SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 18, Issue 2, 2010, Pages 382-394

Model-Based Expectation-Maximization Source Separation and Localization

(3) Mandel, Michael I a Weiss, Ron J a Ellis, Daniel P W a

a Department of Applied Physics and Applied Mathematics (United States)

Author keywords

Maximum likelihood estimation; speech enhancement; time frequency masking; underdetermined source separation

Indexed keywords

EID: 85008544097 PISSN: 15587916 EISSN: 15587924 Source Type: Journal
DOI: 10.1109/TASL.2009.2029711 Document Type: Article

Times cited : (294)

References (32)

1
- 80052339383
- Some experiments on the recognition of speech, with one and with two ears
- C. E. Cherry “Some experiments on the recognition of speech, with one and with two ears,” J. Acoust. Soc. Amer., vol. 25, no. 5, pp. 975–979, 1953.
- (1953) J. Acoust. Soc. Amer. , vol.25 , Issue.5 , pp. 975-979
- Cherry, C.E.¹

2
- 0001991386
- Subjective effects in binaural hearing
- W. Koenig “Subjective effects in binaural hearing,” J. Acoust. Soc. Amer., vol. 22, no. 1, pp. 61–62, 1950.
- (1950) J. Acoust. Soc. Amer. , vol.22 , Issue.1 , pp. 61-62
- Koenig, W.¹

3
- 84867207056
- Source separation based on binaural cues and source model constraints
- Sep.
- R. J. Weiss, M. I. Mandel, and D. P. W. Ellis, “Source separation based on binaural cues and source model constraints,” in Proc. Interspeech, Sep. 2008, pp. 419–422.
- (2008) Proc. Interspeech , pp. 419-422
- Weiss, R.J.¹ Mandel, M.I.² Ellis, D.P.W.³

4
- 3142694930
- Blind separation of speech mixtures via time-frequency masking
- Jul.
- O. Yilmaz and S. Rickard, “Blind separation of speech mixtures via time-frequency masking,” IEEE Trans. Signal Process., vol. 52, no. 7, pp. 1830–1847, Jul. 2004.
- (2004) IEEE Trans. Signal Process. , vol.52 , Issue.7 , pp. 1830-1847
- Yilmaz, O.¹ Rickard, S.²

5
- 33744975847
- Performance measurement in blind audio source separation
- Jul.
- E. Vincent, R. Gribonval, and C. Fevotte, “Performance measurement in blind audio source separation,” IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 4, pp. 1462–1469, Jul. 2006.
- (2006) IEEE Trans. Audio, Speech, Lang. Process. , vol.14 , Issue.4 , pp. 1462-1469
- Vincent, E.¹ Gribonval, R.² Fevotte, C.³

6
- 0033692661
- Blind separation of disjoint orthogonal signals: Demixing N sources from 2 mixtures
- Jun.
- A. Jourjine, S. Rickard, and O. Yilmaz, “Blind separation of disjoint orthogonal signals: Demixing N sources from 2 mixtures,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Jun. 2000, vol. 5, pp. 2985–2988.
- (2000) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. , vol.5 , pp. 2985-2988
- Jourjine, A.¹ Rickard, S.² Yilmaz, O.³

7
- 79953654656
- A classification-based cocktail party processor
- S. Thrun, L. Saul, and B. Scholko, Eds. Cambridge, MA: MIT Press
- N. Roman, D. L. Wang, and G. J. Brown, “A classification-based cocktail party processor,” in Adv. Neural Info. Process. Syst., S. Thrun, L. Saul, and B. Scholko, Eds. Cambridge, MA: MIT Press, 2004, pp. 1425–1432.
- (2004) Adv. Neural Info. Process. Syst. , pp. 1425-1432
- Roman, N.¹ Wang, D.L.² Brown, G.J.³

8
- 33846803485
- On the use of spatial cues to improve binaural source separation
- H. Viste and G. Evangelista, “On the use of spatial cues to improve binaural source separation,” in Proc. Int. Conf. Digital Audio Effects, 2003, pp. 209–213.
- (2003) Proc. Int. Conf. Digital Audio Effects , pp. 209-213
- Viste, H.¹ Evangelista, G.²

9
- 33744971131
- Mask estimation for missing data speech recognition based on statistics of binaural interaction
- Jan.
- S. Harding, J. Barker, and G. J. Brown, “Mask estimation for missing data speech recognition based on statistics of binaural interaction,” IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 1, pp. 58–67, Jan. 2006.
- (2006) IEEE Trans. Audio, Speech, Lang. Process. , vol.14 , Issue.1 , pp. 58-67
- Harding, S.¹ Barker, J.² Brown, G.J.³

10
- 84872736510
- A source localization/separation/respatialization system based on unsupervised classification of interaural cues
- J. Mouba and S. Marchand, “A source localization/separation/respatialization system based on unsupervised classification of interaural cues,” in Proc. Int. Conf. Digital Audio Effects, 2006, pp. 233–238.
- (2006) Proc. Int. Conf. Digital Audio Effects , pp. 233-238
- Mouba, J.¹ Marchand, S.²

11
- 33947677009
- Models of sound localization
- A. N. Popper and R. R. Fay, Eds. New York: Springer, ch. 8
- S. H. Colburn and A. Kulkarni, “Models of sound localization,” in Sound Source Localization, A. N. Popper and R. R. Fay, Eds. New York: Springer, 2005, vol. 25, ch. 8, pp. 272–316.
- (2005) Sound Source Localization , vol.25 , pp. 272-316
- Colburn, S.H.¹ Kulkarni, A.²

12
- 0000466122
- Survey on independent component analysis
- A. Hyvarinen “Survey on independent component analysis,” Neural Comput. Surv., vol. 2, no. 94–128, pp. 3–1, 1999.
- (1999) Neural Comput. Surv. , vol.2 , Issue.94-128 , pp. 3-11
- Hyvarinen, A.¹

13
- 11144223199
- A generalization of blind source separation algorithms for convolutive mixtures based on second-order statistics
- Jan.
- H. Buchner, R. Aichner, and W. Kellermann, “A generalization of blind source separation algorithms for convolutive mixtures based on second-order statistics,” IEEE Trans. Speech Audio Process., vol. 13, no. 1, pp. 120–134, Jan. 2005.
- (2005) IEEE Trans. Speech Audio Process. , vol.13 , Issue.1 , pp. 120-134
- Buchner, H.¹ Aichner, R.² Kellermann, W.³

14
- 0001887874
- A place theory of sound localization
- L. A. Jeffress “A place theory of sound localization,” J. Comparitive Physiol. Psychol., vol. 41, no. 1, pp. 35–39, 1948.
- (1948) J. Comparitive Physiol. Psychol. , vol.41 , Issue.1 , pp. 35-39
- Jeffress, L.A.¹

15
- 0000186849
- Equalization and cancellation theory of binaural masking-level differences
- N. I. Durlach “Equalization and cancellation theory of binaural masking-level differences,” J. Acoust. Soc. Amer., vol. 35, no. 8, pp. 1206–1218, 1963.
- (1963) J. Acoust. Soc. Amer. , vol.35 , Issue.8 , pp. 1206-1218
- Durlach, N.I.¹

16
- 0034897090
- Binaural processing model based on contralateral inhibition. I. model structure
- J. Breebaart, S. van de Par, and A. Kohlrausch, “Binaural processing model based on contralateral inhibition. I. model structure,” J. Acoust. Soc. Amer., vol. 110, no. 2, pp. 1074–1088, 2001.
- (2001) J. Acoust. Soc. Amer. , vol.110 , Issue.2 , pp. 1074-1088
- Breebaart, J.¹ van de Par, S.² Kohlrausch, A.³

17
- 0023681706
- Lateralization of complex binaural stimuli: A weighted-image model
- R. M. Stern, A. S. Zeiberg, and C. Trahiotis “Lateralization of complex binaural stimuli: A weighted-image model,” J. Acoust. Soc. Amer., vol. 84, no. 1, pp. 156–165, 1988.
- (1988) J. Acoust. Soc. Amer. , vol.84 , Issue.1 , pp. 156-165
- Stern, R.M.¹ Zeiberg, A.S.² Trahiotis, C.³

18
- 84868663836
- Binaural sound localization
- D. Wang and G. J. Brown, Eds. Piscataway, NJ: Wiley-IEEE Press, ch. 5
- R. M. Stern, G. J. Brown, and D. Wang, “Binaural sound localization,” in Computational Auditory Scene Analysis: Principles, Algorithms, and Applications, D. Wang and G. J. Brown, Eds. Piscataway, NJ: Wiley-IEEE Press, 2006, ch. 5, pp. 147–185.
- (2006) Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , pp. 147-185
- Stern, R.M.¹ Brown, G.J.² Wang, D.³

19
- 0033778326
- Localization of multiple sound sources with two microphones
- C. Liu, B. C. Wheeler, Jr, R. C. Bilger, C. R. Lansing, and A. S. Feng, “Localization of multiple sound sources with two microphones,” J. Acoust. Soc. Amer., vol. 108, no. 4, pp. 1888–1905, 2000.
- (2000) J. Acoust. Soc. Amer. , vol.108 , Issue.4 , pp. 1888-1905
- Liu, C.¹ Wheeler, B.C.² Bilger, R.C.³ Lansing, C.R.⁴ Feng, A.S.⁵

20
- 30844435714
- Sound source localization in real sound fields based on empirical statistics of interaural parameters
- J. Nix and V. Hohmann, “Sound source localization in real sound fields based on empirical statistics of interaural parameters,” J. Acoust. Soc. Amer., vol. 119, no. 1, pp. 463–479, 2006.
- (2006) J. Acoust. Soc. Amer. , vol.119 , Issue.1 , pp. 463-479
- Nix, J.¹ Hohmann, V.²

21
- 33947651680
- Speech separation based on the statistics of binaural auditory features
- G. J. Brown, S. Harding, and J. P. Barker, “Speech separation based on the statistics of binaural auditory features,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., 2006, vol. 5, pp. V-949–V-952.
- (2006) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. , vol.5 , pp. V-949-V-952
- Brown, G.J.¹ Harding, S.² Barker, J.P.³

22
- 50249118229
- A two-stage frequency-domain blind source separation method for underdetermined convolutive mixtures
- Oct.
- H. Sawada, S. Araki, and S. Makino, “A two-stage frequency-domain blind source separation method for underdetermined convolutive mixtures,” in Proc. IEEE Workshop Applicat. Signal Process. Audio Acoust., Oct. 2007, pp. 139–142.
- (2007) Proc. IEEE Workshop Applicat. Signal Process. Audio Acoust. , pp. 139-142
- Sawada, H.¹ Araki, S.² Makino, S.³

23
- 70349204584
- An EM algorithm for localizing multiple sound sources in reverberant environments
- B. Scholkopf, J. Platt, and T. Hoffman, Eds. Cambridge, MA: MIT Press
- M. I. Mandel, D. P. W. Ellis, and T. Jebara, “An EM algorithm for localizing multiple sound sources in reverberant environments,” in Adv. Neural Info. Process. Syst., B. Scholkopf, J. Platt, and T. Hoffman, Eds. Cambridge, MA: MIT Press, 2007, pp. 953–960.
- (2007) Adv. Neural Info. Process. Syst. , pp. 953-960
- Mandel, M.I.¹ Ellis, D.P.W.² Jebara, T.³

24
- 84892494847
- A probability model for interaural phase difference
- M. I. Mandel and D. P. W. Ellis, “A probability model for interaural phase difference,” in Proc. ISCA Workshop Statist. Percept. Audio Process. (SAPA), 2006, pp. 1–6.
- (2006) Proc. ISCA Workshop Statist. Percept. Audio Process. (SAPA) , pp. 1-6
- Mandel, M.I.¹ Ellis, D.P.W.²

25
- 0028419019
- Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
- J. L. Gauvain and C.-H. Lee “Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains,” IEEE Trans. Speech Audio Process., vol. 2, no. 2, pp. 291–298, 1994.
- (1994) IEEE Trans. Speech Audio Process. , vol.2 , Issue.2 , pp. 291-298
- Gauvain, J.L.¹ Lee, C.-H.²

26
- 0036881034
- Self-localizing dynamic microphone arrays
- Nov.
- P. Aarabi, “Self-localizing dynamic microphone arrays,” IEEE Trans. Syst., Man, Cybern. C, vol. 32, no. 4, pp. 474–484, Nov. 2002.
- (2002) IEEE Trans. Syst., Man, Cybern. C , vol.32 , Issue.4 , pp. 474-484
- Aarabi, P.¹

27
- 18744392833
- Localizing nearby sound sources in a classroom: Binaural room impulse responses
- B. Shinn-Cunningham, N. Kopco, and T. Martin, “Localizing nearby sound sources in a classroom: Binaural room impulse responses,” J. Acoust. Soc. Amer., vol. 117, pp. 3100–3115, 2005.
- (2005) J. Acoust. Soc. Amer. , vol.117 , pp. 3100-3115
- Shinn-Cunningham, B.¹ Kopco, N.² Martin, T.³

28
- 0003548585
- [Online]. Available: http://www.ldc. upenn.edu/Catalog/LDC93S1.html
- J. S. Garofolo, L. F. Lamel, W. M. Fisher, J. G. Fiscus, D. S. Pallett, and N. L. Dahlgren, “DARPA TIMIT Acoustic Phonetic Continuous Speech Corpus CDROM,” 1993 [Online]. Available: http://www.ldc. upenn.edu/Catalog/LDC93S1.html
- (1993) DARPA TIMIT Acoustic Phonetic Continuous Speech Corpus CDROM
- Garofolo, J.S.¹ Lamel, L.F.² Fisher, W.M.³ Fiscus, J.G.⁴ Pallett, D.S.⁵ Dahlgren, N.L.⁶

29
- 0035681892
- The CIPIC HRTF database
- Oct.
- V. R. Algazi, R. O. Duda, D. M. Thompson, and C. Avendano, “The CIPIC HRTF database,” in IEEE Workshop Applicat. Signal Process. Audio Acoust., Oct. 2001, pp. 99–102.
- (2001) IEEE Workshop Applicat. Signal Process. Audio Acoust. , pp. 99-102
- Algazi, V.R.¹ Duda, R.O.² Thompson, D.M.³ Avendano, C.⁴

30
- 34447100796
- Boca Raton, FL: CRC
- P. C. Loizou, Speech Enhancement: Theory and Practice. Boca Raton, FL: CRC, 2007.
- (2007) Speech Enhancement: Theory and Practice
- Loizou, P.C.¹

31
- 54949092435
- Perceptual evaluation of blind source separation for robust speech recognition
- Oct.
- L. Di Persia, D. Milone, H. Rufiner, and M. Yanagida, “Perceptual evaluation of blind source separation for robust speech recognition,” Signal Process., vol. 88, no. 10, pp. 2578–2583, Oct. 2008.
- (2008) Signal Process , vol.88 , Issue.10 , pp. 2578-2583
- Di Persia, L.¹ Milone, D.² Rufiner, H.³ Yanagida, M.⁴

32
- 50249183469
- EM localization and separation using interaural level and phase cues
- Oct.
- M. I. Mandel and D. P. W. Ellis, “EM localization and separation using interaural level and phase cues,” in Proc. IEEE Workshop Applicat. Signal Process. Audio Acoust., Oct. 2007, pp. 275–278.
- (2007) Proc. IEEE Workshop Applicat. Signal Process. Audio Acoust. , pp. 275-278
- Mandel, M.I.¹ Ellis, D.P.W.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.