SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 18, Issue 7, 2010, Pages 1872-1883

Evaluating source separation algorithms with reverberant speech

(4) Mandel, Michael I a Bressler, Scott b Shinn Cunningham, Barbara b Ellis, Daniel P W c

a UNIVERSITÉ DE MONTRÉAL (Canada)

b BOSTON UNIVERSITY (United States)

c Columbia University ^* (United States)

Author keywords

Intelligibility; objective evaluation; reverberation; speech enhancement; timefrequency masking; underdetermined source separation

Indexed keywords

AUTOMATIC SPEECH RECOGNITION; HUMAN LISTENERS; HUMAN PERFORMANCE; INTELLIGIBILITY; OBJECTIVE EVALUATION; OBJECTIVE MEASURE; PRIORI KNOWLEDGE; SEPARATION PERFORMANCE; SIGNAL SEPARATION; SOURCE SEPARATION; SPEECH SEPARATION; TIME FREQUENCY;

ALGORITHMS; REVERBERATION; SIGNAL ANALYSIS; SPEECH ENHANCEMENT; SPEECH RECOGNITION;

SPEECH INTELLIGIBILITY;

EID: 77955698002 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2010.2052252 Document Type: Article

Times cited : (26)

References (38)

1
- 50249115096
- Evaluating speech separation systems
- P. Divenyi, Ed. Norwell, MA: Kluwer, ch.
- D. P. W. Ellis, "Evaluating speech separation systems," in Speech Separation by Humans and Machines, P. Divenyi, Ed. Norwell, MA: Kluwer, 2004, ch. 20, pp. 295-304.
- (2004) Speech Separation by Humans and Machines , vol.20 , pp. 295-304
- Ellis, D.P.W.¹

2
- 33845354768
- Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation
- D. S. Brungart, P. S. Chang, B. D. Simpson, and D. Wang, "Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation," J. Acoust. Soc. Amer., vol.120, no.6, pp. 4007-4018, 2006.
- (2006) J. Acoust. Soc. Amer , vol.120 , Issue.6 , pp. 4007-4018
- Brungart, D.S.¹ Chang, P.S.² Simpson, B.D.³ Wang, D.⁴

3
- 53949095896
- Speech perception of noise with binary gains
- D. Wang, U. Kjems, M. S. Pedersen, J. B. Boldt, and T. Lunner, "Speech perception of noise with binary gains," J. Acoust. Soc. Amer., vol.124, no.4, pp. 2303-2307, 2008.
- (2008) J. Acoust. Soc. Amer , vol.124 , Issue.4 , pp. 2303-2307
- Wang, D.¹ Kjems, U.² Pedersen, M.S.³ Boldt, J.B.⁴ Lunner, T.⁵

4
- 0003982501
- Ph.D. dissertation, Stanford Univ. Dept. of Elect. Eng., Stanford, CA
- M. Weintraub, "A theory and computational model of auditory monaural sound separation," Ph.D. dissertation, Stanford Univ. Dept. of Elect. Eng., Stanford, CA, 1985.
- (1985) A Theory and Computational Model of Auditory Monaural Sound Separation
- Weintraub, M.¹

5
- 44149106061
- Evaluation of objective quality measures for speech enhancement
- Jan.
- Y. Hu and P. C. Loizou, "Evaluation of objective quality measures for speech enhancement," IEEE Trans. Audio, Speech, Lang. Process., vol.16, no.1, pp. 229-238, Jan. 2008.
- (2008) IEEE Trans. Audio, Speech, Lang. Process , vol.16 , Issue.1 , pp. 229-238
- Hu, Y.¹ Loizou, P.C.²

6
- 54949092435
- Perceptual evaluation of blind source separation for robust speech recognition
- Oct.
- L. D. Persia, D. Milone, H. Rufiner, and M.Yanagida, "Perceptual evaluation of blind source separation for robust speech recognition," Signal Process., vol.88, no.10, pp. 2578-2583, Oct. 2008.
- (2008) Signal Process , vol.88 , Issue.10 , pp. 2578-2583
- Persia, L.D.¹ Milone, D.² Rufiner, H.³ Yanagida, M.⁴

7
- 0028823541
- Speech recognition with primarily temporal cues
- Oct.
- R. V. Shannon, F.-G. Zeng, V. Kamath, J. Wygonski, and M. Ekelid, "Speech recognition with primarily temporal cues," Science, vol.270, no.5234, pp. 303-304, Oct. 1995.
- (1995) Science , vol.270 , Issue.5234 , pp. 303-304
- Shannon, R.V.¹ Zeng, F.-G.² Kamath, V.³ Wygonski, J.⁴ Ekelid, M.⁵

8
- 0033282527
- Testing the ability of speech recognizers to measure the effectiveness of encoding algorithms for digital speech transmission
- C. M. Chernick, S. Leigh, K. L. Mills, and R. Toense, "Testing the ability of speech recognizers to measure the effectiveness of encoding algorithms for digital speech transmission," in Proc. IEEE Military Commun. Conf., 1999, vol.2, pp. 1468-1472.
- (1999) Proc. IEEE Military Commun. Conf. , vol.2 , pp. 1468-1472
- Chernick, C.M.¹ Leigh, S.² Mills, K.L.³ Toense, R.⁴

9
- 84948437341
- Speech recognition performance as an effective perceived quality predictor
- W. Jiang and H. Schulzrinne, "Speech recognition performance as an effective perceived quality predictor," in Proc. IEEE Int. Workshop Quality of Service, 2002, pp. 269-275.
- (2002) Proc. IEEE Int. Workshop Quality of Service , pp. 269-275
- Jiang, W.¹ Schulzrinne, H.²

10
- 4544238561
- Monaural speech separation
- G. Hu and D. Wang, "Monaural speech separation," Adv. Neur. Inf. Process. Syst., vol.15, pp. 1221-1228, 2003.
- (2003) Adv. Neur. Inf. Process. Syst , vol.15 , pp. 1221-1228
- Hu, G.¹ Wang, D.²

11
- 3142694930
- Blind separation of speech mixtures via time-frequency masking
- Jul.
- O. Y?lmaz and S. Rickard, "Blind separation of speech mixtures via time-frequency masking," IEEE Trans. Signal Process., vol.52, no.7, pp. 1830-1847, Jul. 2004.
- (2004) IEEE Trans. Signal Process , vol.52 , Issue.7 , pp. 1830-1847
- Ylmaz, O.¹ Rickard, S.²

12
- 34447100796
- Boca Raton FL: CRC
- P. C. Loizou, Speech Enhancement: Theory and Practice. Boca Raton, FL: CRC, 2007.
- (2007) Speech Enhancement: Theory and Practice
- Loizou, P.C.¹

13
- 58149196390
- On the optimality of ideal binary time-frequency masks
- Mar.
- Y. Li and D.Wang, "On the optimality of ideal binary time-frequency masks," Speech Commun., vol.51, no.3, pp. 230-239, Mar. 2009.
- (2009) Speech Commun , vol.51 , Issue.3 , pp. 230-239
- Li, Y.¹ Wang, D.²

14
- 77950114888
- Effects of pitch and spatial separation on selective attention in anechoic and reverberant environments
- S. Bressler and B. G. Shinn-Cunningham, "Effects of pitch and spatial separation on selective attention in anechoic and reverberant environments," J. Acoust. Soc. Amer., vol.123, no.5, pp. 2978-2978, 2008.
- (2008) J. Acoust. Soc. Amer , vol.123 , Issue.5 , pp. 2978-2978
- Bressler, S.¹ Shinn-Cunningham, B.G.²

15
- 0035106984
- Informational and energetic masking effects in the perception of two simultaneous talkers
- D. S. Brungart, "Informational and energetic masking effects in the perception of two simultaneous talkers," J. Acoust. Soc. Amer., vol.109, no.3, pp. 1101-1109, 2001.
- (2001) J. Acoust. Soc. Amer , vol.109 , Issue.3 , pp. 1101-1109
- Brungart, D.S.¹

16
- 27744532546
- Precedence-based speech segregation in a virtual auditory environment
- D. S. Brungart, B. D. Simpson, and R. L. Freyman, "Precedence-based speech segregation in a virtual auditory environment," J. Acoust. Soc. Amer., vol.118, no.5, pp. 3241-3251, 2005.
- (2005) J. Acoust. Soc. Amer , vol.118 , Issue.5 , pp. 3241-3251
- Brungart, D.S.¹ Simpson, B.D.² Freyman, R.L.³

17
- 29244442934
- The advantage of knowing where to listen
- G. Kidd, T. L. Arbogast, C. R. Mason, and F. J. Gallun, "The advantage of knowing where to listen," J. Acoust. Soc. Amer., vol.118, no.6, pp. 3804-3815, 2005.
- (2005) J. Acoust. Soc. Amer , vol.118 , Issue.6 , pp. 3804-3815
- Kidd, G.¹ Arbogast, T.L.² Mason, C.R.³ Gallun, F.J.⁴

18
- 17344368194
- Release from masking due to spatial separation of sources in the identification of nonspeech auditory patterns
- C. R. Mason, T. L. Rohtla, and P. S. Deliwala, "Release from masking due to spatial separation of sources in the identification of nonspeech auditory patterns," J. Acoust. Soc. Amer., vol.104, no.1, pp. 422-431, 1998.
- (1998) J. Acoust. Soc. Amer , vol.104 , Issue.1 , pp. 422-431
- Mason, C.R.¹ Rohtla, T.L.² Deliwala, P.S.³

19
- 2142813014
- Effect of number of masking talkers and auditory priming on informational masking in speech recognition
- R. L. Freyman, U. Balakrishnan, and K. S. Helfer, "Effect of number of masking talkers and auditory priming on informational masking in speech recognition," J. Acoust. Soc. Amer., vol.115, no.5, pp. 2246-2256, 2004.
- (2004) J. Acoust. Soc. Amer , vol.115 , Issue.5 , pp. 2246-2256
- Freyman, R.L.¹ Balakrishnan, U.² Helfer, K.S.³

20
- 42749094234
- Object-based auditory and visual attention
- May
- B. G. Shinn-Cunningham, "Object-based auditory and visual attention," Trends in Cognitive Sci., vol.12, no.5, pp. 182-186, May 2008.
- (2008) Trends in Cognitive Sci , vol.12 , Issue.5 , pp. 182-186
- Shinn-Cunningham, B.G.¹

21
- 18744392833
- Localizing nearby sound sources in a classroom: Binaural room impulse responses
- B. G. Shinn-Cunningham, N. Kopco, and T. J. Martin, "Localizing nearby sound sources in a classroom: Binaural room impulse responses," J. Acoust. Soc. Amer., vol.117, no.5, pp. 3100-3115, 2005.
- (2005) J. Acoust. Soc. Amer , vol.117 , Issue.5 , pp. 3100-3115
- Shinn-Cunningham, B.G.¹ Kopco, N.² Martin, T.J.³

22
- 85008544097
- Model-based expectation maximization source separation and localization
- Feb.
- M. I. Mandel, R. J.Weiss, and D. P. W. Ellis, "Model-based expectation maximization source separation and localization," IEEE Trans. Audio, Speech, Lang. Process., vol.18, no.2, pp. 382-394, Feb. 2010.
- (2010) IEEE Trans. Audio, Speech, Lang. Process , vol.18 , Issue.2 , pp. 382-394
- Mandel, M.I.¹ Weiss, R.J.² Ellis, D.P.W.³

23
- 0033692661
- Blind separation of disjoint orthogonal signals: Demixing n sources from 2 mixtures
- A. Jourjine, S. Rickard, and O. Y?lmaz, "Blind separation of disjoint orthogonal signals: Demixing n sources from 2 mixtures," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., 2000, vol.5, pp. 2985-2988.
- (2000) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process , vol.5 , pp. 2985-2988
- Jourjine, A.¹ Rickard, S.² Ylmaz, O.³

24
- 50249118229
- A two-stage frequency-domain blind source separation method for underdetermined convolutive mixtures
- H. Sawada, S. Araki, and S. Makino, "A two-stage frequency-domain blind source separation method for underdetermined convolutive mixtures," in Proc. IEEE Workshop Applicat. Signal Process. Audio Acoust., 2007, pp. 139-142.
- (2007) Proc. IEEE Workshop Applicat. Signal Process. Audio Acoust , pp. 139-142
- Sawada, H.¹ Araki, S.² Makino, S.³

25
- 84872736510
- A source localization/separation/respatialization system based on unsupervised classification of interaural cues
- J. Mouba and S. Marchand, "A source localization/separation/ respatialization system based on unsupervised classification of interaural cues," in Proc. Int. Conf. Digital Audio Effects, 2006, pp. 233-238.
- (2006) Proc. Int. Conf. Digital Audio Effects , pp. 233-238
- Mouba, J.¹ Marchand, S.²

26
- 4644336054
- Reconstruction of missing features for robust speech recognition
- Sep.
- B. Raj, M. L. Seltzer, and R. M. Stern, "Reconstruction of missing features for robust speech recognition," Speech. Commun., vol.43, no.4, pp. 275-296, Sep. 2004.
- (2004) Speech. Commun , vol.43 , Issue.4 , pp. 275-296
- Raj, B.¹ Seltzer, M.L.² Stern, R.M.³

27
- 0035342414
- Robust automatic speech recognition with missing and unreliable acoustic data
- Jun.
- M. Cooke, P. Green, L. Josifovski, and A. Vizinho, "Robust automatic speech recognition with missing and unreliable acoustic data," Speech. Commun., vol.34, no.3, pp. 267-285, Jun. 2001.
- (2001) Speech. Commun , vol.34 , Issue.3 , pp. 267-285
- Cooke, M.¹ Green, P.² Josifovski, L.³ Vizinho, A.⁴

28
- 33749058582
- Separation and robust recognition of noisy, convolutive speech mixtures using time-frequency masking and missing data techniques
- D. Kolossa, A. Klimas, and R. Orglmeister, "Separation and robust recognition of noisy, convolutive speech mixtures using time-frequency masking and missing data techniques," in Proc. IEEEWorkshop Applicat. Signal Process. Audio Acoust., 2005, pp. 82-85.
- (2005) Proc. IEEEWorkshop Applicat. Signal Process. Audio Acoust , pp. 82-85
- Kolossa, D.¹ Klimas, A.² Orglmeister, R.³

29
- 0025681008
- Hidden markov model decomposition of speech and noise
- A. P. Varga and R. K. Moore, "Hidden markov model decomposition of speech and noise," in Proc. IEEE Int. Conf. Acoust. Speech, Signal Process., 1990, vol.2, pp. 845-848.
- (1990) Proc. IEEE Int. Conf. Acoust. Speech, Signal Process , vol.2 , pp. 845-848
- Varga, A.P.¹ Moore, R.K.²

30
- 50249086925
- Monaural speech separation using source-adapted models
- R. J. Weiss and D. P. W. Ellis, "Monaural speech separation using source-adapted models," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., 2007, pp. 114-117.
- (2007) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process , pp. 114-117
- Weiss, R.J.¹ Ellis, D.P.W.²

31
- 70450194258
- Super-human multi-talker speech recognition: A graphical modeling approach
- Jan.
- J. R. Hershey, S. J. Rennie, P. A. Olsen, and T. T. Kristjansson, "Super-human multi-talker speech recognition: A graphical modeling approach," Comput. Speech Lang., Jan. 2009.
- (2009) Comput. Speech Lang
- Hershey, J.R.¹ Rennie, S.J.² Olsen, P.A.³ Kristjansson, T.T.⁴

32
- 4644317224
- A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition
- Sep.
- M. Seltzer, B. Raj, and R. Stern, "A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition," Speech Commun., vol.43, no.4, pp. 379-393, Sep. 2004.
- (2004) Speech Commun , vol.43 , Issue.4 , pp. 379-393
- Seltzer, M.¹ Raj, B.² Stern, R.³

33
- 34547543067
- Missing feature speech recognition using dereverberation and echo suppression in reverberant environments
- H.-M. Park and R. M. Stern, "Missing feature speech recognition using dereverberation and echo suppression in reverberant environments," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., 2007, vol.4, pp. 381-384.
- (2007) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process , vol.4 , pp. 381-384
- Park, H.-M.¹ Stern, R.M.²

34
- 0141804849
- Effects of small room reverberation upon the recognition of some consonant features
- S. A. Gelfand and S. Silman, "Effects of small room reverberation upon the recognition of some consonant features," J. Acoust. Soc. Amer., vol.66, no.1, pp. 22-29, 1979.
- (1979) J. Acoust. Soc. Amer , vol.66 , Issue.1 , pp. 22-29
- Gelfand, S.A.¹ Silman, S.²

35
- 67149088353
- The 2008 signal separation evaluation campaign: A community-based approach to large-scale evaluation
- E. Vincent, S. Araki, and P. Bofill, "The 2008 signal separation evaluation campaign: A community-based approach to large-scale evaluation," Ind. Compon. Anal. Signal Separat., pp. 734-741, 2009.
- (2009) Ind. Compon. Anal. Signal Separat , pp. 734-741
- Vincent, E.¹ Araki, S.² Bofill, P.³

36
- 33744975847
- Performance measurement in blind audio source separation
- DOI 10.1109/TSA.2005.858005
- E. Vincent, R. Gribonval, and C. Fevotte, "Performance measurement in blind audio source separation," IEEE Trans. Audio, Speech, Lang. Process., vol.14, no.4, pp. 1462-1469, Jul. 2006. (Pubitemid 46547636)
- (2006) IEEE Transactions on Audio, Speech and Language Processing , vol.14 , Issue.4 , pp. 1462-1469
- Vincent, E.¹ Gribonval, R.² Fevotte, C.³

37
- 70349204584
- An em algorithm for localizing multiple sound sources in reverberant environments
- B. Schölkopf, J. Platt, and T. Hoffman, Eds. Cambridge, MA: MIT Press
- M. I. Mandel, D. P. W. Ellis, and T. Jebara, "An EM algorithm for localizing multiple sound sources in reverberant environments," in Advances in Neural Information Processing Systems, B. Schölkopf, J. Platt, and T. Hoffman, Eds. Cambridge, MA: MIT Press, 2007, pp. 953-960.
- (2007) Advances in Neural Information Processing Systems , pp. 953-960
- Mandel, M.I.¹ Ellis, D.P.W.² Jebara, T.³

38
- 50249183469
- EM localization and separation using interaural level and phase cues
- M. I. Mandel and D. P. W. Ellis, "EM localization and separation using interaural level and phase cues," in Proc. IEEE Workshop Applicat. Signal Process. Audio Acoust., 2007, pp. 275-278.
- (2007) Proc. IEEE Workshop Applicat. Signal Process. Audio Acoust , pp. 275-278
- Mandel, M.I.¹ Ellis, D.P.W.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.