SCOPUS 정보 검색 플랫폼

Computer Speech and Language

Volumn 24, Issue 1, 2010, Pages 1-15

Monaural speech separation and recognition challenge

(3) Cooke, Martin a,b Hershey, John R c Rennie, Steven J c

a BASQUE FOUNDATION FOR SCIENCE (Spain)

b UNIVERSITY OF THE BASQUE COUNTRY UPV EHU (Spain)

c IBM T J WATSON RESEARCH CENTER (United States)

Author keywords

Auditory scene analysis; Noise robustness; Simultaneous speech; Speaker identification; Speech recognition; Speech separation

Indexed keywords

AUDITORY SCENE ANALYSIS; NOISE ROBUSTNESS; SIMULTANEOUS SPEECH; SPEAKER IDENTIFICATION; SPEECH SEPARATION;

ACOUSTIC NOISE; LOUDSPEAKERS; PATIENT REHABILITATION; SEPARATION; SPEECH ANALYSIS;

SPEECH RECOGNITION;

EID: 69249202377 PISSN: 08852308 EISSN: 10958363 Source Type: Journal
DOI: 10.1016/j.csl.2009.02.006 Document Type: Article

Times cited : (181)

References (53)

1
- 2142812604
- The perception of speech under adverse acoustic conditions
- Greenberg, S, Ainsworth, W.A, Popper, A.N, Fay, R.R, Eds, Springer Handbook of Auditory Research
- Assmann, P., Summerfield, Q., 2004. The perception of speech under adverse acoustic conditions. In: Greenberg, S., Ainsworth, W.A., Popper, A.N., Fay, R.R. (Eds.), Speech Processing in the Auditory System. Springer Handbook of Auditory Research, vol. 18.
- (2004) Speech Processing in the Auditory System , vol.18
- Assmann, P.¹ Summerfield, Q.²

2
- 33749317042
- Learning spectral clustering, with application to speech separation
- Bach F.R., and Jordan M.I. Learning spectral clustering, with application to speech separation. Journal of Machine Learning Research 7 (2006) 1963-2001
- (2006) Journal of Machine Learning Research , vol.7 , pp. 1963-2001
- Bach, F.R.¹ Jordan, M.I.²

3
- 44949219122
- Recent advances in speech fragment decoding techniques
- Pittsburgh, pp
- Barker, J., Coy, A., Ma, N., Cooke, M.P., 2006. Recent advances in speech fragment decoding techniques. In: Proceedings of Interspeech 2006, Pittsburgh, pp. 85-88.
- (2006) Proceedings of Interspeech , pp. 85-88
- Barker, J.¹ Coy, A.² Ma, N.³ Cooke, M.P.⁴

4
- 69249231059
- Speech fragment decoding techniques for simultaneous speaker identification and speech recognition
- Barker J., Ma N., Coy A., and Cooke M. Speech fragment decoding techniques for simultaneous speaker identification and speech recognition. Computer Speech and Language 24 1 (2010) 94-111
- (2010) Computer Speech and Language , vol.24 , Issue.1 , pp. 94-111
- Barker, J.¹ Ma, N.² Coy, A.³ Cooke, M.⁴

5
- 11144316019
- Decoding speech in the presence of other sources
- Barker J.P., Cooke M.P., and Ellis D.P.W. Decoding speech in the presence of other sources. Speech Communication 45 1 (2005) 5-25
- (2005) Speech Communication , vol.45 , Issue.1 , pp. 5-25
- Barker, J.P.¹ Cooke, M.P.² Ellis, D.P.W.³

6
- 0029411030
- An information-maximization approach to blind separation and blind deconvolution
- Bell A.J., and Sejnowski T.J. An information-maximization approach to blind separation and blind deconvolution. Neural Computation 7 6 (1995) 1129-1159
- (1995) Neural Computation , vol.7 , Issue.6 , pp. 1129-1159
- Bell, A.J.¹ Sejnowski, T.J.²

7
- 33745146930
- Benesty J., Makino S., and Chen J. (Eds), Springer
- In: Benesty J., Makino S., and Chen J. (Eds). Speech Enhancement (2005), Springer
- (2005) Speech Enhancement

8
- 0003684441
- MIT Press, Cambridge MA
- Bregman A.S. Auditory Scene Analysis (1990), MIT Press, Cambridge MA
- (1990) Auditory Scene Analysis
- Bregman, A.S.¹

9
- 0026442628
- Effect of multiple speech-like maskers on binaural speech recognition in normal and impaired hearing
- Bronkhorst A.W., and Plomp R. Effect of multiple speech-like maskers on binaural speech recognition in normal and impaired hearing. Journal of the Acoustical Society of America 92 (1992) 3132-3139
- (1992) Journal of the Acoustical Society of America , vol.92 , pp. 3132-3139
- Bronkhorst, A.W.¹ Plomp, R.²

10
- 33845354768
- Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation
- Brungart D.S., Chang P.S., Simpson B.D., and Wang D.L. Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation. Journal of the Acoustical Society of America 120 (2006) 4007-4018
- (2006) Journal of the Acoustical Society of America , vol.120 , pp. 4007-4018
- Brungart, D.S.¹ Chang, P.S.² Simpson, B.D.³ Wang, D.L.⁴

11
- 0035169173
- Informational and energetic masking effects in the perception of multiple simultaneous talkers
- Brungart D.S., Simpson B.D., Ericson M.A., and Scott K.R. Informational and energetic masking effects in the perception of multiple simultaneous talkers. Journal of the Acoustical Society of America 100 (2001) 2527-2538
- (2001) Journal of the Acoustical Society of America , vol.100 , pp. 2527-2538
- Brungart, D.S.¹ Simpson, B.D.² Ericson, M.A.³ Scott, K.R.⁴

12
- 0028416938
- Independent component analysis, a new concept?
- Comon P. Independent component analysis, a new concept?. Signal Processing 36 3 (1994) 287-314
- (1994) Signal Processing , vol.36 , Issue.3 , pp. 287-314
- Comon, P.¹

13
- 33750368310
- An audio-visual corpus for speech perception and automatic speech recognition
- Cooke M.P., Barker J., Cunningham S.P., and Shao X. An audio-visual corpus for speech perception and automatic speech recognition. Journal of the Acoustical Society of America 120 (2006) 2421-2424
- (2006) Journal of the Acoustical Society of America , vol.120 , pp. 2421-2424
- Cooke, M.P.¹ Barker, J.² Cunningham, S.P.³ Shao, X.⁴

14
- 37849011878
- The foreign language cocktail party problem: energetic and informational masking effects in non-native speech perception
- Cooke M.P., Garcia Lecumberri M.L., and Barker J.P. The foreign language cocktail party problem: energetic and informational masking effects in non-native speech perception. Journal of the Acoustical Society of America 123 (2008) 414-427
- (2008) Journal of the Acoustical Society of America , vol.123 , pp. 414-427
- Cooke, M.P.¹ Garcia Lecumberri, M.L.² Barker, J.P.³

15
- 0035342414
- Robust automatic speech recognition with missing and uncertain acoustic data
- Cooke M.P., Green P.D., Josifovski L., and Vizinho A. Robust automatic speech recognition with missing and uncertain acoustic data. Speech Communication 34 3 (2001) 267-285
- (2001) Speech Communication , vol.34 , Issue.3 , pp. 267-285
- Cooke, M.P.¹ Green, P.D.² Josifovski, L.³ Vizinho, A.⁴

16
- 47749094114
- Cooke, M.P., Lee, T.W., 2006. Speech separation challenge website. http://www.dcs.shef.ac.uk/~martin/SpeechSeparationChallenge.htm.
- (2006) Speech separation challenge website
- Cooke, M.P.¹ Lee, T.W.²

17
- 0001698589
- Auditory Grouping
- The Handbook of Perception and Cognition, Academic Press
- Darwin C.J., and Carlyon R.P. Auditory Grouping. The Handbook of Perception and Cognition. Hearing vol. 6 (1995), Academic Press
- (1995) Hearing , vol.6
- Darwin, C.J.¹ Carlyon, R.P.²

18
- 69249224901
- Centertryk
- Dau, T., Buchholz, J., Harte, J., Christiansen, T., 2008. Auditory signal processing in hearing-impaired listeners. Centertryk.
- (2008) Auditory signal processing in hearing-impaired listeners
- Dau, T.¹ Buchholz, J.² Harte, J.³ Christiansen, T.⁴

19
- 44949249754
- Modified phase opponency based solution to the speech separation challenge
- Pittsburgh, pp
- Deshmukh, O., Espy-Wilson, C., 2006. Modified phase opponency based solution to the speech separation challenge. In: Proceedings of Interspeech 2006, Pittsburgh, pp. 101-104.
- (2006) Proceedings of Interspeech , pp. 101-104
- Deshmukh, O.¹ Espy-Wilson, C.²

20
- 84892300819
- Divenyi P. (Ed), Springer
- In: Divenyi P. (Ed). Speech Separation by Humans and Machines (2004), Springer
- (2004) Speech Separation by Humans and Machines

21
- 0036291376
- Uncertainty decoding with splice for noise robust speech recognition
- Droppo, J., Deng, L., Acero, A., 2002. Uncertainty decoding with splice for noise robust speech recognition. In: IEEE Conference on Acoustics Speech and Signal Processing.
- (2002) IEEE Conference on Acoustics Speech and Signal Processing
- Droppo, J.¹ Deng, L.² Acero, A.³

22
- 0003794341
- Ph.D. thesis, MIT, Cambridge MA
- Ellis, D.P.W., 1996. Prediction-driven computational auditory scene analysis. Ph.D. thesis, MIT, Cambridge MA.
- (1996) Prediction-driven computational auditory scene analysis
- Ellis, D.P.W.¹

23
- 44949138160
- Enhancement of harmonic content of speech based on a dynamic programming pitch tracking algorithm
- Pittsburgh
- Every, M.R., Jackson, P.J.B., 2006. Enhancement of harmonic content of speech based on a dynamic programming pitch tracking algorithm. In: Proceedings of Interspeech 2006, Pittsburgh.
- (2006) Proceedings of Interspeech
- Every, M.R.¹ Jackson, P.J.B.²

24
- 85009074657
- Algonquin: Iterating Laplace's method to remove multiple types of acoustic distortion for robust speech recognition
- Frey, B., Deng, L., Acero, A., Kristjansson, T., 2001. Algonquin: iterating Laplace's method to remove multiple types of acoustic distortion for robust speech recognition. In: Eurospeech, 901-904.
- (2001) Eurospeech , pp. 901-904
- Frey, B.¹ Deng, L.² Acero, A.³ Kristjansson, T.⁴

25
- 0032050110
- Maximum likelihood linear transformations for HMM-based speech recognition
- Gales M. Maximum likelihood linear transformations for HMM-based speech recognition. Computer Speech and Language 12 (1998) 75-98
- (1998) Computer Speech and Language , vol.12 , pp. 75-98
- Gales, M.¹

26
- 0030245128
- Robust continuous speech recognition using parallel model combination
- Gales M., and Young S. Robust continuous speech recognition using parallel model combination. IEEE Transactions on Speech and Audio Processing 4 (1996) 352-359
- (1996) IEEE Transactions on Speech and Audio Processing , vol.4 , pp. 352-359
- Gales, M.¹ Young, S.²

27
- 0029288202
- Speech recognition in noisy environments: a survey
- Gong Y. Speech recognition in noisy environments: a survey. Speech Communication 16 3 (1995) 261-291
- (1995) Speech Communication , vol.16 , Issue.3 , pp. 261-291
- Gong, Y.¹

28
- 44949121548
- Casa based speech separation for robust speech recognition
- Pittsburgh
- Han, R., Zhao, P., Gao, Q., Zhang, Z., Wu, H., Wu, X., 2006. Casa based speech separation for robust speech recognition. In: Proceedings of Interspeech 2006, Pittsburgh.
- (2006) Proceedings of Interspeech
- Han, R.¹ Zhao, P.² Gao, Q.³ Zhang, Z.⁴ Wu, H.⁵ Wu, X.⁶

29
- 69249222720
- Super-human multi-talker speech recognition: a graphical model approach
- Hershey J.R., Rennie S.J., Olsen P.A., and Kristjansson T.T. Super-human multi-talker speech recognition: a graphical model approach. Computer Speech and Language 24 1 (2010) 45-66
- (2010) Computer Speech and Language , vol.24 , Issue.1 , pp. 45-66
- Hershey, J.R.¹ Rennie, S.J.² Olsen, P.A.³ Kristjansson, T.T.⁴

30
- 0027465491
- The lombard reflex and its role on human listeners and automatic speech recognizers
- Junqua J.-C. The lombard reflex and its role on human listeners and automatic speech recognizers. Journal of the Acoustical Society of America 93 (1993) 510-524
- (1993) Journal of the Acoustical Society of America , vol.93 , pp. 510-524
- Junqua, J.-C.¹

31
- 44949258898
- Super-human multi-talker speech recognition: The IBM 2006 speech separation challenge system
- Pittsburgh
- Kristjansson, T., Hershey, J., Olsen, P., Rennie, S., Gopinath, R., 2006. Super-human multi-talker speech recognition: the IBM 2006 speech separation challenge system. In: Proceedings of Interspeech 2006, Pittsburgh.
- (2006) Proceedings of Interspeech
- Kristjansson, T.¹ Hershey, J.² Olsen, P.³ Rennie, S.⁴ Gopinath, R.⁵

32
- 84864010278
- Speaker adaptation of continuous density HMMs using linear regression
- Leggetter C.J., and Woodland P.C. Speaker adaptation of continuous density HMMs using linear regression. International Conference on Speech and Language Processing (1994) 451-454
- (1994) International Conference on Speech and Language Processing , pp. 451-454
- Leggetter, C.J.¹ Woodland, P.C.²

33
- 69249203845
- Monaural speech separation based on MAXVQ and CASA for robust speech recognition
- Li P., Guan Y., Wang S., Xu B., and Liu W. Monaural speech separation based on MAXVQ and CASA for robust speech recognition. Computer Speech and Language 24 1 (2010) 30-44
- (2010) Computer Speech and Language , vol.24 , Issue.1 , pp. 30-44
- Li, P.¹ Guan, Y.² Wang, S.³ Xu, B.⁴ Liu, W.⁵

34
- 34447100796
- CRC Press
- Loizou P.C. Speech Enhancement: Theory and Practice (2007), CRC Press
- (2007) Speech Enhancement: Theory and Practice
- Loizou, P.C.¹

35
- 50949092983
- Makino S., Lee T.W., and Sawada H. (Eds), Springer
- In: Makino S., Lee T.W., and Sawada H. (Eds). Blind Speech Separation (2007), Springer
- (2007) Blind Speech Separation

36
- 69249115826
- Combining missing-feature theory, speech enhancement, and speaker-dependent/independent modeling for speech separation
- Ming J., Hazen T., and Glass J. Combining missing-feature theory, speech enhancement, and speaker-dependent/independent modeling for speech separation. Computer Speech and Language 24 1 (2010) 67-76
- (2010) Computer Speech and Language , vol.24 , Issue.1 , pp. 67-76
- Ming, J.¹ Hazen, T.² Glass, J.³

37
- 44949179273
- Combining missing-feature theory, speech enhancement and speaker-dependent/-independent modeling for speech separation
- Pittsburgh
- Ming, J., Hazen, T.J., Glass, J.R., 2006. Combining missing-feature theory, speech enhancement and speaker-dependent/-independent modeling for speech separation. In: Proceedings of Interspeech 2006, Pittsburgh.
- (2006) Proceedings of Interspeech
- Ming, J.¹ Hazen, T.J.² Glass, J.R.³

38
- 0038021371
- Speech recognition with unknown partial feature corruption a review of the union model
- Ming J., and Smith F.J. Speech recognition with unknown partial feature corruption a review of the union model. Computer Speech and Language 17 (2003) 287-305
- (2003) Computer Speech and Language , vol.17 , pp. 287-305
- Ming, J.¹ Smith, F.J.²

39
- 0141475748
- Cambridge University Press, Cambridge, UK
- Odell J., Ollason D., Woodland P., Young S., and Jansen J. The HTK Book for HTK V2.0 (1995), Cambridge University Press, Cambridge, UK
- (1995) The HTK Book for HTK V2.0
- Odell, J.¹ Ollason, D.² Woodland, P.³ Young, S.⁴ Jansen, J.⁵

40
- 0000914334
- Convolutive blind source separation of non-stationary sources
- Parra L., and Spence C. Convolutive blind source separation of non-stationary sources. IEEE Transactions Speech and Audio Processing (2000) 320-327
- (2000) IEEE Transactions Speech and Audio Processing , pp. 320-327
- Parra, L.¹ Spence, C.²

41
- 33947677142
- Dynamic noise adaptation
- Rennie, S., Kristjansson, T., Olsen, P., Gopinath, R., 2006. Dynamic noise adaptation. In: International Conference on Acoustics, Speech and Signal Processing.
- (2006) International Conference on Acoustics, Speech and Signal Processing
- Rennie, S.¹ Kristjansson, T.² Olsen, P.³ Gopinath, R.⁴

42
- 69249244618
- Signal separation by efficient combinatorial optimization
- Reyes-Gómez, M.J., Jojic, N., 2006. Signal separation by efficient combinatorial optimization. In: Advances in Models for Acoustic Processing NIPS 2006 Workshop.
- (2006) Advances in Models for Acoustic Processing NIPS 2006 Workshop
- Reyes-Gómez, M.J.¹ Jojic, N.²

43
- 85009230793
- Factorial models and refiltering for speech separation and denoising
- Roweis, S., 2003. Factorial models and refiltering for speech separation and denoising. In: Eurospeech, 1009-1012.
- (2003) Eurospeech , pp. 1009-1012
- Roweis, S.¹

44
- 44949110218
- Single-channel speech separation using sparse non-negative matrix factorization
- Pittsburgh
- Schmidt, M.N., Olsson, R.K., 2006. Single-channel speech separation using sparse non-negative matrix factorization. In: Proceedings of Interspeech 2006, Pittsburgh.
- (2006) Proceedings of Interspeech
- Schmidt, M.N.¹ Olsson, R.K.²

45
- 69249159165
- A computational auditory scene analysis system for speech segregation and robust speech recognition
- Shao Y., Srinivasan S., Jin Z., and Wang D. A computational auditory scene analysis system for speech segregation and robust speech recognition. Computer Speech and Language 24 1 (2010) 77-93
- (2010) Computer Speech and Language , vol.24 , Issue.1 , pp. 77-93
- Shao, Y.¹ Srinivasan, S.² Jin, Z.³ Wang, D.⁴

46
- 27744596913
- Consonant identification in n-talker babble is a nonmonotonic function of n
- Simpson S.A., and Cooke M.P. Consonant identification in n-talker babble is a nonmonotonic function of n. Journal of the Acoustical Society of America 118 (2005) 2775-2778
- (2005) Journal of the Acoustical Society of America , vol.118 , pp. 2775-2778
- Simpson, S.A.¹ Cooke, M.P.²

47
- 40749137520
- A computational auditory scene analysis system for robust speech recognition
- Pittsburgh
- Srinivasan, S., Shao, Y., Zhaozhang, J., Wang, D., 2006. A computational auditory scene analysis system for robust speech recognition. In: Proceedings of Interspeech 2006, Pittsburgh.
- (2006) Proceedings of Interspeech
- Srinivasan, S.¹ Shao, Y.² Zhaozhang, J.³ Wang, D.⁴

48
- 0025681008
- Hidden Markov model decomposition of speech and noise
- Varga, A.P., Moore, R.K., 1990. Hidden Markov model decomposition of speech and noise. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing 1990, pp. 845-848.
- (1990) Proceedings of the International Conference on Acoustics, Speech and Signal Processing , pp. 845-848
- Varga, A.P.¹ Moore, R.K.²

49
- 38149102552
- First stereo audio source separation evaluation campaign: data, algorithms and results
- Vincent E., Sawada H., Bofill P., Makino S., and Rosca J. First stereo audio source separation evaluation campaign: data, algorithms and results. LNCS 4666 (2007) 552-559
- (2007) LNCS , vol.4666 , pp. 552-559
- Vincent, E.¹ Sawada, H.² Bofill, P.³ Makino, S.⁴ Rosca, J.⁵

50
- 44849140301
- Speech recognition using factorial hidden Markov models for separation in the feature space
- Pittsburgh
- Virtanen, T., 2006. Speech recognition using factorial hidden Markov models for separation in the feature space. In: Proceedings of Interspeech 2006, Pittsburgh.
- (2006) Proceedings of Interspeech
- Virtanen, T.¹

51
- 82255178542
- Wang D.-L., and Brown G.J. (Eds), IEEE Press/Wiley-Interscience
- In: Wang D.-L., and Brown G.J. (Eds). Computational Auditory Scene Analysis: Principles, Algorithms and Applications (2006), IEEE Press/Wiley-Interscience
- (2006) Computational Auditory Scene Analysis: Principles, Algorithms and Applications

52
- 0022907820
- A computational model for separating two simultaneous talkers
- Weintraub, M., 1986. A computational model for separating two simultaneous talkers. In: Proceedings of ICASSP 1986, pp. 81-84.
- (1986) Proceedings of ICASSP 1986 , pp. 81-84
- Weintraub, M.¹

53
- 69249151355
- Speech separation using speaker-adapted eigenvoice speech models
- Weiss R.J., and Ellis D. Speech separation using speaker-adapted eigenvoice speech models. Computer Speech and Language 24 1 (2010) 16-29
- (2010) Computer Speech and Language , vol.24 , Issue.1 , pp. 16-29
- Weiss, R.J.¹ Ellis, D.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.