메뉴 건너뛰기




Volumn 24, Issue 1, 2010, Pages 1-15

Monaural speech separation and recognition challenge

Author keywords

Auditory scene analysis; Noise robustness; Simultaneous speech; Speaker identification; Speech recognition; Speech separation

Indexed keywords

AUDITORY SCENE ANALYSIS; NOISE ROBUSTNESS; SIMULTANEOUS SPEECH; SPEAKER IDENTIFICATION; SPEECH SEPARATION;

EID: 69249202377     PISSN: 08852308     EISSN: 10958363     Source Type: Journal    
DOI: 10.1016/j.csl.2009.02.006     Document Type: Article
Times cited : (181)

References (53)
  • 1
    • 2142812604 scopus 로고    scopus 로고
    • The perception of speech under adverse acoustic conditions
    • Greenberg, S, Ainsworth, W.A, Popper, A.N, Fay, R.R, Eds, Springer Handbook of Auditory Research
    • Assmann, P., Summerfield, Q., 2004. The perception of speech under adverse acoustic conditions. In: Greenberg, S., Ainsworth, W.A., Popper, A.N., Fay, R.R. (Eds.), Speech Processing in the Auditory System. Springer Handbook of Auditory Research, vol. 18.
    • (2004) Speech Processing in the Auditory System , vol.18
    • Assmann, P.1    Summerfield, Q.2
  • 2
    • 33749317042 scopus 로고    scopus 로고
    • Learning spectral clustering, with application to speech separation
    • Bach F.R., and Jordan M.I. Learning spectral clustering, with application to speech separation. Journal of Machine Learning Research 7 (2006) 1963-2001
    • (2006) Journal of Machine Learning Research , vol.7 , pp. 1963-2001
    • Bach, F.R.1    Jordan, M.I.2
  • 3
    • 44949219122 scopus 로고    scopus 로고
    • Recent advances in speech fragment decoding techniques
    • Pittsburgh, pp
    • Barker, J., Coy, A., Ma, N., Cooke, M.P., 2006. Recent advances in speech fragment decoding techniques. In: Proceedings of Interspeech 2006, Pittsburgh, pp. 85-88.
    • (2006) Proceedings of Interspeech , pp. 85-88
    • Barker, J.1    Coy, A.2    Ma, N.3    Cooke, M.P.4
  • 4
    • 69249231059 scopus 로고    scopus 로고
    • Speech fragment decoding techniques for simultaneous speaker identification and speech recognition
    • Barker J., Ma N., Coy A., and Cooke M. Speech fragment decoding techniques for simultaneous speaker identification and speech recognition. Computer Speech and Language 24 1 (2010) 94-111
    • (2010) Computer Speech and Language , vol.24 , Issue.1 , pp. 94-111
    • Barker, J.1    Ma, N.2    Coy, A.3    Cooke, M.4
  • 5
    • 11144316019 scopus 로고    scopus 로고
    • Decoding speech in the presence of other sources
    • Barker J.P., Cooke M.P., and Ellis D.P.W. Decoding speech in the presence of other sources. Speech Communication 45 1 (2005) 5-25
    • (2005) Speech Communication , vol.45 , Issue.1 , pp. 5-25
    • Barker, J.P.1    Cooke, M.P.2    Ellis, D.P.W.3
  • 6
    • 0029411030 scopus 로고
    • An information-maximization approach to blind separation and blind deconvolution
    • Bell A.J., and Sejnowski T.J. An information-maximization approach to blind separation and blind deconvolution. Neural Computation 7 6 (1995) 1129-1159
    • (1995) Neural Computation , vol.7 , Issue.6 , pp. 1129-1159
    • Bell, A.J.1    Sejnowski, T.J.2
  • 7
    • 33745146930 scopus 로고    scopus 로고
    • Benesty J., Makino S., and Chen J. (Eds), Springer
    • In: Benesty J., Makino S., and Chen J. (Eds). Speech Enhancement (2005), Springer
    • (2005) Speech Enhancement
  • 9
    • 0026442628 scopus 로고
    • Effect of multiple speech-like maskers on binaural speech recognition in normal and impaired hearing
    • Bronkhorst A.W., and Plomp R. Effect of multiple speech-like maskers on binaural speech recognition in normal and impaired hearing. Journal of the Acoustical Society of America 92 (1992) 3132-3139
    • (1992) Journal of the Acoustical Society of America , vol.92 , pp. 3132-3139
    • Bronkhorst, A.W.1    Plomp, R.2
  • 10
    • 33845354768 scopus 로고    scopus 로고
    • Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation
    • Brungart D.S., Chang P.S., Simpson B.D., and Wang D.L. Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation. Journal of the Acoustical Society of America 120 (2006) 4007-4018
    • (2006) Journal of the Acoustical Society of America , vol.120 , pp. 4007-4018
    • Brungart, D.S.1    Chang, P.S.2    Simpson, B.D.3    Wang, D.L.4
  • 12
    • 0028416938 scopus 로고
    • Independent component analysis, a new concept?
    • Comon P. Independent component analysis, a new concept?. Signal Processing 36 3 (1994) 287-314
    • (1994) Signal Processing , vol.36 , Issue.3 , pp. 287-314
    • Comon, P.1
  • 14
    • 37849011878 scopus 로고    scopus 로고
    • The foreign language cocktail party problem: energetic and informational masking effects in non-native speech perception
    • Cooke M.P., Garcia Lecumberri M.L., and Barker J.P. The foreign language cocktail party problem: energetic and informational masking effects in non-native speech perception. Journal of the Acoustical Society of America 123 (2008) 414-427
    • (2008) Journal of the Acoustical Society of America , vol.123 , pp. 414-427
    • Cooke, M.P.1    Garcia Lecumberri, M.L.2    Barker, J.P.3
  • 15
    • 0035342414 scopus 로고    scopus 로고
    • Robust automatic speech recognition with missing and uncertain acoustic data
    • Cooke M.P., Green P.D., Josifovski L., and Vizinho A. Robust automatic speech recognition with missing and uncertain acoustic data. Speech Communication 34 3 (2001) 267-285
    • (2001) Speech Communication , vol.34 , Issue.3 , pp. 267-285
    • Cooke, M.P.1    Green, P.D.2    Josifovski, L.3    Vizinho, A.4
  • 17
    • 0001698589 scopus 로고
    • Auditory Grouping
    • The Handbook of Perception and Cognition, Academic Press
    • Darwin C.J., and Carlyon R.P. Auditory Grouping. The Handbook of Perception and Cognition. Hearing vol. 6 (1995), Academic Press
    • (1995) Hearing , vol.6
    • Darwin, C.J.1    Carlyon, R.P.2
  • 19
    • 44949249754 scopus 로고    scopus 로고
    • Modified phase opponency based solution to the speech separation challenge
    • Pittsburgh, pp
    • Deshmukh, O., Espy-Wilson, C., 2006. Modified phase opponency based solution to the speech separation challenge. In: Proceedings of Interspeech 2006, Pittsburgh, pp. 101-104.
    • (2006) Proceedings of Interspeech , pp. 101-104
    • Deshmukh, O.1    Espy-Wilson, C.2
  • 23
    • 44949138160 scopus 로고    scopus 로고
    • Enhancement of harmonic content of speech based on a dynamic programming pitch tracking algorithm
    • Pittsburgh
    • Every, M.R., Jackson, P.J.B., 2006. Enhancement of harmonic content of speech based on a dynamic programming pitch tracking algorithm. In: Proceedings of Interspeech 2006, Pittsburgh.
    • (2006) Proceedings of Interspeech
    • Every, M.R.1    Jackson, P.J.B.2
  • 24
    • 85009074657 scopus 로고    scopus 로고
    • Algonquin: Iterating Laplace's method to remove multiple types of acoustic distortion for robust speech recognition
    • Frey, B., Deng, L., Acero, A., Kristjansson, T., 2001. Algonquin: iterating Laplace's method to remove multiple types of acoustic distortion for robust speech recognition. In: Eurospeech, 901-904.
    • (2001) Eurospeech , pp. 901-904
    • Frey, B.1    Deng, L.2    Acero, A.3    Kristjansson, T.4
  • 25
    • 0032050110 scopus 로고    scopus 로고
    • Maximum likelihood linear transformations for HMM-based speech recognition
    • Gales M. Maximum likelihood linear transformations for HMM-based speech recognition. Computer Speech and Language 12 (1998) 75-98
    • (1998) Computer Speech and Language , vol.12 , pp. 75-98
    • Gales, M.1
  • 26
    • 0030245128 scopus 로고    scopus 로고
    • Robust continuous speech recognition using parallel model combination
    • Gales M., and Young S. Robust continuous speech recognition using parallel model combination. IEEE Transactions on Speech and Audio Processing 4 (1996) 352-359
    • (1996) IEEE Transactions on Speech and Audio Processing , vol.4 , pp. 352-359
    • Gales, M.1    Young, S.2
  • 27
    • 0029288202 scopus 로고
    • Speech recognition in noisy environments: a survey
    • Gong Y. Speech recognition in noisy environments: a survey. Speech Communication 16 3 (1995) 261-291
    • (1995) Speech Communication , vol.16 , Issue.3 , pp. 261-291
    • Gong, Y.1
  • 30
    • 0027465491 scopus 로고
    • The lombard reflex and its role on human listeners and automatic speech recognizers
    • Junqua J.-C. The lombard reflex and its role on human listeners and automatic speech recognizers. Journal of the Acoustical Society of America 93 (1993) 510-524
    • (1993) Journal of the Acoustical Society of America , vol.93 , pp. 510-524
    • Junqua, J.-C.1
  • 31
    • 44949258898 scopus 로고    scopus 로고
    • Super-human multi-talker speech recognition: The IBM 2006 speech separation challenge system
    • Pittsburgh
    • Kristjansson, T., Hershey, J., Olsen, P., Rennie, S., Gopinath, R., 2006. Super-human multi-talker speech recognition: the IBM 2006 speech separation challenge system. In: Proceedings of Interspeech 2006, Pittsburgh.
    • (2006) Proceedings of Interspeech
    • Kristjansson, T.1    Hershey, J.2    Olsen, P.3    Rennie, S.4    Gopinath, R.5
  • 33
    • 69249203845 scopus 로고    scopus 로고
    • Monaural speech separation based on MAXVQ and CASA for robust speech recognition
    • Li P., Guan Y., Wang S., Xu B., and Liu W. Monaural speech separation based on MAXVQ and CASA for robust speech recognition. Computer Speech and Language 24 1 (2010) 30-44
    • (2010) Computer Speech and Language , vol.24 , Issue.1 , pp. 30-44
    • Li, P.1    Guan, Y.2    Wang, S.3    Xu, B.4    Liu, W.5
  • 35
    • 50949092983 scopus 로고    scopus 로고
    • Makino S., Lee T.W., and Sawada H. (Eds), Springer
    • In: Makino S., Lee T.W., and Sawada H. (Eds). Blind Speech Separation (2007), Springer
    • (2007) Blind Speech Separation
  • 36
    • 69249115826 scopus 로고    scopus 로고
    • Combining missing-feature theory, speech enhancement, and speaker-dependent/independent modeling for speech separation
    • Ming J., Hazen T., and Glass J. Combining missing-feature theory, speech enhancement, and speaker-dependent/independent modeling for speech separation. Computer Speech and Language 24 1 (2010) 67-76
    • (2010) Computer Speech and Language , vol.24 , Issue.1 , pp. 67-76
    • Ming, J.1    Hazen, T.2    Glass, J.3
  • 37
    • 44949179273 scopus 로고    scopus 로고
    • Combining missing-feature theory, speech enhancement and speaker-dependent/-independent modeling for speech separation
    • Pittsburgh
    • Ming, J., Hazen, T.J., Glass, J.R., 2006. Combining missing-feature theory, speech enhancement and speaker-dependent/-independent modeling for speech separation. In: Proceedings of Interspeech 2006, Pittsburgh.
    • (2006) Proceedings of Interspeech
    • Ming, J.1    Hazen, T.J.2    Glass, J.R.3
  • 38
    • 0038021371 scopus 로고    scopus 로고
    • Speech recognition with unknown partial feature corruption a review of the union model
    • Ming J., and Smith F.J. Speech recognition with unknown partial feature corruption a review of the union model. Computer Speech and Language 17 (2003) 287-305
    • (2003) Computer Speech and Language , vol.17 , pp. 287-305
    • Ming, J.1    Smith, F.J.2
  • 43
    • 85009230793 scopus 로고    scopus 로고
    • Factorial models and refiltering for speech separation and denoising
    • Roweis, S., 2003. Factorial models and refiltering for speech separation and denoising. In: Eurospeech, 1009-1012.
    • (2003) Eurospeech , pp. 1009-1012
    • Roweis, S.1
  • 44
    • 44949110218 scopus 로고    scopus 로고
    • Single-channel speech separation using sparse non-negative matrix factorization
    • Pittsburgh
    • Schmidt, M.N., Olsson, R.K., 2006. Single-channel speech separation using sparse non-negative matrix factorization. In: Proceedings of Interspeech 2006, Pittsburgh.
    • (2006) Proceedings of Interspeech
    • Schmidt, M.N.1    Olsson, R.K.2
  • 45
    • 69249159165 scopus 로고    scopus 로고
    • A computational auditory scene analysis system for speech segregation and robust speech recognition
    • Shao Y., Srinivasan S., Jin Z., and Wang D. A computational auditory scene analysis system for speech segregation and robust speech recognition. Computer Speech and Language 24 1 (2010) 77-93
    • (2010) Computer Speech and Language , vol.24 , Issue.1 , pp. 77-93
    • Shao, Y.1    Srinivasan, S.2    Jin, Z.3    Wang, D.4
  • 46
    • 27744596913 scopus 로고    scopus 로고
    • Consonant identification in n-talker babble is a nonmonotonic function of n
    • Simpson S.A., and Cooke M.P. Consonant identification in n-talker babble is a nonmonotonic function of n. Journal of the Acoustical Society of America 118 (2005) 2775-2778
    • (2005) Journal of the Acoustical Society of America , vol.118 , pp. 2775-2778
    • Simpson, S.A.1    Cooke, M.P.2
  • 47
    • 40749137520 scopus 로고    scopus 로고
    • A computational auditory scene analysis system for robust speech recognition
    • Pittsburgh
    • Srinivasan, S., Shao, Y., Zhaozhang, J., Wang, D., 2006. A computational auditory scene analysis system for robust speech recognition. In: Proceedings of Interspeech 2006, Pittsburgh.
    • (2006) Proceedings of Interspeech
    • Srinivasan, S.1    Shao, Y.2    Zhaozhang, J.3    Wang, D.4
  • 49
    • 38149102552 scopus 로고    scopus 로고
    • First stereo audio source separation evaluation campaign: data, algorithms and results
    • Vincent E., Sawada H., Bofill P., Makino S., and Rosca J. First stereo audio source separation evaluation campaign: data, algorithms and results. LNCS 4666 (2007) 552-559
    • (2007) LNCS , vol.4666 , pp. 552-559
    • Vincent, E.1    Sawada, H.2    Bofill, P.3    Makino, S.4    Rosca, J.5
  • 50
    • 44849140301 scopus 로고    scopus 로고
    • Speech recognition using factorial hidden Markov models for separation in the feature space
    • Pittsburgh
    • Virtanen, T., 2006. Speech recognition using factorial hidden Markov models for separation in the feature space. In: Proceedings of Interspeech 2006, Pittsburgh.
    • (2006) Proceedings of Interspeech
    • Virtanen, T.1
  • 52
    • 0022907820 scopus 로고
    • A computational model for separating two simultaneous talkers
    • Weintraub, M., 1986. A computational model for separating two simultaneous talkers. In: Proceedings of ICASSP 1986, pp. 81-84.
    • (1986) Proceedings of ICASSP 1986 , pp. 81-84
    • Weintraub, M.1
  • 53
    • 69249151355 scopus 로고    scopus 로고
    • Speech separation using speaker-adapted eigenvoice speech models
    • Weiss R.J., and Ellis D. Speech separation using speaker-adapted eigenvoice speech models. Computer Speech and Language 24 1 (2010) 16-29
    • (2010) Computer Speech and Language , vol.24 , Issue.1 , pp. 16-29
    • Weiss, R.J.1    Ellis, D.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.