메뉴 건너뛰기




Volumn 19, Issue 5, 2011, Pages 1206-1220

Efficient MMSE Estimation and Uncertainty Processing for Multienvironment Robust Speech Recognition

Author keywords

Feature vector compensation; minimum mean square error (MMSE) estimation; robust speech recognition; stereo data

Indexed keywords


EID: 85008009592     PISSN: 15587916     EISSN: 15587924     Source Type: Journal    
DOI: 10.1109/TASL.2010.2087753     Document Type: Article
Times cited : (10)

References (40)
  • 2
    • 0029288202 scopus 로고
    • Speech recognition in noisy environments: A survey
    • Apr.
    • Y. Gong “Speech recognition in noisy environments: A survey,” Speech Commun., vol. 16, no. 3, pp. 261–291, Apr. 1995.
    • (1995) Speech Commun. , vol.16 , Issue.3 , pp. 261-291
    • Gong, Y.1
  • 4
    • 0028419019 scopus 로고
    • Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
    • Apr.
    • J.-L. Gauvain and C.-H. Lee “Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains,” IEEE Trans. Speech Audio Process., vol. 2, no. 2, pp. 291–298, Apr. 1994.
    • (1994) IEEE Trans. Speech Audio Process. , vol.2 , Issue.2 , pp. 291-298
    • Gauvain, J.-L.1    Lee, C.-H.2
  • 5
    • 0029288633 scopus 로고
    • Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
    • Apr.
    • C. J. Leggetter and P. C. Woodland “Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models,” Comput. Speech Lang., vol. 9, no. 2, pp. 171–185, Apr. 1995.
    • (1995) Comput. Speech Lang. , vol.9 , Issue.2 , pp. 171-185
    • Leggetter, C.J.1    Woodland, P.C.2
  • 6
    • 0030245128 scopus 로고    scopus 로고
    • Robust continuous speech recognition using parallel model combination
    • Sep.
    • M. J. F. Gales and S. J. Young “Robust continuous speech recognition using parallel model combination,” IEEE Trans. Speech Audio Process., vol. 4, no. 5, pp. 352–359, Sep. 1996.
    • (1996) IEEE Trans. Speech Audio Process. , vol.4 , Issue.5 , pp. 352-359
    • Gales, M.J.F.1    Young, S.J.2
  • 7
    • 0018455310 scopus 로고
    • Suppression of acoustic noise in speech using spectral subtraction
    • Apr.
    • S. Boll “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-27, no. 2, pp. 113–120, Apr. 1979.
    • (1979) IEEE Trans. Acoust., Speech, Signal Process. , vol.ASSP-27 , Issue.2 , pp. 113-120
    • Boll, S.1
  • 9
    • 0021645331 scopus 로고
    • Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator
    • Dec.
    • Y. Ephraim and D. Malah “Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator,” IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-32, no. 6, pp. 1109–1121, Dec. 1984.
    • (1984) IEEE Trans. Acoust., Speech, Signal Process. , vol.ASSP-32 , Issue.6 , pp. 1109-1121
    • Ephraim, Y.1    Malah, D.2
  • 10
    • 0016067897 scopus 로고
    • Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification
    • Jun.
    • B. Atal “Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification,” J. Acoust. Soc. Amer., vol. 55, pp. 1304–1312, Jun. 1974.
    • (1974) J. Acoust. Soc. Amer. , vol.55 , pp. 1304-1312
    • Atal, B.1
  • 12
    • 2442509974 scopus 로고    scopus 로고
    • Cepstral domain segmental nonlinear feature transformations for robust speech recognition
    • May
    • J. C. Segura, M. C. Benitez, A. de la Torre, A. J. Rubio, and J. Ramirez, “Cepstral domain segmental nonlinear feature transformations for robust speech recognition,” IEEE Signal Process. Lett., vol. 11, no. 5, pp. 517–520, May 2004.
    • (2004) IEEE Signal Process. Lett. , vol.11 , Issue.5 , pp. 517-520
    • Segura, J.C.1    Benitez, M.C.2    de la Torre, A.3    Rubio, A.J.4    Ramirez, J.5
  • 13
    • 65549153550 scopus 로고    scopus 로고
    • Ph.D. dissertation, Dept. of Elect. Comput. Eng., Carnegie Mellon Univ.
    • P. Moreno, “Speech Recognition in Noisy Environments,” Ph.D. dissertation, Dept. of Elect. Comput. Eng., Carnegie Mellon Univ., 1996.
    • (1996) Speech Recognition in Noisy Environments
    • Moreno, P.1
  • 14
    • 0032048385 scopus 로고    scopus 로고
    • Speech recognition in noisy environments using first-order vector Taylor series
    • Apr.
    • D. Y. Kim, C. K. Un, and N. S. Kim “Speech recognition in noisy environments using first-order vector Taylor series,” Speech Commun., vol. 24, no. 1, pp. 39–49, Apr. 1998.
    • (1998) Speech Commun. , vol.24 , Issue.1 , pp. 39-49
    • Kim, D.Y.1    Un, C.K.2    Kim, N.S.3
  • 15
    • 66149101303 scopus 로고    scopus 로고
    • Robust speech recognition using a cepstral minimum-mean-square-error-motivated noise suppressor
    • Jul.
    • D. Yu, L. Deng, J. Droppo, J. Wu, Y. Gong, and A. Acero “Robust speech recognition using a cepstral minimum-mean-square-error-motivated noise suppressor,” IEEE Trans. Audio Speech Lang. Process., vol. 16, no. 5, pp. 1061–1070, Jul. 2008.
    • (2008) IEEE Trans. Audio Speech Lang. Process. , vol.16 , Issue.5 , pp. 1061-1070
    • Yu, D.1    Deng, L.2    Droppo, J.3    Wu, J.4    Gong, Y.5    Acero, A.6
  • 16
    • 0347968277 scopus 로고    scopus 로고
    • Recursive estimation of nonstationary noise using iterative stochastic approximation for robust speech recognition
    • Nov.
    • L. Deng, J. Droppo, and A. Acero, “Recursive estimation of nonstationary noise using iterative stochastic approximation for robust speech recognition,” IEEE Trans. Speech Audio Process., vol. 11, no. 6, pp. 568–580, Nov. 2003.
    • (2003) IEEE Trans. Speech Audio Process. , vol.11 , Issue.6 , pp. 568-580
    • Deng, L.1    Droppo, J.2    Acero, A.3
  • 17
    • 68549125183 scopus 로고    scopus 로고
    • Stereo-based stochastic mapping for robust speech recognition
    • Sep.
    • M. Afify, X. Cui, and Y. Gao, “Stereo-based stochastic mapping for robust speech recognition,” IEEE Trans. Audio Speech Lang. Process., vol. 17, no. 7, pp. 1325–1334, Sep. 2009.
    • (2009) IEEE Trans. Audio Speech Lang. Process. , vol.17 , Issue.7 , pp. 1325-1334
    • Afify, M.1    Cui, X.2    Gao, Y.3
  • 18
    • 84987702417 scopus 로고    scopus 로고
    • The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions
    • H. Hirsch and D. Pearce, “The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions,” in Proc. ICSLP, 2000, pp. 29–32.
    • (2000) Proc. ICSLP , pp. 29-32
    • Hirsch, H.1    Pearce, D.2
  • 20
    • 85006734596 scopus 로고    scopus 로고
    • Evaluation of the SPLICE algorithm on the Aurora2 database
    • Aalborg, Denmark
    • J. Droppo, L. Deng, and A. Acero, “Evaluation of the SPLICE algorithm on the Aurora2 database,” in Proc. Eurospeech '01, Aalborg, Denmark, 2001, pp. 217–220.
    • (2001) Proc. Eurospeech '01 , pp. 217-220
    • Droppo, J.1    Deng, L.2    Acero, A.3
  • 21
    • 44849120851 scopus 로고    scopus 로고
    • Cepstral vector normalization based on stereo data for robust speech recognition
    • Mar.
    • L. Buera, E. Lleida, A. Miguel, A. Ortega, and O. Saz “Cepstral vector normalization based on stereo data for robust speech recognition,” IEEE Trans. Audio Speech Lang. Process., vol. 15, no. 3, pp. 1098–1113, Mar. 2007.
    • (2007) IEEE Trans. Audio Speech Lang. Process. , vol.15 , Issue.3 , pp. 1098-1113
    • Buera, L.1    Lleida, E.2    Miguel, A.3    Ortega, A.4    Saz, O.5
  • 22
    • 40249103761 scopus 로고    scopus 로고
    • Issues with uncertainty decoding for noise robust automatic speech recognition
    • H. Liao and M. J. F. Gales “Issues with uncertainty decoding for noise robust automatic speech recognition,” Speech Commun., vol. 50, no. 4, pp. 265–277, 2008.
    • (2008) Speech Commun. , vol.50 , Issue.4 , pp. 265-277
    • Liao, H.1    Gales, M.J.F.2
  • 23
    • 51449114531 scopus 로고    scopus 로고
    • MMSE-based stereo feature stochastic mapping for noise robust speech recognition
    • Apr.
    • X. Cui, M. Afify, and Y. Gao, “MMSE-based stereo feature stochastic mapping for noise robust speech recognition,” in Proc. ICASSP'08, Apr. 2008, pp. 4077–4080.
    • (2008) Proc. ICASSP'08 , pp. 4077-4080
    • Cui, X.1    Afify, M.2    Gao, Y.3
  • 25
    • 0242721421 scopus 로고    scopus 로고
    • HMM-based channel error mitigation and its application to distributed speech recognition
    • Nov.
    • A. M. Peinado, V. Sanchez, J. L. Perez-Cordoba, and A. de la Torre “HMM-based channel error mitigation and its application to distributed speech recognition,” Speech Commun., vol. 41, no. 4, pp. 549–561, Nov. 2003.
    • (2003) Speech Commun. , vol.41 , Issue.4 , pp. 549-561
    • Peinado, A.M.1    Sanchez, V.2    Perez-Cordoba, J.L.3    de la Torre, A.4
  • 26
    • 64349084660 scopus 로고    scopus 로고
    • Noise condition-dependent training based on noise classification and SNR estimation
    • Nov.
    • H. Xu, P. Dalsgaard, Z.-H. Tan, and B. Lindberg, “Noise condition-dependent training based on noise classification and SNR estimation,” IEEE Trans. Audio Speech Lang. Process., vol. 15, no. 8, pp. 2431–2443, Nov. 2007.
    • (2007) IEEE Trans. Audio Speech Lang. Process. , vol.15 , Issue.8 , pp. 2431-2443
    • Xu, H.1    Dalsgaard, P.2    Tan, Z.-H.3    Lindberg, B.4
  • 27
    • 56949089751 scopus 로고    scopus 로고
    • Feature compensation in the cepstral domain employing model combination
    • Feb.
    • W. Kim and J. H. L. Hansen “Feature compensation in the cepstral domain employing model combination,” Speech Commun., vol. 51, no. 2, pp. 83–96, Feb. 2009.
    • (2009) Speech Commun. , vol.51 , Issue.2 , pp. 83-96
    • Kim, W.1    Hansen, J.H.L.2
  • 28
    • 19944385270 scopus 로고    scopus 로고
    • Efficient MMSE-based channel error mitigation techniques. Application to distributed speech recognition over wireless channels
    • Jan.
    • A. M. Peinado, V. Sanchez, J. L. Perez-Cordoba, and A. J. Rubio “Efficient MMSE-based channel error mitigation techniques. Application to distributed speech recognition over wireless channels,” IEEE Trans. Wireless Commun., vol. 4, no. 1, pp. 14–19, Jan. 2005.
    • (2005) IEEE Trans. Wireless Commun. , vol.4 , Issue.1 , pp. 14-19
    • Peinado, A.M.1    Sanchez, V.2    Perez-Cordoba, J.L.3    Rubio, A.J.4
  • 32
    • 84867196386 scopus 로고    scopus 로고
    • HMM-based estimation of unreliable spectral components for noise robust speech recognition
    • Brisbane, Australia, Sep.
    • B. J. Borgstrom and A. Alwan, “HMM-based estimation of unreliable spectral components for noise robust speech recognition,” in Proc. Interspeech, Brisbane, Australia, Sep. 2008, pp. 1769–1772.
    • (2008) Proc. Interspeech , pp. 1769-1772
    • Borgstrom, B.J.1    Alwan, A.2
  • 33
    • 33750376174 scopus 로고    scopus 로고
    • Model-based feature enhancement with uncertainty decoding for noise robust ASR
    • Nov.
    • V. Stouten, H. V. Hamme, and P. Wambacq “Model-based feature enhancement with uncertainty decoding for noise robust ASR,” Speech Commun., vol. 48, no. 11, pp. 1502–1514, Nov. 2006.
    • (2006) Speech Commun. , vol.48 , Issue.11 , pp. 1502-1514
    • Stouten, V.1    Hamme, H.V.2    Wambacq, P.3
  • 34
    • 51449120334 scopus 로고    scopus 로고
    • An efficient approximation of the forward-backward algorithm to deal with packet loss, with applications to remote speech recognition
    • Apr.
    • B. J. Borgstrom and A. Alwan, “An efficient approximation of the forward-backward algorithm to deal with packet loss, with applications to remote speech recognition,” in Proc. ICASSP, Apr. 2008, pp. 4425–4428.
    • (2008) Proc. ICASSP , pp. 4425-4428
    • Borgstrom, B.J.1    Alwan, A.2
  • 36
    • 0035273996 scopus 로고    scopus 로고
    • Softbit speech decoding: A new approach to error concealment
    • Mar.
    • T. Fingscheidt and P. Vary “Softbit speech decoding: A new approach to error concealment,” IEEE Trans. Speech Audio Process., vol. 9, no. 3, pp. 240–251, Mar. 2001.
    • (2001) IEEE Trans. Speech Audio Process. , vol.9 , Issue.3 , pp. 240-251
    • Fingscheidt, T.1    Vary, P.2
  • 38
    • 0024610919 scopus 로고
    • A tutorial on hidden Markov models and selected applications in speech recognition
    • Feb.
    • L. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proc. IEEE, vol. 77, no. 2, pp. 257–286, Feb. 1989.
    • (1989) Proc. IEEE , vol.77 , Issue.2 , pp. 257-286
    • Rabiner, L.1
  • 39
    • 84892174007 scopus 로고    scopus 로고
    • Weighted Viterbi algorithm and state duration modelling for speech recognition in noise
    • May
    • N. B. Yoma, F. R. McInnes, and M. A. Jack, “Weighted Viterbi algorithm and state duration modelling for speech recognition in noise,” in Proc. ICASSP, May 1998, vol. 2, pp. 709–712.
    • (1998) Proc. ICASSP , vol.2 , pp. 709-712
    • Yoma, N.B.1    McInnes, F.R.2    Jack, M.A.3
  • 40
    • 33845666211 scopus 로고    scopus 로고
    • Combining media-specific FEC and error concealment for robust distributed speech recognition over loss-prone packet channels
    • Dec.
    • A. M. Gomez, A. M. Peinado, V. Sanchez, and A. J. Rubio “Combining media-specific FEC and error concealment for robust distributed speech recognition over loss-prone packet channels,” IEEE Trans. Multimedia, vol. 8, pp. 1228–1238, Dec. 2006.
    • (2006) IEEE Trans. Multimedia , vol.8 , pp. 1228-1238
    • Gomez, A.M.1    Peinado, A.M.2    Sanchez, V.3    Rubio, A.J.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.