메뉴 건너뛰기




Volumn 18, Issue 7, 2010, Pages 1708-1716

Robust speech recognition based on dereverberation parameter optimization using acoustic model likelihood

Author keywords

Automatic speech recognition (ASR); dereverberation; robustness

Indexed keywords

ACOUSTIC MODEL; AUTOMATIC SPEECH RECOGNITION; CONVENTIONAL APPROACH; DEREVERBERATION; GAUSSIAN MIXTURE MODEL; MULTIBAND; OPTIMAL PARAMETER; PARAMETER OPTIMIZATION; RECOGNITION PERFORMANCE; REVERBERANT ENVIRONMENT; ROBUST SPEECH RECOGNITION; ROBUSTNESS; ROOM IMPULSE RESPONSE; SCALE FACTOR; SPEECH RECOGNIZER; WAVE FORMS;

EID: 77955686025     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2010.2052610     Document Type: Article
Times cited : (28)

References (30)
  • 1
    • 34547507403 scopus 로고    scopus 로고
    • Speech dereverberation
    • P. Naylor and N. Gaubitch, "Speech dereverberation," in Proc. IWAENC, 2005, pp. 173-176.
    • (2005) Proc. IWAENC , pp. 173-176
    • Naylor, P.1    Gaubitch, N.2
  • 3
    • 70350468251 scopus 로고    scopus 로고
    • Subspace methods for multimicrophone speech dereverberation
    • G. Gannot and M. Moonen, "Subspace methods for multimicrophone speech dereverberation," EURASIP J. Appl. Signal Process., vol.E80-A, pp. 804-808, 1997.
    • (1997) EURASIP J. Appl. Signal Process. , vol.E80-A , pp. 804-808
    • Gannot, G.1    Moonen, M.2
  • 4
    • 34247241719 scopus 로고    scopus 로고
    • Inverse filtering for speech dereverberation less sensitive to noise and room transfer function fluctuations
    • T. Hikichi, M. Delcroix, and M. Miyoshi, "Inverse filtering for speech dereverberation less sensitive to noise and room transfer function fluctuations," EURASIP J. Adv. Signal Process., vol.2007, no.1, p. 62, 2007.
    • (2007) EURASIP J. Adv. Signal Process. , vol.2007 , Issue.1 , pp. 62
    • Hikichi, T.1    Delcroix, M.2    Miyoshi, M.3
  • 7
    • 0022270364 scopus 로고
    • Mixture autoregressive hidden Markov models for speech signals
    • Dec. ASSP'33
    • B. Juang and L. Rabiner, "Mixture autoregressive hidden Markov models for speech signals," IEEE Trans. Acoust., Speech, Signal Process., vol.ASSP'33, no.6, pp. 1404-1413, Dec. 1985.
    • (1985) IEEE Trans. Acoust., Speech, Signal Process. , Issue.6 , pp. 1404-1413
    • Juang, B.1    Rabiner, L.2
  • 10
    • 84867192083 scopus 로고    scopus 로고
    • Rapid unsupervised speaker adaptation robust in reverberant environment conditions
    • R. Gomez, J. Even, H. Saruwatari, and K. Shikano, "Rapid unsupervised speaker adaptation robust in reverberant environment conditions," in Proc. Interspeech, 2008, pp. 1309-1312.
    • (2008) Proc. Interspeech , pp. 1309-1312
    • Gomez, R.1    Even, J.2    Saruwatari, H.3    Shikano, K.4
  • 11
    • 33947694356 scopus 로고    scopus 로고
    • Spectral subtraction steered by multi-step forward linear prediction for single channel speech dereverberation
    • K. Kinoshita, T. Nakatani, and M. Miyoshi, "Spectral subtraction steered by multi-step forward linear prediction for single channel speech dereverberation," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2006, vol.I, pp. 817-820.
    • (2006) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , vol.1 , pp. 817-820
    • Kinoshita, K.1    Nakatani, T.2    Miyoshi, M.3
  • 12
    • 0037342773 scopus 로고    scopus 로고
    • Speech-recognizer-based filter optimization for microphone array processing
    • Mar.
    • M. Seltzer, "Speech-recognizer-based filter optimization for microphone array processing," IEEE Signal Process. Lett., vol.10, no.3, pp. 69-71, Mar. 2003.
    • (2003) IEEE Signal Process. Lett. , vol.10 , Issue.3 , pp. 69-71
    • Seltzer, M.1
  • 13
    • 50449096811 scopus 로고    scopus 로고
    • Subband likelihood-maximizing beamforming for speech recognition in reverberant environments
    • Nov.
    • M. Seltzer and R. Stern, "Subband likelihood-maximizing beamforming for speech recognition in reverberant environments," IEEE Trans. Audio, Speech, Lang. Process., vol.14, no.6, pp. 2109-2121, Nov. 2006.
    • (2006) IEEE Trans. Audio, Speech, Lang. Process. , vol.14 , Issue.6 , pp. 2109-2121
    • Seltzer, M.1    Stern, R.2
  • 15
  • 16
    • 0036753897 scopus 로고    scopus 로고
    • Speaker adaptive modeling by vocal tract normalization
    • Aug.
    • L. Welling, H. Ney, and S. Kanthak, "Speaker adaptive modeling by vocal tract normalization," IEEE Trans. Audio, Speech, Lang. Process., vol.10, no.6, pp. 415-426, Aug. 2002.
    • (2002) IEEE Trans. Audio, Speech, Lang. Process. , vol.10 , Issue.6 , pp. 415-426
    • Welling, L.1    Ney, H.2    Kanthak, S.3
  • 17
    • 0018455310 scopus 로고
    • Suppression of acoustic noise in speech using spectral subtraction
    • Apr.
    • S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Trans. Audio, Speech, Signal Process., vol.27, no.2, pp. 113-120, Apr. 1979.
    • (1979) IEEE Trans. Audio, Speech, Signal Process. , vol.27 , Issue.2 , pp. 113-120
    • Boll, S.F.1
  • 18
    • 0034297865 scopus 로고    scopus 로고
    • Spectral subtraction based on phonetic dependency and masking effects
    • Oct.
    • W. Kim, S. Kang, and H. Ko, "Spectral subtraction based on phonetic dependency and masking effects," in Proc. IEEE Visual Image Signal Process., Oct. 2000, vol.147, pp. 423-427.
    • (2000) Proc. IEEE Visual Image Signal Process. , vol.147 , pp. 423-427
    • Kim, W.1    Kang, S.2    Ko, H.3
  • 19
    • 0026882842 scopus 로고
    • Experiments with non-linear spectral subtractor (NSS), hidden Markov models and the projection, for robust speech recognition in cars
    • P. Lockwood and J. Boudy, "Experiments with non-linear spectral subtractor (NSS), hidden Markov models and the projection, for robust speech recognition in cars," Speech Commun., vol.11, no.2-3, pp. 215-228, 1992.
    • (1992) Speech Commun. , vol.11 , Issue.2-3 , pp. 215-228
    • Lockwood, P.1    Boudy, J.2
  • 21
    • 33745195661 scopus 로고    scopus 로고
    • Efficient blind dereverberation framework for automatic speech recognition
    • K. Kinoshita, T. Nakatani, and M. Miyoshi, "Efficient blind dereverberation framework for automatic speech recognition," in Proc. Interspeech, 2005, pp. 3145-3148.
    • (2005) Proc. Interspeech , pp. 3145-3148
    • Kinoshita, K.1    Nakatani, T.2    Miyoshi, M.3
  • 22
    • 0028812427 scopus 로고
    • An optimum computergenerated pulse signal suitable for the measurement of very long impulse responses
    • Feb.
    • Y. Suzuki, F. Asano, H.-Y. Kim, and T. Sone, "An optimum computergenerated pulse signal suitable for the measurement of very long impulse responses," J. Acoust. Soc. Amer., vol.92, no.2, pp. 1119-1123, Feb. 1995.
    • (1995) J. Acoust. Soc. Amer. , vol.92 , Issue.2 , pp. 1119-1123
    • Suzuki, Y.1    Asano, F.2    Kim, H.-Y.3    Sone, T.4
  • 23
    • 38649115063 scopus 로고    scopus 로고
    • A new approach for the adaptation of HMMs to reverberation and background noise
    • H.-G. Hirsch and H. Finster, "A new approach for the adaptation of HMMs to reverberation and background noise," Speech Commun., pp. 244-263, 2008.
    • (2008) Speech Commun. , pp. 244-263
    • Hirsch, H.-G.1    Finster, H.2
  • 24
    • 77955673283 scopus 로고    scopus 로고
    • H. Kuttruff, Room Acoustics. London, U.K.: Spon Press, 2000
    • H. Kuttruff, Room Acoustics. London, U.K.: Spon Press, 2000.
  • 26
    • 77955703558 scopus 로고    scopus 로고
    • Rapid unsupervised speaker adaptation using MLLR and speaker selection
    • R. Gomez, T. Toda, H. Saruwatari, and K. Shikano, "Rapid unsupervised speaker adaptation using MLLR and speaker selection," in Proc. Interspeech, 2007, pp. 262-265.
    • (2007) Proc. Interspeech , pp. 262-265
    • Gomez, R.1    Toda, T.2    Saruwatari, H.3    Shikano, K.4
  • 27
    • 0028419019 scopus 로고
    • Maximum a posteriori estimation for multivariate Gaussian mixture observation of Markov chains
    • Apr.
    • J. Gauvain and C. H. Lee, "Maximum a posteriori estimation for multivariate Gaussian mixture observation of Markov chains," Proc. IEEE Trans. Speech Audio Process., vol.2, no.2, pp. 291-298, Apr. 1994.
    • (1994) Proc. IEEE Trans. Speech Audio Process. , vol.2 , Issue.2 , pp. 291-298
    • Gauvain, J.1    Lee, C.H.2
  • 28
    • 0030263447 scopus 로고    scopus 로고
    • Mean and variance adaptation within the MLLR framework
    • M. J. F. Gales and P. C. Woodland, "Mean and variance adaptation within the MLLR framework," Comput. Speech Lang., vol.10, no.4, pp. 249-264, 1996.
    • (1996) Comput. Speech Lang. , vol.10 , Issue.4 , pp. 249-264
    • Gales, M.J.F.1    Woodland, P.C.2
  • 29
    • 0029288633 scopus 로고
    • Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
    • C. J. Leggeter and Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models," Comput. Speech Lang., pp. 171-185, 1995.
    • (1995) Comput. Speech Lang. , pp. 171-185
    • Leggeter, C.J.1    Woodland2
  • 30
    • 77955666626 scopus 로고    scopus 로고
    • HTK documentation. [Online]. Available
    • HTK documentation. [Online]. Available: http://htk.eng.cam.ac.uk/ docs/docs.shtml


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.