SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 18, Issue 7, 2010, Pages 1708-1716

Robust speech recognition based on dereverberation parameter optimization using acoustic model likelihood

(2) Gomez, Randy a Kawahara, Tatsuya a

a KYOTO UNIVERSITY (Japan)

Author keywords

Automatic speech recognition (ASR); dereverberation; robustness

Indexed keywords

ACOUSTIC MODEL; AUTOMATIC SPEECH RECOGNITION; CONVENTIONAL APPROACH; DEREVERBERATION; GAUSSIAN MIXTURE MODEL; MULTIBAND; OPTIMAL PARAMETER; PARAMETER OPTIMIZATION; RECOGNITION PERFORMANCE; REVERBERANT ENVIRONMENT; ROBUST SPEECH RECOGNITION; ROBUSTNESS; ROOM IMPULSE RESPONSE; SCALE FACTOR; SPEECH RECOGNIZER; WAVE FORMS;

IMPULSE RESPONSE; OPTIMIZATION; REVERBERATION; SIGNAL PROCESSING;

SPEECH RECOGNITION;

EID: 77955686025 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2010.2052610 Document Type: Article

Times cited : (28)

References (30)

1
- 34547507403
- Speech dereverberation
- P. Naylor and N. Gaubitch, "Speech dereverberation," in Proc. IWAENC, 2005, pp. 173-176.
- (2005) Proc. IWAENC , pp. 173-176
- Naylor, P.¹ Gaubitch, N.²

2
- 33947683451
- Speech acquisition and enhancement in a reverberant, cocktail-party-like environment
- Y. Huang, J. Benesty, and J. Chen, "Speech acquisition and enhancement in a reverberant, cocktail-party-like environment," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. ICASSP, 2008, vol.V, pp. 25-28.
- (2008) Proc. IEEE Int. Conf. Acoust. Speech, Signal Process.ICASSP , vol.5 , pp. 25-28
- Huang, Y.¹ Benesty, J.² Chen, J.³

3
- 70350468251
- Subspace methods for multimicrophone speech dereverberation
- G. Gannot and M. Moonen, "Subspace methods for multimicrophone speech dereverberation," EURASIP J. Appl. Signal Process., vol.E80-A, pp. 804-808, 1997.
- (1997) EURASIP J. Appl. Signal Process. , vol.E80-A , pp. 804-808
- Gannot, G.¹ Moonen, M.²

4
- 34247241719
- Inverse filtering for speech dereverberation less sensitive to noise and room transfer function fluctuations
- T. Hikichi, M. Delcroix, and M. Miyoshi, "Inverse filtering for speech dereverberation less sensitive to noise and room transfer function fluctuations," EURASIP J. Adv. Signal Process., vol.2007, no.1, p. 62, 2007.
- (2007) EURASIP J. Adv. Signal Process. , vol.2007 , Issue.1 , pp. 62
- Hikichi, T.¹ Delcroix, M.² Miyoshi, M.³

5
- 84898688036
- Speech denoising and dereverberation using probabilistic models
- Cambridge, MA: MIT Press
- H. Attias, J. Platt, A. Acero, and L. Deng, "Speech denoising and dereverberation using probabilistic models," in Advances in Neural Information Processing Systems 13. Cambridge, MA: MIT Press, 2001.
- (2001) Advances in Neural Information Processing Systems 13
- Attias, H.¹ Platt, J.² Acero, A.³ Deng, L.⁴

6
- 70350458846
- Speech dereverberation based on maximum-likelihood estimation with time-varying Gaussian source model
- Nov.
- T. Nakatani, B.-H. Juang, T. Yoshioka, K. Kinoshita, M. Delcroix, and M. Miyoshi, "Speech dereverberation based on maximum-likelihood estimation with time-varying Gaussian source model," IEEE Trans. Audio, Speech, Lang. Process., vol.16, no.8, pp. 1512-1527, Nov. 2008.
- (2008) IEEE Trans. Audio, Speech, Lang. Process. , vol.16 , Issue.8 , pp. 1512-1527
- Nakatani, T.¹ Juang, B.-H.² Yoshioka, T.³ Kinoshita, K.⁴ Delcroix, M.⁵ Miyoshi, M.⁶

7
- 0022270364
- Mixture autoregressive hidden Markov models for speech signals
- Dec. ASSP'33
- B. Juang and L. Rabiner, "Mixture autoregressive hidden Markov models for speech signals," IEEE Trans. Acoust., Speech, Signal Process., vol.ASSP'33, no.6, pp. 1404-1413, Dec. 1985.
- (1985) IEEE Trans. Acoust., Speech, Signal Process. , Issue.6 , pp. 1404-1413
- Juang, B.¹ Rabiner, L.²

8
- 51449123693
- Distant-talking robust speech recognition using late reflection components of room impulse response
- R. Gomez, J. Even, H. Saruwatari, and K. Shikano, "Distant-talking robust speech recognition using late reflection components of room impulse response," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2008, pp. 4581-4584.
- (2008) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 4581-4584
- Gomez, R.¹ Even, J.² Saruwatari, H.³ Shikano, K.⁴

9
- 50449084234
- Fast dereverberation for hands-free speech recognition
- R. Gomez, J. Even, H. Saruwatari, and K. Shikano, "Fast dereverberation for hands-free speech recognition," in Proc. IEEE Workshop HSCMA, 2008, pp. 140-143.
- (2008) Proc. IEEE Workshop HSCMA , pp. 140-143
- Gomez, R.¹ Even, J.² Saruwatari, H.³ Shikano, K.⁴

10
- 84867192083
- Rapid unsupervised speaker adaptation robust in reverberant environment conditions
- R. Gomez, J. Even, H. Saruwatari, and K. Shikano, "Rapid unsupervised speaker adaptation robust in reverberant environment conditions," in Proc. Interspeech, 2008, pp. 1309-1312.
- (2008) Proc. Interspeech , pp. 1309-1312
- Gomez, R.¹ Even, J.² Saruwatari, H.³ Shikano, K.⁴

11
- 33947694356
- Spectral subtraction steered by multi-step forward linear prediction for single channel speech dereverberation
- K. Kinoshita, T. Nakatani, and M. Miyoshi, "Spectral subtraction steered by multi-step forward linear prediction for single channel speech dereverberation," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2006, vol.I, pp. 817-820.
- (2006) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , vol.1 , pp. 817-820
- Kinoshita, K.¹ Nakatani, T.² Miyoshi, M.³

12
- 0037342773
- Speech-recognizer-based filter optimization for microphone array processing
- Mar.
- M. Seltzer, "Speech-recognizer-based filter optimization for microphone array processing," IEEE Signal Process. Lett., vol.10, no.3, pp. 69-71, Mar. 2003.
- (2003) IEEE Signal Process. Lett. , vol.10 , Issue.3 , pp. 69-71
- Seltzer, M.¹

13
- 50449096811
- Subband likelihood-maximizing beamforming for speech recognition in reverberant environments
- Nov.
- M. Seltzer and R. Stern, "Subband likelihood-maximizing beamforming for speech recognition in reverberant environments," IEEE Trans. Audio, Speech, Lang. Process., vol.14, no.6, pp. 2109-2121, Nov. 2006.
- (2006) IEEE Trans. Audio, Speech, Lang. Process. , vol.14 , Issue.6 , pp. 2109-2121
- Seltzer, M.¹ Stern, R.²

14
- 0029747183
- Speaker normalization using efficient frequency warping procedures
- L. Lee and R. Rose, "Speaker normalization using efficient frequency warping procedures," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 1996, pp. 353-356.
- (1996) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 353-356
- Lee, L.¹ Rose, R.²

15
- 0030672082
- Experiments in speaker normalisation and adaptation for large vocabulary speech recognition
- D. Pye and P. C. Woodland, "Experiments in speaker normalisation and adaptation for large vocabulary speech recognition," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 1997, pp. 1047-1050.
- (1997) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 1047-1050
- Pye, D.¹ Woodland, P.C.²

16
- 0036753897
- Speaker adaptive modeling by vocal tract normalization
- Aug.
- L. Welling, H. Ney, and S. Kanthak, "Speaker adaptive modeling by vocal tract normalization," IEEE Trans. Audio, Speech, Lang. Process., vol.10, no.6, pp. 415-426, Aug. 2002.
- (2002) IEEE Trans. Audio, Speech, Lang. Process. , vol.10 , Issue.6 , pp. 415-426
- Welling, L.¹ Ney, H.² Kanthak, S.³

17
- 0018455310
- Suppression of acoustic noise in speech using spectral subtraction
- Apr.
- S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Trans. Audio, Speech, Signal Process., vol.27, no.2, pp. 113-120, Apr. 1979.
- (1979) IEEE Trans. Audio, Speech, Signal Process. , vol.27 , Issue.2 , pp. 113-120
- Boll, S.F.¹

18
- 0034297865
- Spectral subtraction based on phonetic dependency and masking effects
- Oct.
- W. Kim, S. Kang, and H. Ko, "Spectral subtraction based on phonetic dependency and masking effects," in Proc. IEEE Visual Image Signal Process., Oct. 2000, vol.147, pp. 423-427.
- (2000) Proc. IEEE Visual Image Signal Process. , vol.147 , pp. 423-427
- Kim, W.¹ Kang, S.² Ko, H.³

19
- 0026882842
- Experiments with non-linear spectral subtractor (NSS), hidden Markov models and the projection, for robust speech recognition in cars
- P. Lockwood and J. Boudy, "Experiments with non-linear spectral subtractor (NSS), hidden Markov models and the projection, for robust speech recognition in cars," Speech Commun., vol.11, no.2-3, pp. 215-228, 1992.
- (1992) Speech Commun. , vol.11 , Issue.2-3 , pp. 215-228
- Lockwood, P.¹ Boudy, J.²

20
- 77955701560
- Selective magnitude subtraction for speech enhancement
- I. Soon, S. Koh, and C. Yeo, "Selective magnitude subtraction for speech enhancement," in Proc. 4th Int. Conf./Exhib. High Perform. Comput. The Asia Pacific Region, 2000, vol.2, pp. 692-695.
- (2000) Proc. 4th Int. Conf./Exhib. High Perform. Comput. The Asia Pacific Region , vol.2 , pp. 692-695
- Soon, I.¹ Koh, S.² Yeo, C.³

21
- 33745195661
- Efficient blind dereverberation framework for automatic speech recognition
- K. Kinoshita, T. Nakatani, and M. Miyoshi, "Efficient blind dereverberation framework for automatic speech recognition," in Proc. Interspeech, 2005, pp. 3145-3148.
- (2005) Proc. Interspeech , pp. 3145-3148
- Kinoshita, K.¹ Nakatani, T.² Miyoshi, M.³

22
- 0028812427
- An optimum computergenerated pulse signal suitable for the measurement of very long impulse responses
- Feb.
- Y. Suzuki, F. Asano, H.-Y. Kim, and T. Sone, "An optimum computergenerated pulse signal suitable for the measurement of very long impulse responses," J. Acoust. Soc. Amer., vol.92, no.2, pp. 1119-1123, Feb. 1995.
- (1995) J. Acoust. Soc. Amer. , vol.92 , Issue.2 , pp. 1119-1123
- Suzuki, Y.¹ Asano, F.² Kim, H.-Y.³ Sone, T.⁴

23
- 38649115063
- A new approach for the adaptation of HMMs to reverberation and background noise
- H.-G. Hirsch and H. Finster, "A new approach for the adaptation of HMMs to reverberation and background noise," Speech Commun., pp. 244-263, 2008.
- (2008) Speech Commun. , pp. 244-263
- Hirsch, H.-G.¹ Finster, H.²

24
- 77955673283
- H. Kuttruff, Room Acoustics. London, U.K.: Spon Press, 2000
- H. Kuttruff, Room Acoustics. London, U.K.: Spon Press, 2000.

25
- 33947644864
- Improving rapid unsupervised speaker adaptation based on HMM sufficient statistics
- R. Gomez, T. Toda, H. Saruwatari, and K. Shikano, "Improving rapid unsupervised speaker adaptation based on HMM sufficient statistics," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. ICASSP, 2008, vol.I, pp. 1001-1004.
- (2008) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. ICASSP , vol.1 , pp. 1001-1004
- Gomez, R.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

26
- 77955703558
- Rapid unsupervised speaker adaptation using MLLR and speaker selection
- R. Gomez, T. Toda, H. Saruwatari, and K. Shikano, "Rapid unsupervised speaker adaptation using MLLR and speaker selection," in Proc. Interspeech, 2007, pp. 262-265.
- (2007) Proc. Interspeech , pp. 262-265
- Gomez, R.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

27
- 0028419019
- Maximum a posteriori estimation for multivariate Gaussian mixture observation of Markov chains
- Apr.
- J. Gauvain and C. H. Lee, "Maximum a posteriori estimation for multivariate Gaussian mixture observation of Markov chains," Proc. IEEE Trans. Speech Audio Process., vol.2, no.2, pp. 291-298, Apr. 1994.
- (1994) Proc. IEEE Trans. Speech Audio Process. , vol.2 , Issue.2 , pp. 291-298
- Gauvain, J.¹ Lee, C.H.²

28
- 0030263447
- Mean and variance adaptation within the MLLR framework
- M. J. F. Gales and P. C. Woodland, "Mean and variance adaptation within the MLLR framework," Comput. Speech Lang., vol.10, no.4, pp. 249-264, 1996.
- (1996) Comput. Speech Lang. , vol.10 , Issue.4 , pp. 249-264
- Gales, M.J.F.¹ Woodland, P.C.²

29
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
- C. J. Leggeter and Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models," Comput. Speech Lang., pp. 171-185, 1995.
- (1995) Comput. Speech Lang. , pp. 171-185
- Leggeter, C.J.¹ Woodland²

30
- 77955666626
- HTK documentation. [Online]. Available
- HTK documentation. [Online]. Available: http://htk.eng.cam.ac.uk/ docs/docs.shtml

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.