SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 17, Issue 2, 2009, Pages 324-334

Static and dynamic variance compensation for recognition of reverberant speech with dereverberation preprocessing

(3) Delcroix, Marc a Nakatani, Tomohiro a Watanabe, Shinji a

a NTT Communication Science Laboratories (Japan)

Author keywords

Dereverberation; Model adaptation; Robust automatic speech recognition (ASR); Variance compensation

Indexed keywords

ACOUSTIC MODEL; ADAPTATION SCHEME; ADAPTIVE TRAINING; AUTOMATIC SPEECH RECOGNITION; CONVENTIONAL MODELS; DEREVERBERATION; ERROR RATE; EXPECTATION-MAXIMIZATION ALGORITHMS; MODEL ADAPTATION; MODEL PARAMETERS; NOISE ROBUSTNESS; PARAMETRIC MODELS; PREPROCESSORS; RELATIVE ERROR RATES; REVERBERATION EFFECTS; REVERBERATION TIME; ROBUST AUTOMATIC SPEECH RECOGNITION (ASR); SPEECH FEATURES; SPEECH RECOGNIZER; STATIC AND DYNAMIC; VARIANCE COMPENSATION; WORD ERROR RATE;

ARCHITECTURAL ACOUSTICS; DYNAMIC MODELS; ELECTRIC LOAD SHEDDING; ERROR COMPENSATION; REMELTING; REVERBERATION;

SPEECH RECOGNITION;

EID: 70350450398 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2008.2010214 Document Type: Article

Times cited : (50)

References (37)

1
- 85009070292
- Large-vocabulary speech recognition under adverse acoustic environments
- L. Deng, A. Acero, M. Plumpe, and X. Huang, "Large-vocabulary speech recognition under adverse acoustic environments, " in Proc. Int. Conf. Spoken Lang. Process. (ICSLP'00), 2000, vol. 3, pp. 806-809.
- (2000) Proc. Int. Conf. Spoken Lang. Process. (ICSLP'00) , vol.3 , pp. 806-809
- Deng, L.¹ Acero, A.² Plumpe, M.³ Huang, X.⁴

2
- 85032752225
- Missing-feature approaches in speech recognition
- Sep.
- B. Raj and R. M. Stern, "Missing-feature approaches in speech recognition, " IEEE Signal Process. Mag., vol. 22, no. 5, pp. 101-116, Sep. 2005.
- (2005) IEEE Signal Process. Mag. , vol.22 , Issue.5 , pp. 101-116
- Raj, B.¹ Stern, R.M.²

3
- 0030263447
- Mean and variance adaptation within the mllr framework
- M. J. F. Gales and P. C. Woodland, "Mean and variance adaptation within the MLLR framework, " Comput. Speech Lang., vol. 10, pp. 249-264, 1996.
- (1996) Comput. Speech Lang. , vol.10 , pp. 249-264
- Gales, M.J.F.¹ Woodland, P.C.²

4
- 0032685060
- Robust speech recognition based on a bayesian prediction approach
- Jul.
- H. Jiang, K. Hirose, and Q. Huo, "Robust speech recognition based on a Bayesian prediction approach, " IEEE Trans. Speech Audio Process., vol. 7, no. 4, pp. 426-440, Jul. 1999.
- (1999) IEEE Trans. Speech Audio Process. , vol.7 , Issue.4 , pp. 426-440
- Jiang, H.¹ Hirose, K.² Huo, Q.³

5
- 0028996860
- Robust speech recognition based on stochastic matching
- A. Sankar and C.-H. Lee, "Robust speech recognition based on stochastic matching, " in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'95), 1995, vol. 1, pp. 121-125.
- (1995) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'95) , vol.1 , pp. 121-125
- Sankar, A.¹ Lee, C.-H.²

6
- 0030245128
- Robust continuous speech recognition using parallel model combination
- Sep.
- M. J. F. Gales and S. J. Young, "Robust continuous speech recognition using parallel model combination, " IEEE Trans. Speech Audio Process., vol. 4, no. 5, pp. 352-359, Sep. 1996.
- (1996) IEEE Trans. Speech Audio Process. , vol.4 , Issue.5 , pp. 352-359
- Gales, M.J.F.¹ Young, S.J.²

7
- 56149111866
- Reverberation reduction for improved speech recognition
- CD-ROM
- I. Tashev and D. Allred, "Reverberation reduction for improved speech recognition, " in Proc. Joint Workshop Hands-Free Speech Commun. Microphone Arrays (HSCMA'05), 2005, CD-ROM.
- (2005) Proc. Joint Workshop Hands-Free Speech Commun. Microphone Arrays (HSCMA'05)
- Tashev, I.¹ Allred, D.²

8
- 1542677825
- Blind model selection for automatic speech recognition in reverberant environments
- L. Couvreur and C. Couvreur, "Blind model selection for automatic speech recognition in reverberant environments, " J. VLSI Signal Process. Syst., vol. 36, no. 2-3, pp. 189-203, 2004.
- (2004) J. VLSI Signal Process. Syst. , vol.36 , Issue.2-3 , pp. 189-203
- Couvreur, L.¹ Couvreur, C.²

9
- 33745260725
- Model adaptation by state splitting of HMM for long reverberation
- C. K. Raut, T. Nishimoto, and S. Sagayama, "Model adaptation by state splitting of HMM for long reverberation, " in Proc. 9th Eur. Conf. Speech Commun. Technol. (Interspeech'05-Eurospeech), 2005, pp. 277-280.
- (2005) Proc. 9th Eur. Conf. Speech Commun. Technol. (Interspeech'05-Eurospeech) , pp. 277-280
- Raut, C.K.¹ Nishimoto, T.² Sagayama, S.³

10
- 4544339433
- Acoustic model adaptation using first order prediction for reverberant speech
- T. Takiguchi and M. Nishimura, "Acoustic model adaptation using first order prediction for reverberant speech, " in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'04), 2004, vol. 1, pp. 869-972.
- (2004) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'04) , vol.1 , pp. 869-972
- Takiguchi, T.¹ Nishimura, M.²

11
- 34547517494
- New concept for feature-domain dereverberation for robust distant-talking ASR
- "A
- A. Sehr and W. Kellerman, "A new concept for feature-domain dereverberation for robust distant-talking ASR, " in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'07), 2007, vol. 4, pp. 369-372.
- (2007) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'07) , vol.4 , pp. 369-372
- Sehr, A.¹ Kellerman, W.²

12
- 0036289676
- Acoustic diversity for improved speech recognition in reverberant environments
- B. W. Gillespie and L. E. Atlas, "Acoustic diversity for improved speech recognition in reverberant environments, " in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'02), 2002, vol. 1, pp. 557-600.
- (2002) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'02) , vol.1 , pp. 557-600
- Gillespie, B.W.¹ Atlas, L.E.²

13
- 51449124417
- Speech dereverberation
- P. A. Naylor and N. D. Gaubitch, "Speech dereverberation, " in Proc. Int. Workshop Acoust. Echo and Noise Control (IWAENC'05), 2005, iwaenc05.ele.tue.nl/proceedings/papers/pt03.pdf.
- (2005) Proc. Int. Workshop Acoust. Echo and Noise Control (IWAENC'05)
- Naylor, P.A.¹ Gaubitch, N.D.²

14
- 34548571735
- Harmonicity-based blind dereverberation for single-channel speech signals
- Jan
- T. Nakatani, K. Kinoshita, and M. Miyoshi, "Harmonicity-based blind dereverberation for single-channel speech signals, " IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 1, pp. 80-95, Jan. 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.1 , pp. 80-95
- Nakatani, T.¹ Kinoshita, K.² Miyoshi, M.³

15
- 29744447074
- Speech dereverberation algorithm using transfer function estimates with overestimated order
- T. Hikichi, M. Delcroix, and M. Miyoshi, "Speech dereverberation algorithm using transfer function estimates with overestimated order, " Acoust. Sci. Technol., vol. 27, no. 1, pp. 28-35, 2006.
- (2006) Acoust. Sci. Technol. , vol.27 , Issue.1 , pp. 28-35
- Hikichi, T.¹ Delcroix, M.² Miyoshi, M.³

16
- 33947694356
- Spectral subtraction steered by multi-step forward linear prediction for single channel speech dereverberation
- K. Kinoshita, T. Nakatani, and M. Miyoshi, "Spectral subtraction steered by multi-step forward linear prediction for single channel speech dereverberation, " in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'06), 2006, vol. 1, pp. 817-820.
- (2006) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'06) , vol.1 , pp. 817-820
- Kinoshita, K.¹ Nakatani, T.² Miyoshi, M.³

17
- 51449103069
- A linear prediction- based microphone array for speech dereverberation in a realistic sound field
- Tokyo, Japan, CD-ROM
- K. Kinoshita, M. Delcroix, T. Nakatani, and M. Miyoshi, "A linear prediction- based microphone array for speech dereverberation in a realistic sound field, " in Proc. Audio Eng. Soc. (AES) 13th Regional Conv., Tokyo, Japan, 2007, CD-ROM.
- (2007) Proc. Audio Eng. Soc. (AES) 13th Regional Conv.
- Kinoshita, K.¹ Delcroix, M.² Nakatani, T.³ Miyoshi, M.⁴

18
- 33745761716
- A two-stage algorithm for one-microphone reverberant speech enhancement
- May
- M. Wu and D. Wang, "A two-stage algorithm for one-microphone reverberant speech enhancement, " IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 3, pp. 774-784, May 2006.
- (2006) IEEE Trans. Audio, Speech, Lang. Process. , vol.14 , Issue.3 , pp. 774-784
- Wu, M.¹ Wang, D.²

19
- 18744401086
- Dynamic compensation of HMM variances using the feature enhancement uncertainty computed from a parametric model of speech distortion
- May
- L. Deng, J. Droppo, and A. Acero, "Dynamic compensation of HMM variances using the feature enhancement uncertainty computed from a parametric model of speech distortion, " IEEE Trans. Speech Audio Process., vol. 13, no. 3, pp. 412-421, May 2005.
- (2005) IEEE Trans. Speech Audio Process. , vol.13 , Issue.3 , pp. 412-421
- Deng, L.¹ Droppo, J.² Acero, A.³

20
- 47049124615
- Recognition of convolutive speech mixtures by missing feature techniques for ICA
- D. Kolossa, H. Sawada, R. F. Astudillo, R. Orglmeister, and S. Makino, "Recognition of convolutive speech mixtures by missing feature techniques for ICA, " in Proc. Asilomar Conf. Signals, Syst., Comput. (ACSSC'06), 2006, pp. 1397-1401.
- (2006) Proc. Asilomar Conf. Signals, Syst., Comput. (ACSSC'06) , pp. 1397-1401
- Kolossa, D.¹ Sawada, H.² Astudillo, R.F.³ Orglmeister, R.⁴ Makino, S.⁵

21
- 85009067687
- Using observation uncertainty in hmm decoding
- J. Arrowood and M. Clements, "Using observation uncertainty in HMM decoding, " in Proc. Int. Conf. Spoken Lang. Process. (ICSLP'02), 2002, vol. 3, pp. 1562-1564.
- (2002) Proc. Int. Conf. Spoken Lang. Process. (ICSLP'02) , vol.3 , pp. 1562-1564
- Arrowood, J.¹ Clements, M.²

22
- 0036291376
- Uncertainty decoding with splice for noise robust speech recognition
- J. Droppo, A. Acero, and L. Deng, "Uncertainty decoding with splice for noise robust speech recognition, " in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'02), 2002, vol. 1, pp. 57-60.
- (2002) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'02) , vol.1 , pp. 57-60
- Droppo, J.¹ Acero, A.² Deng, L.³

23
- 33745202806
- Joint uncertainty decoding for noise robust speech recognition
- H. Liao and M. J. F. Gales, "Joint uncertainty decoding for noise robust speech recognition, " in Proc. 9th Eur. Conf. Speech Commun. Technol. (Interspeech'05-Eurospeech), 2005, pp. 3129-3132.
- (2005) Proc. 9th Eur. Conf. Speech Commun. Technol. (Interspeech'05-Eurospeech) , pp. 3129-3132
- Liao, H.¹ Gales, M.J.F.²

24
- 0035342414
- Robust automatic speech recognition with missing and uncertain acoustic data
- M. P. Cooke, P. D. Green, L. B. Josifovski, and A.Vizinho, "Robust automatic speech recognition with missing and uncertain acoustic data, " Speech Commun., vol. 34, pp. 267-285, 2001.
- (2001) Speech Commun. , vol.34 , pp. 267-285
- Cooke, M.P.¹ Green, P.D.² Josifovski, L.B.³ Vizinho, A.⁴

25
- 51449102822
- Combined static and dynamic variance adaptation for efficient interconnection of a speech enhancement pre-processor with speech recognizer
- M. Delcroix, T. Nakatani, and S. Watanabe, "Combined static and dynamic variance adaptation for efficient interconnection of a speech enhancement pre-processor with speech recognizer, " in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'08), 2008, pp. 4073-4076.
- (2008) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP'08) , pp. 4073-4076
- Delcroix, M.¹ Nakatani, T.² Watanabe, S.³

26
- 0030149866
- A maximum-likelihood approach to stochastic matching for robust speech recognition
- May
- A. Sankar and C.-H. Lee, "A maximum-likelihood approach to stochastic matching for robust speech recognition, " IEEE Trans. Speech Audio Process., vol. 4, no. 3, pp. 190-202, May 1996.
- (1996) IEEE Trans. Speech Audio Process. , vol.4 , Issue.3 , pp. 190-202
- Sankar, A.¹ Lee, C.-H.²

27
- 0003870155
- 3rd ed. London, U.K.: Elsevier Science
- H.Kuttruff, Room Acoustics, 3rd ed. London, U.K.: Elsevier Science, 1991.
- (1991) Room Acoustics
- Kuttruff, H.¹

28
- 0003927842
- Upper Saddle River, NJ: Prentice-Hall
- T. F. Quatieri, Discrete-Time Speech Signal Processing. Upper Saddle River, NJ: Prentice-Hall, 2002.
- (2002) Discrete-Time Speech Signal Processing
- Quatieri, T.F.¹

29
- 0018455310
- Suppression of acoustic noise in speech using spectral subtraction
- Apr.
- S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction, " IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-27, no. 2, pp. 113-120, Apr. 1979.
- (1979) IEEE Trans. Acoust., Speech, Signal Process. , vol.ASSP-27 , Issue.2 , pp. 113-120
- Boll, S.F.¹

30
- 0028420014
- Integrated models of signal and background with application to speaker identification in noise
- May
- R. C. Rose, E. M. Hofstetter, and D. A. Reynolds, "Integrated models of signal and background with application to speaker identification in noise, " IEEE Trans. Speech Audio Process., vol. 2, no. 3, pp. 245-257, May 1994.
- (1994) IEEE Trans. Speech Audio Process. , vol.2 , Issue.3 , pp. 245-257
- Rose, R.C.¹ Hofstetter, E.M.² Reynolds, D.A.³

31
- 0000251971
- Maximum likelihood estimation via the ECM algorithm: A general framework
- X.-L. Meng and D. B. Rubin, "Maximum likelihood estimation via the ECM algorithm: A general framework, " Biometrika, vol. 80, pp. 267-278, 1993.
- (1993) Biometrika , vol.80 , pp. 267-278
- Meng, X.-L.¹ Rubin, D.B.²

32
- 33645758265
- Ntt speech recognizer with outlook on the next generation: Solon
- CD-ROM
- T. Hori, "Ntt speech recognizer with outlook on the next generation: Solon, " in Proc. NTT Workshop Communication Scene Analysis, SP-6, 2004, CD-ROM.
- (2004) Proc. NTT Workshop Communication Scene Analysis, SP-6
- Hori, T.¹

33
- 0038669544
- The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy condition
- Paris, France, Sep.
- H. G. Hirsch and D. Pearce, "The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy condition, " in Proc. ISCA Tutorial Research Workshop Autom. Speech Recognition: Challenges for the New Millenium (ITRW ASR2000), Paris, France, Sep. 2000, pp. 18-20.
- (2000) Proc. ISCA Tutorial Research Workshop Autom. Speech Recognition: Challenges for the New Millenium (ITRW ASR2000) , pp. 18-20
- Hirsch, H.G.¹ Pearce, D.²

34
- 70350468459
- IEICE Tech. Rep. 2007-SP-105
- M. Delcroix, T. Nakatani, and S.Watanabe, "Dynamic feature variance adaptation for robust speech recognition with a speech enhancement pre-processor, " 2007, IEICE Tech. Rep., 2007-SP-105.
- (2007) Dynamic Feature Variance Adaptation for Robust Speech Recognition with A Speech Enhancement Pre-processor
- Delcroix, M.¹ Nakatani, T.² Watanabe, S.³

35
- 0009589650
- ETSI ES 202 050 v1.1.1
- ETSI ES 202 050 v1.1.1 "STQ; Distributed speech recognition; advanced front-end feature extraction algorithm; compression algorithms, " 2002.
- (2002) STQ; Distributed Speech Recognition; Advanced Front-end Feature Extraction Algorithm; Compression Algorithms

36
- 50249118229
- A two-stage frequency-domain blind source separation method for underdetermined convolutive mixtures
- H. Sawada, S. Araki, and S. Makino, "A two-stage frequency-domain blind source separation method for underdetermined convolutive mixtures, " in Proc. 2007 IEEE Workshop Applicat. Signal Process. Audio Acoust. (WASPAA'07), 2007, pp. 139-142.
- (2007) Proc. 2007 IEEE Workshop Applicat. Signal Process. Audio Acoust. (WASPAA'07) , pp. 139-142
- Sawada, H.¹ Araki, S.² Makino, S.³

37
- 0000914334
- Convolutive blind separation of non-stationary sources
- May
- L. Parra and C. Spence, "Convolutive blind separation of non-stationary sources, " IEEE Trans. Speech Audio Process., vol. 8, no. 3, pp. 320-327, May 2000.
- (2000) IEEE Trans. Speech Audio Process. , vol.8 , Issue.3 , pp. 320-327
- Parra, L.¹ Spence, C.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.