SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 18, Issue 7, 2010, Pages 1676-1691

Reverberation model-based decoding in the logmelspec domain for robust distant-talking speech recognition

(3) Sehr, Armin a Maas, Roland a Kellermann, Walter a

a UNIVERSITY OF ERLANGEN NUREMBERG (Germany)

Author keywords

Acoustic modeling; distant talking automatic speech recognition (ASR); model based dereverberation; reverberation model; robust ASR

Indexed keywords

ACOUSTIC MODEL; ACOUSTIC MODELING; AUTOMATIC SPEECH RECOGNITION; COMBINATION OPERATION; CONNECTED DIGITS; FEATURE DOMAIN; HIGH FLEXIBILITY; IN-DEPTH ANALYSIS; JOINT DENSITIES; MODEL-BASED; MODEL-BASED DEREVERBERATION; NON-LINEAR OPTIMIZATION ALGORITHMS; OPTIMIZATION OPERATION; OPTIMIZATION PROBLEMS; PARAMETERS ESTIMATED; ROBUST ASR; STATISTICAL PROPERTIES;

CONSTRAINED OPTIMIZATION; CONTINUOUS SPEECH RECOGNITION; HIDDEN MARKOV MODELS; NUMERICAL METHODS; VITERBI ALGORITHM;

REVERBERATION;

EID: 77955683144 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2010.2050511 Document Type: Article

Times cited : (53)

References (42)

1
- 44949167884
- Distant-talking continuous speech recognition based on a novel reverberation model in the feature domain
- A. Sehr, M. Zeller, and W. Kellermann, "Distant-talking continuous speech recognition based on a novel reverberation model in the feature domain," in Proc. Interspeech, 2006, pp. 769-772.
- (2006) Proc. Interspeech , pp. 769-772
- Sehr, A.¹ Zeller, M.² Kellermann, W.³

2
- 70349227947
- The application of hidden Markov models in speech recognition
- M. Gales and S. Young, "The application of hidden Markov models in speech recognition," Foundat. Trends Signal Process., vol.1, no.3, pp. 195-304, 2007.
- (2007) Foundat. Trends Signal Process. , vol.1 , Issue.3 , pp. 195-304
- Gales, M.¹ Young, S.²

3
- 85132840106
- Towards robust distant-talking automatic speech recognition in reverberant environments
- E. Hänsler and G. Schmidt, Eds. Berlin: Springer
- A. Sehr and W. Kellermann, "Towards robust distant-talking automatic speech recognition in reverberant environments," in Topics in Speech and Audio Processing in Adverse Environments, E. Hänsler and G. Schmidt, Eds. Berlin: Springer, 2008, pp. 679-728.
- (2008) Topics in Speech and Audio Processing in Adverse Environments , pp. 679-728
- Sehr, A.¹ Kellermann, W.²

4
- 0030682292
- Recognizing reverberant speech with RASTA-PLP
- B. E. D. Kingsbury and N. Morgan, "Recognizing reverberant speech with RASTA-PLP," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 1997, pp. 1259-1262.
- (1997) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 1259-1262
- Kingsbury, B.E.D.¹ Morgan, N.²

5
- 50449102864
- The harming part of room acoustics in automatic speech recognition
- Aug.
- R. Petrick, K. Lohde, M.Wolff, and R. Hoffmann, "The harming part of room acoustics in automatic speech recognition," in Proc. Interspeech, Aug. 2007, pp. 1094-1097.
- (2007) Proc. Interspeech , pp. 1094-1097
- Petrick, R.¹ Lohde, K.² Wolff, M.³ Hoffmann, R.⁴

6
- 0004056285
- Upper Saddle River, NJ: Prentice-Hall
- X. Huang, A. Acero, and H.-W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm, and System Development. Upper Saddle River, NJ: Prentice-Hall, 2001.
- (2001) Spoken Language Processing: A Guide to Theory, Algorithm, and System Development
- Huang, X.¹ Acero, A.² Hon, H.-W.³

7
- 0023961145
- Inverse filtering of room acoustics
- Feb.
- M. Miyoshi and Y. Kaneda, "Inverse filtering of room acoustics," IEEE Trans. Acoust., Speech, Signal Process., vol.36, no.2, pp. 145-152, Feb. 1988.
- (1988) IEEE Trans. Acoust., Speech, Signal Process. , vol.36 , Issue.2 , pp. 145-152
- Miyoshi, M.¹ Kaneda, Y.²

8
- 34247241719
- Inverse filtering for speech dereverberation less sensitive to noise and room transfer function fluctuations
- T. Hikichi, M. Delcroix, and M. Miyoshi, "Inverse filtering for speech dereverberation less sensitive to noise and room transfer function fluctuations," EURASIP J. Adv. Signal Process., vol.2007, 2007.
- (2007) EURASIP J. Adv. Signal Process. , vol.2007
- Hikichi, T.¹ Delcroix, M.² Miyoshi, M.³

9
- 70350458846
- Speech dereverberation based on maximum-likelihood estimation with time-varying Gaussian source model
- Nov.
- T. Nakatani, B.-H. Juang, T. Yoshioka, K. Kinoshita, M. Delcroix, and M. Miyoshi, "Speech dereverberation based on maximum-likelihood estimation with time-varying Gaussian source model," IEEE Trans. Audio, Speech, Lang. Process., vol.16, no.8, pp. 1512-1527, Nov. 2008.
- (2008) IEEE Trans. Audio, Speech, Lang. Process. , vol.16 , Issue.8 , pp. 1512-1527
- Nakatani, T.¹ Juang, B.-H.² Yoshioka, T.³ Kinoshita, K.⁴ Delcroix, M.⁵ Miyoshi, M.⁶

10
- 79957754961
- TRINICON for dereverberation of speech and audio signals
- P. Naylor and N. Gaubitch, Eds. Berlin: Springer
- H. Buchner and W. Kellermann, "TRINICON for dereverberation of speech and audio signals," in Speech Dereverberation, P. Naylor and N. Gaubitch, Eds. Berlin: Springer.
- Speech Dereverberation
- Buchner, H.¹ Kellermann, W.²

11
- 14344274593
- A new method based on spectral subtraction for speech dereverberation
- K. Lebart and J. M. Boucher, "A new method based on spectral subtraction for speech dereverberation," Acta Acust., vol.87, pp. 359-366, 2001.
- (2001) Acta Acust. , vol.87 , pp. 359-366
- Lebart, K.¹ Boucher, J.M.²

12
- 77955697587
- Late reverberant spectral variance estimation based on a statistical model
- Sep.
- E. Habets, S. Gannot, and I. Cohen, "Late reverberant spectral variance estimation based on a statistical model," IEEE Signal Process. Lett., vol.16, pp. 770-773, Sep. 2009.
- (2009) IEEE Signal Process. Lett. , vol.16 , pp. 770-773
- Habets, E.¹ Gannot, S.² Cohen, I.³

13
- 65249167097
- Suppression of late reverberation effect on speech signal using long-term multiplestep linear prediction
- May
- K. Kinoshita, M. Delcroix, T. Nakatani, and M. Miyoshi, "Suppression of late reverberation effect on speech signal using long-term multiplestep linear prediction," IEEE Trans. Audio, Speech, Lang. Process., vol.17, no.4, pp. 534-545, May 2009.
- (2009) IEEE Trans. Audio, Speech, Lang. Process. , vol.17 , Issue.4 , pp. 534-545
- Kinoshita, K.¹ Delcroix, M.² Nakatani, T.³ Miyoshi, M.⁴

14
- 0001379957
- Enhancement of reverberant speech using LP residual signal
- May
- B. Yegnanarayana and P. S. Murthy, "Enhancement of reverberant speech using LP residual signal," IEEE Trans. Speech Audio Process., vol.8, pp. 267-281, May 2000.
- (2000) IEEE Trans. Speech Audio Process. , vol.8 , pp. 267-281
- Yegnanarayana, B.¹ Murthy, P.S.²

15
- 33845361792
- On the use of linear prediction for dereverberation of speech
- Sep.
- N. D. Gaubitch, P. A. Naylor, and D. B. Ward, "On the use of linear prediction for dereverberation of speech," in Proc. IEEE Int. Workshop Acoust. Echo Noise Control (IWAENC), Sep. 2003, pp. 99-102.
- (2003) Proc. IEEE Int. Workshop Acoust. Echo Noise Control (IWAENC) , pp. 99-102
- Gaubitch, N.D.¹ Naylor, P.A.² Ward, D.B.³

16
- 0028517164
- RASTA processing of speech
- Oct.
- H. Hermansky and N. Morgan, "RASTA processing of speech," IEEE Trans. Speech Audio Process., vol.2, pp. 578-589, Oct. 1994.
- (1994) IEEE Trans. Speech Audio Process. , vol.2 , pp. 578-589
- Hermansky, H.¹ Morgan, N.²

17
- 0016067897
- Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification
- B. Atal, "Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification," J. Acoust. Soc. Amer., vol.55, pp. 1304-1312, 1974.
- (1974) J. Acoust. Soc. Amer. , vol.55 , pp. 1304-1312
- Atal, B.¹

18
- 33745246234
- Multiresolution channel normalization for ASR in reverberant environments
- C. Avendano, S. Tibrewala, and H. Hermansky, "Multiresolution channel normalization for ASR in reverberant environments," in Proc. Eurospeech, 1997, pp. 1107-1110.
- (1997) Proc. Eurospeech , pp. 1107-1110
- Avendano, C.¹ Tibrewala, S.² Hermansky, H.³

19
- 70350439261
- Enhanced speech features by single-channel joint compensation of noise and reverberation
- Feb.
- M. Wölfel, "Enhanced speech features by single-channel joint compensation of noise and reverberation," IEEE Trans. Audio, Speech, Lang. Process., vol.17, no.2, pp. 312-323, Feb. 2009.
- (2009) IEEE Trans. Audio, Speech, Lang. Process. , vol.17 , Issue.2 , pp. 312-323
- Wölfel, M.¹

20
- 0032667502
- Training of HMM with filtered speech material for hands-free recognition
- Mar.
- D. Giuliani, M. Matassoni, M. Omologo, and P. Svaizer, "Training of HMM with filtered speech material for hands-free recognition," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP),Mar. 1999, vol.1, pp. 449-452.
- (1999) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , vol.1 , pp. 449-452
- Giuliani, D.¹ Matassoni, M.² Omologo, M.³ Svaizer, P.⁴

21
- 0034848767
- Acoustic synthesis of training data for speech recognition in living room environments
- V. Stahl, A. Fischer, and R. Bippus, "Acoustic synthesis of training data for speech recognition in living-room environments," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP),May 2001, vol.1, pp. 285-288. (Pubitemid 32839243)
- (2001) ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings , vol.1 , pp. 285-288
- Stahl, V.¹ Fischer, A.² Bippus, R.³

22
- 33645784228
- Acoustic model adaptation using first-order linear prediction for reverberant speech
- Mar.
- T. Takiguchi, M. Nishimura, and Y. Ariki, "Acoustic model adaptation using first-order linear prediction for reverberant speech," IEICE Trans. Inf. Syst., vol.E89-D, no.3, pp. 908-914, Mar. 2006.
- (2006) IEICE Trans. Inf. Syst. , vol.E89-D , Issue.3 , pp. 908-914
- Takiguchi, T.¹ Nishimura, M.² Ariki, Y.³

23
- 33947694706
- Model adaptation for long convolutional distortion by maximum likelihood based state filtering approach
- May
- C. K. Raut, T. Nishimoto, and S. Sagayama, "Model adaptation for long convolutional distortion by maximum likelihood based state filtering approach," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), May 2006, pp. I-1133-I-1136.
- (2006) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP)
- Raut, C.K.¹ Nishimoto, T.² Sagayama, S.³

24
- 44949247595
- A new HMM adaptation approach for the case of a hands-free speech input in reverberant rooms
- Sep.
- H.-G. Hirsch and H. Finster, "A new HMM adaptation approach for the case of a hands-free speech input in reverberant rooms," in Proc. Interspeech, Sep. 2006, pp. 781-783.
- (2006) Proc. Interspeech , pp. 781-783
- Hirsch, H.-G.¹ Finster, H.²

25
- 2942539074
- Techniques for handling convolutional distortion with 'missing data' automatic speech recognition
- Jun.
- J. B. K. J. Palomäki and G. J. Brown, "Techniques for handling convolutional distortion with 'missing data' automatic speech recognition," Speech Commun., vol.43, no.1-2, pp. 123-142, Jun. 2004.
- (2004) Speech Commun. , vol.43 , Issue.1-2 , pp. 123-142
- Palomäki, J.B.K.J.¹ Brown, G.J.²

26
- 70350450398
- Static and dynamic variance compensation for recognition of reverberant speech with dereverberation preprocessing
- Feb.
- M. Delcroix, T. Nakatani, and S. Watanabe, "Static and dynamic variance compensation for recognition of reverberant speech with dereverberation preprocessing," IEEE Trans. Audio, Speech, Lang. Process., vol.17, no.2, pp. 324-334, Feb. 2009.
- (2009) IEEE Trans. Audio, Speech, Lang. Process. , vol.17 , Issue.2 , pp. 324-334
- Delcroix, M.¹ Nakatani, T.² Watanabe, S.³

27
- 77955671150
- Model-based dereverberation in the logmelspec domain for robust distant-talking speech recognition
- A. Sehr, R. Maas, and W. Kellermann, "Model-based dereverberation in the logmelspec domain for robust distant-talking speech recognition," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2010, pp. 4298-4301.
- (2010) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 4298-4301
- Sehr, A.¹ Maas, R.² Kellermann, W.³

28
- 50449103692
- New results for feature-domain reverberation modeling
- A. Sehr and W. Kellermann, "New results for feature-domain reverberation modeling," in Proc. JointWorkshop Hands-free Speech Commun. Microphone Arrays (HSCMA), 2008.
- (2008) Proc. JointWorkshop Hands-free Speech Commun. Microphone Arrays (HSCMA)
- Sehr, A.¹ Kellermann, W.²

29
- 0025681008
- Hidden markov model decomposition of speech and noise
- A. P. Varga and R. K. Moore, "Hidden markov model decomposition of speech and noise," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 1990, pp. 845-848.
- (1990) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 845-848
- Varga, A.P.¹ Moore, R.K.²

30
- 70349707036
- A simplified decoding method for a robust distant-talking ASR concept based on feature-domain dereverberation
- A. Sehr and W. Kellermann, "A simplified decoding method for a robust distant-talking ASR concept based on feature-domain dereverberation, " in Proc. Int. Workshop Acoust. Echo Noise Control (IWAENC), 2008.
- (2008) Proc. Int. Workshop Acoust. Echo Noise Control (IWAENC)
- Sehr, A.¹ Kellermann, W.²

31
- 0003768769
- 2nd ed. Chichester, U.K.: Wiley
- R. Fletcher, Practical Methods of Optimization, 2nd ed. Chichester, U.K.: Wiley, 2000.
- (2000) Practical Methods of Optimization
- Fletcher, R.¹

32
- 77955680516
- M.Sc. thesis, University of Erlangen-Nuremberg, Erlangen, Germany
- R. Maas, "Evaluierung numerischer Optimierungsverfahren für die robuste Spracherkennung nach dem REMOS-Konzept," M.Sc. thesis, University of Erlangen-Nuremberg, Erlangen, Germany, 2009.
- (2009) Evaluierung Numerischer Optimierungsverfahren für Die Robuste Spracherkennung Nach Dem REMOS-Konzept
- Maas, R.¹

33
- 84863742097
- Maximum likelihood estimation of a reverberation model for robust distant-talking speech recognition
- A. Sehr, Y. Zheng, E. Nöth, and W. Kellermann, "Maximum likelihood estimation of a reverberation model for robust distant-talking speech recognition," in Proc. Eur. Signal Process. Conf. (EUSIPCO), 2007, pp. 1299-1303.
- (2007) Proc. Eur. Signal Process. Conf. (EUSIPCO) , pp. 1299-1303
- Sehr, A.¹ Zheng, Y.² Nöth, E.³ Kellermann, W.⁴

34
- 77955696289
- A combined approach for estimating a feature-domain reverberation model in non-diffuse environments
- A. Sehr, J. Y. C. Wen, W. Kellermann, and P. A. Naylor, "A combined approach for estimating a feature-domain reverberation model in non-diffuse environments," in Proc. Int. Workshop Acoust. Echo Noise Control (IWAENC), 2008.
- (2008) Proc. Int. Workshop Acoust. Echo Noise Control (IWAENC)
- Sehr, A.¹ Wen, J.Y.C.² Kellermann, W.³ Naylor, P.A.⁴

35
- 84863766806
- Blind estimation of a feature-domain reverberation model in non-diffuse environments with variance adjustment
- J. Y. C. Wen, A. Sehr, P. A. Naylor, and W. Kellermann, "Blind estimation of a feature-domain reverberation model in non-diffuse environments with variance adjustment," in Proc. Eur. Signal Process. Conf. (EUSIPCO), 2009, pp. 175-179.
- (2009) Proc. Eur. Signal Process. Conf. (EUSIPCO) , pp. 175-179
- Wen, J.Y.C.¹ Sehr, A.² Naylor, P.A.³ Kellermann, W.⁴

36
- 0018455820
- Image method for efficiently simulating small-room acoustics
- Apr.
- J. B. Allen and D. A. Berkley, "Image method for efficiently simulating small-room acoustics," J. Acoust. Soc. Amer., vol.65, no.4, pp. 943-950, Apr. 1979.
- (1979) J. Acoust. Soc. Amer. , vol.65 , Issue.4 , pp. 943-950
- Allen, J.B.¹ Berkley, D.A.²

37
- 0038669544
- The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions
- H. G. Hirsch and D. Pearce, "The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions," in Proc. ISCA ITRW ASR2000 Autom. Speech Recogn.: Challenges Next Millennium, 2000.
- (2000) Proc. ISCA ITRW ASR2000 Autom. Speech Recogn.: Challenges Next Millennium
- Hirsch, H.G.¹ Pearce, D.²

38
- 77955672368
- [Online]. Available
- HTK [Online]. Available: http://htk.eng.cam.ac.uk/

39
- 29144523061
- On the implementation of a primal-dual interior point filter line search algorithm for large-scale nonlinear programming
- A. Wächter and L. Biegler, "On the implementation of a primal-dual interior point filter line search algorithm for large-scale nonlinear programming," Math. Program., vol.106, no.1, pp. 25-57, 2006.
- (2006) Math. Program. , vol.106 , Issue.1 , pp. 25-57
- Wächter, A.¹ Biegler, L.²

40
- 0003822743
- (for HTK Version 3.2). Cambridge, U.K.: Cambridge Univ. Eng. Dept.
- S.Young, G. Evermann, D.Kershaw, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P.Woodland, The HTK Book (for HTK Version 3.2). Cambridge, U.K.: Cambridge Univ. Eng. Dept., 2002.
- (2002) The HTK Book
- Young, S.¹ Evermann, G.² Kershaw, D.³ Moore, G.⁴ Odell, J.⁵ Ollason, D.⁶ Povey, D.⁷ Valtchev, V.⁸ Woodland, P.⁹

41
- 77955671363
- Sound scene database in real acoustical environments
- "Sound scene database in real acoustical environments," Real World Computing Partnership, 2001.
- (2001) Real World Computing Partnership

42
- 0021226391
- A database for speaker-independent digit recognition
- R. G. Leonard, "A database for speaker-independent digit recognition," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 1984, pp. 42.11.1-42.11.4.
- (1984) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 42111-42114
- Leonard, R.G.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.