SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn , Issue , 2014, Pages 4623-4627

Deep recurrent de-noising auto-encoder and blind de-reverberation for reverberated speech recognition

(4) Weninger, Felix a,b Watanabe, Shinji a Tachioka, Yuuki c Schuller, Bjorn b

a MITSUBISHI ELECTRIC RESEARCH LABORATORIES (United States)

b TECHNICAL UNIVERSITY OF MUNICH (Germany)

c MITSUBISHI ELECTRIC CORPORATION (Japan)

Author keywords

automatic speech recognition; De reverberation; feature enhancement; recurrent neural networks

Indexed keywords

LEARNING SYSTEMS; RECURRENT NEURAL NETWORKS; SIGNAL PROCESSING; SPEECH RECOGNITION;

AUTOMATIC SPEECH RECOGNITION; DE-REVERBERATION; FEATURE ENHANCEMENT; FEATURE TRANSFORMATIONS; HUMAN MACHINE INTERACTION; MODEL ADAPTATION; MULTI-CONDITION TRAININGS; ROBUST AUTOMATIC SPEECH RECOGNITIONS (ASR);

REVERBERATION;

EID: 84905216003 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2014.6854478 Document Type: Conference Paper

Times cited : (80)

References (28)

1
- 84890492030
- An investigation of deep neural networks for noise robust speech recognition
- Vancouver, Canada
- M.L. Seltzer, D. Yu, and Y. Wang, "An investigation of deep neural networks for noise robust speech recognition, " in Proc. of ICASSP, Vancouver, Canada, 2013, pp. 7398-7402.
- (2013) Proc. of ICASSP , pp. 7398-7402
- Seltzer, M.L.¹ Yu, D.² Wang, Y.³

2
- 84906237188
- Reverberant speech recognition based on denoising autoencoder
- Lyon, France
- T. Ishii, H. Komiyama, T. Shinozaki, Y. Horiuchi, and S. Kuroiwa, "Reverberant speech recognition based on denoising autoencoder, " in Proc. of INTERSPEECH, Lyon, France, 2013, pp. 3512-3516.
- (2013) Proc. of INTERSPEECH , pp. 3512-3516
- Ishii, T.¹ Komiyama, H.² Shinozaki, T.³ Horiuchi, Y.⁴ Kuroiwa, S.⁵

3
- 84900537286
- The Munich feature enhancement approach to the 2013 CHiME Challenge using BLSTM recurrent neural networks
- Vancouver, Canada
- F. Weninger, J. Geiger, M. Wollmer, B. Schuller, and G. Rigoll, "The Munich feature enhancement approach to the 2013 CHiME Challenge using BLSTM recurrent neural networks, " in Proc. The 2nd CHiME Workshop, Vancouver, Canada, 2013, pp. 86-90.
- (2013) Proc. The 2nd CHiME Workshop , pp. 86-90
- Weninger, F.¹ Geiger, J.² Wollmer, M.³ Schuller, B.⁴ Rigoll, G.⁵

4
- 84883396653
- Noise robust ASR in reverberated multisource environments applying convolutive NMF and Long Short-Term Memory
- M. Wollmer, F. Weninger, J. Geiger, B. Schuller, and G. Rigoll, "Noise robust ASR in reverberated multisource environments applying convolutive NMF and Long Short-Term Memory, " Computer Speech and Language, Special Issue on Speech Separation and Recognition in Multisource Environments, vol. 27, no. 3, pp. 780-797, 2013.
- (2013) Computer Speech and Language, Special Issue on Speech Separation and Recognition in Multisource Environments , vol.27 , Issue.3 , pp. 780-797
- Wollmer, M.¹ Weninger, F.² Geiger, J.³ Schuller, B.⁴ Rigoll, G.⁵

5
- 0242609086
- Blind estimation of reverberation time
- R. Ratnam, D.L. Jones, B.C. Wheeler, W.D. O'Brien, Jr, C.R. Lansing, and A.S. Feng, "Blind estimation of reverberation time, " The Journal of the Acoustical Society of America, vol. 114, no. 5, pp. 2877-2892, 2003.
- (2003) The Journal of the Acoustical Society of America , vol.114 , Issue.5 , pp. 2877-2892
- Ratnam, R.¹ Jones, D.L.² Wheeler, B.C.³ O'brien Jr., W.D.⁴ Lansing, C.R.⁵ Feng, A.S.⁶

6
- 77955671150
- Model-based dereverberation in the Logmelspec domain for robust distant-talking speech recognition
- Dallas, USA
- A. Sehr, R. Maas, and W. Kellermann, "Model-based dereverberation in the Logmelspec domain for robust distant-talking speech recognition, " in Proc. of ICASSP, Dallas, USA, 2010, pp. 4298-4301.
- (2010) Proc. of ICASSP , pp. 4298-4301
- Sehr, A.¹ Maas, R.² Kellermann, W.³

7
- 80051616058
- Frame-wise HMM adaptation using state-dependent reverberation estimates
- Prague, Czech Republic
- A. Sehr, R. Maas, and W. Kellermann, "Frame-wise HMM adaptation using state-dependent reverberation estimates, " in IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), Prague, Czech Republic, 2011, pp. 5484-5487.
- (2011) IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP) , pp. 5484-5487
- Sehr, A.¹ Maas, R.² Kellermann, W.³

8
- 79961153040
- Model-based approaches to handling additive noise in reverberant environments
- Edinburgh, UK
- M.J.F. Gales and Y.Q. Wang, "Model-based approaches to handling additive noise in reverberant environments, " in Proc. IEEE Workshop on Hands-free Speech Communication and Microphone Arrays, Edinburgh, UK, 2011, pp. 121-126.
- (2011) Proc. IEEE Workshop on Hands-free Speech Communication and Microphone Arrays , pp. 121-126
- Gales, M.J.F.¹ Wang, Y.Q.²

9
- 79957856980
- A basis representation of constrained MLLR transforms for robust adaptation
- D. Povey and K. Yao, "A basis representation of constrained MLLR transforms for robust adaptation, " Computer Speech and Language, vol. 26, pp. 35-51, 2012.
- (2012) Computer Speech and Language , vol.26 , pp. 35-51
- Povey, D.¹ Yao, K.²

10
- 70350450398
- Static and dynamic variance compensation for recognition of reverberant speech with dereverberation pre-processing
- M. Delcroix, T. Nakatani, and S.Watanabe, "Static and dynamic variance compensation for recognition of reverberant speech with dereverberation pre-processing, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 17, no. 2, pp. 324-334, 2009.
- (2009) IEEE Transactions on Audio, Speech, and Language Processing , vol.17 , Issue.2 , pp. 324-334
- Delcroix, M.¹ Nakatani, T.² Watanabe, S.³

11
- 56449089103
- Extracting and composing robust features with denoising autoencoders
- Helsinki, Finland
- P. Vincent, H. Larochelle, Y. Bengio, and P. Manzagol, "Extracting and composing robust features with denoising autoencoders, " in Proc. of ICML, Helsinki, Finland, 2008, pp. 1096-1103.
- (2008) Proc. of ICML , pp. 1096-1103
- Vincent, P.¹ Larochelle, H.² Bengio, Y.³ Manzagol, P.⁴

12
- 0031573117
- Long short-term memory
- S. Hochreiter and J. Schmidhuber, "Long short-term memory, " Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997.
- (1997) Neural Computation , vol.9 , Issue.8 , pp. 1735-1780
- Hochreiter, S.¹ Schmidhuber, J.²

13
- 0034293152
- Learning to forget: Continual prediction with LSTM
- F. Gers, J. Schmidhuber, and F. Cummins, "Learning to forget: Continual prediction with LSTM, " Neural Computation, vol. 12, no. 10, pp. 2451-2471, 2000.
- (2000) Neural Computation , vol.12 , Issue.10 , pp. 2451-2471
- Gers, F.¹ Schmidhuber, J.² Cummins, F.³

14
- 84893622444
- The REVERB Challenge: A common evaluation framework for dereverberation and recognition of reverberant speech
- New Paltz, NY, USA, to appear
- K. Kinoshita, M. Delcroix, T. Yoshioka, T. Nakatani, E. Habets, R. Haeb-Umbach, V. Leutnant, A. Sehr, W. Kellermann, R. Maas, S. Gannot, and B. Raj, "The REVERB Challenge: A common evaluation framework for dereverberation and recognition of reverberant speech, " in Proc. of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, 2013, to appear.
- (2013) Proc. of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)
- Kinoshita, K.¹ Delcroix, M.² Yoshioka, T.³ Nakatani, T.⁴ Habets, E.⁵ Haeb-Umbach, R.⁶ Leutnant, V.⁷ Sehr, A.⁸ Kellermann, W.⁹ Maas, R.¹⁰ Gannot, S.¹¹ Raj, B.¹²

15
- 85132941272
- Speech dereverberation using statistical reverberation models
- P.A. Naylor and N.D. Gaubitch, Eds. Springer
- E. Habets, "Speech dereverberation using statistical reverberation models, " in Speech Dereverberation, P.A. Naylor and N.D. Gaubitch, Eds., pp. 57-93. Springer, 2010.
- (2010) Speech Dereverberation , pp. 57-93
- Habets, E.¹

16
- 84962920708
- Evaluating long-term spectral subtraction for reverberant ASR
- Madonna di Campiglio, ItalyIEEE
- D. Gelbart and N. Morgan, "Evaluating long-term spectral subtraction for reverberant ASR, " in Proc. of ASRU, Madonna di Campiglio, Italy, 2001, pp. 103-106, IEEE.
- (2001) Proc. of ASRU , pp. 103-106
- Gelbart, D.¹ Morgan, N.²

17
- 65249167097
- Suppression of late reverberation effect on speech signal using long-term multiplestep linear prediction
- K. Kinoshita, M. Delcroix, T. Nakatani, and M. Miyoshi, "Suppression of late reverberation effect on speech signal using long-term multiplestep linear prediction, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 17, no. 4, pp. 534-545, 2009.
- (2009) IEEE Transactions on Audio, Speech, and Language Processing , vol.17 , Issue.4 , pp. 534-545
- Kinoshita, K.¹ Delcroix, M.² Nakatani, T.³ Miyoshi, M.⁴

18
- 84906279378
- Speech enhancement with weighted denoising auto-encoder
- Lyon, France
- B.Y. Xia and C.C. Bao, "Speech enhancement with weighted denoising auto-encoder, " in Proc. of INTERSPEECH, Lyon, France, 2013, pp. 436-440.
- (2013) Proc. of INTERSPEECH , pp. 436-440
- Xia, B.Y.¹ Bao, C.C.²

19
- 84900542109
- Recurrent neural network feature enhancement: The 2nd CHiME challenge
- IEEE. Vancouver, Canada, June
- A.L. Maas, T.M. O'Neil, A.Y. Hannun, and A.Y. Ng, "Recurrent neural network feature enhancement: The 2nd CHiME challenge, " in Proc. The 2nd CHiME Workshop, Vancouver, Canada, June 2013, pp. 79-80, IEEE.
- (2013) Proc. The 2nd CHiME Workshop , pp. 79-80
- Maas, A.L.¹ O'neil, T.M.² Hannun, A.Y.³ Ng, A.Y.⁴

20
- 84877253028
- Dereverberation method with reverberation time estimation using floored ratio of spectral subtraction
- Y. Tachioka, T. Hanazawa, and T. Iwasaki, "Dereverberation method with reverberation time estimation using floored ratio of spectral subtraction, " Acoustical Science and Technology, vol. 34, no. 3, pp. 212-215, 2013.
- (2013) Acoustical Science and Technology , vol.34 , Issue.3 , pp. 212-215
- Tachioka, Y.¹ Hanazawa, T.² Iwasaki, T.³

21
- 0018455310
- Suppression of acoustic noise in speech using spectral subtraction
- S.F. Boll, "Suppression of acoustic noise in speech using spectral subtraction, " IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 27, no. 2, pp. 113-120, 1979.
- (1979) IEEE Transactions on Acoustics, Speech and Signal Processing , vol.27 , Issue.2 , pp. 113-120
- Boll, S.F.¹

22
- 84890543083
- Speech recognition with deep recurrent neural networks
- Vancouver, Canada, May, IEEE
- A. Graves, A. Mohamed, and G. Hinton, "Speech recognition with deep recurrent neural networks, " in Proc. of ICASSP, Vancouver, Canada, May 2013, pp. 6645-6649, IEEE.
- (2013) Proc. of ICASSP , pp. 6645-6649
- Graves, A.¹ Mohamed, A.² Hinton, G.³

23
- 84865791631
- Speech-based non-prototypical affect recognition for childrobot interaction in reverberated environments
- Florence, Italy
- M.Wollmer, F.Weninger, S. Steidl, A. Batliner, and B. Schuller, "Speech-based non-prototypical affect recognition for childrobot interaction in reverberated environments, " in Proc. of INTERSPEECH, Florence, Italy, 2011, pp. 3113-3116.
- (2011) Proc. of INTERSPEECH , pp. 3113-3116
- Wollmer, M.¹ Weninger, F.² Steidl, S.³ Batliner, A.⁴ Schuller, B.⁵

24
- 0028996854
- WSJCAM0: A British English speech corpus for large vocabulary continuous speech recognition
- Detroit, MI, USA
- T. Robinson, J. Fransen, D. Pye, J. Foote, and S. Renals, "WSJCAM0: A British English speech corpus for large vocabulary continuous speech recognition, " in Proc. of ICASSP, Detroit, MI, USA, 1995, pp. 81-84.
- (1995) Proc. of ICASSP , pp. 81-84
- Robinson, T.¹ Fransen, J.² Pye, D.³ Foote, J.⁴ Renals, S.⁵

25
- 84858953642
- The Kaldi speech recognition toolkit
- Big Island, HI, USA
- D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, N. Goel, M. Hannemann, P. Motlícek, Y. Qian, P. Schwarz, et al., "The Kaldi speech recognition toolkit, " in Proc. of ASRU, Big Island, HI, USA, 2011.
- (2011) Proc. of ASRU
- Povey, D.¹ Ghoshal, A.² Boulianne, G.³ Burget, L.⁴ Glembek, O.⁵ Goel, N.⁶ Hannemann, M.⁷ Motlícek, P.⁸ Qian, Y.⁹ Schwarz, P.¹⁰

26
- 60749097551
- Cambridge University Engineering Department, Cambridge, UK
- S.J. Young, G. Evermann, M.J.F. Gales, D. Kershaw, G. Moore, J.J. Odell, D.G. Ollason, D. Povey, V. Valtchev, and P.C. Woodland, The HTK book version 3.4, Cambridge University Engineering Department, Cambridge, UK, 2006.
- (2006) The HTK Book Version 3.4
- Young, S.J.¹ Evermann, G.² Gales, M.J.F.³ Kershaw, D.⁴ Moore, G.⁵ Odell, J.J.⁶ Ollason, D.G.⁷ Povey, D.⁸ Valtchev, V.⁹ Woodland, P.C.¹⁰

27
- 0032638856
- Semi-tied covariance matrices for hidden Markov models
- M. Gales, "Semi-tied covariance matrices for hidden Markov models, " IEEE Transactions on Speech and Audio Processing, vol. 7, pp. 272-281, 1999.
- (1999) IEEE Transactions on Speech and Audio Processing , vol.7 , pp. 272-281
- Gales, M.¹

28
- 84890503970
- Effectiveness of discriminative training and feature transformation for reverberated and noisy speech
- Vancouver, Canada
- Y. Tachioka, S. Watanabe, and J.R. Hershey, "Effectiveness of discriminative training and feature transformation for reverberated and noisy speech, " in Proc. of ICASSP, Vancouver, Canada, 2013, pp. 6935-6939
- (2013) Proc. of ICASSP , pp. 6935-6939
- Tachioka, Y.¹ Watanabe, S.² Hershey, J.R.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.