SCOPUS 정보 검색 플랫폼

2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings

Volumn , Issue , 2016, Pages 317-323

Time-frequency convolutional networks for robust speech recognition

(2) Mitra, Vikramjit a Franco, Horacio a

a SRI INTERNATIONAL (United States)

Author keywords

deep convolution networks; robust features; robust speech recognition; time frequency convolution nets

Indexed keywords

CONVOLUTION; REVERBERATION; SPEECH;

CONVOLUTIONAL NETWORKS; DEEP NEURAL NETWORKS; EXPERIMENTAL ANALYSIS; ROBUST FEATURES; ROBUST SPEECH RECOGNITION; ROBUSTNESS TO NOISE; SPECTRAL DISTORTIONS; TIME FREQUENCY;

SPEECH RECOGNITION;

EID: 84964422542 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ASRU.2015.7404811 Document Type: Conference Paper

Times cited : (46)

References (28)

1
- 84055211743
- Acoustic modeling using deep belief networks
- A. Mohamed, G.E. Dahl, and G. Hinton, "Acoustic modeling using deep belief networks," IEEE Trans. on ASLP, vol. 20, no. 1, pp. 14-22, 2012
- (2012) IEEE Trans. on ASLP , vol.20 , Issue.1 , pp. 14-22
- Mohamed, A.¹ Dahl, G.E.² Hinton, G.³

2
- 84865801985
- Conversational speech transcription using context-dependent deep neural networks
- F. Seide, G. Li, and D. Yu, "Conversational speech transcription using context-dependent deep neural networks," Proc. of Interspeech, 2011
- (2011) Proc. of Interspeech
- Seide, F.¹ Li, G.² Yu, D.³

3
- 84910065702
- Acoustic modeling with deep neural networks using raw time signal for LVCSR
- Z. Tuske, P., Golik, R., Schluter, and H. Ney, "Acoustic modeling with deep neural networks using raw time signal for LVCSR," Proc. of Interspeech, 2014
- (2014) Proc. of Interspeech
- Tuske, Z.¹ Golik, P.² Schluter, R.³ Ney, H.⁴

4
- 85075436378
- Deep neural network language models
- E. Arisoy, T.N. Sainath, B. Kingsbury, and B. Ramabhadran, "Deep neural network language models," Proc. of NAACL-HLT Workshop, 2012
- (2012) Proc. of NAACL-HLT Workshop
- Arisoy, E.¹ Sainath, T.N.² Kingsbury, B.³ Ramabhadran, B.⁴

5
- 84890492030
- An investigation of deep neural networks for noise robust speech recognition
- M. Seltzer, D. Yu, and Y. Wang, "An investigation of deep neural networks for noise robust speech recognition", Proc of ICASSP, 2013
- (2013) Proc of ICASSP
- Seltzer, M.¹ Yu, D.² Wang, Y.³

6
- 84910075252
- Evaluating robust features on deep neural networks for speech recognition in noisy and channel mismatched conditions
- V. Mitra, W. Wang, H. Franco, Y. Lei, C. Bartels, and M. Graciarena, "Evaluating robust features on deep neural networks for speech recognition in noisy and channel mismatched conditions," in Proc. of Interspeech, 2014
- (2014) Proc. of Interspeech
- Mitra, V.¹ Wang, W.² Franco, H.³ Lei, Y.⁴ Bartels, C.⁵ Graciarena, M.⁶

7
- 84928158249
- Robust features and system fusion for reverberation-robust speech recognition
- V. Mitra, W. Wang, Y. Lei, A. Kathol, G. Sivaraman, and C. Espy-Wilson, "Robust features and system fusion for reverberation-robust speech recognition," in Proc. of REVERB Challenge, 2014
- (2014) Proc. of REVERB Challenge
- Mitra, V.¹ Wang, W.² Lei, Y.³ Kathol, A.⁴ Sivaraman, G.⁵ Espy-Wilson, C.⁶

8
- 84959111702
- Combating reverberation in large vocabulary continuous speech recognition
- V. Mitra, J. Van Hout, M. McLaren, W. Wang, M. Graciarena, D. Vergyri, and H. Franco, "Combating reverberation in large vocabulary continuous speech recognition," Proc. of Interspeech, 2015
- (2015) Proc. of Interspeech
- Mitra, V.¹ Van Hout, J.² McLaren, M.³ Wang, W.⁴ Graciarena, M.⁵ Vergyri, D.⁶ Franco, H.⁷

9
- 84946693063
- Deep convolutional nets and robust features for reverberation-robust speech recognition
- V. Mitra, W. Wang, and H. Franco, "deep convolutional nets and robust features for reverberation-robust speech recognition," in Proc. of SLT, pp. 548-553, 2014
- (2014) Proc. of SLT , pp. 548-553
- Mitra, V.¹ Wang, W.² Franco, H.³

10
- 84890525984
- Deep convolutional neural network for LVCSR
- T. Sainath, A. Mohamed, B. Kingsbury, and B. Ramabhadran, "Deep convolutional neural network for LVCSR," Proc. of ICASSP, 2013
- (2013) Proc. of ICASSP
- Sainath, T.¹ Mohamed, A.² Kingsbury, B.³ Ramabhadran, B.⁴

11
- 84867605836
- Applying convolutional neural networks concepts to hybrid NNHMM model for speech recognition
- O. Abdel-Hamid, A. Mohamed, H. Jiang, and G. Penn, "Applying convolutional neural networks concepts to hybrid NNHMM model for speech recognition," Proc. of ICASSP, pp. 4277-4280, 2012
- (2012) Proc. of ICASSP , pp. 4277-4280
- Abdel-Hamid, O.¹ Mohamed, A.² Jiang, H.³ Penn, G.⁴

12
- 84858953286
- Vocal tract length normalization for LVCSR
- Carnegie Mellon University
- P. Zhan and A Waibel, "Vocal tract length normalization for LVCSR," in Tech. Rep. CMU-LTI-97-150. Carnegie Mellon University, 1997
- (1997) Tech. Rep. CMU-LTI-97-150
- Zhan, P.¹ Waibel, A.²

13
- 84863380535
- Unsupervised feature learning for audio classification using convolutional deep belief networks
- H. Lee, P. Pham, Y. Largman, and A. Ng, "Unsupervised feature learning for audio classification using convolutional deep belief networks," Proc. of Adv. Neural Inf. Process. Syst. 22, pp. 1096-1104, 2009
- (2009) Proc. of Adv. Neural Inf. Process. Syst , vol.22 , pp. 1096-1104
- Lee, H.¹ Pham, P.² Largman, Y.³ Ng, A.⁴

14
- 85007207023
- Exploring hierarchical speech representations using a deep convolutional neural network
- D. Hau and K. Chen, "Exploring hierarchical speech representations using a deep convolutional neural network," Proc. of 11th UK Workshop Comput. Intell. (UKCI '11), 2011
- (2011) Proc. of 11th UK Workshop Comput. Intell. (UKCI '11)
- Hau, D.¹ Chen, K.²

15
- 0024634603
- Phoneme recognition using time-delay neural networks
- A. Waibel, T. Hanazawa, G. Hinton, K. Shikano, and K. Lang, "Phoneme recognition using time-delay neural networks," IEEE Trans. Acoust., Speech, Signal Process., 38(3), pp. 328-339, 1989
- (1989) IEEE Trans. Acoust., Speech, Signal Process , vol.38 , Issue.3 , pp. 328-339
- Waibel, A.¹ Hanazawa, T.² Hinton, G.³ Shikano, K.⁴ Lang, K.⁵

16
- 84964511330
- Single channel blind dereverberation based on auto-correlation functions of frame-wise time sequences of frequency components
- K. Ohta and M. Yanagida, "Single channel blind dereverberation based on auto-correlation functions of frame-wise time sequences of frequency components," Proc. of IWAENC, pp. 1-4, 2006
- (2006) Proc. of IWAENC , pp. 1-4
- Ohta, K.¹ Yanagida, M.²

17
- 84893622444
- The REVERB Challenge: A common evaluation framework for dereverberation and recognition of reverberant speech
- K. Kinoshita, M. Delcroix, T. Yoshioka, T. Nakatani, E. Habets, R. Haeb-Umbach, V. Leutnant, A. Sehr, W. Kellermann, R. Maas, S. Gannot, and B. Raj, "The REVERB Challenge: A common evaluation framework for dereverberation and recognition of reverberant speech," Proc. of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2013
- (2013) Proc. of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)
- Kinoshita, K.¹ Delcroix, M.² Yoshioka, T.³ Nakatani, T.⁴ Habets, E.⁵ Haeb-Umbach, R.⁶ Leutnant, V.⁷ Sehr, A.⁸ Kellermann, W.⁹ Maas, R.¹⁰ Gannot, S.¹¹ Raj, B.¹²

18
- 33646677283
- Experimental framework for the performance evaluation of speech recognition front-ends on a large vocabulary task
- June 4
- G. Hirsch, "Experimental framework for the performance evaluation of speech recognition front-ends on a large vocabulary task," ETSI STQ-Aurora DSR Working Group, June 4, 2001
- (2001) ETSI STQ-Aurora DSR Working Group
- Hirsch, G.¹

19
- 0028996854
- WSJCAM0: A British English speech corpus for large vocabulary continuous speech recognition
- T. Robinson, J. Fransen, D. Pye, J. Foote, and S. Renals, "WSJCAM0: A British English speech corpus for large vocabulary continuous speech recognition," Proc. ICASSP, pp. 81-84, 1995
- (1995) Proc. ICASSP , pp. 81-84
- Robinson, T.¹ Fransen, J.² Pye, D.³ Foote, J.⁴ Renals, S.⁵

20
- 33846217002
- The multi-channel Wall Street Journal audio visual corpus (MC-WSJAV): Specification and initial experiments
- M. Lincoln, I. McCowan, J. Vepa, and H.K. Maganti, "The multi-channel Wall Street Journal audio visual corpus (MC-WSJAV): Specification and initial experiments," Proc. of IEEE Workshop on Automatic Speech Recognition and Understanding, 2005
- (2005) Proc. of IEEE Workshop on Automatic Speech Recognition and Understanding
- Lincoln, M.¹ McCowan, I.² Vepa, J.³ Maganti, H.K.⁴

21
- 84906260861
- Damped oscillator cepstral coefficients for robust speech recognition
- V. Mitra, H. Franco, and M. Graciarena, "Damped oscillator cepstral coefficients for robust speech recognition," Proc. of Interspeech, pp. 886-890, 2013
- (2013) Proc. of Interspeech , pp. 886-890
- Mitra, V.¹ Franco, H.² Graciarena, M.³

22
- 84867589420
- Normalized amplitude modulation features for large vocabulary noise-robust speech recognition
- V. Mitra, H. Franco, M. Graciarena, and A. Mandal, "Normalized amplitude modulation features for large vocabulary noise-robust speech recognition," Proc. of ICASSP, pp. 4117-4120, 2012
- (2012) Proc. of ICASSP , pp. 4117-4120
- Mitra, V.¹ Franco, H.² Graciarena, M.³ Mandal, A.⁴

23
- 0028287770
- Effect of reducing slow temporal modulations on speech reception
- R. Drullman, J. M. Festen, and R. Plomp, "Effect of reducing slow temporal modulations on speech reception", J. Acoust. Soc. of Am., vol. 95, no. 5, pp. 2670-2680, 1994
- (1994) J. Acoust. Soc. of Am , vol.95 , Issue.5 , pp. 2670-2680
- Drullman, R.¹ Festen, J.M.² Plomp, R.³

24
- 84905269267
- Medium duration modulation cepstral feature for robust speech recognition
- Florence
- V. Mitra, H. Franco, M. Graciarena, and D. Vergyri, "Medium duration modulation cepstral feature for robust speech recognition," Proc. of ICASSP, Florence, 2014
- (2014) Proc. of ICASSP
- Mitra, V.¹ Franco, H.² Graciarena, M.³ Vergyri, D.⁴

25
- 0019075685
- Some observations on oral air flow during phonation
- H. Teager, "Some observations on oral air flow during phonation," in IEEE Trans. ASSP, pp. 599-601, 1980
- (1980) IEEE Trans. ASSP , pp. 599-601
- Teager, H.¹

26
- 84964561760
- Improving robustness against reverberation for large-vocabulary continuous speech recognition
- V. Mitra, J. Van Hout, W. Wang, M. Graciarena, M. McLaren, H. Franco, D. Vergyri, "Improving Robustness against Reverberation for Large-Vocabulary Continuous Speech Recognition," submitted to ASRU 2015
- (2015) Submitted to ASRU
- Mitra, V.¹ Van Hout, J.² Wang, W.³ Graciarena, M.⁴ McLaren, M.⁵ Franco, H.⁶ Vergyri, D.⁷

27
- 84964515036
- The automatic speech recognition in reverberant environments (aspire) challenge
- M. Harper, "The Automatic Speech recognition In Reverberant Environments (ASpIRE) Challenge," Proc. of ASRU, 2015
- (2015) Proc. of ASRU
- Harper, M.¹

28
- 84874226579
- Adaptation of context-dependent deep neural networks for automatic speech recognition
- K. Yao, D. Yu, F. Seide, H. Su, L. Deng, Y Gong, "Adaptation Of Context-Dependent Deep Neural Networks For Automatic Speech Recognition," Proc. of SLT 2012
- (2012) Proc. of SLT
- Yao, K.¹ Yu, D.² Seide, F.³ Su, H.⁴ Deng, L.⁵ Gong, Y.⁶

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.