SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn , Issue , 2014, Pages 905-909

Robust CNN-based speech recognition with Gabor filter kernels

(2) Chang, Shuo Yiin a,b Morgan, Nelson a,b

a UNIVERSITY OF CALIFORNIA (United States)

b INTERNATIONAL COMPUTER SCIENCE INSTITUTE (United States)

Author keywords

Aurora 4; Convolutional neural network; Gabor filter; Speech recognition

Indexed keywords

BACKPROPAGATION; CONVOLUTION; GABOR FILTERS; NETWORK ARCHITECTURE; NEURAL NETWORKS; SPEECH; SPEECH COMMUNICATION;

ACOUSTIC FEATURES; AUDITORY PROCESSING; AURORA 4; CONVOLUTIONAL NEURAL NETWORK; FILTER COEFFICIENTS; MULTIPLE FEATURES; NEURAL NETWORK FEATURES; PROPOSED ARCHITECTURES;

SPEECH RECOGNITION;

EID: 84910036228 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (75)

References (26)

1
- 0003573244
- Kluwer Press
- H. Bourlard and N. Morgan, "Connectionist Speech Recognition: A Hybrid Approach", Kluwer Press, 1993.
- (1993) Connectionist Speech Recognition: A Hybrid Approach
- Bourlard, H.¹ Morgan, N.²

2
- 0033709098
- Tandem connectionist feature extraction for conventional HMM systems
- H. Hermansky, D. P. W. Ellis, and S. Sharma, "Tandem connectionist feature extraction for conventional HMM systems, " in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., 2000, vol. 3, pp. 1635-1638.
- (2000) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process , vol.3 , pp. 1635-1638
- Hermansky, H.¹ Ellis, D.P.W.² Sharma, S.³

3
- 85009097225
- On using MLP features in LVCSR
- Q. Zhu, B. Chen, N. Morgan, and A. Stolcke, "On using MLP features in LVCSR, " in Proc. Interspeech, 2004, pp. 921-924.
- (2004) Proc. Interspeech , pp. 921-924
- Zhu, Q.¹ Chen, B.² Morgan, N.³ Stolcke, A.⁴

4
- 84055211743
- Acoustic modeling using deep belief networks
- Jan
- A. Mohamed, G. Dahl, and G. Hinton, "Acoustic modeling using deep belief networks, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 20, no. 1, pp. 14 -22, Jan. 2012.
- (2012) Audio, Speech, and Language Processing, IEEE Transactions on , vol.20 , Issue.1 , pp. 14-22
- Mohamed, A.¹ Dahl, G.² Hinton, G.³

5
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
- IEEE
- G. Hinton, L. Deng, D. Yu, G. E. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, and T. N. Sainath, "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups, " Signal Processing Magazine, IEEE, vol. 29, no. 6, p. 8297, 2012.
- (2012) Signal Processing Magazine , vol.29 , Issue.6 , pp. 8297
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.E.⁴ Mohamed, A.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.N.¹⁰

6
- 84890492030
- An investigation of deep neural networks for noise robust speech recognition
- M. Seltzer, D. Yu and Y. Wang, "An investigation of deep neural networks for noise robust speech recognition" in Proc. ICASSP, pp. 7398-7402, 2013.
- (2013) Proc. ICASSP , pp. 7398-7402
- Seltzer, M.¹ Yu, D.² Wang, Y.³

7
- 84906214784
- Exploring convolutional neural network structures and optimization for speech recognition
- O. Abdel-Hamid, L. Deng, and D. Yu. "Exploring convolutional neural network structures and optimization for speech recognition, " Proc. Interspeech, 2013.
- (2013) Proc. Interspeech
- Abdel-Hamid, O.¹ Deng, L.² Yu, D.³

8
- 84978970240
- Deep convolutional neural networks for LVCSR
- Proc
- T. N. Sainath, A. Mohamed, B. Kingsbury and B. Ramabhadran "Deep convolutional neural networks for LVCSR" in Automatic Speech Recognition and Understanding (ASRU), 2013 Proc. p. 315- 320.
- (2013) Automatic Speech Recognition and Understanding (ASRU) , pp. 315-320
- Sainath, T.N.¹ Mohamed, A.² Kingsbury, B.³ Ramabhadran, B.⁴

9
- 84858971297
- Convolutive bottleneck network features for LVCSR
- IEEE
- K. Vesely, M. Karafiat, and Frantisek Grezl, "Convolutive bottleneck network features for LVCSR, " in Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on. IEEE, 2011 pp. 42-47.
- (2011) Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on , pp. 42-47
- Vesely, K.¹ Karafiat, M.² Grezl, F.³

10
- 84867602181
- Easy does it: Robust spectro-temporal many-stream asr without fine tuning streams
- S.V Ravuri and N. Morgan "Easy does it: robust spectro-temporal many-stream asr without fine tuning streams", Proc. ICSASP 2012, pp. 4309-4312.
- (2012) Proc. ICSASP , pp. 4309-4312
- Ravuri, S.V.¹ Morgan, N.²

11
- 70349223037
- An auditory-based feature for robust speech recognition
- Sep 2009
- Y. Shao, Z. Jin, D. Wang and S. Srinivasan "An auditory-based feature for robust speech recognition" Proc. Interspeech 2009, Sep 2009, pp. 4625-4628.
- (2009) Proc. Interspeech , pp. 4625-4628
- Shao, Y.¹ Jin, Z.² Wang, D.³ Srinivasan, S.⁴

12
- 70349194599
- Noise adaptive training using a vector Taylor series approach for noise robust automatic speech recognition
- O. Kalinli, M.L. Seltzer, and A. Acero, "Noise adaptive training using a vector Taylor series approach for noise robust automatic speech recognition, " in Proc. ICASSP, 2009, pp. 3825-3828.
- (2009) Proc. ICASSP , pp. 3825-3828
- Kalinli, O.¹ Seltzer, M.L.² Acero, A.³

13
- 84867611164
- Factor analysis based VTS discriminative adaptive training
- IEEE
- F. Flego and M. J. F. Gales, "Factor Analysis Based VTS Discriminative Adaptive Training" Proc. ICASSP. IEEE, 2012, pp. 4669-4672.
- (2012) Proc. ICASSP , pp. 4669-4672
- Flego, F.¹ Gales, M.J.F.²

14
- 84867589420
- Normalized amplitude modulation features for large vocabulary noise-robust speech recognition
- March
- H. Franco, M. Graciarena, and A. Mandal, "Normalized amplitude modulation features for large vocabulary noise-robust speech recognition", Proc. ICASSP 2012, March 2012, pp. 4117-4120.
- (2012) Proc. ICASSP 2012 , pp. 4117-4120
- Franco, H.¹ Graciarena, M.² Mandal, A.³

15
- 0442317754
- ETSI ES 202 050 Ver. 1.1.5
- Speech Processing, Transmission and Quality Aspects (STQ); Distributed Speech Recognition; Adv. Front-end Feature Extraction Algorithm; Compression Algorithms, ETSI ES 202 050 Ver. 1.1.5, 2007.
- (2007) Speech Processing, Transmission and Quality Aspects (STQ); Distributed Speech Recognition; Adv. Front-end Feature Extraction Algorithm; Compression Algorithms

16
- 78049398950
- Feature extraction for robust speech recognition based on maximizing the sharpness of the power distribution and on power flooring
- C. Kim and R. M. Stern, "Feature extraction for robust speech recognition based on maximizing the sharpness of the power distribution and on power flooring", in Proc. ICASSP, pp. 4574-4577, 2010.
- (2010) Proc. ICASSP , pp. 4574-4577
- Kim, C.¹ Stern, R.M.²

17
- 85009227802
- Localized spectro-temporal features for automatic speech recognition
- Sep
- M. Kleinschmidt, "Localized spectro-temporal features for automatic speech recognition, " in Proc. of Eurospeech, 2003, Sep 2003, pp. 2573-2576.
- (2003) Proc. of Eurospeech, 2003 , pp. 2573-2576
- Kleinschmidt, M.¹

18
- 0032658253
- Temporal patterns (TRAPs) in ASR of noisy speech
- March
- H. Hermansky and S. Sharma, "Temporal patterns (TRAPs) in ASR of noisy speech, " Proc. ICASSP 1999, March 1999, pp. 289-292 vol. 1.
- (1999) Proc. ICASSP 1999 , vol.1 , pp. 289-292
- Hermansky, H.¹ Sharma, S.²

19
- 33646799825
- A neural network for learning long-term temporal features for speech recognition
- March
- B.Y. Chen, Q. Zhu, and N. Morgan, "A Neural Network for Learning Long-Term Temporal Features for Speech Recognition, " Proc. ICASSP 2005, March 2005, pp. 945-948.
- (2005) Proc. ICASSP 2005 , pp. 945-948
- Chen, B.Y.¹ Zhu, Q.² Morgan, N.³

20
- 84906221944
- Informative spectro-temporal bottleneck features for noise-robust speech recognition
- S.Y. Chang, N. Morgan "Informative spectro-temporal bottleneck features for noise-robust speech recognition", Proc. Interspeech 2013.
- (2013) Proc. Interspeech
- Chang, S.Y.¹ Morgan, N.²

21
- 84890543873
- Investigating deep neural network based transforms of robust audio features for LVCSR
- E. Bocchieri and D. Dimitriadis "Investigating deep neural network based transforms of robust audio features for LVCSR" in Proc. ICASSP, pp. 6709-6713, 2013.
- (2013) Proc. ICASSP , pp. 6709-6713
- Bocchieri, E.¹ Dimitriadis, D.²

22
- 0032203257
- Gradient-based learning applied to document recognition
- Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition, " Proceedings of the IEEE, 86(11), 2278-2324, 1998.
- (1998) Proceedings of the IEEE , vol.86 , Issue.11 , pp. 2278-2324
- Lecun, Y.¹ Bottou, L.² Bengio, Y.³ Haffner, P.⁴

23
- 67651242353
- Performance analysis of the Aurora large vocabulary baseline system
- Vienna, Austria
- N. Parihar, J. Picone, D. Pearce, H.G. Hirsch, "Performance analysis of the Aurora large vocabulary baseline system, " Proceedings of the European Signal Processing Conference, Vienna, Austria, 2004.
- (2004) Proceedings of the European Signal Processing Conference
- Parihar, N.¹ Picone, J.² Pearce, D.³ Hirsch, H.G.⁴

24
- 84910089405
- "Renoiser web page, " http://labrosa.ee.columbia.edu/projects/renoiser/create-wsj.html.
- Renoiser Web Page

25
- 84873310339
- The rats radio traffic collection system
- K.Walker and S. Strassel, "The rats radio traffic collection system, " in Proc. of ISCA Odyssey, 2012.
- (2012) Proc. of ISCA Odyssey
- Walker, K.¹ Strassel, S.²

26
- 78650474133
- A practical guide to training restricted Boltzmann machines
- University of Toronto
- G. Hinton, "A practical guide to training restricted Boltzmann machines, " Tech. Rep. UTML TR 2010-003, University of Toronto, 2010.
- (2010) Tech. Rep. UTML TR 2010-003
- Hinton, G.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.