SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn 2015-August, Issue , 2015, Pages 4989-4993

An analysis of convolutional neural networks for speech recognition

(3) Huang, Jui Ting a Li, Jinyu a Gong, Yifan a

a MICROSOFT (United States)

Author keywords

Convolutional neural networks; DNN; low footprint models; maxout units

Indexed keywords

AUDIO SIGNAL PROCESSING; CONVOLUTION; DEEP NEURAL NETWORKS; NEURAL NETWORKS; SPEECH; SPEECH COMMUNICATION;

CONVOLUTIONAL NEURAL NETWORK; DISTANT SPEECH RECOGNITION; EDGE DETECTORS; MAXOUT UNITS; NOISE ROBUSTNESS; SMALL FOOTPRINTS; TEST CONDITION; WORD ERROR RATE REDUCTIONS;

SPEECH RECOGNITION;

EID: 84946086402 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2015.7178920 Document Type: Conference Paper

Times cited : (125)

References (24)

1
- 84055163920
- Roles of pretraining and fine-tuning in context-dependent DBN-HMMs for realworld speech recognition
- D. Yu, L. Deng, and G. Dahl, "Roles of pretraining and fine-tuning in context-dependent DBN-HMMs for realworld speech recognition," in Proc. NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2010.
- (2010) Proc. NIPS Workshop on Deep Learning and Unsupervised Feature Learning
- Yu, D.¹ Deng, L.² Dahl, G.³

2
- 84858972572
- Making deep belief networks effective for large vocabulary continuous speech recognition
- T. N. Sainath, B. Kingsbury, B. Ramabhadran, P. Fousek, P. Novak, and A. Mohamed, "Making deep belief networks effective for large vocabulary continuous speech recognition," in Proc. Workshop on Automatic Speech Recognition and Understanding, pp. 30-35, 2011.
- (2011) Proc. Workshop on Automatic Speech Recognition and Understanding , pp. 30-35
- Sainath, T.N.¹ Kingsbury, B.² Ramabhadran, B.³ Fousek, P.⁴ Novak, P.⁵ Mohamed, A.⁶

3
- 84055222005
- Contextdependent pre-trained deep neural networks for largevocabulary speech recognition
- G. E. Dahl, D. Yu, L. Deng, and A. Acero, "Contextdependent pre-trained deep neural networks for largevocabulary speech recognition," IEEE Trans. on Audio, Speech and Language Processing, vol. 20, no. 1, pp. 30-42, 2012.
- (2012) IEEE Trans. on Audio, Speech and Language Processing , vol.20 , Issue.1 , pp. 30-42
- Dahl, G.E.¹ Yu, D.² Deng, L.³ Acero, A.⁴

4
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
- G. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, and B. Kingsbury, "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups," IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 82-97, 2012.
- (2012) IEEE Signal Processing Magazine , vol.29 , Issue.6 , pp. 82-97
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.⁴ Mohamed, A.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.¹⁰ Kingsbury, B.¹¹

5
- 84890491198
- Recent advances in deep learning for speech research at Microsoft
- L. Deng, J. Li, J.-T. Huang et al. "Recent advances in deep learning for speech research at Microsoft," in Proc. ICASSP, 2013.
- (2013) Proc. ICASSP
- Deng, L.¹ Li, J.² Huang, J.-T.³

6
- 84911473441
- Convolutional neural networks for speech recognition
- O. Abdel-Hamid, A. Mohamed, H. Jiang, L. Deng, G. Penn, D. Yu, "Convolutional neural networks for speech recognition," IEEE Transactions on Audio, Speech, and Language Processing, vol.22, no.1, pp.1533-1545, 2014.
- (2014) IEEE Transactions on Audio, Speech, and Language Processing , vol.22 , Issue.1 , pp. 1533-1545
- Abdel-Hamid, O.¹ Mohamed, A.² Jiang, H.³ Deng, L.⁴ Penn, G.⁵ Yu, D.⁶

7
- 84890525984
- Deep convolutional neural networks for LVCSR
- T.N. Sainath, A. Mohamed, B. Kingsbury, and B. Ramabhadran, "Deep convolutional neural networks for LVCSR," in Proc IEEE ICASSP, 2013.
- (2013) Proc IEEE ICASSP
- Sainath, T.N.¹ Mohamed, A.² Kingsbury, B.³ Ramabhadran, B.⁴

8
- 84893654379
- Improvements to deep convolutional neural networks for LVCSR
- T.N. Sainath, B. Kingsbury, A. Mohamed, G.E. Dahl, G. Saon, H. Soltau, T. Beran, A.Y. Aravkin, and B. Ramabhadran, "Improvements to deep convolutional neural networks for LVCSR," in Proc IEEE ASRU, 2013.
- (2013) Proc IEEE ASRU
- Sainath, T.N.¹ Kingsbury, B.² Mohamed, A.³ Dahl, G.E.⁴ Saon, G.⁵ Soltau, H.⁶ Beran, T.⁷ Aravkin, A.Y.⁸ Ramabhadran, B.⁹

9
- 84910028405
- Improving language-universal feature extraction with deep maxout and convolutional neural networks
- Y. Miao and F. Metze, "Improving language-universal feature extraction with deep maxout and convolutional neural networks," in Proc. Interspeech, 2014.
- (2014) Proc. Interspeech
- Miao, Y.¹ Metze, F.²

10
- 84901999583
- Convolutional neural networks for distant speech recognition
- P Swietojanski, A Ghoshal, and S Renals, "Convolutional neural networks for distant speech recognition," IEEE Signal Process. Letters, 2014
- (2014) IEEE Signal Process. Letters
- Swietojanski, P.¹ Ghoshal, A.² Renals, S.³

11
- 85083953021
- Feature learning in deep neural networks-studies on speech recognition tasks
- D. Yu, M. Seltzer, J. Li, J-T. Huang, F. Seide, "Feature learning in deep neural networks-studies on speech recognition tasks", ICLR 2013.
- (2013) ICLR
- Yu, D.¹ Seltzer, M.² Li, J.³ Huang, J.-T.⁴ Seide, F.⁵

12
- 84906257050
- Neural network acoustic models for the DARPA RATS program
- H. Soltau, H-K. Kuo, L. Mangu, G. Saon, T. Beran, "Neural network acoustic models for the DARPA RATS program," in Proc. Interspeech, 2013.
- (2013) Proc. Interspeech
- Soltau, H.¹ Kuo, H.-K.² Mangu, L.³ Saon, G.⁴ Beran, T.⁵

13
- 85048545369
- Measuring invariances in deep networks
- I. Goodfellow, H. Lee, Q. Le, A. Saxe, A. Ng, "Measuring invariances in deep networks," in Proc. NIPS, 2009.
- (2009) Proc. NIPS
- Goodfellow, I.¹ Lee, H.² Le, Q.³ Saxe, A.⁴ Ng, A.⁵

14
- 84906251664
- Accurate and compact large vocabulary speech recognition on mobile devices
- X. Lei, A. Senior, A., A. Gruenstein, and J. Sorensen, "Accurate and compact large vocabulary speech recognition on mobile devices," in Proc. Interspeech, 2013.
- (2013) Proc. Interspeech
- Lei, X.¹ Senior, A.A.² Gruenstein, A.³ Sorensen, J.⁴

15
- 84897543523
- Maxout networks
- I. J. Goodfellow, D. Warde-Farley, M. Mirza, A. Courville, and Y. Bengio, "Maxout networks," in Proc. ICML, 2013.
- (2013) Proc. ICML
- Goodfellow, I.J.¹ Warde-Farley, D.² Mirza, M.³ Courville, A.⁴ Bengio, Y.⁵

16
- 84890471125
- On rectified linear units for speech processing
- M.D. Zeiler, M. Ranzato, R. Monga, M. Mao, K. Yang, Q.V. Le, P. Nguyen, A. Senior, V. Vanhoucke, J. Dean, et al., "On rectified linear units for speech processing," in Proc. ICASSP, 2013.
- (2013) Proc. ICASSP
- Zeiler, M.D.¹ Ranzato, M.² Monga, R.³ Mao, M.⁴ Yang, K.⁵ Le, Q.V.⁶ Nguyen, P.⁷ Senior, A.⁸ Vanhoucke, V.⁹ Dean, J.¹⁰

17
- 84905270524
- Investigation of maxout networks for speech recognition
- P. Swietojanski, J. Li, and J.-T. Huang, "Investigation of maxout networks for speech recognition," in Proc. ICASSP, 2014
- (2014) Proc. ICASSP
- Swietojanski, P.¹ Li, J.² Huang, J.-T.³

18
- 67651044226
- Spectro-temporal analysis of speech using 2-d Gabor filters
- T. Ezzat, J. Bouvrie, and T. Poggio, "Spectro-temporal analysis of speech using 2-d Gabor filters," in Proc. Interspeech, 2007.
- (2007) Proc. Interspeech
- Ezzat, T.¹ Bouvrie, J.² Poggio, T.³

19
- 84910036228
- Robust CNN-based speech recognition with Gabor filter kernels
- S.-Y. Chang and N. Morgan, "Robust CNN-based speech recognition with Gabor filter kernels," in Proc. Interspeech, 2014.
- (2014) Proc. Interspeech
- Chang, S.-Y.¹ Morgan, N.²

20
- 56149125973
- Aurora working group: DSR front end LVCSR evaluation AU/384/02
- Mississippi State Univ
- N. Parihar and J. Picone, "Aurora working group: DSR front end LVCSR evaluation AU/384/02," Tech. Rep., Institute for Signal and Information Processing, Mississippi State Univ., 2002.
- (2002) Tech. Rep., Institute for Signal and Information Processing
- Parihar, N.¹ Picone, J.²

21
- 84874415293
- Microphone array processing for distant speech recognition: Towards realworld deployment
- K. Kumatani, T. Arakawa, K. Yamamoto, J. McDonough, B. Raj, R. Singh, and I. Tashev, "Microphone array processing for distant speech recognition: towards realworld deployment," APSIPA Annual Summit and Conference, 2012
- (2012) APSIPA Annual Summit and Conference
- Kumatani, K.¹ Arakawa, T.² Yamamoto, K.³ McDonough, J.⁴ Raj, B.⁵ Singh, R.⁶ Tashev, I.⁷

22
- 84910035297
- Learning small-size DNN with output-distribution-based criteria
- J. Li, R. Zhao, J.-T. Huang, and Y. Gong, "Learning small-size DNN with output-distribution-based criteria," in Proc. Interspeech, 2014.
- (2014) Proc. Interspeech
- Li, J.¹ Zhao, R.² Huang, J.-T.³ Gong, Y.⁴

23
- 84910069623
- Convolutional deep maxout networks for phone recognition
- L. Toth, "Convolutional deep maxout networks for phone recognition," in Proc. Interspeech, 2014.
- (2014) Proc. Interspeech
- Toth, L.¹

24
- 84910046405
- Long short-term memory recurrent neural network architectures for large scale acoustic modeling
- H. Sak, A. Senior, and F. Beaufays, "Long short-term memory recurrent neural network architectures for large scale acoustic modeling," in Interspeech, 2014, pp. 338-342.
- (2014) Interspeech , pp. 338-342
- Sak, H.¹ Senior, A.² Beaufays, F.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.