SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn , Issue , 2014, Pages 1915-1919

Ensemble deep learning for speech recognition

Author keywords

Deep learning; Ensemble learning; Log linear system combination; Speech recognition; Stacking

Indexed keywords

CONVEX OPTIMIZATION; LEARNING SYSTEMS; LINEAR SYSTEMS; OPTIMIZATION; RECURRENT NEURAL NETWORKS; SPEECH; SPEECH COMMUNICATION;

CONVEX OPTIMIZATION PROBLEMS; DEEP LEARNING; ENSEMBLE LEARNING; HIERARCHICAL FEATURES; POSTERIOR PROBABILITY; RECOGNITION ACCURACY; STACKING; SYSTEM COMBINATION;

SPEECH RECOGNITION;

EID: 84910048046 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (230)

References (40)

1
- 85032751593
- Research developments and directions in speech recognition and understanding
- May
- Baker, J., Deng, L., Glass, J., Khudanpur, S., Lee, C.-H., Morgan, N., and O'Shaughnessy, D. "Research developments and directions in speech recognition and understanding, " IEEE Sig. Proc. Mag., vol. 26, no. 3, May 2009, pp. 75-80.
- (2009) IEEE Sig. Proc. Mag. , vol.26 , Issue.3 , pp. 75-80
- Baker, J.¹ Deng, L.² Glass, J.³ Khudanpur, S.⁴ Lee, C.-H.⁵ Morgan, N.⁶ O'Shaughnessy, D.⁷

2
- 84879854889
- Representation learning: A review and new perspectives
- Bengio, Y., Courville, A., and Vincent, P. "Representation learning: A review and new perspectives, " IEEE Trans. PAMI, vol. 38, pp. 1798-1828, 2013.
- (2013) IEEE Trans. PAMI , vol.38 , pp. 1798-1828
- Bengio, Y.¹ Courville, A.² Vincent, P.³

3
- 84890543516
- Advances in optimizing recurrent networks
- Bengio, Y., Boulanger, N., and Pascanu, R. "Advances in optimizing recurrent networks, " Proc. ICASSP, 2013.
- (2013) Proc. ICASSP
- Bengio, Y.¹ Boulanger, N.² Pascanu, R.³

4
- 0030196364
- Stacked regression
- Breiman, L. "Stacked regression, " Machine Learning, Vol. 24, pp. 49-64, 1996.
- (1996) Machine Learning , vol.24 , pp. 49-64
- Breiman, L.¹

5
- 0003573244
- Kluwer
- Bourlard, H. and Morgan, N., Connectionist Speech Recognition: A Hybrid Approach, Kluwer, 1993.
- (1993) Connectionist Speech Recognition: A Hybrid Approach
- Bourlard, H.¹ Morgan, N.²

6
- 85083950550
- A primal-dual method for training recurrent neural networks constrained by the echo-state property
- April
- Chen, J. and Deng, L. "A primal-dual method for training recurrent neural networks constrained by the echo-state property, " Proc. Int. Conf. Learning Representations, April, 2014.
- (2014) Proc. Int. Conf. Learning Representations
- Chen, J.¹ Deng, L.²

7
- 84055222005
- Contextdependent, pre-trained deep neural networks for large vocabulary speech recognition
- Dahl, G., Yu, D., Deng, L., and Acero, A. "Contextdependent, pre-trained deep neural networks for large vocabulary speech recognition, " IEEE Trans. Audio, Speech, & Language Proc., Vol. 20, pp. 30-42, 2012.
- (2012) IEEE Trans. Audio, Speech, & Language Proc. , vol.20 , pp. 30-42
- Dahl, G.¹ Yu, D.² Deng, L.³ Acero, A.⁴

8
- 84905280906
- Sequence classification using the high-level features extracted from deep neural networks
- Deng, L. and Chen, J. "Sequence classification using the high-level features extracted from deep neural networks, " Proc. ICASSP, 2014.
- (2014) Proc. ICASSP
- Deng, L.¹ Chen, J.²

9
- 84890545163
- A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion
- Deng, L., Abdel-Hamid, O., and Yu, D. "A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion, " Proc. ICASSP, 2013.
- (2013) Proc. ICASSP
- Deng, L.¹ Abdel-Hamid, O.² Yu, D.³

10
- 84890491198
- Recent advances in deep learning for speech research at Microsoft
- Deng, L., Li, J., Huang, K., Yao, D. Yu, F. Seide, M. Seltzer, G. Zweig, X. He, J. Williams, Y. Gong, and A. Acero. "Recent advances in deep learning for speech research at Microsoft, " Proc. ICASSP, 2013.
- (2013) Proc. ICASSP
- Yao¹ Deng, L.² Li, J.³ Huang, K.⁴ Yu, D.⁵ Seide, F.⁶ Seltzer, M.⁷ Zweig, G.⁸ He, X.⁹ Williams, J.¹⁰ Gong, Y.¹¹ Acero, A.¹²

11
- 84890526837
- New types of deep neural network learning for speech recognition and related applications: An overview
- Deng, L., Hinton, G., and Kingsbury, B. "New types of deep neural network learning for speech recognition and related applications: An overview, " Proc. ICASSP, 2013.
- (2013) Proc. ICASSP
- Deng, L.¹ Hinton, G.² Kingsbury, B.³

12
- 84890468916
- Deep learning for speech recognition and related applications
- Deng, L., Yu, D., and Hinton, G. "Deep Learning for Speech Recognition and Related Applications" NIPS Workshop, 2009.
- (2009) NIPS Workshop
- Deng, L.¹ Yu, D.² Hinton, G.³

13
- 84867614591
- Scalable stacking and learning for building deep architectures
- Deng, L., Yu, D., and Platt, J. "Scalable stacking and learning for building deep architectures, " Proc. ICASSP, 2012.
- (2012) Proc. ICASSP
- Deng, L.¹ Yu, D.² Platt, J.³

14
- 84890543083
- Speech recognition with deep recurrent neural networks
- Graves, A., Mohamed, A., and Hinton, G. "Speech recognition with deep recurrent neural networks, " Proc. ICASSP, 2013.
- (2013) Proc. ICASSP
- Graves, A.¹ Mohamed, A.² Hinton, G.³

15
- 84893701254
- Hybrid speech recognition with deep bidirectional LSTM
- Graves, A., Jaitly, N., and Mohamed, A. "Hybrid speech recognition with deep bidirectional LSTM, " Proc. ASRU, 2013.
- (2013) Proc. ASRU
- Graves, A.¹ Jaitly, N.² Mohamed, A.³

16
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition
- Hinton, G., Deng, L., Yu, D., Dahl, G., Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T., and Kingsbury, B., "Deep Neural Networks for Acoustic Modeling in Speech Recognition, " IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 82-97, 2012.
- (2012) IEEE Signal Processing Magazine , vol.29 , Issue.6 , pp. 82-97
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.⁴ Mohamed, A.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.¹⁰ Kingsbury, B.¹¹

17
- 33746600649
- Reducing the dimensionality of data with neural networks
- July
- Hinton, G. and Salakhutdinov, R. "Reducing the dimensionality of data with neural networks, " Science, vol. 313. no. 5786, pp. 504 - 507, July 2006.
- (2006) Science , vol.313 , Issue.5786 , pp. 504-507
- Hinton, G.¹ Salakhutdinov, R.²

18
- 84878539964
- Application of pre-trained deep neural networks to large vocabulary speech recognition
- Jaitly, N., Nguyen, P., and Vanhoucke, V. "Application of pre-trained deep neural networks to large vocabulary speech recognition, " Proc. Interspeech, 2012.
- (2012) Proc. Interspeech
- Jaitly, N.¹ Nguyen, P.² Vanhoucke, V.³

19
- 84878379108
- Scalable minimum Bayes risk training of deep neural network acoustic models using distributed Hessian-free optimization
- Kingsbury, B., Sainath, T., and Soltau, H. "Scalable minimum Bayes risk training of deep neural network acoustic models using distributed Hessian-free optimization, " Proc. Interspeech, 2012.
- (2012) Proc. Interspeech
- Kingsbury, B.¹ Sainath, T.² Soltau, H.³

20
- 84878409063
- Recurrent neural networks for noise reduction in robust ASR
- Maas, A., Le, Q., O'Neil, T., Vinyals, O., Nguyen, P., and Ng, P. "Recurrent neural networks for noise reduction in robust ASR, " Proc. Interspeech, 2012.
- (2012) Proc. Interspeech
- Maas, A.¹ Le, Q.² O'Neil, T.³ Vinyals, O.⁴ Nguyen, P.⁵ Ng, P.⁶

21
- 80053451847
- Learning recurrent neural networks with Hessian-free optimization
- Martens, J. and Sutskever, I. "Learning recurrent neural networks with Hessian-free optimization, " Proc. ICML, 2011.
- (2011) Proc. ICML
- Martens, J.¹ Sutskever, I.²

22
- 84858966958
- Strategies for training large scale neural network language models
- Mikolov, T., Deoras, A., Povey, D., Burget, L., and Cernocky, J. "Strategies for training large scale neural network language models, " Proc. ASRU, 2011.
- (2011) Proc. ASRU
- Mikolov, T.¹ Deoras, A.² Povey, D.³ Burget, L.⁴ Cernocky, J.⁵

23
- 79959829092
- Recurrent neural network based language model
- Mikolov, T., Karafiat, M., Burget, L., Cernocky, J., and Khudanpur, S. "Recurrent neural network based language model, " Proc. ICASSP, 2010, 1045-1048.
- (2010) Proc. ICASSP , pp. 1045-1048
- Mikolov, T.¹ Karafiat, M.² Burget, L.³ Cernocky, J.⁴ Khudanpur, S.⁵

24
- 84055211743
- Acoustic modeling using deep belief networks
- January
- Mohamed, A., Dahl, G. and Hinton, G. "Acoustic modeling using deep belief networks", IEEE Trans. Audio, Speech, and Language Proc. Vol. 20., January 2012.
- (2012) IEEE Trans. Audio, Speech, and Language Proc. , vol.20
- Mohamed, A.¹ Dahl, G.² Hinton, G.³

25
- 79959840616
- Investigation of fullsequence training of deep belief networks for speech recognition
- Mohamed, A., Yu, D., and Deng, L. "Investigation of fullsequence training of deep belief networks for speech recognition, " Proc. Interspeech, 2010.
- (2010) Proc. Interspeech
- Mohamed, A.¹ Yu, D.² Deng, L.³

26
- 84255177123
- Deep and wide: Multiple layers in automatic speech recognition
- January
- Morgan, N. "Deep and wide: Multiple layers in automatic speech recognition, " IEEE Trans. Audio, Speech, and Language Processing, Vol. 20 (1), January 2012.
- (2012) IEEE Trans. Audio, Speech, and Language Processing , vol.20 , Issue.1
- Morgan, N.¹

27
- 84897497795
- On the difficulty of training recurrent neural networks
- Pascanu, R., Mikolov, T., and Bengio, Y. "On the difficulty of training recurrent neural networks, " Proc. ICML, 2013.
- (2013) Proc. ICML
- Pascanu, R.¹ Mikolov, T.² Bengio, Y.³

28
- 0028392167
- An application of recurrent nets to phone probability estimation
- Robinson, A. "An application of recurrent nets to phone probability estimation, " IEEE Trans. Neural Networks, Vol. 5, pp. 298-305, 1994.
- (1994) IEEE Trans. Neural Networks , vol.5 , pp. 298-305
- Robinson, A.¹

29
- 84886829539
- Optimization techniques to improve training speed of deep neural networks for large speech tasks
- Nov
- Sainath, T., Kingsbury, B., Soltau, H., and Ramabhadran, B. "Optimization Techniques to Improve Training Speed of Deep Neural Networks for Large Speech Tasks, " IEEE Trans. Audio, Speech, and Language Processing, vol.21, no.11, pp.2267-2276, Nov. 2013.
- (2013) IEEE Trans. Audio, Speech, and Language Processing , vol.21 , Issue.11 , pp. 2267-2276
- Sainath, T.¹ Kingsbury, B.² Soltau, H.³ Ramabhadran, B.⁴

30
- 84890525984
- Convolutional neural networks for LVCSR
- Sainath, T., Mohamed, A., Kingsbury, B., and Ramabhadran, B. "Convolutional neural networks for LVCSR, " Proc. ICASSP, 2013.
- (2013) Proc. ICASSP
- Sainath, T.¹ Mohamed, A.² Kingsbury, B.³ Ramabhadran, B.⁴

31
- 84893654379
- Improvements to deep convolutional neural networks for LVCSR
- Sainath, T., Kingsbury, Mohamed, A., Dahl, G., Saon, G., Soltau, H., Beran, T., Aravkin, A., and B. Ramabhadran. "Improvements to deep convolutional neural networks for LVCSR, " Proc. ASRU, 2013.
- (2013) Proc. ASRU
- Kingsbury¹ Sainath, T.² Mohamed, A.³ Dahl, G.⁴ Saon, G.⁵ Soltau, H.⁶ Beran, T.⁷ Aravkin, A.⁸ Ramabhadran, B.⁹

32
- 84858972572
- Making deep belief networks effective for large vocabulary continuous speech recognition
- Sainath, T., Kingsbury, B., Ramabhadran, B., Novak, P., and Mohamed, A. "Making deep belief networks effective for large vocabulary continuous speech recognition, " Proc. ASRU, 2011.
- (2011) Proc. ASRU
- Sainath, T.¹ Kingsbury, B.² Ramabhadran, B.³ Novak, P.⁴ Mohamed, A.⁵

33
- 84865801985
- Conversational speech transcription using context-dependent deep neural networks
- Seide, F., Li, G., and Yu, D. "Conversational speech transcription using context-dependent deep neural networks, " Proc. Interspeech, 2011.
- (2011) Proc. Interspeech
- Seide, F.¹ Li, G.² Yu, D.³

34
- 80053459857
- Generating text with recurrent neural networks
- Sutskever, I., Martens J., and Hinton, G. "Generating text with recurrent neural networks, " Proc. ICML, 2011.
- (2011) Proc. ICML
- Martens, J.¹ Sutskever, I.² Hinton, G.³

35
- 84886714036
- Acoustic modeling with hierarchical reservoirs
- Nov
- Triefenbach, F., Jalalvand, A., Demuynck, K., Martens, J.- P. "Acoustic modeling with hierarchical reservoirs, " IEEE Trans. Audio, Speech, and Language Processing, vol.21, no.11, pp. 2439-2450, Nov. 2013.
- (2013) IEEE Trans. Audio, Speech, and Language Processing , vol.21 , Issue.11 , pp. 2439-2450
- Triefenbach, F.¹ Jalalvand, A.² Demuynck, K.³ Martens, J.-P.⁴

36
- 0024634603
- Phoneme recognition using time-delay neural networks
- Waibel, A., Hanazawa, T., Hinton, G., Shikano, K., and Lang, K. "Phoneme recognition using time-delay neural networks, " IEEE Trans. Acoust. Speech, and Signal Proc., vol. 37, pp. 328-339, 1989.
- (1989) IEEE Trans. Acoust. Speech, and Signal Proc. , vol.37 , pp. 328-339
- Waibel, A.¹ Hanazawa, T.² Hinton, G.³ Shikano, K.⁴ Lang, K.⁵

37
- 0026692226
- Stacked generalization
- Wolpert, D. "Stacked generalization, " Neural Networks, vol. 5, no. 2, pp. 241-259, 1992.
- (1992) Neural Networks , vol.5 , Issue.2 , pp. 241-259
- Wolpert, D.¹

38
- 84904483474
- Recurrent neural networks for language understanding
- Yao, K., Zweig, G., Hwang, M., Shi, Y., and Yu, D. "Recurrent Neural Networks for Language Understanding, " Proc. Interspeech, 2013.
- (2013) Proc. Interspeech
- Yao, K.¹ Zweig, G.² Hwang, M.³ Shi, Y.⁴ Yu, D.⁵

39
- 84871387302
- The deep tensor neural network with applications to large vocabulary speech recognition
- Yu, D., Deng, L., and Seide, F. "The deep tensor neural network with applications to large vocabulary speech recognition, " IEEE Trans. Audio, Speech, and Language Processing, vol. 21, no. 2, pp. 388-396, 2013.
- (2013) IEEE Trans. Audio, Speech, and Language Processing , vol.21 , Issue.2 , pp. 388-396
- Yu, D.¹ Deng, L.² Seide, F.³

40
- 84865713025
- Roles of pre-training and fine-tuning in context-dependent DBN-HMMs for real-world speech recognition
- Yu, D., Deng, L., and Dahl, G.E., "Roles of pre-training and fine-tuning in context-dependent DBN-HMMs for real-world speech recognition, " NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2010.
- (2010) NIPS Workshop on Deep Learning and Unsupervised Feature Learning
- Yu, D.¹ Deng, L.² Dahl, G.E.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.