SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn 08-12-September-2016, Issue , 2016, Pages 3429-3433

Advances in very deep convolutional neural networks for LVCSR

(2) Sercu, Tom a Goel, Vaibhava a

a IBM T J WATSON RESEARCH CENTER (United States)

Author keywords

Acoustic modeling; Convolutional networks; Neural networks; Speech recognition

Indexed keywords

CONVOLUTION; NEURAL NETWORKS; SPEECH COMMUNICATION; SPEECH PROCESSING; STATISTICAL TESTS;

ACOUSTIC MODEL; CONVOLUTIONAL NETWORKS; CONVOLUTIONAL NEURAL NETWORK; SEQUENCE DATA; SPEECH RECOGNITION SYSTEMS; SYSTEM COMBINATION; TIME DIMENSION; WORD ERROR RATE;

SPEECH RECOGNITION;

EID: 84994207081 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: 10.21437/Interspeech.2016-1033 Document Type: Conference Paper

Times cited : (25)

References (35)

1
- 84973324686
- Very deep multilingual convolutional neural networks for lvcsr
- T. Sercu, C. Puhrsch, B. Kingsbury, and Y. LeCun, "Very deep multilingual convolutional neural networks for lvcsr, " Proc. ICASSP, 2016.
- (2016) Proc. ICASSP
- Sercu, T.¹ Puhrsch, C.² Kingsbury, B.³ LeCun, Y.⁴

2
- 84994298763
- "Iarpa babel, " http://www.iarpa.gov/index.php/researchprograms/babel.
- Iarpa Babel

3
- 84925410541
- CoRR arXiv: 1409.1556
- K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition, " CoRR arXiv:1409.1556, 2014.
- (2014) Very Deep Convolutional Networks for Large-scale Image Recognition
- Simonyan, K.¹ Zisserman, A.²

4
- 84969584486
- Batch normalization: Accelerating deep network training by reducing internal covariate shift
- S. Ioffe and C. Szegedy, "Batch normalization: Accelerating deep network training by reducing internal covariate shift, " Proc. ICML, 2015.
- (2015) Proc ICML
- Ioffe, S.¹ Szegedy, C.²

5
- 84890525984
- Deep convolutional neural networks for lvcsr
- T. N. Sainath, A.-R. Mohamed, B. Kingsbury, and B. Ramabhadran, "Deep convolutional neural networks for lvcsr, " in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, 2013, pp. 8614-8618.
- (2013) Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference On. IEEE , pp. 8614-8618
- Sainath, T.N.¹ Mohamed, A.-R.² Kingsbury, B.³ Ramabhadran, B.⁴

6
- 84905265980
- Joint training of convolutional and non-convolutional neural networks
- H. Soltau, G. Saon, and T. N. Sainath, "Joint training of convolutional and non-convolutional neural networks, " to Proc. ICASSP, 2014.
- (2014) Proc. ICASSP
- Soltau, H.¹ Saon, G.² Sainath, T.N.³

7
- 84959129849
- The IBM 2015 english conversational telephone speech recognition system
- G. Saon, H.-K. J. Kuo, S. Rennie, and M. Picheny, "The ibm 2015 english conversational telephone speech recognition system, " Proc. Interspeech, 2015.
- (2015) Proc. Interspeech
- Saon, G.¹ Kuo, H.-K.J.² Rennie, S.³ Picheny, M.⁴

8
- 84994201246
- G. Saon, T. Sercu, S. Rennie, and H.-K. J. Kuo, "The ibm 2016 english conversational telephone speech recognition system, " -, 2016.
- (2016) The IBM 2016 English Conversational Telephone Speech Recognition System
- Saon, G.¹ Sercu, T.² Rennie, S.³ Kuo, H.-K.J.⁴

9
- 11144321031
- Convolutional networks for images, speech, and time series
- Y. LeCun and Y. Bengio, "Convolutional networks for images, speech, and time series, " The handbook of brain theory and neural networks, vol. 3361, no. 10, p. 1995, 1995.
- (1995) The Handbook of Brain Theory and Neural Networks , vol.3361 , Issue.10 , pp. 1995
- LeCun, Y.¹ Bengio, Y.²

10
- 0032203257
- Gradient-based learning applied to document recognition
- Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition, " Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
- (1998) Proceedings of the IEEE , vol.86 , Issue.11 , pp. 2278-2324
- LeCun, Y.¹ Bottou, L.² Bengio, Y.³ Haffner, P.⁴

11
- 84990044091
- Torch7: A matlab-like environment for machine learning
- R. Collobert, K. Kavukcuoglu, and C. Farabet, "Torch7: A matlab-like environment for machine learning, " in BigLearn, NIPS Workshop, no. EPFL-CONF-192376, 2011.
- (2011) BigLearn, NIPS Workshop, No. EPFL-CONF-192376
- Collobert, R.¹ Kavukcuoglu, K.² Farabet, C.³

12
- 84897510162
- On the importance of initialization and momentum in deep learning
- I. Sutskever, J. Martens, G. Dahl, and G. Hinton, "On the importance of initialization and momentum in deep learning, " in Proc. ICML, 2013, pp. 1139-1147.
- (2013) Proc ICML , pp. 1139-1147
- Sutskever, I.¹ Martens, J.² Dahl, G.³ Hinton, G.⁴

13
- 84890543852
- Error back propagation for sequence training of context-dependent deep networks for conversational speech transcription
- H. Su, G. Li, D. Yu, and F. Seide, "Error back propagation for sequence training of context-dependent deep networks for conversational speech transcription, " in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, 2013, pp. 6664-6668.
- (2013) Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference On. IEEE , pp. 6664-6668
- Su, H.¹ Li, G.² Yu, D.³ Seide, F.⁴

14
- 84876231242
- Imagenet classification with deep convolutional neural networks
- A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks, " in Advances in neural information processing systems, 2012, pp. 1097-1105.
- (2012) Advances in Neural Information Processing Systems , pp. 1097-1105
- Krizhevsky, A.¹ Sutskever, I.² Hinton, G.E.³

15
- 84906347546
- arXiv preprint arXiv: 1312.6229
- P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun, "Overfeat: Integrated recognition, localization and detection using convolutional networks, " arXiv preprint arXiv:1312.6229, 2013.
- (2013) Overfeat: Integrated Recognition, Localization and Detection Using Convolutional Networks
- Sermanet, P.¹ Eigen, D.² Zhang, X.³ Mathieu, M.⁴ Fergus, R.⁵ LeCun, Y.⁶

16
- 84876258641
- Learning hierarchical features for scene labeling
- C. Farabet, C. Couprie, L. Najman, and Y. LeCun, "Learning hierarchical features for scene labeling, " Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 35, no. 8, pp. 1915- 1929, 2013.
- (2013) Pattern Analysis and Machine Intelligence, IEEE Transactions on , vol.35 , Issue.8 , pp. 1915-1929
- Farabet, C.¹ Couprie, C.² Najman, L.³ LeCun, Y.⁴

17
- 84867605836
- Applying convolutional neural networks concepts to hybrid nn-hmm model for speech recognition
- O. Abdel-Hamid, A.-R. Mohamed, H. Jiang, and G. Penn, "Applying convolutional neural networks concepts to hybrid nn-hmm model for speech recognition, " in Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on. IEEE, 2012, pp. 4277-4280.
- (2012) Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference On. IEEE , pp. 4277-4280
- Abdel-Hamid, O.¹ Mohamed, A.-R.² Jiang, H.³ Penn, G.⁴

18
- 84990890375
- arXiv preprint arXiv: 1509.01626
- X. Zhang, J. Zhao, and Y. LeCun, "Character-level convolutional networks for text classification, " arXiv preprint arXiv:1509.01626, 2015.
- (2015) Character-level Convolutional Networks for Text Classification
- Zhang, X.¹ Zhao, J.² LeCun, Y.³

19
- 84978952442
- arXiv preprint arXiv: 1508.06615
- Y. Kim, Y. Jernite, D. Sontag, and A. M. Rush, "Character-aware neural language models, " arXiv preprint arXiv:1508.06615, 2015.
- (2015) Character-aware Neural Language Models
- Kim, Y.¹ Jernite, Y.² Sontag, D.³ Rush, A.M.⁴

20
- 84939821074
- arXiv preprint arXiv: 1502.03044
- K. Xu, J. Ba, R. Kiros, A. Courville, R. Salakhutdinov, R. Zemel, and Y. Bengio, "Show, attend and tell: Neural image caption generation with visual attention, " arXiv preprint arXiv:1502.03044, 2015.
- (2015) Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
- Xu, K.¹ Ba, J.² Kiros, R.³ Courville, A.⁴ Salakhutdinov, R.⁵ Zemel, R.⁶ Bengio, Y.⁷

21
- 84964588182
- Fast r-cnn
- R. Girshick, "Fast r-cnn, " in Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 1440-1448.
- (2015) Proceedings of the IEEE International Conference on Computer Vision , pp. 1440-1448
- Girshick, R.¹

22
- 84937144752
- arXiv preprint arXiv: 1411.4038
- J. Long, E. Shelhamer, and T. Darrell, "Fully convolutional networks for semantic segmentation, " arXiv preprint arXiv:1411.4038, 2014.
- (2014) Fully Convolutional Networks for Semantic Segmentation
- Long, J.¹ Shelhamer, E.² Darrell, T.³

23
- 0024634603
- Phoneme recognition using time-delay neural networks
- A. Waibel, T. Hanazawa, G. Hinton, K. Shikano, and K. J. Lang, "Phoneme recognition using time-delay neural networks, " Acoustics, Speech and Signal Processing, IEEE Transactions on, vol. 37, no. 3, pp. 328-339, 1989.
- (1989) Acoustics, Speech and Signal Processing, IEEE Transactions on , vol.37 , Issue.3 , pp. 328-339
- Waibel, A.¹ Hanazawa, T.² Hinton, G.³ Shikano, K.⁴ Lang, K.J.⁵

24
- 84951162898
- Deep convolutional neural networks for large-scale speech tasks
- T. N. Sainath, B. Kingsbury, G. Saon, H. Soltau, A.-R. Mohamed, G. Dahl, and B. Ramabhadran, "Deep convolutional neural networks for large-scale speech tasks, " Neural Networks, 2014.
- (2014) Neural Networks
- Sainath, T.N.¹ Kingsbury, B.² Saon, G.³ Soltau, H.⁴ Mohamed, A.-R.⁵ Dahl, G.⁶ Ramabhadran, B.⁷

25
- 84959078561
- Convolutional neural networks for small-footprint keyword spotting
- T. Sainath and C. Parada, "Convolutional neural networks for small-footprint keyword spotting, " in Proc. Interspeech, 2015.
- (2015) Proc. Interspeech
- Sainath, T.¹ Parada, C.²

26
- 84994264999
- arXiv preprint arXiv: 1304.1018
- D. Palaz, R. Collobert, and M. M. Doss, "Estimating phoneme class conditional probabilities from raw speech signal using convolutional neural networks, " arXiv preprint arXiv:1304.1018, 2013.
- (2013) Estimating Phoneme Class Conditional Probabilities from Raw Speech Signal Using Convolutional Neural Networks
- Palaz, D.¹ Collobert, R.² Doss, M.M.³

27
- 84971463350
- CoRR arXiv: 1512.02595
- D. Amodei, R. Anubhai, E. Battenberg, C. Case, J. Casper, B. Catanzaro, J. Chen, M. Chrzanowski, A. Coates, G. Diamos et al., "Deep speech 2: End-to-end speech recognition in english and Mandarin, " CoRR arXiv:1512.02595, 2015.
- (2015) Deep Speech 2: End-to-end Speech Recognition in English and Mandarin
- Amodei, D.¹ Anubhai, R.² Battenberg, E.³ Case, C.⁴ Casper, J.⁵ Catanzaro, B.⁶ Chen, J.⁷ Chrzanowski, M.⁸ Coates, A.⁹ Diamos, G.¹⁰

28
- 70349213445
- Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling
- B. Kingsbury, "Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling, " in Proc. ICASSP. IEEE, 2009, pp. 3761-3764.
- (2009) Proc. ICASSP IEEE , pp. 3761-3764
- Kingsbury, B.¹

29
- 84878379108
- Scalable minimum bayes risk training of deep neural network acoustic models using distributed hessian-free optimization
- B. Kingsbury, T. N. Sainath, and H. Soltau, "Scalable minimum bayes risk training of deep neural network acoustic models using distributed hessian-free optimization, " in Thirteenth Annual Conference of the International Speech Communication Association, 2012.
- (2012) Thirteenth Annual Conference of the International Speech Communication Association
- Kingsbury, B.¹ Sainath, T.N.² Soltau, H.³

30
- 84905233897
- Meannormalized stochastic gradient for large-scale deep learning
- S. Wiesler, A. Richard, R. Schluter, and H. Ney, "Meannormalized stochastic gradient for large-scale deep learning, " in proc. ICASSP. IEEE, 2014, pp. 180-184.
- (2014) Proc. ICASSP IEEE , pp. 180-184
- Wiesler, S.¹ Richard, A.² Schluter, R.³ Ney, H.⁴

31
- 84990032289
- CoRR arXiv: 1512.00567
- C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, "Rethinking the inception architecture for computer vision, " CoRR arXiv:1512.00567, 2015.
- (2015) Rethinking the Inception Architecture for Computer Vision
- Szegedy, C.¹ Vanhoucke, V.² Ioffe, S.³ Shlens, J.⁴ Wojna, Z.⁵

32
- 84958589374
- CoRR arXiv: 1512.03385
- K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition, " CoRR arXiv:1512.03385, 2015.
- (2015) Deep Residual Learning for Image Recognition
- He, K.¹ Zhang, X.² Ren, S.³ Sun, J.⁴

33
- 84973326024
- Batch normalized recurrent neural networks
- C. Laurent, G. Pereyra, P. Brakel, Y. Zhang, and Y. Bengio, "Batch normalized recurrent neural networks, " Proc. ICASSP, 2016.
- (2016) Proc. ICASSP
- Laurent, C.¹ Pereyra, G.² Brakel, P.³ Zhang, Y.⁴ Bengio, Y.⁵

34
- 84931584163
- arXiv preprint arXiv: 1412.4526
- H. Li, R. Zhao, and X.Wang, "Highly efficient forward and backward propagation of convolutional neural networks for pixelwise classification, " arXiv preprint arXiv:1412.4526, 2014.
- (2014) Highly Efficient Forward and Backward Propagation of Convolutional Neural Networks for Pixelwise Classification
- Li, H.¹ Zhao, R.² Wang, X.³

35
- 84980036754
- arXiv preprint arXiv: 1511.07122
- F. Yu and V. Koltun, "Multi-scale context aggregation by dilated convolutions, " arXiv preprint arXiv:1511.07122, 2015.
- (2015) Multi-scale Context Aggregation by Dilated Convolutions
- Yu, F.¹ Koltun, V.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.