SCOPUS 정보 검색 플랫폼

31st International Conference on Machine Learning, ICML 2014

Volumn 5, Issue , 2014, Pages 3771-3779

Towards end-to-end speech recognition with recurrent neural networks

(2) Graves, Alex a Jaitly, Navdeep b

a DEEPMIND (United Kingdom)

b UNIVERSITY OF TORONTO (Canada)

Author keywords

[No Author keywords available]

Indexed keywords

ARTIFICIAL INTELLIGENCE; CHARACTER RECOGNITION; CLASSIFICATION (OF INFORMATION); COMPUTATIONAL LINGUISTICS; LEARNING SYSTEMS; NETWORK ARCHITECTURE; RECURRENT NEURAL NETWORKS; TRANSCRIPTION;

BASELINE SYSTEMS; LINGUISTIC INFORMATION; OBJECTIVE FUNCTIONS; PHONETIC REPRESENTATION; SPEECH RECOGNITION SYSTEMS; TEMPORAL CLASSIFICATION; WALL STREET JOURNAL; WORD ERROR RATE;

SPEECH RECOGNITION;

EID: 84919832465 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (786)

References (22)

1
- 0022890536
- Maximum mutual information estimation of hidden markov model parameters for speech recognition
- Apr
- Bahl, L., Brown, P., De Souza, P.V., and Mercer, R. Maximum mutual information estimation of hidden markov model parameters for speech recognition. In Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '86., volume 11, pp. 49-52, Apr 1986. doi: 10.1109/ICASSP.1986.1169179.
- (1986) Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '86. , vol.11 , pp. 49-52
- Bahl, L.¹ Brown, P.² De Souza, P.V.³ Mercer, R.⁴

2
- 33745202406
- Open vocabulary speech recognition with flat hybrid models
- Bisani, Maximilian and Ney, Hermann. Open vocabulary speech recognition with flat hybrid models. In INTER-SPEECH, pp. 725-728, 2005.
- (2005) INTER-SPEECH , pp. 725-728
- Bisani, M.¹ Ney, H.²

3
- 0003573244
- Kluwer Academic Publishers, Norwell, MA, USA
- Bourlard, Herve A. and Morgan, Nelson. Connection-ist Speech Recognition: A Hybrid Approach. Kluwer Academic Publishers, Norwell, MA, USA, 1993. ISBN 0792393961.
- (1993) Connection-ist Speech Recognition: A Hybrid Approach
- Bourlard, H.A.¹ Morgan, N.²

4
- 80054740693
- A committee of neural networks for traffic sign classification
- IEEE
- Ciresan, Dan C, Meier, Ueli, Masci, Jonathan, and Schmidhuber, Jrgen. A committee of neural networks for traffic sign classification. In IJCNN, pp. 1918-1921. IEEE, 2011.
- (2011) IJCNN , pp. 1918-1921
- Ciresan, D.C.¹ Meier, U.² Masci, J.³ Schmidhuber, J.⁴

5
- 0019053271
- Comparison of parametric representations for monosyllabic. Word recognition in continuously spoken sentences
- August
- Davis, S. and Mermelstein, P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech and Signal Processing, 28(4):357-366, August 1980.
- (1980) IEEE Transactions on Acoustics, Speech and Signal Processing , vol.28 , Issue.4 , pp. 357-366
- Davis, S.¹ Mermelstein, P.²

6
- 77949404053
- From speech to letters - Using a novel neural network architecture for grapheme based asr
- IEEE. 13.-17.12.2009
- Eyben, F., Wllmer, M., Schuller, B., and Graves, A. From speech to letters - using a novel neural network architecture for grapheme based asr. In Proc. Automatic Speech Recognition and Understanding Workshop (ASRU 2009), Merano, Italy. IEEE, 2009. 13.-17.12.2009.
- (2009) Proc. Automatic Speech Recognition and Understanding Workshop (ASRU 2009), Merano, Italy
- Eyben, F.¹ Wllmer, M.² Schuller, B.³ Graves, A.⁴

7
- 85009227775
- Recognition of out-of-vocabulary words with sub-lexical language models
- Galescu, Lucian. Recognition of out-of-vocabulary words with sub-lexical language models. In INTERSPEECH, 2003.
- (2003) INTERSPEECH
- Galescu, L.¹

8
- 0041965934
- Learning precise timing with LSTM recurrent networks
- Gers, F., Schraudolph, N., and Schmidhuber, J. Learning Precise Timing with LSTM Recurrent Networks. Journal of Machine Learning Research, 3:115-143, 2002.
- (2002) Journal of Machine Learning Research , vol.3 , pp. 115-143
- Gers, F.¹ Schraudolph, N.² Schmidhuber, J.³

9
- 27744588611
- Framewise phoneme classification with bidirectional LSTM and other neural network architectures
- June/July
- Graves, A. and Schmidhuber, J. Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures. Neural Networks, 18(5-6):602-610, June/July 2005.
- (2005) Neural Networks , vol.18 , Issue.5-6 , pp. 602-610
- Graves, A.¹ Schmidhuber, J.²

10
- 33749259827
- Connectionist temporal classification: Labelling un-segmented sequence data with recurrent neural networks
- Pittsburgh, USA
- Graves, A., Fernandez, S., Gomez, F., and Schmidhuber, J. Connectionist Temporal Classification: Labelling Un-segmented Sequence Data with Recurrent Neural Networks. In ICML, Pittsburgh, USA, 2006.
- (2006) ICML
- Graves, A.¹ Fernandez, S.² Gomez, F.³ Schmidhuber, J.⁴

11
- 84890543083
- Speech recognition with deep recurrent neural networks
- Vancouver, Canada, May
- Graves, A., Mohamed, A., and Hinton, G. Speech recognition with deep recurrent neural networks. In Proc ICASSP 2013, Vancouver, Canada, May 2013.
- (2013) Proc ICASSP 2013
- Graves, A.¹ Mohamed, A.² Hinton, G.³

12
- 70349284484
- Springer
- Graves, Alex. Supervised Sequence Labelling with Recurrent Neural Networks, volume 385 of Studies in Computational Intelligence. Springer, 2012.
- (2012) Supervised Sequence Labelling with Recurrent Neural Networks, Volume 385 of Studies in Computational Intelligence
- Graves, A.¹

13
- 33746600649
- Reducing the dimensionality of data with neural networks
- May
- Hinton, G. E. and Salakhutdinov, R. R. Reducing the Dimensionality of Data with Neural Networks. Science, 313(5786):504-507, May 2006.
- (2006) Science , vol.313 , Issue.5786 , pp. 504-507
- Hinton, G.E.¹ Salakhutdinov, R.R.²

14
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition
- Hinton, Geoffrey, Deng, Li, Yu, Dong, Dahl, George, rah-man Mohamed, Abdel, Jaitly, Navdeep, Senior, Andrew, Vanhoucke, Vincent, Nguyen, Patrick, Sainath, Tara, and Kingsbury, Brian. Deep neural networks for acoustic modeling in speech recognition. Signal Processing Magazine, 2012.
- (2012) Signal Processing Magazine
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.⁴ Rahman, M.A.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.¹⁰ Kingsbury, B.¹¹

15
- 0031573117
- Long short-term memory
- Hochreiter, S. and Schmidhuber, J. Long Short-Term Memory. Neural Computation, 9(8): 1735-1780, 1997.
- (1997) Neural Computation , vol.9 , Issue.8 , pp. 1735-1780
- Hochreiter, S.¹ Schmidhuber, J.²

16
- 80051609011
- Learning a better representation of speech soundwaves using restricted boltzmann machines
- Jaitly, Navdeep and Hinton, Geoffrey E. Learning a better representation of speech soundwaves using restricted boltzmann machines. In ICASSP, pp. 5884-5887, 2011.
- (2011) ICASSP , pp. 5884-5887
- Jaitly, N.¹ Hinton, G.E.²

17
- 84878539964
- Application of pretrained deep neural networks to large vocabulary speech recognition
- Jaitly, Navdeep, Nguyen, Patrick, Senior, Andrew W, and Vanhoucke, Vincent. Application of pretrained deep neural networks to large vocabulary speech recognition. In INTERSPEECH, 2012.
- (2012) INTERSPEECH
- Jaitly, N.¹ Nguyen, P.² Senior, A.W.³ Vanhoucke, V.⁴

18
- 84876231242
- Imagenet classification with deep convolutional neural networks
- Krizhevsky, Alex, Sutskever, Ilya, and Hinton, Geoffrey E. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems, 2012.
- (2012) Advances in Neural Information Processing Systems
- Krizhevsky, A.¹ Sutskever, I.² Hinton, G.E.³

19
- 0031647824
- A frequency warping approach to speaker normalization
- Jan
- Lee, Li and Rose, R. A frequency warping approach to speaker normalization. Speech and Audio Processing, IEEE Transactions on, 6(1):49-60, Jan 1998.
- (1998) Speech and Audio Processing, IEEE Transactions on , vol.6 , Issue.1 , pp. 49-60
- Li, L.¹ Rose, R.²

20
- 44949241322
- Reinforcement learning of motor skills with policy gradients
- Peters, J. and Schaal, S. Reinforcement learning of motor skills with policy gradients. In Neural Networks, number 4, pp. 682-97, 2008.
- (2008) Neural Networks , Issue.4 , pp. 682-697
- Peters, J.¹ Schaal, S.²

21
- 84874281338
- The kaldi speech recognition toolkit
- IEEE Signal Processing Society, December
- Povey, D., Ghoshal, A., Boulianne, G., Bürget, L., Glem-bek, O., Goel, N., Hannemann, M., Motlicek, P., Qian, Y, Schwarz, P., Silovsky, J., Stemmer, G., and Vesely, K. The kaldi speech recognition toolkit. In IEEE 2011 Workshop on Automatic Speech Recognition and Understanding. IEEE Signal Processing Society, December 2011.
- (2011) IEEE 2011 Workshop on Automatic Speech Recognition and Understanding
- Povey, D.¹ Ghoshal, A.² Boulianne, G.³ Bürget, L.⁴ Glem-Bek, O.⁵ Goel, N.⁶ Hannemann, M.⁷ Motlicek, P.⁸ Qian, Y.⁹ Schwarz, P.¹⁰ Silovsky, J.¹¹ Stemmer, G.¹² Vesely, K.¹³

22
- 0031268931
- Bidirectional recurrent neural networks
- Schuster, M. and Paliwal, K. K. Bidirectional Recurrent Neural Networks. IEEE Transactions on Signal Processing, 45:2673-2681, 1997.
- (1997) IEEE Transactions on Signal Processing , vol.45 , pp. 2673-2681
- Schuster, M.¹ Paliwal, K.K.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.