SCOPUS 정보 검색 플랫폼

2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings

Volumn , Issue , 2016, Pages 78-83

Deep bi-directional recurrent networks over spectral windows

(7) Mohamed, Abdel Rahman a Seide, Frank a Yu, Dong a Droppo, Jasha a Stoicke, Andreas a Zweig, Geoffrey a Penn, Gerald b

a MICROSOFT RESEARCH (United States)

b UNIVERSITY OF TORONTO (Canada)

Author keywords

acoustic modeling; Deep learning; LSTM; Recurrent networks

Indexed keywords

LINGUISTICS; RANDOM PROCESSES; RECURRENT NEURAL NETWORKS; SPEECH TRANSMISSION; TRANSCRIPTION;

ACOUSTIC MODEL; CONVERSATIONAL SPEECH; DEEP LEARNING; FASTER CONVERGENCE; LONG SHORT TERM MEMORY; LSTM; RECURRENT NETWORKS; SPECTRAL WINDOWS;

SPEECH RECOGNITION;

EID: 84964507635 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ASRU.2015.7404777 Document Type: Conference Paper

Times cited : (36)

References (22)

1
- 84865713025
- Roles of pre-training and fine-tuning in context-dependent dbn-hmms for real-world speech recognition
- Dong Yu, Li Deng, and George E. Dahl, "Roles of pre-training and fine-tuning in context-dependent dbn-hmms for real-world speech recognition," in NIPS 2010 workshop on Deep Learning and Unsupervised Feature Learning, 2010
- (2010) NIPS 2010 Workshop on Deep Learning and Unsupervised Feature Learning
- Yu, D.¹ Deng, L.² Dahl, G.E.³

2
- 84865801985
- Conversational speech transcription using context-dependent deep neural networks
- Frank Seide, Gang Li, and Dong Yu, "Conversational speech transcription using context-dependent deep neural networks," in Interspeech 2011
- (2011) Interspeech
- Seide, F.¹ Li, G.² Yu, D.³

3
- 84890543852
- Error back propagation for sequence training of context-dependent deep networks for conversational speech transcription
- Hang Su, Gang Li, Dong Yu, and Frank Seide, "Error back propagation for sequence training of context-dependent deep networks for conversational speech transcription," in ICASSP 2013
- (2013) ICASSP
- Su, H.¹ Li, G.² Yu, D.³ Seide, F.⁴

4
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition
- Geoffrey Hinton, Li Deng, Dong Yu, George Dahl, Abdel rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara Sainath, and Brian Kingsbury, "Deep neural networks for acoustic modeling in speech recognition," Signal Processing Magazine, 2012
- (2012) Signal Processing Magazine
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.⁴ Rahman Mohamed, A.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.¹⁰ Kingsbury, B.¹¹

5
- 0001592322
- Kluwer Academic Publishers
- T. Robinson, M. Hochberg, and S. Renals, "The use of recurrent networks in continuous speech recognition," pp. 233-258. Kluwer Academic Publishers, 1996
- (1996) The Use of Recurrent Networks in Continuous Speech Recognition , pp. 233-258
- Robinson, T.¹ Hochberg, M.² Renals, S.³

6
- 84890543083
- Speech recognition with deep recurrent neural networks
- Alex Graves, Abdel rahman Mohamed, and Geoffrey Hinton, "Speech recognition with deep recurrent neural networks," in ICASSP 2013
- (2013) ICASSP
- Graves, A.¹ Rahman Mohamed, A.² Hinton, G.³

7
- 84962892645
- Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition
- abs/1402.1128
- Hasim Sak, Andrew W., and Francoise Beaufays, "Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition," CoRR, vol. abs/1402.1128, 2014
- (2014) CoRR
- Sak, H.¹ Andrew, W.² Beaufays, F.³

8
- 70349227947
- The application of hidden markov models in speech recognition
- Mark Gales and Steve Young, "The application of hidden markov models in speech recognition," Found. Trends Signal Process., vol. 1, no. 3, 2007
- (2007) Found. Trends Signal Process , vol.1 , Issue.3
- Gales, M.¹ Young, S.²

9
- 84964441421
- Kluwer Academic Publishers
- Herve A. Bourlard and Nelson Morgan, Connectionist Speech Recognition: A Hybrid Approach, Kluwer Academic Publishers, 1993
- (1993) Bourlard and Nelson Morgan, Connectionist Speech Recognition: A Hybrid Approach
- Herve, A.¹

10
- 77956502334
- Unsupervised feature learning for audio classification using convolutional deep belief networks
- Honglak Lee, Peter Pham, Yan Largman, and Andrew Y. Ng, "Unsupervised feature learning for audio classification using convolutional deep belief networks," in Advances in Neural Information Processing Systems 22. 2009
- (2009) Advances in Neural Information Processing Systems , pp. 22
- Lee, H.¹ Pham, P.² Largman, Y.³ Ng, A.Y.⁴

11
- 84911473441
- Convolutional neural networks for speech recognition
- Oct
- Ossama Abdel-Hamid, Abdel-Rahman Mohamed, Hui Jiang, Li Deng, Gerald Penn, and Dong Yu, "Convolutional neural networks for speech recognition," IEEE/ACM Trans. Audio, Speech and Lang. Proc., vol. 22, no. 10, pp. 1533-1545, Oct. 2014
- (2014) IEEE/ACM Trans. Audio, Speech and Lang. Proc , vol.22 , Issue.10 , pp. 1533-1545
- Ossama, A.-H.¹ Mohamed, A.-R.² Jiang, H.³ Deng, L.⁴ Penn, G.⁵ Yu, D.⁶

12
- 84893701254
- Hybrid speech recognition with deep bidirectional LSTM
- Alex Graves, Navdeep Jaitly, and Abdel rahman Mohamed, "Hybrid speech recognition with deep bidirectional LSTM," in ASRU 2013
- (2013) ASRU
- Graves, A.¹ Jaitly, N.² Rahman Mohamed, A.³

13
- 0031268931
- Bidirectional recurrent neural networks
- Nov
- M. Schuster and K.K. Paliwal, "Bidirectional recurrent neural networks," Trans. Sig. Proc., vol. 45, no. 11, pp. 2673-2681, Nov. 1997
- (1997) Trans. Sig. Proc , vol.45 , Issue.11 , pp. 2673-2681
- Schuster, M.¹ Paliwal, K.K.²

14
- 84890545163
- A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion
- May
- Li Deng, Ossama Abdel-Hamid, and Dong Yu, "A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion," in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), May 2013
- (2013) IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
- Deng, L.¹ Ossama, A.-H.² Yu, D.³

15
- 84964537525
- The IBM 2015 english conversational telephone speech recognition system
- abs/1505.05899
- George Saon, Hong-Kwang Jeff Kuo, Steven J. Rennie, and Michael Picheny, "The IBM 2015 english conversational telephone speech recognition system," CoRR, vol. abs/1505.05899, 2015
- (2015) CoRR
- Saon, G.¹ Jeff Kuo, H.-K.² Rennie, S.J.³ Picheny, M.⁴

16
- 0031573117
- Long short-term memory
- Sepp Hochreiter and Jurgen Schmidhuber, "Long short-term memory," Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997
- (1997) Neural Computation , vol.9 , Issue.8 , pp. 1735-1780
- Hochreiter, S.¹ Schmidhuber, J.²

17
- 84910069984
- 1-bit stochastic gradient descent and its application to dataparallel distributed training of speech DNNs
- Frank Seide, Hao Fu, Jasha Droppo, Gang Li, and Dong Yu, "1-bit stochastic gradient descent and its application to dataparallel distributed training of speech DNNs," in INTERSPEECH 2014
- (2014) INTERSPEECH
- Seide, F.¹ Fu, H.² Droppo, J.³ Li, G.⁴ Yu, D.⁵

18
- 84905269646
- On parallelizability of stochastic gradient descent for speech DNNs
- Frank Seide, Hao Fu, Jasha Droppo, Gang Li, and Dong Yu, "On parallelizability of stochastic gradient descent for speech DNNs," in ICASSP 2014
- (2014) ICASSP
- Seide, F.¹ Fu, H.² Droppo, J.³ Li, G.⁴ Yu, D.⁵

19
- 84959076031
- Training deep bidirectional LSTM acoustic models for LVCSR by a contextsensitive-chunk BPTT approach
- Kai Chen, Zhi-Jie Yan, and Qiang Huo, "Training deep bidirectional LSTM acoustic models for LVCSR by a contextsensitive-chunk BPTT approach," in interspeech 2015
- (2015) Interspeech
- Chen, K.¹ Yan, Z.-J.² Huo, Q.³

20
- 70349213445
- Lattice-based optimization of sequence classication criteria for neural-network acoustic modeling
- Brian Kingsbury, "Lattice-based optimization of sequence classication criteria for neural-network acoustic modeling," in icassp 2009
- (2009) Icassp
- Kingsbury, B.¹

21
- 84906264325
- Efficient estimation of maximum entropy language models with N-gram features: An SRILM extension
- Tanel Alumae and Mikko Kurimo, "Efficient estimation of maximum entropy language models with N-gram features: An SRILM extension," in interspeech 2012
- (2012) Interspeech
- Alumae, T.¹ Kurimo, M.²

22
- 37849007170
- Web resources for language modeling in conversational speech recognition
- Ivan Bulyko, Mari Ostendorf, Manhung Siu, Tim Ng, Andreas Stolcke, and O zgur C etin, "Web resources for language modeling in conversational speech recognition," ACM Transactions on Speech and Language Processing, vol. 5, no. 1, 2007
- (2007) ACM Transactions on Speech and Language Processing , vol.5 , Issue.1
- Bulyko, I.¹ Ostendorf, M.² Siu, M.³ Ng, T.⁴ Stolcke, A.⁵ Zgur Etin C, O.⁶

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.