SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn 2015-August, Issue , 2015, Pages 4575-4579

Modeling long temporal contexts in convolutional neural network-based phone recognition

(1) Toth, Laszlo a

a UNIVERSITY OF SZEGED (Hungary)

Author keywords

convolutional neural network; Deep neural network; maxout; split temporal context; TIMIT

Indexed keywords

AUDIO SIGNAL PROCESSING; CONVOLUTION; NEURAL NETWORKS; SPEECH COMMUNICATION; SPEECH RECOGNITION; TELEPHONE SETS;

CONVOLUTIONAL NEURAL NETWORK; MAXOUT; PHONE ERROR RATE; PHONE RECOGNITION; RELATIVE ERROR RATES; SPEECH RECOGNIZER; SPLIT TEMPORAL CONTEXT; TIMIT;

DEEP NEURAL NETWORKS;

EID: 84946020232 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2015.7178837 Document Type: Conference Paper

Times cited : (9)

References (26)

1
- 84890525984
- Deep convolutional neural networks for LVCSR
- T. N. Sainath, A. Mohamed, B. Kingsbury, and B. Ramabhadran, Deep convolutional neural networks for LVCSR, in Proc. ICASSP, 2013, pp. 8614-8618
- (2013) Proc. ICASSP , pp. 8614-8618
- Sainath, T.N.¹ Mohamed, A.² Kingsbury, B.³ Ramabhadran, B.⁴

2
- 84055211743
- Acoustic modeling using deep belief networks
- A. Mohamed, G. E. Dahl, and G. Hinton, Acoustic modeling using deep belief networks, IEEE Trans. ASLP, vol. 20, no. 1, pp. 14-22, 2012
- (2012) IEEE Trans. ASLP , vol.20 , Issue.1 , pp. 14-22
- Mohamed, A.¹ Dahl, G.E.² Hinton, G.³

3
- 84890471125
- On rectified linear units for speech processing
- M. D. Zeiler, M. Ranzato, R. Monga, M. Mao, K. Yang, Q. V. Le, P. Nguyen, A. Senior, V. Vanhoucke, J. Dean, and G. E. Hinton, On rectified linear units for speech processing, in Proc. ICASSP, 2013, pp. 3517-3521
- (2013) Proc. ICASSP , pp. 3517-3521
- Zeiler, M.D.¹ Ranzato, M.² Monga, R.³ Mao, M.⁴ Yang, K.⁵ Le, Q.V.⁶ Nguyen, P.⁷ Senior, A.⁸ Vanhoucke, V.⁹ Dean, J.¹⁰ Hinton, G.E.¹¹

4
- 84255177123
- Deep and wide: Multiple layers in automatic speech recognition
- N. Morgan, Deep and wide: Multiple layers in automatic speech recognition, IEEE Trans. ASLP, vol. 20, no. 1, pp. 7-13, 2012
- (2012) IEEE Trans. ASLP , vol.20 , Issue.1 , pp. 7-13
- Morgan, N.¹

5
- 84874485803
- Investigation of deep neural networks (DNN) for large vocabulary continuous speech recognition: Why DNN surpasses GMMs in acoustic modeling
- J. Pan, C. Liu, Z. G. Wang, Y. Hu, and H. Jiang, Investigation of deep neural networks (DNN) for large vocabulary continuous speech recognition: Why DNN surpasses GMMs in acoustic modeling, in Proc. ISCSLP, 2012, pp. 301-305
- (2012) Proc. ISCSLP , pp. 301-305
- Pan, J.¹ Liu, C.² Wang, Z.G.³ Hu, Y.⁴ Jiang, H.⁵

6
- 84994350739
- Multi-stream speech recognition: Ready for prime time?
- A. Janin, D. Ellis, and N.Morgan, Multi-stream speech recognition: Ready for prime time?, in Proc. Eurospeech, 1999, pp. 591-594
- (1999) Proc. Eurospeech , pp. 591-594
- Janin, A.¹ Ellis, D.² Morgan, N.³

7
- 84905252069
- Combining time-and frequency-domain convolution in convolutional neural network-based phone recognition
- L. Tóth, Combining time-and frequency-domain convolution in convolutional neural network-based phone recognition, in Proc. ICASSP, 2014, pp. 190-194
- (2014) Proc. ICASSP , pp. 190-194
- Tóth, L.¹

8
- 84867605836
- Applying convolutional neural network concepts to hybrid NN-HMM model for speech recognition
- O. Abdel-Hamid, A. Mohamed, H. Jiang, and G. Penn, Applying convolutional neural network concepts to hybrid NN-HMM model for speech recognition, in Proc. ICASSP, 2012, pp. 4277-4280
- (2012) Proc. ICASSP , pp. 4277-4280
- Abdel-Hamid, O.¹ Mohamed, A.² Jiang, H.³ Penn, G.⁴

9
- 0003573244
- Kluwer
- H. Bourlard and N. Morgan, Connectionist Speech Recognition-A Hybrid Approach, Kluwer, 1994
- (1994) Connectionist Speech Recognition-A Hybrid Approach
- Bourlard, H.¹ Morgan, N.²

10
- 85009216422
- Using mutual information to design class-specific phone recognizers
- P. Scanlon, D. P. W. Ellis, and R. Reilly, Using mutual information to design class-specific phone recognizers, in Proc. Interspeech, 2003, pp. 857-860
- (2003) Proc. Interspeech , pp. 857-860
- Scanlon, P.¹ Ellis, D.P.W.² Reilly, R.³

11
- 78049251448
- Analysis of MLP based hierarchical phoneme posterior probability estimator
- J. Pinto et al., Analysis of MLP based hierarchical phoneme posterior probability estimator, IEEE Trans. ASLP, vol. 19, no. 2, pp. 225-241, 2010
- (2010) IEEE Trans. ASLP , vol.19 , Issue.2 , pp. 225-241
- Pinto, J.¹

12
- 84941300175
- Springer
- Vasquez, D. and Gruhn, R. and Minker, W., Hierarchical Neural Network Structures for Phoneme Recognition, Springer, 2013
- (2013) Hierarchical Neural Network Structures for Phoneme Recognition
- Vasquez, D.¹ Gruhn, R.² Minker, W.³

13
- 84910083918
- Language ID-based training of multilingual stacked bottleneck features
- Y. Zhang, E. Chuangsuwanich, and J. Glass, Language ID-based training of multilingual stacked bottleneck features, in Proc. Interspeech, 2014, pp. 1-5
- (2014) Proc. Interspeech , pp. 1-5
- Zhang, Y.¹ Chuangsuwanich, E.² Glass, J.³

14
- 84858971297
- Convolutive bottleneck network features for LVCSR
- K. Vesel ý, M. Karafiát, and F. Grézl, Convolutive bottleneck network features for LVCSR, in Proc. ASRU, 2011, pp. 42-47
- (2011) Proc. ASRU , pp. 42-47
- Vesel Ý., K.¹ Karafiát, M.² Grézl, F.³

15
- 84906276981
- Convolutional deep rectifier neural nets for phone recognition
- L. Tóth, Convolutional deep rectifier neural nets for phone recognition, in Proc. Interspeech, 2013, pp. 1722-1726
- (2013) Proc. Interspeech , pp. 1722-1726
- Tóth, L.¹

16
- 33947620115
- Hierarchical structures of neural networks for phoneme recognition
- P. Schwarz, P. Matějka, and J. Černocký, Hierarchical structures of neural networks for phoneme recognition, in Proc. ICASSP, 2006, pp. 325-328
- (2006) Proc. ICASSP , pp. 325-328
- Schwarz, P.¹ Matějka, P.² Černocký, J.³

17
- 84873303660
- Speech recognition using long-span temporal patterns in a deep network model
- S. M. Siniscalchi, D. Yu, L. Deng, and C.-H. Lee, Speech recognition using long-span temporal patterns in a deep network model, IEEE Signal Processing Letters, vol. 20, no. 3, pp. 201-204, 2013
- (2013) IEEE Signal Processing Letters , vol.20 , Issue.3 , pp. 201-204
- Siniscalchi, S.M.¹ Yu, D.² Deng, L.³ Lee, C.-H.⁴

18
- 84910087218
- Modeling long temporal contexts for robust DNN-based speech recognition
- B. Li and K. C. Sim, Modeling long temporal contexts for robust DNN-based speech recognition, in Proc. Interspeech, 2014, pp. 353-357
- (2014) Proc. Interspeech , pp. 353-357
- Li, B.¹ Sim, K.C.²

19
- 84905283576
- Deep learning of split temporal context for automatic speech recognition
- M. Baccouche, B. Besset, P. Collen, and O. Le Blouch, Deep learning of split temporal context for automatic speech recognition, in Proc. ICASSP, 2014, pp. 5459-5463
- (2014) Proc. ICASSP , pp. 5459-5463
- Baccouche, M.¹ Besset, B.² Collen, P.³ Le Blouch, O.⁴

20
- 84906214784
- Exploring convolutional neural network structures and optimization techniques for speech recognition
- O. Abdel-Hamid, L. Deng, and D. Yu, Exploring convolutional neural network structures and optimization techniques for speech recognition, in Proc. Interspeech, 2013, pp. 3366-3370
- (2013) Proc. Interspeech , pp. 3366-3370
- Abdel-Hamid, O.¹ Deng, L.² Yu, D.³

21
- 84893654379
- Improvements to deep convolutional neural networks for LVCSR
- T. N. Sainath, B. Kingsbury, A. Mohamed, and B. Ramabhadran et al., Improvements to deep convolutional neural networks for LVCSR, in Proc. ASRU, 2013, pp. 315-320
- (2013) Proc. ASRU , pp. 315-320
- Sainath, T.N.¹ Kingsbury, B.² Mohamed, A.³ Ramabhadran, B.⁴

22
- 84910069623
- Convolutional deep maxout networks for phone recognition
- L. Tóth, Convolutional deep maxout networks for phone recognition, in Proc. Interspeech, 2014, pp. 1078-1082
- (2014) Proc. Interspeech , pp. 1078-1082
- Tóth, L.¹

23
- 77955803591
- Enhanced phone posteriors for improving speech recognition systems
- H. Ketabdar and H. Bourlard, Enhanced phone posteriors for improving speech recognition systems, IEEE Trans. ASLP, vol. 18, no. 6, pp. 1094-1106, 2010
- (2010) IEEE Trans. ASLP , vol.18 , Issue.6 , pp. 1094-1106
- Ketabdar, H.¹ Bourlard, H.²

24
- 84890466217
- Improving neural networks by preventing co-adaptation of feature detectors
- abs/1207.0580
- G.E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, Improving neural networks by preventing co-adaptation of feature detectors, CoRR, vol. abs/1207.0580, 2012
- (2012) CoRR
- Hinton, G.E.¹ Srivastava, N.² Krizhevsky, A.³ Sutskever, I.⁴ Salakhutdinov, R.⁵

25
- 84905247926
- Deep scattering spectrum with deep neural networks
- V. Peddinti, T N. Sainath, S.Maymon, B. Ramabhadran, D. Nahamoo, and V. Goel, Deep scattering spectrum with deep neural networks, in Proc. ICASSP, 2014, pp. 210-214
- (2014) Proc. ICASSP , pp. 210-214
- Peddinti, V.¹ Sainath, T.N.² Maymon, S.³ Ramabhadran, B.⁴ Nahamoo, D.⁵ Goel, V.⁶

26
- 84890543083
- Speech recognition with deep recurrent neural networks
- A. Graves, A. Mohamed, and G. E. Hinton, Speech recognition with deep recurrent neural networks, in Proc. ICASSP, 2013, pp. 6645-6649
- (2013) Proc. ICASSP , pp. 6645-6649
- Graves, A.¹ Mohamed, A.² Hinton, G.E.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.