SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn , Issue , 2014, Pages 190-194

Combining time- and frequency-domain convolution in convolutional neural network-based phone recognition

(1) Toth, Laszlo a

a UNIVERSITY OF SZEGED (Hungary)

Author keywords

convolutional neural network; Deep neural network; rectified linear unit; speech recognition; TIMIT

Indexed keywords

CONVOLUTION; IMAGE RECOGNITION; NETWORK ARCHITECTURE; NEURAL NETWORKS; SPEECH RECOGNITION;

CONVOLUTIONAL NEURAL NETWORK; DEEP NEURAL NETWORKS; FULLY CONNECTED NETWORKS; LINEAR UNITS; SPECTRAL REPRESENTATIONS; TIME AND FREQUENCIES; TIME-DOMAIN CONVOLUTION; TIMIT;

SIGNAL PROCESSING;

EID: 84905252069 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2014.6853584 Document Type: Conference Paper

Times cited : (71)

References (21)

1
- 0002263996
- Convolutional networks for images, speech and time series
- Michael A. Arbib, Ed., MIT Press
- Y. Lecun and Y. Bengio, "Convolutional networks for images, speech and time series," in The Handbook of Brain Theory and Neural Networks, Michael A. Arbib, Ed. 1995, pp. 255-258, MIT Press.
- (1995) The Handbook of Brain Theory and Neural Networks , pp. 255-258
- Lecun, Y.¹ Bengio, Y.²

2
- 84867605836
- Applying convolutional neural network concepts to hybrid NN-HMM model for speech recognition
- O. Abdel-Hamid, A. Mohamed, H. Jiang, and G. Penn, "Applying convolutional neural network concepts to hybrid NN-HMM model for speech recognition," in Proc. ICASSP, 2012, pp. 4277-4280.
- (2012) Proc. ICASSP , pp. 4277-4280
- Abdel-Hamid, O.¹ Mohamed, A.² Jiang, H.³ Penn, G.⁴

3
- 84890525984
- Deep convolutional neural networks for LVCSR
- T. N. Sainath, A. Mohamed, B. Kingsbury, and B. Ramabhadran, "Deep convolutional neural networks for LVCSR," in Proc. ICASSP, 2013, pp. 8614-8618.
- (2013) Proc. ICASSP , pp. 8614-8618
- Sainath, T.N.¹ Mohamed, A.² Kingsbury, B.³ Ramabhadran, B.⁴

4
- 0024634603
- Phoneme recognition using time-delay neural networks
- Waibel, A. and Hanazawa, T. and Hinton, G. and Shikano, K and Lang, K. J., "Phoneme recognition using time-delay neural networks," IEEE Trans. ASSP, vol. 37, no. 3, pp. 328-339, 1989.
- (1989) IEEE Trans. ASSP , vol.37 , Issue.3 , pp. 328-339
- Waibel, A.¹ Hanazawa, T.² Hinton, G.³ Shikano, K.⁴ Lang, K.J.⁵

5
- 84906214784
- Exploring convolutional neural network structures and optimization techniques for speech recognition
- O. Abdel-Hamid, L. Deng, and D. Yu, "Exploring convolutional neural network structures and optimization techniques for speech recognition," in Proc. Interspeech, 2013, pp. 3366-3370.
- (2013) Proc. Interspeech , pp. 3366-3370
- Abdel-Hamid, O.¹ Deng, L.² Yu, D.³

6
- 84893654379
- Improvements to deep convolutional neural networks for LVCSR
- accepted, in print
- T. N. Sainath, B. Kingsbury, A. Mohamed, and B. Ramabhadran, "Improvements to deep convolutional neural networks for LVCSR," in Proc. ASRU. 2013, accepted, in print.
- (2013) Proc. ASRU.
- Sainath, T.N.¹ Kingsbury, B.² Mohamed, A.³ Ramabhadran, B.⁴

7
- 84858971297
- Convolutive bottleneck network features for LVCSR
- K. Veselý, M. Karafíat, and F. Grézl, "Convolutive bottleneck network features for LVCSR," in Proc. ASRU, 2011, pp. 42-47.
- (2011) Proc. ASRU , pp. 42-47
- Veselý, K.¹ Karafíat, M.² Grézl, F.³

8
- 84890451371
- Phone recognition with deep sparse rectifier neural networks
- L. Tóth, "Phone recognition with deep sparse rectifier neural networks," in Proc. ICASSP, 2013, pp. 6985-6989.
- (2013) Proc. ICASSP , pp. 6985-6989
- Tóth, L.¹

9
- 84890527827
- Improving deep neural networks for LVCSR using rectified linear units and dropout
- G. E. Dahl, T. N. Sainath, and G. E. Hinton, "Improving deep neural networks for LVCSR using rectified linear units and dropout," in Proc. ICASSP, 2013, pp. 8609-8613.
- (2013) Proc. ICASSP , pp. 8609-8613
- Dahl, G.E.¹ Sainath, T.N.² Hinton, G.E.³

10
- 84890471125
- On rectified linear units for speech processing
- M. D. Zeiler, M. Ranzato, R. Monga, M. Mao, K. Yang, Q. V. Le, P. Nguyen, A. Senior, V. Vanhoucke, J. Dean, and G. E. Hinton, "On rectified linear units for speech processing," in Proc. ICASSP, 2013, pp. 3517-3521.
- (2013) Proc. ICASSP , pp. 3517-3521
- Zeiler, M.D.¹ Ranzato, M.² Monga, R.³ Mao, M.⁴ Yang, K.⁵ Le, Q.V.⁶ Nguyen, P.⁷ Senior, A.⁸ Vanhoucke, V.⁹ Dean, J.¹⁰ Hinton, G.E.¹¹

11
- 84893676344
- Rectifier nonlinearities improve neural network acoustic models
- A. L. Maas, A. Y. Hannun, and A. Y. Ng, "Rectifier nonlinearities improve neural network acoustic models," in Proc. ICML, 2013.
- (2013) Proc. ICML
- Maas, A.L.¹ Hannun, A.Y.² Ng, A.Y.³

12
- 84906276981
- Convolutional deep rectifier neural nets for phone recognition
- L. Tóth, "Convolutional deep rectifier neural nets for phone recognition," in Proc. Interspeech, 2013, pp. 1722-1726.
- (2013) Proc. Interspeech , pp. 1722-1726
- Tóth, L.¹

13
- 77955803591
- Enhanced phone posteriors for improving speech recognition systems
- H. Ketabdar and H. Bourlard, "Enhanced phone posteriors for improving speech recognition systems," IEEE Trans. ASLP, vol. 18, no. 6, pp. 1094-1106, 2010.
- (2010) IEEE Trans. ASLP , vol.18 , Issue.6 , pp. 1094-1106
- Ketabdar, H.¹ Bourlard, H.²

14
- 78049251448
- Analysis of MLP based hierarchical phoneme posterior probability estimator
- J. Pinto et al., "Analysis of MLP based hierarchical phoneme posterior probability estimator," IEEE Trans. ASLP, vol. 19, no. 2, pp. 225-241, 2010.
- (2010) IEEE Trans. ASLP , vol.19 , Issue.2 , pp. 225-241
- Pinto, J.¹

15
- 84055211743
- Acoustic modeling using deep belief networks
- A. Mohamed, G. E. Dahl, and G. Hinton, "Acoustic modeling using deep belief networks," IEEE Trans. ASLP, vol. 20, no. 1, pp. 14-22, 2012.
- (2012) IEEE Trans. ASLP , vol.20 , Issue.1 , pp. 14-22
- Mohamed, A.¹ Dahl, G.E.² Hinton, G.³

16
- 84890545163
- A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion
- L. Deng, O. Abdel-Hamid, and D. Yu, "A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion," in Proc. ICASSP, 2013, pp. 6669-6673.
- (2013) Proc. ICASSP , pp. 6669-6673
- Deng, L.¹ Abdel-Hamid, O.² Yu, D.³

17
- 84890543083
- Speech recognition with deep recurrent neural networks
- A. Graves, A. Mohamed, and G. E. Hinton, "Speech recognition with deep recurrent neural networks," in Proc. ICASSP, 2013, pp. 6645-6649.
- (2013) Proc. ICASSP , pp. 6645-6649
- Graves, A.¹ Mohamed, A.² Hinton, G.E.³

18
- 84867720412
- CoRR, Vol. abs/1207.0580
- G.E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Improving neural networks by preventing co-adaptation of feature detectors," CoRR, vol. abs/1207.0580, 2012.
- (2012) Improving Neural Networks by Preventing Co-adaptation of Feature Detectors
- Hinton, G.E.¹ Srivastava, N.² Krizhevsky, A.³ Sutskever, I.⁴ Salakhutdinov, R.⁵

19
- 70349213445
- Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling
- B. Kingsbury, "Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling," in Proc. ICASSP, 2009, pp. 3761-3764.
- (2009) Proc. ICASSP , pp. 3761-3764
- Kingsbury, B.¹

20
- 79959840616
- Investigation of fullsequence training of deep belief networks for speech recognition
- A. Mohamed, D. Yu, and L. Deng, "Investigation of fullsequence training of deep belief networks for speech recognition," in Proc. Interspeech, 2010, pp. 2846-2849.
- (2010) Proc. Interspeech , pp. 2846-2849
- Mohamed, A.¹ Yu, D.² Deng, L.³

21
- 84906274730
- Sequence-discriminative training of deep neural networks
- K. Veselý, A. Ghoshal, L. Burget, and D. Povey, "Sequence-discriminative training of deep neural networks," in Proc. Interspeech, 2013, pp. 2345-2349.
- (2013) Proc. Interspeech , pp. 2345-2349
- Veselý, K.¹ Ghoshal, A.² Burget, L.³ Povey, D.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.