메뉴 건너뛰기




Volumn , Issue , 2014, Pages 190-194

Combining time- and frequency-domain convolution in convolutional neural network-based phone recognition

Author keywords

convolutional neural network; Deep neural network; rectified linear unit; speech recognition; TIMIT

Indexed keywords

CONVOLUTION; IMAGE RECOGNITION; NETWORK ARCHITECTURE; NEURAL NETWORKS; SPEECH RECOGNITION;

EID: 84905252069     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2014.6853584     Document Type: Conference Paper
Times cited : (71)

References (21)
  • 1
    • 0002263996 scopus 로고
    • Convolutional networks for images, speech and time series
    • Michael A. Arbib, Ed., MIT Press
    • Y. Lecun and Y. Bengio, "Convolutional networks for images, speech and time series," in The Handbook of Brain Theory and Neural Networks, Michael A. Arbib, Ed. 1995, pp. 255-258, MIT Press.
    • (1995) The Handbook of Brain Theory and Neural Networks , pp. 255-258
    • Lecun, Y.1    Bengio, Y.2
  • 2
    • 84867605836 scopus 로고    scopus 로고
    • Applying convolutional neural network concepts to hybrid NN-HMM model for speech recognition
    • O. Abdel-Hamid, A. Mohamed, H. Jiang, and G. Penn, "Applying convolutional neural network concepts to hybrid NN-HMM model for speech recognition," in Proc. ICASSP, 2012, pp. 4277-4280.
    • (2012) Proc. ICASSP , pp. 4277-4280
    • Abdel-Hamid, O.1    Mohamed, A.2    Jiang, H.3    Penn, G.4
  • 4
    • 0024634603 scopus 로고
    • Phoneme recognition using time-delay neural networks
    • Waibel, A. and Hanazawa, T. and Hinton, G. and Shikano, K and Lang, K. J., "Phoneme recognition using time-delay neural networks," IEEE Trans. ASSP, vol. 37, no. 3, pp. 328-339, 1989.
    • (1989) IEEE Trans. ASSP , vol.37 , Issue.3 , pp. 328-339
    • Waibel, A.1    Hanazawa, T.2    Hinton, G.3    Shikano, K.4    Lang, K.J.5
  • 5
    • 84906214784 scopus 로고    scopus 로고
    • Exploring convolutional neural network structures and optimization techniques for speech recognition
    • O. Abdel-Hamid, L. Deng, and D. Yu, "Exploring convolutional neural network structures and optimization techniques for speech recognition," in Proc. Interspeech, 2013, pp. 3366-3370.
    • (2013) Proc. Interspeech , pp. 3366-3370
    • Abdel-Hamid, O.1    Deng, L.2    Yu, D.3
  • 6
    • 84893654379 scopus 로고    scopus 로고
    • Improvements to deep convolutional neural networks for LVCSR
    • accepted, in print
    • T. N. Sainath, B. Kingsbury, A. Mohamed, and B. Ramabhadran, "Improvements to deep convolutional neural networks for LVCSR," in Proc. ASRU. 2013, accepted, in print.
    • (2013) Proc. ASRU.
    • Sainath, T.N.1    Kingsbury, B.2    Mohamed, A.3    Ramabhadran, B.4
  • 7
    • 84858971297 scopus 로고    scopus 로고
    • Convolutive bottleneck network features for LVCSR
    • K. Veselý, M. Karafíat, and F. Grézl, "Convolutive bottleneck network features for LVCSR," in Proc. ASRU, 2011, pp. 42-47.
    • (2011) Proc. ASRU , pp. 42-47
    • Veselý, K.1    Karafíat, M.2    Grézl, F.3
  • 8
    • 84890451371 scopus 로고    scopus 로고
    • Phone recognition with deep sparse rectifier neural networks
    • L. Tóth, "Phone recognition with deep sparse rectifier neural networks," in Proc. ICASSP, 2013, pp. 6985-6989.
    • (2013) Proc. ICASSP , pp. 6985-6989
    • Tóth, L.1
  • 9
    • 84890527827 scopus 로고    scopus 로고
    • Improving deep neural networks for LVCSR using rectified linear units and dropout
    • G. E. Dahl, T. N. Sainath, and G. E. Hinton, "Improving deep neural networks for LVCSR using rectified linear units and dropout," in Proc. ICASSP, 2013, pp. 8609-8613.
    • (2013) Proc. ICASSP , pp. 8609-8613
    • Dahl, G.E.1    Sainath, T.N.2    Hinton, G.E.3
  • 11
    • 84893676344 scopus 로고    scopus 로고
    • Rectifier nonlinearities improve neural network acoustic models
    • A. L. Maas, A. Y. Hannun, and A. Y. Ng, "Rectifier nonlinearities improve neural network acoustic models," in Proc. ICML, 2013.
    • (2013) Proc. ICML
    • Maas, A.L.1    Hannun, A.Y.2    Ng, A.Y.3
  • 12
    • 84906276981 scopus 로고    scopus 로고
    • Convolutional deep rectifier neural nets for phone recognition
    • L. Tóth, "Convolutional deep rectifier neural nets for phone recognition," in Proc. Interspeech, 2013, pp. 1722-1726.
    • (2013) Proc. Interspeech , pp. 1722-1726
    • Tóth, L.1
  • 13
    • 77955803591 scopus 로고    scopus 로고
    • Enhanced phone posteriors for improving speech recognition systems
    • H. Ketabdar and H. Bourlard, "Enhanced phone posteriors for improving speech recognition systems," IEEE Trans. ASLP, vol. 18, no. 6, pp. 1094-1106, 2010.
    • (2010) IEEE Trans. ASLP , vol.18 , Issue.6 , pp. 1094-1106
    • Ketabdar, H.1    Bourlard, H.2
  • 14
    • 78049251448 scopus 로고    scopus 로고
    • Analysis of MLP based hierarchical phoneme posterior probability estimator
    • J. Pinto et al., "Analysis of MLP based hierarchical phoneme posterior probability estimator," IEEE Trans. ASLP, vol. 19, no. 2, pp. 225-241, 2010.
    • (2010) IEEE Trans. ASLP , vol.19 , Issue.2 , pp. 225-241
    • Pinto, J.1
  • 15
    • 84055211743 scopus 로고    scopus 로고
    • Acoustic modeling using deep belief networks
    • A. Mohamed, G. E. Dahl, and G. Hinton, "Acoustic modeling using deep belief networks," IEEE Trans. ASLP, vol. 20, no. 1, pp. 14-22, 2012.
    • (2012) IEEE Trans. ASLP , vol.20 , Issue.1 , pp. 14-22
    • Mohamed, A.1    Dahl, G.E.2    Hinton, G.3
  • 16
    • 84890545163 scopus 로고    scopus 로고
    • A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion
    • L. Deng, O. Abdel-Hamid, and D. Yu, "A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion," in Proc. ICASSP, 2013, pp. 6669-6673.
    • (2013) Proc. ICASSP , pp. 6669-6673
    • Deng, L.1    Abdel-Hamid, O.2    Yu, D.3
  • 17
    • 84890543083 scopus 로고    scopus 로고
    • Speech recognition with deep recurrent neural networks
    • A. Graves, A. Mohamed, and G. E. Hinton, "Speech recognition with deep recurrent neural networks," in Proc. ICASSP, 2013, pp. 6645-6649.
    • (2013) Proc. ICASSP , pp. 6645-6649
    • Graves, A.1    Mohamed, A.2    Hinton, G.E.3
  • 19
    • 70349213445 scopus 로고    scopus 로고
    • Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling
    • B. Kingsbury, "Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling," in Proc. ICASSP, 2009, pp. 3761-3764.
    • (2009) Proc. ICASSP , pp. 3761-3764
    • Kingsbury, B.1
  • 20
    • 79959840616 scopus 로고    scopus 로고
    • Investigation of fullsequence training of deep belief networks for speech recognition
    • A. Mohamed, D. Yu, and L. Deng, "Investigation of fullsequence training of deep belief networks for speech recognition," in Proc. Interspeech, 2010, pp. 2846-2849.
    • (2010) Proc. Interspeech , pp. 2846-2849
    • Mohamed, A.1    Yu, D.2    Deng, L.3
  • 21
    • 84906274730 scopus 로고    scopus 로고
    • Sequence-discriminative training of deep neural networks
    • K. Veselý, A. Ghoshal, L. Burget, and D. Povey, "Sequence-discriminative training of deep neural networks," in Proc. Interspeech, 2013, pp. 2345-2349.
    • (2013) Proc. Interspeech , pp. 2345-2349
    • Veselý, K.1    Ghoshal, A.2    Burget, L.3    Povey, D.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.