메뉴 건너뛰기




Volumn 2015-August, Issue , 2015, Pages 4575-4579

Modeling long temporal contexts in convolutional neural network-based phone recognition

Author keywords

convolutional neural network; Deep neural network; maxout; split temporal context; TIMIT

Indexed keywords

AUDIO SIGNAL PROCESSING; CONVOLUTION; NEURAL NETWORKS; SPEECH COMMUNICATION; SPEECH RECOGNITION; TELEPHONE SETS;

EID: 84946020232     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2015.7178837     Document Type: Conference Paper
Times cited : (9)

References (26)
  • 2
    • 84055211743 scopus 로고    scopus 로고
    • Acoustic modeling using deep belief networks
    • A. Mohamed, G. E. Dahl, and G. Hinton, Acoustic modeling using deep belief networks, IEEE Trans. ASLP, vol. 20, no. 1, pp. 14-22, 2012
    • (2012) IEEE Trans. ASLP , vol.20 , Issue.1 , pp. 14-22
    • Mohamed, A.1    Dahl, G.E.2    Hinton, G.3
  • 4
    • 84255177123 scopus 로고    scopus 로고
    • Deep and wide: Multiple layers in automatic speech recognition
    • N. Morgan, Deep and wide: Multiple layers in automatic speech recognition, IEEE Trans. ASLP, vol. 20, no. 1, pp. 7-13, 2012
    • (2012) IEEE Trans. ASLP , vol.20 , Issue.1 , pp. 7-13
    • Morgan, N.1
  • 5
    • 84874485803 scopus 로고    scopus 로고
    • Investigation of deep neural networks (DNN) for large vocabulary continuous speech recognition: Why DNN surpasses GMMs in acoustic modeling
    • J. Pan, C. Liu, Z. G. Wang, Y. Hu, and H. Jiang, Investigation of deep neural networks (DNN) for large vocabulary continuous speech recognition: Why DNN surpasses GMMs in acoustic modeling, in Proc. ISCSLP, 2012, pp. 301-305
    • (2012) Proc. ISCSLP , pp. 301-305
    • Pan, J.1    Liu, C.2    Wang, Z.G.3    Hu, Y.4    Jiang, H.5
  • 6
    • 84994350739 scopus 로고    scopus 로고
    • Multi-stream speech recognition: Ready for prime time?
    • A. Janin, D. Ellis, and N.Morgan, Multi-stream speech recognition: Ready for prime time?, in Proc. Eurospeech, 1999, pp. 591-594
    • (1999) Proc. Eurospeech , pp. 591-594
    • Janin, A.1    Ellis, D.2    Morgan, N.3
  • 7
    • 84905252069 scopus 로고    scopus 로고
    • Combining time-and frequency-domain convolution in convolutional neural network-based phone recognition
    • L. Tóth, Combining time-and frequency-domain convolution in convolutional neural network-based phone recognition, in Proc. ICASSP, 2014, pp. 190-194
    • (2014) Proc. ICASSP , pp. 190-194
    • Tóth, L.1
  • 8
    • 84867605836 scopus 로고    scopus 로고
    • Applying convolutional neural network concepts to hybrid NN-HMM model for speech recognition
    • O. Abdel-Hamid, A. Mohamed, H. Jiang, and G. Penn, Applying convolutional neural network concepts to hybrid NN-HMM model for speech recognition, in Proc. ICASSP, 2012, pp. 4277-4280
    • (2012) Proc. ICASSP , pp. 4277-4280
    • Abdel-Hamid, O.1    Mohamed, A.2    Jiang, H.3    Penn, G.4
  • 10
    • 85009216422 scopus 로고    scopus 로고
    • Using mutual information to design class-specific phone recognizers
    • P. Scanlon, D. P. W. Ellis, and R. Reilly, Using mutual information to design class-specific phone recognizers, in Proc. Interspeech, 2003, pp. 857-860
    • (2003) Proc. Interspeech , pp. 857-860
    • Scanlon, P.1    Ellis, D.P.W.2    Reilly, R.3
  • 11
    • 78049251448 scopus 로고    scopus 로고
    • Analysis of MLP based hierarchical phoneme posterior probability estimator
    • J. Pinto et al., Analysis of MLP based hierarchical phoneme posterior probability estimator, IEEE Trans. ASLP, vol. 19, no. 2, pp. 225-241, 2010
    • (2010) IEEE Trans. ASLP , vol.19 , Issue.2 , pp. 225-241
    • Pinto, J.1
  • 13
    • 84910083918 scopus 로고    scopus 로고
    • Language ID-based training of multilingual stacked bottleneck features
    • Y. Zhang, E. Chuangsuwanich, and J. Glass, Language ID-based training of multilingual stacked bottleneck features, in Proc. Interspeech, 2014, pp. 1-5
    • (2014) Proc. Interspeech , pp. 1-5
    • Zhang, Y.1    Chuangsuwanich, E.2    Glass, J.3
  • 14
    • 84858971297 scopus 로고    scopus 로고
    • Convolutive bottleneck network features for LVCSR
    • K. Vesel ý, M. Karafiát, and F. Grézl, Convolutive bottleneck network features for LVCSR, in Proc. ASRU, 2011, pp. 42-47
    • (2011) Proc. ASRU , pp. 42-47
    • Vesel Ý., K.1    Karafiát, M.2    Grézl, F.3
  • 15
    • 84906276981 scopus 로고    scopus 로고
    • Convolutional deep rectifier neural nets for phone recognition
    • L. Tóth, Convolutional deep rectifier neural nets for phone recognition, in Proc. Interspeech, 2013, pp. 1722-1726
    • (2013) Proc. Interspeech , pp. 1722-1726
    • Tóth, L.1
  • 16
    • 33947620115 scopus 로고    scopus 로고
    • Hierarchical structures of neural networks for phoneme recognition
    • P. Schwarz, P. Matějka, and J. Černocký, Hierarchical structures of neural networks for phoneme recognition, in Proc. ICASSP, 2006, pp. 325-328
    • (2006) Proc. ICASSP , pp. 325-328
    • Schwarz, P.1    Matějka, P.2    Černocký, J.3
  • 17
    • 84873303660 scopus 로고    scopus 로고
    • Speech recognition using long-span temporal patterns in a deep network model
    • S. M. Siniscalchi, D. Yu, L. Deng, and C.-H. Lee, Speech recognition using long-span temporal patterns in a deep network model, IEEE Signal Processing Letters, vol. 20, no. 3, pp. 201-204, 2013
    • (2013) IEEE Signal Processing Letters , vol.20 , Issue.3 , pp. 201-204
    • Siniscalchi, S.M.1    Yu, D.2    Deng, L.3    Lee, C.-H.4
  • 18
    • 84910087218 scopus 로고    scopus 로고
    • Modeling long temporal contexts for robust DNN-based speech recognition
    • B. Li and K. C. Sim, Modeling long temporal contexts for robust DNN-based speech recognition, in Proc. Interspeech, 2014, pp. 353-357
    • (2014) Proc. Interspeech , pp. 353-357
    • Li, B.1    Sim, K.C.2
  • 19
    • 84905283576 scopus 로고    scopus 로고
    • Deep learning of split temporal context for automatic speech recognition
    • M. Baccouche, B. Besset, P. Collen, and O. Le Blouch, Deep learning of split temporal context for automatic speech recognition, in Proc. ICASSP, 2014, pp. 5459-5463
    • (2014) Proc. ICASSP , pp. 5459-5463
    • Baccouche, M.1    Besset, B.2    Collen, P.3    Le Blouch, O.4
  • 20
    • 84906214784 scopus 로고    scopus 로고
    • Exploring convolutional neural network structures and optimization techniques for speech recognition
    • O. Abdel-Hamid, L. Deng, and D. Yu, Exploring convolutional neural network structures and optimization techniques for speech recognition, in Proc. Interspeech, 2013, pp. 3366-3370
    • (2013) Proc. Interspeech , pp. 3366-3370
    • Abdel-Hamid, O.1    Deng, L.2    Yu, D.3
  • 21
    • 84893654379 scopus 로고    scopus 로고
    • Improvements to deep convolutional neural networks for LVCSR
    • T. N. Sainath, B. Kingsbury, A. Mohamed, and B. Ramabhadran et al., Improvements to deep convolutional neural networks for LVCSR, in Proc. ASRU, 2013, pp. 315-320
    • (2013) Proc. ASRU , pp. 315-320
    • Sainath, T.N.1    Kingsbury, B.2    Mohamed, A.3    Ramabhadran, B.4
  • 22
    • 84910069623 scopus 로고    scopus 로고
    • Convolutional deep maxout networks for phone recognition
    • L. Tóth, Convolutional deep maxout networks for phone recognition, in Proc. Interspeech, 2014, pp. 1078-1082
    • (2014) Proc. Interspeech , pp. 1078-1082
    • Tóth, L.1
  • 23
    • 77955803591 scopus 로고    scopus 로고
    • Enhanced phone posteriors for improving speech recognition systems
    • H. Ketabdar and H. Bourlard, Enhanced phone posteriors for improving speech recognition systems, IEEE Trans. ASLP, vol. 18, no. 6, pp. 1094-1106, 2010
    • (2010) IEEE Trans. ASLP , vol.18 , Issue.6 , pp. 1094-1106
    • Ketabdar, H.1    Bourlard, H.2
  • 24
    • 84890466217 scopus 로고    scopus 로고
    • Improving neural networks by preventing co-adaptation of feature detectors
    • abs/1207.0580
    • G.E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, Improving neural networks by preventing co-adaptation of feature detectors, CoRR, vol. abs/1207.0580, 2012
    • (2012) CoRR
    • Hinton, G.E.1    Srivastava, N.2    Krizhevsky, A.3    Sutskever, I.4    Salakhutdinov, R.5
  • 26
    • 84890543083 scopus 로고    scopus 로고
    • Speech recognition with deep recurrent neural networks
    • A. Graves, A. Mohamed, and G. E. Hinton, Speech recognition with deep recurrent neural networks, in Proc. ICASSP, 2013, pp. 6645-6649
    • (2013) Proc. ICASSP , pp. 6645-6649
    • Graves, A.1    Mohamed, A.2    Hinton, G.E.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.