메뉴 건너뛰기




Volumn , Issue , 2012, Pages 4277-4280

Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition

Author keywords

acoustic modeling; local filtering; max pooling; neural networks; speech recognition

Indexed keywords

ACOUSTIC MODELING; CNN MODELS; CONVOLUTIONAL NEURAL NETWORK; DATA SETS; FREQUENCY DOMAINS; HIDDEN LAYERS; MAX-POOLING; RELATIVE ERRORS; SPEAKER-INDEPENDENT SPEECH RECOGNITION; SPECTRAL VARIATION; SPEECH RECOGNITION PERFORMANCE; SPEECH SIGNALS; TEST SETS; TRANSLATION INVARIANCE;

EID: 84867605836     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2012.6288864     Document Type: Conference Paper
Times cited : (853)

References (11)
  • 2
    • 84858972572 scopus 로고    scopus 로고
    • Making deep belief networks effective for large vocabulary continuous speech recognition
    • T. N. Sainath, B. Kingsbury, B. Ramabhadran, P. Fousek, P. Novak, and A. Mohamed, "Making deep belief networks effective for large vocabulary continuous speech recognition," in ASRU, 2011.
    • (2011) ASRU
    • Sainath, T.N.1    Kingsbury, B.2    Ramabhadran, B.3    Fousek, P.4    Novak, P.5    Mohamed, A.6
  • 3
    • 84865801985 scopus 로고    scopus 로고
    • Conversational speech transcription using context-dependent deep neural networks
    • G. Li F. Seide and D. Yu, "Conversational speech transcription using context-dependent deep neural networks," in Interspeech 2011.
    • (2011) Interspeech
    • Li, G.1    Seide, F.2    Yu, D.3
  • 5
    • 5044231640 scopus 로고    scopus 로고
    • Learning methods for generic object recognition with invariance to pose and lighting
    • IEEE Press
    • Y. LeCun, F. Huang, and L. Bottou, "Learning methods for generic object recognition with invariance to pose and lighting," in Proceedings of CVPR'04. 2004, IEEE Press.
    • Proceedings of CVPR'04. 2004
    • LeCun, Y.1    Huang, F.2    Bottou, L.3
  • 6
    • 0002263996 scopus 로고
    • Convolutional networks for images, speech, and time-series
    • M. A. Arbib, Ed. MIT Press
    • Y. LeCun and Y. Bengio, "Convolutional networks for images, speech, and time-series," in The Handbook of Brain Theory and Neural Networks, M. A. Arbib, Ed. 1995, MIT Press.
    • (1995) The Handbook of Brain Theory and Neural Networks
    • LeCun, Y.1    Bengio, Y.2
  • 7
    • 84863380535 scopus 로고    scopus 로고
    • Unsupervised feature learning for audio classification using convolutional deep belief networks
    • H. Lee, P. Pham, Y. Largman, and A. Ng, "Unsupervised feature learning for audio classification using convolutional deep belief networks," in Advances in Neural Information Processing Systems 22, pp. 1096-1104. 2009.
    • (2009) Advances in Neural Information Processing Systems , vol.22 , pp. 1096-1104
    • Lee, H.1    Pham, P.2    Largman, Y.3    Ng, A.4
  • 9
    • 0032050110 scopus 로고    scopus 로고
    • Maximum likelihood linear transformations for hmm-based speech recognition
    • M.J.F. Gales, "Maximum likelihood linear transformations for hmm-based speech recognition," Computer Speech and Language, vol. 12, pp. 75-98, 1998.
    • (1998) Computer Speech and Language , vol.12 , pp. 75-98
    • Gales, M.J.F.1
  • 10
    • 0024768209 scopus 로고
    • Speaker-independent phone recognition using hidden markov models
    • November
    • K. F. Lee and H. W. Hon, "Speaker-independent phone recognition using hidden markov models," IEEE Transactions on Audio, Speech and Language Processing, vol. 37, no. 11, pp. 1641-1648, November 1989.
    • (1989) IEEE Transactions on Audio, Speech and Language Processing , vol.37 , Issue.11 , pp. 1641-1648
    • Lee, K.F.1    Hon, H.W.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.