메뉴 건너뛰기




Volumn 22, Issue 10, 2014, Pages 1533-1545

Convolutional neural networks for speech recognition

Author keywords

Convolution; Convolutional neural networks; Limited Weight Sharing (LWS) scheme; Pooling

Indexed keywords

COMPLEX NETWORKS; CONVOLUTION; HIDDEN MARKOV MODELS; IMAGE SEGMENTATION; NEURAL NETWORKS; SPEECH; TRELLIS CODES;

EID: 84911473441     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASLP.2014.2339736     Document Type: Article
Times cited : (2129)

References (47)
  • 1
    • 77950917809 scopus 로고    scopus 로고
    • Discriminative training for automatic speech recognition: A survey
    • H. Jiang, "Discriminative training for automatic speech recognition: A survey," Comput. Speech, Lang., vol. 24, no. 4, pp. 589-608, 2010.
    • (2010) Comput. Speech, Lang. , vol.24 , Issue.4 , pp. 589-608
    • Jiang, H.1
  • 2
    • 85032750905 scopus 로고    scopus 로고
    • Discriminative learning in sequential pattern recognition - A unifying review for optimization-oriented speech recognition
    • Sep.
    • X. He, L. Deng, and W. Chou, "Discriminative learning in sequential pattern recognition - A unifying review for optimization-oriented speech recognition," IEEE Signal Process. Mag., vol. 25, no. 5, pp. 14-36, Sep. 2008.
    • (2008) IEEE Signal Process. Mag. , vol.25 , Issue.5 , pp. 14-36
    • He, X.1    Deng, L.2    Chou, W.3
  • 3
    • 84876672166 scopus 로고    scopus 로고
    • Machine learning paradigms for speech recognition: An overview
    • May
    • L. Deng and X. Li, "Machine learning paradigms for speech recognition: An overview," IEEE Trans. Audio, Speech, Lang. Process., vol. 21, no. 5, pp. 1060-1089, May 2013.
    • (2013) IEEE Trans. Audio, Speech, Lang. Process. , vol.21 , Issue.5 , pp. 1060-1089
    • Deng, L.1    Li, X.2
  • 9
    • 84255177123 scopus 로고    scopus 로고
    • Deep and wide: Multiple layers in automatic speech recognition
    • Jan.
    • N. Morgan, "Deep and wide: Multiple layers in automatic speech recognition," IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 1, pp. 7-13, Jan. 2012.
    • (2012) IEEE Trans. Audio, Speech, Lang. Process. , vol.20 , Issue.1 , pp. 7-13
    • Morgan, N.1
  • 11
    • 79959840616 scopus 로고    scopus 로고
    • Investigation of full-sequence training of deep belief networks for speech recognition
    • A. Mohamed, D. Yu, and L. Deng, "Investigation of full-sequence training of deep belief networks for speech recognition," in Proc. Interspeech, 2010, pp. 2846-2849.
    • Proc. Interspeech, 2010 , pp. 2846-2849
    • Mohamed, A.1    Yu, D.2    Deng, L.3
  • 13
    • 84055222005 scopus 로고    scopus 로고
    • Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
    • Jan.
    • G. Dahl, D. Yu, L. Deng, and A. Acero, "Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition," IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 1, pp. 30-42, Jan. 2012.
    • (2012) IEEE Trans. Audio, Speech, Lang. Process. , vol.20 , Issue.1 , pp. 30-42
    • Dahl, G.1    Yu, D.2    Deng, L.3    Acero, A.4
  • 14
    • 84865801985 scopus 로고    scopus 로고
    • Conversational speech transcription using context-dependent deep neural networks
    • F. Seide, G. Li, and D. Yu, "Conversational speech transcription using context-dependent deep neural networks," in Proc. Interspeech, 2011, pp. 437-440.
    • Proc. Interspeech, 2011 , pp. 437-440
    • Seide, F.1    Li, G.2    Yu, D.3
  • 16
    • 84874485803 scopus 로고    scopus 로고
    • Investigation of deep neural networks (DNN) for large vocabulary continuous speech recognition: Why DNN surpasses GMMs in acoustic modeling
    • J. Pan, C. Liu, Z. Wang, Y. Hu, and H. Jiang, "Investigation of deep neural networks (DNN) for large vocabulary continuous speech recognition: Why DNN surpasses GMMs in acoustic modeling," in Proc. ISCSLP, 2012.
    • Proc. ISCSLP, 2012
    • Pan, J.1    Liu, C.2    Wang, Z.3    Hu, Y.4    Jiang, H.5
  • 20
    • 24144487688 scopus 로고    scopus 로고
    • Tandem connectionist feature extraction for conversational speech recognition
    • Berlin/Heidelberg, Germany: Springer
    • Q. Zhu, B. Chen, N. Morgan, and A. Stolcke, "Tandem connectionist feature extraction for conversational speech recognition," in Machine Learning for Multimodal Interaction. Berlin/Heidelberg, Germany: Springer, 2005, vol. 3361, pp. 223-231.
    • (2005) Machine Learning for Multimodal Interaction , vol.3361 , pp. 223-231
    • Zhu, Q.1    Chen, B.2    Morgan, N.3    Stolcke, A.4
  • 25
    • 84911475287 scopus 로고
    • A Pipelined Neural Network Architecture For Speech Recognition
    • Book: Norwell, MA, USA: Kluwer
    • D. Zhang, L. Deng, and M. Elmasry, A Pipelined Neural Network Architecture For Speech Recognition, In Book: VLSI Artificial Neural Networks Engineering. Norwell, MA, USA: Kluwer, 1994.
    • (1994) VLSI Artificial Neural Networks Engineering
    • Zhang, D.1    Deng, L.2    Elmasry, M.3
  • 26
    • 0028256706 scopus 로고
    • Analysis of correlation structure for a neural predictive model with applications to speech recognition
    • L. Deng, K. Hassanein, and M. Elmasry, "Analysis of correlation structure for a neural predictive model with applications to speech recognition," Neural Netw., vol. 7, no. 2, pp. 331-339, 1994.
    • (1994) Neural Netw. , vol.7 , Issue.2 , pp. 331-339
    • Deng, L.1    Hassanein, K.2    Elmasry, M.3
  • 28
    • 0013344078 scopus 로고    scopus 로고
    • Training products of experts by minimizing contrastive divergence
    • G. Hinton, "Training products of experts by minimizing contrastive divergence," Neural Comput., vol. 14, pp. 1771-1800, 2002.
    • (2002) Neural Comput. , vol.14 , pp. 1771-1800
    • Hinton, G.1
  • 31
    • 0002263996 scopus 로고
    • Convolutional networks for images, speech, and time-series
    • M. A. Arbib, Ed. Cambridge, MA, USA: MIT Press
    • Y. LeCun and Y. Bengio, "Convolutional networks for images, speech, and time-series," in The Handbook of Brain Theory and Neural Networks, M. A. Arbib, Ed. Cambridge, MA, USA: MIT Press, 1995.
    • (1995) The Handbook of Brain Theory and Neural Networks
    • LeCun, Y.1    Bengio, Y.2
  • 32
    • 0019152630 scopus 로고
    • Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position
    • K. Fukushima, "Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position," Biol. Cybern., vol. 36, pp. 193-202, 1980.
    • (1980) Biol. Cybern. , vol.36 , pp. 193-202
    • Fukushima, K.1
  • 33
    • 84863380535 scopus 로고    scopus 로고
    • Unsupervised feature learning for audio classification using convolutional deep belief networks
    • H. Lee, P. Pham, Y. Largman, and A. Ng, "Unsupervised feature learning for audio classification using convolutional deep belief networks," in Proc. Adv. Neural Inf. Process. Syst. 22, 2009, pp. 1096-1104.
    • (2009) Proc. Adv. Neural Inf. Process. Syst. , vol.22 , pp. 1096-1104
    • Lee, H.1    Pham, P.2    Largman, Y.3    Ng, A.4
  • 38
    • 84906214784 scopus 로고    scopus 로고
    • Exploring convolutional neural network structures and optimization techniques for speech recognition
    • O. Abdel-Hamid, L. Deng, and D. Yu, "Exploring convolutional neural network structures and optimization techniques for speech recognition," in Proc. Interspeech, 2013.
    • Proc. Interspeech, 2013
    • Abdel-Hamid, O.1    Deng, L.2    Yu, D.3
  • 45
    • 71149119164 scopus 로고    scopus 로고
    • Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations
    • H. Lee, R. Grosse, R. Ranganath, and A. Y. Ng, "Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations," in Proc. 26th Annu. Int. Conf. Mach. Learn., 2009, pp. 609-616.
    • Proc. 26th Annu. Int. Conf. Mach. Learn., 2009 , pp. 609-616
    • Lee, H.1    Grosse, R.2    Ranganath, R.3    Ng, A.Y.4
  • 47
    • 0024768209 scopus 로고
    • Speaker-independent phone recognition using hidden Markov models
    • Nov.
    • K. F. Lee and H. W. Hon, "Speaker-independent phone recognition using hidden Markov models," IEEE Trans. Audio, Speech, Lang. Process., vol. 37, no. 11, pp. 1641-1648, Nov. 1989.
    • (1989) IEEE Trans. Audio, Speech, Lang. Process. , vol.37 , Issue.11 , pp. 1641-1648
    • Lee, K.F.1    Hon, H.W.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.