메뉴 건너뛰기




Volumn 7, Issue 3-4, 2013, Pages 197-387

Deep learning: Methods and applications

Author keywords

[No Author keywords available]

Indexed keywords

SPEECH RECOGNITION; TEXT PROCESSING;

EID: 84903724014     PISSN: 19328346     EISSN: 19328354     Source Type: Journal    
DOI: 10.1561/2000000039     Document Type: Review
Times cited : (3127)

References (446)
  • 1
    • 84906214784 scopus 로고    scopus 로고
    • Exploring convolutional neural network structures and optimization for speech recognition
    • O. Abdel-Hamid, L. Deng, and D. Yu. Exploring convolutional neural network structures and optimization for speech recognition. Proceedings of Interspeech, 2013.
    • (2013) Proceedings of Interspeech
    • Abdel-Hamid, O.1    Deng, L.2    Yu, D.3
  • 7
    • 77958488310 scopus 로고    scopus 로고
    • Deep machine learning - A new frontier in artificial intelligence
    • November
    • I. Arel, C. Rose, and T. Karnowski. Deep machine learning - a new frontier in artificial intelligence. IEEE Computational Intelligence Magazine, 5:13-18, November 2010.
    • (2010) IEEE Computational Intelligence Magazine , vol.5 , pp. 13-18
    • Arel, I.1    Rose, C.2    Karnowski, T.3
  • 19
    • 79959407847 scopus 로고    scopus 로고
    • Neural net language models
    • Y. Bengio. Neural net language models. Scholarpedia, 3, 2008.
    • (2008) Scholarpedia , vol.3
    • Bengio, Y.1
  • 22
    • 84883201530 scopus 로고    scopus 로고
    • Deep learning of representations: Looking forward
    • Springer
    • Y. Bengio. Deep learning of representations: Looking forward. In Statistical Language and Speech Processing, pages 1-37. Springer, 2013.
    • (2013) Statistical Language and Speech Processing , pp. 1-37
    • Bengio, Y.1
  • 35
    • 85032752364 scopus 로고    scopus 로고
    • Graphical model architectures for speech recognition
    • J. Bilmes and C. Bartels. Graphical model architectures for speech recognition. IEEE Signal Processing Magazine, 22:89-100, 2005.
    • (2005) IEEE Signal Processing Magazine , vol.22 , pp. 89-100
    • Bilmes, J.1    Bartels, C.2
  • 36
    • 84877727208 scopus 로고    scopus 로고
    • A semantic matching energy function for learning with multi-relational data - Application to word-sense disambiguation
    • May
    • A. Bordes, X. Glorot, J. Weston, and Y. Bengio. A semantic matching energy function for learning with multi-relational data - application to word-sense disambiguation. Machine Learning, May 2013.
    • (2013) Machine Learning
    • Bordes, A.1    Glorot, X.2    Weston, J.3    Bengio, Y.4
  • 38
    • 84890014676 scopus 로고    scopus 로고
    • From machine learning to machine reasoning: An essay
    • L. Bottou. From machine learning to machine reasoning: An essay. Journal of Machine Learning Research, 14:3207-3260, 2013.
    • (2013) Journal of Machine Learning Research , vol.14 , pp. 3207-3260
    • Bottou, L.1
  • 44
    • 0030196364 scopus 로고    scopus 로고
    • Stacked regression
    • L. Breiman. Stacked regression. Machine Learning, 24:49-64, 1996.
    • (1996) Machine Learning , vol.24 , pp. 49-64
    • Breiman, L.1
  • 47
    • 0031189914 scopus 로고    scopus 로고
    • Multitask learning
    • R. Caruana. Multitask learning. Machine Learning, 28:41-75, 1997.
    • (1997) Machine Learning , vol.28 , pp. 41-75
    • Caruana, R.1
  • 50
    • 0031146514 scopus 로고    scopus 로고
    • Hmm-based speech recognition using state-dependent, discriminatively derived transforms on Mel-warped DFT features
    • R. Chengalvarayan and L. Deng. Hmm-based speech recognition using state-dependent, discriminatively derived transforms on Mel-warped DFT features. IEEE Transactions on Speech and Audio Processing, pages 243-256, 1997.
    • (1997) IEEE Transactions on Speech and Audio Processing , pp. 243-256
    • Chengalvarayan, R.1    Deng, L.2
  • 52
    • 0032206267 scopus 로고    scopus 로고
    • Speech trajectory discrimination using the minimum classification error learning
    • R. Chengalvarayan and L. Deng. Speech trajectory discrimination using the minimum classification error learning. IEEE Transactions on Speech and Audio Processing, 6(6):505-515, 1998.
    • (1998) IEEE Transactions on Speech and Audio Processing , vol.6 , Issue.6 , pp. 505-515
    • Chengalvarayan, R.1    Deng, L.2
  • 68
    • 84055222005 scopus 로고    scopus 로고
    • Context-dependent, pre-trained deep neural networks for large vocabulary speech recognition
    • January
    • G. Dahl, D. Yu, L. Deng, and A. Acero. Context-dependent, pre-trained deep neural networks for large vocabulary speech recognition. IEEE Transactions on Audio, Speech, & Language Processing, 20(1):30-42, January 2012.
    • (2012) IEEE Transactions on Audio Speech, & Language Processing , vol.20 , Issue.1 , pp. 30-42
    • Dahl, G.1    Yu, D.2    Deng, L.3    Acero, A.4
  • 71
    • 0026854213 scopus 로고
    • A generalized hidden Markov model with state-conditioned trend functions of time for the speech signal
    • L. Deng. A generalized hidden Markov model with state-conditioned trend functions of time for the speech signal. Signal Processing, 27(1):65-78, 1992.
    • (1992) Signal Processing , vol.27 , Issue.1 , pp. 65-78
    • Deng, L.1
  • 72
    • 0027678649 scopus 로고
    • A stochastic model of speech incorporating hierarchical nonstationarity
    • L. Deng. A stochastic model of speech incorporating hierarchical nonstationarity. IEEE Transactions on Speech and Audio Processing, 1(4):471-475, 1993.
    • (1993) IEEE Transactions on Speech and Audio Processing , vol.1 , Issue.4 , pp. 471-475
    • Deng, L.1
  • 73
    • 0032119268 scopus 로고    scopus 로고
    • A dynamic, feature-based approach to the interface between phonology and phonetics for speech modeling and recognition
    • L. Deng. A dynamic, feature-based approach to the interface between phonology and phonetics for speech modeling and recognition. Speech Communication, 24(4):299-323, 1998.
    • (1998) Speech Communication , vol.24 , Issue.4 , pp. 299-323
    • Deng, L.1
  • 74
    • 0039503389 scopus 로고    scopus 로고
    • Computational models for speech production
    • Springer Verlag
    • L. Deng. Computational models for speech production. In Computational Models of Speech Pattern Processing, pages 199-213. Springer Verlag, 1999.
    • (1999) Computational Models of Speech Pattern Processing , pp. 199-213
    • Deng, L.1
  • 75
    • 33744966595 scopus 로고    scopus 로고
    • Switching dynamic system models for speech articulation and acoustics
    • Springer-Verlag, New York
    • L. Deng. Switching dynamic system models for speech articulation and acoustics. In Mathematical Foundations of Speech and Language Processing, pages 115-134. Springer-Verlag, New York, 2003.
    • (2003) Mathematical Foundations of Speech and Language Processing , pp. 115-134
    • Deng, L.1
  • 78
    • 85032752689 scopus 로고    scopus 로고
    • The MNIST database of handwritten digit images for machine learning research
    • November
    • L. Deng. The MNIST database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6), November 2012.
    • (2012) IEEE Signal Processing Magazine , vol.29 , Issue.6
    • Deng, L.1
  • 83
    • 0031185482 scopus 로고    scopus 로고
    • Speaker-independent phonetic classification using hidden markov models with state-conditioned mixtures of trend functions
    • L. Deng and M. Aksmanovic. Speaker-independent phonetic classification using hidden markov models with state-conditioned mixtures of trend functions. IEEE Transactions on Speech and Audio Processing, 5:319-324, 1997.
    • (1997) IEEE Transactions on Speech and Audio Processing , vol.5 , pp. 319-324
    • Deng, L.1    Aksmanovic, M.2
  • 84
    • 0028516022 scopus 로고
    • Speech recognition using hidden Markov models with polynomial regression functions as nonstationary states
    • L. Deng, M. Aksmanovic, D. Sun, and J. Wu. Speech recognition using hidden Markov models with polynomial regression functions as nonstationary states. IEEE Transactions on Speech and Audio Processing, 2(4):507-520, 1994.
    • (1994) IEEE Transactions on Speech and Audio Processing , vol.2 , Issue.4 , pp. 507-520
    • Deng, L.1    Aksmanovic, M.2    Sun, D.3    Wu, J.4
  • 86
    • 0026458724 scopus 로고
    • Structural design of a hidden Markov model based speech recognizer using multi-valued phonetic features: Comparison with segmental speech units
    • L. Deng and K. Erler. Structural design of a hidden Markov model based speech recognizer using multi-valued phonetic features: Comparison with segmental speech units. Journal of the Acoustical Society of America, 92(6):3058-3067, 1992.
    • (1992) Journal of the Acoustical Society of America , vol.92 , Issue.6 , pp. 3058-3067
    • Deng, L.1    Erler, K.2
  • 87
    • 0028256706 scopus 로고
    • Analysis of correlation structure for a neural predictive model with application to speech recognition
    • L. Deng, K. Hassanein, and M. Elmasry. Analysis of correlation structure for a neural predictive model with application to speech recognition. Neural Networks, 7(2):331-339, 1994.
    • (1994) Neural Networks , vol.7 , Issue.2 , pp. 331-339
    • Deng, L.1    Hassanein, K.2    Elmasry, M.3
  • 92
    • 0026189555 scopus 로고
    • Phonemic hidden Markov models with continuous mixture output densities for large vocabulary word recognition
    • L. Deng, M. Lennig, V. Gupta, F. Seitz, P. Mermelstein, and P. Kenny. Phonemic hidden Markov models with continuous mixture output densities for large vocabulary word recognition. IEEE Transactions on Signal Processing, 39(7):1677-1681, 1991.
    • (1991) IEEE Transactions on Signal Processing , vol.39 , Issue.7 , pp. 1677-1681
    • Deng, L.1    Lennig, M.2    Gupta, V.3    Seitz, F.4    Mermelstein, P.5    Kenny, P.6
  • 93
    • 10244257175 scopus 로고
    • Large vocabulary word recognition using context-dependent allophonic hidden Markov models
    • L. Deng, M. Lennig, F. Seitz, and P. Mermelstein. Large vocabulary word recognition using context-dependent allophonic hidden Markov models. Computer Speech and Language, 4(4):345-357, 1990.
    • (1990) Computer Speech and Language , vol.4 , Issue.4 , pp. 345-357
    • Deng, L.1    Lennig, M.2    Seitz, F.3    Mermelstein, P.4
  • 95
    • 84876672166 scopus 로고    scopus 로고
    • Machine learning paradigms in speech recognition: An overview
    • May
    • L. Deng and X. Li. Machine learning paradigms in speech recognition: An overview. IEEE Transactions on Audio, Speech, & Language, 21:1060-1089, May 2013.
    • (2013) IEEE Transactions on Audio, Speech, & Language , vol.21 , pp. 1060-1089
    • Deng, L.1    Li, X.2
  • 96
    • 0033623527 scopus 로고    scopus 로고
    • Spontaneous speech recognition using a statistical coarticulatory model for the vocal tract resonance dynamics
    • L. Deng and J. Ma. Spontaneous speech recognition using a statistical coarticulatory model for the vocal tract resonance dynamics. Journal of the Acoustical Society America, 108:3036-3048, 2000.
    • (2000) Journal of the Acoustical Society America , vol.108 , pp. 3036-3048
    • Deng, L.1    Ma, J.2
  • 98
    • 0031198059 scopus 로고    scopus 로고
    • Production models as a structural basis for automatic speech recognition
    • August
    • L. Deng, G. Ramsay, and D. Sun. Production models as a structural basis for automatic speech recognition. Speech Communication, 33(2-3):93-111, August 1997.
    • (1997) Speech Communication , vol.33 , Issue.2-3 , pp. 93-111
    • Deng, L.1    Ramsay, G.2    Sun, D.3
  • 99
    • 0030190520 scopus 로고    scopus 로고
    • Transitional speech units and their representation by regressive Markov states: Applications to speech recognition
    • July
    • L. Deng and H. Sameti. Transitional speech units and their representation by regressive Markov states: Applications to speech recognition. IEEE Transactions on speech and audio processing, 4(4):301-306, July 1996.
    • (1996) IEEE Transactions on Speech and Audio Processing , vol.4 , Issue.4 , pp. 301-306
    • Deng, L.1    Sameti, H.2
  • 101
    • 0028234947 scopus 로고
    • A statistical approach to automatic speech recognition using the atomic speech units constructed from overlapping articulatory features
    • L. Deng and D. Sun. A statistical approach to automatic speech recognition using the atomic speech units constructed from overlapping articulatory features. Journal of the Acoustical Society of America, 85(5):2702-2719, 1994.
    • (1994) Journal of the Acoustical Society of America , vol.85 , Issue.5 , pp. 2702-2719
    • Deng, L.1    Sun, D.2
  • 104
    • 18744401086 scopus 로고    scopus 로고
    • Dynamic compensation of HMM variances using the feature enhancement uncertainty computed from a parametric model of speech distortion
    • L. Deng, J. Wu, J. Droppo, and A. Acero. Dynamic compensation of HMM variances using the feature enhancement uncertainty computed from a parametric model of speech distortion. IEEE Transactions on Speech and Audio Processing, 13(3):412-421, 2005.
    • (2005) IEEE Transactions on Speech and Audio Processing , vol.13 , Issue.3 , pp. 412-421
    • Deng, L.1    Wu, J.2    Droppo, J.3    Acero, A.4
  • 106
    • 84865768819 scopus 로고    scopus 로고
    • Deep convex network: A scalable architecture for speech pattern classification
    • L. Deng and D. Yu. Deep convex network: A scalable architecture for speech pattern classification. In Proceedings of Interspeech. 2011.
    • (2011) Proceedings of Interspeech
    • Deng, L.1    Yu, D.2
  • 107
    • 33744966561 scopus 로고    scopus 로고
    • A bidirectional target filtering model of speech coarticulation: Two-stage implementation for phonetic recognition
    • January
    • L. Deng, D. Yu, and A. Acero. A bidirectional target filtering model of speech coarticulation: Two-stage implementation for phonetic recognition. IEEE Transactions on Audio and Speech Processing, 14(1):256-265, January 2006.
    • (2006) IEEE Transactions on Audio and Speech Processing , vol.14 , Issue.1 , pp. 256-265
    • Deng, L.1    Yu, D.2    Acero, A.3
  • 116
    • 0032119668 scopus 로고    scopus 로고
    • The hierarchical hidden Markov model: Analysis and applications
    • S. Fine, Y. Singer, and N. Tishby. The hierarchical hidden Markov model: Analysis and applications. Machine Learning, 32:41-62, 1998.
    • (1998) Machine Learning , vol.32 , pp. 41-62
    • Fine, S.1    Singer, Y.2    Tishby, N.3
  • 118
    • 44849099965 scopus 로고    scopus 로고
    • Phone-discriminating minimum classification error (p-mce) training for phonetic recognition
    • Q. Fu, X. He, and L. Deng. Phone-discriminating minimum classification error (p-mce) training for phonetic recognition. In Proceedings of Interspeech. 2007.
    • (2007) Proceedings of Interspeech
    • Fu, Q.1    He, X.2    Deng, L.3
  • 127
    • 77955783938 scopus 로고    scopus 로고
    • Error approximation and minimum phone error acoustic model estimation
    • August
    • M. Gibson and T. Hain. Error approximation and minimum phone error acoustic model estimation. IEEE Transactions on Audio, Speech, and Language Processing, 18(6):1269-1279, August 2010.
    • (2010) IEEE Transactions on Audio Speech, and Language Processing , vol.18 , Issue.6 , pp. 1269-1279
    • Gibson, M.1    Hain, T.2
  • 139
    • 84857892556 scopus 로고    scopus 로고
    • Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics
    • M. Gutmann and A. Hyvarinen. Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics. Journal of Machine Learning Research, 13:307-361, 2012.
    • (2012) Journal of Machine Learning Research , vol.13 , pp. 307-361
    • Gutmann, M.1    Hyvarinen, A.2
  • 144
    • 85032751114 scopus 로고    scopus 로고
    • Speech recognition, machine translation, and speech translation - A unifying discriminative framework
    • November 2011
    • X. He and L. Deng. Speech recognition, machine translation, and speech translation - a unifying discriminative framework. IEEE Signal Processing Magazine, 28, November 2011.
    • IEEE Signal Processing Magazine , vol.28
    • He, X.1    Deng, L.2
  • 146
    • 84876669905 scopus 로고    scopus 로고
    • Speech-centric information processing: An optimization-oriented approach
    • X. He and L. Deng. Speech-centric information processing: An optimization-oriented approach. In Proceedings of the IEEE. 2013.
    • (2013) Proceedings of the IEEE
    • He, X.1    Deng, L.2
  • 147
    • 85032750905 scopus 로고    scopus 로고
    • Discriminative learning in sequential pattern recognition - A unifying review for optimization-oriented speech recognition
    • X. He, L. Deng, andW. Chou. Discriminative learning in sequential pattern recognition - a unifying review for optimization-oriented speech recognition. IEEE Signal Processing Magazine, 25:14-36, 2008.
    • (2008) IEEE Signal Processing Magazine , vol.25 , pp. 14-36
    • He, X.1    Deng, L.2    Chou, W.3
  • 149
    • 84887376734 scopus 로고    scopus 로고
    • Investigations on an EM-style optimization algorithm for discriminative training of HMMs
    • December
    • G. Heigold, H. Ney, and R. Schluter. Investigations on an EM-style optimization algorithm for discriminative training of HMMs. IEEE Transactions on Audio, Speech, and Language Processing, 21(12):2616-2626, December 2013.
    • (2013) IEEE Transactions on Audio, Speech, and Language Processing , vol.21 , Issue.12 , pp. 2616-2626
    • Heigold, G.1    Ney, H.2    Schluter, R.3
  • 156
    • 0025519204 scopus 로고
    • Mapping part-whole hierarchies into connectionist networks
    • G. Hinton. Mapping part-whole hierarchies into connectionist networks. Artificial Intelligence, 46:47-75, 1990.
    • (1990) Artificial Intelligence , vol.46 , pp. 47-75
    • Hinton, G.1
  • 157
    • 0009438133 scopus 로고
    • Preface to the special issue on connectionist symbol processing
    • G. Hinton. Preface to the special issue on connectionist symbol processing. Artificial Intelligence, 46:1-4, 1990.
    • (1990) Artificial Intelligence , vol.46 , pp. 1-4
    • Hinton, G.1
  • 158
    • 0037327724 scopus 로고    scopus 로고
    • The ups and downs of Hebb synapses
    • G. Hinton. The ups and downs of Hebb synapses. Canadian Psychology, 44:10-13, 2003.
    • (2003) Canadian Psychology , vol.44 , pp. 10-13
    • Hinton, G.1
  • 163
    • 33745805403 scopus 로고    scopus 로고
    • A fast learning algorithm for deep belief nets
    • G. Hinton, S. Osindero, and Y. Teh. A fast learning algorithm for deep belief nets. Neural Computation, 18:1527-1554, 2006.
    • (2006) Neural Computation , vol.18 , pp. 1527-1554
    • Hinton, G.1    Osindero, S.2    Teh, Y.3
  • 164
    • 33746600649 scopus 로고    scopus 로고
    • Reducing the dimensionality of data with neural networks
    • July
    • G. Hinton and R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313(5786):504-507, July 2006.
    • (2006) Science , vol.313 , Issue.5786 , pp. 504-507
    • Hinton, G.1    Salakhutdinov, R.2
  • 165
    • 79961245273 scopus 로고    scopus 로고
    • Discovering binary codes for documents by learning deep generative models
    • G. Hinton and R. Salakhutdinov. Discovering binary codes for documents by learning deep generative models. Topics in Cognitive Science, pages 1-18, 2010.
    • (2010) Topics in Cognitive Science , pp. 1-18
    • Hinton, G.1    Salakhutdinov, R.2
  • 174
    • 77956280276 scopus 로고    scopus 로고
    • Hierarchical bayesian language models for conversational speech recognition
    • November
    • S. Huang and S. Renals. Hierarchical bayesian language models for conversational speech recognition. IEEE Transactions on Audio, Speech, and Language Processing, 18(8):1941-1954, November 2010.
    • (2010) IEEE Transactions on Audio, Speech, and Language Processing , vol.18 , Issue.8 , pp. 1941-1954
    • Huang, S.1    Renals, S.2
  • 176
    • 84906218045 scopus 로고    scopus 로고
    • Semi-supervised GMMand DNN acoustic model training with multi-system combination and confidence re-calibration
    • Y. Huang, D. Yu, Y. Gong, and C. Liu. Semi-supervised GMMand DNN acoustic model training with multi-system combination and confidence re-calibration. In Proceedings of Interspeech, pages 2360-2364. 2013.
    • (2013) Proceedings of Interspeech , pp. 2360-2364
    • Huang, Y.1    Yu, D.2    Gong, Y.3    Liu, C.4
  • 186
    • 85032751120 scopus 로고    scopus 로고
    • Parameter estimation of statistical models using convex optimization: An advanced method of discriminative training for speech and language processing
    • H. Jiang and X. Li. Parameter estimation of statistical models using convex optimization: An advanced method of discriminative training for speech and language processing. IEEE Signal Processing Magazine, 27(3):115-127, 2010.
    • (2010) IEEE Signal Processing Magazine , vol.27 , Issue.3 , pp. 115-127
    • Jiang, H.1    Li, X.2
  • 187
    • 0022691022 scopus 로고
    • Maximum likelihood estimation for multivariate mixture observations of Markov chains
    • B. Juang, S. Levinson, and M. Sondhi. Maximum likelihood estimation for multivariate mixture observations of Markov chains. IEEE Transactions on Information Theory, 32:307-309, 1986.
    • (1986) IEEE Transactions on Information Theory , vol.32 , pp. 307-309
    • Juang, B.1    Levinson, S.2    Sondhi, M.3
  • 195
    • 84878379108 scopus 로고    scopus 로고
    • Scalable minimum bayes risk training of deep neural network acoustic models using distributed hessian-free optimization
    • B. Kingsbury, T. Sainath, and H. Soltau. Scalable minimum bayes risk training of deep neural network acoustic models using distributed hessian-free optimization. In Proceedings of Interspeech. 2012.
    • (2012) Proceedings of Interspeech
    • Kingsbury, B.1    Sainath, T.2    Soltau, H.3
  • 199
    • 84878534913 scopus 로고    scopus 로고
    • Integrating deep neural networks into structural classification approach based on weighted finite-state transducers
    • Y. Kubo, T. Hori, and A. Nakamura. Integrating deep neural networks into structural classification approach based on weighted finite-state transducers. In Proceedings of Interspeech. 2012.
    • (2012) Proceedings of Interspeech
    • Kubo, Y.1    Hori, T.2    Nakamura, A.3
  • 201
    • 84887376692 scopus 로고    scopus 로고
    • Cross-lingual automatic speech recognition using tandem features
    • December
    • P. Lal and S. King. Cross-lingual automatic speech recognition using tandem features. IEEE Transactions on Audio, Speech, and Language Processing, 21(12):2506-2515, December 2013.
    • (2013) IEEE Transactions on Audio, Speech, and Language Processing , vol.21 , Issue.12 , pp. 2506-2515
    • Lal, P.1    King, S.2
  • 202
    • 0025254722 scopus 로고
    • A time-delay neural network architecture for isolated word recognition
    • K. Lang, A. Waibel, and G. Hinton. A time-delay neural network architecture for isolated word recognition. Neural Networks, 3(1):23-43, 1990.
    • (1990) Neural Networks , vol.3 , Issue.1 , pp. 23-43
    • Lang, K.1    Waibel, A.2    Hinton, G.3
  • 211
    • 0002263996 scopus 로고
    • Convolutional networks for images, speech, and time series
    • In M. Arbib, editor MIT Press, Cambridge, Massachusetts
    • Y. LeCun and Y. Bengio. Convolutional networks for images, speech, and time series. In M. Arbib, editor, The Handbook of Brain Theory and Neural Networks, pages 255-258. MIT Press, Cambridge, Massachusetts, 1995.
    • (1995) The Handbook of Brain Theory and Neural Networks , pp. 255-258
    • Lecun, Y.1    Bengio, Y.2
  • 212
  • 214
    • 85009128804 scopus 로고    scopus 로고
    • From knowledge-ignorant to knowledge-rich modeling: A new speech research paradigm for next-generation automatic speech recognition
    • C.-H. Lee. From knowledge-ignorant to knowledge-rich modeling: A new speech research paradigm for next-generation automatic speech recognition. In Proceedings of International Conference on Spoken Language Processing (ICSLP), pages 109-111. 2004.
    • (2004) Proceedings of International Conference on Spoken Language Processing (ICSLP) , pp. 109-111
    • Lee, C.-H.1
  • 227
    • 84901237776 scopus 로고    scopus 로고
    • Modeling spectral envelopes using restricted boltzmann machines and deep belief networks for statistical parametric speech synthesis
    • Z. Ling, L. Deng, and D. Yu. Modeling spectral envelopes using restricted boltzmann machines and deep belief networks for statistical parametric speech synthesis. IEEE Transactions on Audio Speech Language Processing, 21(10):2129-2139, 2013.
    • (2013) IEEE Transactions on Audio Speech Language Processing , vol.21 , Issue.10 , pp. 2129-2139
    • Ling, Z.1    Deng, L.2    Yu, D.3
  • 229
    • 84869440340 scopus 로고    scopus 로고
    • Articulatory control of HMMbased parametric speech synthesis using feature-space-switched multiple regression
    • January
    • Z. Ling, K. Richmond, and J. Yamagishi. Articulatory control of HMMbased parametric speech synthesis using feature-space-switched multiple regression. IEEE Transactions on Audio, Speech, and Language Processing, 21, January 2013.
    • (2013) IEEE Transactions on Audio, Speech, and Language Processing , vol.21
    • Ling, Z.1    Richmond, K.2    Yamagishi, J.3
  • 231
    • 0001523807 scopus 로고    scopus 로고
    • A path-stack algorithm for optimizing dynamic regimes in a statistical hidden dynamical model of speech
    • J. Ma and L. Deng. A path-stack algorithm for optimizing dynamic regimes in a statistical hidden dynamical model of speech. Computer, Speech and Language, 2000.
    • (2000) Computer, Speech and Language
    • Ma, J.1    Deng, L.2
  • 232
    • 0347968275 scopus 로고    scopus 로고
    • Efficient decoding strategies for conversational speech recognition using a constrained nonlinear state-space model
    • J. Ma and L. Deng. Efficient decoding strategies for conversational speech recognition using a constrained nonlinear state-space model. IEEE Transactions on Speech and Audio Processing, 11(6):590-602, 2003.
    • (2003) IEEE Transactions on Speech and Audio Processing , vol.11 , Issue.6 , pp. 590-602
    • Ma, J.1    Deng, L.2
  • 233
    • 0742307392 scopus 로고    scopus 로고
    • Target-directed mixture dynamic models for spontaneous speech recognition
    • J. Ma and L. Deng. Target-directed mixture dynamic models for spontaneous speech recognition. IEEE Transactions on Speech and Audio Processing, 12(1):47-58, 2004.
    • (2004) IEEE Transactions on Speech and Audio Processing , vol.12 , Issue.1 , pp. 47-58
    • Ma, J.1    Deng, L.2
  • 237
    • 84903700854 scopus 로고    scopus 로고
    • Scientists see promise in deep-learning programs
    • November 24
    • J. Markoff. Scientists see promise in deep-learning programs. New York Times, November 24 2012.
    • (2012) New York Times
    • Markoff, J.1
  • 242
    • 84906237242 scopus 로고    scopus 로고
    • Investigation of recurrentneural- network architectures and learning methods for spoken language understanding
    • G. Mesnil, X. He, L. Deng, and Y. Bengio. Investigation of recurrentneural- network architectures and learning methods for spoken language understanding. In Proceedings of Interspeech. 2013.
    • (2013) Proceedings of Interspeech
    • Mesnil, G.1    He, X.2    Deng, L.3    Bengio, Y.4
  • 243
    • 84906273501 scopus 로고    scopus 로고
    • Improving low-resource CD-DNN-HMM using dropout and multilingual DNN training
    • Y. Miao and F. Metze. Improving low-resource CD-DNN-HMM using dropout and multilingual DNN training. In Proceedings of Interspeech. 2013.
    • (2013) Proceedings of Interspeech
    • Miao, Y.1    Metze, F.2
  • 260
    • 79959840616 scopus 로고    scopus 로고
    • Investigation of full-sequence training of deep belief networks for speech recognition
    • A. Mohamed, D. Yu, and L. Deng. Investigation of full-sequence training of deep belief networks for speech recognition. In Proceedings of Interspeech. 2010.
    • (2010) Proceedings of Interspeech
    • Mohamed, A.1    Yu, D.2    Deng, L.3
  • 271
    • 4944221356 scopus 로고    scopus 로고
    • Layered representations for learning and inferring office activity from multiple sensory channels
    • N. Oliver, A. Garg, and E. Horvitz. Layered representations for learning and inferring office activity from multiple sensory channels. Computer Vision and Image Understanding, 96:163-180, 2004.
    • (2004) Computer Vision and Image Understanding , vol.96 , pp. 163-180
    • Oliver, N.1    Garg, A.2    Horvitz, E.3
  • 274
    • 0030245363 scopus 로고    scopus 로고
    • From HMMs to segment models: A unified view of stochastic modeling for speech recognition
    • September
    • M. Ostendorf, V. Digalakis, and O. Kimball. From HMMs to segment models: A unified view of stochastic modeling for speech recognition. IEEE Transactions on Speech and Audio Processing, 4(5), September 1996.
    • (1996) IEEE Transactions on Speech and Audio Processing , vol.4 , Issue.5
    • Ostendorf, M.1    Digalakis, V.2    Kimball, O.3
  • 286
    • 0029310084 scopus 로고
    • Holographic reduced representations
    • May
    • T. Plate. Holographic reduced representations. IEEE Transactions on Neural Networks, 6(3):623-641, May 1995.
    • (1995) IEEE Transactions on Neural Networks , vol.6 , Issue.3 , pp. 623-641
    • Plate, T.1
  • 287
    • 84903722546 scopus 로고    scopus 로고
    • How the brain might work: The role of information and learning in understanding and replicating intelligence
    • In G. Jacovitt, A. Pettorossi, R. Consolo, and V. Senni, editors Lateran University Press
    • T. Poggio. How the brain might work: The role of information and learning in understanding and replicating intelligence. In G. Jacovitt, A. Pettorossi, R. Consolo, and V. Senni, editors, Information: Science and Technology for the New Century, pages 45-61. Lateran University Press, 2007.
    • (2007) Information: Science and Technology for the New Century , pp. 45-61
    • Poggio, T.1
  • 288
    • 0025519291 scopus 로고
    • Recursive distributed representations
    • J. Pollack. Recursive distributed representations. Artificial Intelligence, 46:77-105, 1990.
    • (1990) Artificial Intelligence , vol.46 , pp. 77-105
    • Pollack, J.1
  • 292
    • 0031003679 scopus 로고    scopus 로고
    • Optimality: From neural networks to universal grammar
    • A. Prince and P. Smolensky. Optimality: From neural networks to universal grammar. Science, 275:1604-1610, 1997.
    • (1997) Science , vol.275 , pp. 1604-1610
    • Prince, A.1    Smolensky, P.2
  • 293
    • 0024610919 scopus 로고
    • A tutorial on hidden markov models and selected applications in speech recognition
    • L. Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. In Proceedings of the IEEE, pages 257-286. 1989.
    • (1989) Proceedings of the IEEE , pp. 257-286
    • Rabiner, L.1
  • 299
    • 0030419718 scopus 로고    scopus 로고
    • Construction of state-dependent dynamic parameters by maximum likelihood: Applications to speech recognition
    • C. Rathinavalu and L. Deng. Construction of state-dependent dynamic parameters by maximum likelihood: Applications to speech recognition. Signal Processing, 55(2):149-165, 1997.
    • (1997) Signal Processing , vol.55 , Issue.2 , pp. 149-165
    • Rathinavalu, C.1    Deng, L.2
  • 301
    • 85032751986 scopus 로고    scopus 로고
    • Single-channel multi-talker speech recognition - Graphical modeling approaches
    • S. Rennie, H. Hershey, and P. Olsen. Single-channel multi-talker speech recognition - graphical modeling approaches. IEEE Signal Processing Magazine, 33:66-80, 2010.
    • (2010) IEEE Signal Processing Magazine , vol.33 , pp. 66-80
    • Rennie, S.1    Hershey, H.2    Olsen, P.3
  • 304
    • 0028392167 scopus 로고
    • An application of recurrent nets to phone probability estimation
    • A. Robinson. An application of recurrent nets to phone probability estimation. IEEE Transactions on Neural Networks, 5:298-305, 1994.
    • (1994) IEEE Transactions on Neural Networks , vol.5 , pp. 298-305
    • Robinson, A.1
  • 320
    • 84905273821 scopus 로고    scopus 로고
    • Continuous space translation models for phrase-based statistical machine translation
    • H. Schwenk. Continuous space translation models for phrase-based statistical machine translation. In Proceedings of Computional Linguistics. 2012.
    • (2012) Proceedings of Computional Linguistics
    • Schwenk, H.1
  • 324
    • 84865801985 scopus 로고    scopus 로고
    • Conversational speech transcription using context-dependent deep neural networks
    • F. Seide, G. Li, and D. Yu. Conversational speech transcription using context-dependent deep neural networks. In Proceedings of Interspeech, pages 437-440. 2011.
    • (2011) Proceedings of Interspeech , pp. 437-440
    • Seide, F.1    Li, G.2    Yu, D.3
  • 327
    • 0028195651 scopus 로고
    • Waveform-based speech recognition using hidden filter models: Parameter selection and sensitivity to power normalization
    • H. Sheikhzadeh and L. Deng. Waveform-based speech recognition using hidden filter models: Parameter selection and sensitivity to power normalization. IEEE Transactions on on Speech and Audio Processing (ICASSP), 2:80-91, 1994.
    • (1994) IEEE Transactions on on Speech and Audio Processing (ICASSP) , vol.2 , pp. 80-91
    • Sheikhzadeh, H.1    Deng, L.2
  • 328
    • 84990946747 scopus 로고    scopus 로고
    • Learning semantic representations using convolutional neural networks for web search
    • Y. Shen, X. He, J. Gao, L. Deng, and G. Mesnil. Learning semantic representations using convolutional neural networks for web search. In Proceedings World Wide Web. 2014.
    • (2014) Proceedings World Wide Web
    • Shen, Y.1    He, X.2    Gao, J.3    Deng, L.4    Mesnil, G.5
  • 330
    • 84881054791 scopus 로고    scopus 로고
    • Hermitian polynomial for speaker adaptation of connectionist speech recognition systems
    • M. Siniscalchi, J. Li, and C. Lee. Hermitian polynomial for speaker adaptation of connectionist speech recognition systems. IEEE Transactions on Audio, Speech, and Language Processing, 21(10):2152-2161, 2013a.
    • (2013) IEEE Transactions on Audio, Speech, and Language Processing , vol.21 , Issue.10 , pp. 2152-2161
    • Siniscalchi, M.1    Li, J.2    Lee, C.3
  • 332
    • 84875405186 scopus 로고    scopus 로고
    • Exploiting deep neural networks for detection-based speech recognition
    • M. Siniscalchi, D. Yu, L. Deng, and C.-H. Lee. Exploiting deep neural networks for detection-based speech recognition. Neurocomputing, 106:148-157, 2013.
    • (2013) Neurocomputing , vol.106 , pp. 148-157
    • Siniscalchi, M.1    Yu, D.2    Deng, L.3    Lee, C.-H.4
  • 333
    • 84873303660 scopus 로고    scopus 로고
    • Speech recognition using long-span temporal patterns in a deep network model
    • March
    • M. Siniscalchi, D. Yu, L. Deng, and C.-H. Lee. Speech recognition using long-span temporal patterns in a deep network model. IEEE Signal Processing Letters, 20(3):201-204, March 2013.
    • (2013) IEEE Signal Processing Letters , vol.20 , Issue.3 , pp. 201-204
    • Siniscalchi, M.1    Yu, D.2    Deng, L.3    Lee, C.-H.4
  • 335
    • 0025516779 scopus 로고
    • Tensor product variable binding and the representation of symbolic structures in connectionist systems
    • P. Smolensky. Tensor product variable binding and the representation of symbolic structures in connectionist systems. Artificial Intelligence, 46:159-216, 1990.
    • (1990) Artificial Intelligence , vol.46 , pp. 159-216
    • Smolensky, P.1
  • 339
    • 84905233165 scopus 로고    scopus 로고
    • Tutorial at Association of Computational Logistics (ACL), 2012, and North American Chapter of the Association of Computational Linguistics (NAACL)
    • R. Socher, Y. Bengio, and C. Manning. Deep learning for NLP. Tutorial at Association of Computational Logistics (ACL), 2012, and North American Chapter of the Association of Computational Linguistics (NAACL), 2013. http://www.socher.org/index.php/DeepLearning Tutorial.
    • (2013) Deep Learning for NLP
    • Socher, R.1    Bengio, Y.2    Manning, C.3
  • 341
  • 351
    • 85073226083 scopus 로고    scopus 로고
    • Preliminary investigation of boltzmann machine classifiers for speaker recognition
    • T. Stafylakis, P. Kenny, M. Senoussaoui, and P. Dumouchel. Preliminary investigation of boltzmann machine classifiers for speaker recognition. In Proceedings of Odyssey, pages 109-116. 2012.
    • (2012) Proceedings of Odyssey , pp. 109-116
    • Stafylakis, T.1    Kenny, P.2    Senoussaoui, M.3    Dumouchel, P.4
  • 355
    • 0036165806 scopus 로고    scopus 로고
    • An overlapping-feature based phonological model incorporating linguistic constraints: Applications to speech recognition
    • J. Sun and L. Deng. An overlapping-feature based phonological model incorporating linguistic constraints: Applications to speech recognition. Journal on Acoustical Society of America, 111(2):1086-1101, 2002.
    • (2002) Journal on Acoustical Society of America , vol.111 , Issue.2 , pp. 1086-1101
    • Sun, J.1    Deng, L.2
  • 362
    • 84890474716 scopus 로고    scopus 로고
    • Deep neural network features and semi-supervised training for low resource speech recognition
    • S. Thomas, M. Seltzer, K. Church, and H. Hermansky. Deep neural network features and semi-supervised training for low resource speech recognition. In Proceedings of Interspeech. 2013.
    • (2013) Proceedings of Interspeech
    • Thomas, S.1    Seltzer, M.2    Church, K.3    Hermansky, H.4
  • 375
    • 79959575293 scopus 로고    scopus 로고
    • A connection between score matching and denoising autoencoder
    • P. Vincent. A connection between score matching and denoising autoencoder. Neural Computation, 23(7):1661-1674, 2011.
    • (2011) Neural Computation , vol.23 , Issue.7 , pp. 1661-1674
    • Vincent, P.1
  • 376
    • 79551480483 scopus 로고    scopus 로고
    • Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion
    • P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P. Manzagol. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of Machine Learning Research, 11:3371-3408, 2010.
    • (2010) Journal of Machine Learning Research , vol.11 , pp. 3371-3408
    • Vincent, P.1    Larochelle, H.2    Lajoie, I.3    Bengio, Y.4    Manzagol, P.5
  • 388
    • 77955654853 scopus 로고    scopus 로고
    • Large scale image annotation: Learning to rank with joint word-image embeddings
    • J. Weston, S. Bengio, and N. Usunier. Large scale image annotation: Learning to rank with joint word-image embeddings. Machine Learning, 81(1):21-35, 2010.
    • (2010) Machine Learning , vol.81 , Issue.1 , pp. 21-35
    • Weston, J.1    Bengio, S.2    Usunier, N.3
  • 390
    • 84906237512 scopus 로고    scopus 로고
    • Investigations on hessian-free optimization for cross-entropy training of deep neural networks
    • S.Wiesler, J. Li, and J. Xue. Investigations on hessian-free optimization for cross-entropy training of deep neural networks. In Proceedings of Interspeech. 2013.
    • (2013) Proceedings of Interspeech
    • Wiesler, S.1    Li, J.2    Xue, J.3
  • 392
    • 0026692226 scopus 로고
    • Stacked generalization
    • D. Wolpert. Stacked generalization. Neural Networks, 5(2):241-259, 1992.
    • (1992) Neural Networks , vol.5 , Issue.2 , pp. 241-259
    • Wolpert, D.1
  • 394
    • 85032751865 scopus 로고    scopus 로고
    • A geometric perspective of large-margin training of gaussian models
    • November
    • L. Xiao and L. Deng. A geometric perspective of large-margin training of gaussian models. IEEE Signal Processing Magazine, 27(6):118-123, November 2010.
    • (2010) IEEE Signal Processing Magazine , vol.27 , Issue.6 , pp. 118-123
    • Xiao, L.1    Deng, L.2
  • 395
    • 0037313081 scopus 로고    scopus 로고
    • Equivalence of backpropagation and contrastive hebbian learning in a layered network
    • X. Xie and S. Seung. Equivalence of backpropagation and contrastive hebbian learning in a layered network. Neural computation, 15:441-454, 2003.
    • (2003) Neural Computation , vol.15 , pp. 441-454
    • Xie, X.1    Seung, S.2
  • 396
    • 84889257121 scopus 로고    scopus 로고
    • An experimental study on speech enhancement based on deep neural networks
    • Y. Xu, J. Du, L. Dai, and C. Lee. An experimental study on speech enhancement based on deep neural networks. IEEE Signal Processing Letters, 21(1):65-68, 2014.
    • (2014) IEEE Signal Processing Letters , vol.21 , Issue.1 , pp. 65-68
    • Xu, Y.1    Du, J.2    Dai, L.3    Lee, C.4
  • 397
    • 84906227589 scopus 로고    scopus 로고
    • Restructuring of deep neural network acoustic models with singular value decomposition
    • J. Xue, J. Li, and Y. Gong. Restructuring of deep neural network acoustic models with singular value decomposition. In Proceedings of Interspeech. 2013.
    • (2013) Proceedings of Interspeech
    • Xue, J.1    Li, J.2    Gong, Y.3
  • 399
    • 84906225757 scopus 로고    scopus 로고
    • A scalable approach to using DNN-derived features in GMM-HMM based acoustic modeling for LVCSR
    • Z. Yan, Q. Huo, and J. Xu. A scalable approach to using DNN-derived features in GMM-HMM based acoustic modeling for LVCSR. In Proceedings of Interspeech. 2013.
    • (2013) Proceedings of Interspeech
    • Yan, Z.1    Huo, Q.2    Xu, J.3
  • 400
    • 84866881711 scopus 로고    scopus 로고
    • Combining a two-step CRF model and a joint source-channel model for machine transliteration
    • D. Yang and S. Furui. Combining a two-step CRF model and a joint source-channel model for machine transliteration. In Proceedings of Association for Computational Linguistics (ACL), pages 275-280. 2010.
    • (2010) Proceedings of Association for Computational Linguistics (ACL) , pp. 275-280
    • Yang, D.1    Furui, S.2
  • 401
    • 84903733224 scopus 로고    scopus 로고
    • A fast maximum likelihood nonlinear feature transformation method for GMM-HMM speaker adaptation
    • K. Yao, D. Yu, L. Deng, and Y. Gong. A fast maximum likelihood nonlinear feature transformation method for GMM-HMM speaker adaptation. Neurocomputing, 2013a.
    • (2013) Neurocomputing
    • Yao, K.1    Yu, D.2    Deng, L.3    Gong, Y.4
  • 406
    • 33644756784 scopus 로고    scopus 로고
    • On the convergence of markovian stochastic algorithms with rapidly decreasing ergodicity rates
    • L. Younes. On the convergence of markovian stochastic algorithms with rapidly decreasing ergodicity rates. Stochastics and Stochastic Reports, 65(3):177-228, 1999.
    • (1999) Stochastics and Stochastic Reports , vol.65 , Issue.3 , pp. 177-228
    • Younes, L.1
  • 409
    • 85032752267 scopus 로고    scopus 로고
    • Solving nonlinear estimation problems using splines
    • July
    • D. Yu and L. Deng. Solving nonlinear estimation problems using splines. IEEE Signal Processing Magazine, 26(4):86-90, July 2009.
    • (2009) IEEE Signal Processing Magazine , vol.26 , Issue.4 , pp. 86-90
    • Yu, D.1    Deng, L.2
  • 410
    • 79959828814 scopus 로고    scopus 로고
    • Deep-structured hidden conditional random fields for phonetic recognition
    • September
    • D. Yu and L. Deng. Deep-structured hidden conditional random fields for phonetic recognition. In Proceedings of Interspeech. September 2010.
    • (2010) Proceedings of Interspeech
    • Yu, D.1    Deng, L.2
  • 411
    • 84865770736 scopus 로고    scopus 로고
    • Accelerated parallelizable neural networks learning algorithms for speech recognition
    • D. Yu and L. Deng. Accelerated parallelizable neural networks learning algorithms for speech recognition. In Proceedings of Interspeech. 2011.
    • (2011) Proceedings of Interspeech
    • Yu, D.1    Deng, L.2
  • 412
    • 85032782045 scopus 로고    scopus 로고
    • Deep learning and its applications to signal and information processing
    • January
    • D. Yu and L. Deng. Deep learning and its applications to signal and information processing. IEEE Signal Processing Magazine, pages 145-154, January 2011.
    • (2011) IEEE Signal Processing Magazine , pp. 145-154
    • Yu, D.1    Deng, L.2
  • 413
    • 84862822032 scopus 로고    scopus 로고
    • Efficient and effective algorithms for training singlehidden- layer neural networks
    • D. Yu and L. Deng. Efficient and effective algorithms for training singlehidden- layer neural networks. Pattern Recognition Letters, 33:554-558, 2012.
    • (2012) Pattern Recognition Letters , vol.33 , pp. 554-558
    • Yu, D.1    Deng, L.2
  • 417
    • 42949105203 scopus 로고    scopus 로고
    • Large-margin minimum classification error training: A theoretical risk minimization perspective
    • October
    • D. Yu, L. Deng, X. He, and A. Acero. Large-margin minimum classification error training: A theoretical risk minimization perspective. Computer Speech and Language, 22(4):415-429, October 2008.
    • (2008) Computer Speech and Language , vol.22 , Issue.4 , pp. 415-429
    • Yu, D.1    Deng, L.2    He, X.3    Acero, A.4
  • 421
    • 84878405171 scopus 로고    scopus 로고
    • Large vocabulary speech recognition using deep tensor neural networks
    • D. Yu, L. Deng, and F. Seide. Large vocabulary speech recognition using deep tensor neural networks. In Proceedings of Interspeech. 2012c.
    • (2012) Proceedings of Interspeech
    • Yu, D.1    Deng, L.2    Seide, F.3
  • 422
    • 84871387302 scopus 로고    scopus 로고
    • The deep tensor neural network with applications to large vocabulary speech recognition
    • D. Yu, L. Deng, and F. Seide. The deep tensor neural network with applications to large vocabulary speech recognition. IEEE Transactions on Audio, Speech, and Language Processing, 21(2):388-396, 2013.
    • (2013) IEEE Transactions on Audio, Speech, and Language Processing , vol.21 , Issue.2 , pp. 388-396
    • Yu, D.1    Deng, L.2    Seide, F.3
  • 425
    • 84865785753 scopus 로고    scopus 로고
    • Improved bottleneck features using pre-trained deep neural networks
    • D. Yu and M. Seltzer. Improved bottleneck features using pre-trained deep neural networks. In Proceedings of Interspeech. 2011.
    • (2011) Proceedings of Interspeech
    • Yu, D.1    Seltzer, M.2
  • 428
  • 444
    • 84865208051 scopus 로고    scopus 로고
    • Nonlinear compensation using the gauss-newton method for noise-robust speech recognition
    • Y. Zhao and B. Juang. Nonlinear compensation using the gauss-newton method for noise-robust speech recognition. IEEE Transactions on Audio, Speech, and Language Processing, 20(8):2191-2206, 2012.
    • (2012) IEEE Transactions on Audio Speech, and Language Processing , vol.20 , Issue.8 , pp. 2191-2206
    • Zhao, Y.1    Juang, B.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.