메뉴 건너뛰기




Volumn 3, Issue , 2014, Pages

A tutorial survey of architectures, algorithms, and applications for deep learning

Author keywords

Algorithms; Deep learning; Information processing

Indexed keywords

ALGORITHMS; AUDIO SIGNAL PROCESSING; CLASSIFICATION (OF INFORMATION); DATA PROCESSING; DEEP NEURAL NETWORKS; LEARNING ALGORITHMS; LEARNING SYSTEMS; MODELING LANGUAGES; NATURAL LANGUAGE PROCESSING SYSTEMS; NETWORK ARCHITECTURE; RECURRENT NEURAL NETWORKS; SURVEYS;

EID: 84906883759     PISSN: None     EISSN: 20487703     Source Type: Journal    
DOI: 10.1017/ATSIP.2013.99     Document Type: Review
Times cited : (540)

References (222)
  • 2
    • 85032751686 scopus 로고    scopus 로고
    • Expanding the scope of signal processing
    • Deng, L. Expanding the scope of signal processing. IEEE Signal Process. Mag., 25 (3) (2008), 2-4.
    • (2008) IEEE Signal Process. Mag. , vol.25 , Issue.3 , pp. 2-4
    • Deng, L.1
  • 3
    • 33745805403 scopus 로고    scopus 로고
    • A fast learning algorithm for deep belief nets
    • Hinton, G.; Osindero, S.; Teh, Y. A fast learning algorithm for deep belief nets. Neural Comput., 18 (2006), 1527-1554.
    • (2006) Neural Comput. , vol.18 , pp. 1527-1554
    • Hinton, G.1    Osindero, S.2    Teh, Y.3
  • 4
    • 69349090197 scopus 로고    scopus 로고
    • Learning deep architectures for AI
    • Bengio, Y. Learning deep architectures for AI. Found. TrendsMach. Learn., 2 (1) (2009), 1-127.
    • (2009) Found. Trends Mach. Learn. , vol.2 , Issue.1 , pp. 1-127
    • Bengio, Y.1
  • 6
    • 85032751458 scopus 로고    scopus 로고
    • Deep neural networks for acoustic modeling in speech recognition
    • Hinton, G. et al. Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process.Mag., 29 (6) (2012), 82-97.
    • (2012) IEEE Signal Process. Mag. , vol.29 , Issue.6 , pp. 82-97
    • Hinton, G.1
  • 7
    • 85032782045 scopus 로고    scopus 로고
    • Deep learning and its applications to signal and information processing
    • Yu, D.; Deng, L. Deep learning and its applications to signal and information processing. IEEE Signal Process. Mag., 28 (2011), 145- 154.
    • (2011) IEEE Signal Process. Mag. , vol.28 , pp. 145-154
    • Yu D.Deng, L.1
  • 8
    • 77958488310 scopus 로고    scopus 로고
    • Deep machine learning - A new frontier in artificial intelligence
    • Arel, I.; Rose, C.; Karnowski, T. Deep machine learning - A new frontier in artificial intelligence, in IEEE Computational Intelligence Mag., 5 (2010), 13-18.
    • (2010) IEEE Computational Intelligence Mag. , vol.5 , pp. 13-18
    • Arel, I.1    Rose, C.2    Karnowski, T.3
  • 9
    • 84903700854 scopus 로고    scopus 로고
    • Scientists see promise in deep-learning programs
    • November 24
    • Markoff, J. Scientists See Promise in Deep-Learning Programs. New York Times, November 24, 2012.
    • (2012) New York Times
    • Markoff, J.1
  • 10
    • 78149327741 scopus 로고    scopus 로고
    • Kernel methods for deep learning
    • Cho, Y.; Saul, L. Kernel methods for deep learning. NIPS, 2009, 342-350.
    • (2009) NIPS , pp. 342-350
    • Cho, Y.1    Saul, L.2
  • 12
    • 84877777313 scopus 로고    scopus 로고
    • Learning with recursive perceptual representations
    • Vinyals, O.; Jia, Y.; Deng, L.; Darrell, T. Learning with recursive perceptual representations, in Proc. NIPS, 2012.
    • (2012) Proc. NIPS
    • Vinyals, O.1    Jia, Y.2    Deng, L.3    Darrell, T.4
  • 13
    • 85032751593 scopus 로고    scopus 로고
    • Research developments and directions in speech recognition and understanding
    • Baker, J. et al. Research developments and directions in speech recognition and understanding. IEEE Signal Process. Mag., 26 (3) (2009), 75-80.
    • (2009) IEEE Signal Process. Mag. , vol.26 , Issue.3 , pp. 75-80
    • Baker, J.1
  • 14
    • 85032759066 scopus 로고    scopus 로고
    • Updated MINS report on speech recognition and understanding
    • Baker, J. et al. Updated MINS report on speech recognition and understanding. IEEE Signal. Process. Mag., 26 (4) (2009), 78-85.
    • (2009) EEE Signal. Process. Mag. , vol.26 , Issue.4 , pp. 78-85
    • Baker, J.1
  • 15
    • 0039503389 scopus 로고    scopus 로고
    • Computational models for speech production
    • Springer- Verlag, Berlin, Heidelberg
    • Deng, L. Computational models for speech production, in Computational Models of Speech Pattern Processing, 199-213, Springer- Verlag, 1999, Berlin, Heidelberg.
    • (1999) Computational Models of Speech Pattern Processing , pp. 199-213
    • Deng, L.1
  • 16
    • 33744966595 scopus 로고    scopus 로고
    • Switching dynamic system models for speech articulation and acoustics
    • Springer, New York
    • Deng, L. Switching dynamic system models for speech articulation and acoustics, in Mathematical Foundations of Speech and Language Processing, 115-134, Springer, NewYork, 2003.
    • (2003) Mathematical Foundations of Speech and Language Processing , pp. 115-134
    • Deng, L.1
  • 19
    • 84903722546 scopus 로고    scopus 로고
    • How the brain might work The role of information and learning in understanding and replicating intelligence
    • (G. Jacovitt, A. Pettorossi, R. Consolo, V. Senni, eds), Lateran University Press, Amsterdam, Netherlands
    • Poggio, T. How the brain might work the role of information and learning in understanding and replicating intelligence, in Information Science and Technology for the New Century (G. Jacovitt, A. Pettorossi, R. Consolo, V. Senni, eds), 45-61, Lateran University Press, 2007, Amsterdam, Netherlands.
    • (2007) Information Science and Technology for the New Century , pp. 45-61
    • Poggio, T.1
  • 20
    • 79951563340 scopus 로고    scopus 로고
    • Understanding the difficulty of training deep feed forward neural networks
    • Glorot, X.; Bengio, Y. Understanding the difficulty of training deep feed forward neural networks, in Proc. AISTAT, 2010.
    • (2010) Proc. AISTAT
    • Glorot, X.1    Bengio, Y.2
  • 21
    • 33746600649 scopus 로고    scopus 로고
    • Reducing the dimensionality of data with neural networks
    • Hinton, G.; Salakhutdinov, R. Reducing the dimensionality of data with neural networks. Science, 313 (5786) (2006), 504-507.
    • (2006) Science , vol.313 , Issue.5786 , pp. 504-507
    • Hinton, G.1    Salakhutdinov, R.2
  • 22
    • 84905237729 scopus 로고    scopus 로고
    • Context-dependent DBNHMMs in large vocabulary continuous speech recognition
    • Dahl, G.; Yu, D.; Deng, L.; Acero, A. Context-dependent DBNHMMs in large vocabulary continuous speech recognition, in Proc. ICASSP, 2011.
    • (2011) Proc. ICASSP
    • Dahl, G.1    Yu, D.2    Deng, L.3    Acero, A.4
  • 23
    • 79959840616 scopus 로고    scopus 로고
    • Investigation of full-sequence training of deep belief networks for speech recognition
    • September
    • Mohamed, A.; Yu, D.; Deng, L. Investigation of full-sequence training of deep belief networks for speech recognition, in Proc. Inter speech, September 2010.
    • (2010) Proc. Inter Speech
    • Mohamed, A.1    Yu, D.2    Deng, L.3
  • 25
    • 84055222005 scopus 로고    scopus 로고
    • Context-dependent DBNHMMs in large vocabulary continuous speech recognition
    • Dahl, G.; Yu, D.; Deng, L.; Acero, A. Context-dependent DBNHMMs in large vocabulary continuous speech recognition. IEEE Trans. Audio Speech, Lang. Process., 20 (1) (2012), 30-42.
    • (2012) IEEE Trans. Audio Speech, Lang. Process. , vol.20 , Issue.1 , pp. 30-42
    • Dahl, G.1    Yu, D.2    Deng, L.3    Acero, A.4
  • 26
    • 84867585919 scopus 로고    scopus 로고
    • Understanding how deep belief networks perform acoustic modeling
    • Mohamed, A.; Hinton, G.; Penn, G. Understanding how deep belief networks perform acoustic modelling, in Proc. ICASSP, 2012.
    • (2012) Proc. ICASSP
    • Mohamed, A.1    Hinton, G.2    Penn, G.3
  • 27
    • 79551480483 scopus 로고    scopus 로고
    • Stacked de noising auto encoders Leaning useful representations in a deep network with a local de noising criterion
    • Vincent, P.; Larochelle, H.; Lajoie, I.; Bengio, Y.; Manzagol, P. Stacked denoising autoencoders leaning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res., 11 (2010), 3371-3408.
    • (2010) J. Mach. Learn. Res. , vol.11 , pp. 3371-3408
    • Vincent, P.1    Larochelle, H.2    Lajoie, I.3    Bengio, Y.4    Manzagol, P.5
  • 28
    • 80053460450 scopus 로고    scopus 로고
    • Contractive auto encoders Explicit invariance during feature extraction
    • Rifai, S.; Vincent, P.; Muller, X.; Glorot, X.; Bengio, Y. Contractive autoencoders explicit invariance during feature extraction, in Proc. ICML, 2011, 833-840.
    • (2011) Proc. ICML , pp. 833-840
    • Rifai, S.1    Vincent, P.2    Muller, X.3    Glorot, X.4    Bengio, Y.5
  • 29
    • 70049094447 scopus 로고    scopus 로고
    • Sparse feature learning for deep belief networks
    • Ranzato, M.; Boureau, Y.; LeCun, Y. Sparse feature learning for deep belief networks, in Proc. NIPS, 2007.
    • (2007) Proc. NIPS
    • Ranzato, M.1    Boureau, Y.2    LeCun, Y.3
  • 32
    • 0003573244 scopus 로고
    • Connectionist speech recognition A hybrid approach
    • Norwell, MA
    • Bourlard, H.; Morgan, N. Connectionist Speech Recognition A Hybrid Approach, Kluwer, Norwell, MA, 1993.
    • (1993) Kluwer
    • Bourlard, H.1    Morgan, N.2
  • 33
    • 84255177123 scopus 로고    scopus 로고
    • Deep and wide Multiple layers in automatic speech recognition
    • Morgan, N. Deep and wide multiple layers in automatic speech recognition. IEEE Trans.Audio Speech, Lang. Process., 20 (1) (2012), 7-13.
    • (2012) IEEE Trans.Audio Speech, Lang. Process. , vol.20 , Issue.1 , pp. 7-13
    • Morgan, N.1
  • 34
    • 84876672166 scopus 로고    scopus 로고
    • Machine learning paradigms in speech recognition An overview
    • Deng, L.; Li, X. Machine learning paradigms in speech recognition an overview. IEEE Trans. Audio Speech, Lang., 21 (2013), 1060-1089.
    • (2013) IEEE Trans. Audio Speech, Lang. , vol.21 , pp. 1060-1089
    • Deng, L.1    Li, X.2
  • 36
    • 85112276587 scopus 로고    scopus 로고
    • Efficient learning of sparse representations with an energy-based model
    • Ranzato, M.; Poultney, C.; Chopra, S.; LeCun, Y. Efficient learning of sparse representations with an energy-based model, in Proc. NIPS, 2006.
    • (2006) Proc. NIPS
    • Ranzato, M.1    Poultney, C.2    Chopra, S.3    LeCun, Y.4
  • 41
    • 84877755914 scopus 로고    scopus 로고
    • A better way to pre train deep Boltzmann machines
    • Salakhutdinov, R.; Hinton, G. A better way to pre train deep Boltzmann machines, in Proc. NIPS, 2012.
    • (2012) Proc. NIPS
    • Salakhutdinov, R.1    Hinton, G.2
  • 42
    • 84877724347 scopus 로고    scopus 로고
    • Multimodal learning with deep boltzmann machines
    • Srivastava, N.; Salakhutdinov, R. Multimodal learning with deep Boltzmann machines, in Proc. NIPS, 2012.
    • (2012) Proc. NIPS
    • Srivastava, N.1    Salakhutdinov, R.2
  • 43
    • 85162069624 scopus 로고    scopus 로고
    • Phone recognition with the mean-covariance restricted Boltzmann machine
    • Dahl, G.; Ranzato, M.; Mohamed, A.; Hinton, G. Phone recognition with themean-covariance restricted Boltzmannmachine. Proc. NIPS, 23 (2010), 469-477.
    • (2010) Proc. NIPS , vol.23 , pp. 469-477
    • Dahl, G.1    Ranzato, M.2    Mohamed, A.3    Hinton, G.4
  • 45
    • 84877731706 scopus 로고    scopus 로고
    • Discriminative learning of sum-product networks
    • Gens, R.; Domingo, P. Discriminative learning of sum-product networks. Proc. NIPS, 2012.
    • (2012) Proc. NIPS
    • Gens, R.1    Domingo, P.2
  • 46
  • 47
    • 84865683125 scopus 로고    scopus 로고
    • Deep learning with Hessian-free optimization
    • Martens, J. Deep learning with Hessian-free optimization, in Proc. ICML, 2010.
    • (2010) Proc. ICML
    • Martens, J.1
  • 48
    • 80053451847 scopus 로고    scopus 로고
    • Learning recurrent neural networks with Hessian-free optimization
    • Martens, J.; Sutskever, I. Learning recurrent neural networks with Hessian-free optimization, in Proc. ICML, 2011.
    • (2011) Proc. ICML
    • Martens, J.1    Sutskever, I.2
  • 52
    • 84906237242 scopus 로고    scopus 로고
    • Investigation of re current neural- network architectures and learning methods for spoken language understanding
    • Mesnil, G.; He, X.; Deng, L.; Bengio, Y. Investigation of recurrentneural- network architectures and learning methods for spoken language understanding, in Proc. Inter speech, 2013.
    • (2013) Proc. Inter Speech
    • Mesnil, G.1    He, X.2    Deng, L.3    Bengio, Y.4
  • 54
    • 0026854213 scopus 로고
    • A generalized hidden Markov model with state conditioned trend functions of time for the speech signal
    • Deng, L. A generalized hidden Markov model with state conditioned trend functions of time for the speech signal. Signal Process., 27 (1) (1992), 65-78.
    • (1992) Signal Process. , vol.27 , Issue.1 , pp. 65-78
    • Deng, L.1
  • 55
    • 0027678649 scopus 로고
    • A stochastic model of speech incorporating hierarchical non stationarity
    • Deng, L. A stochastic model of speech incorporating hierarchical nonstationarity. IEEE Trans. Speech Audio Process., 1 (4) (1993), 471-475.
    • (1993) IEEE Trans. Speech Audio Process. , vol.1 , Issue.4 , pp. 471-475
    • Deng, L.1
  • 56
    • 0028516022 scopus 로고
    • Speech recognition using hidden markov models with polynomial regression functions as nonstationary states
    • Deng, L.; Aksmanovic, M.; Sun, D.; Wu, J. Speech recognition using hiddenMarkov models with polynomial regression functions as nonstationary states. IEEE Trans. Speech Audio Process., 2 (4) (1994), 507-520.
    • (1994) IEEE Trans. Speech Audio Process. , vol.2 , Issue.4 , pp. 507-520
    • Deng, L.1    Aksmanovic, M.2    Sun, D.3    Wu, J.4
  • 57
    • 0030245363 scopus 로고    scopus 로고
    • From HMM's to segment models A unified view of stochastic modeling for speech recognition
    • Ostendorf, M.; Digalakis, V.; Kimball, O. From HMM's to segment models a unified view of stochastic modeling for speech recognition. IEEE Trans. Speech Audio Process., 4 (5) (1996), 360-378.
    • (1996) IEEE Trans. Speech Audio Process. , vol.4 , Issue.5 , pp. 360-378
    • Ostendorf, M.1    Digalakis, V.2    Kimball, O.3
  • 58
    • 0030190520 scopus 로고    scopus 로고
    • Transitional speech units and their representation by regressive Markov states Applications to speech recognition
    • Deng, L.; Sameti, H. Transitional speech units and their representation by regressiveMarkov states applications to speech recognition. IEEE Trans. Speech Audio Process., 4 (4) (1996), 301-306.
    • (1996) IEEE Trans. Speech Audio Process. , vol.4 , Issue.4 , pp. 301-306
    • Deng, L.1    Sameti, H.2
  • 59
    • 0031185482 scopus 로고    scopus 로고
    • Speaker-independent phonetic classification using hidden Markov models with state-conditioned mixtures of trend functions
    • Deng, L.; Aksmanovic, M. Speaker-independent phonetic classification using hidden Markov models with state-conditioned mixtures of trend functions. IEEE Trans. Speech Audio Process., 5 (1997), 319-324.
    • (1997) IEEE Trans. Speech Audio Process. , vol.5 , pp. 319-324
    • Deng, L.1    Aksmanovic, M.2
  • 60
    • 85032752267 scopus 로고    scopus 로고
    • Solving nonlinear estimation problems using splines
    • Yu, D.; Deng, L. Solving nonlinear estimation problems using Splines. IEEE Signal Process. Mag., 26 (4) (2009), 86-90.
    • (2009) IEEE Signal Process. Mag. , vol.26 , Issue.4 , pp. 86-90
    • Yu, D.1    Deng, L.2
  • 61
    • 68549140008 scopus 로고    scopus 로고
    • A novel framework and training algorithm for variable-parameter hidden Markov models
    • Yu, D., Deng, L.; Gong, Y.; Acero, A. A novel framework and training algorithm for variable-parameter hidden Markov models. IEEE Trans. Audio Speech Lang. Process., 17 (7) (2009), 1348- 1360.
    • (2009) IEEE Trans. Audio Speech Lang. Process. , vol.17 , Issue.7 , pp. 1348-1360
    • Yu, D.1    Deng, L.2    Gong, Y.3    Acero, A.4
  • 62
    • 78149260085 scopus 로고    scopus 로고
    • Continuous stochastic feature mapping based on trajectory HMMs
    • Zen, H.; Nankaku, Y.; Tokuda, K. Continuous stochastic feature mapping based on trajectory HMMs. IEEE Trans. Audio Speech, Lang. Process., 19 (2) (2011), 417-430.
    • (2011) IEEE Trans. Audio Speech, Lang. Process. , vol.19 , Issue.2 , pp. 417-430
    • Zen, H.1    Nankaku, Y.2    Tokuda, K.3
  • 64
    • 84869440340 scopus 로고    scopus 로고
    • Articulatory control of HMM based parametric speech synthesis using feature-space-switched multiple regression
    • Ling, Z.; Richmond, K.; Yamagishi, J. Articulatory control ofHMMbased parametric speech synthesis using feature-space-switched multiple regression. IEEE Trans. Audio Speech Lang. Process., 21 (2013), 207-219.
    • (2013) IEEE Trans. Audio Speech Lang. Process. , vol.21 , pp. 207-219
    • Ling, Z.1    Richmond, K.2    Yamagishi, J.3
  • 65
    • 84890447002 scopus 로고    scopus 로고
    • Modeling spectral envelopes using restricted Boltzmann machines for statistical parametric speech synthesis
    • Ling, Z.; Deng, L.; Yu, D. Modeling spectral envelopes using restricted Boltzmann machines for statistical parametric speech synthesis, in ICASSP, 2013, 7825-7829.
    • (2013) ICASSP , pp. 7825-7829
    • Ling, Z.1    Deng, L.2    Yu, D.3
  • 66
    • 84872190545 scopus 로고    scopus 로고
    • Autoregressive models for statistical parametric speech synthesis
    • Shannon, M.; Zen, H.; Byrne, W. Autoregressive models for statistical parametric speech synthesis. IEEE Trans. Audio Speech Lang. Process., 21 (3) (2013), 587-597.
    • (2013) IEEE Trans. Audio Speech Lang. Process. , vol.21 , Issue.3 , pp. 587-597
    • Shannon, M.1    Zen, H.2    Byrne, W.3
  • 67
    • 0031198059 scopus 로고    scopus 로고
    • Production models as a structural basis for automatic speech recognition
    • Deng, L.; Ramsay, G.; Sun, D. Production models as a structural basis for automatic speech recognition. Speech Commun., 33 (2-3) (1997), 93-111.
    • (1997) Speech Commun. , vol.33 , Issue.2-3 , pp. 93-111
    • Deng, L.1    Ramsay, G.2    Sun, D.3
  • 68
    • 0001853667 scopus 로고    scopus 로고
    • An investigation of segmental hidden dynamic models of speech coarticulation for automatic speech recognition
    • Johns Hopkins
    • Bridle, J. et al. An investigation of segmental hidden dynamic models of speech coarticulation for automatic speech recognition. Final Report for 1998 Workshop on Language Engineering, CLSP, Johns Hopkins, 1998.
    • (1998) Final Report for 1998 Workshop on Language Engineering, CLSP
    • Bridle, J.1
  • 69
    • 0032639922 scopus 로고    scopus 로고
    • Initial evaluation of hidden dynamic models on conversational speech
    • Picone, P. et al.: Initial evaluation of hidden dynamic models on conversational speech, in Proc. ICASSP, 1999.
    • (1999) Proc. ICASSP
    • Picone, P.1
  • 70
    • 0036293703 scopus 로고    scopus 로고
    • A recognition method with parametric trajectory synthesized using direct relations between static and dynamic feature vector time series
    • Minami, Y.; McDermott, E.; Nakamura, A.; Katagiri, S.: A recognition method with parametric trajectory synthesized using direct relations between static and dynamic feature vector time series, in Proc. ICASSP, 2002, 957-960.
    • (2002) Proc. ICASSP , pp. 957-960
    • Minami, Y.1    McDermott, E.2    Nakamura, A.3    Katagiri, S.4
  • 71
    • 4243109553 scopus 로고    scopus 로고
    • Challenges in adopting speech recognition
    • Deng, L.; Huang, X.D.: Challenges in adopting speech recognition. Commun. ACM, 47 (1) (2004), 11-13.
    • (2004) Commun. ACM , vol.47 , Issue.1 , pp. 11-13
    • Deng, L.1    Huang, X.D.2
  • 72
    • 0347968275 scopus 로고    scopus 로고
    • Efficient decoding strategies for conversational speech recognition using a constrained nonlinear statespace model
    • Ma, J.; Deng, L.: Efficient decoding strategies for conversational speech recognition using a constrained nonlinear statespace model. IEEE Trans. Speech Audio Process., 11 (6) (2003), 590-602.
    • (2003) IEEE Trans. Speech Audio Process. , vol.11 , Issue.6 , pp. 590-602
    • Ma, J.1    Deng, L.2
  • 73
    • 0742307392 scopus 로고    scopus 로고
    • Target-directed mixture dynamic models for spontaneous speech recognition
    • Ma, J.; Deng, L.: Target-directed mixture dynamic models for spontaneous speech recognition. IEEE Trans. Speech Audio Process., 12 (1) (2004), 47-58.
    • (2004) IEEE Trans. Speech Audio Process. , vol.12 , Issue.1 , pp. 47-58
    • Ma, J.1    Deng, L.2
  • 75
    • 33744966561 scopus 로고    scopus 로고
    • A bidirectional target filtering model of speech coarticulation: Two-stage implementation for phonetic recognition
    • Deng, L.; Yu, D.; Acero, A.: A bidirectional target filtering model of speech coarticulation: two-stage implementation for phonetic recognition. IEEE Trans. Audio Speech Process., 14 (1) (2006a), 256-265.
    • (2006) IEEE Trans. Audio Speech Process. , vol.14 , Issue.1 , pp. 256-265
    • Deng, L.1    Yu, D.2    Acero, A.3
  • 76
    • 34547551709 scopus 로고    scopus 로고
    • Use of differential cepstra as acoustic features in hidden trajectory modeling for phonetic recognition
    • April
    • Deng, L.; Yu, D.: Use of differential cepstra as acoustic features in hidden trajectory modeling for phonetic recognition, in Proc. ICASSP, April 2007.
    • (2007) Proc. ICASSP
    • Deng, L.1    Yu, D.2
  • 77
    • 85032752364 scopus 로고    scopus 로고
    • Graphical model architectures for speech recognition
    • Bilmes, J.; Bartels, C.: Graphical model architectures for speech recognition. IEEE Signal Process. Mag., 22 (2005), 89-100.
    • (2005) IEEE Signal Process. Mag. , vol.22 , pp. 89-100
    • Bilmes, J.1    Bartels, C.2
  • 78
    • 85032751937 scopus 로고    scopus 로고
    • Dynamic graphical models
    • Bilmes, J.: Dynamic graphical models. IEEE Signal Process.Mag., 33 (2010), 29-42.
    • (2010) IEEE Signal Process.Mag. , vol.33 , pp. 29-42
    • Bilmes, J.1
  • 79
    • 85032751986 scopus 로고    scopus 로고
    • Single-channel multi talker speech recognition - graphical modeling approaches
    • Rennie, S.; Hershey, H.; Olsen, P.: Single-channelmultitalker speech recognition - graphical modeling approaches. IEEE Signal Process. Mag., 33 (2010), 66-80.
    • (2010) IEEE Signal Process. Mag. , vol.33 , pp. 66-80
    • Rennie, S.1    Hershey, H.2    Olsen, P.3
  • 80
    • 79951599228 scopus 로고    scopus 로고
    • A probabilistic interaction model for multi pitch tracking with factorial hidden markov model
    • Wohlmayr, M.; Stark, M.; Pernkopf, F.: A probabilistic interaction model for multipitch tracking with factorial hiddenMarkov model. IEEE Trans. Audio Speech, Lang. Process., 19 (4) (2011).
    • (2011) IEEE Trans. Audio Speech, Lang. Process. , vol.19 , Issue.4
    • Wohlmayr, M.1    Stark, M.2    Pernkopf, F.3
  • 81
    • 84862270634 scopus 로고    scopus 로고
    • Empirical risk minimization of graphical model parameters given approximate inference, decoding, and model structure
    • Stoyanov, V.; Ropson, A.; Eisner, J.: Empirical risk minimization of graphicalmodel parameters given approximate inference, decoding, and model structure, in Proc. AISTAT, 2011.
    • (2011) Proc. AISTAT
    • Stoyanov, V.1    Ropson, A.2    Eisner, J.3
  • 83
    • 0032119668 scopus 로고    scopus 로고
    • The hierarchical hidden markov model: Analysis and applications
    • Fine, S.; Singer, Y.; Tishby, N.: The Hierarchical Hidden Markov Model: Analysis and applications. Mach. Learn., 32 (1998), 41-62.
    • (1998) Mach. Learn. , vol.32 , pp. 41-62
    • Fine, S.1    Singer, Y.2    Tishby, N.3
  • 84
    • 4944221356 scopus 로고    scopus 로고
    • Layered representations for learning and inferring office activity from multiple sensory channels
    • Oliver, N.; Garg, A.; Horvitz, E.: Layered representations for learning and inferring office activity from multiple sensory channels. Comput. Vis. Image Understand., 96 (2004), 163-180.
    • (2004) Comput. Vis. Image Understand. , vol.96 , pp. 163-180
    • Oliver, N.1    Garg, A.2    Horvitz, E.3
  • 85
    • 84864026688 scopus 로고    scopus 로고
    • Modeling human motion using binary latent variables
    • Taylor, G.; Hinton, G.E.; Roweis, S.:Modeling human motion using binary latent variables, in Proc. NIPS, 2007.
    • (2007) Proc. NIPS
    • Taylor, G.1    Hinton, G.E.2    Roweis, S.3
  • 86
    • 84866842186 scopus 로고    scopus 로고
    • Learning continuous phrase representations and syntactic parsing with recursive neural networks
    • Socher, R.; Lin, C.; Ng, A.; Manning, C.: Learning continuous phrase representations and syntactic parsing with recursive neural networks, in Proc. ICML, 2011.
    • (2011) Proc. ICML
    • Socher, R.1    Lin, C.2    Ng, A.3    Manning, C.4
  • 87
    • 0031139839 scopus 로고    scopus 로고
    • Minimum classification error rate methods for speech recognition
    • Juang, B.-H., Chou, W.; Lee, C.-H.: Minimum classification error rate methods for speech recognition. IEEE Trans. Speech Audio Process., 5 (1997), 257-265.
    • (1997) IEEE Trans. Speech Audio Process. , vol.5 , pp. 257-265
    • Juang, B.-H.1    Chou, W.2    Lee, C.-H.3
  • 88
    • 0032206267 scopus 로고    scopus 로고
    • Speech trajectory discrimination using the minimum classification error learning
    • Chengalvarayan, R.; Deng, L.: Speech trajectory discrimination using the minimum classification error learning. IEEETrans. Speech Audio Process., 6 (6) (1998), 505-515.
    • (1998) IEEETrans. Speech Audio Process. , vol.6 , Issue.6 , pp. 505-515
    • Chengalvarayan, R.1    Deng, L.2
  • 89
    • 0036296863 scopus 로고    scopus 로고
    • Minimumphone error and i-smoothing for improved discriminative training
    • Povey, D.;Woodland, P.:Minimumphone error and i-smoothing for improved discriminative training, in Proc. ICASSP, 2002, 105-108.
    • (2002) Proc. ICASSP , pp. 105-108
    • Povey, D.1    Woodland, P.2
  • 90
    • 85032750905 scopus 로고    scopus 로고
    • Discriminative learning in sequential pattern recognition - A unifying review for optimization-oriented speech recognition
    • He, X.; Deng, L.; Chou, W.: Discriminative learning in sequential pattern recognition - A unifying review for optimization-oriented speech recognition. IEEE Signal Process. Mag., 25 (2008), 14-36.
    • (2008) IEEE Signal Process. Mag. , vol.25 , pp. 14-36
    • He, X.1    Deng, L.2    Chou, W.3
  • 91
    • 85032751120 scopus 로고    scopus 로고
    • Parameter estimation of statistical models using convex optimization: An advanced method of discriminative training for speech and language processing
    • Jiang, H.; Li, X.: Parameter estimation of statistical models using convex optimization: An advanced method of discriminative training for speech and language processing. IEEE Signal Process. Mag., 27 (3) (2010), 115-127.
    • (2010) IEEE Signal Process. Mag. , vol.27 , Issue.3 , pp. 115-127
    • Jiang, H.1    Li, X.2
  • 92
    • 34547526577 scopus 로고    scopus 로고
    • Large-margin minimum classification error training for large-scale speech recognition tasks
    • Yu, D.; Deng, L.; He, X.; Acero, X.: Large-margin minimum classification error training for large-scale speech recognition tasks, in Proc. ICASSP, 2007.
    • (2007) Proc. ICASSP
    • Yu, D.1    Deng, L.2    He, X.3    Acero, X.4
  • 93
    • 85032751865 scopus 로고    scopus 로고
    • A geometric perspective of large-margin training of Gaussian models
    • Xiao, L.; Deng, L.: A geometric perspective of large-margin training of Gaussian models. IEEE Signal Process. Mag., 27 (6) (2010), 118- 123.
    • (2010) IEEE Signal Process. Mag. , vol.27 , Issue.6 , pp. 118-123
    • Xiao, L.1    Deng, L.2
  • 94
    • 77955783938 scopus 로고    scopus 로고
    • Error approximation and minimum phone error acoustic model estimation
    • Gibson, M.; Hain, T.: Error approximation and minimum phone error acoustic model estimation. IEEE Trans. Audio Speech, Lang. Process., 18 (6) (2010), 1269-1279.
    • (2010) IEEE Trans. Audio Speech, Lang. Process. , vol.18 , Issue.6 , pp. 1269-1279
    • Gibson, M.1    Hain, T.2
  • 95
    • 84866881711 scopus 로고    scopus 로고
    • Combining a two-step CRF model and a joint source channel model for machine transliteration
    • Uppsala, Sweden
    • Yang, D.; Furui, S.: Combining a two-step CRF model and a joint source channel model for machine transliteration, in Proc. ACL, Uppsala, Sweden, 2010, 275-280.
    • (2010) Proc. ACL , pp. 275-280
    • Yang, D.1    Furui, S.2
  • 96
    • 78649308591 scopus 로고    scopus 로고
    • Sequential labeling using deep-structured conditional random fields
    • Yu, D.;Wang, S.;Deng, L.: Sequential labeling using deep-structured conditional randomfields. J. Sel. Top. Signal Process., 4 (2010), 965- 973.
    • (2010) J. Sel. Top. Signal Process. , vol.4 , pp. 965-973
    • Yu, D.1    Wang, S.2    Deng, L.3
  • 97
    • 70350435251 scopus 로고    scopus 로고
    • Speech recognition using augmented conditional random fields
    • Hifny, Y.; Renals, S.: Speech recognition using augmented conditional random fields. IEEE Trans. Audio Speech Lang. Process., 17 (2) (2009), 354-365.
    • (2009) IEEE Trans. Audio Speech Lang. Process. , vol.17 , Issue.2 , pp. 354-365
    • Hifny, Y.1    Renals, S.2
  • 98
    • 69249105007 scopus 로고    scopus 로고
    • Discriminative input stream combination for conditional random field phone recognition
    • Heintz, I.; Fosler-Lussier, E.; Brew, C.: Discriminative input stream combination for conditional random field phone recognition. IEEE Trans. Audio Speech Lang. Process., 17 (8) (2009), 1533-1546.
    • (2009) IEEE Trans. Audio Speech Lang. Process. , vol.17 , Issue.8 , pp. 1533-1546
    • Heintz, I.1    Fosler-Lussier, E.2    Brew, C.3
  • 99
    • 77949370075 scopus 로고    scopus 로고
    • A segmental CRF approach to large vocabulary continuous speech recognition
    • Zweig, G.; Nguyen, P.: A segmental CRF approach to large vocabulary continuous speech recognition, in Proc. ASRU, 2009.
    • (2009) Proc. ASRU
    • Zweig, G.1    Nguyen, P.2
  • 100
  • 102
    • 79959828814 scopus 로고    scopus 로고
    • Deep-structured hidden conditional random fields for phonetic recognition
    • September
    • Yu, D.; Deng, L.: Deep-structured hidden conditional randomfields for phonetic recognition, in Proc. Interspeech, September. 2010.
    • (2010) Proc. Inter speech
    • Yu, D.1    Deng, L.2
  • 103
    • 78049409409 scopus 로고    scopus 로고
    • Language recognition using deep-structured conditional random fields
    • Yu, D.; Wang, S.; Karam, Z.; Deng, L.: Language recognition using deep-structured conditional random fields, in Proc. ICASSP, 2010, 5030-5033.
    • (2010) Proc. ICASSP , pp. 5030-5033
    • Yu, D.1    Wang, S.2    Karam, Z.3    Deng, L.4
  • 105
    • 77955803591 scopus 로고    scopus 로고
    • Enhanced phone posteriors for improving speech recognition systems
    • Ketabdar, H.; Bourlard, H.: Enhanced phone posteriors for improving speech recognition systems. IEEE Trans. Audio Speech Lang. Process., 18 (6) (2010), 1094-1106.
    • (2010) IEEE Trans. Audio Speech Lang. Process. , vol.18 , Issue.6 , pp. 1094-1106
    • Ketabdar, H.1    Bourlard, H.2
  • 106
    • 85032751546 scopus 로고    scopus 로고
    • Pushing the envelope - Aside [speech recognition]
    • Morgan, N. et al.: Pushing the envelope - Aside [speech recognition]. IEEE Signal Process. Mag., 22 (5) (2005), 81-88.
    • (2005) IEEE Signal Process. Mag. , vol.22 , Issue.5 , pp. 81-88
    • Morgan, N.1
  • 107
    • 84865768819 scopus 로고    scopus 로고
    • Deep Convex Network: A scalable architecture for speech pattern classification
    • Deng, L.; Yu, D.: Deep Convex Network: A scalable architecture for speech pattern classification, in Proc. Inter speech, 2011.
    • (2011) Proc. Inter Speech
    • Deng, L.1    Yu, D.2
  • 108
    • 84867614591 scopus 로고    scopus 로고
    • Scalable stacking and learning for building deep architectures
    • Deng, L.; Yu, D.; Platt, J.: Scalable stacking and learning for building deep architectures, in Proc. ICASSP, 2012.
    • (2012) Proc. ICASSP
    • Deng, L.1    Yu, D.2    Platt, J.3
  • 109
    • 84867605416 scopus 로고    scopus 로고
    • Towards deep understanding: Deep convex networks for semantic utterance classification
    • Tur, G.; Deng, L.; Hakkani-Tür, D.; He, X.: Towards deep understanding: deep convex networks for semantic utterance classification, in Proc. ICASSP, 2012.
    • (2012) Proc. ICASSP
    • Tur, G.1    Deng, L.2    Hakkani-Tür, D.3    He, X.4
  • 110
    • 84877785043 scopus 로고    scopus 로고
    • Deep spatiotemporal architectures and learning for protein structure prediction
    • Lena, P.; Nagata, K.; Baldi, P.: Deep spatiotemporal architectures and learning for protein structure prediction, in Proc. NIPS, 2012.
    • (2012) Proc. NIPS
    • Lena, P.1    Nagata, K.2    Baldi, P.3
  • 111
    • 84867606917 scopus 로고    scopus 로고
    • A deep architecture with bilinear modeling of hidden representations: Applications to phonetic recognition
    • Hutchinson, B.; Deng, L.; Yu, D.: A deep architecture with bilinear modeling of hidden representations: Applications to phonetic recognition, in Proc. ICASSP, 2012.
    • (2012) Proc. ICASSP
    • Hutchinson, B.1    Deng, L.2    Yu, D.3
  • 113
    • 0028256706 scopus 로고
    • Analysis of correlation structure for a neural predictive model with application to speech recognition
    • Deng, L.; Hassanein, K.; Elmasry, M.: Analysis of correlation structure for a neural predictive model with application to speech recognition. Neural Netw., 7 (2) (1994a), 331-339.
    • (1994) Neural Netw. , vol.7 , Issue.2 , pp. 331-339
    • Deng, L.1    Hassanein, K.2    Elmasry, M.3
  • 114
    • 0028392167 scopus 로고
    • An application of recurrent nets to phone probability estimation
    • Robinson, A.: An application of recurrent nets to phone probability estimation. IEEE Trans. Neural Netw., 5 (1994), 298-305.
    • (1994) IEEE Trans. Neural Netw. , vol.5 , pp. 298-305
    • Robinson, A.1
  • 115
    • 33749259827 scopus 로고    scopus 로고
    • Connectionist temporal classification: Labeling un segmented sequence data with recurrent neural networks
    • Graves, A.; Fernandez, S.; Gomez, F.; Schmidhuber, J.: Connectionist temporal classification: labeling unsegmented sequence data with recurrent neural networks, in Proc. ICML, 2006.
    • (2006) Proc. ICML
    • Graves, A.1    Fernandez, S.2    Gomez, F.3    Schmidhuber, J.4
  • 116
    • 84890543083 scopus 로고    scopus 로고
    • Speech recognition with deep recurrent neural networks
    • Graves, A.; Mahamed, A.; Hinton, G.: Speech recognition with deep recurrent neural networks, in Proc. ICASSP, 2013.
    • (2013) Proc. ICASSP
    • Graves, A.1    Mahamed, A.2    Hinton, G.3
  • 118
    • 0032203257 scopus 로고    scopus 로고
    • Gradient-based learning applied to document recognition
    • LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE, 86 (1998), 2278- 2324.
    • (1998) Proc. IEEE , vol.86 , pp. 2278-2324
    • LeCun, Y.1    Bottou, L.2    Bengio, Y.3    Haffner, P.4
  • 119
    • 84877789057 scopus 로고    scopus 로고
    • Deep neural networks segment neuronal membranes in electron microscopy images
    • Ciresan, D.; Giusti, A.; Gambardella, L.; Schidhuber, J.: Deep neural networks segment neuronal membranes in electron microscopy images, in Proc. NIPS, 2012.
    • (2012) Proc. NIPS
    • Ciresan, D.1    Giusti, A.2    Gambardella, L.3    Schidhuber, J.4
  • 120
    • 84877760312 scopus 로고    scopus 로고
    • Large scale distributed deep networks
    • Dean, J. et al.: Large scale distributed deep networks, in Proc. NIPS, 2012.
    • (2012) Proc. NIPS
    • Dean, J.1
  • 121
    • 84876231242 scopus 로고    scopus 로고
    • Image net classification with deep con volutional neural Networks
    • Krizhevsky, A.; Sutskever, I.; Hinton, G.: ImageNet classification with deep convolutional neural Networks, in Proc. NIPS, 2012.
    • (2012) Proc. NIPS
    • Krizhevsky, A.1    Sutskever, I.2    Hinton, G.3
  • 122
    • 84867605836 scopus 로고    scopus 로고
    • Applying convolutional neural networks concepts to hybridNN-HMMmodel for speech recognition
    • Abdel-Hamid, O.; Mohamed, A.; Jiang, H.; Penn, G.: Applying convolutional neural networks concepts to hybridNN-HMMmodel for speech recognition. in ICASSP, 2012.
    • (2012) ICASSP
    • Abdel-Hamid, O.1    Mohamed, A.2    Jiang, H.3    Penn, G.4
  • 123
    • 84906214784 scopus 로고    scopus 로고
    • Exploring con volutional neural network structures and optimization for speech recognition
    • Abdel-Hamid, O.; Deng, L.; Yu, D.: Exploring convolutional neural network structures and optimization for speech recognition. in Proc. Inter speech, 2013.
    • (2013) Proc. Inter Speech
    • Abdel-Hamid, O.1    Deng, L.2    Yu, D.3
  • 126
    • 84890545163 scopus 로고    scopus 로고
    • Adeep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion
    • Deng, L.; Abdel-Hamid, O.; Yu, D.:Adeep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion, in Proc. ICASSP, 2013.
    • (2013) Proc. ICASSP
    • Deng, L.1    Abdel-Hamid, O.2    Yu, D.3
  • 127
    • 0025254722 scopus 로고
    • A time-delay neural network architecture for isolated word recognition
    • Lang, K.;Waibel, A.;Hinton, G.: A time-delay neural network architecture for isolated word recognition. Neural Netw., 3 (1) (1990), 23-43.
    • (1990) Neural Netw. , vol.3 , Issue.1 , pp. 23-43
    • Lang, K.1    Waibel, A.2    Hinton, G.3
  • 128
    • 84875923598 scopus 로고    scopus 로고
    • On Intelligence: How a New Understanding of the Brain will lead to the Creation of Truly Intelligent Machines
    • New York
    • Hawkins, J.; Blakeslee, S.: On Intelligence: How a New Understanding of the Brain will lead to the Creation of Truly Intelligent Machines, Times Books, New York, 2004.
    • (2004) Times Books
    • Hawkins, J.1    Blakeslee, S.2
  • 130
    • 84855358050 scopus 로고    scopus 로고
    • Hierarchical temporal memory including HTM cortical learning algorithms
    • December 10
    • Hawkins, G.; Ahmad, S.; Dubinsky, D.: Hierarchical Temporal Memory including HTM Cortical Learning Algorithms. Numenta Technical Report, December 10, 2010.
    • (2010) Numenta Technical Report
    • Hawkins, G.1    Ahmad, S.2    Dubinsky, D.3
  • 131
    • 33744917190 scopus 로고    scopus 로고
    • From knowledge-ignorant to knowledge-rich modeling: A new speech research paradigm for next-generation automatic speech recognition
    • Lee, C.-H.: From knowledge-ignorant to knowledge-rich modeling: A new speech research paradigm for next-generation automatic speech recognition, in Proc. ICSLP, 2004, 109-111.
    • (2004) Proc. ICSLP , pp. 109-111
    • Lee, C.-H.1
  • 132
    • 84867329143 scopus 로고    scopus 로고
    • Boosting attribute and phone estimation accuracies with deep neural networks for detectionbased speech recognition
    • Yu, D.; Siniscalchi, S.;Deng, L.;Lee, C.: Boosting attribute andphone estimation accuracies with deep neural networks for detectionbased speech recognition, in Proc. ICASSP, 2012.
    • (2012) Proc. ICASSP
    • Yu, D.1    Siniscalchi, S.2    Deng, L.3    Lee, C.4
  • 133
    • 84875405186 scopus 로고    scopus 로고
    • Exploiting deep neural networks for detection-based speech recognition
    • Siniscalchi, M.; Yu, D.; Deng, L.; Lee, C.-H.: Exploiting deep neural networks for detection-based speech recognition. Neuro computing, 106 (2013), 148-157.
    • (2013) Neuro Computing , vol.106 , pp. 148-157
    • Siniscalchi, M.1    Yu, D.2    Deng, L.3    Lee, C.-H.4
  • 134
    • 84872967500 scopus 로고    scopus 로고
    • A bottom-up modular search approach to large vocabulary continuous speech recognition
    • Siniscalchi, M.; Svendsen, T.; Lee, C.-H.: A bottom-up modular search approach to large vocabulary continuous speech recognition. IEEE Trans. Audio Speech, Lang. Process., 21 (2013), 786-797.
    • (2013) IEEE Trans. Audio Speech, Lang. Process. , vol.21 , pp. 786-797
    • Siniscalchi, M.1    Svendsen, T.2    Lee, C.-H.3
  • 135
    • 84867606668 scopus 로고    scopus 로고
    • Exploiting sparseness in deep neural networks for large vocabulary speech recognition
    • Yu, D.; Seide, F.; Li, G.; Deng, L.: Exploiting sparseness in deep neural networks for large vocabulary speech recognition, in Proc. ICASSP, 2012.
    • (2012) Proc. ICASSP
    • Yu, D.1    Seide, F.2    Li, G.3    Deng, L.4
  • 136
    • 0028234947 scopus 로고
    • A statistical approach to automatic speech recognition using the atomic speech units constructed from overlapping articulatory features
    • Deng, L.; Sun, D.: A statistical approach to automatic speech recognition using the atomic speech units constructed from overlapping articulatory features. J. Acoust. Soc. Am., 85 (5) (1994), 2702-2719.
    • (1994) J. Acoust. Soc. Am. , vol.85 , Issue.5 , pp. 2702-2719
    • Deng, L.1    Sun, D.2
  • 137
    • 0036165806 scopus 로고    scopus 로고
    • An overlapping-feature based phonological model incorporating linguistic constraints: Applications to speech recognition
    • Sun, J.; Deng, L.: An overlapping-feature based phonological model incorporating linguistic constraints: Applications to speech recognition. J. Acoust. Soc. Am., 111 (2) (2002), 1086-1101.
    • (2002) J. Acoust. Soc. Am. , vol.111 , Issue.2 , pp. 1086-1101
    • Sun, J.1    Deng, L.2
  • 138
    • 84886806651 scopus 로고    scopus 로고
    • Improving training time of deep belief networks through hybrid pre-training and larger batch sizes
    • December
    • Sainath, T.; Kingsbury, B.; Ramabhadran, B.: Improving training time of deep belief networks through hybrid pre-training and larger batch sizes, in Proc. NIPS Workshop on Log-linearModels, December 2012.
    • (2012) Proc. NIPS Workshop on Log-linear Models
    • Sainath, T.1    Kingsbury, B.2    Ramabhadran, B.3
  • 140
    • 70349213445 scopus 로고    scopus 로고
    • Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling
    • Kingsbury, B.: Lattice-based optimization of sequence classification criteria for neural-network acousticmodeling, in Proc. ICASSP, 2009.
    • (2009) Proc. ICASSP
    • Kingsbury, B.1
  • 141
    • 84878379108 scopus 로고    scopus 로고
    • Scalable minimum Bayes risk training of deep neural network acoustic models using distributed Hessian-free optimization
    • Kingsbury, B.; Sainath, T.; Soltau, H.: Scalable minimum Bayes risk training of deep neural network acoustic models using distributed Hessian-free optimization, in Proc. Interspeech, 2012.
    • (2012) Proc. Inter speech
    • Kingsbury, B.1    Sainath, T.2    Soltau, H.3
  • 142
    • 56449110012 scopus 로고    scopus 로고
    • Classification using discriminative restricted Boltzmann machines
    • Larochelle, H.; Bengio, Y.: Classification using discriminative restricted Boltzmann machines, in Proc. ICML, 2008.
    • (2008) Proc. ICML
    • Larochelle, H.1    Bengio, Y.2
  • 143
    • 80053540444 scopus 로고    scopus 로고
    • Unsupervised learning of hierarchical representations with convolutional deep belief networks
    • October
    • Lee, H.; Grosse, R.; Ranganath, R.; and Ng, A.: Unsupervised learning of hierarchical representations with convolutional deep belief networks, Communications of the ACM, Vol. 54, No. 10, October, 2011, pp. 95-103.
    • (2011) Communications of the ACM , vol.54 , Issue.10 , pp. 95-103
    • Lee, H.1    Grosse, R.2    Ranganath, R.3    Ng, A.4
  • 144
    • 71149119164 scopus 로고    scopus 로고
    • Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations
    • Lee, H.; Grosse, R.; Ranganath, R.; Ng, A.: Convolutional Deep Belief Networks for Scalable Unsupervised Learning ofHierarchical Representations, Proc. ICML, 2009.
    • (2009) Proc. ICML
    • Lee, H.1    Grosse, R.2    Ranganath, R.3    Ng, A.4
  • 145
    • 77956502334 scopus 로고    scopus 로고
    • Unsupervised feature learning for audio classification using convolutional deep belief networks
    • Lee, H.; Largman, Y.; Pham, P.; Ng, A.: Unsupervised feature learning for audio classification using convolutional deep belief networks, Proc. NIPS, 2010.
    • (2010) Proc. NIPS
    • Lee, H.1    Largman, Y.2    Pham, P.3    Ng, A.4
  • 146
    • 80052877144 scopus 로고    scopus 로고
    • On deep generative models with applications to recognition
    • Ranzato, M.; Susskind, J.; Mnih, V.; Hinton, G.: On deep generative models with applications to recognition, in Proc. CVPR, 2011.
    • (2011) Proc. CVPR
    • Ranzato, M.1    Susskind, J.2    Mnih, V.3    Hinton, G.4
  • 147
    • 0032654483 scopus 로고    scopus 로고
    • Speech translation: Coupling of recognition and translation
    • Ney, H.: Speech translation: coupling of recognition and translation, in Proc. ICASSP, 1999.
    • (1999) Proc. ICASSP
    • Ney, H.1
  • 148
    • 85032751114 scopus 로고    scopus 로고
    • Speech recognition, machine translation, and speech translation - A unifying discriminative framework
    • He, X.; Deng, L.: Speech recognition, machine translation, and speech translation - A unifying discriminative framework. IEEE Signal Process. Mag., 28 (2011), 126-133.
    • (2011) IEEE Signal Process. Mag. , vol.28 , pp. 126-133
    • He, X.1    Deng, L.2
  • 149
    • 66149085249 scopus 로고    scopus 로고
    • An integrative and discriminative technique for spoken utterance classification
    • Yamin, S.; Deng, L.; Wang, Y.; Acero, A.: An integrative and discriminative technique for spoken utterance classification. IEEE Trans. Audio Speech Lang. Process., 16 (2008), 1207-1214.
    • (2008) IEEE Trans. Audio Speech Lang. Process. , vol.16 , pp. 1207-1214
    • Yamin, S.1    Deng, L.2    Wang, Y.3    Acero, A.4
  • 150
    • 84867608216 scopus 로고    scopus 로고
    • Optimization in speech-centric information processing: Criteria and techniques
    • He, X.; Deng, L.: Optimization in speech-centric information processing: criteria and techniques, in Proc. ICASSP, 2012.
    • (2012) Proc. ICASSP
    • He, X.1    Deng, L.2
  • 151
    • 84876669905 scopus 로고    scopus 로고
    • Speech-centric information processing: An optimization-oriented approach
    • He, X.; Deng, L.: Speech-centric information processing: An optimization-oriented approach, in Proc. IEEE, 2013.
    • (2013) Proc. IEEE
    • He, X.1    Deng, L.2
  • 152
    • 84890494546 scopus 로고    scopus 로고
    • Deep stacking networks for information retrieval
    • Deng, L.; He, X.; Gao, J.: Deep stacking networks for information retrieval, in Proc. ICASSP, 2013a.
    • (2013) Proc. ICASSP
    • Deng, L.1    He, X.2    Gao, J.3
  • 153
    • 84890486619 scopus 로고    scopus 로고
    • Multi-style adaptive training for robust cross-lingual spoken language understanding
    • He, X.; Deng, L.; Tur, G.; Hakkani-Tur, D.: Multi-style adaptive training for robust cross-lingual spoken language understanding, in Proc. ICASSP, 2013.
    • (2013) Proc. ICASSP
    • He, X.1    Deng, L.2    Tur, G.3    Hakkani-Tur, D.4
  • 155
    • 84865801985 scopus 로고    scopus 로고
    • Conversational speech transcription using context-dependent deep neural networks
    • Seide, F.; Li, G.; Yu, D.: Conversational speech transcription using context-dependent deep neural networks. Proc. Interspeech, (2011), 437-440.
    • (2011) Proc. Inter speech , pp. 437-440
    • Seide, F.1    Li, G.2    Yu, D.3
  • 156
    • 84906225757 scopus 로고    scopus 로고
    • A scalable approach to using DNN-derived features in GMM-HMM based acoustic modeling for LVCSR
    • Yan, Z.; Huo, Q.; Xu, J.: A scalable approach to using DNN-derived features in GMM-HMM based acoustic modeling for LVCSR, in Proc. Inter speech, 2013.
    • (2013) Proc. Inter Speech
    • Yan, Z.1    Huo, Q.2    Xu, J.3
  • 157
    • 84899000641 scopus 로고    scopus 로고
    • Exponential family harmoniums with an application to information retrieval
    • Welling, M.; Rosen-Zvi, M.; Hinton, G.: Exponential family harmoniums with an application to information retrieval. Proc. NIPS, vol. 20 (2005).
    • (2005) Proc. NIPS , vol.20
    • Welling, M.1    Rosen-Zvi, M.2    Hinton, G.3
  • 158
    • 78650474133 scopus 로고    scopus 로고
    • A practical guide to training restricted boltzmann machines
    • University of Toronto, August
    • Hinton, G.: A practical guide to training restricted Boltzmann machines. UTML Technical Report 2010-003, University of Toronto, August 2010.
    • (2010) UTML Technical Report 2010-003
    • Hinton, G.1
  • 159
    • 0026692226 scopus 로고
    • Stacked generalization
    • Wolpert, D.: Stacked generalization. Neural Netw., 5 (2) (1992), 241- 259.
    • (1992) Neural Netw. , vol.5 , Issue.2 , pp. 241-259
    • Wolpert, D.1
  • 160
    • 84880708659 scopus 로고    scopus 로고
    • Stacked sequential learning
    • Cohen, W.; de Carvalho, R.V.: Stacked sequential learning, in Proc. IJCAI, 2005, 671-676.
    • (2005) Proc. IJCAI , pp. 671-676
    • Cohen, W.1    De Carvalho, R.V.2
  • 162
    • 84897497795 scopus 로고    scopus 로고
    • On the difficulty of training recurrent neural networks
    • Pascanu, R.; Mikolov, T.; Bengio, Y.: On the difficulty of training recurrent neural networks, in Proc. ICML, 2013.
    • (2013) Proc. ICML
    • Pascanu, R.1    Mikolov, T.2    Bengio, Y.3
  • 163
    • 0033623527 scopus 로고    scopus 로고
    • Spontaneous speech recognition using a statistical coarticulatory model for the vocal tract resonance dynamics
    • Deng, L.; Ma, J.: Spontaneous speech recognition using a statistical coarticulatory model for the vocal tract resonance dynamics. J. Acoust. Soc. Am., 108 (2000), 3036-3048.
    • (2000) J. Acoust. Soc. Am. , vol.108 , pp. 3036-3048
    • Deng, L.1    Ma, J.2
  • 164
    • 0344443787 scopus 로고    scopus 로고
    • Joint state and parameter estimation for a target-directed nonlinear dynamic system model
    • Togneri, R.; Deng, L.: Joint state and parameter estimation for a target-directed nonlinear dynamic system model. IEEE Trans. Signal Process., 51 (12) (2003), 3061-3070.
    • (2003) IEEE Trans. Signal Process. , vol.51 , Issue.12 , pp. 3061-3070
    • Togneri, R.1    Deng, L.2
  • 167
    • 84878534913 scopus 로고    scopus 로고
    • Integrating deep neural networks into structural classification approach based on weight finite-state transducers
    • Kubo, Y.; Hori, T.; Nakamura, A.: Integrating deep neural networks into structural classification approach based on weight finite-state transducers, in Proc. Interspeech, 2012.
    • (2012) Proc. Inter speech
    • Kubo, Y.1    Hori, T.2    Nakamura, A.3
  • 168
    • 84890491198 scopus 로고    scopus 로고
    • Recent advances in deep learning for speech research at Microsoft
    • Deng, L. et al.: Recent advances in deep learning for speech research at Microsoft, in Proc. ICASSP, 2013.
    • (2013) Proc. ICASSP
    • Deng, L.1
  • 169
    • 84890526837 scopus 로고    scopus 로고
    • New types of deep neural network learning for speech recognition and related applications: An overview
    • Deng, L.; Hinton, G.; Kingsbury, B.: New types of deep neural network learning for speech recognition and related applications: An overview, in Proc. ICASSP, 2013.
    • (2013) Proc. ICASSP
    • Deng, L.1    Hinton, G.2    Kingsbury, B.3
  • 170
    • 0022691022 scopus 로고
    • Maximumlikelihood estimation formultivariate mixture observations of Markov chains
    • Juang, B.; Levinson, S.; Sondhi, M.: Maximumlikelihood estimation formultivariatemixture observations ofMarkov chains. IEEE Trans. Inf. Theory, 32 (1986), 307-309.
    • (1986) IEEE Trans. Inf. Theory , vol.32 , pp. 307-309
    • Juang, B.1    Levinson, S.2    Sondhi, M.3
  • 171
    • 10244257175 scopus 로고
    • Large vocabulary word recognition using context-dependent allophonic hidden markov models
    • Deng, L.; Lennig, M.; Seitz, F.; Mermelstein, P.: Large vocabulary word recognition using context-dependent allophonic hidden Markov models. Comput. Speech Lang., 4 (4) (1990), 345-357.
    • (1990) Comput. Speech Lang. , vol.4 , Issue.4 , pp. 345-357
    • Deng, L.1    Lennig, M.2    Seitz, F.3    Mermelstein, P.4
  • 172
    • 0026189555 scopus 로고
    • Phonemic hidden Markov models with continuous mixture output densities for large vocabulary word recognition
    • Deng, L.; Kenny, P.; Lennig, M.; Gupta, V.; Seitz, F.;Mermelstein, P.: Phonemic hidden Markov models with continuous mixture output densities for large vocabulary word recognition. IEEE Trans. Signal Process, 39 (7) (1991), 1677-1681.
    • (1991) IEEE Trans. Signal Process , vol.39 , Issue.7 , pp. 1677-1681
    • Deng, L.1    Kenny, P.2    Lennig, M.3    Gupta, V.4    Seitz, F.5    Mermelstein, P.6
  • 173
    • 0028195651 scopus 로고
    • Waveform-based speech recognition using hidden filter models: Parameter selection and sensitivity to power normalization
    • Sheikhzadeh, H.; Deng, L.: Waveform-based speech recognition using hidden filter models: parameter selection and sensitivity to power normalization. IEEE Trans. Speech Audio Process., 2 (1994), 80-91.
    • (1994) IEEE Trans. Speech Audio Process. , vol.2 , pp. 80-91
    • Sheikhzadeh, H.1    Deng, L.2
  • 174
    • 80051609011 scopus 로고    scopus 로고
    • Learning a better representation of speech sound waves using restricted Boltzmann machines
    • Jaitly, N.; Hinton, G.: Learning a better representation of speech sound waves using restricted Boltzmann machines, in Proc. ICASSP, 2011.
    • (2011) Proc. ICASSP
    • Jaitly, N.1    Hinton, G.2
  • 175
    • 84858972572 scopus 로고    scopus 로고
    • Making deep belief networks effective for large vocabulary continuous speech recognition
    • Sainath, T.; Kingbury, B.; Ramabhadran, B.; Novak, P.; Mohamed, A.: Making deep belief networks effective for large vocabulary continuous speech recognition, in Proc. IEEE ASRU, 2011.
    • (2011) Proc. IEEE ASRU
    • Sainath, T.1    Kingbury, B.2    Ramabhadran, B.3    Novak, P.4    Mohamed, A.5
  • 176
    • 84878539964 scopus 로고    scopus 로고
    • Application of pre-trained deep neural networks to large vocabulary speech recognition
    • Jaitly, N.; Nguyen, P.; Vanhoucke, V.: Application of pre-trained deep neural networks to large vocabulary speech recognition, in Proc. Interspeech, 2012.
    • (2012) Proc. Interspeech
    • Jaitly, N.1    Nguyen, P.2    Vanhoucke, V.3
  • 178
    • 84055163920 scopus 로고    scopus 로고
    • Roles of pre-training and fine-tuning in context-dependent DBN-HMMs for real-world speech recognition
    • Yu, D.; Deng, L.; Dahl, G.: Roles of pre-training and fine-tuning in context-dependent DBN-HMMs for real-world speech recognition, in Proc. NIPSWorkshop, 2010.
    • (2010) Proc. NIPS Workshop
    • Yu, D.1    Deng, L.2    Dahl, G.3
  • 179
    • 85008521116 scopus 로고    scopus 로고
    • Calibration of confidence measures in speech recognition
    • Yu, D.; Li, J.-Y.; Deng, L.: Calibration of confidence measures in speech recognition. IEEE Trans. Audio Speech Lang., 19 (2010), 2461-2473.
    • (2010) IEEE Trans. Audio Speech Lang. , vol.19 , pp. 2461-2473
    • Yu, D.1    Li, J.-Y.2    Deng, L.3
  • 181
    • 84901237776 scopus 로고    scopus 로고
    • Modeling spectral envelopes using restricted Boltzmann machines and deep belief networks for statistical parametric speech synthesis
    • Ling, Z.; Deng, L.; Yu, D.: Modeling spectral envelopes using restricted Boltzmann machines and deep belief networks for statistical parametric speech synthesis. IEEE Trans. Audio Speech Lang. Process., 21 (10) (2013), 2129-2139.
    • (2013) IEEE Trans. Audio Speech Lang. Process. , vol.21 , Issue.10 , pp. 2129-2139
    • Ling, Z.1    Deng, L.2    Yu, D.3
  • 182
    • 84890527090 scopus 로고    scopus 로고
    • Multi-distribution deep belief network for speech synthesis
    • Kang, S.;Qian, X.;Meng, H.: Multi-distribution deep belief network for speech synthesis, in Proc. ICASSP, 2013, 8012-8016.
    • (2013) Proc. ICASSP , pp. 8012-8016
    • Kang, S.1    Qian, X.2    Meng, H.3
  • 183
    • 84890490547 scopus 로고    scopus 로고
    • Statistical parametric speech synthesis using deep neural networks
    • Zen, H.; Senior, A.; Schuster, M.: Statistical parametric speech synthesis using deep neural networks, in Proc. ICASSP, 2013, 7962- 7966.
    • (2013) Proc. ICASSP , pp. 7962-7966
    • Zen, H.1    Senior, A.2    Schuster, M.3
  • 184
    • 84890522099 scopus 로고    scopus 로고
    • F0 contour prediction with a deep belief network-Gaussian process hybrid model
    • Fernandez, R.; Rendel, A.; Ramabhadran, B.; Hoory, R.: F0 contour prediction with a deep belief network-Gaussian process hybrid Model, in Proc. ICASSP, 2013, 6885-6889.
    • (2013) Proc. ICASSP , pp. 6885-6889
    • Fernandez, R.1    Rendel, A.2    Ramabhadran, B.3    Hoory, R.4
  • 185
    • 84873453413 scopus 로고    scopus 로고
    • Moving beyond feature design: Deep architectures and automatic feature learning in music informatics
    • Humphrey, E.; Bello, J.; LeCun, Y.: Moving beyond feature design: deep architectures and automatic feature learning in music informatics, in Proc. ISMIR, 2012.
    • (2012) Proc. ISMIR
    • Humphrey, E.1    Bello, J.2    LeCun, Y.3
  • 186
    • 84873426072 scopus 로고    scopus 로고
    • Analyzing drum patterns using conditional deep belief networks
    • Batternberg, E.;Wessel, D.: Analyzing drum patterns using conditional deep belief networks, in Proc. ISMIR, 2012.
    • (2012) Proc. ISMIR
    • Batternberg, E.1    Wessel, D.2
  • 188
    • 78149306047 scopus 로고    scopus 로고
    • 3-d object recognition with deep belief nets
    • Nair, V.; Hinton, G.: 3-d object recognition with deep belief nets, in Proc. NIPS, 2009.
    • (2009) Proc. NIPS
    • Nair, V.1    Hinton, G.2
  • 189
    • 0002263996 scopus 로고
    • Convolutional networks for images, speech, and time series
    • (M. A. Arbib, ed.), MIT Press, Cambridge, Massachusetts
    • LeCun, Y.; Bengio, Y.: Convolutional networks for images, speech, and time series, in The Handbook of Brain Theory and Neural Networks (M. A. Arbib, ed.), 255-258, MIT Press, Cambridge, Massachusetts, 1995.
    • (1995) The Handbook of Brain Theory and Neural Networks , pp. 255-258
    • LeCun, Y.1    Bengio, Y.2
  • 191
    • 85083954484 scopus 로고    scopus 로고
    • Stochastic pooling for regularization of deep convolutional neural networks
    • Zeiler, M.; Fergus, R.: Stochastic pooling for regularization of deep convolutional neural networks, in Proc. ICLR, 2013.
    • (2013) Proc. ICLR
    • Zeiler, M.1    Fergus, R.2
  • 192
    • 84873600957 scopus 로고    scopus 로고
    • Learning invariant feature hierarchies
    • LeCun, Y.: Learning invariant feature hierarchies, in Proc. ECCV, 2012.
    • (2012) Proc. ECCV
    • LeCun, Y.1
  • 194
    • 69849103259 scopus 로고    scopus 로고
    • Adaptive multimodal fusion by uncertainty compensation with application to audiovisual speech recognition
    • Papandreou, G.; Katsamanis, A.; Pitsikalis, V.;Maragos, P.: Adaptive multimodal fusion by uncertainty compensation with application to audiovisual speech recognition. IEEE Trans. Audio Speech Lang. Process., 17 (3) (2009), 423-435.
    • (2009) IEEE Trans. Audio Speech Lang. Process. , vol.17 , Issue.3 , pp. 423-435
    • Papandreou, G.1    Katsamanis, A.2    Pitsikalis, V.3    Maragos, P.4
  • 195
    • 18744401086 scopus 로고    scopus 로고
    • Dynamic compensation of HMM variances using the feature enhancement uncertainty computed from a parametric model of speech distortion
    • Deng, L.; Wu, J.; Droppo, J.; Acero, A.: Dynamic compensation of HMM variances using the feature enhancement uncertainty computed from a parametric model of speech distortion. IEEE Trans. Speech Audio Process., 13 (3) (2005), 412-421.
    • (2005) IEEE Trans. Speech Audio Process. , vol.13 , Issue.3 , pp. 412-421
    • Deng, L.1    Wu, J.2    Droppo, J.3    Acero, A.4
  • 198
    • 34547970628 scopus 로고    scopus 로고
    • Three new graphical models for statistical language modeling
    • Mnih, A.; Hinton, G.: Three new graphical models for statistical language modeling, in Proc. ICML, 2007, 641-648.
    • (2007) Proc. ICML , pp. 641-648
    • Mnih, A.1    Hinton, G.2
  • 199
    • 84858779990 scopus 로고    scopus 로고
    • A scalable hierarchical distributed language model
    • Mnih, A.; Hinton, G.: A scalable hierarchical distributed language model, in Proc. NIPS, 2008, 1081-1088.
    • (2008) Proc. NIPS , pp. 1081-1088
    • Mnih, A.1    Hinton, G.2
  • 200
    • 80053276362 scopus 로고    scopus 로고
    • Training continuous space language models: Some practical issues
    • Le, H.; Allauzen, A.; Wisniewski, G.; Yvon, F.: Training continuous space languagemodels: some practical issues, in Proc.EMNLP, 2010, 778-788.
    • (2010) Proc.EMNLP , pp. 778-788
    • Le, H.1    Allauzen, A.2    Wisniewski, G.3    Yvon, F.4
  • 204
    • 77956280276 scopus 로고    scopus 로고
    • Hierarchical Bayesian language models for conversational speech recognition
    • Huang, S.; Renals, S.: Hierarchical Bayesian language models for conversational speech recognition. IEEE Trans. Audio Speech Lang. Process., 18 (8) (2010), 1941-1954.
    • (2010) IEEE Trans. Audio Speech Lang. Process. , vol.18 , Issue.8 , pp. 1941-1954
    • Huang, S.1    Renals, S.2
  • 205
    • 56449095373 scopus 로고    scopus 로고
    • A unified architecture for natural language processing: Deep neural networks with multitask learning
    • Collobert, R.;Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning, in Proc. ICML, 2008.
    • (2008) Proc. ICML
    • Collobert, R.1    Weston, J.2
  • 209
    • 84878180089 scopus 로고    scopus 로고
    • Improving word representations via global context and multiple word prototypes
    • Huang, E.; Socher, R.; Manning, C.; Ng, A.: Improving word representations via global context and multiple word prototypes, in Proc. ACL, 2012.
    • (2012) Proc. ACL
    • Huang, E.1    Socher, R.2    Manning, C.3    Ng, A.4
  • 210
    • 84926285904 scopus 로고    scopus 로고
    • Bilingual word embeddings for phrase-based machine translation
    • Zou, W.; Socher, R.; Cer, D.; Manning, C.: Bilingual word embeddings for phrase-based machine translation, in Proc. EMNLP, 2013.
    • (2013) Proc. EMNLP
    • Zou, W.1    Socher, R.2    Cer, D.3    Manning, C.4
  • 212
    • 80053261327 scopus 로고    scopus 로고
    • Semisupervised recursive autoencoders for predicting sentiment distributions
    • Socher, R.; Pennington, J.; Huang, E.; Ng, A.; Manning, C.: Semisupervised recursive autoencoders for predicting sentiment distributions, in Proc. EMNLP, 2011.
    • (2011) Proc. EMNLP
    • Socher, R.1    Pennington, J.2    Huang, E.3    Ng, A.4    Manning, C.5
  • 213
    • 85162476102 scopus 로고    scopus 로고
    • Dynamic pooling and unfolding recursive auto encoders for paraphrase detection
    • Socher, R.; Pennington, J.;Huang, E.;Ng, A.;Manning, C.:Dynamic pooling and unfolding recursive autoencoders for paraphrase detection, in Proc. NIPS, 2011.
    • (2011) Proc. NIPS
    • Socher, R.1    Pennington, J.2    Huang, E.3    Ng, A.4    Manning, C.5
  • 215
    • 84871387302 scopus 로고    scopus 로고
    • The deep tensor neural network with applications to large vocabulary speech recognition
    • Yu, D.; Deng, L.; Seide, F.: The deep tensor neural network with applications to large vocabulary speech recognition. IEEE Trans. Audio Speech Lang. Process., 21 (2013), 388-396.
    • (2013) IEEE Trans. Audio Speech Lang. Process. , vol.21 , pp. 388-396
    • Yu, D.1    Deng, L.2    Seide, F.3
  • 217
    • 79961245273 scopus 로고    scopus 로고
    • Discovering binary codes for documents by learning deep generative models
    • Hinton, G.; Salakhutdinov, R.: Discovering binary codes for documents by learning deep generativemodels. Top. Cognit. Sci., (2010), 1-18.
    • (2010) Top. Cognit. Sci. , pp. 1-18
    • Hinton, G.1    Salakhutdinov, R.2
  • 220
    • 84899022736 scopus 로고    scopus 로고
    • Large scale online learning
    • Bottou, L.; LeCun, Y.: Large scale online learning, in Proc. NIPS, 2004.
    • (2004) Proc. NIPS
    • Bottou, L.1    LeCun, Y.2
  • 221
    • 84857855190 scopus 로고    scopus 로고
    • Random search for hyper-parameter optimization
    • Bergstra, J.; Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res., 3 (2012), 281-305.
    • (2012) J. Mach. Learn. Res. , vol.3 , pp. 281-305
    • Bergstra, J.1    Bengio, Y.2
  • 222
    • 84869201485 scopus 로고    scopus 로고
    • Practical Bayesian optimization of machine learning algorithms
    • Snoek, J.; Larochelle, H.; Adams, R.: Practical Bayesian optimization of machine learning algorithms, in Proc. NIPS, 2012.
    • (2012) Proc. NIPS
    • Snoek, J.1    Larochelle, H.2    Adams, R.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.