메뉴 건너뛰기




Volumn , Issue , 2015, Pages 153-195

Deep dynamic models for learning hidden representations of speech features

Author keywords

[No Author keywords available]

Indexed keywords


EID: 84944075741     PISSN: None     EISSN: None     Source Type: Book    
DOI: 10.1007/978-1-4939-1456-2_6     Document Type: Book
Times cited : (16)

References (112)
  • 2
    • 0040856612 scopus 로고
    • Stochastic modeling for automatic speech recognition
    • ed. by D. Reddy (Academic, New York
    • J. Baker, Stochastic modeling for automatic speech recognition, in Speech Recognition, ed. by D. Reddy (Academic, New York, 1976)
    • (1976) Speech Recognition
    • Baker, J.1
  • 5
    • 0000342467 scopus 로고
    • Statistical inference for probabilistic functions of finite state markov chains
    • L. Baum, T. Petrie, Statistical inference for probabilistic functions of finite state Markov chains. Ann. Math. Stat. 37(6), 1554–1563 (1966)
    • (1966) Ann. Math. Stat , vol.37 , Issue.6 , pp. 1554-1563
    • Baum, L.1    Petrie, T.2
  • 8
    • 0038021376 scopus 로고    scopus 로고
    • Buried markov models: A graphical modeling approach to automatic speech recognition
    • J. Bilmes, Buried markov models: a graphical modeling approach to automatic speech recognition. Comput. Speech Lang. 17, 213–231 (2003)
    • (2003) Comput. Speech Lang , vol.17 , pp. 213-231
    • Bilmes, J.1
  • 13
    • 85083950550 scopus 로고    scopus 로고
    • A primal-dual method for training recurrent neural networks constrained by the echo-state property
    • J. Chen, L. Deng, A primal-dual method for training recurrent neural networks constrained by the echo-state property, in Proceedings of ICLR (2014)
    • (2014) Proceedings of ICLR
    • Chen, J.1    Deng, L.2
  • 16
    • 84055222005 scopus 로고    scopus 로고
    • Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
    • G. Dahl, D. Yu, L. Deng, A. Acero, Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans. Audio Speech Lang. Process. 20(1), 30–42 (2012)
    • (2012) IEEE Trans. Audio Speech Lang. Process , vol.20 , Issue.1 , pp. 30-42
    • Dahl, G.1    Yu, D.2    Deng, L.3    Acero, A.4
  • 17
    • 0002629270 scopus 로고
    • Maximum-likelihood from incomplete data via the em algorithm
    • A.P. Dempster, N.M. Laird, D.B. Rubin, Maximum-likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B. 39, 1–38 (1977)
    • (1977) J. R. Stat. Soc. Ser. B , vol.39 , pp. 1-38
    • Dempster, A.P.1    Laird, N.M.2    Rubin, D.B.3
  • 18
    • 0026854213 scopus 로고
    • A generalized hidden markov model with state-conditioned trend functions of time for the speech signal
    • L. Deng, A generalized hidden markov model with state-conditioned trend functions of time for the speech signal. Signal Process. 27(1), 65–78 (1992)
    • (1992) Signal Process , vol.27 , Issue.1 , pp. 65-78
    • Deng, L.1
  • 19
    • 0032119268 scopus 로고    scopus 로고
    • A dynamic, feature-based approach to the interface between phonology and phonetics for speech modeling and recognition
    • L. Deng, A dynamic, feature-based approach to the interface between phonology and phonetics for speech modeling and recognition. Speech Commun. 24(4), 299–323 (1998)
    • (1998) Speech Commun , vol.24 , Issue.4 , pp. 299-323
    • Deng, L.1
  • 20
    • 0039503389 scopus 로고    scopus 로고
    • Articulatory features and associated production models in statistical speech recognition
    • Springer, New York
    • L. Deng, Articulatory features and associated production models in statistical speech recognition, in Computational Models of Speech Pattern Processing (Springer, New York, 1999), pp. 214–224
    • (1999) Computational Models of Speech Pattern Processing , pp. 214-224
    • Deng, L.1
  • 21
    • 0039503389 scopus 로고    scopus 로고
    • Computational models for speech production
    • Springer, New York
    • L. Deng, Computational models for speech production, in Computational Models of Speech Pattern Processing (Springer, New York, 1999), pp. 199–213
    • (1999) Computational Models of Speech Pattern Processing , pp. 199-213
    • Deng, L.1
  • 22
    • 33744966595 scopus 로고    scopus 로고
    • Switching dynamic system models for speech articulation and acoustics
    • Springer, New York
    • L. Deng, Switching dynamic system models for speech articulation and acoustics, in Mathematical Foundations of Speech and Language Processing (Springer, New York, 2003), pp. 115–134
    • (2003) Mathematical Foundations of Speech and Language Processing , pp. 115-134
    • Deng, L.1
  • 24
    • 0028516022 scopus 로고
    • Speech recognition using hidden markov models with polynomial regression functions as non-stationary states
    • L. Deng, M. Aksmanovic, D. Sun, J. Wu, Speech recognition using hidden Markov models with polynomial regression functions as non-stationary states. IEEE Trans. Acoust. Speech Signal Process. 2(4), 101–119 (1994)
    • (1994) IEEE Trans. Acoust. Speech Signal Process , vol.2 , Issue.4 , pp. 101-119
    • Deng, L.1    Aksmanovic, M.2    Sun, D.3    Wu, J.4
  • 25
    • 84905280906 scopus 로고    scopus 로고
    • Sequence classification using high-level features extracted from deep neural networks
    • L. Deng, J. Chen, Sequence classification using high-level features extracted from deep neural networks, in Proceedings of ICASSP (2014)
    • (2014) Proceedings of ICASSP
    • Deng, L.1    Chen, J.2
  • 27
    • 0028256706 scopus 로고
    • Analysis of the correlation structure for a neural predictive model with application to speech recognition
    • L. Deng, K. Hassanein, M. Elmasry, Analysis of the correlation structure for a neural predictive model with application to speech recognition. Neural Netw. 7(2), 331–339 (1994)
    • (1994) Neural Netw , vol.7 , Issue.2 , pp. 331-339
    • Deng, L.1    Hassanein, K.2    Elmasry, M.3
  • 28
    • 84890526837 scopus 로고    scopus 로고
    • New types of deep neural network learning for speech recognition and related applications: An overview
    • L. Deng, G. Hinton, B. Kingsbury, New types of deep neural network learning for speech recognition and related applications: an overview, in Proceedings of IEEE ICASSP, Vancouver, 2013
    • (2013) Proceedings of IEEE ICASSP, Vancouver
    • Deng, L.1    Hinton, G.2    Kingsbury, B.3
  • 29
    • 84890468916 scopus 로고    scopus 로고
    • Deep learning for speech recognition and related applications
    • L. Deng, G. Hinton, D. Yu, Deep learning for speech recognition and related applications, in NIPS Workshop, Whistler, 2009
    • (2009) NIPS Workshop, Whistler
    • Deng, L.1    Hinton, G.2    Yu, D.3
  • 30
    • 0026189555 scopus 로고
    • Phonemic hidden markov models with continuous mixture output densities for large vocabulary word recognition
    • L. Deng, P. Kenny, M. Lennig, V. Gupta, F. Seitz, P. Mermelsten, Phonemic hidden markov models with continuous mixture output densities for large vocabulary word recognition. IEEE Trans. Acoust. Speech Signal Process. 39(7), 1677–1681 (1991)
    • (1991) IEEE Trans. Acoust. Speech Signal Process , vol.39 , Issue.7 , pp. 1677-1681
    • Deng, L.1    Kenny, P.2    Lennig, M.3    Gupta, V.4    Seitz, F.5    Mermelsten, P.6
  • 31
    • 34547517867 scopus 로고    scopus 로고
    • Adaptive kalman filtering and smoothing for tracking vocal tract resonances using a continuous-valued hidden dynamic model
    • L. Deng, L. Lee, H. Attias, A. Acero, Adaptive kalman filtering and smoothing for tracking vocal tract resonances using a continuous-valued hidden dynamic model. IEEE Trans. Audio Speech Lang. Process. 15(1), 13–23 (2007)
    • (2007) IEEE Trans. Audio Speech Lang. Process , vol.15 , Issue.1 , pp. 13-23
    • Deng, L.1    Lee, L.2    Attias, H.3    Acero, A.4
  • 32
    • 10244257175 scopus 로고
    • Large vocabulary word recognition using context-dependent allophonic hidden markov models
    • L. Deng, M. Lennig, F. Seitz, P. Mermelstein, Large vocabulary word recognition using context-dependent allophonic hidden markov models. Comput. Speech Lang. 4, 345–357 (1991)
    • (1991) Comput. Speech Lang , vol.4 , pp. 345-357
    • Deng, L.1    Lennig, M.2    Seitz, F.3    Mermelstein, P.4
  • 33
    • 84876672166 scopus 로고    scopus 로고
    • Machine learning paradigms in speech recognition: An overview
    • L. Deng, X. Li, Machine learning paradigms in speech recognition: an overview. IEEE Trans. Audio Speech Lang. Process. 21(5), 1060–1089 (2013)
    • (2013) IEEE Trans. Audio Speech Lang. Process , vol.21 , Issue.5 , pp. 1060-1089
    • Deng, L.1    Li, X.2
  • 34
    • 0003911245 scopus 로고    scopus 로고
    • A statistical coarticulatory model for the hidden vocal-tract-resonance dynamics
    • L. Deng, J. Ma, A statistical coarticulatory model for the hidden vocal-tract-resonance dynamics, in EUROSPEECH (1999), pp. 1499–1502
    • (1999) EUROSPEECH , pp. 1499-1502
    • Deng, L.1    Ma, J.2
  • 35
    • 0033623527 scopus 로고    scopus 로고
    • Spontaneous speech recognition using a statistical coarticulatory model for the hidden vocal-tract-resonance dynamics
    • L. Deng, J. Ma, Spontaneous speech recognition using a statistical coarticulatory model for the hidden vocal-tract-resonance dynamics. J. Acoust. Soc. Am. 108, 3036–3048 (2000)
    • (2000) J. Acoust. Soc. Am , vol.108 , pp. 3036-3048
    • Deng, L.1    Ma, J.2
  • 37
    • 0031198059 scopus 로고    scopus 로고
    • Production models as a structural basis for automatic speech recognition
    • L. Deng, G. Ramsay, D. Sun, Production models as a structural basis for automatic speech recognition. Speech Commun. 33(2–3), 93–111 (1997)
    • (1997) Speech Commun , vol.33 , Issue.23 , pp. 93-111
    • Deng, L.1    Ramsay, G.2    Sun, D.3
  • 39
    • 33744966561 scopus 로고    scopus 로고
    • A bidirectional target filtering model of speech coarticulation: Two-stage implementation for phonetic recognition
    • L. Deng, D. Yu, A. Acero, A bidirectional target filtering model of speech coarticulation: two-stage implementation for phonetic recognition. IEEE Trans. Speech Audio Process. 14, 256–265 (2006)
    • (2006) IEEE Trans. Speech Audio Process , vol.14 , pp. 256-265
    • Deng, L.1    Yu, D.2    Acero, A.3
  • 43
    • 85032752250 scopus 로고    scopus 로고
    • Bayesian nonparametric methods for learning markov switching processes
    • E. Fox, E. Sudderth, M. Jordan, A. Willsky, Bayesian nonparametric methods for learning markov switching processes. IEEE Signal Process. Mag. 27(6), 43–54 (2010)
    • (2010) IEEE Signal Process. Mag , vol.27 , Issue.6 , pp. 43-54
    • Fox, E.1    Sudderth, E.2    Jordan, M.3    Willsky, A.4
  • 44
    • 85009074657 scopus 로고    scopus 로고
    • Algonquin: Iterating laplaces method to remove multiple types of acoustic distortion for robust speech recognition
    • B. Frey, L. Deng, A. Acero, T. Kristjansson, Algonquin: iterating laplaces method to remove multiple types of acoustic distortion for robust speech recognition, in Proceedings of Eurospeech (2000)
    • (2000) Proceedings of Eurospeech
    • Frey, B.1    Deng, L.2    Acero, A.3    Kristjansson, T.4
  • 45
    • 0030245128 scopus 로고    scopus 로고
    • Robust continuous speech recognition using parallel model combination
    • M. Gales, S. Young, Robust continuous speech recognition using parallel model combination. IEEE Trans. Speech Audio Process. 4(5), 352–359 (1996)
    • (1996) IEEE Trans. Speech Audio Process , vol.4 , Issue.5 , pp. 352-359
    • Gales, M.1    Young, S.2
  • 46
    • 0034170950 scopus 로고    scopus 로고
    • Variational learning for switching state-space models
    • Z. Ghahramani, G.E. Hinton, Variational learning for switching state-space models. Neural Comput. 12, 831–864 (2000)
    • (2000) Neural Comput , vol.12 , pp. 831-864
    • Ghahramani, Z.1    Hinton, G.E.2
  • 50
    • 78650474133 scopus 로고    scopus 로고
    • A practical guide to training restricted boltzmann machines
    • Machine Learning Group, University of Toronto, 2010
    • G. E. Hinton, “A practical guide to training restricted Boltzmann machines,” in Technical report 2010-003, Machine Learning Group, University of Toronto, 2010.
    • Technical Report 2010-003
    • Hinton, G.E.1
  • 52
    • 33745805403 scopus 로고    scopus 로고
    • A fast learning algorithm for deep belief nets
    • G. Hinton, S. Osindero, Y. Teh, A fast learning algorithm for deep belief nets. Neural Comput. 18, 1527–1554 (2006)
    • (2006) Neural Comput , vol.18 , pp. 1527-1554
    • Hinton, G.1    Osindero, S.2    Teh, Y.3
  • 53
    • 33746600649 scopus 로고    scopus 로고
    • Reducing the dimensionality of data with neural networks
    • G. Hinton, R. Salakhutdinov, Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
    • (2006) Science , vol.313 , Issue.5786 , pp. 504-507
    • Hinton, G.1    Salakhutdinov, R.2
  • 54
    • 0032673963 scopus 로고    scopus 로고
    • Probabilistic-trajectory segmental hmms
    • W. Holmes, M. Russell, Probabilistic-trajectory segmental HMMs. Comput. Speech Lang. 13, 3–37 (1999)
    • (1999) Comput. Speech Lang , vol.13 , pp. 3-37
    • Holmes, W.1    Russell, M.2
  • 56
    • 33749833931 scopus 로고    scopus 로고
    • Tutorial on training recurrent neural networks, covering bppt, rtrl, ekf and the “echo state network” approach. Gmd report 159
    • H. Jaeger, Tutorial on training recurrent neural networks, covering BPPT, RTRL, EKF and the “echo state network” approach. GMD Report 159, GMD - German National Research Institute for Computer Science (2002)
    • (2002) GMD - German National Research Institute for Computer Science
    • Jaeger, H.1
  • 57
    • 0016939124 scopus 로고
    • Continuous speech recognition by statistical methods
    • F. Jelinek, Continuous speech recognition by statistical methods. Proc. IEEE 64(4), 532–557 (1976)
    • (1976) Proc. IEEE , vol.64 , Issue.4 , pp. 532-557
    • Jelinek, F.1
  • 58
    • 0022691022 scopus 로고
    • Maximum likelihood estimation for mixture multivariate stochastic observations of markov chains
    • B.-H. Juang, S.E. Levinson, M.M. Sondhi, Maximum likelihood estimation for mixture multivariate stochastic observations of markov chains. IEEE Trans. Inf. Theory 32(2), 307–309 (1986)
    • (1986) IEEE Trans. Inf. Theory , vol.32 , Issue.2 , pp. 307-309
    • Juang, B.-H.1    Levinson, S.E.2    Sondhi, M.M.3
  • 59
    • 84878379108 scopus 로고    scopus 로고
    • Scalable minimum bayes risk training of deep neural network acoustic models using distributed hessian-free optimization
    • B. Kingsbury, T. Sainath, H. Soltau, Scalable minimum Bayes risk training of deep neural network acoustic models using distributed hessian-free optimization, in Proceedings of Interspeech (2012)
    • (2012) Proceedings of Interspeech
    • Kingsbury, B.1    Sainath, T.2    Soltau, H.3
  • 62
    • 0034842603 scopus 로고    scopus 로고
    • A functional articulatory dynamic model for speech production
    • L.J. Lee, P. Fieguth, L. Deng, A functional articulatory dynamic model for speech production, in Proceedings of ICASSP, Salt Lake City, vol. 2, 2001, pp. 797–800
    • (2001) Proceedings of ICASSP, Salt Lake City , vol.2 , pp. 797-800
    • Lee, L.J.1    Fieguth, P.2    Deng, L.3
  • 63
    • 84897953008 scopus 로고    scopus 로고
    • Temporally varying weight regression: A semi-parametric trajectory model for automatic speech recognition
    • S. Liu, K. Sim, Temporally varying weight regression: a semi-parametric trajectory model for automatic speech recognition. IEEE Trans. Audio Speech Lang. Process. 22(1) 151–160 (2014)
    • (2014) IEEE Trans. Audio Speech Lang. Process , vol.22 , Issue.1 , pp. 151-160
    • Liu, S.1    Sim, K.2
  • 64
    • 84875405186 scopus 로고    scopus 로고
    • Exploiting deep neural networks for detectionbased speech recognition
    • S.M. Siniscalchia, D. Yu, L. Deng, C.-H. Lee, Exploiting deep neural networks for detectionbased speech recognition. Neurocomputing 106, 148–157 (2013)
    • (2013) Neurocomputing , vol.106 , pp. 148-157
    • Siniscalchia, S.M.1    Yu, D.2    Deng, L.3    Lee, C.-H.4
  • 65
    • 0001523807 scopus 로고    scopus 로고
    • A path-stack algorithm for optimizing dynamic regimes in a statistical hidden dynamic model of speech
    • J. Ma, L. Deng, A path-stack algorithm for optimizing dynamic regimes in a statistical hidden dynamic model of speech. Comput. Speech Lang. 14, 101–104 (2000)
    • (2000) Comput. Speech Lang , vol.14 , pp. 101-104
    • Ma, J.1    Deng, L.2
  • 66
    • 0347968275 scopus 로고    scopus 로고
    • Efficient decoding strategies for conversational speech recognition using a constrained nonlinear state-space model
    • J. Ma, L. Deng, Efficient decoding strategies for conversational speech recognition using a constrained nonlinear state-space model. IEEE Trans. Audio Speech Process. 11(6), 590–602 (2003)
    • (2003) IEEE Trans. Audio Speech Process , vol.11 , Issue.6 , pp. 590-602
    • Ma, J.1    Deng, L.2
  • 67
    • 0347968275 scopus 로고    scopus 로고
    • Efficient decoding strategies for conversational speech recognition using a constrained nonlinear state-space model
    • J. Ma, L. Deng, Efficient decoding strategies for conversational speech recognition using a constrained nonlinear state-space model. IEEE Trans. Audio Speech Lang. Process 11(6), 590–602 (2004)
    • (2004) IEEE Trans. Audio Speech Lang. Process , vol.11 , Issue.6 , pp. 590-602
    • Ma, J.1    Deng, L.2
  • 68
    • 0742307392 scopus 로고    scopus 로고
    • Target-directed mixture dynamic models for spontaneous speech recognition
    • J. Ma, L. Deng, Target-directed mixture dynamic models for spontaneous speech recognition. IEEE Trans. Audio Speech Process. 12(1), 47–58 (2004)
    • (2004) IEEE Trans. Audio Speech Process , vol.12 , Issue.1 , pp. 47-58
    • Ma, J.1    Deng, L.2
  • 70
    • 80053451847 scopus 로고    scopus 로고
    • Learning recurrent neural networks with hessian-free optimization
    • J. Martens, I. Sutskever, Learning recurrent neural networks with hessian-free optimization, in Proceedings of ICML, Bellevue, 2011, pp. 1033–1040
    • (2011) Proceedings of ICML, Bellevue , pp. 1033-1040
    • Martens, J.1    Sutskever, I.2
  • 71
    • 84906237242 scopus 로고    scopus 로고
    • Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding
    • G. Mesnil, X. He, L. Deng, Y. Bengio, Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding, in Proceedings of INTERSPEECH, Lyon, 2013
    • (2013) Proceedings of INTERSPEECH, Lyon
    • Mesnil, G.1    He, X.2    Deng, L.3    Bengio, Y.4
  • 72
    • 54349106040 scopus 로고    scopus 로고
    • Switching linear dynamical systems for noise robust speech recognition
    • B. Mesot, D. Barber, Switching linear dynamical systems for noise robust speech recognition. IEEE Trans. Audio Speech Lang. Process. 15(6), 1850–1858 (2007)
    • (2007) IEEE Trans. Audio Speech Lang. Process , vol.15 , Issue.6 , pp. 1850-1858
    • Mesot, B.1    Barber, D.2
  • 74
    • 84858966958 scopus 로고    scopus 로고
    • Strategies for training large scale neural network language models
    • IEEE, Honolulu
    • T. Mikolov, A. Deoras, D. Povey, L. Burget, J. Cernocky, Strategies for training large scale neural network language models, in Proceedings of IEEE ASRU (IEEE, Honolulu, 2011), pp. 196–201
    • (2011) Proceedings of IEEE ASRU , pp. 196-201
    • Mikolov, T.1    Deoras, A.2    Povey, D.3    Burget, L.4    Cernocky, J.5
  • 80
    • 84255177123 scopus 로고    scopus 로고
    • Deep and wide: Multiple layers in automatic speech recognition
    • N. Morgan, Deep and wide: multiple layers in automatic speech recognition. IEEE Trans. Audio Speech Lang. Process. 20(1), 7–13 (2012)
    • (2012) IEEE Trans. Audio Speech Lang. Process , vol.20 , Issue.1 , pp. 7-13
    • Morgan, N.1
  • 81
    • 0030245363 scopus 로고    scopus 로고
    • From hmm’s to segment models: A unified view of stochastic modeling for speech recognition
    • M. Ostendorf, V. Digalakis, O. Kimball, From HMM’s to segment models: a unified view of stochastic modeling for speech recognition. IEEE Trans. Speech Audio Process. 4(5), 360– 378 (1996)
    • (1996) IEEE Trans. Speech Audio Process , vol.4 , Issue.5
    • Ostendorf, M.1    Digalakis, V.2    Kimball, O.3
  • 83
    • 69249099357 scopus 로고    scopus 로고
    • Dynamic speech spectrum representation and tracking variable number of vocal tract resonance frequencies with time-varying dirichlet process mixture models
    • E. Ozkan, I. Ozbek, M. Demirekler, Dynamic speech spectrum representation and tracking variable number of vocal tract resonance frequencies with time-varying dirichlet process mixture models. IEEE Trans. Audio Speech Lang. Process. 17(8), 1518–1532 (2009)
    • (2009) IEEE Trans. Audio Speech Lang. Process , vol.17 , Issue.8 , pp. 1518-1532
    • Ozkan, E.1    Ozbek, I.2    Demirekler, M.3
  • 85
    • 0141698849 scopus 로고    scopus 로고
    • Variational learning in mixed-state dynamic graphical models
    • V. Pavlovic, B. Frey, T. Huang, Variational learning in mixed-state dynamic graphical models, in Proceedings of UAI, Stockholm, 1999, pp. 522–530
    • (1999) Proceedings of UAI, Stockholm , pp. 522-530
    • Pavlovic, V.1    Frey, B.2    Huang, T.3
  • 87
    • 0028401031 scopus 로고    scopus 로고
    • Neurocontrol of nonlinear dynamical systems with kalman filter trained recurrent networks
    • G. Puskorius, L. Feldkamp, Neurocontrol of nonlinear dynamical systems with kalman filter trained recurrent networks. IEEE Trans. Neural Netw. 5(2), 279–297 (1998)
    • (1998) IEEE Trans. Neural Netw , vol.5 , Issue.2 , pp. 279-297
    • Puskorius, G.1    Feldkamp, L.2
  • 89
    • 85032751986 scopus 로고    scopus 로고
    • Single-channel multitalker speech recognition—graphical modeling approaches
    • S. Rennie, J. Hershey, P. Olsen, Single-channel multitalker speech recognition—graphical modeling approaches. IEEE Signal Process.Mag. 33, 66–80 (2010)
    • (2010) IEEE Signal Process.Mag , vol.33 , pp. 66-80
    • Rennie, S.1    Hershey, J.2    Olsen, P.3
  • 90
    • 0028392167 scopus 로고
    • An application of recurrent nets to phone probability estimation
    • A.J. Robinson, An application of recurrent nets to phone probability estimation. IEEE Trans. Neural Netw. 5(2), 298–305 (1994)
    • (1994) IEEE Trans. Neural Netw , vol.5 , Issue.2 , pp. 298-305
    • Robinson, A.J.1
  • 92
    • 10844250035 scopus 로고    scopus 로고
    • Linear/linear segmental hmm with a formant-based intermediate layer. Comput
    • M. Russell, P. Jackson, A multiple-level linear/linear segmental HMM with a formant-based intermediate layer. Comput. Speech Lang. 19, 205–225 (2005)
    • (2005) Speech Lang , vol.19 , pp. 205-225
    • Russell, M.1    Jackson, P.2    Multiple-Level, A.3
  • 93
    • 84886829539 scopus 로고    scopus 로고
    • Optimization techniques to improve training speed of deep neural networks for large speech tasks
    • T. Sainath, B. Kingsbury, H. Soltau, B. Ramabhadran, Optimization techniques to improve training speed of deep neural networks for large speech tasks. IEEE Trans. Audio Speech Lang. Process. 21(11), 2267–2276 (2013)
    • (2013) IEEE Trans. Audio Speech Lang. Process , vol.21 , Issue.11 , pp. 2267-2276
    • Sainath, T.1    Kingsbury, B.2    Soltau, H.3    Ramabhadran, B.4
  • 95
    • 0031074957 scopus 로고    scopus 로고
    • Maximum likelihood in statistical estimation of dynamical systems: Decomposition algorithm and simulation results
    • X. Shen, L. Deng, Maximum likelihood in statistical estimation of dynamical systems: decomposition algorithm and simulation results. Signal Process. 57, 65–79 (1997)
    • (1997) Signal Process , vol.57 , pp. 65-79
    • Shen, X.1    Deng, L.2
  • 97
    • 84883148756 scopus 로고    scopus 로고
    • Empirical risk minimization of graphical model parameters given approximate inference, decoding, and model structure
    • V. Stoyanov, A. Ropson, J. Eisner, Empirical risk minimization of graphical model parameters given approximate inference, decoding, and model structure, in Proceedings of AISTAT (2011)
    • (2011) Proceedings of AISTAT
    • Stoyanov, V.1    Ropson, A.2    Eisner, J.3
  • 101
    • 0344443787 scopus 로고    scopus 로고
    • Joint state and parameter estimation for a target-directed nonlinear dynamic system model
    • R. Togneri, L. Deng, Joint state and parameter estimation for a target-directed nonlinear dynamic system model. IEEE Trans. Signal Process. 51(12), 3061–3070 (2003)
    • (2003) IEEE Trans. Signal Process , vol.51 , Issue.12 , pp. 3061-3070
    • Togneri, R.1    Deng, L.2
  • 102
    • 33745373922 scopus 로고    scopus 로고
    • A state-space model with neural-network prediction for recovering vocal tract resonances in fluent speech from mel-cepstral coefficients
    • R. Togneri, L. Deng, A state-space model with neural-network prediction for recovering vocal tract resonances in fluent speech from mel-cepstral coefficients. Speech Commun. 48(8), 971– 988 (2006)
    • (2006) Speech Commun , vol.48 , Issue.8
    • Togneri, R.1    Deng, L.2
  • 105
    • 3242679207 scopus 로고    scopus 로고
    • A generalized mean field algorithm for variational inference in exponential families
    • X. Xing, M. Jordan, S. Russell, A generalized mean field algorithm for variational inference in exponential families, in Proceedings of UAI (2003)
    • (2003) Proceedings of UAI
    • Xing, X.1    Jordan, M.2    Russell, S.3
  • 106
    • 33749541517 scopus 로고    scopus 로고
    • Speaker-adaptive learning of resonance targets in a hidden trajectory model of speech coarticulation. Comput
    • D. Yu, L. Deng, Speaker-adaptive learning of resonance targets in a hidden trajectory model of speech coarticulation. Comput. Speech Lang. 27, 72–87 (2007)
    • (2007) Speech Lang , vol.27 , pp. 72-87
    • Yu, D.1    Deng, L.2
  • 111
    • 85133439657 scopus 로고    scopus 로고
    • An introduction of trajectory model into hmm-based speech synthesis
    • H. Zen, K. Tokuda, T. Kitamura, An introduction of trajectory model into HMM-based speech synthesis, in Proceedings of ISCA SSW5 (2004), pp. 191–196
    • (2004) Proceedings of ISCA SSW5 , pp. 191-196
    • Zen, H.1    Tokuda, K.2    Kitamura, T.3
  • 112
    • 67650153217 scopus 로고    scopus 로고
    • Acoustic-articulatory modelling with the trajectory hmm
    • L. Zhang, S. Renals, Acoustic-articulatory modelling with the trajectory HMM. IEEE Signal Process. Lett. 15, 245–248 (2008)
    • (2008) IEEE Signal Process. Lett , vol.15 , pp. 245-248
    • Zhang, L.1    Renals, S.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.