-
1
-
-
85009113852
-
Hmm adaptation using vector taylor series for noisy speech recognition
-
A. Acero, L. Deng, T. Kristjansson, J. Zhang, HMM adaptation using vector taylor series for noisy speech recognition, in Proceedings of International Conference on Spoken Language Processing (2000), pp. 869–872
-
(2000)
Proceedings of International Conference on Spoken Language Processing
, pp. 869-872
-
-
Acero, A.1
Deng, L.2
Kristjansson, T.3
Zhang, J.4
-
2
-
-
0040856612
-
Stochastic modeling for automatic speech recognition
-
ed. by D. Reddy (Academic, New York
-
J. Baker, Stochastic modeling for automatic speech recognition, in Speech Recognition, ed. by D. Reddy (Academic, New York, 1976)
-
(1976)
Speech Recognition
-
-
Baker, J.1
-
3
-
-
85032751593
-
Research developments and directions in speech recognition and understanding, part i
-
J. Baker, L. Deng, J. Glass, S. Khudanpur, C.-H. Lee, N. Morgan, D. O’Shgughnessy, Research developments and directions in speech recognition and understanding, part i. IEEE Signal Process. Mag. 26(3), 75–80 (2009)
-
(2009)
IEEE Signal Process. Mag
, vol.26
, Issue.3
, pp. 75-80
-
-
Baker, J.1
Deng, L.2
Glass, J.3
Khudanpur, S.4
Lee, C.-H.5
Morgan, N.6
O’Shgughnessy, D.7
-
4
-
-
85032759066
-
Updated minds report on speech recognition and understanding
-
J. Baker, L. Deng, J. Glass, S. Khudanpur, C.-H. Lee, N. Morgan, D. O’Shgughnessy, Updated MINDS report on speech recognition and understanding. IEEE Signal Process. Mag. 26(4), 78–85 (2009)
-
(2009)
IEEE Signal Process. Mag
, vol.26
, Issue.4
, pp. 78-85
-
-
Baker, J.1
Deng, L.2
Glass, J.3
Khudanpur, S.4
Lee, C.-H.5
Morgan, N.6
O’Shgughnessy, D.7
-
5
-
-
0000342467
-
Statistical inference for probabilistic functions of finite state markov chains
-
L. Baum, T. Petrie, Statistical inference for probabilistic functions of finite state Markov chains. Ann. Math. Stat. 37(6), 1554–1563 (1966)
-
(1966)
Ann. Math. Stat
, vol.37
, Issue.6
, pp. 1554-1563
-
-
Baum, L.1
Petrie, T.2
-
6
-
-
84890543516
-
Advances in optimizing recurrent networks
-
Y. Bengio, N. Boulanger, R. Pascanu, Advances in optimizing recurrent networks, in Proceedings of ICASSP, Vancouver, 2013
-
(2013)
Proceedings of ICASSP, Vancouver
-
-
Bengio, Y.1
Boulanger, N.2
Pascanu, R.3
-
7
-
-
84890543516
-
Advances in optimizing recurrent networks
-
Y. Bengio, N. Boulanger-Lewandowski, R. Pascanu, Advances in optimizing recurrent networks, in Proceedings of ICASSP, Vancouver, 2013
-
(2013)
Proceedings of ICASSP, Vancouver
-
-
Bengio, Y.1
Boulanger-Lewandowski, N.2
Pascanu, R.3
-
8
-
-
0038021376
-
Buried markov models: A graphical modeling approach to automatic speech recognition
-
J. Bilmes, Buried markov models: a graphical modeling approach to automatic speech recognition. Comput. Speech Lang. 17, 213–231 (2003)
-
(2003)
Comput. Speech Lang
, vol.17
, pp. 213-231
-
-
Bilmes, J.1
-
12
-
-
84944117044
-
-
Final Report for 1998 Workshop on Langauge Engineering, CLSP (Johns Hopkins
-
J. Bridle, L. Deng, J. Picone, H. Richards, J. Ma, T. Kamm, M. Schuster, S. Pike, R. Reagan, An investigation of segmental hidden dynamic models of speech coarticulation for automatic speech recognition. Final Report for 1998 Workshop on Langauge Engineering, CLSP (Johns Hopkins, 1998)
-
(1998)
An Investigation of Segmental Hidden Dynamic Models of Speech Coarticulation for Automatic Speech Recognition
-
-
Bridle, J.1
Deng, L.2
Picone, J.3
Richards, H.4
Ma, J.5
Kamm, T.6
Schuster, M.7
Pike, S.8
Reagan, R.9
-
13
-
-
85083950550
-
A primal-dual method for training recurrent neural networks constrained by the echo-state property
-
J. Chen, L. Deng, A primal-dual method for training recurrent neural networks constrained by the echo-state property, in Proceedings of ICLR (2014)
-
(2014)
Proceedings of ICLR
-
-
Chen, J.1
Deng, L.2
-
15
-
-
80051616844
-
Large vocabulary continuous speech recognition with context-dependent dbn-hmms
-
G. Dahl, D. Yu, L. Deng, A. Acero, Large vocabulary continuous speech recognition with context-dependent DBN-HMMs, in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (2011)
-
(2011)
Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing
-
-
Dahl, G.1
Yu, D.2
Deng, L.3
Acero, A.4
-
16
-
-
84055222005
-
Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
-
G. Dahl, D. Yu, L. Deng, A. Acero, Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans. Audio Speech Lang. Process. 20(1), 30–42 (2012)
-
(2012)
IEEE Trans. Audio Speech Lang. Process
, vol.20
, Issue.1
, pp. 30-42
-
-
Dahl, G.1
Yu, D.2
Deng, L.3
Acero, A.4
-
18
-
-
0026854213
-
A generalized hidden markov model with state-conditioned trend functions of time for the speech signal
-
L. Deng, A generalized hidden markov model with state-conditioned trend functions of time for the speech signal. Signal Process. 27(1), 65–78 (1992)
-
(1992)
Signal Process
, vol.27
, Issue.1
, pp. 65-78
-
-
Deng, L.1
-
19
-
-
0032119268
-
A dynamic, feature-based approach to the interface between phonology and phonetics for speech modeling and recognition
-
L. Deng, A dynamic, feature-based approach to the interface between phonology and phonetics for speech modeling and recognition. Speech Commun. 24(4), 299–323 (1998)
-
(1998)
Speech Commun
, vol.24
, Issue.4
, pp. 299-323
-
-
Deng, L.1
-
20
-
-
0039503389
-
Articulatory features and associated production models in statistical speech recognition
-
Springer, New York
-
L. Deng, Articulatory features and associated production models in statistical speech recognition, in Computational Models of Speech Pattern Processing (Springer, New York, 1999), pp. 214–224
-
(1999)
Computational Models of Speech Pattern Processing
, pp. 214-224
-
-
Deng, L.1
-
21
-
-
0039503389
-
Computational models for speech production
-
Springer, New York
-
L. Deng, Computational models for speech production, in Computational Models of Speech Pattern Processing (Springer, New York, 1999), pp. 199–213
-
(1999)
Computational Models of Speech Pattern Processing
, pp. 199-213
-
-
Deng, L.1
-
22
-
-
33744966595
-
Switching dynamic system models for speech articulation and acoustics
-
Springer, New York
-
L. Deng, Switching dynamic system models for speech articulation and acoustics, in Mathematical Foundations of Speech and Language Processing (Springer, New York, 2003), pp. 115–134
-
(2003)
Mathematical Foundations of Speech and Language Processing
, pp. 115-134
-
-
Deng, L.1
-
24
-
-
0028516022
-
Speech recognition using hidden markov models with polynomial regression functions as non-stationary states
-
L. Deng, M. Aksmanovic, D. Sun, J. Wu, Speech recognition using hidden Markov models with polynomial regression functions as non-stationary states. IEEE Trans. Acoust. Speech Signal Process. 2(4), 101–119 (1994)
-
(1994)
IEEE Trans. Acoust. Speech Signal Process
, vol.2
, Issue.4
, pp. 101-119
-
-
Deng, L.1
Aksmanovic, M.2
Sun, D.3
Wu, J.4
-
25
-
-
84905280906
-
Sequence classification using high-level features extracted from deep neural networks
-
L. Deng, J. Chen, Sequence classification using high-level features extracted from deep neural networks, in Proceedings of ICASSP (2014)
-
(2014)
Proceedings of ICASSP
-
-
Deng, L.1
Chen, J.2
-
26
-
-
0036299277
-
A bayesian approach to speech feature enhancement using the dynamic cepstral prior
-
L. Deng, J. Droppo, A. Acero, A Bayesian approach to speech feature enhancement using the dynamic cepstral prior, in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1 (2002), pp. I-829–I-832
-
(2002)
Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing
, vol.1
-
-
Deng, L.1
Droppo, J.2
Acero, A.3
-
27
-
-
0028256706
-
Analysis of the correlation structure for a neural predictive model with application to speech recognition
-
L. Deng, K. Hassanein, M. Elmasry, Analysis of the correlation structure for a neural predictive model with application to speech recognition. Neural Netw. 7(2), 331–339 (1994)
-
(1994)
Neural Netw
, vol.7
, Issue.2
, pp. 331-339
-
-
Deng, L.1
Hassanein, K.2
Elmasry, M.3
-
28
-
-
84890526837
-
New types of deep neural network learning for speech recognition and related applications: An overview
-
L. Deng, G. Hinton, B. Kingsbury, New types of deep neural network learning for speech recognition and related applications: an overview, in Proceedings of IEEE ICASSP, Vancouver, 2013
-
(2013)
Proceedings of IEEE ICASSP, Vancouver
-
-
Deng, L.1
Hinton, G.2
Kingsbury, B.3
-
29
-
-
84890468916
-
Deep learning for speech recognition and related applications
-
L. Deng, G. Hinton, D. Yu, Deep learning for speech recognition and related applications, in NIPS Workshop, Whistler, 2009
-
(2009)
NIPS Workshop, Whistler
-
-
Deng, L.1
Hinton, G.2
Yu, D.3
-
30
-
-
0026189555
-
Phonemic hidden markov models with continuous mixture output densities for large vocabulary word recognition
-
L. Deng, P. Kenny, M. Lennig, V. Gupta, F. Seitz, P. Mermelsten, Phonemic hidden markov models with continuous mixture output densities for large vocabulary word recognition. IEEE Trans. Acoust. Speech Signal Process. 39(7), 1677–1681 (1991)
-
(1991)
IEEE Trans. Acoust. Speech Signal Process
, vol.39
, Issue.7
, pp. 1677-1681
-
-
Deng, L.1
Kenny, P.2
Lennig, M.3
Gupta, V.4
Seitz, F.5
Mermelsten, P.6
-
31
-
-
34547517867
-
Adaptive kalman filtering and smoothing for tracking vocal tract resonances using a continuous-valued hidden dynamic model
-
L. Deng, L. Lee, H. Attias, A. Acero, Adaptive kalman filtering and smoothing for tracking vocal tract resonances using a continuous-valued hidden dynamic model. IEEE Trans. Audio Speech Lang. Process. 15(1), 13–23 (2007)
-
(2007)
IEEE Trans. Audio Speech Lang. Process
, vol.15
, Issue.1
, pp. 13-23
-
-
Deng, L.1
Lee, L.2
Attias, H.3
Acero, A.4
-
32
-
-
10244257175
-
Large vocabulary word recognition using context-dependent allophonic hidden markov models
-
L. Deng, M. Lennig, F. Seitz, P. Mermelstein, Large vocabulary word recognition using context-dependent allophonic hidden markov models. Comput. Speech Lang. 4, 345–357 (1991)
-
(1991)
Comput. Speech Lang
, vol.4
, pp. 345-357
-
-
Deng, L.1
Lennig, M.2
Seitz, F.3
Mermelstein, P.4
-
33
-
-
84876672166
-
Machine learning paradigms in speech recognition: An overview
-
L. Deng, X. Li, Machine learning paradigms in speech recognition: an overview. IEEE Trans. Audio Speech Lang. Process. 21(5), 1060–1089 (2013)
-
(2013)
IEEE Trans. Audio Speech Lang. Process
, vol.21
, Issue.5
, pp. 1060-1089
-
-
Deng, L.1
Li, X.2
-
34
-
-
0003911245
-
A statistical coarticulatory model for the hidden vocal-tract-resonance dynamics
-
L. Deng, J. Ma, A statistical coarticulatory model for the hidden vocal-tract-resonance dynamics, in EUROSPEECH (1999), pp. 1499–1502
-
(1999)
EUROSPEECH
, pp. 1499-1502
-
-
Deng, L.1
Ma, J.2
-
35
-
-
0033623527
-
Spontaneous speech recognition using a statistical coarticulatory model for the hidden vocal-tract-resonance dynamics
-
L. Deng, J. Ma, Spontaneous speech recognition using a statistical coarticulatory model for the hidden vocal-tract-resonance dynamics. J. Acoust. Soc. Am. 108, 3036–3048 (2000)
-
(2000)
J. Acoust. Soc. Am
, vol.108
, pp. 3036-3048
-
-
Deng, L.1
Ma, J.2
-
37
-
-
0031198059
-
Production models as a structural basis for automatic speech recognition
-
L. Deng, G. Ramsay, D. Sun, Production models as a structural basis for automatic speech recognition. Speech Commun. 33(2–3), 93–111 (1997)
-
(1997)
Speech Commun
, vol.33
, Issue.23
, pp. 93-111
-
-
Deng, L.1
Ramsay, G.2
Sun, D.3
-
39
-
-
33744966561
-
A bidirectional target filtering model of speech coarticulation: Two-stage implementation for phonetic recognition
-
L. Deng, D. Yu, A. Acero, A bidirectional target filtering model of speech coarticulation: two-stage implementation for phonetic recognition. IEEE Trans. Speech Audio Process. 14, 256–265 (2006)
-
(2006)
IEEE Trans. Speech Audio Process
, vol.14
, pp. 256-265
-
-
Deng, L.1
Yu, D.2
Acero, A.3
-
42
-
-
4544236840
-
Noise robust speech recognition with a switching linear dynamic model
-
J. Droppo, A. Acero, Noise robust speech recognition with a switching linear dynamic model, in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1 (2004), pp. I-953–I-956
-
(2004)
Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing
, vol.1
-
-
Droppo, J.1
Acero, A.2
-
43
-
-
85032752250
-
Bayesian nonparametric methods for learning markov switching processes
-
E. Fox, E. Sudderth, M. Jordan, A. Willsky, Bayesian nonparametric methods for learning markov switching processes. IEEE Signal Process. Mag. 27(6), 43–54 (2010)
-
(2010)
IEEE Signal Process. Mag
, vol.27
, Issue.6
, pp. 43-54
-
-
Fox, E.1
Sudderth, E.2
Jordan, M.3
Willsky, A.4
-
44
-
-
85009074657
-
Algonquin: Iterating laplaces method to remove multiple types of acoustic distortion for robust speech recognition
-
B. Frey, L. Deng, A. Acero, T. Kristjansson, Algonquin: iterating laplaces method to remove multiple types of acoustic distortion for robust speech recognition, in Proceedings of Eurospeech (2000)
-
(2000)
Proceedings of Eurospeech
-
-
Frey, B.1
Deng, L.2
Acero, A.3
Kristjansson, T.4
-
45
-
-
0030245128
-
Robust continuous speech recognition using parallel model combination
-
M. Gales, S. Young, Robust continuous speech recognition using parallel model combination. IEEE Trans. Speech Audio Process. 4(5), 352–359 (1996)
-
(1996)
IEEE Trans. Speech Audio Process
, vol.4
, Issue.5
, pp. 352-359
-
-
Gales, M.1
Young, S.2
-
46
-
-
0034170950
-
Variational learning for switching state-space models
-
Z. Ghahramani, G.E. Hinton, Variational learning for switching state-space models. Neural Comput. 12, 831–864 (2000)
-
(2000)
Neural Comput
, vol.12
, pp. 831-864
-
-
Ghahramani, Z.1
Hinton, G.E.2
-
49
-
-
84890543083
-
Speech recognition with deep recurrent neural networks
-
A. Graves, A. Mahamed, G. Hinton, Speech recognition with deep recurrent neural networks, in Proceedings of ICASSP, Vancouver, 2013
-
(2013)
Proceedings of ICASSP, Vancouver
-
-
Graves, A.1
Mahamed, A.2
Hinton, G.3
-
50
-
-
78650474133
-
A practical guide to training restricted boltzmann machines
-
Machine Learning Group, University of Toronto, 2010
-
G. E. Hinton, “A practical guide to training restricted Boltzmann machines,” in Technical report 2010-003, Machine Learning Group, University of Toronto, 2010.
-
Technical Report 2010-003
-
-
Hinton, G.E.1
-
51
-
-
85032751458
-
Deep neural networks for acoustic modeling in speech recognition
-
G. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, B. Kingsbury, Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process. Mag. 29(6), 82–97 (2012)
-
(2012)
IEEE Signal Process. Mag
, vol.29
, Issue.6
, pp. 82-97
-
-
Hinton, G.1
Deng, L.2
Yu, D.3
Dahl, G.4
Mohamed, A.5
Jaitly, N.6
Senior, A.7
Vanhoucke, V.8
Nguyen, P.9
Sainath, T.10
Kingsbury, B.11
-
52
-
-
33745805403
-
A fast learning algorithm for deep belief nets
-
G. Hinton, S. Osindero, Y. Teh, A fast learning algorithm for deep belief nets. Neural Comput. 18, 1527–1554 (2006)
-
(2006)
Neural Comput
, vol.18
, pp. 1527-1554
-
-
Hinton, G.1
Osindero, S.2
Teh, Y.3
-
53
-
-
33746600649
-
Reducing the dimensionality of data with neural networks
-
G. Hinton, R. Salakhutdinov, Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
-
(2006)
Science
, vol.313
, Issue.5786
, pp. 504-507
-
-
Hinton, G.1
Salakhutdinov, R.2
-
54
-
-
0032673963
-
Probabilistic-trajectory segmental hmms
-
W. Holmes, M. Russell, Probabilistic-trajectory segmental HMMs. Comput. Speech Lang. 13, 3–37 (1999)
-
(1999)
Comput. Speech Lang
, vol.13
, pp. 3-37
-
-
Holmes, W.1
Russell, M.2
-
55
-
-
0004056285
-
-
(Upper Saddle River, New Jersey 07458)
-
X. Huang, A. Acero, H.-W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm, and System Development (Upper Saddle River, New Jersey 07458)
-
Spoken Language Processing: A Guide to Theory, Algorithm, and System Development
-
-
Huang, X.1
Acero, A.2
Hon, H.-W.3
-
56
-
-
33749833931
-
Tutorial on training recurrent neural networks, covering bppt, rtrl, ekf and the “echo state network” approach. Gmd report 159
-
H. Jaeger, Tutorial on training recurrent neural networks, covering BPPT, RTRL, EKF and the “echo state network” approach. GMD Report 159, GMD - German National Research Institute for Computer Science (2002)
-
(2002)
GMD - German National Research Institute for Computer Science
-
-
Jaeger, H.1
-
57
-
-
0016939124
-
Continuous speech recognition by statistical methods
-
F. Jelinek, Continuous speech recognition by statistical methods. Proc. IEEE 64(4), 532–557 (1976)
-
(1976)
Proc. IEEE
, vol.64
, Issue.4
, pp. 532-557
-
-
Jelinek, F.1
-
58
-
-
0022691022
-
Maximum likelihood estimation for mixture multivariate stochastic observations of markov chains
-
B.-H. Juang, S.E. Levinson, M.M. Sondhi, Maximum likelihood estimation for mixture multivariate stochastic observations of markov chains. IEEE Trans. Inf. Theory 32(2), 307–309 (1986)
-
(1986)
IEEE Trans. Inf. Theory
, vol.32
, Issue.2
, pp. 307-309
-
-
Juang, B.-H.1
Levinson, S.E.2
Sondhi, M.M.3
-
59
-
-
84878379108
-
Scalable minimum bayes risk training of deep neural network acoustic models using distributed hessian-free optimization
-
B. Kingsbury, T. Sainath, H. Soltau, Scalable minimum Bayes risk training of deep neural network acoustic models using distributed hessian-free optimization, in Proceedings of Interspeech (2012)
-
(2012)
Proceedings of Interspeech
-
-
Kingsbury, B.1
Sainath, T.2
Soltau, H.3
-
60
-
-
56449110012
-
Classification using discriminative restricted boltzmann machines
-
ACM, New York
-
H. Larochelle, Y. Bengio, Classification using discriminative restricted Boltzmann machines, in Proceedings of the 25th International Conference on Machine learning (ACM, New York, 2008), pp. 536–543
-
(2008)
Proceedings of the 25Th International Conference on Machine Learning
, pp. 536-543
-
-
Larochelle, H.1
Bengio, Y.2
-
61
-
-
0141813573
-
Variational inference and learning for segmental switching state space models of hidden speech dynamics
-
L. Lee, H. Attias, L. Deng, Variational inference and learning for segmental switching state space models of hidden speech dynamics, in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1 (2003), pp. I-872–I-875
-
(2003)
Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing
, vol.1
-
-
Lee, L.1
Attias, H.2
Deng, L.3
-
62
-
-
0034842603
-
A functional articulatory dynamic model for speech production
-
L.J. Lee, P. Fieguth, L. Deng, A functional articulatory dynamic model for speech production, in Proceedings of ICASSP, Salt Lake City, vol. 2, 2001, pp. 797–800
-
(2001)
Proceedings of ICASSP, Salt Lake City
, vol.2
, pp. 797-800
-
-
Lee, L.J.1
Fieguth, P.2
Deng, L.3
-
63
-
-
84897953008
-
Temporally varying weight regression: A semi-parametric trajectory model for automatic speech recognition
-
S. Liu, K. Sim, Temporally varying weight regression: a semi-parametric trajectory model for automatic speech recognition. IEEE Trans. Audio Speech Lang. Process. 22(1) 151–160 (2014)
-
(2014)
IEEE Trans. Audio Speech Lang. Process
, vol.22
, Issue.1
, pp. 151-160
-
-
Liu, S.1
Sim, K.2
-
64
-
-
84875405186
-
Exploiting deep neural networks for detectionbased speech recognition
-
S.M. Siniscalchia, D. Yu, L. Deng, C.-H. Lee, Exploiting deep neural networks for detectionbased speech recognition. Neurocomputing 106, 148–157 (2013)
-
(2013)
Neurocomputing
, vol.106
, pp. 148-157
-
-
Siniscalchia, S.M.1
Yu, D.2
Deng, L.3
Lee, C.-H.4
-
65
-
-
0001523807
-
A path-stack algorithm for optimizing dynamic regimes in a statistical hidden dynamic model of speech
-
J. Ma, L. Deng, A path-stack algorithm for optimizing dynamic regimes in a statistical hidden dynamic model of speech. Comput. Speech Lang. 14, 101–104 (2000)
-
(2000)
Comput. Speech Lang
, vol.14
, pp. 101-104
-
-
Ma, J.1
Deng, L.2
-
66
-
-
0347968275
-
Efficient decoding strategies for conversational speech recognition using a constrained nonlinear state-space model
-
J. Ma, L. Deng, Efficient decoding strategies for conversational speech recognition using a constrained nonlinear state-space model. IEEE Trans. Audio Speech Process. 11(6), 590–602 (2003)
-
(2003)
IEEE Trans. Audio Speech Process
, vol.11
, Issue.6
, pp. 590-602
-
-
Ma, J.1
Deng, L.2
-
67
-
-
0347968275
-
Efficient decoding strategies for conversational speech recognition using a constrained nonlinear state-space model
-
J. Ma, L. Deng, Efficient decoding strategies for conversational speech recognition using a constrained nonlinear state-space model. IEEE Trans. Audio Speech Lang. Process 11(6), 590–602 (2004)
-
(2004)
IEEE Trans. Audio Speech Lang. Process
, vol.11
, Issue.6
, pp. 590-602
-
-
Ma, J.1
Deng, L.2
-
68
-
-
0742307392
-
Target-directed mixture dynamic models for spontaneous speech recognition
-
J. Ma, L. Deng, Target-directed mixture dynamic models for spontaneous speech recognition. IEEE Trans. Audio Speech Process. 12(1), 47–58 (2004)
-
(2004)
IEEE Trans. Audio Speech Process
, vol.12
, Issue.1
, pp. 47-58
-
-
Ma, J.1
Deng, L.2
-
69
-
-
84878409063
-
Recurrent neural networks for noise reduction in robust asr
-
A.L. Maas, Q. Le, T.M. O’Neil, O. Vinyals, P. Nguyen, A.Y. Ng, Recurrent neural networks for noise reduction in robust asr, in Proceedings of INTERSPEECH, Portland, 2012
-
(2012)
Proceedings of INTERSPEECH, Portland
-
-
Maas, A.L.1
Le, Q.2
O’Neil, T.M.3
Vinyals, O.4
Nguyen, P.5
Ng, A.Y.6
-
70
-
-
80053451847
-
Learning recurrent neural networks with hessian-free optimization
-
J. Martens, I. Sutskever, Learning recurrent neural networks with hessian-free optimization, in Proceedings of ICML, Bellevue, 2011, pp. 1033–1040
-
(2011)
Proceedings of ICML, Bellevue
, pp. 1033-1040
-
-
Martens, J.1
Sutskever, I.2
-
71
-
-
84906237242
-
Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding
-
G. Mesnil, X. He, L. Deng, Y. Bengio, Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding, in Proceedings of INTERSPEECH, Lyon, 2013
-
(2013)
Proceedings of INTERSPEECH, Lyon
-
-
Mesnil, G.1
He, X.2
Deng, L.3
Bengio, Y.4
-
72
-
-
54349106040
-
Switching linear dynamical systems for noise robust speech recognition
-
B. Mesot, D. Barber, Switching linear dynamical systems for noise robust speech recognition. IEEE Trans. Audio Speech Lang. Process. 15(6), 1850–1858 (2007)
-
(2007)
IEEE Trans. Audio Speech Lang. Process
, vol.15
, Issue.6
, pp. 1850-1858
-
-
Mesot, B.1
Barber, D.2
-
74
-
-
84858966958
-
Strategies for training large scale neural network language models
-
IEEE, Honolulu
-
T. Mikolov, A. Deoras, D. Povey, L. Burget, J. Cernocky, Strategies for training large scale neural network language models, in Proceedings of IEEE ASRU (IEEE, Honolulu, 2011), pp. 196–201
-
(2011)
Proceedings of IEEE ASRU
, pp. 196-201
-
-
Mikolov, T.1
Deoras, A.2
Povey, D.3
Burget, L.4
Cernocky, J.5
-
75
-
-
79959829092
-
Recurrent neural network based language model
-
T. Mikolov, M. Karafiát, L. Burget, J. Cernocky, S. Khudanpur, Recurrent neural network based language model, in Proceedings of INTERSPEECH, Makuhari, 2010, pp. 1045–1048
-
(2010)
Proceedings of INTERSPEECH, Makuhari
, pp. 1045-1048
-
-
Mikolov, T.1
Karafiát, M.2
Burget, L.3
Cernocky, J.4
Khudanpur, S.5
-
76
-
-
80051643236
-
Extensions of recurrent neural network language model
-
in
-
T. Mikolov, S. Kombrink, L. Burget, J. Cernocky, S. Khudanpur, Extensions of recurrent neural network language model, in Proceedings of IEEE ICASSP, Prague, 2011, pp. 5528–5531
-
(2011)
Proceedings of IEEE ICASSP, Prague
, pp. 5528-5531
-
-
Mikolov, T.1
Kombrink, S.2
Burget, L.3
Cernocky, J.4
Khudanpur, S.5
-
77
-
-
84055211743
-
Acoustic modeling using deep belief networks
-
A. Mohamed, G. Dahl, G. Hinton, Acoustic modeling using deep belief networks. IEEE Trans. Audio Speech Lang. Process. 20(1), 14–22 (2012)
-
(2012)
IEEE Trans. Audio Speech Lang. Process
, vol.20
, Issue.1
, pp. 14-22
-
-
Mohamed, A.1
Dahl, G.2
Hinton, G.3
-
79
-
-
80051654263
-
Deep belief networks using discriminative features for phone recognition
-
A. Mohamed, T. Sainath, G. Dahl, B. Ramabhadran, G. Hinton, M. Picheny, Deep belief networks using discriminative features for phone recognition, in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (2011), pp. 5060–5063
-
(2011)
Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing
, pp. 5060-5063
-
-
Mohamed, A.1
Sainath, T.2
Dahl, G.3
Ramabhadran, B.4
Hinton, G.5
Picheny, M.6
-
80
-
-
84255177123
-
Deep and wide: Multiple layers in automatic speech recognition
-
N. Morgan, Deep and wide: multiple layers in automatic speech recognition. IEEE Trans. Audio Speech Lang. Process. 20(1), 7–13 (2012)
-
(2012)
IEEE Trans. Audio Speech Lang. Process
, vol.20
, Issue.1
, pp. 7-13
-
-
Morgan, N.1
-
81
-
-
0030245363
-
From hmm’s to segment models: A unified view of stochastic modeling for speech recognition
-
M. Ostendorf, V. Digalakis, O. Kimball, From HMM’s to segment models: a unified view of stochastic modeling for speech recognition. IEEE Trans. Speech Audio Process. 4(5), 360– 378 (1996)
-
(1996)
IEEE Trans. Speech Audio Process
, vol.4
, Issue.5
-
-
Ostendorf, M.1
Digalakis, V.2
Kimball, O.3
-
83
-
-
69249099357
-
Dynamic speech spectrum representation and tracking variable number of vocal tract resonance frequencies with time-varying dirichlet process mixture models
-
E. Ozkan, I. Ozbek, M. Demirekler, Dynamic speech spectrum representation and tracking variable number of vocal tract resonance frequencies with time-varying dirichlet process mixture models. IEEE Trans. Audio Speech Lang. Process. 17(8), 1518–1532 (2009)
-
(2009)
IEEE Trans. Audio Speech Lang. Process
, vol.17
, Issue.8
, pp. 1518-1532
-
-
Ozkan, E.1
Ozbek, I.2
Demirekler, M.3
-
84
-
-
84897497795
-
On the difficulty of training recurrent neural networks
-
R. Pascanu, T. Mikolov, Y. Bengio, On the difficulty of training recurrent neural networks, in Proceedings of ICML, Atlanta, 2013
-
(2013)
Proceedings of ICML, Atlanta
-
-
Pascanu, R.1
Mikolov, T.2
Bengio, Y.3
-
85
-
-
0141698849
-
Variational learning in mixed-state dynamic graphical models
-
V. Pavlovic, B. Frey, T. Huang, Variational learning in mixed-state dynamic graphical models, in Proceedings of UAI, Stockholm, 1999, pp. 522–530
-
(1999)
Proceedings of UAI, Stockholm
, pp. 522-530
-
-
Pavlovic, V.1
Frey, B.2
Huang, T.3
-
86
-
-
0032639922
-
Initial evaluation of hidden dynamic models on conversational speech
-
J. Picone, S. Pike, R. Regan, T. Kamm, J. Bridle, L. Deng, Z. Ma, H. Richards, M. Schuster, Initial evaluation of hidden dynamic models on conversational speech, in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (1999)
-
(1999)
Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing
-
-
Picone, J.1
Pike, S.2
Regan, R.3
Kamm, T.4
Bridle, J.5
Deng, L.6
Ma, Z.7
Richards, H.8
Schuster, M.9
-
87
-
-
0028401031
-
Neurocontrol of nonlinear dynamical systems with kalman filter trained recurrent networks
-
G. Puskorius, L. Feldkamp, Neurocontrol of nonlinear dynamical systems with kalman filter trained recurrent networks. IEEE Trans. Neural Netw. 5(2), 279–297 (1998)
-
(1998)
IEEE Trans. Neural Netw
, vol.5
, Issue.2
, pp. 279-297
-
-
Puskorius, G.1
Feldkamp, L.2
-
89
-
-
85032751986
-
Single-channel multitalker speech recognition—graphical modeling approaches
-
S. Rennie, J. Hershey, P. Olsen, Single-channel multitalker speech recognition—graphical modeling approaches. IEEE Signal Process.Mag. 33, 66–80 (2010)
-
(2010)
IEEE Signal Process.Mag
, vol.33
, pp. 66-80
-
-
Rennie, S.1
Hershey, J.2
Olsen, P.3
-
90
-
-
0028392167
-
An application of recurrent nets to phone probability estimation
-
A.J. Robinson, An application of recurrent nets to phone probability estimation. IEEE Trans. Neural Netw. 5(2), 298–305 (1994)
-
(1994)
IEEE Trans. Neural Netw
, vol.5
, Issue.2
, pp. 298-305
-
-
Robinson, A.J.1
-
91
-
-
4544302569
-
Rao-blackwellised gibbs sampling for switching linear dynamical systems
-
A. Rosti, M. Gales, Rao-blackwellised gibbs sampling for switching linear dynamical systems, in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1 (2004), pp. I-809–I-812
-
(2004)
Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing
, vol.1
-
-
Rosti, A.1
Gales, M.2
-
92
-
-
10844250035
-
Linear/linear segmental hmm with a formant-based intermediate layer. Comput
-
M. Russell, P. Jackson, A multiple-level linear/linear segmental HMM with a formant-based intermediate layer. Comput. Speech Lang. 19, 205–225 (2005)
-
(2005)
Speech Lang
, vol.19
, pp. 205-225
-
-
Russell, M.1
Jackson, P.2
Multiple-Level, A.3
-
93
-
-
84886829539
-
Optimization techniques to improve training speed of deep neural networks for large speech tasks
-
T. Sainath, B. Kingsbury, H. Soltau, B. Ramabhadran, Optimization techniques to improve training speed of deep neural networks for large speech tasks. IEEE Trans. Audio Speech Lang. Process. 21(11), 2267–2276 (2013)
-
(2013)
IEEE Trans. Audio Speech Lang. Process
, vol.21
, Issue.11
, pp. 2267-2276
-
-
Sainath, T.1
Kingsbury, B.2
Soltau, H.3
Ramabhadran, B.4
-
94
-
-
84858976070
-
Feature engineering in context-dependent deep neural networks for conversational speech transcription
-
Waikoloa, HI, USA
-
F. Seide, G. Li, X. Chen, D. Yu, Feature engineering in context-dependent deep neural networks for conversational speech transcription, in IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2011 (Waikoloa, HI, USA), pp. 24–29
-
IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2011
, pp. 24-29
-
-
Seide, F.1
Li, G.2
Chen, X.3
Yu, D.4
-
95
-
-
0031074957
-
Maximum likelihood in statistical estimation of dynamical systems: Decomposition algorithm and simulation results
-
X. Shen, L. Deng, Maximum likelihood in statistical estimation of dynamical systems: decomposition algorithm and simulation results. Signal Process. 57, 65–79 (1997)
-
(1997)
Signal Process
, vol.57
, pp. 65-79
-
-
Shen, X.1
Deng, L.2
-
97
-
-
84883148756
-
Empirical risk minimization of graphical model parameters given approximate inference, decoding, and model structure
-
V. Stoyanov, A. Ropson, J. Eisner, Empirical risk minimization of graphical model parameters given approximate inference, decoding, and model structure, in Proceedings of AISTAT (2011)
-
(2011)
Proceedings of AISTAT
-
-
Stoyanov, V.1
Ropson, A.2
Eisner, J.3
-
100
-
-
80053459857
-
Generating text with recurrent neural networks
-
I. Sutskever, J. Martens, G.E. Hinton, Generating text with recurrent neural networks, in Proceedings of ICML, Bellevue, 2011, pp. 1017–1024
-
(2011)
Proceedings of ICML, Bellevue
, pp. 1017-1024
-
-
Sutskever, I.1
Martens, J.2
Hinton, G.E.3
-
101
-
-
0344443787
-
Joint state and parameter estimation for a target-directed nonlinear dynamic system model
-
R. Togneri, L. Deng, Joint state and parameter estimation for a target-directed nonlinear dynamic system model. IEEE Trans. Signal Process. 51(12), 3061–3070 (2003)
-
(2003)
IEEE Trans. Signal Process
, vol.51
, Issue.12
, pp. 3061-3070
-
-
Togneri, R.1
Deng, L.2
-
102
-
-
33745373922
-
A state-space model with neural-network prediction for recovering vocal tract resonances in fluent speech from mel-cepstral coefficients
-
R. Togneri, L. Deng, A state-space model with neural-network prediction for recovering vocal tract resonances in fluent speech from mel-cepstral coefficients. Speech Commun. 48(8), 971– 988 (2006)
-
(2006)
Speech Commun
, vol.48
, Issue.8
-
-
Togneri, R.1
Deng, L.2
-
103
-
-
84886714036
-
Acoustic modeling with hierarchical reservoirs
-
F. Triefenbach, A. Jalalvand, K. Demuynck, J.-P. Martens, Acoustic modeling with hierarchical reservoirs. EEE Trans. Audio Speech Lang. Process. 21(11), 2439–2450 (2013)
-
(2013)
EEE. Trans. Audio Speech Lang. Process
, vol.21
, Issue.11
, pp. 2439-2450
-
-
Triefenbach, F.1
Jalalvand, A.2
Demuynck, K.3
Martens, J.-P.4
-
104
-
-
84887037596
-
Optimization algorithms and applications for speech and language processing
-
S. Wright, D. Kanevsky, L. Deng, X. He, G. Heigold, H. Li, Optimization algorithms and applications for speech and language processing. IEEE Trans. Audio Speech Lang. Process. 21(11), 2231–2243 (2013)
-
(2013)
IEEE Trans. Audio Speech Lang. Process
, vol.21
, Issue.11
, pp. 2231-2243
-
-
Wright, S.1
Kanevsky, D.2
Deng, L.3
He, X.4
Heigold, G.5
Li, H.6
-
105
-
-
3242679207
-
A generalized mean field algorithm for variational inference in exponential families
-
X. Xing, M. Jordan, S. Russell, A generalized mean field algorithm for variational inference in exponential families, in Proceedings of UAI (2003)
-
(2003)
Proceedings of UAI
-
-
Xing, X.1
Jordan, M.2
Russell, S.3
-
106
-
-
33749541517
-
Speaker-adaptive learning of resonance targets in a hidden trajectory model of speech coarticulation. Comput
-
D. Yu, L. Deng, Speaker-adaptive learning of resonance targets in a hidden trajectory model of speech coarticulation. Comput. Speech Lang. 27, 72–87 (2007)
-
(2007)
Speech Lang
, vol.27
, pp. 72-87
-
-
Yu, D.1
Deng, L.2
-
109
-
-
84867606668
-
Exploiting sparseness in deep neural networks for large vocabulary speech recognition
-
D. Yu, F. Seide, G. Li, L. Deng, Exploiting sparseness in deep neural networks for large vocabulary speech recognition, in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (2012), pp. 4409–4412
-
(2012)
Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing
, pp. 4409-4412
-
-
Yu, D.1
Seide, F.2
Li, G.3
Deng, L.4
-
110
-
-
84867329143
-
Boosting attribute and phone estimation accuracies with deep neural networks for detection-based speech recognition
-
D. Yu, S. Siniscalchi, L. Deng, C. Lee, Boosting attribute and phone estimation accuracies with deep neural networks for detection-based speech recognition, in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (2012)
-
(2012)
Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing
-
-
Yu, D.1
Siniscalchi, S.2
Deng, L.3
Lee, C.4
-
111
-
-
85133439657
-
An introduction of trajectory model into hmm-based speech synthesis
-
H. Zen, K. Tokuda, T. Kitamura, An introduction of trajectory model into HMM-based speech synthesis, in Proceedings of ISCA SSW5 (2004), pp. 191–196
-
(2004)
Proceedings of ISCA SSW5
, pp. 191-196
-
-
Zen, H.1
Tokuda, K.2
Kitamura, T.3
-
112
-
-
67650153217
-
Acoustic-articulatory modelling with the trajectory hmm
-
L. Zhang, S. Renals, Acoustic-articulatory modelling with the trajectory HMM. IEEE Signal Process. Lett. 15, 245–248 (2008)
-
(2008)
IEEE Signal Process. Lett
, vol.15
, pp. 245-248
-
-
Zhang, L.1
Renals, S.2
|