-
1
-
-
85032751458
-
Deep neural networks for acoustic modeling in speechrecognition: The shared views of four research groups
-
G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, and B. Kingsbury, "Deep neural networks for acoustic modeling in speechrecognition: The shared views of four research groups, " SignalProcessing Magazine, IEEE, vol. 29, no. 6, pp. 82-97, 2012.
-
(2012)
SignalProcessing Magazine, IEEE
, vol.29
, Issue.6
, pp. 82-97
-
-
Hinton, G.1
Deng, L.2
Yu, D.3
Dahl, G.E.4
Mohamed, A.-R.5
Jaitly, N.6
Senior, A.7
Vanhoucke, V.8
Nguyen, P.9
Sainath, T.N.10
Kingsbury, B.11
-
3
-
-
0029308753
-
Neural networks for statisticalrecognition of continuous speech
-
N. Morgan and H. A. Bourlard, "Neural networks for statisticalrecognition of continuous speech, " Proceedings of the IEEE, vol. 83, no. 5, pp. 742-772, 1995.
-
(1995)
Proceedings of the IEEE
, vol.83
, Issue.5
, pp. 742-772
-
-
Morgan, N.1
Bourlard, H.A.2
-
4
-
-
0028194709
-
Connectionist probability estimators in HMM speech recognition
-
S. Renals, N. Morgan, H. Bourlard, M. Cohen, and H. Franco, "Connectionist probability estimators in HMM speech recognition, "IEEE Transactions on Speech and Audio Processing, vol. 2, no. 1, pp. 161-174, 1994.
-
(1994)
IEEE Transactions on Speech and Audio Processing
, vol.2
, Issue.1
, pp. 161-174
-
-
Renals, S.1
Morgan, N.2
Bourlard, H.3
Cohen, M.4
Franco, H.5
-
5
-
-
84055211743
-
Acoustic modelingusing deep belief networks
-
A.-r. Mohamed, G. E. Dahl, and G. Hinton, "Acoustic modelingusing deep belief networks, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 20, no. 1, pp. 14-22, 2012.
-
(2012)
Audio, Speech, and Language Processing, IEEE Transactions on
, vol.20
, Issue.1
, pp. 14-22
-
-
Mohamed, A.-R.1
Dahl, G.E.2
Hinton, G.3
-
6
-
-
84865801985
-
Conversational speech transcriptionusing context-dependent deep neural networks
-
F. Seide, G. Li, and D. Yu, "Conversational speech transcriptionusing context-dependent deep neural networks. " in Interspeech, 2011, pp. 437-440.
-
(2011)
Interspeech
, pp. 437-440
-
-
Seide, F.1
Li, G.2
Yu, D.3
-
7
-
-
0000329355
-
A recurrent error propagation networkspeech recognition system
-
T. Robinson and F. Fallside, "A recurrent error propagation networkspeech recognition system, " Computer Speech and Language, vol. 5, pp. 259-274, 1991.
-
(1991)
Computer Speech and Language
, vol.5
, pp. 259-274
-
-
Robinson, T.1
Fallside, F.2
-
8
-
-
0001592322
-
The use of recurrentnetworks in continuous speech recognition
-
C. Lee, K. Paliwal, and F. Soong, Eds. Kluwer Academic Publishers
-
T. Robinson, M. Hochberg, and S. Renals, "The use of recurrentnetworks in continuous speech recognition, " in Automatic Speechand Speaker Recognition-Advanced Topics, C. Lee, K. Paliwal, and F. Soong, Eds. Kluwer Academic Publishers, 1996, pp. 233-258.
-
(1996)
Automatic Speechand Speaker Recognition-Advanced Topics
, pp. 233-258
-
-
Robinson, T.1
Hochberg, M.2
Renals, S.3
-
9
-
-
0036567797
-
Connectionist speech recognition of broadcastnews
-
A. Robinson, G. Cook, D. Ellis, E. Fosler-Lussier, S. Renals, and D. Williams, "Connectionist speech recognition of broadcastnews, " Speech Communication, vol. 37, pp. 27-45, 2002.
-
(2002)
Speech Communication
, vol.37
, pp. 27-45
-
-
Robinson, A.1
Cook, G.2
Ellis, D.3
Fosler-Lussier, E.4
Renals, S.5
Williams, D.6
-
10
-
-
0031268931
-
Bidirectional recurrent neuralnetworks
-
M. Schuster and K. K. Paliwal, "Bidirectional recurrent neuralnetworks, " Signal Processing, IEEE Transactions on, vol. 45, no. 11, pp. 2673-2681, 1997.
-
(1997)
Signal Processing, IEEE Transactions on
, vol.45
, Issue.11
, pp. 2673-2681
-
-
Schuster, M.1
Paliwal, K.K.2
-
11
-
-
84910072094
-
Sequence discriminative distributedtraining of long short-term memory recurrent neural networks
-
H. Sak, O. Vinyals, G. Heigold, A. Senior, E. McDermott, R. Monga, and M. Mao, "Sequence discriminative distributedtraining of long short-term memory recurrent neural networks, "in Proc. INTERSPEECH, 2014.
-
(2014)
Proc. INTERSPEECH
-
-
Sak, H.1
Vinyals, O.2
Heigold, G.3
Senior, A.4
McDermott, E.5
Monga, R.6
Mao, M.7
-
12
-
-
0016663359
-
The DRAGON system-an overview
-
J. Baker, "The DRAGON system-an overview, " IEEE Transactionson Acoustics, Speech, and Signal Processing, vol. 23, pp. 24-29, 1975.
-
(1975)
IEEE Transactionson Acoustics, Speech, and Signal Processing
, vol.23
, pp. 24-29
-
-
Baker, J.1
-
13
-
-
0020719320
-
A maximum likelihood approachto speech recognition
-
L. Bahl, F. Jelinek, and R. Mercer, "A maximum likelihood approachto speech recognition, " IEEE Transactions on PatternAnalysis and Machine Intelligence, vol. 5, pp. 179-190, 1983.
-
(1983)
IEEE Transactions on PatternAnalysis and Machine Intelligence
, vol.5
, pp. 179-190
-
-
Bahl, L.1
Jelinek, F.2
Mercer, R.3
-
14
-
-
0024610919
-
A tutorial on hidden markov models and selectedapplications in speech recognition
-
L. Rabiner, "A tutorial on hidden markov models and selectedapplications in speech recognition, " Proceedings of the IEEE, vol. 77, no. 2, pp. 257-286, 1989.
-
(1989)
Proceedings of the IEEE
, vol.77
, Issue.2
, pp. 257-286
-
-
Rabiner, L.1
-
15
-
-
33645791324
-
What HMMs can do
-
J. A. Bilmes, "What HMMs can do, " IEICE TRANSACTIONS onInformation and Systems, vol. 89, no. 3, pp. 869-891, 2006.
-
(2006)
IEICE TRANSACTIONS OnInformation and Systems
, vol.89
, Issue.3
, pp. 869-891
-
-
Bilmes, J.A.1
-
16
-
-
0036460907
-
Weighted finite-state transducersin speech recognition
-
M. Mohri, F. Pereira, and M. Riley, "Weighted finite-state transducersin speech recognition, " Computer Speech & Language, vol. 16, pp. 69-88, 2002.
-
(2002)
Computer Speech & Language
, vol.16
, pp. 69-88
-
-
Mohri, M.1
Pereira, F.2
Riley, M.3
-
18
-
-
84878379108
-
Scalable minimumBayes risk training of deep neural network acoustic models usingdistributed Hessian-free optimization
-
B. Kingsbury, T. N. Sainath, and H. Soltau, "Scalable minimumBayes risk training of deep neural network acoustic models usingdistributed Hessian-free optimization. " in INTERSPEECH, 2012.
-
(2012)
INTERSPEECH
-
-
Kingsbury, B.1
Sainath, T.N.2
Soltau, H.3
-
19
-
-
84906274730
-
Sequencediscriminativetraining of deep neural networks
-
K. Veselý, A. Ghoshal, L. Burget, and D. Povey, "Sequencediscriminativetraining of deep neural networks, " in Proc. INTERSPEECH, 2013.
-
(2013)
Proc. INTERSPEECH
-
-
Veselý, K.1
Ghoshal, A.2
Burget, L.3
Povey, D.4
-
20
-
-
0030245363
-
From HMM's tosegment models: A unified view of stochastic modeling for speechrecognition
-
M. Ostendorf, V. Digalakis, and O. Kimball, "From HMM's tosegment models: A unified view of stochastic modeling for speechrecognition, " IEEE Transactions on Speech and Audio Processing, pp. 360-378, 1996.
-
(1996)
IEEE Transactions on Speech and Audio Processing
, pp. 360-378
-
-
Ostendorf, M.1
Digalakis, V.2
Kimball, O.3
-
22
-
-
33745185781
-
Hiddenconditional rand om fields for phone classification
-
A. Gunawardana, M. Mahajan, A. Acero, and J. C. Platt, "Hiddenconditional rand om fields for phone classification. " in INTERSPEECH, 2005, pp. 1117-1120.
-
(2005)
INTERSPEECH
, pp. 1117-1120
-
-
Gunawardana, A.1
Mahajan, M.2
Acero, A.3
Platt, J.C.4
-
23
-
-
45549086638
-
Template-based continuous speechrecognition
-
M. DeWachter, M. Matton, K. Demuynck, P. Wambacq, R. Cools, and D. Van Compernolle, "Template-based continuous speechrecognition, " Audio, Speech, and Language Processing, IEEETransactions on, vol. 15, no. 4, pp. 1377-1390, 2007.
-
(2007)
Audio, Speech, and Language Processing, IEEETransactions on
, vol.15
, Issue.4
, pp. 1377-1390
-
-
DeWachter, M.1
Matton, M.2
Demuynck, K.3
Wambacq, P.4
Cools, R.5
Van Compernolle, D.6
-
24
-
-
70350435251
-
Speech recognition using augmentedconditional rand om fields
-
Y. Hifny and S. Renals, "Speech recognition using augmentedconditional rand om fields, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 17, no. 2, pp. 354-365, 2009.
-
(2009)
Audio, Speech, and Language Processing, IEEE Transactions on
, vol.17
, Issue.2
, pp. 354-365
-
-
Hifny, Y.1
Renals, S.2
-
26
-
-
84928547704
-
Sequence to sequencelearning with neural networks
-
I. Sutskever, O. Vinyals, and Q. V. Le, "Sequence to sequencelearning with neural networks, " in Advances in Neural InformationProcessing Systems, 2014, pp. 3104-3112.
-
(2014)
Advances in Neural InformationProcessing Systems
, pp. 3104-3112
-
-
Sutskever, I.1
Vinyals, O.2
Le, Q.V.3
-
27
-
-
84961291190
-
Learning phrase representationsusing RNN encoder-decoder for statistical machine translation
-
K. Cho, B. van Merrienboer, C. Gulcehre, F. Bougares, H. Schwenk, and Y. Bengio, "Learning phrase representationsusing RNN encoder-decoder for statistical machine translation, "Pro. EMNLP, 2014.
-
(2014)
Pro. EMNLP
-
-
Cho, K.1
Van Merrienboer, B.2
Gulcehre, C.3
Bougares, F.4
Schwenk, H.5
Bengio, Y.6
-
28
-
-
85083953689
-
Neural machine translationby jointly learning to align and translate
-
D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translationby jointly learning to align and translate, " in Proc. ICLR, 2015.
-
(2015)
Proc. ICLR
-
-
Bahdanau, D.1
Cho, K.2
Bengio, Y.3
-
29
-
-
84939821075
-
-
arXiv preprint arXiv: 1411. 4555
-
O. Vinyals, A. Toshev, S. Bengio, and D. Erhan, "Show and tell: Aneural image caption generator, " arXiv preprint arXiv: 1411. 4555, 2014.
-
(2014)
Show and Tell: Aneural Image Caption Generator
-
-
Vinyals, O.1
Toshev, A.2
Bengio, S.3
Erhan, D.4
-
30
-
-
84939821074
-
-
arXiv preprint arXiv: 1502. 03044
-
K. Xu, J. Ba, R. Kiros, A. Courville, R. Salakhutdinov, R. Zemel, and Y. Bengio, "Show, attend and tell: Neural image caption generationwith visual attention, " arXiv preprint arXiv: 1502. 03044, 2015.
-
(2015)
Show, Attend and Tell: Neural Image Caption Generationwith Visual Attention
-
-
Xu, K.1
Ba, J.2
Kiros, R.3
Courville, A.4
Salakhutdinov, R.5
Zemel, R.6
Bengio, Y.7
-
31
-
-
84936143793
-
Towards end-to-end speech recognitionwith recurrent neural networks
-
A. Graves and N. Jaitly, "Towards end-to-end speech recognitionwith recurrent neural networks, " in Proc. ICML, 2014, pp. 1764-1772.
-
(2014)
Proc. ICML
, pp. 1764-1772
-
-
Graves, A.1
Jaitly, N.2
-
32
-
-
84957716354
-
-
arXiv preprint arXiv: 1408. 2873
-
A. L. Maas, A. Y. Hannun, D. Jurafsky, and A. Y. Ng, "First-Pass Large Vocabulary Continuous Speech Recognition using Bi-Directional Recurrent DNNs, " arXiv preprint arXiv: 1408. 2873, 2014.
-
(2014)
First-Pass Large Vocabulary Continuous Speech Recognition Using Bi-Directional Recurrent DNNs
-
-
Maas, A.L.1
Hannun, A.Y.2
Jurafsky, D.3
Ng, A.Y.4
-
33
-
-
84959066041
-
-
arXiv preprint arXiv: 1412.1602
-
J. Chorowski, D. Bahdanau, K. Cho, and Y. Bengio, "End-to-endContinuous Speech Recognition using Attention-based RecurrentNN: First Results, " arXiv preprint arXiv: 1412. 1602, 2014.
-
(2014)
End-to-endContinuous Speech Recognition Using Attention-based RecurrentNN: First Results
-
-
Chorowski, J.1
Bahdanau, D.2
Cho, K.3
Bengio, Y.4
-
34
-
-
84928545733
-
-
arXiv preprint arXiv: 1412.5567
-
A. Hannun, C. Case, J. Casper, B. Catanzaro, G. Diamos, E. Elsen, R. Prenger et al., "Deep Speech: Scaling up end-to-endspeech recognition, " in arXiv preprint arXiv: 1412. 5567, 2014.
-
(2014)
Deep Speech: Scaling Up End-to-endspeech Recognition
-
-
Hannun, A.1
Case, C.2
Casper, J.3
Catanzaro, B.4
Diamos, G.5
Elsen, E.6
Prenger, R.7
-
36
-
-
84887388950
-
An empiricalstudy of learning rates in deep neural networks for speech recognition
-
A. Senior, G. Heigold, M. Ranzato, and K. Yang, "An empiricalstudy of learning rates in deep neural networks for speech recognition, "in Proc. ICASSP. IEEE, 2013, pp. 6724-6728.
-
(2013)
Proc. ICASSP. IEEE
, pp. 6724-6728
-
-
Senior, A.1
Heigold, G.2
Ranzato, M.3
Yang, K.4
-
37
-
-
0028392483
-
Learning long-term dependencieswith gradient descent is difficult
-
Y. Bengio, P. Simard, and P. Frasconi, "Learning long-term dependencieswith gradient descent is difficult, " Neural Networks, IEEE Transactions on, vol. 5, no. 2, pp. 157-166, 1994.
-
(1994)
Neural Networks, IEEE Transactions on
, vol.5
, Issue.2
, pp. 157-166
-
-
Bengio, Y.1
Simard, P.2
Frasconi, P.3
-
38
-
-
0031573117
-
Long short-term memory
-
S. Hochreiter and J. Schmidhuber, "Long short-term memory, "Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997.
-
(1997)
Neural Computation
, vol.9
, Issue.8
, pp. 1735-1780
-
-
Hochreiter, S.1
Schmidhuber, J.2
-
39
-
-
85016587886
-
SWITCHBOARD: Telephone speech corpus for research and development
-
J. J. Godfrey, E. C. Holliman, and J. McDaniel, "SWITCHBOARD: Telephone speech corpus for research and development, "in Proc. ICASSP. IEEE, 1992, pp. 517-520.
-
(1992)
Proc. ICASSP. IEEE
, pp. 517-520
-
-
Godfrey, J.J.1
Holliman, E.C.2
McDaniel, J.3
-
40
-
-
84959109176
-
-
arXiv preprintarXiv: 1503. 03535
-
C. Gulcehre, O. Firat, K. Xu, K. Cho, L. Barrault, H.-C. Lin, F. Bougares, H. Schwenk, and Y. Bengio, "On using monolingualcorpora in neural machine translation, " arXiv preprintarXiv: 1503. 03535, 2015.
-
(2015)
On Using Monolingualcorpora in Neural Machine Translation
-
-
Gulcehre, C.1
Firat, O.2
Xu, K.3
Cho, K.4
Barrault, L.5
Lin, H.-C.6
Bougares, F.7
Schwenk, H.8
Bengio, Y.9
-
41
-
-
0024634603
-
Phoneme recognition using time-delay neural networks
-
A. Waibel, T. Hanazawa, G. Hinton, K. Shikano, and K. J. Lang, "Phoneme recognition using time-delay neural networks, "Acoustics, Speech and Signal Processing, IEEE Transactions on, vol. 37, no. 3, pp. 328-339, 1989.
-
(1989)
Acoustics, Speech and Signal Processing, IEEE Transactions on
, vol.37
, Issue.3
, pp. 328-339
-
-
Waibel, A.1
Hanazawa, T.2
Hinton, G.3
Shikano, K.4
Lang, K.J.5
|