-
1
-
-
84858952478
-
Don't multiply lightly: Quantifying problems with the acoustic model assumptions in speech recognition
-
IEEE
-
D. Gillick, L. Gillick, and S. Wegmann, "Don't multiply lightly: Quantifying problems with the acoustic model assumptions in speech recognition," in Proc. ASRU. IEEE, 2011, pp. 71-76.
-
(2011)
Proc. ASRU
, pp. 71-76
-
-
Gillick, D.1
Gillick, L.2
Wegmann, S.3
-
2
-
-
0030245363
-
From HMM's to segment models: A unified view of stochastic modeling for speech recognition
-
M. Ostendorf, V. Digalakis, and O. Kimball, "From HMM's to segment models: A unified view of stochastic modeling for speech recognition," IEEE Transactions on Speech and Audio Processing, pp. 360-378, 1996.
-
(1996)
IEEE Transactions on Speech and Audio Processing
, pp. 360-378
-
-
Ostendorf, M.1
Digalakis, V.2
Kimball, O.3
-
4
-
-
33745185781
-
Hidden conditional random fields for phone classification
-
A. Gunawardana, M. Mahajan, A. Acero, and J. C. Platt, "Hidden conditional random fields for phone classification." in INTERSPEECH, 2005, pp. 1117-1120.
-
(2005)
INTERSPEECH
, pp. 1117-1120
-
-
Gunawardana, A.1
Mahajan, M.2
Acero, A.3
Platt, J.C.4
-
5
-
-
70350435251
-
Speech recognition using augmented conditional random fields
-
Y. Hifny and S. Renals, "Speech recognition using augmented conditional random fields," Audio, Speech, and Language Processing, IEEE Transactions on, vol. 17, no. 2, pp. 354-365, 2009.
-
(2009)
Audio, Speech, and Language Processing, IEEE Transactions on
, vol.17
, Issue.2
, pp. 354-365
-
-
Hifny, Y.1
Renals, S.2
-
6
-
-
84936143793
-
Towards end-to-end speech recognition with recurrent neural networks
-
A. Graves and N. Jaitly, "Towards end-to-end speech recognition with recurrent neural networks," in Proc. ICML, 2014, pp. 1764-1772.
-
(2014)
Proc. ICML
, pp. 1764-1772
-
-
Graves, A.1
Jaitly, N.2
-
7
-
-
84928545733
-
-
arXiv preprint arXiv:1412.5567
-
A. Hannun, C. Case, J. Casper, B. Catanzaro, G. Diamos, E. Elsen, R. Prenger et al., "Deep Speech: Scaling up end-to-end speech recognition," in arXiv preprint arXiv:1412.5567, 2014.
-
(2014)
Deep Speech: Scaling Up End-to-end Speech Recognition
-
-
Hannun, A.1
Case, C.2
Casper, J.3
Catanzaro, B.4
Diamos, G.5
Elsen, E.6
Prenger, R.7
-
8
-
-
84959112739
-
Fast and accurate recurrent neural network acoustic models for speech recognition
-
H. Sak, A. Senior, K. Rao, and F. Beaufays, "Fast and accurate recurrent neural network acoustic models for speech recognition," in Proc. INTERSPEECH, 2015.
-
(2015)
Proc. INTERSPEECH
-
-
Sak, H.1
Senior, A.2
Rao, K.3
Beaufays, F.4
-
9
-
-
84964489732
-
EESEN: Endto-end speech recognition using deep RNN models and WFST-based decoding
-
Y. Miao, M. Gowayyed, and F. Metze, "EESEN: Endto-end speech recognition using deep RNN models and WFST-based decoding," in Proc. ASRU, 2015.
-
(2015)
Proc. ASRU
-
-
Miao, Y.1
Gowayyed, M.2
Metze, F.3
-
10
-
-
85083953689
-
Neural machine translation by jointly learning to align and translate
-
D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translation by jointly learning to align and translate," in Proc. ICLR, 2015.
-
(2015)
Proc. ICLR
-
-
Bahdanau, D.1
Cho, K.2
Bengio, Y.3
-
11
-
-
84965139600
-
Attention-based models for speech recognition
-
J. K. Chorowski, D. Bahdanau, D. Serdyuk, K. Cho, and Y. Bengio, "Attention-based models for speech recognition," in Advances in Neural Information Processing Systems, 2015, pp. 577-585.
-
(2015)
Advances in Neural Information Processing Systems
, pp. 577-585
-
-
Chorowski, J.K.1
Bahdanau, D.2
Serdyuk, D.3
Cho, K.4
Bengio, Y.5
-
12
-
-
84959173420
-
A study of the recurrent neural network encoder-decoder for large vocabulary speech recognition
-
L. Lu, X. Zhang, K. Cho, and S. Renals, "A study of the recurrent neural network encoder-decoder for large vocabulary speech recognition," in Proc. INTERSPEECH, 2015.
-
(2015)
Proc. INTERSPEECH
-
-
Lu, L.1
Zhang, X.2
Cho, K.3
Renals, S.4
-
13
-
-
84994328213
-
-
arXiv preprint arXiv:1508.01211
-
W. Chan, N. Jaitly, Q. V. Le, and O. Vinyals, "Listen, attend and spell," arXiv preprint arXiv:1508.01211, 2015.
-
(2015)
Listen, Attend and Spell
-
-
Chan, W.1
Jaitly, N.2
Le, Q.V.3
Vinyals, O.4
-
16
-
-
0142192295
-
Conditional random fields: Probabilistic models for segmenting and labeling sequence data
-
J. Lafferty, A. McCallum, and F. Pereira, "Conditional random fields: Probabilistic models for segmenting and labeling sequence data," in Proc. ICML, 2001, pp. 282-289.
-
(2001)
Proc. ICML
, pp. 282-289
-
-
Lafferty, J.1
McCallum, A.2
Pereira, F.3
-
17
-
-
80051659716
-
Speech recognition with segmental conditional random fields: A summary of the JHU CLSP 2010 summer workshop
-
G. Zweig, P. Nguyen, D. Van Compernolle, K. Demuynck, L. Atlas, P. Clark et al., "Speech recognition with segmental conditional random fields: A summary of the JHU CLSP 2010 summer workshop," in Proc. ICASSP. IEEE, 2011, pp. 5044-5047.
-
(2011)
Proc. ICASSP. IEEE
, pp. 5044-5047
-
-
Zweig, G.1
Nguyen, P.2
Van Compernolle, D.3
Demuynck, K.4
Atlas, L.5
Clark, P.6
-
18
-
-
84876691724
-
Conditional random fields in speech, audio, and language processing
-
E. Fosler-Lussier, Y. He, P. Jyothi, and R. Prabhavalkar, "Conditional random fields in speech, audio, and language processing," Proceedings of the IEEE, vol. 101, no. 5, pp. 1054-1075, 2013.
-
(2013)
Proceedings of the IEEE
, vol.101
, Issue.5
, pp. 1054-1075
-
-
Fosler-Lussier, E.1
He, Y.2
Jyothi, P.3
Prabhavalkar, R.4
-
19
-
-
84906282118
-
Deep segmental neural networks for speech recognition
-
O. Abdel-Hamid, L. Deng, D. Yu, and H. Jiang, "Deep segmental neural networks for speech recognition." in Proc. INTERSPEECH, 2013, pp. 1849-1853.
-
(2013)
Proc. INTERSPEECH
, pp. 1849-1853
-
-
Abdel-Hamid, O.1
Deng, L.2
Yu, D.3
Jiang, H.4
-
20
-
-
84959175560
-
Segmental conditional random fields with deep neural networks as acoustic models for first-pass word recognition
-
Y. He and E. Fosler-Lussier, "Segmental conditional random fields with deep neural networks as acoustic models for first-pass word recognition," in Proc. INTERSPEECH, 2015.
-
(2015)
Proc. INTERSPEECH
-
-
He, Y.1
Fosler-Lussier, E.2
-
22
-
-
84858953642
-
The Kaldi speech recognition toolkit
-
D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, N. Goel, M. Hannemann, P. Motlicek, Y. Qian, P. Schwarz, J. Silovský, G. Semmer, and K. Veselý, "The Kaldi speech recognition toolkit," in Proc. ASRU, 2011.
-
(2011)
Proc. ASRU
-
-
Povey, D.1
Ghoshal, A.2
Boulianne, G.3
Burget, L.4
Glembek, O.5
Goel, N.6
Hannemann, M.7
Motlicek, P.8
Qian, Y.9
Schwarz, P.10
Silovský, J.11
Semmer, G.12
Veselý, K.13
-
23
-
-
0031573117
-
Long short-term memory
-
S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997.
-
(1997)
Neural Computation
, vol.9
, Issue.8
, pp. 1735-1780
-
-
Hochreiter, S.1
Schmidhuber, J.2
-
24
-
-
84904163933
-
Dropout: A simple way to prevent neural networks from overfitting
-
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Dropout: A simple way to prevent neural networks from overfitting," The Journal of Machine Learning Research, vol. 15, no. 1, pp. 1929-1958, 2014.
-
(2014)
The Journal of Machine Learning Research
, vol.15
, Issue.1
, pp. 1929-1958
-
-
Srivastava, N.1
Hinton, G.2
Krizhevsky, A.3
Sutskever, I.4
Salakhutdinov, R.5
-
26
-
-
84867598637
-
Classification and recognition with direct segment models
-
IEEE
-
G. Zweig, "Classification and recognition with direct segment models," in Proc. ICASSP. IEEE, 2012, pp. 4161-4164.
-
(2012)
Proc. ICASSP
, pp. 4161-4164
-
-
Zweig, G.1
-
27
-
-
84878565391
-
Efficient segmental conditional random fields for phone recognition
-
Y. He and E. Fosler-Lussier, "Efficient segmental conditional random fields for phone recognition," in Proc. INTERSPEECH, 2012, pp. 1898-1901.
-
(2012)
Proc. INTERSPEECH
, pp. 1898-1901
-
-
He, Y.1
Fosler-Lussier, E.2
-
28
-
-
84964454407
-
Discriminative segmental cascades for feature-rich phone recognition
-
H. Tang, W. Wang, K. Gimpel, and K. Livescu, "Discriminative segmental cascades for feature-rich phone recognition," in Proc. ASRU, 2015.
-
(2015)
Proc. ASRU
-
-
Tang, H.1
Wang, W.2
Gimpel, K.3
Livescu, K.4
-
29
-
-
84890543083
-
Speech recognition with deep recurrent neural networks
-
IEEE
-
A. Graves, A.-R. Mohamed, and G. Hinton, "Speech recognition with deep recurrent neural networks," in Proc. ICASSP. IEEE, 2013, pp. 6645-6649
-
(2013)
Proc. ICASSP
, pp. 6645-6649
-
-
Graves, A.1
Mohamed, A.-R.2
Hinton, G.3
|