-
1
-
-
70349213445
-
Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling
-
apr
-
B. Kingsbury, "Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling," in Proceedings of ICASSP. IEEE, apr 2009, pp. 3761-3764.
-
(2009)
Proceedings of ICASSP. IEEE
, pp. 3761-3764
-
-
Kingsbury, B.1
-
2
-
-
84906274730
-
Sequencediscriminative training of deep neural networks
-
K. Veselý, A. Ghoshal, L. Burget, and D. Povey, "Sequencediscriminative training of deep neural networks." in Proceedings of INTERSPEECH, 2013, pp. 2345-2349.
-
(2013)
Proceedings of INTERSPEECH
, pp. 2345-2349
-
-
Veselý, K.1
Ghoshal, A.2
Burget, L.3
Povey, D.4
-
3
-
-
84890543852
-
Error back propagation for sequence training of context-dependent deep networks for conversational speech transcription
-
may
-
H. Su, G. Li, D. Yu, and F. Seide, "Error back propagation for sequence training of Context-Dependent Deep Networks for conversational speech transcription," in Proceedings of ICASSP. IEEE, may 2013, pp. 6664-6668.
-
(2013)
Proceedings of ICASSP. IEEE
, pp. 6664-6668
-
-
Su, H.1
Li, G.2
Yu, D.3
Seide, F.4
-
5
-
-
33749259827
-
Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks
-
ACM
-
A. Graves, S. Fernández, F. Gomez, and J. Schmidhuber, "Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks," in Proceedings of the 23rd international conference on Machine learning. ACM, 2006, pp. 369-376.
-
(2006)
Proceedings of the 23rd International Conference on Machine Learning
, pp. 369-376
-
-
Graves, A.1
Fernández, S.2
Gomez, F.3
Schmidhuber, J.4
-
6
-
-
84946084790
-
Learning acoustic frame labeling for speech recognition with recurrent neural networks
-
H. Sak, A. Senior, K. Rao, O. Irsoy, A. Graves, F. Beaufays, and J. Schalkwyk, "Learning acoustic frame labeling for speech recognition with recurrent neural networks," in Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on. IEEE, 2015, pp. 4280-4284.
-
(2015)
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference On. IEEE
, pp. 4280-4284
-
-
Sak, H.1
Senior, A.2
Rao, K.3
Irsoy, O.4
Graves, A.5
Beaufays, F.6
Schalkwyk, J.7
-
7
-
-
84928545733
-
-
arXiv preprint arXiv:1412.5567
-
A. Hannun, C. Case, J. Casper, B. Catanzaro, G. Diamos, E. Elsen, R. Prenger, S. Satheesh, S. Sengupta, A. Coates et al., "Deep speech: Scaling up end-to-end speech recognition," arXiv preprint arXiv:1412.5567, 2014.
-
(2014)
Deep Speech: Scaling Up End-to-end Speech Recognition
-
-
Hannun, A.1
Case, C.2
Casper, J.3
Catanzaro, B.4
Diamos, G.5
Elsen, E.6
Prenger, R.7
Satheesh, S.8
Sengupta, S.9
Coates, A.10
-
8
-
-
84959112739
-
Fast and accurate recurrent neural network acoustic models for speech recognition
-
H. Sak, A. Senior, K. Rao, and F. Beaufays, "Fast and accurate recurrent neural network acoustic models for speech recognition," in Sixteenth Annual Conference of the International Speech Communication Association, 2015.
-
(2015)
Sixteenth Annual Conference of the International Speech Communication Association
-
-
Sak, H.1
Senior, A.2
Rao, K.3
Beaufays, F.4
-
9
-
-
84994286302
-
Acoustic modelling with cd-ctc-smbr lstm rnns
-
A. Senior, H. Sak, F. de Chaumont Quitry, T. N. Sainath, and K. Rao, "Acoustic Modelling with CD-CTC-SMBR LSTM RNNS," in ASRU, 2015.
-
(2015)
ASRU
-
-
Senior, A.1
Sak, H.2
Quitry Chaumont, F.D.3
Sainath, T.N.4
Rao, K.5
-
10
-
-
34047266376
-
Advances in speech transcription at IBM under the DARPA EARS program
-
sep
-
S. Chen, B. Kingsbury, Lidia Mangu, D. Povey, G. Saon, H. Soltau, and G. Zweig, "Advances in speech transcription at IBM under the DARPA EARS program," IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 5, pp. 1596-1608, sep 2006.
-
(2006)
IEEE Transactions on Audio, Speech and Language Processing
, vol.14
, Issue.5
, pp. 1596-1608
-
-
Chen, S.1
Kingsbury, B.2
Mangu, L.3
Povey, D.4
Saon, G.5
Soltau, H.6
Zweig, G.7
-
11
-
-
85075929453
-
Speech recognition with weighted finite-state transducers
-
M. Mohri, F. Pereira, and M. Riley, "Speech recognition with weighted finite-state transducers," in Springer Handbook of Speech Processing. Springer, 2008, pp. 559-584.
-
(2008)
Springer Handbook of Speech Processing. Springer
, pp. 559-584
-
-
Mohri, M.1
Pereira, F.2
Riley, M.3
-
12
-
-
0002197352
-
An n log n algorithm for minimizing states in a finite automaton
-
J. Hopcroft, "An n log n algorithm for minimizing states in a finite automaton," Theory of Machines and Computations, pp. 189-196, 1971.
-
(1971)
Theory of Machines and Computations
, pp. 189-196
-
-
Hopcroft, J.1
-
13
-
-
84867616340
-
Generating exact lattices in the wfst framework
-
D. Povey, M. Hannemann, G. Boulianne, L. Burget, A. Ghoshal, M. Janda, M. Karafiát, S. Kombrink, P. Motlicek, Y. Qian et al., "Generating exact lattices in the wfst framework," in Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on. IEEE, 2012, pp. 4213-4216.
-
(2012)
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference On. IEEE
, pp. 4213-4216
-
-
Povey, D.1
Hannemann, M.2
Boulianne, G.3
Burget, L.4
Ghoshal, A.5
Janda, M.6
Karafiát, M.7
Kombrink, S.8
Motlicek, P.9
Qian, Y.10
-
14
-
-
84959115289
-
A time delay neural network architecture for efficient modeling of long temporal contexts
-
V. Peddinti, D. Povey, and S. Khudanpur, "A time delay neural network architecture for efficient modeling of long temporal contexts," in Proceedings of INTERSPEECH, 2015.
-
(2015)
Proceedings of INTERSPEECH
-
-
Peddinti, V.1
Povey, D.2
Khudanpur, S.3
-
16
-
-
84959118622
-
Audio augmentation for speech recognition
-
T. Ko, V. Peddinti, D. Povey, and S. Khudanpur, "Audio augmentation for speech recognition," in Proceedings of INTERSPEECH, 2015.
-
(2015)
Proceedings of INTERSPEECH
-
-
Ko, T.1
Peddinti, V.2
Povey, D.3
Khudanpur, S.4
-
17
-
-
84893691530
-
Speaker adaptation of neural network acoustic models using i-vectors
-
Dec
-
G. Saon, H. Soltau, D. Nahamoo, and M. Picheny, "Speaker adaptation of neural network acoustic models using i-vectors," in Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on, Dec 2013, pp. 55-59.
-
(2013)
Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on
, pp. 55-59
-
-
Saon, G.1
Soltau, H.2
Nahamoo, D.3
Picheny, M.4
-
18
-
-
84959101589
-
Pronunciation and silence probability modeling for ASR
-
G. Chen, H. Xu, M. Wu, D. Povey, and S. Khudanpur, "Pronunciation and silence probability modeling for ASR," in Proceedings of INTERSPEECH, 2015.
-
(2015)
Proceedings of INTERSPEECH
-
-
Chen, G.1
Xu, H.2
Wu, M.3
Povey, D.4
Khudanpur, S.5
-
20
-
-
84946076428
-
Ted-lium: An automatic speech recognition dedicated corpus
-
A. Rousseau, P. Deléglise, and Y. Esteve, "Ted-lium: an automatic speech recognition dedicated corpus." in LREC, 2012, pp. 125-129.
-
(2012)
LREC
, pp. 125-129
-
-
Rousseau, A.1
Deléglise, Y.2
Esteve, P.3
-
21
-
-
84994300353
-
Far-field ASR without parallel data
-
V. Peddinti, V. Manohar, Y. Wang, D. Povey, and S. Khudanpur, "Far-field ASR without parallel data," in Proceedings of Interspeech, 2016. [Online]. Available: http://www.danielpovey.com/files/2016 interspeech ami.pdf
-
(2016)
Proceedings of Interspeech
-
-
Peddinti, V.1
Manohar, V.2
Wang, Y.3
Povey, D.4
Khudanpur, S.5
-
22
-
-
84964507635
-
Deep bi-directional recurrent networks over spectral windows
-
ASRU
-
A.-r. Mohamed, F. Seide, D. Yu, J. Droppo, A. Stolcke, G. Zweig, and G. Penn, "Deep bi-directional recurrent networks over spectral windows," in Proceedings of ASRU. ASRU, 2015.
-
(2015)
Proceedings of ASRU
-
-
Mohamed, A.-R.1
Seide, F.2
Yu, D.3
Droppo, J.4
Stolcke, A.5
Zweig, G.6
Penn, G.7
|