-
1
-
-
70349227947
-
The application of hidden Markov models in speech recognition
-
Mark Gales and Steve Young, "The application of hidden Markov models in speech recognition, " Foundations and Trends in Signal Processing, vol. 1, no. 3, pp. 195-304, 2008.
-
(2008)
Foundations and Trends in Signal Processing
, vol.1
, Issue.3
, pp. 195-304
-
-
Gales, M.1
Young, S.2
-
3
-
-
84946023646
-
Convolutional neural networks-based continuous speech recognition using raw speech signal
-
Dimitri Palaz, Mathew Magimai Doss, Ronan Collobert, "Convolutional neural networks-based continuous speech recognition using raw speech signal, " in Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on. IEEE, 2015, pp. 4295-4299.
-
(2015)
Acoustics, Speech and Signal Processing (ICASSP) 2015 IEEE International Conference On. IEEE
, pp. 4295-4299
-
-
Palaz, D.1
Doss, M.M.2
Collobert, R.3
-
4
-
-
84959168440
-
Learning the speech front-end with raw waveform CLDNNs.
-
Tara N Sainath, Ron J Weiss, Andrew W Senior, Kevin W Wilson, Oriol Vinyals, "Learning the speech front-end with raw waveform CLDNNs., " in Interspeech, 2015, vol. 2015.
-
(2015)
Interspeech
, pp. 2015
-
-
Sainath, T.N.1
Weiss, R.J.2
Senior, A.W.3
Wilson, K.W.4
Vinyals, O.5
-
5
-
-
84994235770
-
Acoustic modelling from the signal domain using CNNs
-
Pegah Ghahremani, Vimal Manohar, Daniel Povey, Sanjeev Khudanpur, "Acoustic modelling from the signal domain using CNNs, " in Interspeech, 2016, vol. 2016.
-
(2016)
Interspeech
, vol.2016
-
-
Ghahremani, P.1
Manohar, V.2
Povey, D.3
Khudanpur, S.4
-
6
-
-
0012330750
-
The design for the Wall Street Journal-based CSR corpus
-
HLT Association for Computational Linguistics
-
Douglas B. Paul and Janet M. Baker, "The design for the Wall Street Journal-based CSR corpus, " in Proceedings of the Workshop on Speech and Natural Language, Stroudsburg, PA, USA, 1992, HLT '91, pp. 357-362, Association for Computational Linguistics.
-
(1992)
Proceedings of the Workshop on Speech and Natural Language, Stroudsburg, PA, USA
, vol.91
, pp. 357-362
-
-
Paul, D.B.1
Baker, J.M.2
-
7
-
-
34250704813
-
Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks
-
Alex Graves, Santiago Fernández, Faustino Gomez, Jürgen Schmidhuber, "Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks, " in Proceedings of the 23rd International Conference on Machine learning. ACM, 2006, pp. 369-376.
-
(2006)
Proceedings of the 23rd International Conference on Machine Learning. ACM
, pp. 369-376
-
-
Graves, A.1
Fernández, S.2
Gomez, F.3
Schmidhuber, J.4
-
8
-
-
84890543083
-
Speech recognition with deep recurrent neural networks
-
Alex Graves, Abdel Rahman Mohamed, Geoffrey Hinton, "Speech recognition with deep recurrent neural networks, " in Acoustics, Speech and Signal processing (ICASSP), 2013 IEEE International Conference on. IEEE, 2013, pp. 6645-6649.
-
(2013)
Acoustics, Speech and Signal Processing (ICASSP) 2013 IEEE International Conference On. IEEE
, pp. 6645-6649
-
-
Graves, A.1
Rahman Mohamed, A.2
Hinton, G.3
-
9
-
-
84971463350
-
Deep speech 2: End-to-end speech recognition in English and Mandarin
-
Dario Amodei, Rishita Anubhai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, JingDong Chen, Mike Chrzanowski, Adam Coates, Greg Diamos, et al., "Deep speech 2: End-to-end speech recognition in English and Mandarin, " in Proceedings of The 33rd International Conference on Machine Learning, 2016, pp. 173-182.
-
(2016)
Proceedings of the 33rd International Conference on Machine Learning
, pp. 173-182
-
-
Amodei, D.1
Anubhai, R.2
Battenberg, E.3
Case, C.4
Casper, J.5
Catanzaro, B.6
Chen, J.7
Chrzanowski, M.8
Coates, A.9
Diamos, G.10
-
10
-
-
84959066041
-
-
Jan Chorowski, Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio, "End-to-end continuous speech recognition using attention-based recurrent NN: First results, " arXiv preprint arXiv:1412.1602, 2014.
-
(2014)
End-to-end Continuous Speech Recognition Using Attention-based Recurrent NN: First Results
-
-
Chorowski, J.1
Bahdanau, D.2
Cho, K.3
Bengio, Y.4
-
11
-
-
84973351869
-
Listen, attend and spell: A neural network for large vocabulary conversational speech recognition
-
William Chan, Navdeep Jaitly, Quoc Le, Oriol Vinyals, "Listen, attend and spell: A neural network for large vocabulary conversational speech recognition, " in Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International Conference on. IEEE, 2016, pp. 4960-4964.
-
(2016)
Acoustics, Speech and Signal Processing (ICASSP) 2016 IEEE International Conference On. IEEE
, pp. 4960-4964
-
-
Chan, W.1
Jaitly, N.2
Le, Q.3
Vinyals, O.4
-
15
-
-
0000359337
-
Backpropagation applied to handwritten zip code recognition
-
Yann LeCun, Bernhard Boser, John S Denker, Donnie Henderson, Richard E Howard, Wayne Hubbard, Lawrence D Jackel, "Backpropagation applied to handwritten zip code recognition, " Neural computation, vol. 1, no. 4, pp. 541-551, 1989.
-
(1989)
Neural Computation
, vol.1
, Issue.4
, pp. 541-551
-
-
LeCun, Y.1
Boser, B.2
Denker, J.S.3
Henderson, D.4
Howard, R.E.5
Hubbard, W.6
Jackel, L.D.7
-
17
-
-
0031573117
-
Long shortterm memory
-
Sepp Hochreiter and Jürgen Schmidhuber, "Long shortterm memory, " Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997.
-
(1997)
Neural Computation
, vol.9
, Issue.8
, pp. 1735-1780
-
-
Hochreiter, S.1
Schmidhuber, J.2
-
18
-
-
84962006941
-
-
Jost Tobias Springenberg, Alexey Dosovitskiy, Thomas Brox, Martin Riedmiller, "Striving for simplicity: The all convolutional net, " arXiv preprint arXiv:1412.6806, 2014.
-
(2014)
Striving for Simplicity: The All Convolutional Net
-
-
Springenberg, J.T.1
Dosovitskiy, A.2
Brox, T.3
Riedmiller, M.4
-
19
-
-
84858953642
-
The Kaldi speech recognition toolkit
-
Dec. 2011, IEEE Signal Processing Society, IEEE Catalog No.: CFP11SRW-USB
-
Daniel Povey, Arnab Ghoshal, Gilles Boulianne, Lukas Burget, Ondrej Glembek, Nagendra Goel, Mirko Hannemann, Petr Motlicek, Yanmin Qian, Petr Schwarz, Jan Silovsky, Georg Stemmer, Karel Vesely, "The Kaldi speech recognition toolkit, " in IEEE 2011 Workshop on Automatic Speech Recognition and Understanding. Dec. 2011, IEEE Signal Processing Society, IEEE Catalog No.: CFP11SRW-USB.
-
IEEE 2011 Workshop on Automatic Speech Recognition and Understanding
-
-
Povey, D.1
Ghoshal, A.2
Boulianne, G.3
Burget, L.4
Glembek, O.5
Goel, N.6
Hannemann, M.7
Motlicek, P.8
Qian, Y.9
Schwarz, P.10
Silovsky, J.11
Stemmer, G.12
Vesely, K.13
-
20
-
-
84957716354
-
-
Awni Y Hannun, Andrew L Maas, Daniel Jurafsky, Andrew Y Ng, "First-pass large vocabulary continuous speech recognition using bi-directional recurrent DNNs, " arXiv preprint arXiv:1408.2873, 2014.
-
(2014)
First-pass Large Vocabulary Continuous Speech Recognition Using Bi-directional Recurrent DNNs
-
-
Hannun, A.Y.1
Maas, A.L.2
Jurafsky, D.3
Ng, A.Y.4
-
21
-
-
84893676344
-
Rectifier nonlinearities improve neural network acoustic models
-
Andrew L Maas, Awni Y Hannun, Andrew Y Ng, "Rectifier nonlinearities improve neural network acoustic models, " in in ICML Workshop on Deep Learning for Audio, Speech and Language Processing, 2013.
-
(2013)
ICML Workshop on Deep Learning for Audio, Speech and Language Processing
-
-
Maas, A.L.1
Hannun, A.Y.2
Ng, A.Y.3
-
23
-
-
84973293705
-
End-to-end attention-based large vocabulary speech recognition
-
Dzmitry Bahdanau, Jan Chorowski, Dmitriy Serdyuk, Philemon Brakel, Yoshua Bengio, "End-to-end attention-based large vocabulary speech recognition, " in Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International Conference on. IEEE, 2016, pp. 4945-4949.
-
(2016)
Acoustics, Speech and Signal Processing (ICASSP) 2016 IEEE International Conference On. IEEE
, pp. 4945-4949
-
-
Bahdanau, D.1
Chorowski, J.2
Serdyuk, D.3
Brakel, P.4
Bengio, Y.5
-
25
-
-
85023752928
-
Joint CTC-attention based end-to-end speech recognition using multi-task learning
-
to appear
-
Suyoun Kim, Takaaki Hori, ShinjiWatanabe, "Joint CTC-attention based end-to-end speech recognition using multi-task learning, " in Acoustics, Speech and Signal processing (ICASSP), 2017 IEEE International Conference on. IEEE, 2017, p. to appear.
-
(2017)
Acoustics, Speech and Signal Processing (ICASSP) 2017 IEEE International Conference On. IEEE
-
-
Kim, S.1
Hori, T.2
Watanabe, S.3
-
26
-
-
84879854889
-
Representation learning: A review and new perspectives
-
Yoshua Bengio, Aaron Courville, Pascal Vincent, "Representation learning: A review and new perspectives, " IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 8, pp. 1798-1828, 2013.
-
(2013)
IEEE Transactions on Pattern Analysis and Machine Intelligence
, vol.35
, Issue.8
, pp. 1798-1828
-
-
Bengio, Y.1
Courville, A.2
Vincent, P.3
-
27
-
-
84937508363
-
How transferable are features in deep neural networks?
-
Jason Yosinski, Jeff Clune, Yoshua Bengio, Hod Lipson, "How transferable are features in deep neural networks?, " in Advances in neural information processing systems, 2014, pp. 3320-3328.
-
(2014)
Advances in Neural Information Processing Systems
, pp. 3320-3328
-
-
Yosinski, J.1
Clune, J.2
Bengio, Y.3
Lipson, H.4
-
28
-
-
84959115246
-
Cross-lingual transfer learning during supervised training in low resource scenarios.
-
Amit Das and Mark Hasegawa-Johnson, "Cross-lingual transfer learning during supervised training in low resource scenarios., " in INTERSPEECH, 2015, pp. 3531-3535.
-
(2015)
Interspeech
, pp. 3531-3535
-
-
Das, A.1
Hasegawa-Johnson, M.2
-
29
-
-
34547548235
-
Probabilistic and bottle-neck features for LVCSR of meetings
-
Frantisek Grézl, Martin Karafiát, Stanislav Kontár, Jan Cernocky, "Probabilistic and bottle-neck features for LVCSR of meetings, " in Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on. IEEE, 2007, vol. 4, pp. IV-757.
-
(2007)
Acoustics, Speech and Signal Processing 2007. ICASSP 2007. IEEE International Conference On. IEEE
, vol.4
, pp. IV-757
-
-
Grézl, F.1
Karafiát, M.2
Kontár, S.3
Cernocky, J.4
-
30
-
-
84906225757
-
A scalable approach to using DNN-derived features in GMM-HMM based acoustic modeling for LVCSR
-
Z.J. Yan, Q. Huo, J. Xu, "A scalable approach to using DNN-derived features in GMM-HMM based acoustic modeling for LVCSR, " in Proc. INTERSPEECH, 2013.
-
(2013)
Proc. INTERSPEECH
-
-
Yan, Z.J.1
Huo, Q.2
Xu, J.3
-
31
-
-
84858971297
-
Convolutive bottleneck network features for LVCSR
-
K. Vesely, M. Karafiát, F. Grézl, "Convolutive bottleneck network features for LVCSR, " in Proc. ASRU, Waikoloa, USA, 2011, pp. 42-47.
-
(2011)
Proc. ASRU, Waikoloa, USA
, pp. 42-47
-
-
Vesely, K.1
Karafiát, M.2
Grézl, F.3
|