-
1
-
-
0038133939
-
Distance measures for speech recognition, psychological and instrumental
-
P. Mermelstein, "Distance measures for speech recognition, psychological and instrumental, " Pattern recognition and artificial intelligence, vol. 116, pp. 374-388, 1976.
-
(1976)
Pattern Recognition and Artificial Intelligence
, vol.116
, pp. 374-388
-
-
Mermelstein, P.1
-
2
-
-
0025041264
-
Perceptual linear predictive (PLP) analysis of speech
-
H. Hermansky, "Perceptual linear predictive (PLP) analysis of speech, " Journal of the Acoustical Society of America, vol. 87, pp. 1738-1752, 1990.
-
(1990)
Journal of the Acoustical Society of America
, vol.87
, pp. 1738-1752
-
-
Hermansky, H.1
-
3
-
-
84893688455
-
Learning filter banks within a deep neural network framework
-
T. N. Sainath, B. Kingsbury, A.-R. Mohamed, and B. Ramabhadran, "Learning filter banks within a deep neural network framework, " in Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on. IEEE, 2013, pp. 297-302.
-
(2013)
Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop On. IEEE
, pp. 297-302
-
-
Sainath, T.N.1
Kingsbury, B.2
Mohamed, A.-R.3
Ramabhadran, B.4
-
4
-
-
84910065702
-
Acoustic modeling with deep neural networks using raw time signal for LVCSR
-
Z. Tüske, P. Golik, R. Schlüter, and H. Ney, "Acoustic modeling with deep neural networks using raw time signal for LVCSR, " in Proc. Interspeech, 2014.
-
(2014)
Proc. Interspeech
-
-
Tüske, Z.1
Golik, P.2
Schlüter, R.3
Ney, H.4
-
5
-
-
84946030537
-
Speech acoustic modeling from raw multichannel waveforms
-
Y. Hoshen, R. J.Weiss, and K.W.Wilson, "Speech acoustic modeling from raw multichannel waveforms, " in Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on. IEEE, 2015, pp. 4624-4628.
-
(2015)
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference On. IEEE
, pp. 4624-4628
-
-
Hoshen, Y.1
Weiss, R.J.2
Wilson, K.W.3
-
7
-
-
84959168440
-
Learning the speech front-end with raw waveform cldnns
-
T. N. Sainath, R. J. Weiss, A. Senior, K. W. Wilson, and O. Vinyals, "Learning the speech front-end with raw waveform cldnns, " in Proc. Interspeech, 2015.
-
(2015)
Proc. Interspeech
-
-
Sainath, T.N.1
Weiss, R.J.2
Senior, A.3
Wilson, K.W.4
Vinyals, O.5
-
8
-
-
0024634603
-
Phoneme recognition using time-delay neural networks
-
Mar
-
A. Waibel, T. Hanazawa, G. Hinton, K. Shikano, and K. J. Lang, "Phoneme recognition using time-delay neural networks, " IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 37, no. 3, pp. 328-339, Mar 1989.
-
(1989)
IEEE Transactions on Acoustics, Speech, and Signal Processing
, vol.37
, Issue.3
, pp. 328-339
-
-
Waibel, A.1
Hanazawa, T.2
Hinton, G.3
Shikano, K.4
Lang, K.J.5
-
9
-
-
84994310412
-
Purely sequencetrained neural networks for ASR based on lattice-free MMI
-
[Online]
-
D. Povey, V. Peddinti, D. Galvez, P. Ghahrmani, V. Manohar, X. Na, Y. Wang, and S. Khudanpur, "Purely sequencetrained neural networks for ASR based on lattice-free MMI, " in Submitted to Interspeech, 2016. [Online]. Available: http://www.danielpovey.com/files/2016 interspeech mmi.pdf
-
(2016)
Submitted to Interspeech
-
-
Povey, D.1
Peddinti, V.2
Galvez, D.3
Ghahrmani, P.4
Manohar, V.5
Na, X.6
Wang, Y.7
Khudanpur, S.8
-
10
-
-
85016587886
-
Switchboard: Telephone speech corpus for research and development
-
J. J. Godfrey et al., "Switchboard: Telephone speech corpus for research and development, " in ICASSP, 1992.
-
(1992)
ICASSP
-
-
Godfrey, J.J.1
-
14
-
-
0030263447
-
Mean and variance adaptation within the MLLR framework
-
M. J. F. Gales and P. C. Woodland, "Mean and Variance Adaptation Within the MLLR Framework, " Computer Speech and Language, vol. 10, pp. 249-264, 1996.
-
(1996)
Computer Speech and Language
, vol.10
, pp. 249-264
-
-
Gales, M.J.F.1
Woodland, P.C.2
-
15
-
-
79951609039
-
Front-end factor analysis for speaker verification
-
N. Dehak, P. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, "Front-end factor analysis for speaker verification, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 19, no. 4, pp. 788-798, 2011.
-
(2011)
Audio, Speech, and Language Processing, IEEE Transactions on
, vol.19
, Issue.4
, pp. 788-798
-
-
Dehak, N.1
Kenny, P.2
Dehak, R.3
Dumouchel, P.4
Ouellet, P.5
-
16
-
-
84893691530
-
Speaker adaptation of neural network acoustic models using i-vectors
-
G. Saon, H. Soltau, D. Nahamoo, and M. Picheny, "Speaker adaptation of neural network acoustic models using i-vectors." in ASRU, 2013, pp. 55-59.
-
(2013)
ASRU
, pp. 55-59
-
-
Saon, G.1
Soltau, H.2
Nahamoo, D.3
Picheny, M.4
-
17
-
-
84964483822
-
JHU ASpIRE system: Robust LVCSR with TDNNs, ivector Adaptation, and RNN-LMs
-
V. Peddinti, G. Chen, V. Manohar, T. Ko, D. Povey, and S. Khudanpur, "JHU ASpIRE system: Robust LVCSR with TDNNs, ivector Adaptation, and RNN-LMs, " in ASRU, 2015.
-
(2015)
ASRU
-
-
Peddinti, V.1
Chen, G.2
Manohar, V.3
Ko, T.4
Povey, D.5
Khudanpur, S.6
-
18
-
-
84959142471
-
Robust i-vector based adaptation of DNN acoustic model for speech recognition
-
S. Garimella, A. Mandal, N. Strom, B. Hoffmeister, S. Matsoukas, and S. H. K. Parthasarathi, "Robust i-vector based adaptation of DNN acoustic model for speech recognition, " In Proceedings of Interspeech, 2015.
-
(2015)
Proceedings of Interspeech
-
-
Garimella, S.1
Mandal, A.2
Strom, N.3
Hoffmeister, B.4
Matsoukas, S.5
Parthasarathi, S.H.K.6
-
19
-
-
84959118622
-
Audio augmentation for speech recognition
-
T. Ko, V. Peddinti, D. Povey, and S. Khudanpur, "Audio augmentation for speech recognition, " in Proceedings of INTERSPEECH, 2015.
-
(2015)
Proceedings of INTERSPEECH
-
-
Ko, T.1
Peddinti, V.2
Povey, D.3
Khudanpur, S.4
-
20
-
-
84896734479
-
Deep scattering spectrum
-
J. Andén and S. Mallat, "Deep scattering spectrum, " Signal Processing, IEEE Transactions on, vol. 62, no. 16, pp. 4114-4128, 2014.
-
(2014)
Signal Processing, IEEE Transactions on
, vol.62
, Issue.16
, pp. 4114-4128
-
-
Andén, J.1
Mallat, S.2
-
21
-
-
84905239342
-
Improving Deep Neural Network Acoustic Models using Generalized Maxout Networks
-
May
-
X. Zhang, J. Trmal, D. Povey, and S. Khudanpur, "Improving Deep Neural Network Acoustic Models using Generalized Maxout Networks, " in Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on, May 2014, pp. 215-219.
-
(2014)
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
, pp. 215-219
-
-
Zhang, X.1
Trmal, J.2
Povey, D.3
Khudanpur, S.4
-
22
-
-
84959110637
-
Convolutional neural networks for acoustic modeling of raw time signal in lvcsr
-
P. Golik, Z. Tüske, R. Schlüter, and H. Ney, "Convolutional Neural Networks for Acoustic Modeling of Raw Time Signal in LVCSR, " in Sixteenth Annual Conference of the International Speech Communication Association, 2015.
-
(2015)
Sixteenth Annual Conference of the International Speech Communication Association
-
-
Golik, P.1
Tüske, Z.2
Schlüter, R.3
Ney, H.4
-
23
-
-
84959115289
-
A time delay neural network architecture for efficient modeling of long temporal contexts
-
V. Peddinti, D. Povey, and S. Khudanpur, "A time delay neural network architecture for efficient modeling of long temporal contexts, " in Proceedings of INTERSPEECH, 2015.
-
(2015)
Proceedings of INTERSPEECH
-
-
Peddinti, V.1
Povey, D.2
Khudanpur, S.3
-
24
-
-
0012330750
-
The design for the Wall Street Journal-based CSR corpus
-
Association for Computational Linguistics
-
D. B. Paul and J. M. Baker, "The design for the Wall Street Journal-based CSR corpus, " in Proceedings of the workshop on Speech and Natural Language. Association for Computational Linguistics, 1992, pp. 357-362.
-
(1992)
Proceedings of the Workshop on Speech and Natural Language
, pp. 357-362
-
-
Paul, D.B.1
Baker, J.M.2
-
25
-
-
84858953642
-
The kaldi speech recognition toolkit
-
D. Povey, A. Ghoshal et al., "The Kaldi Speech Recognition Toolkit, " in Proc. ASRU, 2011.
-
(2011)
Proc. ASRU
-
-
Povey, D.1
Ghoshal, A.2
|