-
1
-
-
84867605836
-
Applying convolutional neural network concepts to hybrid NN-HMM model for speech recognition
-
Abdel-Hamid, O., Mohamed, A., Jiang, H., Penn, G. (2012). Applying convolutional neural network concepts to hybrid NN-HMM model for speech recognition, In Proc. ICASSP.
-
(2012)
Proc. ICASSP.
-
-
Abdel-Hamid, O.1
Mohamed, A.2
Jiang, H.3
Penn, G.4
-
3
-
-
84890527827
-
Improving deep neural networks for LVCSR using rectified linear units and dropout
-
Dahl, G., Sainath, T., Hinton, G. (2013), Improving deep neural networks for LVCSR using rectified linear units and dropout. In Proc. ICASSP.
-
(2013)
Proc. ICASSP.
-
-
Dahl, G.1
Sainath, T.2
Hinton, G.3
-
4
-
-
84055222005
-
Context-dependent pre-trained deep neural networks for large vocabulary speech recognition
-
Dahl G., Yu D., Deng L., Acero A. Context-dependent pre-trained deep neural networks for large vocabulary speech recognition. IEEE Transactions on Audio, Speech, and Language Processing 2012, 20(1):30-42.
-
(2012)
IEEE Transactions on Audio, Speech, and Language Processing
, vol.20
, Issue.1
, pp. 30-42
-
-
Dahl, G.1
Yu, D.2
Deng, L.3
Acero, A.4
-
5
-
-
84890545163
-
A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion
-
Deng, L., Abdel-Hamid, O., Yu, D. (2013). A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion. In Proc. ICASSP.
-
(2013)
Proc. ICASSP.
-
-
Deng, L.1
Abdel-Hamid, O.2
Yu, D.3
-
6
-
-
0032050110
-
Maximum likelihood linear transformations for HMM-based speech recognition
-
Gales M. Maximum likelihood linear transformations for HMM-based speech recognition. Computer Speech and Language 1998, 12(2):75-98.
-
(1998)
Computer Speech and Language
, vol.12
, Issue.2
, pp. 75-98
-
-
Gales, M.1
-
8
-
-
79951563340
-
Understanding the difficulty of training deep feedforward neural networks
-
Glorot, X., Bengio, Y. Understanding the difficulty of training deep feedforward neural networks. In Proc. AI Stats.
-
Proc. AI Stats.
-
-
Glorot, X.1
Bengio, Y.2
-
9
-
-
85032751458
-
Deep neural networks for acoustic modeling in speech recognition
-
Hinton G., Deng L., Yu D., Dahl G., Mohamed A., Jaitly N., Senior A., Vanhoucke V., Nguyen P., Sainath T.N., Kingsbury B. Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Processing Magazine 2012, 29(6):82-97.
-
(2012)
IEEE Signal Processing Magazine
, vol.29
, Issue.6
, pp. 82-97
-
-
Hinton, G.1
Deng, L.2
Yu, D.3
Dahl, G.4
Mohamed, A.5
Jaitly, N.6
Senior, A.7
Vanhoucke, V.8
Nguyen, P.9
Sainath, T.N.10
Kingsbury, B.11
-
10
-
-
84890466217
-
Improving neural networks by preventing co-adaptation of feature detectors
-
Hinton G., Srivastava N., Krizhevsky A., Sutskever I., Salakhutdinov R. Improving neural networks by preventing co-adaptation of feature detectors. The Computing Research Repository (CoRR) 1207 2012, 0580.
-
(2012)
The Computing Research Repository (CoRR) 1207
, pp. 0580
-
-
Hinton, G.1
Srivastava, N.2
Krizhevsky, A.3
Sutskever, I.4
Salakhutdinov, R.5
-
11
-
-
84878539964
-
Application of pretrained deep neural networks to large vocabulary speech recognition
-
Jaitly, N., Nguyen, P., Senior, A. W., Vanhoucke, V. (2012). Application of pretrained deep neural networks to large vocabulary speech recognition, In Proc. Interspeech.
-
(2012)
Proc. Interspeech
-
-
Jaitly, N.1
Nguyen, P.2
Senior, A.W.3
Vanhoucke, V.4
-
12
-
-
70349213445
-
Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling
-
Kingsbury, B. (2009). Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling. In Proc. ICASSP.
-
(2009)
Proc. ICASSP.
-
-
Kingsbury, B.1
-
13
-
-
84878379108
-
Scalable minimum bayes risk training of deep neural network acoustic models using distributed hessian-free optimization
-
Kingsbury, B., Sainath, T. N., Soltau, H. (2012). Scalable minimum bayes risk training of deep neural network acoustic models using distributed hessian-free optimization. In Proc. Interspeech.
-
(2012)
Proc. Interspeech.
-
-
Kingsbury, B.1
Sainath, T.N.2
Soltau, H.3
-
15
-
-
0030737097
-
Face recognition: A convolutional neural-network approach
-
Lawrence S. Face recognition: A convolutional neural-network approach. IEEE Transactions on Neural Networks 1997, 8(1):98-113.
-
(1997)
IEEE Transactions on Neural Networks
, vol.8
, Issue.1
, pp. 98-113
-
-
Lawrence, S.1
-
18
-
-
5044231640
-
Learning methods for generic object recognition with invariance to pose and lighting
-
LeCun, Y., Huang, F., Bottou, L. (2004). Learning methods for generic object recognition with invariance to pose and lighting, Proc. CVPR.
-
(2004)
Proc. CVPR.
-
-
LeCun, Y.1
Huang, F.2
Bottou, L.3
-
19
-
-
0029747183
-
Speaker normalization using efficient frequency warping procedures
-
Lee, L., Rose, R. C. (1996). Speaker normalization using efficient frequency warping procedures, In Proc. ICASSP.
-
(1996)
Proc. ICASSP.
-
-
Lee, L.1
Rose, R.C.2
-
21
-
-
84867585919
-
Understanding how deep belief networks perform acoustic modelling
-
Mohamed, A., Hinton, G., Penn, G. (2012). Understanding how deep belief networks perform acoustic modelling, In ICASSP.
-
(2012)
ICASSP.
-
-
Mohamed, A.1
Hinton, G.2
Penn, G.3
-
22
-
-
51449120120
-
Boosted MMI for model and feature-space discriminative training
-
Povey, D., Kanevsky, D., Kingsbury, B., Ramabhadran, B., Saon, G., Visweswariah, K. (2008). Boosted MMI for model and feature-space discriminative training, In Proc. ICASSP (pp. 4057-4060).
-
(2008)
Proc. ICASSP
, pp. 4057-4060
-
-
Povey, D.1
Kanevsky, D.2
Kingsbury, B.3
Ramabhadran, B.4
Saon, G.5
Visweswariah, K.6
-
23
-
-
84890525984
-
Deep Convolutional Neural Networks for LVCSR
-
Sainath, T., Mohamed, A., Kingsbury, B., Ramabhadran, B. 2013 Deep Convolutional Neural Networks for LVCSR, In: Proc. ICASSP.
-
(2013)
Proc. ICASSP
-
-
Sainath, T.1
Mohamed, A.2
Kingsbury, B.3
Ramabhadran, B.4
-
24
-
-
84893654379
-
Improvements to Deep Convolutional Neural Networks for LVCSR
-
Sainath, T. N., Kingsbury, B., Mohamed, A., Dahl, G., Saon, G., Soltau, H., Beran, T., Aravkin, A. Y., Ramabhadran, B. 2013 Improvements to Deep Convolutional Neural Networks for LVCSR, In: Proc. ASRU.
-
(2013)
Proc. ASRU.
-
-
Sainath, T.N.1
Kingsbury, B.2
Mohamed, A.3
Dahl, G.4
Saon, G.5
Soltau, H.6
Beran, T.7
Aravkin, A.Y.8
Ramabhadran, B.9
-
25
-
-
84858972572
-
Making Deep Belief Networks Effective for Large Vocabulary Continuous Speech Recognition
-
Sainath, T. N., Kingsbury, B., Ramabhadran, B., Fousek, P., Novak, P., Mohamed, A. 2011 Making Deep Belief Networks Effective for Large Vocabulary Continuous Speech Recognition, In: Proc. ASRU.
-
(2011)
Proc. ASRU.
-
-
Sainath, T.N.1
Kingsbury, B.2
Ramabhadran, B.3
Fousek, P.4
Novak, P.5
Mohamed, A.6
-
26
-
-
84893659646
-
-
Sainath, T. N., Ramabhadran, B., Picheny, M., Nahamoo, D., Kanevsky, D. (2011). Exemplar-based sparse representation features: from TIMIT to LVCSR.
-
(2011)
Exemplar-based sparse representation features: from TIMIT to LVCSR
-
-
Sainath, T.N.1
Ramabhadran, B.2
Picheny, M.3
Nahamoo, D.4
Kanevsky, D.5
-
27
-
-
84893691530
-
Speaker adaptation of neural network acoustic models using I-vectors
-
(in preperation)
-
Saon, G., Soltau, H., Picheny, M., Nahamoo, D. 2013 Speaker adaptation of neural network acoustic models using I-vectors. In Proc. ASRU (in preperation).
-
(2013)
Proc. ASRU
-
-
Saon, G.1
Soltau, H.2
Picheny, M.3
Nahamoo, D.4
-
28
-
-
84865801985
-
Conversational Speech Transcription Using Context-Dependent Deep Neural Networks
-
Seide, F., Li, G., Yu, D. 2011 Conversational Speech Transcription Using Context-Dependent Deep Neural Networks, In: Proc. Interspeech.
-
(2011)
Proc. Interspeech.
-
-
Seide, F.1
Li, G.2
Yu, D.3
-
29
-
-
84874575248
-
Convolutional neural networks applied to house numbers digit classification
-
Sermanet, P., Chintala, S., LeCun, Y. 2012 Convolutional neural networks applied to house numbers digit classification, In: Pattern Recognition (ICPR), 2012 21st International Conference on.
-
(2012)
Pattern Recognition (ICPR), 2012 21st International Conference on
-
-
Sermanet, P.1
Chintala, S.2
LeCun, Y.3
-
31
-
-
0024634603
-
Phoneme recognition using time-delay neural networks
-
Waibel A., Hanazawa T., Hinton G., Shikano K., Lang K. Phoneme recognition using time-delay neural networks. IEEE Transactions on Acoustics, Speech and Signal Processing 1989, 37(3):328-339.
-
(1989)
IEEE Transactions on Acoustics, Speech and Signal Processing
, vol.37
, Issue.3
, pp. 328-339
-
-
Waibel, A.1
Hanazawa, T.2
Hinton, G.3
Shikano, K.4
Lang, K.5
-
32
-
-
0003571976
-
-
University of Cambridge
-
Young S.J., Evermann G., Gales M.J.F., Hain T., Kershaw D., Liu X., Moore G., Odell J., Ollason D., Povey D., Valtchev V., Woodland P.C. The HTK Book (for HTK Version 3.4) 2006, University of Cambridge.
-
(2006)
The HTK Book (for HTK Version 3.4)
-
-
Young, S.J.1
Evermann, G.2
Gales, M.J.F.3
Hain, T.4
Kershaw, D.5
Liu, X.6
Moore, G.7
Odell, J.8
Ollason, D.9
Povey, D.10
Valtchev, V.11
Woodland, P.C.12
-
33
-
-
0002144369
-
Tree-based State Tying for High Accuracy Acoustic Modelling
-
Young, S. J., Odell, J., Woodland, P. 1994 Tree-based State Tying for High Accuracy Acoustic Modelling, In: Proc. HLT. pp. 307-312.
-
(1994)
Proc. HLT
, pp. 307-312
-
-
Young, S.J.1
Odell, J.2
Woodland, P.3
|