-
2
-
-
85032751458
-
Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
-
Geoffrey Hinton, Li Deng, Dong Yu, George E Dahl, Abdelrahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara N Sainath, et al., "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups, " Signal Processing Magazine, IEEE, vol. 29, no. 6, pp. 82-97, 2012.
-
(2012)
Signal Processing Magazine, IEEE
, vol.29
, Issue.6
, pp. 82-97
-
-
Hinton, G.1
Deng, L.2
Yu, D.3
Dahl, G.E.4
Mohamed, A.5
Jaitly, N.6
Senior, A.7
Vanhoucke, V.8
Nguyen, P.9
Sainath, T.N.10
-
3
-
-
0028392483
-
Learning long-term dependencies with gradient descent is difficult
-
Yoshua Bengio, Patrice Simard, and Paolo Frasconi, "Learning long-term dependencies with gradient descent is difficult, " Neural Networks, IEEE Transactions on, vol. 5, no. 2, pp. 157-166, 1994.
-
(1994)
Neural Networks, IEEE Transactions on
, vol.5
, Issue.2
, pp. 157-166
-
-
Bengio, Y.1
Simard, P.2
Frasconi, P.3
-
4
-
-
0031573117
-
Long short-term memory
-
Sepp Hochreiter and Jürgen Schmidhuber, "Long short-term memory, " Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997.
-
(1997)
Neural Computation
, vol.9
, Issue.8
, pp. 1735-1780
-
-
Hochreiter, S.1
Schmidhuber, J.2
-
5
-
-
84910046405
-
Long short-term memory recurrent neural network architectures for large scale acoustic modeling
-
Hasim Sak, Andrew Senior, and Françoise Beaufays, "Long short-term memory recurrent neural network architectures for large scale acoustic modeling, " in INTERSPEECH, 2014.
-
(2014)
INTERSPEECH
-
-
Sak, H.1
Senior, A.2
Beaufays, F.3
-
6
-
-
84946083498
-
Constructing long short-term memory based deep recurrent neural networks for large vocabulary speech recognition
-
Xiangang Li and Xihong Wu, "Constructing long short-term memory based deep recurrent neural networks for large vocabulary speech recognition, " in ICASSP. IEEE, 2015, pp. 4520-4524.
-
(2015)
ICASSP. IEEE
, pp. 4520-4524
-
-
Li, X.1
Wu, X.2
-
7
-
-
84910027886
-
A comparative analytic study on the Gaussian mixture and context dependent deep neural network hidden Markov models
-
Yan Huang, Dong Yu, Chaojun Liu, and Yifan Gong, "A comparative analytic study on the Gaussian mixture and context dependent deep neural network hidden Markov models, " in INTERSPEECH, 2014.
-
(2014)
INTERSPEECH
-
-
Huang, Y.1
Yu, D.2
Liu, C.3
Gong, Y.4
-
9
-
-
84858976070
-
Feature engineering in context-dependent deep neural networks for conversational speech transcription
-
Frank Seide, Gang Li, Xie Chen, and Dong Yu, "Feature engineering in context-dependent deep neural networks for conversational speech transcription, " in ASRU. IEEE, 2011, pp. 24-29.
-
(2011)
ASRU. IEEE
, pp. 24-29
-
-
Seide, F.1
Li, G.2
Chen, X.3
Yu, D.4
-
10
-
-
84874226579
-
Adaptation of context-dependent deep neural networks for automatic speech recognition
-
Kaisheng Yao, Dong Yu, Frank Seide, Hang Su, Li Deng, and Yifan Gong, "Adaptation of context-dependent deep neural networks for automatic speech recognition., " in SLT, 2012, pp. 366-369.
-
(2012)
SLT
, pp. 366-369
-
-
Yao, K.1
Yu, D.2
Seide, F.3
Su, H.4
Deng, L.5
Gong, Y.6
-
11
-
-
84890542079
-
Kl-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition
-
Dong Yu, Kaisheng Yao, Hang Su, Gang Li, and Frank Seide, "Kl-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition, " in ICASSP. IEEE, 2013, pp. 7893-7897.
-
(2013)
ICASSP. IEEE
, pp. 7893-7897
-
-
Yu, D.1
Yao, K.2
Su, H.3
Li, G.4
Seide, F.5
-
12
-
-
84890452886
-
Fast speaker adaptation of hybrid nn/hmm model for speech recognition based on discriminative learning of speaker code
-
Ossama Abdel-Hamid and Hui Jiang, "Fast speaker adaptation of hybrid nn/hmm model for speech recognition based on discriminative learning of speaker code, " in ICASSP. IEEE, 2013, pp. 7942-7946.
-
(2013)
ICASSP. IEEE
, pp. 7942-7946
-
-
Abdel-Hamid, O.1
Jiang, H.2
-
13
-
-
84893691530
-
Speaker adaptation of neural network acoustic models using i-vectors
-
George Saon, Hagen Soltau, David Nahamoo, and Michael Picheny, "Speaker adaptation of neural network acoustic models using i-vectors, " in ASRU. IEEE, 2013, pp. 55-59.
-
(2013)
ASRU. IEEE
, pp. 55-59
-
-
Saon, G.1
Soltau, H.2
Nahamoo, D.3
Picheny, M.4
-
14
-
-
84946692024
-
Vocal tract length normalisation approaches to dnn-based children's and adults' speech recognition
-
Romain Serizel and Diego Giuliani, "Vocal tract length normalisation approaches to dnn-based children's and adults' speech recognition, " in Spoken Language Technology Workshop (SLT), 2014 IEEE. IEEE, 2014, pp. 135-140.
-
(2014)
Spoken Language Technology Workshop (SLT), 2014 IEEE. IEEE
, pp. 135-140
-
-
Serizel, R.1
Giuliani, D.2
-
15
-
-
84938725974
-
On speaker adaptation of long short-term memory recurrent neural networks
-
Yajie Miao and Florian Metze, "On speaker adaptation of long short-term memory recurrent neural networks, " in INTERSPEECH, 2015.
-
(2015)
INTERSPEECH
-
-
Miao, Y.1
Metze, F.2
-
16
-
-
79951609039
-
Front-end factor analysis for speaker verification
-
Najim Dehak, Patrick Kenny, Réda Dehak, Pierre Dumouchel, and Pierre Ouellet, "Front-end factor analysis for speaker verification, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 19, no. 4, pp. 788-798, 2011.
-
(2011)
Audio, Speech, and Language Processing, IEEE Transactions on
, vol.19
, Issue.4
, pp. 788-798
-
-
Dehak, N.1
Kenny, P.2
Dehak, R.3
Dumouchel, P.4
Ouellet, P.5
-
17
-
-
84946035423
-
An investigation of augmenting speaker representations to improve speaker normalisation for dnn-based speech recognition
-
Hengguan Huang and Khe Chai Sim, "An investigation of augmenting speaker representations to improve speaker normalisation for dnn-based speech recognition, " in ICASSP. IEEE, 2015, pp. 4610-4613.
-
(2015)
ICASSP. IEEE
, pp. 4610-4613
-
-
Huang, H.1
Chai Sim, K.2
-
18
-
-
84959155988
-
Recurrent neural network language model adaptation for multi-genre broadcast speech recognition
-
Xie Chen, Tian Tan, Xunying Liu, Pierre Lanchantin, Moquan Wan, Mark Gales, and PhilWoodland, "Recurrent neural network language model adaptation for multi-genre broadcast speech recognition, " in INTERSPEECH, 2015.
-
(2015)
INTERSPEECH
-
-
Chen, X.1
Tan, T.2
Liu, X.3
Lanchantin, P.4
Wan, M.5
Gales, M.6
Woodland, P.7
-
19
-
-
84874235486
-
Context dependent recurrent neural network language model
-
Tomas Mikolov and Geoffrey Zweig, "Context dependent recurrent neural network language model., " in SLT, 2012, pp. 234-239.
-
(2012)
SLT
, pp. 234-239
-
-
Mikolov, T.1
Zweig, G.2
-
20
-
-
84910031119
-
Towards speaker adaptive training of deep neural network acoustic models
-
Yajie Miao, Hao Zhang, and Florian Metze, "Towards speaker adaptive training of deep neural network acoustic models, " in INTERSPEECH, 2014.
-
(2014)
INTERSPEECH
-
-
Miao, Y.1
Zhang, H.2
Metze, F.3
-
21
-
-
84905252132
-
A novel scheme for speaker recognition using a phonetically-aware deep neural network
-
Yun Lei, Nicolas Scheffer, Luciana Ferrer, and Moray McLaren, "A novel scheme for speaker recognition using a phonetically-aware deep neural network, " in ICASSP. IEEE, 2014, pp. 1695-1699.
-
(2014)
ICASSP. IEEE
, pp. 1695-1699
-
-
Lei, Y.1
Scheffer, N.2
Ferrer, L.3
McLaren, M.4
-
22
-
-
84905283791
-
Joint acoustic modeling of triphones and trigraphemes by multi-task learning deep neural networks for low-resource speech recognition
-
Dongpeng Chen, Brian Mak, Cheung-Chi Leung, and Sunil Sivadas, "Joint acoustic modeling of triphones and trigraphemes by multi-task learning deep neural networks for low-resource speech recognition, " in ICASSP. IEEE, 2014, pp. 5592-5596.
-
(2014)
ICASSP. IEEE
, pp. 5592-5596
-
-
Chen, D.1
Mak, B.2
Leung, C.3
Sivadas, S.4
-
23
-
-
84959157262
-
Multi-task learning for text-dependent speaker verification
-
Nanxin Chen, Yanmin Qian, and Kai Yu, "Multi-task learning for text-dependent speaker verification, " in INTERSPEECH, 2015.
-
(2015)
INTERSPEECH
-
-
Chen, N.1
Qian, Y.2
Yu, K.3
-
24
-
-
0028996973
-
On the effects of speech rate in large vocabulary speech recognition systems
-
Matthew Siegler, Richard M Stern, et al., "On the effects of speech rate in large vocabulary speech recognition systems, " in ICASSP. IEEE, 1995, vol. 1, pp. 612-615.
-
(1995)
ICASSP. IEEE
, vol.1
, pp. 612-615
-
-
Siegler, M.1
Stern, R.M.2
-
25
-
-
85135173867
-
Speech recognition using on-line estimation of speaking rate
-
Nelson Morgan, Eric Fosler-Lussier, and Nikki Mirghafori, "Speech recognition using on-line estimation of speaking rate., " in Eurospeech, 1997, vol. 97, pp. 2079-2082.
-
(1997)
Eurospeech
, vol.97
, pp. 2079-2082
-
-
Morgan, N.1
Fosler-Lussier, E.2
Mirghafori, N.3
-
26
-
-
78049388975
-
Speaking rate adaptation using continuous frame rate normalization
-
Stephen M Chu and Daniel Povey, "Speaking rate adaptation using continuous frame rate normalization, " in ICASSP. IEEE, 2010, pp. 4306-4309.
-
(2010)
ICASSP. IEEE
, pp. 4306-4309
-
-
Chu, S.M.1
Povey, D.2
-
27
-
-
84959148444
-
Learning speech rate in speech recognition
-
Xiangyu Zeng, Shi Yin, and Dong Wang, "Learning speech rate in speech recognition, " in INTERSPEECH, 2015, pp. 528-532.
-
(2015)
INTERSPEECH
, pp. 528-532
-
-
Zeng, X.1
Yin, S.2
Wang, D.3
-
28
-
-
85008520364
-
Transcribing meetings with the amida systems
-
Thomas Hain, Luká Burget, John Dines, Philip N Garner, Frantisek Grézl, Asmaa El Hannani, Marijn Huijbregts, Martin Karafiat, Mike Lincoln, and VincentWan, "Transcribing meetings with the amida systems, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 20, no. 2, pp. 486-498, 2012.
-
(2012)
Audio, Speech, and Language Processing, IEEE Transactions on
, vol.20
, Issue.2
, pp. 486-498
-
-
Hain, T.1
Burget, L.2
Dines, J.3
Garner, P.N.4
Grézl, F.5
El Hannani, A.6
Huijbregts, M.7
Karafiat, M.8
Lincoln, M.9
Wan, V.10
-
29
-
-
84893704659
-
Hybrid acoustic models for distant and multichannel large vocabulary speech recognition
-
Pawel Swietojanski, Arnab Ghoshal, and Steve Renals, "Hybrid acoustic models for distant and multichannel large vocabulary speech recognition, " in ASRU. IEEE, 2013, pp. 285-290.
-
(2013)
ASRU. IEEE
, pp. 285-290
-
-
Swietojanski, P.1
Ghoshal, A.2
Renals, S.3
-
30
-
-
84893696682
-
-
Daniel Povey, Arnab Ghoshal, Gilles Boulianne, Lukás Burget, Ondrej Glembek, Nagendra Goel, Mirko Hannemann, Petr Motlícek, Yanmin Qian, Petr Schwarz, et al., "The kaldi speech recognition toolkit, " 2011.
-
(2011)
The Kaldi Speech Recognition Toolkit
-
-
Povey, D.1
Ghoshal, A.2
Boulianne, G.3
Burget, L.4
Glembek, O.5
Goel, N.6
Hannemann, M.7
Motlícek, P.8
Qian, Y.9
Schwarz, P.10
-
31
-
-
84928146953
-
-
Tech. Rep., Microsoft Research
-
Dong Yu, Adam Eversole, Mike Seltzer, Kaisheng Yao, Zhiheng Huang, Brian Guenter, Oleksii Kuchaiev, Yu Zhang, Frank Seide, Huaming Wang, et al., "An introduction to computational networks and the computational network toolkit, " Tech. Rep., Microsoft Research, 2014.
-
(2014)
An Introduction to Computational Networks and the Computational Network Toolkit
-
-
Yu, D.1
Eversole, A.2
Seltzer, M.3
Yao, K.4
Huang, Z.5
Guenter, B.6
Kuchaiev, O.7
Zhang, Y.8
Seide, F.9
Wang, H.10
|