-
1
-
-
84964518874
-
-
http://htk.eng.cam.ac.uk
-
-
-
-
2
-
-
84867605836
-
Applying convolutional neural networks concepts to hybrid NNHMM model for speech recognition
-
Kyoto
-
O. Abdel-Hamid, A. Mohamed, H. Jiang, &G. Penn, "Applying convolutional neural networks concepts to hybrid NNHMM model for speech recognition", Proc. ICASSP, Kyoto, 2012
-
(2012)
Proc. ICASSP
-
-
Abdel-Hamid, O.1
Mohamed, A.2
Jiang, H.3
Penn, G.4
-
3
-
-
0030362995
-
A compact model for speaker adaptive training
-
Philadelphia
-
T. Anastasakos, J. McDonough, R. Schwartz, &J. Makhoul, "A compact model for speaker adaptive training", Proc. ICSLP, Philadelphia, 1996
-
(1996)
Proc. ICSLP
-
-
Anastasakos, T.1
McDonough, J.2
Schwartz, R.3
Makhoul, J.4
-
4
-
-
85010742974
-
The MGB challenge: Evaluating multi-genre broadcast media transcription
-
Scottsdale
-
P. Bell, M.J.F. Gales, T. Hain, J. Kilgour, P. Lanchantin, X. Liu, A. McParland, S. Renals, O. Saz, M.Wester &P.C.Woodland. "The MGB challenge: Evaluating multi-genre broadcast media transcription", Proc. ASRU Workshop, Scottsdale, 2015
-
(2015)
Proc. ASRU Workshop
-
-
Bell, P.1
Gales, M.J.F.2
Hain, T.3
Kilgour, J.4
Lanchantin, P.5
Liu, X.6
McParland, A.7
Renals, S.8
Saz, O.9
Wester, M.10
Woodland, P.C.11
-
5
-
-
0028392483
-
Learning long-term dependencies with gradient descent is difficult
-
Y. Bengio, P. Simard, &P. Frasconi, "Learning long-term dependencies with gradient descent is difficult", IEEE Transactions on Neural Networks, vol. 5, pp. 157-166, 1994
-
(1994)
IEEE Transactions on Neural Networks
, vol.5
, pp. 157-166
-
-
Bengio, Y.1
Simard, P.2
Frasconi, P.3
-
6
-
-
41049105254
-
Joint-sequence models for graphemeto-phoneme conversion
-
M. Bisani &H. Ney, "Joint-sequence models for graphemeto-phoneme conversion, Speech Communication, vol. 50, no. 5, 2008
-
(2008)
Speech Communication
, vol.50
, Issue.5
-
-
Bisani, M.1
Ney, H.2
-
7
-
-
0141607824
-
Latent dirichlet allocation
-
D.M. Blei, A. Ng, &M.I. Jordan, "Latent Dirichlet allocation", Journal of Machine Learning Research, vol. 3, pp. 99-1022, 2003
-
(2003)
Journal of Machine Learning Research
, vol.3
, pp. 99-1022
-
-
Blei, D.M.1
Ng, A.2
Jordan, M.I.3
-
8
-
-
4544253838
-
Improving broadcast news transcription by lightly supervised discriminative training
-
Montreal
-
H.Y. Chan &P.C.Woodland, "Improving broadcast news transcription by lightly supervised discriminative training", Proc. ICASSP, Montreal, 2004
-
(2004)
Proc. ICASSP
-
-
Chan, H.Y.1
Woodland, P.C.2
-
9
-
-
84910067710
-
Efficient GPU-based training of recurrent neural network language models using spliced sentence bunch
-
Singapore
-
X. Chen, Y. Wang, X. Liu, M.J.F. Gales, &P.C. Woodland, "Efficient GPU-based training of recurrent neural network language models using spliced sentence bunch", Proc. Interspeech, Singapore, 2014
-
(2014)
Proc. Interspeech
-
-
Chen, X.1
Wang, Y.2
Liu, X.3
Gales, M.J.F.4
Woodland, P.C.5
-
10
-
-
84959155988
-
Recurrent neural network language model adaptation for multi-genre broadcast speech recognition
-
Dresden
-
X. Chen, T. Tan, X. Liu, P. Lanchantin, M.J.F. Gales &P.C. Woodland, "Recurrent Neural Network Language Model Adaptation for Multi-Genre Broadcast Speech Recognition", Proc. Interspeech, Dresden, 2015
-
(2015)
Proc. Interspeech
-
-
Chen, X.1
Tan, T.2
Liu, X.3
Lanchantin, P.4
Gales, M.J.F.5
Woodland, P.C.6
-
11
-
-
4544253834
-
Posterior probability decoding, confidence estimation and system combination
-
College Park, MD
-
G. Evermann &P.C. Woodland, "Posterior probability decoding, confidence estimation and system combination", Proc. Speech Transcription Workshop, College Park, MD, 2000
-
(2000)
Proc. Speech Transcription Workshop
-
-
Evermann, G.1
Woodland, P.C.2
-
12
-
-
0030638031
-
A post-processing system to yield reduced word error rates: Recogniser output voting error reduction (ROVER)
-
Santa Barbara
-
J. Fiscus, "A post-processing system to yield reduced word error rates: recogniser output voting error reduction (ROVER), iProc. ASRU Workshop, Santa Barbara, 1997
-
(1997)
IProc. ASRU Workshop
-
-
Fiscus, J.1
-
13
-
-
0032050110
-
Maximum likelihood linear transformations for HMM-based speech recognition
-
M.J.F. Gales, "Maximum likelihood linear transformations for HMM-based speech recognition", Computer Speech and Langauge, vol. 12, pp. 75-98, 1997
-
(1997)
Computer Speech and Langauge
, vol.12
, pp. 75-98
-
-
Gales, M.J.F.1
-
14
-
-
0032638856
-
Semi-tied covariance matrices for hidden Markov models
-
M.J.F. Gales, "Semi-tied covariance matrices for hidden Markov models", IEEE Transactions on Speech and Audio Processing, vol. 7, no. 3, pp. 272-281, 1999
-
(1999)
IEEE Transactions on Speech and Audio Processing
, vol.7
, Issue.3
, pp. 272-281
-
-
Gales, M.J.F.1
-
15
-
-
34047266379
-
Progress in the CU-HTK broadcast news transcription system
-
M.J.F. Gales, D.Y. Kim, P.C. Woodland, H.Y. Chan, D. Mrva, R. Sinha, &S.E. Tranter, "Progress in the CU-HTK broadcast news transcription system", IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, no. 5, pp. 1513-1525, 2006
-
(2006)
IEEE Transactions on Audio, Speech, and Language Processing
, vol.14
, Issue.5
, pp. 1513-1525
-
-
Gales, M.J.F.1
Kim, D.Y.2
Woodland, P.C.3
Chan, H.Y.4
Mrva, D.5
Sinha, R.6
Tranter, S.E.7
-
16
-
-
0036567851
-
The LIMSI broadcast news transcription system
-
J.L. Gauvain, L. Lamel, &G. Adda "The LIMSI broadcast news transcription system" Speech communication, vol. 37, no. 1, pp. 89-108, 2002
-
(2002)
Speech Communication
, vol.37
, Issue.1
, pp. 89-108
-
-
Gauvain, J.L.1
Lamel, L.2
Adda, G.3
-
17
-
-
84905252790
-
A pitch extraction algorithm tuned for automatic speech recognition
-
Florence
-
P. Ghahremani, B. BabaAli, D. Povey, K. Riedhammer, J. Trmal, &S. Khudanpur, "A pitch extraction algorithm tuned for automatic speech recognition", Proc. ICASSP, Florence, 2014
-
(2014)
Proc. ICASSP
-
-
Ghahremani, P.1
Babaali, B.2
Povey, D.3
Riedhammer, K.4
Trmal, J.5
Khudanpur, S.6
-
18
-
-
51449103447
-
Optimizing bottle-neck features for LVCSR
-
Las Vegas
-
F. Grezl &P. Fousek, "Optimizing bottle-neck features for LVCSR", Proc. ICASSP, Las Vegas, 2008
-
(2008)
Proc. ICASSP
-
-
Grezl, F.1
Fousek, P.2
-
19
-
-
78650474133
-
-
Technical Report, UTML TR 2010-003, Department of Computer Science, University of Toronto
-
G.E. Hinton, "A Practical Guide to Training Restricted Boltzmann Machines", Technical Report, UTML TR 2010-003, Department of Computer Science, University of Toronto, 2010
-
(2010)
A Practical Guide to Training Restricted Boltzmann Machines
-
-
Hinton, G.E.1
-
20
-
-
84959162419
-
I-vector estimation using informative priors for adaptation of deep neural networks
-
Dresden
-
P. Karanasou, M.J.F. Gales &P.C. Woodland, "I-vector estimation using informative priors for adaptation of deep neural networks", Proc. Interspeech, Dresden, 2015
-
(2015)
Proc. Interspeech
-
-
Karanasou, P.1
Gales, M.J.F.2
Woodland, P.C.3
-
21
-
-
84964556678
-
Speaker diarisation and longitudinal linking in multi-genre broadcast data
-
P. Karanasou, M.J.F. Gales, P. Lanchantin, X. Liu, Y. Qian, L. Wang, P.C. Woodland &C. Zhang, "Speaker diarisation and longitudinal linking in multi-genre broadcast data", Proc. ASRU Workshop, Scottsdale, 2015
-
(2015)
Proc. ASRU Workshop, Scottsdale
-
-
Karanasou, P.1
Gales, M.J.F.2
Lanchantin, P.3
Liu, X.4
Qian, Y.5
Wang, L.6
Woodland, P.C.7
Zhang, C.8
-
22
-
-
70349213445
-
Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling
-
Taipei
-
B. Kingsbury, "Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling, Proc. ICASSP, Taipei, 2009
-
(2009)
Proc. ICASSP
-
-
Kingsbury, B.1
-
23
-
-
84893668957
-
Investigation of multilingual deep neural networks for spoken term detection
-
Olomouc
-
K.M. Knill, M.J.F. Gales, S.P. Rath, P.C. Woodland, C. Zhang, &S.-X. Zhang, "Investigation of multilingual deep neural networks for spoken term detection", Proc. ASRU Workshop, Olomouc, 2013
-
(2013)
Proc. ASRU Workshop
-
-
Knill, K.M.1
Gales, M.J.F.2
Rath, S.P.3
Woodland, P.C.4
Zhang, C.5
Zhang, S.-X.6
-
24
-
-
0036460908
-
Lightly supervised and unsupervised acoustic model training
-
L. Lamel, J.L. Gauvain, &G. Adda, "Lightly supervised and unsupervised acoustic model training", Computer Speech &Language, vol. 16, no. 1, pp. 115-129, 2002
-
(2002)
Computer Speech &Language
, vol.16
, Issue.1
, pp. 115-129
-
-
Lamel, L.1
Gauvain, J.L.2
Adda, G.3
-
25
-
-
84964513580
-
The development of the Cambridge university alignment systems for the multi-genre broadcast challenge
-
Scottsdale
-
P. Lanchantin, M.J.F. Gales, P. Karanasou, X. Liu, Y. Qian, L. Wang, P.C. Woodland &C. Zhang, "The development of the Cambridge University alignment systems for the Multi-Genre Broadcast challenge", Proc. ASRU Workshop, Scottsdale, 2015
-
(2015)
Proc. ASRU Workshop
-
-
Lanchantin, P.1
Gales, M.J.F.2
Karanasou, P.3
Liu, X.4
Qian, Y.5
Wang, L.6
Woodland, P.C.7
Zhang, C.8
-
27
-
-
84905240726
-
Efficient lattice rescoring using recurrent neural network language models
-
Florence
-
X. Liu, Y. Wang, X. Chen, M.J.F. Gales, &P.C. Woodland, "Efficient lattice rescoring using recurrent neural network language models", Proc. ICASSP, Florence, 2014
-
(2014)
Proc. ICASSP
-
-
Liu, X.1
Wang, Y.2
Chen, X.3
Gales, M.J.F.4
Woodland, P.C.5
-
28
-
-
84959109976
-
The Cambridge university 2014 BOLT conversational telephone Mandarin Chinese LVCSR system for speech translation
-
Dresden
-
X. Liu, F. Flego, L. Wang, C. Zhang, M.J.F. Gales, &P.C. Woodland, "The Cambridge University 2014 BOLT conversational telephone Mandarin Chinese LVCSR system for speech translation", Proc. Interspeech, Dresden, 2015
-
(2015)
Proc. Interspeech
-
-
Liu, X.1
Flego, F.2
Wang, L.3
Zhang, C.4
Gales, M.J.F.5
Woodland, P.C.6
-
29
-
-
0034296009
-
Finding consensus in speech recognition: Word error minimization and other applications of confusion networks
-
L. Mangu, E. Brill, A. Stolcke, "Finding consensus in speech recognition: word error minimization and other applications of confusion networks", Computer Speech and Language, Vol. 14, No. 4, pp. 373-400, 2000
-
(2000)
Computer Speech and Language
, vol.14
, Issue.4
, pp. 373-400
-
-
Mangu, L.1
Brill, E.2
Stolcke, A.3
-
30
-
-
79959829092
-
Recurrent neural network based language model
-
Makuhari, Japan
-
T. Mikolov, M. Karafiat, L. Burget, J. Cernocky, &S. Khudanpur, "Recurrent neural network based language model", Proc. Interspeech, Makuhari, Japan, 2010
-
(2010)
Proc. Interspeech
-
-
Mikolov, T.1
Karafiat, M.2
Burget, L.3
Cernocky, J.4
Khudanpur, S.5
-
31
-
-
80051643236
-
Extensions of recurrent neural network language model
-
Prague
-
T. Mikolov, S. Kombrink, L. Burget, J. H. Cernocky, &S. Khudanpur, "Extensions of recurrent neural network language model", Proc. ICASSP, Prague, 2011
-
(2011)
Proc. ICASSP
-
-
Mikolov, T.1
Kombrink, S.2
Burget, L.3
Cernocky, J.H.4
Khudanpur, S.5
-
32
-
-
0036296863
-
Minimum phone error and I-smoothing for improved discriminative training
-
Orlando
-
D. Povey &P.C. Woodland, "Minimum phone error and I-smoothing for improved discriminative training", Proc. ICASSP, Orlando, 2002
-
(2002)
Proc. ICASSP
-
-
Povey, D.1
Woodland, P.C.2
-
33
-
-
84858953642
-
The Kaldi speech recognition toolkit
-
Hawaii
-
D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, N. Goel, M.Hannemann, P. Motlíček, Y. Qian, P. Schwarz, J. Silovsky, G. Stemmer, &K. Vesely, "The Kaldi speech recognition toolkit", Proc. ASRU Workshop, Hawaii, 2011
-
(2011)
Proc. ASRU Workshop
-
-
Povey, D.1
Ghoshal, A.2
Boulianne, G.3
Burget, L.4
Glembek, O.5
Goel, N.6
Hannemann, M.7
Motlíček, P.8
Qian, Y.9
Schwarz, P.10
Silovsky, J.11
Stemmer, G.12
Vesely, K.13
-
34
-
-
70450180978
-
Robust LTS rules with the Combilex speech technology lexicon
-
Brighton
-
K. Richmond, R. Clark &S. Fitt, "Robust LTS rules with the Combilex speech technology lexicon", Proc. Interspeech, Brighton, 2009
-
(2009)
Proc. Interspeech
-
-
Richmond, K.1
Clark, R.2
Fitt, S.3
-
35
-
-
79959836077
-
On generating Combilex pronunciations via morphological analysis
-
Makuhari, Japan
-
K. Richmond, R. Clark &S. Fitt, "On generating Combilex pronunciations via morphological analysis", Proc. Interspeech, Makuhari, Japan, 2010
-
(2010)
Proc. Interspeech
-
-
Richmond, K.1
Clark, R.2
Fitt, S.3
-
36
-
-
84910046405
-
Long short-term memory recurrent neural network architectures for large scale acoustic modeling
-
Singapore
-
H. Sak, A. Senior, &F. Beaufays, "Long short-term memory recurrent neural network architectures for large scale acoustic modeling", Proc. Interspeech, Singapore, 2014
-
(2014)
Proc. Interspeech
-
-
Sak, H.1
Senior, A.2
Beaufays, F.3
-
37
-
-
84893688455
-
Learning filter banks within a deep neural network framework
-
Olomouc
-
T. N. Sainath, B. Kingsbury, A. Mohamed, &B. Ramabhadran, "Learning filter banks within a deep neural network framework", Proc. ASRU Workshop, Olomouc, 2013
-
(2013)
Proc. ASRU Workshop
-
-
Sainath, T.N.1
Kingsbury, B.2
Mohamed, A.3
Ramabhadran, B.4
-
38
-
-
84946037134
-
Convolutional, long short-term memory, fully connected deep neural networks
-
Brisbane
-
T.N. Sainath, O. Vinyals, A. Senior, &Hasim Sak, "Convolutional, long short-term memory, fully connected deep neural networks", Proc. ICASSP, Brisbane, 2015
-
(2015)
Proc. ICASSP
-
-
Sainath, T.N.1
Vinyals, O.2
Senior, A.3
Sak, H.4
-
39
-
-
84890446559
-
Feature engineering in context-dependent deep neural networks
-
Hawaii
-
F. Seide, G. Li, X. Chen, &D. Yu, "Feature engineering in context-dependent deep neural networks", Proc. ASRU Workshop, Hawaii, 2011
-
(2011)
Proc. ASRU Workshop
-
-
Seide, F.1
Li, G.2
Chen, X.3
Yu, D.4
-
40
-
-
84906240855
-
Prefix tree based n-best list re-scoring for recurrent neural network language model used in speech recognition system
-
Lyon
-
Y. Si, Q. Zhang, T. Li, J. Pan, &Y. Yan, "Prefix tree based n-best list re-scoring for recurrent neural network language model used in speech recognition system", Proc. Interspeech, Lyon, 2013
-
(2013)
Proc. Interspeech
-
-
Si, Y.1
Zhang, Q.2
Li, T.3
Pan, J.4
Yan, Y.5
-
41
-
-
33646357306
-
The Cambridge University March 2005 speaker diarisation system
-
R. Sinha, S.E. Tranter, M.J.F. Gales, &P.C. Woodland, "The Cambridge University March 2005 speaker diarisation system", Proc. Interspeech, 2005
-
(2005)
Proc. Interspeech
-
-
Sinha, R.1
Tranter, S.E.2
Gales, M.J.F.3
Woodland, P.C.4
-
42
-
-
84881054791
-
Hermitian polynomial for speaker adaptation of connectionist speech recognition systems
-
S.M. Siniscalchi, J.-Y. Li, &C.-H. Lee, "Hermitian polynomial for speaker adaptation of connectionist speech recognition systems", IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, pp. 2152-2161, 2013
-
(2013)
IEEE Transactions on Audio, Speech, and Language Processing
, vol.21
, pp. 2152-2161
-
-
Siniscalchi, S.M.1
Li, J.-Y.2
Lee, C.-H.3
-
43
-
-
84891308106
-
SRILM: An extensible language modeling toolkit
-
Denver
-
A. Stolcke, "SRILM an extensible language modeling toolkit", Proc. ICSLP, Denver, 2002
-
(2002)
Proc. ICSLP
-
-
Stolcke, A.1
-
44
-
-
84890492591
-
Revisiting hybrid and GMM-HMM system combination techniques
-
Vancouver
-
P. Swietojanski, A. Ghoshal, &S. Renals, "Revisiting hybrid and GMM-HMM system combination techniques", Proc. ICASSP, Vancouver, 2013
-
(2013)
Proc. ICASSP
-
-
Swietojanski, P.1
Ghoshal, A.2
Renals, S.3
-
45
-
-
84983119674
-
Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models
-
Lake Tahoe
-
P. Swietojanski &S. Renals, "Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models", Proc. IWSLT, Lake Tahoe, 2014
-
(2014)
Proc. IWSLT
-
-
Swietojanski, P.1
Renals, S.2
-
46
-
-
84946032695
-
Differentiable pooling for unsupervised speaker adaptation
-
Brisbane
-
P. Swietojanski &S. Renals, "Differentiable pooling for unsupervised speaker adaptation", Proc. ICASSP, Brisbane, 2015
-
(2015)
Proc. ICASSP
-
-
Swietojanski, P.1
Renals, S.2
-
48
-
-
84906274730
-
Sequencediscriminative training of deep neural networks
-
Lyon
-
K. Vesely, A. Ghoshal, L. Burget, &D. Povey, "Sequencediscriminative training of deep neural networks", Proc. Interspeech, Lyon, 2013
-
(2013)
Proc. Interspeech
-
-
Vesely, K.1
Ghoshal, A.2
Burget, L.3
Povey, D.4
-
49
-
-
0036567794
-
The development of the HTK broadcast news transcription system: An overview
-
P.C.Woodland, "The development of the HTK broadcast news transcription system: An overview", Speech Communication, vol. 37, no. 1, pp. 47-67, 2002
-
(2002)
Speech Communication
, vol.37
, Issue.1
, pp. 47-67
-
-
Woodland, P.C.1
-
50
-
-
79953250475
-
Minimum Bayes risk decoding and system combination based on a recursion for edit distance
-
H. Xu, D. Povey, L. Mangu, &J. Zhu, "Minimum Bayes risk decoding and system combination based on a recursion for edit distance", Computer Speech &Language, vol. 25, no. 4, pp. 802-828, 2011
-
(2011)
Computer Speech &Language
, vol.25
, Issue.4
, pp. 802-828
-
-
Xu, H.1
Povey, D.2
Mangu, L.3
Zhu, J.4
-
51
-
-
0003571976
-
-
Cambridge University Engineering Department
-
S.J. Young, G. Evermann, M.J.F. Gales, T. Hain., D. Kershaw, X. Liu, G. Moore, J.J. Odell, D. Ollason, D. Povey, V. Valtchev, and P.C. Woodland, The HTK book (for HTK version 3.4). Cambridge University Engineering Department, 2006
-
(2006)
The HTK Book (For HTK Version 3.4)
-
-
Young, S.J.1
Evermann, G.2
Gales, M.J.F.3
Hain, T.4
Kershaw, D.5
Liu, X.6
Moore, G.7
Odell, J.J.8
Ollason, D.9
Povey, D.10
Valtchev, V.11
Woodland, P.C.12
-
52
-
-
84923929378
-
Fuse deep neural network and Gaussian mixture model systems
-
Springer, London
-
D. Yu &L. Deng, "Fuse deep neural network and Gaussian mixture model systems", Automatic Speech Recognition: A Deep Learning Approach, pp. 177-191. Springer, London, 2015
-
(2015)
Automatic Speech Recognition: A Deep Learning Approach
, pp. 177-191
-
-
Yu, D.1
Deng, L.2
-
53
-
-
84959142742
-
A general artificial neural network extension for HTK
-
Dresden
-
C. Zhang &P.C. Woodland, "A general artificial neural network extension for HTK", Proc. Interspeech, Dresden, 2015
-
(2015)
Proc. Interspeech
-
-
Zhang, C.1
Woodland, P.C.2
-
54
-
-
84959174678
-
Parameterised sigmoid and ReLU hidden activation functions for DNN acoustic modelling
-
Dresden
-
C. Zhang &P.C. Woodland, "Parameterised sigmoid and ReLU hidden activation functions for DNN acoustic modelling", Proc. Interspeech, Dresden, 2015
-
(2015)
Proc. Interspeech
-
-
Zhang, C.1
Woodland, P.C.2
-
55
-
-
84946061232
-
Investigating online low-footprint speaker adaptation using generalized linear regression and click-through data
-
Brisbane
-
Y. Zhao, J.-Y. Li, J. Xue, &Y.-F. Gong, "Investigating online low-footprint speaker adaptation using generalized linear regression and click-through data", Proc. ICASSP, Brisbane, 2015
-
(2015)
Proc. ICASSP
-
-
Zhao, Y.1
Li, J.-Y.2
Xue, J.3
Gong, Y.-F.4
|