-
1
-
-
84897943848
-
An overview of noise-robust automatic speech recognition
-
[1] Li, J., Deng, L., Gong, Y., Haeb-Umbach, R., An overview of noise-robust automatic speech recognition. IEEE Trans. Audio, Speech, Lang. Process. 22:4 (2014), 745–777.
-
(2014)
IEEE Trans. Audio, Speech, Lang. Process.
, vol.22
, Issue.4
, pp. 745-777
-
-
Li, J.1
Deng, L.2
Gong, Y.3
Haeb-Umbach, R.4
-
2
-
-
0032140546
-
On stochastic feature and model compensation approaches to robust speech recognition
-
[2] Lee, C.-H., On stochastic feature and model compensation approaches to robust speech recognition. Speech Commun. 25:1–3 (1998), 29–47.
-
(1998)
Speech Commun.
, vol.25
, Issue.1-3
, pp. 29-47
-
-
Lee, C.-H.1
-
3
-
-
0035426931
-
Language independent and language adaptive acoustic modeling for speech recognition
-
[3] Schultz, T., Waibel, A., Language independent and language adaptive acoustic modeling for speech recognition. Speech Commun. 35 (2001), 31–51.
-
(2001)
Speech Commun.
, vol.35
, pp. 31-51
-
-
Schultz, T.1
Waibel, A.2
-
4
-
-
84862931515
-
Experiments on cross-language attribute detection and phone recognition with minimal target specific training data
-
[4] Siniscalchi, S.M., Lyu, D.-C., Svendsen, T., Lee, C.-H., Experiments on cross-language attribute detection and phone recognition with minimal target specific training data. IEEE Trans. Audio Speech Lang. Process. 20:3 (2012), 875–887.
-
(2012)
IEEE Trans. Audio Speech Lang. Process.
, vol.20
, Issue.3
, pp. 875-887
-
-
Siniscalchi, S.M.1
Lyu, D.-C.2
Svendsen, T.3
Lee, C.-H.4
-
5
-
-
85032751458
-
Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups
-
[5] Hinton, G., Deng, L., Yu, D., Dahl, G., Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T., Kingsbury, B., Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29:6 (2012), 82–97.
-
(2012)
IEEE Signal Process. Mag.
, vol.29
, Issue.6
, pp. 82-97
-
-
Hinton, G.1
Deng, L.2
Yu, D.3
Dahl, G.4
Mohamed, A.5
Jaitly, N.6
Senior, A.7
Vanhoucke, V.8
Nguyen, P.9
Sainath, T.10
Kingsbury, B.11
-
6
-
-
0027683813
-
Shared-distribution hidden markov models for speech recognition
-
[6] Hwang, M.-Y.M.-Y., Huang, X., Shared-distribution hidden markov models for speech recognition. IEEE Trans. Speech Audio Process. 1:4 (1993), 414–420.
-
(1993)
IEEE Trans. Speech Audio Process.
, vol.1
, Issue.4
, pp. 414-420
-
-
Hwang, M.-Y.M.-Y.1
Huang, X.2
-
7
-
-
84858972572
-
Making deep belief networks effective for large vocabulary continuous speech recognition
-
[7] T. N. Sainath, B. Kingsbury, B. Ramabhadran, P. Fousek, P. Novak, A. Mohamed, Making deep belief networks effective for large vocabulary continuous speech recognition, in: Proc. ASRU, 2011, pp. 30–35.
-
(2011)
Proc. ASRU
, pp. 30-35
-
-
Sainath, T.N.1
Kingsbury, B.2
Ramabhadran, B.3
Fousek, P.4
Novak, P.5
Mohamed, A.6
-
8
-
-
84055222005
-
Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
-
[8] Dahl, G.E., Yu, D., Deng, L., Acero, A., Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans. Audio, Speech Lang. Proc. 20:1 (2012), 30–42.
-
(2012)
IEEE Trans. Audio, Speech Lang. Proc.
, vol.20
, Issue.1
, pp. 30-42
-
-
Dahl, G.E.1
Yu, D.2
Deng, L.3
Acero, A.4
-
9
-
-
84906274730
-
Sequence-discriminative training of deep neural networks
-
[9] K. Vesely`, A. Ghoshal, L. Burget, D. Povey, Sequence-discriminative training of deep neural networks, in: Proc. Interspeech, 2013, pp. 2345–2349.
-
(2013)
Proc. Interspeech
, pp. 2345-2349
-
-
K.1
Vesely2
Ghoshal, A.4
Burget, L.5
Povey, D.6
-
10
-
-
0032923221
-
-
Catastrophic forgetting in connectionist networks: causes, consequences and solutions, Trends in Cognitive Sciences, vol. 3 (4).
-
[10] M. Franch, Catastrophic forgetting in connectionist networks: causes, consequences and solutions, Trends in Cognitive Sciences, vol. 3 (4).
-
-
-
Franch, M.1
-
11
-
-
84890542079
-
KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition
-
[11] D. Yu, K. Yao, H. Su, G. Li, F. Seide, KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition, in: Proc. ICASSP, 2013, pp. 7893–7897.
-
(2013)
Proc. ICASSP
, pp. 7893-7897
-
-
Yu, D.1
Yao, K.2
Su, H.3
Li, G.4
Seide, F.5
-
12
-
-
34548012893
-
Linear hidden transformations for adaptation of hybrid ANN/HMM models
-
[12] Gemello, R., Mana, F., Scanzio, S., Laface, P., De Mori, R., Linear hidden transformations for adaptation of hybrid ANN/HMM models. Speech Commun. 49:10–11 (2007), 827–835.
-
(2007)
Speech Commun.
, vol.49
, Issue.10-11
, pp. 827-835
-
-
Gemello, R.1
Mana, F.2
Scanzio, S.3
Laface, P.4
De Mori, R.5
-
13
-
-
84876672166
-
Machine learning paradigms for speech recognition: an overview
-
[13] Deng, L., Li, X., Machine learning paradigms for speech recognition: an overview. IEEE Trans. Audio, Speech Lang. Process. 21 (2013), 1060–1089.
-
(2013)
IEEE Trans. Audio, Speech Lang. Process.
, vol.21
, pp. 1060-1089
-
-
Deng, L.1
Li, X.2
-
14
-
-
77956296425
-
Noise adaptive training for robust automatic speech recognition
-
[14] Kalinli, O., Seltzer, M.L., Droppo, J., Acero, A., Noise adaptive training for robust automatic speech recognition. IEEE Trans. Audio Speech Lang. Process. 18:8 (2010), 1889–1901.
-
(2010)
IEEE Trans. Audio Speech Lang. Process.
, vol.18
, Issue.8
, pp. 1889-1901
-
-
Kalinli, O.1
Seltzer, M.L.2
Droppo, J.3
Acero, A.4
-
16
-
-
0028419019
-
Maximum a posteriori estimation for multivariate gaussian mixture observations of Markov chains
-
[16] Gauvain, J., Lee, C., Maximum a posteriori estimation for multivariate gaussian mixture observations of Markov chains. IEEE Trans. Speech Audio Process. 2:2 (1994), 291–298.
-
(1994)
IEEE Trans. Speech Audio Process.
, vol.2
, Issue.2
, pp. 291-298
-
-
Gauvain, J.1
Lee, C.2
-
17
-
-
85121045899
-
Multitask learning: A knowledge-based source of inductive bias
-
[17] R. Caruna, Multitask learning: A knowledge-based source of inductive bias, in: Proc. ICML, 1993, pp. 41–48.
-
(1993)
Proc. ICML
, pp. 41-48
-
-
Caruna, R.1
-
18
-
-
85009167968
-
-
Multitask learning in connectionist robust asr using recurrent neural networks., in: Proc. INTERSPEECH, 2003.
-
[18] S. Parveen, P. Green, Multitask learning in connectionist robust asr using recurrent neural networks., in: Proc. INTERSPEECH, 2003.
-
-
-
Parveen, S.1
Green, P.2
-
19
-
-
84890458846
-
Multitask learning in connectionist speech recognition
-
[19] Y. Lu, F. Lu, S. Sehgal, S. Gupta, J. Du, C. Tham, P. Green, V. Wan, Multitask learning in connectionist speech recognition, in: Proc. Australian International Conference on Speech Science and Technology, 2004.
-
(2004)
Proc. Australian International Conference on Speech Science and Technology
-
-
Lu, Y.1
Lu, F.2
Sehgal, S.3
Gupta, S.4
Du, J.5
Tham, C.6
Green, P.7
Wan, V.8
-
20
-
-
84890545600
-
Multi-task learning in deep neural networks for improved phoneme recognition
-
[20] M. Seltzer, J. Droppo, Multi-task learning in deep neural networks for improved phoneme recognition, in: Proc. ICASSP, 2013, pp. 6965–6969.
-
(2013)
Proc. ICASSP
, pp. 6965-6969
-
-
Seltzer, M.1
Droppo, J.2
-
21
-
-
85043800698
-
-
Multi-objective learning and mask-based post-processing for deep neural network based speech enhancement, 2015, submitted to INTERSPEECH.
-
[21] Y. Xu, J. Du, Z. Huang, L.-R. Dai, C.-H. Lee, Multi-objective learning and mask-based post-processing for deep neural network based speech enhancement, 2015, submitted to INTERSPEECH.
-
-
-
Xu, Y.1
Du, J.2
Huang, Z.3
Dai, L.-R.4
Lee, C.-H.5
-
22
-
-
85079095310
-
The design for the wall street journal-based CSR corpus
-
[22] D.B. Paul, J.M. Baker, The design for the wall street journal-based CSR corpus, in: Proc. Workshop on Speech and Natural Language, Banff, Canada, 1992, pp. 899–902.
-
(1992)
Proc. Workshop on Speech and Natural Language, Banff, Canada
, pp. 899-902
-
-
Paul, D.B.1
Baker, J.M.2
-
23
-
-
84937854847
-
Speaker-adaptation for hybrid HMM-ANN continuous speech recognition system
-
[23] J. Neto, L. Almeida, M. Hochberg, C. Martins, L. Nunes, S. Renals, T. Robinson, Speaker-adaptation for hybrid HMM-ANN continuous speech recognition system, in: Proc. Eurospeech, Madrid, Spain, 1995, pp. 2171–2174.
-
(1995)
Proc. Eurospeech, Madrid, Spain
, pp. 2171-2174
-
-
Neto, J.1
Almeida, L.2
Hochberg, M.3
Martins, C.4
Nunes, L.5
Renals, S.6
Robinson, T.7
-
24
-
-
79959849500
-
Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems
-
[24] B. Li, K.C. Sim, Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems, in: Proc. INTERSPEECH, 2010, pp. 526–529.
-
(2010)
Proc. INTERSPEECH
, pp. 526-529
-
-
Li, B.1
Sim, K.C.2
-
25
-
-
84921817164
-
Learning representations by back-propagating errors
-
[25] Rumelhart, D.E., Hinton, G.E., Williams, R.J., Learning representations by back-propagating errors. Cogn. Model., 5(3), 1988, 1.
-
(1988)
Cogn. Model.
, vol.5
, Issue.3
, pp. 1
-
-
Rumelhart, D.E.1
Hinton, G.E.2
Williams, R.J.3
-
26
-
-
84959161626
-
Maximum a posteriori adaptation of network parameters in deep models
-
[26] Z. Huang, S.M. Siniscalchi, I.-F. Chen, J. Li, J. Wu, C.-H. Lee, Maximum a posteriori adaptation of network parameters in deep models, in: Proc. Interspeech, 2015.
-
(2015)
Proc. Interspeech
-
-
Huang, Z.1
Siniscalchi, S.M.2
Chen, I.-F.3
Li, J.4
Wu, J.5
Lee, C.-H.6
-
27
-
-
84959169347
-
Rapid adaptation for deep neural networks through multi-task learning
-
[27] Z. Huang, J. Li, S.M. Siniscalchi, I.-F. Chen, J. Wu, C.-H. Lee, Rapid adaptation for deep neural networks through multi-task learning, in: Proc. Interspeech, 2015.
-
(2015)
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
, vol.2015-January
, pp. 3625-3629
-
-
Huang, Z.1
Li, J.2
Siniscalchi, S.M.3
Chen, I.-F.4
Wu, J.5
Lee, C.-H.6
-
28
-
-
0003413187
-
Neural Networks: A Comprehensive Foundation
-
Macmillan
-
[28] Haykin, S., Neural Networks: A Comprehensive Foundation. 1994, Macmillan.
-
(1994)
-
-
Haykin, S.1
-
29
-
-
80053446822
-
Optimal distributed online prediction
-
[29] O. Dekel, R. Gilad-Bachrach, O. Shamir, L. Xiao, Optimal distributed online prediction, in: Proc. ICML, 2011, pp. 713–720.
-
(2011)
Proc. ICML
, pp. 713-720
-
-
Dekel, O.1
Gilad-Bachrach, R.2
Shamir, O.3
Xiao, L.4
-
30
-
-
33745805403
-
A fast learning algorithm for deep belief nets
-
[30] Hinton, G.E., Osindero, S., Teh, Y., A fast learning algorithm for deep belief nets. Neural Comput. 18:7 (2006), 1527–1554.
-
(2006)
Neural Comput.
, vol.18
, Issue.7
, pp. 1527-1554
-
-
Hinton, G.E.1
Osindero, S.2
Teh, Y.3
-
31
-
-
84905239342
-
Improving deep neural network acoustic models using generalized maxout networks
-
[31] X. Zhang, J. Trmal, D. Povey, S. Khudanpur, Improving deep neural network acoustic models using generalized maxout networks, in: Proc. ICASSP 2014, 2006, pp. 215–219.
-
(2006)
Proc. ICASSP
, vol.2014
, pp. 215-219
-
-
Zhang, X.1
Trmal, J.2
Povey, D.3
Khudanpur, S.4
-
32
-
-
84865801985
-
Conversational speech transcription using context-dependent deep neural networks
-
[32] F. Seide, G. Li, D. Yu, Conversational speech transcription using context-dependent deep neural networks, in: Proc. Interspeech, Florence, Italy, 2011, pp. 437–440.
-
(2011)
Proc. Interspeech, Florence, Italy
, pp. 437-440
-
-
Seide, F.1
Li, G.2
Yu, D.3
-
33
-
-
84921731072
-
Fast adaptation of deep neural network based on discriminant codes for speech recognition
-
[33] Xue, S., Abdel-Hamid, O., Jiang, H., Dai, L., Liu, Q., Fast adaptation of deep neural network based on discriminant codes for speech recognition. IEEE/ACM Trans. Audio Speech Lang. Proc. 22:12 (2014), 1713–1725.
-
(2014)
IEEE/ACM Trans. Audio Speech Lang. Proc.
, vol.22
, Issue.12
, pp. 1713-1725
-
-
Xue, S.1
Abdel-Hamid, O.2
Jiang, H.3
Dai, L.4
Liu, Q.5
-
34
-
-
84912109599
-
Speaker adaptation of hybrid NN/HMM model for speech recognition based on singular value decomposition
-
[34] S. Xue, H. Jiang, L. Dai, Speaker adaptation of hybrid NN/HMM model for speech recognition based on singular value decomposition, in: Proc. ISCSLP, 2014.
-
(2014)
Proc. ISCSLP
-
-
Xue, S.1
Jiang, H.2
Dai, L.3
-
35
-
-
80051654263
-
Deep belief networks using discriminative features for phone recognition
-
Proc. ICASSP, 2011, p. 5060–5063.
-
[35] A. Mohamed, T. Sainath, G. Dahl, B. Ramabhadran, G. Hinton, M. Picheny, Deep belief networks using discriminative features for phone recognition, in: Proc. ICASSP, 2011, p. 5060–5063.
-
-
-
Mohamed, A.1
Sainath, T.2
Dahl, G.3
Ramabhadran, B.4
Hinton, G.5
Picheny, M.6
-
36
-
-
85008520364
-
Transcribing meetings with the amida systems
-
[36] Hain, T., Burget, L., Dines, J., Garner, P., Grézl, F., Hannani, A.E., Karafíat, M., Lincoln, M., Wan, V., Transcribing meetings with the amida systems. IEEE Trans. Audio Speech Lang. Process. 20:2 (2012), 486–498.
-
(2012)
IEEE Trans. Audio Speech Lang. Process.
, vol.20
, Issue.2
, pp. 486-498
-
-
Hain, T.1
Burget, L.2
Dines, J.3
Garner, P.4
Grézl, F.5
Hannani, A.E.6
Karafíat, M.7
Lincoln, M.8
Wan, V.9
-
37
-
-
84893691530
-
Speaker adaptation of neural network acoustic models using i-vectors
-
[37] G. Saon, H. Soltau, D. Nahamoo, M. Picheny, Speaker adaptation of neural network acoustic models using i-vectors, in: Proc. ASRU, 2013, pp. 55–59.
-
(2013)
Proc. ASRU
, pp. 55-59
-
-
Saon, G.1
Soltau, H.2
Nahamoo, D.3
Picheny, M.4
-
38
-
-
0000159105
-
On adaptive decision rules and decision parameter adaptation for automatic speech recognition
-
[38] Lee, C.-H., Huo, Q., On adaptive decision rules and decision parameter adaptation for automatic speech recognition. Proc. IEEE 88:8 (2000), 1241–1269.
-
(2000)
Proc. IEEE
, vol.88
, Issue.8
, pp. 1241-1269
-
-
Lee, C.-H.1
Huo, Q.2
-
39
-
-
0004119259
-
The Sound Pattern of English
-
Harper & Row
-
[39] Chomsky, N., Halle, M., The Sound Pattern of English. 1968, Harper & Row.
-
(1968)
-
-
Chomsky, N.1
Halle, M.2
-
40
-
-
84910035297
-
Learning small-size dnn with output-distribution-based criteria
-
[40] J. Li, R. Zhao, J.-T. Huang, Y. Gong, Learning small-size dnn with output-distribution-based criteria, in: Proc. Interspeech, 2014.
-
(2014)
Proc. Interspeech
-
-
Li, J.1
Zhao, R.2
Huang, J.-T.3
Gong, Y.4
-
41
-
-
85008035419
-
Equivalence of generative and log-linear models
-
[41] Heigold, G., Ney, H., Lehnen, P., Gass, T., Schluter, R., Equivalence of generative and log-linear models. IEEE Trans. Audio Speech Lang. Process. 19:5 (2011), 1138–1148.
-
(2011)
IEEE Trans. Audio Speech Lang. Process.
, vol.19
, Issue.5
, pp. 1138-1148
-
-
Heigold, G.1
Ney, H.2
Lehnen, P.3
Gass, T.4
Schluter, R.5
-
42
-
-
0035279111
-
A structural Bayes approach to speaker adaptation
-
[42] Shinoda, K., Lee, C.-H., A structural Bayes approach to speaker adaptation. IEEE Trans. Speech Audio Process. 9:3 (2001), 276–287.
-
(2001)
IEEE Trans. Speech Audio Process.
, vol.9
, Issue.3
, pp. 276-287
-
-
Shinoda, K.1
Lee, C.-H.2
-
43
-
-
0025629882
-
Tied mixture continuous parameter modeling for speech recognition
-
[43] Bellegarda, J.R., Nahamoo, D., Tied mixture continuous parameter modeling for speech recognition. IEEE Trans. Acoust., Speech Signal Process. 38:12 (1990), 2033–2045.
-
(1990)
IEEE Trans. Acoust., Speech Signal Process.
, vol.38
, Issue.12
, pp. 2033-2045
-
-
Bellegarda, J.R.1
Nahamoo, D.2
-
44
-
-
0000250399
-
Semi-continuous hidden markov models for speech signal
-
[44] Huang, X., Jack, M.A., Semi-continuous hidden markov models for speech signal. Comput. Speech Lang. 3:3 (1989), 239–251.
-
(1989)
Comput. Speech Lang.
, vol.3
, Issue.3
, pp. 239-251
-
-
Huang, X.1
Jack, M.A.2
-
45
-
-
84912122097
-
Decision tree based state tying for speech recognition using DNN derived embeddings
-
[45] X. Li, X. Wu, Decision tree based state tying for speech recognition using DNN derived embeddings, in: Proc. ISCSLP, 2014, pp. 123–127.
-
(2014)
Proc. ISCSLP
, pp. 123-127
-
-
Li, X.1
Wu, X.2
-
46
-
-
84976220626
-
Discriminative transfer learning with tree-based priors
-
[46] N. Srivastava, R. Salakhutdinov, Discriminative transfer learning with tree-based priors, in: Proc. NIST, 2013.
-
(2013)
Proc. NIST
-
-
Srivastava, N.1
Salakhutdinov, R.2
-
47
-
-
64849090489
-
Conditional random fields for integrating local discriminative classifiers
-
[47] Morris, J., Fosler-Lussier, E., Conditional random fields for integrating local discriminative classifiers. IEEE Trans. Audio Speech Lang. Process. 16:3 (2008), 617–628.
-
(2008)
IEEE Trans. Audio Speech Lang. Process.
, vol.16
, Issue.3
, pp. 617-628
-
-
Morris, J.1
Fosler-Lussier, E.2
-
48
-
-
84858953642
-
The Kaldi speech recognition toolkit
-
[48] D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, N. Goel, M. Hannemann, P. Motlicek, Y. Qian, P. Schwarz, J. Silovsky`, G. Stemmer, K. Vesely`, The Kaldi speech recognition toolkit, in: Proc. ASRU, 2011.
-
(2011)
Proc. ASRU
-
-
Povey, D.1
Ghoshal, A.2
Boulianne, G.3
Burget, L.4
Glembek, O.5
Goel, N.6
Hannemann, M.7
Motlicek, P.8
Qian, Y.9
Schwarz, P.10
J.11
Silovsky12
Stemmer, G.14
K.15
Vesely16
-
49
-
-
0002144369
-
Tree-based state tying for high accuracy acoustic modeling
-
[49] S. Young, J. Odell, P. Woodland, Tree-based state tying for high accuracy acoustic modeling, in: Proc. ARPA Human Language Technology Workshop, Plainsboro, NJ, USA, 1994, pp. 307–312.
-
(1994)
Proc. ARPA Human Language Technology Workshop, Plainsboro, NJ, USA
, pp. 307-312
-
-
Young, S.1
Odell, J.2
Woodland, P.3
-
50
-
-
0001596920
-
Large vocabulary continuous speech recognition: advances and applications
-
[50] Gauvain, J.-L., Lamel, L., Large vocabulary continuous speech recognition: advances and applications. Proc. IEEE 88:8 (2000), 1181–1200.
-
(2000)
Proc. IEEE
, vol.88
, Issue.8
, pp. 1181-1200
-
-
Gauvain, J.-L.1
Lamel, L.2
-
51
-
-
84890454527
-
Low-rank matrix factorization for deep neural network training with high-dimensional output targets
-
[51] T. N. Sainath, B. Kingsbury, V. Sindhwani, E. Arisoy, B. Ramabhadran, Low-rank matrix factorization for deep neural network training with high-dimensional output targets, in: Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, IEEE, 2013, pp. 6655–6659.
-
(2013)
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, IEEE
, pp. 6655-6659
-
-
Sainath, T.N.1
Kingsbury, B.2
Sindhwani, V.3
Arisoy, E.4
Ramabhadran, B.5
-
52
-
-
84906227589
-
Restructuring of deep neural network acoustic models with singular value decomposition
-
[52] J. Xue, J. Li, Y. Gong, Restructuring of deep neural network acoustic models with singular value decomposition, in: Proc. Interspeech 2014, 2013, pp. 2365–2369.
-
(2013)
Proc. Interspeech
, vol.2014
, pp. 2365-2369
-
-
Xue, J.1
Li, J.2
Gong, Y.3
-
53
-
-
0029375590
-
Speaker adaptation using constrained estimation of gaussian mixtures
-
[53] Digalakis, V.V., Rtischev, D., Neumeye, L.G., Speaker adaptation using constrained estimation of gaussian mixtures. IEEE Trans. Speech Audio Process. 3:4 (1995), 357–366.
-
(1995)
IEEE Trans. Speech Audio Process.
, vol.3
, Issue.4
, pp. 357-366
-
-
Digalakis, V.V.1
Rtischev, D.2
Neumeye, L.G.3
-
54
-
-
0032050110
-
Maximum likelihood linear transformations for HMM-based speech recognition
-
[54] Gales, M.J.F., Maximum likelihood linear transformations for HMM-based speech recognition. Comput. Speech Lang. 12 (1998), 75–98.
-
(1998)
Comput. Speech Lang.
, vol.12
, pp. 75-98
-
-
Gales, M.J.F.1
-
56
-
-
85043823916
-
-
Switchboard-1 release 2, Linguistic Data Consortium, Philadelphia.
-
[56] J. J. Godfrey, E. Holliman, Switchboard-1 release 2, Linguistic Data Consortium, Philadelphia.
-
-
-
Godfrey, J.J.1
Holliman, E.2
-
57
-
-
84858953642
-
The Kaldi speech recognition toolkit
-
[57] D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, N. Goel, M. Hannemann, P. Motlicek, Y. Qian, P. Schwarz, J. Silovsky`, G. Stemmer, K. Vesely`, The Kaldi speech recognition toolkit, in: Proc. ASRU, 2011.
-
(2011)
Proc. ASRU
-
-
Povey, D.1
Ghoshal, A.2
Boulianne, G.3
Burget, L.4
Glembek, O.5
Goel, N.6
Hannemann, M.7
Motlicek, P.8
Qian, Y.9
Schwarz, P.10
J.11
Silovsky12
Stemmer, G.14
K.15
Vesely16
-
58
-
-
84910084579
-
2000 NIST evaluation of conversational speech recognition over the telephone: English and mandarin performance results
-
[58] J. Fiscus, W.M. Fisher, A.F. Martin, M.A. Przybocki, D.S. Pallett, 2000 NIST evaluation of conversational speech recognition over the telephone: English and mandarin performance results, in: Proc. Speech Transcription Workshop, 2000.
-
(2000)
Proc. Speech Transcription Workshop
-
-
Fiscus, J.1
Fisher, W.M.2
Martin, A.F.3
Przybocki, M.A.4
Pallett, D.S.5
-
59
-
-
84890483489
-
Initialization schemes for multilayer perceptron training and their impact on ASR performance using multilingual data
-
[59] N.T. Vu, W. Breiter, F. Metze, T. Schultz, Initialization schemes for multilayer perceptron training and their impact on ASR performance using multilingual data, in: Proc. Interspeech, Portland, OR, USA, 2012.
-
(2012)
Proc. Interspeech, Portland, OR, USA
-
-
Vu, N.T.1
Breiter, W.2
Metze, F.3
Schultz, T.4
|