SCOPUS 정보 검색 플랫폼

Volumn 218, Issue , 2016, Pages 448-459

A unified approach to transfer learning of deep neural networks with applications to speaker adaptation in automatic speech recognition

(3) Huang, Zhen a Siniscalchi, Sabato Marco b Lee, Chin Hui a

a GEORGIA INSTITUTE OF TECHNOLOGY (United States)

b KORE UNIVERSITY OF ENNA (Italy)

Author keywords

Deep neural network; Multi task learning; Speaker adaptation; Transfer learning

Indexed keywords

DEEP LEARNING; DEEP NEURAL NETWORKS; LEARNING SYSTEMS; METADATA; MULTI-TASK LEARNING; NEURAL NETWORKS; SPEECH; TRANSFER LEARNING;

ADAPTATION ALGORITHMS; AUTOMATIC SPEECH RECOGNITION; AUTOMATIC SPEECH RECOGNITION SYSTEM; HIERARCHICAL STRUCTURES; MULTI-CONDITION TRAININGS; PERFORMANCE DEGRADATION; SPEAKER ADAPTATION; WORD ERROR RATE REDUCTIONS;

SPEECH RECOGNITION;

ARTICLE; ARTIFICIAL NEURAL NETWORK; AUTOMATIC SPEECH RECOGNITION; CONTROLLED STUDY; DEEP NEURAL NETWORK; INFORMATION PROCESSING; KERNEL METHOD; MEASUREMENT ACCURACY; MULTI CONDITION TRAINING; MULTI TASK HETEROGENEOUS TRANSFER LEARNING; PRIORITY JOURNAL; PROBABILITY; TASK PERFORMANCE; WALL STREET JOURNAL;

EID: 84994096935 PISSN: 09252312 EISSN: 18728286 Source Type: Journal
DOI: 10.1016/j.neucom.2016.09.018 Document Type: Article

Times cited : (63)

References (59)

1
- 84897943848
- An overview of noise-robust automatic speech recognition
- [1] Li, J., Deng, L., Gong, Y., Haeb-Umbach, R., An overview of noise-robust automatic speech recognition. IEEE Trans. Audio, Speech, Lang. Process. 22:4 (2014), 745–777.
- (2014) IEEE Trans. Audio, Speech, Lang. Process. , vol.22 , Issue.4 , pp. 745-777
- Li, J.¹ Deng, L.² Gong, Y.³ Haeb-Umbach, R.⁴

2
- 0032140546
- On stochastic feature and model compensation approaches to robust speech recognition
- [2] Lee, C.-H., On stochastic feature and model compensation approaches to robust speech recognition. Speech Commun. 25:1–3 (1998), 29–47.
- (1998) Speech Commun. , vol.25 , Issue.1-3 , pp. 29-47
- Lee, C.-H.¹

3
- 0035426931
- Language independent and language adaptive acoustic modeling for speech recognition
- [3] Schultz, T., Waibel, A., Language independent and language adaptive acoustic modeling for speech recognition. Speech Commun. 35 (2001), 31–51.
- (2001) Speech Commun. , vol.35 , pp. 31-51
- Schultz, T.¹ Waibel, A.²

4
- 84862931515
- Experiments on cross-language attribute detection and phone recognition with minimal target specific training data
- [4] Siniscalchi, S.M., Lyu, D.-C., Svendsen, T., Lee, C.-H., Experiments on cross-language attribute detection and phone recognition with minimal target specific training data. IEEE Trans. Audio Speech Lang. Process. 20:3 (2012), 875–887.
- (2012) IEEE Trans. Audio Speech Lang. Process. , vol.20 , Issue.3 , pp. 875-887
- Siniscalchi, S.M.¹ Lyu, D.-C.² Svendsen, T.³ Lee, C.-H.⁴

5
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups
- [5] Hinton, G., Deng, L., Yu, D., Dahl, G., Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T., Kingsbury, B., Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29:6 (2012), 82–97.
- (2012) IEEE Signal Process. Mag. , vol.29 , Issue.6 , pp. 82-97
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.⁴ Mohamed, A.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.¹⁰ Kingsbury, B.¹¹

6
- 0027683813
- Shared-distribution hidden markov models for speech recognition
- [6] Hwang, M.-Y.M.-Y., Huang, X., Shared-distribution hidden markov models for speech recognition. IEEE Trans. Speech Audio Process. 1:4 (1993), 414–420.
- (1993) IEEE Trans. Speech Audio Process. , vol.1 , Issue.4 , pp. 414-420
- Hwang, M.-Y.M.-Y.¹ Huang, X.²

7
- 84858972572
- Making deep belief networks effective for large vocabulary continuous speech recognition
- [7] T. N. Sainath, B. Kingsbury, B. Ramabhadran, P. Fousek, P. Novak, A. Mohamed, Making deep belief networks effective for large vocabulary continuous speech recognition, in: Proc. ASRU, 2011, pp. 30–35.
- (2011) Proc. ASRU , pp. 30-35
- Sainath, T.N.¹ Kingsbury, B.² Ramabhadran, B.³ Fousek, P.⁴ Novak, P.⁵ Mohamed, A.⁶

8
- 84055222005
- Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
- [8] Dahl, G.E., Yu, D., Deng, L., Acero, A., Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans. Audio, Speech Lang. Proc. 20:1 (2012), 30–42.
- (2012) IEEE Trans. Audio, Speech Lang. Proc. , vol.20 , Issue.1 , pp. 30-42
- Dahl, G.E.¹ Yu, D.² Deng, L.³ Acero, A.⁴

9
- 84906274730
- Sequence-discriminative training of deep neural networks
- [9] K. Vesely`, A. Ghoshal, L. Burget, D. Povey, Sequence-discriminative training of deep neural networks, in: Proc. Interspeech, 2013, pp. 2345–2349.
- (2013) Proc. Interspeech , pp. 2345-2349
- K.¹ Vesely² Ghoshal, A.⁴ Burget, L.⁵ Povey, D.⁶

10
- 0032923221
- Catastrophic forgetting in connectionist networks: causes, consequences and solutions, Trends in Cognitive Sciences, vol. 3 (4).
- [10] M. Franch, Catastrophic forgetting in connectionist networks: causes, consequences and solutions, Trends in Cognitive Sciences, vol. 3 (4).
- Franch, M.¹

11
- 84890542079
- KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition
- [11] D. Yu, K. Yao, H. Su, G. Li, F. Seide, KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition, in: Proc. ICASSP, 2013, pp. 7893–7897.
- (2013) Proc. ICASSP , pp. 7893-7897
- Yu, D.¹ Yao, K.² Su, H.³ Li, G.⁴ Seide, F.⁵

12
- 34548012893
- Linear hidden transformations for adaptation of hybrid ANN/HMM models
- [12] Gemello, R., Mana, F., Scanzio, S., Laface, P., De Mori, R., Linear hidden transformations for adaptation of hybrid ANN/HMM models. Speech Commun. 49:10–11 (2007), 827–835.
- (2007) Speech Commun. , vol.49 , Issue.10-11 , pp. 827-835
- Gemello, R.¹ Mana, F.² Scanzio, S.³ Laface, P.⁴ De Mori, R.⁵

13
- 84876672166
- Machine learning paradigms for speech recognition: an overview
- [13] Deng, L., Li, X., Machine learning paradigms for speech recognition: an overview. IEEE Trans. Audio, Speech Lang. Process. 21 (2013), 1060–1089.
- (2013) IEEE Trans. Audio, Speech Lang. Process. , vol.21 , pp. 1060-1089
- Deng, L.¹ Li, X.²

14
- 77956296425
- Noise adaptive training for robust automatic speech recognition
- [14] Kalinli, O., Seltzer, M.L., Droppo, J., Acero, A., Noise adaptive training for robust automatic speech recognition. IEEE Trans. Audio Speech Lang. Process. 18:8 (2010), 1889–1901.
- (2010) IEEE Trans. Audio Speech Lang. Process. , vol.18 , Issue.8 , pp. 1889-1901
- Kalinli, O.¹ Seltzer, M.L.² Droppo, J.³ Acero, A.⁴

15
- 77956031473
- A survey on transfer learning
- [15] Pan, S.J., Yang, Q., A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22 (2010), 1245–1359.
- (2010) IEEE Trans. Knowl. Data Eng. , vol.22 , pp. 1245-1359
- Pan, S.J.¹ Yang, Q.²

16
- 0028419019
- Maximum a posteriori estimation for multivariate gaussian mixture observations of Markov chains
- [16] Gauvain, J., Lee, C., Maximum a posteriori estimation for multivariate gaussian mixture observations of Markov chains. IEEE Trans. Speech Audio Process. 2:2 (1994), 291–298.
- (1994) IEEE Trans. Speech Audio Process. , vol.2 , Issue.2 , pp. 291-298
- Gauvain, J.¹ Lee, C.²

17
- 85121045899
- Multitask learning: A knowledge-based source of inductive bias
- [17] R. Caruna, Multitask learning: A knowledge-based source of inductive bias, in: Proc. ICML, 1993, pp. 41–48.
- (1993) Proc. ICML , pp. 41-48
- Caruna, R.¹

18
- 85009167968
- Multitask learning in connectionist robust asr using recurrent neural networks., in: Proc. INTERSPEECH, 2003.
- [18] S. Parveen, P. Green, Multitask learning in connectionist robust asr using recurrent neural networks., in: Proc. INTERSPEECH, 2003.
- Parveen, S.¹ Green, P.²

19
- 84890458846
- Multitask learning in connectionist speech recognition
- [19] Y. Lu, F. Lu, S. Sehgal, S. Gupta, J. Du, C. Tham, P. Green, V. Wan, Multitask learning in connectionist speech recognition, in: Proc. Australian International Conference on Speech Science and Technology, 2004.
- (2004) Proc. Australian International Conference on Speech Science and Technology
- Lu, Y.¹ Lu, F.² Sehgal, S.³ Gupta, S.⁴ Du, J.⁵ Tham, C.⁶ Green, P.⁷ Wan, V.⁸

20
- 84890545600
- Multi-task learning in deep neural networks for improved phoneme recognition
- [20] M. Seltzer, J. Droppo, Multi-task learning in deep neural networks for improved phoneme recognition, in: Proc. ICASSP, 2013, pp. 6965–6969.
- (2013) Proc. ICASSP , pp. 6965-6969
- Seltzer, M.¹ Droppo, J.²

21
- 85043800698
- Multi-objective learning and mask-based post-processing for deep neural network based speech enhancement, 2015, submitted to INTERSPEECH.
- [21] Y. Xu, J. Du, Z. Huang, L.-R. Dai, C.-H. Lee, Multi-objective learning and mask-based post-processing for deep neural network based speech enhancement, 2015, submitted to INTERSPEECH.
- Xu, Y.¹ Du, J.² Huang, Z.³ Dai, L.-R.⁴ Lee, C.-H.⁵

22
- 85079095310
- The design for the wall street journal-based CSR corpus
- [22] D.B. Paul, J.M. Baker, The design for the wall street journal-based CSR corpus, in: Proc. Workshop on Speech and Natural Language, Banff, Canada, 1992, pp. 899–902.
- (1992) Proc. Workshop on Speech and Natural Language, Banff, Canada , pp. 899-902
- Paul, D.B.¹ Baker, J.M.²

23
- 84937854847
- Speaker-adaptation for hybrid HMM-ANN continuous speech recognition system
- [23] J. Neto, L. Almeida, M. Hochberg, C. Martins, L. Nunes, S. Renals, T. Robinson, Speaker-adaptation for hybrid HMM-ANN continuous speech recognition system, in: Proc. Eurospeech, Madrid, Spain, 1995, pp. 2171–2174.
- (1995) Proc. Eurospeech, Madrid, Spain , pp. 2171-2174
- Neto, J.¹ Almeida, L.² Hochberg, M.³ Martins, C.⁴ Nunes, L.⁵ Renals, S.⁶ Robinson, T.⁷

24
- 79959849500
- Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems
- [24] B. Li, K.C. Sim, Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems, in: Proc. INTERSPEECH, 2010, pp. 526–529.
- (2010) Proc. INTERSPEECH , pp. 526-529
- Li, B.¹ Sim, K.C.²

25
- 84921817164
- Learning representations by back-propagating errors
- [25] Rumelhart, D.E., Hinton, G.E., Williams, R.J., Learning representations by back-propagating errors. Cogn. Model., 5(3), 1988, 1.
- (1988) Cogn. Model. , vol.5 , Issue.3 , pp. 1
- Rumelhart, D.E.¹ Hinton, G.E.² Williams, R.J.³

26
- 84959161626
- Maximum a posteriori adaptation of network parameters in deep models
- [26] Z. Huang, S.M. Siniscalchi, I.-F. Chen, J. Li, J. Wu, C.-H. Lee, Maximum a posteriori adaptation of network parameters in deep models, in: Proc. Interspeech, 2015.
- (2015) Proc. Interspeech
- Huang, Z.¹ Siniscalchi, S.M.² Chen, I.-F.³ Li, J.⁴ Wu, J.⁵ Lee, C.-H.⁶

27
- 84959169347
- Rapid adaptation for deep neural networks through multi-task learning
- [27] Z. Huang, J. Li, S.M. Siniscalchi, I.-F. Chen, J. Wu, C.-H. Lee, Rapid adaptation for deep neural networks through multi-task learning, in: Proc. Interspeech, 2015.
- (2015) Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH , vol.2015-January , pp. 3625-3629
- Huang, Z.¹ Li, J.² Siniscalchi, S.M.³ Chen, I.-F.⁴ Wu, J.⁵ Lee, C.-H.⁶

28
- 0003413187
- Neural Networks: A Comprehensive Foundation
- Macmillan
- [28] Haykin, S., Neural Networks: A Comprehensive Foundation. 1994, Macmillan.
- (1994)
- Haykin, S.¹

29
- 80053446822
- Optimal distributed online prediction
- [29] O. Dekel, R. Gilad-Bachrach, O. Shamir, L. Xiao, Optimal distributed online prediction, in: Proc. ICML, 2011, pp. 713–720.
- (2011) Proc. ICML , pp. 713-720
- Dekel, O.¹ Gilad-Bachrach, R.² Shamir, O.³ Xiao, L.⁴

30
- 33745805403
- A fast learning algorithm for deep belief nets
- [30] Hinton, G.E., Osindero, S., Teh, Y., A fast learning algorithm for deep belief nets. Neural Comput. 18:7 (2006), 1527–1554.
- (2006) Neural Comput. , vol.18 , Issue.7 , pp. 1527-1554
- Hinton, G.E.¹ Osindero, S.² Teh, Y.³

31
- 84905239342
- Improving deep neural network acoustic models using generalized maxout networks
- [31] X. Zhang, J. Trmal, D. Povey, S. Khudanpur, Improving deep neural network acoustic models using generalized maxout networks, in: Proc. ICASSP 2014, 2006, pp. 215–219.
- (2006) Proc. ICASSP , vol.2014 , pp. 215-219
- Zhang, X.¹ Trmal, J.² Povey, D.³ Khudanpur, S.⁴

32
- 84865801985
- Conversational speech transcription using context-dependent deep neural networks
- [32] F. Seide, G. Li, D. Yu, Conversational speech transcription using context-dependent deep neural networks, in: Proc. Interspeech, Florence, Italy, 2011, pp. 437–440.
- (2011) Proc. Interspeech, Florence, Italy , pp. 437-440
- Seide, F.¹ Li, G.² Yu, D.³

33
- 84921731072
- Fast adaptation of deep neural network based on discriminant codes for speech recognition
- [33] Xue, S., Abdel-Hamid, O., Jiang, H., Dai, L., Liu, Q., Fast adaptation of deep neural network based on discriminant codes for speech recognition. IEEE/ACM Trans. Audio Speech Lang. Proc. 22:12 (2014), 1713–1725.
- (2014) IEEE/ACM Trans. Audio Speech Lang. Proc. , vol.22 , Issue.12 , pp. 1713-1725
- Xue, S.¹ Abdel-Hamid, O.² Jiang, H.³ Dai, L.⁴ Liu, Q.⁵

34
- 84912109599
- Speaker adaptation of hybrid NN/HMM model for speech recognition based on singular value decomposition
- [34] S. Xue, H. Jiang, L. Dai, Speaker adaptation of hybrid NN/HMM model for speech recognition based on singular value decomposition, in: Proc. ISCSLP, 2014.
- (2014) Proc. ISCSLP
- Xue, S.¹ Jiang, H.² Dai, L.³

35
- 80051654263
- Deep belief networks using discriminative features for phone recognition
- Proc. ICASSP, 2011, p. 5060–5063.
- [35] A. Mohamed, T. Sainath, G. Dahl, B. Ramabhadran, G. Hinton, M. Picheny, Deep belief networks using discriminative features for phone recognition, in: Proc. ICASSP, 2011, p. 5060–5063.
- Mohamed, A.¹ Sainath, T.² Dahl, G.³ Ramabhadran, B.⁴ Hinton, G.⁵ Picheny, M.⁶

36
- 85008520364
- Transcribing meetings with the amida systems
- [36] Hain, T., Burget, L., Dines, J., Garner, P., Grézl, F., Hannani, A.E., Karafíat, M., Lincoln, M., Wan, V., Transcribing meetings with the amida systems. IEEE Trans. Audio Speech Lang. Process. 20:2 (2012), 486–498.
- (2012) IEEE Trans. Audio Speech Lang. Process. , vol.20 , Issue.2 , pp. 486-498
- Hain, T.¹ Burget, L.² Dines, J.³ Garner, P.⁴ Grézl, F.⁵ Hannani, A.E.⁶ Karafíat, M.⁷ Lincoln, M.⁸ Wan, V.⁹

37
- 84893691530
- Speaker adaptation of neural network acoustic models using i-vectors
- [37] G. Saon, H. Soltau, D. Nahamoo, M. Picheny, Speaker adaptation of neural network acoustic models using i-vectors, in: Proc. ASRU, 2013, pp. 55–59.
- (2013) Proc. ASRU , pp. 55-59
- Saon, G.¹ Soltau, H.² Nahamoo, D.³ Picheny, M.⁴

38
- 0000159105
- On adaptive decision rules and decision parameter adaptation for automatic speech recognition
- [38] Lee, C.-H., Huo, Q., On adaptive decision rules and decision parameter adaptation for automatic speech recognition. Proc. IEEE 88:8 (2000), 1241–1269.
- (2000) Proc. IEEE , vol.88 , Issue.8 , pp. 1241-1269
- Lee, C.-H.¹ Huo, Q.²

39
- 0004119259
- The Sound Pattern of English
- Harper & Row
- [39] Chomsky, N., Halle, M., The Sound Pattern of English. 1968, Harper & Row.
- (1968)
- Chomsky, N.¹ Halle, M.²

40
- 84910035297
- Learning small-size dnn with output-distribution-based criteria
- [40] J. Li, R. Zhao, J.-T. Huang, Y. Gong, Learning small-size dnn with output-distribution-based criteria, in: Proc. Interspeech, 2014.
- (2014) Proc. Interspeech
- Li, J.¹ Zhao, R.² Huang, J.-T.³ Gong, Y.⁴

41
- 85008035419
- Equivalence of generative and log-linear models
- [41] Heigold, G., Ney, H., Lehnen, P., Gass, T., Schluter, R., Equivalence of generative and log-linear models. IEEE Trans. Audio Speech Lang. Process. 19:5 (2011), 1138–1148.
- (2011) IEEE Trans. Audio Speech Lang. Process. , vol.19 , Issue.5 , pp. 1138-1148
- Heigold, G.¹ Ney, H.² Lehnen, P.³ Gass, T.⁴ Schluter, R.⁵

42
- 0035279111
- A structural Bayes approach to speaker adaptation
- [42] Shinoda, K., Lee, C.-H., A structural Bayes approach to speaker adaptation. IEEE Trans. Speech Audio Process. 9:3 (2001), 276–287.
- (2001) IEEE Trans. Speech Audio Process. , vol.9 , Issue.3 , pp. 276-287
- Shinoda, K.¹ Lee, C.-H.²

43
- 0025629882
- Tied mixture continuous parameter modeling for speech recognition
- [43] Bellegarda, J.R., Nahamoo, D., Tied mixture continuous parameter modeling for speech recognition. IEEE Trans. Acoust., Speech Signal Process. 38:12 (1990), 2033–2045.
- (1990) IEEE Trans. Acoust., Speech Signal Process. , vol.38 , Issue.12 , pp. 2033-2045
- Bellegarda, J.R.¹ Nahamoo, D.²

44
- 0000250399
- Semi-continuous hidden markov models for speech signal
- [44] Huang, X., Jack, M.A., Semi-continuous hidden markov models for speech signal. Comput. Speech Lang. 3:3 (1989), 239–251.
- (1989) Comput. Speech Lang. , vol.3 , Issue.3 , pp. 239-251
- Huang, X.¹ Jack, M.A.²

45
- 84912122097
- Decision tree based state tying for speech recognition using DNN derived embeddings
- [45] X. Li, X. Wu, Decision tree based state tying for speech recognition using DNN derived embeddings, in: Proc. ISCSLP, 2014, pp. 123–127.
- (2014) Proc. ISCSLP , pp. 123-127
- Li, X.¹ Wu, X.²

46
- 84976220626
- Discriminative transfer learning with tree-based priors
- [46] N. Srivastava, R. Salakhutdinov, Discriminative transfer learning with tree-based priors, in: Proc. NIST, 2013.
- (2013) Proc. NIST
- Srivastava, N.¹ Salakhutdinov, R.²

47
- 64849090489
- Conditional random fields for integrating local discriminative classifiers
- [47] Morris, J., Fosler-Lussier, E., Conditional random fields for integrating local discriminative classifiers. IEEE Trans. Audio Speech Lang. Process. 16:3 (2008), 617–628.
- (2008) IEEE Trans. Audio Speech Lang. Process. , vol.16 , Issue.3 , pp. 617-628
- Morris, J.¹ Fosler-Lussier, E.²

48
- 84858953642
- The Kaldi speech recognition toolkit
- [48] D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, N. Goel, M. Hannemann, P. Motlicek, Y. Qian, P. Schwarz, J. Silovsky`, G. Stemmer, K. Vesely`, The Kaldi speech recognition toolkit, in: Proc. ASRU, 2011.
- (2011) Proc. ASRU
- Povey, D.¹ Ghoshal, A.² Boulianne, G.³ Burget, L.⁴ Glembek, O.⁵ Goel, N.⁶ Hannemann, M.⁷ Motlicek, P.⁸ Qian, Y.⁹ Schwarz, P.¹⁰ J.¹¹ Silovsky¹² Stemmer, G.¹⁴ K.¹⁵ Vesely¹⁶

49
- 0002144369
- Tree-based state tying for high accuracy acoustic modeling
- [49] S. Young, J. Odell, P. Woodland, Tree-based state tying for high accuracy acoustic modeling, in: Proc. ARPA Human Language Technology Workshop, Plainsboro, NJ, USA, 1994, pp. 307–312.
- (1994) Proc. ARPA Human Language Technology Workshop, Plainsboro, NJ, USA , pp. 307-312
- Young, S.¹ Odell, J.² Woodland, P.³

50
- 0001596920
- Large vocabulary continuous speech recognition: advances and applications
- [50] Gauvain, J.-L., Lamel, L., Large vocabulary continuous speech recognition: advances and applications. Proc. IEEE 88:8 (2000), 1181–1200.
- (2000) Proc. IEEE , vol.88 , Issue.8 , pp. 1181-1200
- Gauvain, J.-L.¹ Lamel, L.²

51
- 84890454527
- Low-rank matrix factorization for deep neural network training with high-dimensional output targets
- [51] T. N. Sainath, B. Kingsbury, V. Sindhwani, E. Arisoy, B. Ramabhadran, Low-rank matrix factorization for deep neural network training with high-dimensional output targets, in: Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, IEEE, 2013, pp. 6655–6659.
- (2013) Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, IEEE , pp. 6655-6659
- Sainath, T.N.¹ Kingsbury, B.² Sindhwani, V.³ Arisoy, E.⁴ Ramabhadran, B.⁵

52
- 84906227589
- Restructuring of deep neural network acoustic models with singular value decomposition
- [52] J. Xue, J. Li, Y. Gong, Restructuring of deep neural network acoustic models with singular value decomposition, in: Proc. Interspeech 2014, 2013, pp. 2365–2369.
- (2013) Proc. Interspeech , vol.2014 , pp. 2365-2369
- Xue, J.¹ Li, J.² Gong, Y.³

53
- 0029375590
- Speaker adaptation using constrained estimation of gaussian mixtures
- [53] Digalakis, V.V., Rtischev, D., Neumeye, L.G., Speaker adaptation using constrained estimation of gaussian mixtures. IEEE Trans. Speech Audio Process. 3:4 (1995), 357–366.
- (1995) IEEE Trans. Speech Audio Process. , vol.3 , Issue.4 , pp. 357-366
- Digalakis, V.V.¹ Rtischev, D.² Neumeye, L.G.³

54
- 0032050110
- Maximum likelihood linear transformations for HMM-based speech recognition
- [54] Gales, M.J.F., Maximum likelihood linear transformations for HMM-based speech recognition. Comput. Speech Lang. 12 (1998), 75–98.
- (1998) Comput. Speech Lang. , vol.12 , pp. 75-98
- Gales, M.J.F.¹

55
- 34547973978
- To transfer or not to transfer
- [55] M. Rosenstein, Z. Marx, L. Kaelbling, To transfer or not to transfer, in: Neural Information Processing Systems (NIPS ?05) Workshop Inductive Transfer: 10 Years Late, 2005.
- (2005) Neural Information Processing Systems (NIPS ?05) Workshop Inductive Transfer: 10 Years Late
- Rosenstein, M.¹ Marx, Z.² Kaelbling, L.³

56
- 85043823916
- Switchboard-1 release 2, Linguistic Data Consortium, Philadelphia.
- [56] J. J. Godfrey, E. Holliman, Switchboard-1 release 2, Linguistic Data Consortium, Philadelphia.
- Godfrey, J.J.¹ Holliman, E.²

57
- 84858953642
- The Kaldi speech recognition toolkit
- [57] D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, N. Goel, M. Hannemann, P. Motlicek, Y. Qian, P. Schwarz, J. Silovsky`, G. Stemmer, K. Vesely`, The Kaldi speech recognition toolkit, in: Proc. ASRU, 2011.
- (2011) Proc. ASRU
- Povey, D.¹ Ghoshal, A.² Boulianne, G.³ Burget, L.⁴ Glembek, O.⁵ Goel, N.⁶ Hannemann, M.⁷ Motlicek, P.⁸ Qian, Y.⁹ Schwarz, P.¹⁰ J.¹¹ Silovsky¹² Stemmer, G.¹⁴ K.¹⁵ Vesely¹⁶

58
- 84910084579
- 2000 NIST evaluation of conversational speech recognition over the telephone: English and mandarin performance results
- [58] J. Fiscus, W.M. Fisher, A.F. Martin, M.A. Przybocki, D.S. Pallett, 2000 NIST evaluation of conversational speech recognition over the telephone: English and mandarin performance results, in: Proc. Speech Transcription Workshop, 2000.
- (2000) Proc. Speech Transcription Workshop
- Fiscus, J.¹ Fisher, W.M.² Martin, A.F.³ Przybocki, M.A.⁴ Pallett, D.S.⁵

59
- 84890483489
- Initialization schemes for multilayer perceptron training and their impact on ASR performance using multilingual data
- [59] N.T. Vu, W. Breiter, F. Metze, T. Schultz, Initialization schemes for multilayer perceptron training and their impact on ASR performance using multilingual data, in: Proc. Interspeech, Portland, OR, USA, 2012.
- (2012) Proc. Interspeech, Portland, OR, USA
- Vu, N.T.¹ Breiter, W.² Metze, F.³ Schultz, T.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.