SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn 2015-January, Issue , 2015, Pages 3625-3629

Rapid adaptation for deep neural networks through multi-task learning

(6) Huang, Zhen a Li, Jinyu b Siniscalchi, Sabato Marco a,c Chen, I Fan a Wu, Ji a,d Lee, Chin Hui a

a GEORGIA INSTITUTE OF TECHNOLOGY (United States)

b MICROSOFT (United States)

c KORE UNIVERSITY OF ENNA (Italy)

d TSINGHUA UNIVERSITY (China)

Author keywords

CD DNN HMM; Deep neural networks; Multitask learning; Speaker adaptation

Indexed keywords

LEARNING SYSTEMS; LINEARIZATION; SPEECH COMMUNICATION; TELEPHONE SETS;

ADAPTATION FRAMEWORK; AUTOMATIC SPEECH RECOGNITION; CD-DNN-HMM; CLASSIFICATION TASKS; DEEP NEURAL NETWORKS; MULTITASK LEARNING; PARAMETER ADAPTATION; SPEAKER ADAPTATION;

SPEECH RECOGNITION;

EID: 84959169347 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (84)

References (39)

1
- 0024610919
- A tutorial on hidden Markov models and selectedapplications in speech recognition
- L. Rabiner, "A tutorial on hidden Markov models and selectedapplications in speech recognition, " Proceedings of the IEEE, vol. 77, no. 2, pp. 257-286, 1989.
- (1989) Proceedings of the IEEE , vol.77 , Issue.2 , pp. 257-286
- Rabiner, L.¹

2
- 85032751458
- Deepneural networks for acoustic modeling in speech recognition: Theshared views of four research groups
- G. Hinton, L. Deng, D. Yu, G. E. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath et al., "Deepneural networks for acoustic modeling in speech recognition: Theshared views of four research groups, " IEEE Signal ProcessingMagazine, vol. 29, no. 6, pp. 82-97, 2012.
- (2012) IEEE Signal ProcessingMagazine , vol.29 , Issue.6 , pp. 82-97
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.E.⁴ Mohamed, A.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.N.¹⁰

3
- 33947635130
- Regularized adaptation of discriminativeclassifiers
- X. Li and J. Bilmes, "Regularized adaptation of discriminativeclassifiers, " in Proc. ICASSP, vol. 1, 2006, pp. I-I.
- (2006) Proc. ICASSP , vol.1 , pp. I-I
- Li, X.¹ Bilmes, J.²

4
- 84890542079
- KL-divergence regularizeddeep neural network adaptation for improved large vocabularyspeech recognition
- D. Yu, K. Yao, H. Su, G. Li, and F. Seide, "KL-divergence regularizeddeep neural network adaptation for improved large vocabularyspeech recognition, " in Proc. ICASSP, 2013, pp. 7893-7897.
- (2013) Proc. ICASSP , pp. 7893-7897
- Yu, D.¹ Yao, K.² Su, H.³ Li, G.⁴ Seide, F.⁵

5
- 84890447334
- Factorized deep neural networksfor adaptive speech recognition
- D. Yu, X. Chen, and L. Deng, "Factorized deep neural networksfor adaptive speech recognition, " in Proc. Int. Workshop on StatisticalMachine Learning for Speech Processing, 2012.
- (2012) Proc. Int. Workshop on StatisticalMachine Learning for Speech Processing
- Yu, D.¹ Chen, X.² Deng, L.³

6
- 84871387302
- The deep tensor neural networkwith applications to large vocabulary speech recognition
- D. Yu, L. Deng, and S. Seide, "The deep tensor neural networkwith applications to large vocabulary speech recognition, " IEEETrans. Audio, Speech, and Language Processing, vol. 21, no. 2, pp. 388-396, 2013.
- (2013) IEEETrans. Audio, Speech, and Language Processing , vol.21 , Issue.2 , pp. 388-396
- Yu, D.¹ Deng, L.² Seide, S.³

7
- 84937854847
- Speaker-adaptation for hybridHMM-ANN continuous speech recognition system
- J. Neto, L. Almeida, M. Hochberg, C. Martins, L. Nunes, S. Renals, and T. Robinson, "Speaker-adaptation for hybridHMM-ANN continuous speech recognition system, " in Proc. Eurospeech, 1995.
- (1995) Proc. Eurospeech
- Neto, J.¹ Almeida, L.² Hochberg, M.³ Martins, C.⁴ Nunes, L.⁵ Renals, S.⁶ Robinson, T.⁷

8
- 34548012893
- Linearhidden transformations for adaptation of hybrid ANN/HMMmodels
- R. Gemello, F. Mana, S. Scanzio, P. Laface, and R. D. Mori, "Linearhidden transformations for adaptation of hybrid ANN/HMMmodels, " Speech Communication, vol. 49, no. 10, pp. 827-835, 2007.
- (2007) Speech Communication , vol.49 , Issue.10 , pp. 827-835
- Gemello, R.¹ Mana, F.² Scanzio, S.³ Laface, P.⁴ Mori, R.D.⁵

9
- 84858976070
- Feature engineeringin context-dependent deep neural networks for conversationalspeech transcription
- F. Seide, G. Li, X. Chen, and D. Yu, "Feature engineeringin context-dependent deep neural networks for conversationalspeech transcription, " in Proc. ASRU, 2011, pp. 24-29.
- (2011) Proc. ASRU , pp. 24-29
- Seide, F.¹ Li, G.² Chen, X.³ Yu, D.⁴

10
- 84874226579
- Adaptationof context-dependent deep neural networks for automatic speechrecognition
- K. Yao, D. Yu, F. Seide, H. Su, L. Deng, and Y. Gong, "Adaptationof context-dependent deep neural networks for automatic speechrecognition, " in Proc. Spoken Language Technology Workshop, 2012, pp. 366-369.
- (2012) Proc. Spoken Language Technology Workshop , pp. 366-369
- Yao, K.¹ Yu, D.² Seide, F.³ Su, H.⁴ Deng, L.⁵ Gong, Y.⁶

11
- 84893691530
- Speaker adaptationof neural network acoustic models using i-vectors
- G. Saon, H. Soltau, D. Nahamoo, and M. Picheny, "Speaker adaptationof neural network acoustic models using i-vectors, " in Proc. ASRU, 2013, pp. 55-59.
- (2013) Proc. ASRU , pp. 55-59
- Saon, G.¹ Soltau, H.² Nahamoo, D.³ Picheny, M.⁴

12
- 84881054791
- Hermitian polynomial forspeaker adaptation of connectionist speech recognition systems
- S. M. Siniscalchi, J. Li, and C.-H. Lee, "Hermitian polynomial forspeaker adaptation of connectionist speech recognition systems, "IEEE Trans. Audio, Speech, and Language Processing, vol. 21, no. 10, pp. 2152-2161, 2013.
- (2013) IEEE Trans. Audio, Speech, and Language Processing , vol.21 , Issue.10 , pp. 2152-2161
- Siniscalchi, S.M.¹ Li, J.² Lee, C.-H.³

13
- 84906225505
- Rapid and effective speaker adaptationof convolutional neural network based models for speechrecognition
- O. Abdel-Hamid and H. Jiang, "Rapid and effective speaker adaptationof convolutional neural network based models for speechrecognition, " in Proc. INTERSPEECH, 2013, pp. 1248-1252.
- (2013) Proc. INTERSPEECH , pp. 1248-1252
- Abdel-Hamid, O.¹ Jiang, H.²

14
- 84983119674
- Learning hidden unit contributionsfor unsupervised speaker adaptation of neural networkacoustic models
- P. Swietojanski and S. Renals, "Learning hidden unit contributionsfor unsupervised speaker adaptation of neural networkacoustic models, " in Proc. IEEE STL, 2014.
- (2014) Proc. IEEE STL
- Swietojanski, P.¹ Renals, S.²

15
- 84905262902
- Factorized adaptation for deepneural network
- J. Li, J.-T. Huang, and Y. Gong, "Factorized adaptation for deepneural network, " in Proc. ICASSP, 2014.
- (2014) Proc. ICASSP
- Li, J.¹ Huang, J.-T.² Gong, Y.³

16
- 84890452886
- Fast speaker adaptation of hybridNN/HMM model for speech recognition based on discriminativelearning of speaker code
- O. Abdel-Hamid and H. Jiang, "Fast speaker adaptation of hybridNN/HMM model for speech recognition based on discriminativelearning of speaker code, " in Proc. ICASSP, 2013, pp. 7942-7946.
- (2013) Proc. ICASSP , pp. 7942-7946
- Abdel-Hamid, O.¹ Jiang, H.²

17
- 84905284226
- Direct adaptationof hybrid DNN/HMM model for fast speaker adaptationin LVCSR based on speaker code
- S. Xue, O. Abdel-Hamid, H. Jiang, and L. Dai, "Direct adaptationof hybrid DNN/HMM model for fast speaker adaptationin LVCSR based on speaker code, " in Proc. ICASSP, 2014, pp. 6339-6343.
- (2014) Proc. ICASSP , pp. 6339-6343
- Xue, S.¹ Abdel-Hamid, O.² Jiang, H.³ Dai, L.⁴

18
- 84921731072
- Fastadaptation of deep neural network based on discriminant codesfor speech recognition
- S. Xue, O. Abdel-Hamid, H. Jiang, L. Dai, and Q. Liu, "Fastadaptation of deep neural network based on discriminant codesfor speech recognition, " IEEE/ACM Trans. on Audio, Speech and Lang. Proc., vol. 22, no. 12, pp. 1713-1725, 2014.
- (2014) IEEE/ACM Trans. on Audio, Speech and Lang. Proc , vol.22 , Issue.12 , pp. 1713-1725
- Xue, S.¹ Abdel-Hamid, O.² Jiang, H.³ Dai, L.⁴ Liu, Q.⁵

19
- 0027683813
- Shared-distribution hiddenmarkov models for speech recognition
- M.-Y. M.-Y. Hwang and X. Huang, "Shared-distribution hiddenmarkov models for speech recognition, " IEEE Trans. Speech and Audio Processing, vol. 1, no. 4, pp. 414-420, 1993.
- (1993) IEEE Trans. Speech and Audio Processing , vol.1 , Issue.4 , pp. 414-420
- Hwang, M.-Y.M.-Y.¹ Huang, X.²

20
- 84938690750
- Speaker adaptation of deepneural networks using a hierarchy of output layers
- R. Price, I. Kenichi, and K. Shinoda, "Speaker adaptation of deepneural networks using a hierarchy of output layers, " in Proc. SLT, 2014.
- (2014) Proc. SLT
- Price, R.¹ Kenichi, I.² Shinoda, K.³

21
- 84959161626
- Maximum a posteriori adaptation of network parameters indeep models
- Z. Huang, S. M. Siniscalchi, I.-F. Chen, J. Li, J. Wu, and C.-H. Lee, "Maximum a posteriori adaptation of network parameters indeep models, " 2015, submitted to INTERSPEECH.
- (2015) INTERSPEECH
- Huang, Z.¹ Siniscalchi, S.M.² Chen, I.-F.³ Li, J.⁴ Wu, J.⁵ Lee, C.-H.⁶

22
- 85121045899
- Multitask learning: A knowledge-based source of inductivebias
- R. Caruna, "Multitask learning: A knowledge-based source of inductivebias, " in Proc. ICML, 1993, pp. 41-48.
- (1993) Proc. ICML , pp. 41-48
- Caruna, R.¹

23
- 85009167968
- Multitask learning in connectionistrobust asr using recurrent neural networks
- S. Parveen and P. Green, "Multitask learning in connectionistrobust asr using recurrent neural networks. " in Proc. INTERSPEECH, 2003.
- (2003) Proc. INTERSPEECH
- Parveen, S.¹ Green, P.²

24
- 84890458846
- Multitask learning in connectionist speech recognition
- Y. Lu, F. Lu, S. Sehgal, S. Gupta, J. Du, C. Tham, P. Green, and V. Wan, "Multitask learning in connectionist speech recognition, "in Proc. Australian International Conference on Speech Scienceand Technology, 2004.
- (2004) Proc. Australian International Conference on Speech Scienceand Technology
- Lu, Y.¹ Lu, F.² Sehgal, S.³ Gupta, S.⁴ Du, J.⁵ Tham, C.⁶ Green, P.⁷ Wan, V.⁸

25
- 84890545600
- Multi-task learning in deep neuralnetworks for improved phoneme recognition
- M. Seltzer and J. Droppo, "Multi-task learning in deep neuralnetworks for improved phoneme recognition, " in Proc. ICASSP, 2013, pp. 6965-6969.
- (2013) Proc. ICASSP , pp. 6965-6969
- Seltzer, M.¹ Droppo, J.²

26
- 84976230656
- Learning auxiliarycategorization for neural network based speech synthesis
- Z. Wen, K. Li, Z. Huang, J. Tao, and C.-H. Lee, "Learning auxiliarycategorization for neural network based speech synthesis, "2015, submitted to INTERSPEECH.
- (2015) INTERSPEECH
- Wen, Z.¹ Li, K.² Huang, Z.³ Tao, J.⁴ Lee, C.-H.⁵

27
- 84959100788
- Multiobjectivelearning and mask-based post-processing for deep neuralnetwork based speech enhancement
- Y. Xu, J. Du, Z. Huang, L.-R. Dai, and C.-H. Lee, "Multiobjectivelearning and mask-based post-processing for deep neuralnetwork based speech enhancement, " 2015, submitted to INTERSPEECH.
- (2015) INTERSPEECH
- Xu, Y.¹ Du, J.² Huang, Z.³ Dai, L.-R.⁴ Lee, C.-H.⁵

28
- 84866054643
- MIT Press, Cambridge, MA, USA
- D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Learning representationsby back-propagating errors. MIT Press, Cambridge, MA, USA, 1988.
- (1988) Learning Representationsby Back-propagating Errors
- Rumelhart, D.E.¹ Hinton, G.E.² Williams, R.J.³

29
- 80053446822
- Optimaldistributed online prediction
- O. Dekel, R. Gilad-Bachrach, O. Shamir, and L. Xiao, "Optimaldistributed online prediction, " in Proc. ICML, 2011, pp. 713-720.
- (2011) Proc. ICML , pp. 713-720
- Dekel, O.¹ Gilad-Bachrach, R.² Shamir, O.³ Xiao, L.⁴

30
- 33746600649
- Reducing the dimensionalityof data with neural networks
- G. E. Hinton and R. R. Salakhutdinov, "Reducing the dimensionalityof data with neural networks, " Science, vol. 313, no. 5786, pp. 504-507, 2006.
- (2006) Science , vol.313 , Issue.5786 , pp. 504-507
- Hinton, G.E.¹ Salakhutdinov, R.R.²

31
- 85008035419
- Equivalenceof generative and log-linear models
- G. Heigold, H. Ney, P. Lehnen, T. Gass, and R. Schluter, "Equivalenceof generative and log-linear models, " IEEE Trans. Audio, Speech & Language Processing, vol. 19, no. 5, pp. 1138-1148, 2011.
- (2011) IEEE Trans. Audio, Speech & Language Processing , vol.19 , Issue.5 , pp. 1138-1148
- Heigold, G.¹ Ney, H.² Lehnen, P.³ Gass, T.⁴ Schluter, R.⁵

32
- 84910035297
- Learning small-sizednn with output-distribution-based criteria
- J. Li, R. Zhao, J.-T. Huang, and Y. Gong, "Learning small-sizednn with output-distribution-based criteria, " in Proc. Interspeech, 2014.
- (2014) Proc. Interspeech
- Li, J.¹ Zhao, R.² Huang, J.-T.³ Gong, Y.⁴

33
- 0029288633
- Maximum likelihood linearregression for speaker adaptation of continuous density hiddenMarkov models
- C. J. Leggetter and P. C. Woodland, "Maximum likelihood linearregression for speaker adaptation of continuous density hiddenMarkov models, " Computer Speech & Language, vol. 9, no. 2, pp. 171-185, 1995.
- (1995) Computer Speech & Language , vol.9 , Issue.2 , pp. 171-185
- Leggetter, C.J.¹ Woodland, P.C.²

34
- 85079095310
- The design for the wall street journalbasedCSR corpus
- D. B. Paul and J. M. Baker, "The design for the wall street journalbasedCSR corpus, " in Proc. Workshop on Speech and NaturalLanguage, 1992, pp. 899-902.
- (1992) Proc. Workshop on Speech and NaturalLanguage , pp. 899-902
- Paul, D.B.¹ Baker, J.M.²

35
- 84858953642
- The Kaldi speech recognitiontoolkit
- D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, N. Goel, M. Hannemann, P. Motlicek, Y. Qian, P. Schwarz, J. Silovsky, G. Stemmer, and K. Vesely, "The Kaldi speech recognitiontoolkit, " in Proc. ASRU, 2011.
- (2011) Proc. ASRU
- Povey, D.¹ Ghoshal, A.² Boulianne, G.³ Burget, L.⁴ Glembek, O.⁵ Goel, N.⁶ Hannemann, M.⁷ Motlicek, P.⁸ Qian, Y.⁹ Schwarz, P.¹⁰ Silovsky, J.¹¹ Stemmer, G.¹² Vesely, K.¹³

36
- 84890454527
- Low-rank matrix factorization for deep neural networktraining with high-dimensional output targets
- T. N. Sainath, B. Kingsbury, V. Sindhwani, E. Arisoy, and B. Ramabhadran, "Low-rank matrix factorization for deep neural networktraining with high-dimensional output targets, " in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE InternationalConference on. IEEE, 2013, pp. 6655-6659.
- (2013) Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE InternationalConference On. IEEE , pp. 6655-6659
- Sainath, T.N.¹ Kingsbury, B.² Sindhwani, V.³ Arisoy, E.⁴ Ramabhadran, B.⁵

37
- 84906227589
- Restructuring of deep neural networkacoustic models with singular value decomposition
- J. Xue, J. Li, and Y. Gong, "Restructuring of deep neural networkacoustic models with singular value decomposition. " in INTERSPEECH, 2013, pp. 2365-2369.
- (2013) INTERSPEECH , pp. 2365-2369
- Xue, J.¹ Li, J.² Gong, Y.³

38
- 84905229915
- Singular value decompositionbased low-footprint speaker adaptation and personalizationfor deep neural network
- J. Xue, J. Li, D. Yu, M. Seltzer, and Y. Gong, "Singular value decompositionbased low-footprint speaker adaptation and personalizationfor deep neural network, " in Proc. ICASSP, 2014.
- (2014) Proc. ICASSP
- Xue, J.¹ Li, J.² Yu, D.³ Seltzer, M.⁴ Gong, Y.⁵

39
- 84912109599
- Speaker adaptation of hybridNN/HMM model for speech recognition based on singular valuedecomposition
- S. Xue, H. Jiang, and L. Dai, "Speaker adaptation of hybridNN/HMM model for speech recognition based on singular valuedecomposition, " in Proc. ISCSLP, 2014.
- (2014) Proc. ISCSLP
- Xue, S.¹ Jiang, H.² Dai, L.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.