SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn , Issue , 2013, Pages 7893-7897

KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition

(5) Yu, Dong a Yao, Kaisheng b Su, Hang c,d Li, Gang c Seide, Frank c

a MICROSOFT RESEARCH (United States)

b MICROSOFT (United States)

c MICROSOFT RESEARCH ASIA (China)

d TSINGHUA UNIVERSITY (China)

Author keywords

CD DNN HMM; deep neural network; Kullback Leibler divergence regularization; speaker adaptation

Indexed keywords

ADAPTATION TECHNIQUES; CD-DNN-HMM; DEEP NEURAL NETWORKS; KULLBACK LEIBLER DIVERGENCE; LARGE VOCABULARY SPEECH RECOGNITION; SPEAKER ADAPTATION; SPEECH TRANSCRIPTIONS; UNSUPERVISED ADAPTATION;

HIDDEN MARKOV MODELS; SIGNAL PROCESSING; SPEECH RECOGNITION; SPEECH TRANSMISSION;

NEURAL NETWORKS;

EID: 84890542079 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2013.6639201 Document Type: Conference Paper

Times cited : (464)

References (30)

1
- 84055222005
- Contextdependent pre-trained deep neural networks for large vocabulary speech recognition
- G. E. Dahl, D. Yu, L. Deng, and A. Acero, "Contextdependent pre-trained deep neural networks for large vocabulary speech recognition," IEEE Trans. on Audio, Speech, and Language Processing, vol. 20, no. 1, pp. 30-42, 2012.
- (2012) IEEE Trans. on Audio, Speech, and Language Processing , vol.20 , Issue.1 , pp. 30-42
- Dahl, G.E.¹ Yu, D.² Deng, L.³ Acero, A.⁴

2
- 84055163920
- Roles of pretraining and finetuning in context-dependent DNN-HMMs for real-world speech recognition
- Dec
- D. Yu, L. Deng, and G. Dahl, "Roles of pretraining and finetuning in context-dependent DNN-HMMs for real-world speech recognition," in Proc. NIPS Workshop on Deep Learning and Unsupervised Feature Learning, Dec. 2010.
- (2010) Proc. NIPS Workshop on Deep Learning and Unsupervised Feature Learning
- Yu, D.¹ Deng, L.² Dahl, G.³

3
- 84865801985
- Conversational speech transcription using context-dependent deep neural networks
- F. Seide, G. Li, and D. Yu, "Conversational speech transcription using context-dependent deep neural networks," in Proc. Interspeech'11, pp. 437-440, 2011.
- (2011) Proc. Interspeech'11 , pp. 437-440
- Seide, F.¹ Li, G.² Yu, D.³

4
- 44049108531
- Automated directory assistance system-From theory to practice
- D. Yu, Y.-C. Ju, Y.-Y. Wang, G. Zweig, and A. Acero, "Automated Directory Assistance System-from Theory to Practice", in Proc. Interspeech'07, pp. 2709-2712, 2007.
- (2007) Proc. Interspeech'07 , pp. 2709-2712
- Yu, D.¹ Ju, Y.-C.² Wang, Y.-Y.³ Zweig, G.⁴ Acero, A.⁵

5
- 84055211743
- Acoustic modeling using deep belief networks
- A. Mohamed, G. Dahl, and G. Hinton, "Acoustic modeling using deep belief networks," IEEE Trans. on Audio, Speech, and Language Processing, vol. 20, no. 1, pp. 14-22, 2012.
- (2012) IEEE Trans. on Audio, Speech, and Language Processing , vol.20 , Issue.1 , pp. 14-22
- Mohamed, A.¹ Dahl, G.² Hinton, G.³

6
- 84878539964
- Application of pretrained deep neural networks to large vocabulary speech recognition
- N. Jaitly, P. Nguyen, and V. Vanhoucke, "application of pretrained deep neural networks to large vocabulary speech recognition", in Proc. Interspeech'12, 2012.
- (2012) Proc. Interspeech'12
- Jaitly, N.¹ Nguyen, P.² Vanhoucke, V.³

7
- 84858972572
- Making deep belief networks effective for large vocabulary continuous speech recognition
- T. N. Sainath, B. Kingsbury, B. Ramabhadran, P. Fousek, P. Novak, and A.-r. Mohamed, "Making deep belief networks effective for large vocabulary continuous speech recognition", in Proc. ASRU'11, pp. 30-35, 2011.
- (2011) Proc. ASRU'11 , pp. 30-35
- Sainath, T.N.¹ Kingsbury, B.² Ramabhadran, B.³ Fousek, P.⁴ Novak, P.⁵ Mohamed, A.-R.⁶

8
- 84878379108
- Scalable minimum bayes risk training of deep neural network acoustic models using distributed hessian-free optimization
- B. Kingsbury, T. N. Sainath, and H. Soltau, "Scalable minimum bayes risk training of deep neural network acoustic models using distributed hessian-free optimization," in Proc. Interspeech'12, 2012.
- (2012) Proc. Interspeech'12
- Kingsbury, B.¹ Sainath, T.N.² Soltau, H.³

9
- 84890543852
- Error back propagation for sequence training of context-dependent deep networks for conversational speech transcription
- Hang Su, Gang Li, Dong Yu, Frank Seide, "Error back propagation for sequence training of context-dependent deep networks for conversational speech transcription", in Proc. ICASSP 2013.
- (2013) Proc. ICASSP
- Su, H.¹ Li, G.² Yu, D.³ Seide, F.⁴

10
- 84890492030
- An investigation of deep neural networks for noise robust speech recognition
- Michael Seltzer, Dong Yu, Yongqiang Wang, "An investigation of deep neural networks for noise robust speech recognition", in Proc. ICASSP 2013.
- (2013) Proc. ICASSP
- Seltzer, M.¹ Yu, D.² Wang, Y.³

11
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition
- G. Hinton, L. Deng, D. Yu, G. Dahl, A.-R. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, and B. Kingsbury, "Deep neural networks for acoustic modeling in speech recognition," IEEE Signal Processing Magazine, 2012.
- (2012) IEEE Signal Processing Magazine
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.⁴ Mohamed, A.-R.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.¹⁰ Kingsbury, B.¹¹

12
- 84937880519
- Connectionist speaker normalization and adaptation
- V. Abrash, H. Franco, A. Sankar, and M. Cohen, "Connectionist speaker normalization and adaptation," in Proc. EUROSPEECH'95, pp. 2183-2186, 1995.
- (1995) Proc. EUROSPEECH'95 , pp. 2183-2186
- Abrash, V.¹ Franco, H.² Sankar, A.³ Cohen, M.⁴

13
- 84937854847
- Speaker-adaptation for hybrid HMM-ANN continuous speech recognition system
- J. Neto, L. Almeida, M. Hochberg, C. Martins, L. Nunes, and S. Renals, T. Robinson, "Speaker-adaptation for hybrid HMM-ANN continuous speech recognition system," in Proc. EUROSPEECH'95, pp. 2171-2174, 1995.
- (1995) Proc. EUROSPEECH'95 , pp. 2171-2174
- Neto, J.¹ Almeida, L.² Hochberg, M.³ Martins, C.⁴ Nunes, L.⁵ Renals, S.⁶ Robinson, T.⁷

14
- 0343476363
- Hybrid HMM-NN modeling of stationary-transitional units for continuous speech recognition
- D. Albesano, R. Gemello, and F. Mana, "Hybrid HMM-NN modeling of stationary-transitional units for continuous speech recognition", in Proc. NIPS'97, pp. 1112-1115, 1997.
- (1997) Proc. NIPS'97 , pp. 1112-1115
- Albesano, D.¹ Gemello, R.² Mana, F.³

15
- 79959849500
- Comparison of discriminative input and output transformations for speaker adaptation in the Hybrid NN/HMM systems
- B. Li and K. C. Sim, "Comparison of discriminative input and output transformations for speaker adaptation in the Hybrid NN/HMM systems", in Proc. Interspeech'10, pp. 526-529, 2010.
- (2010) Proc. Interspeech'10 , pp. 526-529
- Li, B.¹ Sim, K.C.²

16
- 34548012893
- Linear hidden transformations for adaptation of hybrid ANN/HMM models
- R. Gemello, F. Mana, S. Scanzio, P. Laface, and R. De Mori, "Linear hidden transformations for adaptation of hybrid ANN/HMM models", Speech Communication 49, no. 10, pp. 827-83, 2007.
- (2007) Speech Communication , vol.49 , Issue.10 , pp. 827-883
- Gemello, R.¹ Mana, F.² Scanzio, S.³ Laface, P.⁴ De Mori, R.⁵

17
- 84865740155
- Improving lvcsr system combination using neural network language model cross adaptation
- X. Liu, M. J. F. Gales, and P. C. Woodland. "Improving LVCSR system combination using neural network language model cross adaptation," in Proc. Interspeech'11, Pp. 2857-2860, 2011.
- (2011) Proc. Interspeech'11 , pp. 2857-2860
- Liu, X.¹ Gales, M.J.F.² Woodland, P.C.³

18
- 84878535870
- A initial attempt on task-specific adaptation for deep neural networkbased large vocabulary continuous speech recognition
- Y. Xiao, Z. Zhang, S. Cai, J. Pan, and Y. Yan, "A initial attempt on task-specific adaptation for deep neural networkbased large vocabulary continuous speech recognition", in Proc. Interspeech'12, 2012.
- (2012) Proc. Interspeech'12
- Xiao, Y.¹ Zhang, Z.² Cai, S.³ Pan, J.⁴ Yan, Y.⁵

19
- 78049310851
- Adaptation of a feedforward artificial neural network using a linear transform
- J. Trmal, J. Zelinka, and L. Müller. "Adaptation of a feedforward artificial neural network using a linear transform," Text, Speech and Dialogue. Springer Berlin/Heidelberg, pp. 423-430, 2010.
- (2010) Text, Speech and Dialogue. Springer Berlin/Heidelberg , pp. 423-430
- Trmal, J.¹ Zelinka, J.² Müller, L.³

20
- 84874226579
- Adaptation of context-dependent deep neural networks for automatic speech recognition
- K. Yao, D. Yu, F. Seide, H. Su, L.i Deng, and Y. Gong, "Adaptation of context-dependent deep neural networks for automatic speech recognition", in Proc. SLT'12, 2012.
- (2012) Proc. SLT'12
- Yao, K.¹ Yu, D.² Seide, F.³ Su, H.⁴ Deng, L.I.⁵ Gong, Y.⁶

21
- 33646794050
- Two-stage speaker adaptation of hybrid tied-posterior acoustic models
- J. Stadermann and G. Rigoll, "Two-stage speaker adaptation of hybrid tied-posterior acoustic models," in Proc. ICASSP'05, vol. I, pp. 997-1000, 2005.
- (2005) Proc. ICASSP'05 , vol.1 , pp. 997-1000
- Stadermann, J.¹ Rigoll, G.²

22
- 40649088651
- Adaptation of artificial neural networks avoiding catastrophic forgetting
- D. Albesano, R. Gemello, P. Laface, F. Mana, and S. Scanzio, "Adaptation of artificial neural networks avoiding catastrophic forgetting," in Proc. Int. Jnt. Conference on Neural Networks 2006, pp. 2863-2870, 2006.
- (2006) Proc. Int. Jnt. Conference on Neural Networks 2006 , pp. 2863-2870
- Albesano, D.¹ Gemello, R.² Laface, P.³ Mana, F.⁴ Scanzio, S.⁵

23
- 33947635130
- Regularized adaptation of discriminative classifiers
- X. Li and J. Bilmes, "Regularized adaptation of discriminative classifiers," in Proc. ICASSP'06, 2006.
- (2006) Proc. ICASSP'06
- Li, X.¹ Bilmes, J.²

24
- 0033677005
- Fast speaker adaptation of artificial neural networks for automatic speech recognition
- S. Dupont and L. Cheboub, "Fast speaker adaptation of artificial neural networks for automatic speech recognition", in Proc. ICASSP'00, vol.3, pp. 1795-1798, 2000.
- (2000) Proc. ICASSP'00 , vol.3 , pp. 1795-1798
- Dupont, S.¹ Cheboub, L.²

25
- 84858976070
- Feature engineering in context-dependent deep neural networks for conversational speech transcription
- F. Seide, G. Li, X. Chen, and D. Yu, "Feature engineering in context-dependent deep neural networks for conversational speech transcription," in Proc. ASRU'11, 2011.
- (2011) Proc. ASRU'11
- Seide, F.¹ Li, G.² Chen, X.³ Yu, D.⁴

26
- 84890447334
- Factorized deep neural networks for adaptive speech recognition
- D. Yu, X. Chen, and L. Deng, "Factorized deep neural networks for adaptive speech recognition", International workshop on statistical machine learning for speech processing, 2012.
- (2012) International Workshop on Statistical Machine Learning for Speech Processing
- Yu, D.¹ Chen, X.² Deng, L.³

27
- 84871387302
- The deep tensor neural network with applications to large vocabulary speech recognition
- D. Yu, L. Deng, and F. Seide, "The deep tensor neural network with applications to large vocabulary speech recognition", IEEE Trans. on Audio, Speech, and Language Processing, 2013.
- (2013) IEEE Trans. on Audio, Speech, and Language Processing
- Yu, D.¹ Deng, L.² Seide, F.³

28
- 84885579558
- arXiv:0912.4896
- J. Snoek, H. Larochelle, and RP. Adams, "Practical bayesian optimization of machine learning algorithms," arXiv:0912.4896, 2012.
- (2012) Practical Bayesian Optimization of Machine Learning Algorithms
- Snoek, J.¹ Larochelle, H.² Adams, R.P.³

29
- 68549140008
- A novel framework and training algorithm for variable-parameter hidden markov models
- D. Yu, L. Deng, Y. Gong, and A. Acero, "A novel framework and training algorithm for variable-parameter hidden markov models", IEEE Trans. on Audio, Speech, and Language Processing, vol 17, no. 7, pp. 1348-1360, 2009.
- (2009) IEEE Trans. on Audio, Speech, and Language Processing , vol.17 , Issue.7 , pp. 1348-1360
- Yu, D.¹ Deng, L.² Gong, Y.³ Acero, A.⁴

30
- 79959853780
- On speaker adaptive training of artificial neural networks
- J. Trmal, J. Zelinka, and L. Müller, "On speaker adaptive training of artificial neural networks", in Proc. Interspeech'10, pp. 554-557, 2010.
- (2010) Proc. Interspeech'10 , pp. 554-557
- Trmal, J.¹ Zelinka, J.² Müller, L.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.