SCOPUS 정보 검색 플랫폼

2012 IEEE Workshop on Spoken Language Technology, SLT 2012 - Proceedings

Volumn , Issue , 2012, Pages 366-369

Adaptation of context-dependent deep neural networks for automatic speech recognition

(6) Yao, Kaisheng a Yu, Dong b Seide, Frank c Su, Hang c,d Deng, Li b Gong, Yifan a

a MICROSOFT (United States)

b MICROSOFT RESEARCH (United States)

c MICROSOFT RESEARCH ASIA (China)

d TSINGHUA UNIVERSITY (China)

Author keywords

Context Dependent Deep Neural Networks; HMM; speaker adaptation; speech recognition

Indexed keywords

ADAPTATION METHODS; AFFINE TRANSFORMATIONS; AUTOMATIC SPEECH RECOGNITION; BATCH UPDATE; CONTEXT DEPENDENT; DEEP NEURAL NETWORKS; HIDDEN LAYERS; HMM; INPUT LAYERS; LARGE VOCABULARY SPEECH RECOGNITION; LAYER ADAPTATION; SPEAKER ADAPTATION; STOCHASTIC GRADIENT ASCENTS; WORD ERROR RATE;

HIDDEN MARKOV MODELS; NEURAL NETWORKS; OPTICS;

SPEECH RECOGNITION;

EID: 84874226579 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/SLT.2012.6424251 Document Type: Conference Paper

Times cited : (200)

References (17)

1
- 84055222005
- Context-dependent pre-trained deep neural networks for large vocabulary speech recognition
- G. E. Dahl, D. Yu, L. Deng, and A. Acero, "Context-dependent pre-trained deep neural networks for large vocabulary speech recognition," IEEE Trans. on Audio, Speech, and Language Processing, vol. 20, no. 1, pp. 30-42, 2012.
- (2012) IEEE Trans. on Audio, Speech, and Language Processing , vol.20 , Issue.1 , pp. 30-42
- Dahl, G.E.¹ Yu, D.² Deng, L.³ Acero, A.⁴

2
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition
- G. Hinton, L. Deng, D. Yu, G. Dahl, A.-R. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, and B. Kingsbury, "Deep neural networks for acoustic modeling in speech recognition," IEEE Signal Processing Magazine, 2012.
- (2012) IEEE Signal Processing Magazine
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.⁴ Mohamed, A.-R.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.¹⁰ Kingsbury, B.¹¹

3
- 0004056285
- Prentice Hall
- X. Huang, A. Acero, and H.-W. Hong, Spoken Language Processing: a guide to theory, algorithm, and system development, Prentice Hall, 2001.
- (2001) Spoken Language Processing: A Guide to Theory Algorithm, and System Development
- Huang, X.¹ Acero, A.² Hong, H.-W.³

4
- 79959849500
- Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems
- B. Li and K. C. Sim, "Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems," in INTERSPEECH, 2010, pp. 526-529.
- (2010) Interspeech , pp. 526-529
- Li, B.¹ Sim, K.C.²

5
- 33947703156
- Adaptation of hybrid ANN/HMM models using linear hidden transformations and conservative training
- R. Gemello, F. Mana, S. Scanzio, P. Laface, and R. D. Mori, "Adaptation of hybrid ANN/HMM models using linear hidden transformations and conservative training," in ICASSP, 2006, pp. 1189-1192.
- (2006) ICASSP , pp. 1189-1192
- Gemello, R.¹ Mana, F.² Scanzio, S.³ Laface, P.⁴ Mori, R.D.⁵

6
- 84878606732
- Hermitian based hidden activation functions for adaptation of hybrid HMM/ANN models
- S. M. Siniscalchi, J. Li, and C.-H. Lee, "Hermitian based hidden activation functions for adaptation of hybrid HMM/ANN models," in INTERSPEECH, 2012.
- (2012) Interspeech
- Siniscalchi, S.M.¹ Li, J.² Lee, C.-H.³

7
- 84937854847
- Speaker-adaptation for hybrid HMM-ANN continuous speech recognition system
- J. Neto et al, "Speaker-adaptation for hybrid HMM-ANN continuous speech recognition system," in EUROSPEECH, 1995.
- (1995) EUROSPEECH
- Neto Et Al, J.¹

8
- 0003507689
- Blaisdell Publishing Company or Xerox College Publishing
- Arthur Earl Bryson and Yu-Chi Ho, Applied optimal control: optimization, estimation, and control, p. 481, Blaisdell Publishing Company or Xerox College Publishing, 1969.
- (1969) Yu-Chi Ho Applied Optimal Control: Optimization Estimation and Control , pp. 481
- Bryson, A.E.¹

9
- 84858976070
- Feature engineering in context-dependent deep neural networks for conversational speech transcription
- F. Seide, G. Li, X. Chen, and D. Yu, "Feature engineering in context-dependent deep neural networks for conversational speech transcription," in ASRU, 2011.
- (2011) ASRU
- Seide, F.¹ Li, G.² Chen, X.³ Yu, D.⁴

10
- 84858953286
- Vocal tract length normalization for LVCSR
- Carnegie Mellon University
- P. Zhan etal, "Vocal tract length normalization for LVCSR," in Tech. Rep. CMU-LTI-97-150. Carnegie Mellon University, 1997.
- (1997) Tech. Rep. CMU-LTI-97-150
- Zhan, P.¹

11
- 0032050110
- Maximum likelihood linear transformations for HMM-based speech recognition
- M. J. F. Gales, "Maximum likelihood linear transformations for HMM-based speech recognition," Computer, Speech and Language, vol. 12, pp. 75-98, 1998.
- (1998) Computer, Speech and Language , vol.12 , pp. 75-98
- Gales, M.J.F.¹

12
- 84937880519
- Connectionist speaker normalization and adaptation
- V. Abrash, H. Franco, A. Sankar, and M. Cohen, "Connectionist speaker normalization and adaptation," in EUROSPEECH, 1995.
- (1995) EUROSPEECH
- Abrash, V.¹ Franco, H.² Sankar, A.³ Cohen, M.⁴

13
- 84055211743
- Acoustic modeling using deep belief networks
- A. Mohamed, G. Dahl, and G. Hinton, "Acoustic modeling using deep belief networks," IEEE Trans. on Audio, Speech, and Language Processing, vol. 20, no. 1, pp. 14-22, 2012.
- (2012) IEEE Trans. on Audio, Speech, and Language Processing , vol.20 , Issue.1 , pp. 14-22
- Mohamed, A.¹ Dahl, G.² Hinton, G.³

14
- 0013344078
- Training products of experts by minimizing contrastive divergence
- G. E. Hinton, "Training products of experts by minimizing contrastive divergence," Neural Computation, vol. 14, pp. 1771-1800, 2002.
- (2002) Neural Computation , vol.14 , pp. 1771-1800
- Hinton, G.E.¹

15
- 85008035419
- Equivalence of generative and log-linear models
- G. Heigold, H. Ney, P. Lehnen, T. Gass, and R. Schluter, "Equivalence of generative and log-linear models," IEEE Trans. on Audio, Speech, and Language Processing, vol. 19, no. 5, pp. 1138-1148, 2011.
- (2011) IEEE Trans. on Audio, Speech, and Language Processing , vol.19 , Issue.5 , pp. 1138-1148
- Heigold, G.¹ Ney, H.² Lehnen, P.³ Gass, T.⁴ Schluter, R.⁵

16
- 33947635130
- Regularized adaptation of discriminative classifiers
- X. Li and J. Bilmes, "Regularized adaptation of discriminative classifiers," in ICASSP, 2006.
- (2006) ICASSP
- Li, X.¹ Bilmes, J.²

17
- 33646777278
- A generalization of linear dis-criminant analysis in maximum likelihood framework
- Johns Hopkins University, Aug
- N. Kumar and A. G. Andreou, "A generalization of linear dis-criminant analysis in maximum likelihood framework," in Tech. Rep. JHU-CLSP Technical Report. Johns Hopkins University, Aug 1996, vol. 16.
- (1996) Tech. Rep. JHU-CLSP Technical Report , pp. 16
- Kumar, N.¹ Andreou, A.G.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.