SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn , Issue , 2014, Pages 6339-6343

Direct adaptation of hybrid DNN/HMM model for fast speaker adaptation in LVCSR based on speaker code

(4) Xue, Shaofei a Abdel Hamid, Ossama b Jiang, Hui b Dai, Lirong a

a University of Science and Technology of China (China)

b YORK UNIVERSITY (Canada)

Author keywords

Deep Neural Network (DNN); Fast Speaker Adaptation; Hybrid DNN HMM; Speaker Code

Indexed keywords

BACKPROPAGATION ALGORITHMS; CODES (SYMBOLS); ELECTRIC SWITCHBOARDS; SIGNAL PROCESSING;

ADAPTATION METHODS; CONNECTION WEIGHTS; DEEP NEURAL NETWORKS; FAST SPEAKER ADAPTATION; HYBRID DNN-HMM; RELATIVE REDUCTION; SPEAKER CODE; TRAINING PROCESS;

SPEECH RECOGNITION;

EID: 84905284226 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2014.6854824 Document Type: Conference Paper

Times cited : (67)

References (25)

1
- 84890452886
- Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code
- Ossama Abdel-Hamid and Hui Jiang, "Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code," in IEEE International Conference of Acoustics,Speech and Signal Processing (ICASSP), 2013.
- (2013) IEEE International Conference of Acoustics,Speech and Signal Processing (ICASSP)
- Abdel-Hamid, O.¹ Jiang, H.²

2
- 0028419019
- Maximum a posteriori estimation for multivariate gaussian mixture observations of markov chains
- J. L. Gauvain and Chin-Hui Lee, "Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains," IEEE Transactions on Speech and audio processing, vol. 2, no. 2, pp. 291-298, 1994.
- (1994) IEEE Transactions on Speech and Audio Processing , vol.2 , Issue.2 , pp. 291-298
- Gauvain, J.L.¹ Lee, C.-H.²

3
- 0031177213
- Combined Bayesian and predictive techniques for rapid speaker adaptation of continuous density hidden Markov models
- S. M. Ahadi and P. C. Woodland, "Combined Bayesian and predictive techniques for rapid speaker adaptation of continuous density hidden Markov models," Computer speech &language, vol. 11, no. 3, pp. 187-206, 1997.
- (1997) Computer Speech &Language , vol.11 , Issue.3 , pp. 187-206
- Ahadi, S.M.¹ Woodland, P.C.²

4
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
- Christopher Leggetter and P. C. Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models," Computer Speech &Language, vol. 9, no. 2, pp. 171-185, 1995.
- (1995) Computer Speech &Language , vol.9 , Issue.2 , pp. 171-185
- Leggetter, C.¹ Woodland, P.C.²

5
- 0032050110
- Maximum likelihood linear transformations for HMM-based speech recognition
- Mark J. F. Gales, "Maximum likelihood linear transformations for HMM-based speech recognition," Computer speech &language, vol. 12, no. 2, pp. 75-98, 1998.
- (1998) Computer Speech &Language , vol.12 , Issue.2 , pp. 75-98
- Gales, M.J.F.¹

6
- 0029375590
- Speaker adaptation using constrained estimation of Gaussian mixtures
- Vassilios V Digalakis, Dimitry Rtischev, and Leonardo G Neumeyer, "Speaker adaptation using constrained estimation of Gaussian mixtures," IEEE Transactions on Speech and Audio Processing, vol. 3, no. 5, pp. 357-366, 1995.
- (1995) IEEE Transactions on Speech and Audio Processing , vol.3 , Issue.5 , pp. 357-366
- Digalakis, V.V.¹ Rtischev, D.² Neumeyer, L.G.³

7
- 84937854847
- Speaker-adaptation for hybrid HMM-ANN continuous speech recognition system
- Joao Neto, Lus Almeida, Mike Hochberg, Ciro Martins, Lus Nunes, Steve Renals, and Tony Robinson, "Speaker-adaptation for hybrid HMM-ANN continuous speech recognition system," in EUROSPEECH, 1995.
- (1995) EUROSPEECH
- Neto, J.¹ Almeida, L.² Hochberg, M.³ Martins, C.⁴ Nunes, L.⁵ Renals, S.⁶ Robinson, T.⁷

8
- 34548012893
- Linear hidden transformations for adaptation of hybrid ANN/HMM models
- Roberto Gemello, Franco Mana, Stefano Scanzio, Pietro Laface, and Renato De Mori, "Linear hidden transformations for adaptation of hybrid ANN/HMM models," Speech Communication, vol. 49, no. 10, pp. 827-835, 2007.
- (2007) Speech Communication , vol.49 , Issue.10 , pp. 827-835
- Gemello, R.¹ Mana, F.² Scanzio, S.³ Laface, P.⁴ De Mori, R.⁵

9
- 33646794050
- Two-stage speaker adaptation of hybrid tied-posterior acoustic models
- Jan Stadermann and Gerhard Rigoll, "Two-stage speaker adaptation of hybrid tied-posterior acoustic models," in IEEE International Conference of Acoustics,Speech and Signal Processing (ICASSP), 2005.
- (2005) IEEE International Conference of Acoustics,Speech and Signal Processing (ICASSP)
- Stadermann, J.¹ Rigoll, G.²

10
- 84878606732
- Hermitian based hidden activation functions for adaptation of hybrid HMM/ANN models
- Sabato Marco Siniscalchi, Jinyu Li, and Chin-Hui Lee, "Hermitian based hidden activation functions for adaptation of hybrid HMM/ANN models," in INTERSPEECH, 2012.
- (2012) INTERSPEECH
- Siniscalchi, S.M.¹ Li, J.² Lee, C.-H.³

11
- 84890542079
- KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition
- Dong Yu, Kaisheng Yao, Hang Su, Gang Li, and Frank Seide, "KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition," in IEEE International Conference of Acoustics,Speech and Signal Processing (ICASSP), 2013.
- (2013) IEEE International Conference of Acoustics,Speech and Signal Processing (ICASSP)
- Yu, D.¹ Yao, K.² Su, H.³ Li, G.⁴ Seide, F.⁵

12
- 84890521103
- Speaker adaptation of context dependent deep neural networks
- Hank Liao, "Speaker adaptation of context dependent deep neural networks," in IEEE International Conference of Acoustics, Speech and Signal Processing (ICASSP), 2013.
- (2013) IEEE International Conference of Acoustics, Speech and Signal Processing (ICASSP)
- Liao, H.¹

13
- 84906225505
- Rapid and effective speaker adaptation of convolutional neural network based models for speech recognition
- Ossama Abdel-Hamid and Hui Jiang, "Rapid and effective speaker adaptation of convolutional neural network based models for speech recognition," in INTERSPEECH, 2013.
- (2013) INTERSPEECH
- Ossama, A.-H.¹ Jiang, H.²

14
- 84910030053
- RecNorm: Simultaneous normalization and classification applied to speech recognition
- Bridle J. S. and S. J. Cox, "RecNorm: simultaneous normalization and classification applied to speech recognition," Advances in Neural Information Processing Systems, vol. 3, 1991.
- (1991) Advances in Neural Information Processing Systems , vol.3
- Bridle, J.S.¹ Cox, S.J.²

15
- 0030352922
- Speaker adaptation by modeling the speaker variation in a continuous speech recognition system
- Nikko Strom, "Speaker adaptation by modeling the speaker variation in a continuous speech recognition system," in Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on. IEEE, 1996, vol. 2, pp. 989-992.
- (1996) Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference On. IEEE , vol.2 , pp. 989-992
- Strom, N.¹

16
- 84905268324
- Speaker adaptation of deep neural network based on discriminant codes
- Feb
- Shaofei Xue, Ossama Abdel-Hamid, Hui Jiang, and Lirong Dai, "Speaker adaptation of deep neural network based on discriminant codes," submitted to IEEE Transactions on Acoustics, Speech and Signal Processing, Feb 2014.
- (2014) Submitted to IEEE Transactions on Acoustics, Speech and Signal Processing
- Xue, S.¹ Ossama, A.-H.² Jiang, H.³ Dai, L.⁴

17
- 0024768209
- Speaker-independent phone recognition using hidden Markov models
- K-F Lee and H-W Hon, "Speaker-independent phone recognition using hidden Markov models," IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 37, no. 11, pp. 1641-1648, 1989.
- (1989) IEEE Transactions on Acoustics, Speech and Signal Processing , vol.37 , Issue.11 , pp. 1641-1648
- Lee, K.-F.¹ Hon, H.-W.²

18
- 84055211743
- Acoustic modeling using deep belief networks
- Abdel-rahman Mohamed, George E Dahl, and Geoffrey Hinton, "Acoustic modeling using deep belief networks," IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 1, pp. 14-22, 2012.
- (2012) IEEE Transactions on Audio, Speech, and Language Processing , vol.20 , Issue.1 , pp. 14-22
- Mohamed, A.-R.¹ Dahl, G.E.² Hinton, G.³

19
- 84874485803
- Investigation of deep neural networks (DNN) for large vocabulary continuous speech recognition: Why DNN surpasses GMMs in acoustic modeling
- Jia Pan, Cong Liu, Zhiguo Wang, Yu Hu, and Hui Jiang, "Investigation of deep neural networks (DNN) for large vocabulary continuous speech recognition: Why DNN surpasses GMMs in acoustic modeling," in 8th International Symposium on Chinese Spoken Language Processing (ISCSLP), 2012, pp. 301-305.
- (2012) 8th International Symposium on Chinese Spoken Language Processing (ISCSLP) , pp. 301-305
- Pan, J.¹ Liu, C.² Wang, Z.³ Hu, Y.⁴ Jiang, H.⁵

20
- 84876477729
- Investigation on dimensionality reduction of concatenated features with deep neural network for LVCSR systems
- Yebo Bao, Hui Jiang, Cong Liu, Yu Hu, and Lirong Dai, "Investigation on dimensionality reduction of concatenated features with deep neural network for LVCSR systems," in IEEE 11th International Conference on Signal Processing (ICSP), 2012, vol. 1, pp. 562-566.
- (2012) IEEE 11th International Conference on Signal Processing (ICSP) , vol.1 , pp. 562-566
- Bao, Y.¹ Jiang, H.² Liu, C.³ Hu, Y.⁴ Dai, L.⁵

21
- 84890445451
- Incoherent training of deep neural networks to de-correlate bottleneck features for speech recognition
- Yebo Bao, Hui Jiang, Lirong Dai, and Cong Liu, "Incoherent training of deep neural networks to de-correlate bottleneck features for speech recognition," in IEEE International Conference of Acoustics,Speech and Signal Processing (ICASSP), 2013.
- (2013) IEEE International Conference of Acoustics,Speech and Signal Processing (ICASSP)
- Bao, Y.¹ Jiang, H.² Dai, L.³ Liu, C.⁴

22
- 84905252086
- Improving deep neural networks for LVCSR using dropout and shrinking structure
- Shiliang Zhang, Yebo Bao, Pan Zhou, Hui Jiang, and Lirong Dai, "Improving deep neural networks for LVCSR using dropout and shrinking structure," in IEEE International Conference of Acoustics,Speech and Signal Processing (ICASSP), 2014.
- (2014) IEEE International Conference of Acoustics,Speech and Signal Processing (ICASSP)
- Zhang, S.¹ Bao, Y.² Zhou, P.³ Jiang, H.⁴ Dai, L.⁵

23
- 84890543852
- Error back propagation for sequence training of context-dependent deep networks for conversational speech transcription
- Hang Su, Gang Li, Dong Yu, and Frank Seide, "Error back propagation for sequence training of context-dependent deep networks for conversational speech transcription," in IEEE International Conference of Acoustics,Speech and Signal Processing (ICASSP), 2013.
- (2013) IEEE International Conference of Acoustics,Speech and Signal Processing (ICASSP)
- Su, H.¹ Li, G.² Yu, D.³ Seide, F.⁴

24
- 84905286338
- A state-clustering based multiple deep neural networks modelling approach for speech recognition
- Pan Zhou, Lirong Dai, Hui Jiang, Yu Hu, and Qingfeng Liu, "A state-clustering based multiple deep neural networks modelling approach for speech recognition," in IEEE International Conference of Acoustics,Speech and Signal Processing (ICASSP), 2013.
- (2013) IEEE International Conference of Acoustics,Speech and Signal Processing (ICASSP)
- Zhou, P.¹ Dai, L.² Jiang, H.³ Hu, Y.⁴ Liu, Q.⁵

25
- 84905240378
- Sequence training of multiple deep neural networks for better performance and faster training speed
- Pan Zhou, Lirong Dai, and Hui Jiang, "Sequence training of multiple deep neural networks for better performance and faster training speed," in IEEE International Conference of Acoustics, Speech and Signal Processing (ICASSP), 2014.
- (2014) IEEE International Conference of Acoustics, Speech and Signal Processing (ICASSP)
- Zhou, P.¹ Dai, L.² Jiang, H.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.