SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 23, Issue 11, 2015, Pages 1938-1949

Speaker Adaptive Training of Deep Neural Network Acoustic Models Using I-Vectors

(3) Miao, Yajie a Zhang, Hao a Metze, Florian a

a Carnegie Mellon University (United States)

Author keywords

Acoustic modeling; deep neural networks (DNNs); speaker adaptive training (SAT)

Indexed keywords

OBJECT RECOGNITION;

ACOUSTIC MODEL; ADAPTATION METHODS; ADAPTIVE MODELS; DEEP NEURAL NETWORKS; GAUSSIAN MIXTURE MODEL (GMMS); NETWORK STRUCTURES; SPEAKER ADAPTIVE TRAININGS; WORD ERROR RATE;

SPEECH RECOGNITION;

EID: 84938688160 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASLP.2015.2457612 Document Type: Article

Times cited : (131)

References (56)

1
- 84055222005
- Context-dependent pretrained deep neural networks for large-vocabulary speech recognition
- Jan.
- G. E. Dahl, D. Yu, L. Deng, and A. Acero, "Context-dependent pretrained deep neural networks for large-vocabulary speech recognition," IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 1, pp. 30-42, Jan. 2012.
- (2012) IEEE Trans. Audio, Speech, Lang. Process. , vol.20 , Issue.1 , pp. 30-42
- Dahl, G.E.¹ Yu, D.² Deng, L.³ Acero, A.⁴

2
- 84858976070
- Feature engineering in context-dependent deep neural networks for conversational speech transcription
- F. Seide, G. Li, X. Chen, and D. Yu, "Feature engineering in context-dependent deep neural networks for conversational speech transcription," in Proc. IEEE Workshop Autom. Speech Recogn. Understand. (ASRU), 2011, pp. 24-29.
- (2011) Proc. IEEE Workshop Autom. Speech Recogn. Understand. (ASRU) , pp. 24-29
- Seide, F.¹ Li, G.² Chen, X.³ Yu, D.⁴

3
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
- Nov.
- G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, and T. N. Sainath et al., "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups," IEEE Signal Process. Mag., vol. 29, no. 6, pp. 82-97, Nov. 2012.
- (2012) IEEE Signal Process. Mag. , vol.29 , Issue.6 , pp. 82-97
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.E.⁴ Mohamed, A.-R.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.N.¹⁰

4
- 84878539964
- Application of pretrained deep neural networks to large vocabulary speech recognition
- N. Jaitly, P. Nguyen, A. W. Senior, and V. Vanhoucke, "Application of pretrained deep neural networks to large vocabulary speech recognition," in Proc. 13th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH), 2012.
- (2012) Proc. 13th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH)
- Jaitly, N.¹ Nguyen, P.² Senior, A.W.³ Vanhoucke, V.⁴

5
- 85083953021
- Feature learning in deep neural networks-studies on speech recognition tasks
- D. Yu, M. L. Seltzer, J. Li, J.-T. Huang, and F. Seide, "Feature learning in deep neural networks-studies on speech recognition tasks," arXiv preprint arXiv:1301.3605, 2013.
- (2013) ArXiv Preprint arXiv:1301.3605
- Yu, D.¹ Seltzer, M.L.² Li, J.³ Huang, J.-T.⁴ Seide, F.⁵

6
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
- C. J. Leggetter and P. C. Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models," Comput. Speech Lang., vol. 9, no. 2, pp. 171-185, 1995.
- (1995) Comput. Speech Lang. , vol.9 , Issue.2 , pp. 171-185
- Leggetter, C.J.¹ Woodland, P.C.²

7
- 0032050110
- Maximum likelihood linear transformations for HMMbased speech recognition
- M. J. Gales, "Maximum likelihood linear transformations for HMMbased speech recognition," Comput. Speech Lang., vol. 12, no. 2, pp. 75-98, 1998.
- (1998) Comput. Speech Lang. , vol.12 , Issue.2 , pp. 75-98
- Gales, M.J.¹

8
- 84893703162
- Large scale deep neural network acoustic modeling with semi-supervised training data for YouTube video transcription
- H. Liao, E. McDermott, and A. Senior, "Large scale deep neural network acoustic modeling with semi-supervised training data for YouTube video transcription," in Proc. IEEEWorkshop Autom. Speech Recogn. Understand. (ASRU), 2013, pp. 368-373.
- (2013) Proc. IEEEWorkshop Autom. Speech Recogn. Understand. (ASRU) , pp. 368-373
- Liao, H.¹ McDermott, E.² Senior, A.³

9
- 84890542079
- KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition
- D. Yu, K. Yao, H. Su, G. Li, and F. Seide, "KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2013, pp. 7893-7897.
- (2013) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 7893-7897
- Yu, D.¹ Yao, K.² Su, H.³ Li, G.⁴ Seide, F.⁵

10
- 79959849500
- Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems
- B. Li and K. C. Sim, "Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems," in Proc. 11th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH), 2010.
- (2010) Proc. 11th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH)
- Li, B.¹ Sim, K.C.²

11
- 84874226579
- Adaptation of context-dependent deep neural networks for automatic speech recognition
- K. Yao, D. Yu, F. Seide, H. Su, L. Deng, and Y. Gong, "Adaptation of context-dependent deep neural networks for automatic speech recognition," in Proc. IEEE Spoken Lang. Technol. Workshop (SLT), 2012, pp. 366-369.
- (2012) Proc. IEEE Spoken Lang. Technol. Workshop (SLT) , pp. 366-369
- Yao, K.¹ Yu, D.² Seide, F.³ Su, H.⁴ Deng, L.⁵ Gong, Y.⁶

12
- 84906241049
- Improved feature processing for deep neural networks
- ISCA
- S. P. Rath, D. Povey, K. Vesel, and J. Cernock, "Improved feature processing for deep neural networks," in Proc. 14th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH), 2013, pp. 109-113, ISCA.
- (2013) Proc. 14th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH) , pp. 109-113
- Rath, S.P.¹ Povey, D.² Vesel, K.³ Cernock, J.⁴

13
- 84893691530
- Speaker adaptation of neural network acoustic models using i-vectors
- G. Saon, H. Soltau, D. Nahamoo, and M. Picheny, "Speaker adaptation of neural network acoustic models using i-vectors," in Proc. IEEE Workshop Autom. Speech Recogn. Understand. (ASRU), 2013, pp. 55-59.
- (2013) Proc. IEEE Workshop Autom. Speech Recogn. Understand. (ASRU) , pp. 55-59
- Saon, G.¹ Soltau, H.² Nahamoo, D.³ Picheny, M.⁴

14
- 0030362995
- A compact model for speaker-adaptive training
- T. Anastasakos, J. McDonough, R. Schwartz, and J. Makhoul, "A compact model for speaker-adaptive training," in Proc. 4th Int. Conf. Spoken Lang. (ICSLP, 1996, vol. 2, pp. 1137-1140.
- (1996) Proc. 4th Int. Conf. Spoken Lang. (ICSLP , vol.2 , pp. 1137-1140
- Anastasakos, T.¹ McDonough, J.² Schwartz, R.³ Makhoul, J.⁴

15
- 0030677475
- Speaker adaptive training: A maximum likelihood approach to speaker normalization
- T. Anastasakos, J. McDonough, and J. Makhoul, "Speaker adaptive training: A maximum likelihood approach to speaker normalization," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 1997, pp. 1043-1046.
- (1997) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 1043-1046
- Anastasakos, T.¹ McDonough, J.² Makhoul, J.³

16
- 79959853780
- On speaker adaptive training of artificial neural networks
- J. Trmal, J. Zelinka, and L. Müller, "On speaker adaptive training of artificial neural networks," in Proc. 11th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH), 2010.
- (2010) Proc. 11th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH)
- Trmal, J.¹ Zelinka, J.² Müller, L.³

17
- 84905259138
- Improving DNN speaker independence with i-vector inputs
- A. Senior and I. Lopez-Moreno, "Improving DNN speaker independence with i-vector inputs," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2014, pp. 225-229.
- (2014) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 225-229
- Senior, A.¹ Lopez-Moreno, I.²

18
- 84921731072
- Fast adaptation of deep neural network based on discriminant codes for speech recognition
- Dec.
- S. Xue, O. Abdel-Hamid, H. Jiang, L. Dai, and Q. Liu, "Fast adaptation of deep neural network based on discriminant codes for speech recognition," IEEE/ACM Trans. Audio, Speech, Lang. Process., vol. 22, no. 12, pp. 1713-1725, Dec. 2014.
- (2014) IEEE/ACM Trans. Audio, Speech, Lang. Process. , vol.22 , Issue.12 , pp. 1713-1725
- Xue, S.¹ Abdel-Hamid, O.² Jiang, H.³ Dai, L.⁴ Liu, Q.⁵

19
- 84905216195
- Speaker adaptive training using deep neural networks
- T. Ochiai, S. Matsuda, X. Lu, C. Hori, and S. Katagiri, "Speaker adaptive training using deep neural networks," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2014, pp. 6349-6353.
- (2014) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 6349-6353
- Ochiai, T.¹ Matsuda, S.² Lu, X.³ Hori, C.⁴ Katagiri, S.⁵

20
- 70450180849
- Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification
- N. Dehak, R. Dehak, P. Kenny, N. Brümmer, P. Ouellet, and P. Dumouchel, "Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification," in Proc. 10th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH), 2009, pp. 1559-1562.
- (2009) Proc. 10th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH) , pp. 1559-1562
- Dehak, N.¹ Dehak, R.² Kenny, P.³ Brümmer, N.⁴ Ouellet, P.⁵ Dumouchel, P.⁶

21
- 79951609039
- Front-end factor analysis for speaker verification
- May
- N. Dehak, P. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, "Front-end factor analysis for speaker verification," IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 4, pp. 788-798, May 2011.
- (2011) IEEE Trans. Audio, Speech, Lang. Process. , vol.19 , Issue.4 , pp. 788-798
- Dehak, N.¹ Kenny, P.² Dehak, R.³ Dumouchel, P.⁴ Ouellet, P.⁵

22
- 84890525984
- Deep convolutional neural networks for LVCSR
- T. N. Sainath, A.-r. Mohamed, B. Kingsbury, and B. Ramabhadran, "Deep convolutional neural networks for LVCSR," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2013, pp. 8614-8618.
- (2013) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 8614-8618
- Sainath, T.N.¹ Mohamed, A.-R.² Kingsbury, B.³ Ramabhadran, B.⁴

23
- 84905265980
- Joint training of convolutional and non-convolutional neural networks
- H. Soltau, G. Saon, and T. N. Sainath, "Joint training of convolutional and non-convolutional neural networks," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2014, pp. 5572-5576.
- (2014) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 5572-5576
- Soltau, H.¹ Saon, G.² Sainath, T.N.³

24
- 84911473441
- Convolutional neural networks for speech recognition
- Oct.
- O. Abdel-Hamid, A.-R. Mohamed, H. Jiang, L. Deng, G. Penn, and D. Yu, "Convolutional neural networks for speech recognition," IEEE/ACM Trans. Audio, Speech, Lang. Process., vol. 22, no. 10, pp. 1533-1545, Oct. 2014.
- (2014) IEEE/ACM Trans. Audio, Speech, Lang. Process. , vol.22 , Issue.10 , pp. 1533-1545
- Abdel-Hamid, O.¹ Mohamed, A.-R.² Jiang, H.³ Deng, L.⁴ Penn, G.⁵ Yu, D.⁶

25
- 84922343800
- Deep convolutional neural networks for large-scale speech tasks
- T. N. Sainath, B. Kingsbury, G. Saon, H. Soltau, A.-r. Mohamed, G. Dahl, and B. Ramabhadran, "Deep convolutional neural networks for large-scale speech tasks," Neural Networks, vol. 64, pp. 34-48, 2015.
- (2015) Neural Networks , vol.64 , pp. 34-48
- Sainath, T.N.¹ Kingsbury, B.² Saon, G.³ Soltau, H.⁴ Mohamed, A.-R.⁵ Dahl, G.⁶ Ramabhadran, B.⁷

26
- 84878409063
- Recurrent neural networks for noise reduction in robust ASR
- A. L. Maas, Q. V. Le, T. M. O'Neil, O. Vinyals, P. Nguyen, and A. Y. Ng, "Recurrent neural networks for noise reduction in robust ASR," in Proc. 13th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH), 2012.
- (2012) Proc. 13th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH)
- Maas, A.L.¹ Le, Q.V.² O'Neil, T.M.³ Vinyals, O.⁴ Nguyen, P.⁵ Ng, A.Y.⁶

27
- 84890543083
- Speech recognition with deep recurrent neural networks
- A. Graves, A.-R. Mohamed, and G. Hinton, "Speech recognition with deep recurrent neural networks," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2013, pp. 6645-6649.
- (2013) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 6645-6649
- Graves, A.¹ Mohamed, A.-R.² Hinton, G.³

28
- 84910046405
- Long short-term memory recurrent neural network architectures for large scale acoustic modeling
- H. Sak, A. Senior, and F. Beaufays, "Long short-term memory recurrent neural network architectures for large scale acoustic modeling," in Proc. 15th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH), 2014.
- (2014) Proc. 15th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH)
- Sak, H.¹ Senior, A.² Beaufays, F.³

29
- 84910072094
- Sequence discriminative distributed training of long short-term memory recurrent neural networks
- H. Sak, O. Vinyals, G. Heigold, A. Senior, E. McDermott, R. Monga, and M. Mao, "Sequence discriminative distributed training of long short-term memory recurrent neural networks," in Proc. 15th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH), 2014.
- (2014) Proc. 15th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH)
- Sak, H.¹ Vinyals, O.² Heigold, G.³ Senior, A.⁴ McDermott, E.⁵ Monga, R.⁶ Mao, M.⁷

30
- 84938725974
- On speaker adaptation of long short-term memory recurrent neural networks
- Y. Miao and F. Metze, "On speaker adaptation of long short-term memory recurrent neural networks," in Proc. 16th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH), 2015.
- (2015) Proc. 16th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH)
- Miao, Y.¹ Metze, F.²

31
- 84910031119
- Towards speaker adaptive training of deep neural network acoustic models
- Y. Miao, H. Zhang, and F. Metze, "Towards speaker adaptive training of deep neural network acoustic models," in Proc. 15th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH), 2014.
- (2014) Proc. 15th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH)
- Miao, Y.¹ Zhang, H.² Metze, F.³

32
- 84946685505
- Improvements to speaker adaptive training of deep neural networks
- Y. Miao, L. Jiang, H. Zhang, and F. Metze, "Improvements to speaker adaptive training of deep neural networks," in Proc. IEEE Spoken Lang. Technol. Workshop (SLT), 2014, pp. 6349-6353.
- (2014) Proc. IEEE Spoken Lang. Technol. Workshop (SLT) , pp. 6349-6353
- Miao, Y.¹ Jiang, L.² Zhang, H.³ Metze, F.⁴

33
- 84983119674
- Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models
- P. Swietojanski and S. Renals, "Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models," in Proc. IEEE Spoken Lang. Technol. Workshop (SLT), 2014.
- (2014) Proc. IEEE Spoken Lang. Technol. Workshop (SLT)
- Swietojanski, P.¹ Renals, S.²

34
- 84862294866
- Deep sparse rectifier networks
- X. Glorot, A. Bordes, and Y. Bengio, "Deep sparse rectifier networks," in Proc. 14th Int. Conf. Artif. Intell. Statist. JMLR W&CP Vol., 2011, vol. 15, pp. 315-323.
- (2011) Proc. 14th Int. Conf. Artif. Intell. Statist. JMLR W&CP , vol.15 , pp. 315-323
- Glorot, X.¹ Bordes, A.² Bengio, Y.³

35
- 84893701756
- Deep maxout networks for lowresource speech recognition
- Y. Miao, F. Metze, and S. Rawat, "Deep maxout networks for lowresource speech recognition," in Proc. IEEE Workshop Autom. Speech Recogn. Understand. (ASRU), 2013, pp. 398-403.
- (2013) Proc. IEEE Workshop Autom. Speech Recogn. Understand. (ASRU) , pp. 398-403
- Miao, Y.¹ Metze, F.² Rawat, S.³

36
- 84910028405
- Improving language-universal feature extraction with deep maxout and convolutional neural networks
- Y. Miao and F. Metze, "Improving language-universal feature extraction with deep maxout and convolutional neural networks," in Proc. 15th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH), 2014.
- (2014) Proc. 15th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH)
- Miao, Y.¹ Metze, F.²

37
- 84905229915
- Singular value decomposition based low-footprint speaker adaptation and personalization for deep neural network
- J. Xue, J. Li, D. Yu, M. Seltzer, and Y. Gong, "Singular value decomposition based low-footprint speaker adaptation and personalization for deep neural network," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2014, pp. 6359-6363.
- (2014) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 6359-6363
- Xue, J.¹ Li, J.² Yu, D.³ Seltzer, M.⁴ Gong, Y.⁵

38
- 84905259145
- I-vector-based speaker adaptation of deep neural networks for French broadcast audio transcription
- V. Gupta, P. Kenny, P. Ouellet, and T. Stafylakis, "I-vector-based speaker adaptation of deep neural networks for French broadcast audio transcription," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2014, pp. 6334-6338.
- (2014) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 6334-6338
- Gupta, V.¹ Kenny, P.² Ouellet, P.³ Stafylakis, T.⁴

39
- 84910068089
- Adaptation of deep neural network acoustic models using factorised i-vectors
- P. Karanasou, Y. Wang, M. J. Gales, and P. C. Woodland, "Adaptation of deep neural network acoustic models using factorised i-vectors," in Proc. 15th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH), 2014.
- (2014) Proc. 15th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH)
- Karanasou, P.¹ Wang, Y.² Gales, M.J.³ Woodland, P.C.⁴

40
- 84905262902
- Factorized adaptation for deep neural network
- J. Li, J.-T. Huang, and Y. Gong, "Factorized adaptation for deep neural network," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2014, pp. 5537-5541.
- (2014) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 5537-5541
- Li, J.¹ Huang, J.-T.² Gong, Y.³

41
- 84890452886
- Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code
- O. Abdel-Hamid and H. Jiang, "Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2013, pp. 7942-7946.
- (2013) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 7942-7946
- Abdel-Hamid, O.¹ Jiang, H.²

42
- 84906225505
- Rapid and effective speaker adaptation of convolutional neural network based models for speech recognition
- O. Abdel-Hamid and H. Jiang, "Rapid and effective speaker adaptation of convolutional neural network based models for speech recognition," in Proc. 14th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH), 2013, pp. 1248-1252.
- (2013) Proc. 14th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH) , pp. 1248-1252
- Abdel-Hamid, O.¹ Jiang, H.²

43
- 84938690750
- Speaker adaptation of deep neural networks using a hierarchy of output layers
- R. Price, I. Kenichi, and K. Shinoda, "Speaker adaptation of deep neural networks using a hierarchy of output layers," in Proc. IEEE Spoken Lang. Technol. Workshop (SLT), 2014, pp. 153-158.
- (2014) Proc. IEEE Spoken Lang. Technol. Workshop (SLT) , pp. 153-158
- Price, R.¹ Kenichi, I.² Shinoda, K.³

44
- 84910092490
- Feature space maximum a posteriori linear regression for adaptation of deep neural networks
- Z. Huang, J. Li, S. M. Siniscalchi, I.-F. Chen, C. Weng, and C.-H. Lee, "Feature space maximum a posteriori linear regression for adaptation of deep neural networks," in Proc. 15th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH), 2014.
- (2014) Proc. 15th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH)
- Huang, Z.¹ Li, J.² Siniscalchi, S.M.³ Chen, I.-F.⁴ Weng, C.⁵ Lee, C.-H.⁶

45
- 84881054791
- Hermitian polynomial for speaker adaptation of connectionist speech recognition systems
- Oct.
- S. M. Siniscalchi, J. Li, and C.-H. Lee, "Hermitian polynomial for speaker adaptation of connectionist speech recognition systems," IEEE Trans. Audio, Speech, Lang. Process., vol. 21, no. 10, pp. 2152-2161, Oct. 2013.
- (2013) IEEE Trans. Audio, Speech, Lang. Process. , vol.21 , Issue.10 , pp. 2152-2161
- Siniscalchi, S.M.¹ Li, J.² Lee, C.-H.³

46
- 58349106697
- A study of interspeaker variability in speaker verification
- Jul.
- P. Kenny, P. Ouellet, N. Dehak, V. Gupta, and P. Dumouchel, "A study of interspeaker variability in speaker verification," IEEE Trans. Audio, Speech, Lang. Process., vol. 16, no. 5, pp. 980-988, Jul. 2008.
- (2008) IEEE Trans. Audio, Speech, Lang. Process. , vol.16 , Issue.5 , pp. 980-988
- Kenny, P.¹ Ouellet, P.² Dehak, N.³ Gupta, V.⁴ Dumouchel, P.⁵

47
- 84858984756
- IVector-based discriminative adaptation for automatic speech recognition
- M. Karafiát, L. Burget, P. Matejka, O. Glembek, and J. Cernocky, "iVector-based discriminative adaptation for automatic speech recognition," in Proc. IEEE Workshop Autom. Speech Recogn. Understand. (ASRU), 2011, pp. 152-157.
- (2011) Proc. IEEE Workshop Autom. Speech Recogn. Understand. (ASRU) , pp. 152-157
- Karafiát, M.¹ Burget, L.² Matejka, P.³ Glembek, O.⁴ Cernocky, J.⁵

48
- 84946076428
- TED-LIUM: An automatic speech recognition dedicated corpus
- A. Rousseau, P. Deléglise, and Y. Estève, "TED-LIUM: An automatic speech recognition dedicated corpus," in Proc. LREC, 2012, pp. 125-129.
- (2012) Proc. LREC , pp. 125-129
- Rousseau, A.¹ Deléglise, P.² Estève, Y.³

49
- 84858953642
- The kaldi speech recognition toolkit
- D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, N. Goel, M. Hannemann, P. Motl, Y. Qian, P. Schwarz, J. Silovský, G. Stemmer, and K. Veselý, "The Kaldi speech recognition toolkit," in Proc. IEEE Workshop Autom. Speech Recogn. Understand. (ASRU), 2011, pp. 1-4.
- (2011) Proc. IEEE Workshop Autom. Speech Recogn. Understand. (ASRU) , pp. 1-4
- Povey, D.¹ Ghoshal, A.² Boulianne, G.³ Burget, L.⁴ Glembek, O.⁵ Goel, N.⁶ Hannemann, M.⁷ Motl, P.⁸ Qian, Y.⁹ Schwarz, P.¹⁰ Silovský, J.¹¹ Stemmer, G.¹² Veselý, K.¹³

50
- 4544265717
- Ph.D. dissertation, Univ. of Cambridge, Cambridge, U.K
- D. Povey, "Discriminative training for large vocabulary speech recognition," Ph.D. dissertation, Univ. of Cambridge, Cambridge, U.K., 2005.
- (2005) Discriminative Training for Large Vocabulary Speech Recognition
- Povey, D.¹

51
- 84938725977
- Kaldi+PDNN: Building DNN-based ASR systems with Kaldi and PDNN
- Y. Miao, "Kaldi+PDNN: Building DNN-based ASR systems with Kaldi and PDNN," arXiv preprint arXiv:1401.6984, 2014.
- (2014) ArXiv Preprint arXiv:1401.6984
- Miao, Y.¹

52
- 79551480483
- Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion
- P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol, "Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion," J. Mach. Learn. Res., vol. 11, pp. 3371-3408, 2010.
- (2010) J. Mach. Learn. Res. , vol.11 , pp. 3371-3408
- Vincent, P.¹ Larochelle, H.² Lajoie, I.³ Bengio, Y.⁴ Manzagol, P.-A.⁵

53
- 84872506495
- A practical guide to training restricted Boltzmann machines
- New York, NY, USA: Springer
- G. E. Hinton, "A practical guide to training restricted Boltzmann machines," in Neural Networks: Tricks of the Trade. New York, NY, USA: Springer, 2012, pp. 599-619.
- (2012) Neural Networks: Tricks of the Trade , pp. 599-619
- Hinton, G.E.¹

54
- 84905252132
- A novel scheme for speaker recognition using a phonetically-aware deep neural network
- Y. Lei, N. Scheffer, L. Ferrer, and M. McLaren, "A novel scheme for speaker recognition using a phonetically-aware deep neural network," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2014, pp. 1695-1699.
- (2014) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 1695-1699
- Lei, Y.¹ Scheffer, N.² Ferrer, L.³ McLaren, M.⁴

55
- 84890482429
- Extracting deep bottleneck features using stacked auto-encoders
- J. Gehring, Y. Miao, F. Metze, and A. Waibel, "Extracting deep bottleneck features using stacked auto-encoders," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2013, pp. 3377-3381.
- (2013) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 3377-3381
- Gehring, J.¹ Miao, Y.² Metze, F.³ Waibel, A.⁴

56
- 84906273176
- Modular combination of deep neural networks for acoustic modeling
- J. Gehring, W. Lee, K. Kilgour, I. Lane, Y. Miao, and A. Waibel, "Modular combination of deep neural networks for acoustic modeling," in Proc. 14th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH), 2013.
- (2013) Proc. 14th Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH)
- Gehring, J.¹ Lee, W.² Kilgour, K.³ Lane, I.⁴ Miao, Y.⁵ Waibel, A.⁶

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.