SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn 2016-May, Issue , 2016, Pages 5010-5014

SAT-LHUC: Speaker adaptive training for learning hidden unit contributions

(2) Swietojanski, Pawel a Renais, Steve a

a UNIVERSITY OF EDINBURGH (United Kingdom)

Author keywords

Deep Neural Networks; LHUC; SAT

Indexed keywords

EID: 84973299594 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2016.7472631 Document Type: Conference Paper

Times cited : (18)

References (44)

1
- 0030362995
- A compact model for speaker-adaptive training
- T Anastasakos, J McDonough, R Schwartz, and J Makhoul, "A compact model for speaker-adaptive training, " in Proc ICSLP, 1996, pp. 1137-1140.
- (1996) Proc ICSLP , pp. 1137-1140
- Anastasakos, T.¹ McDonough, J.² Schwartz, R.³ Makhoul, J.⁴

2
- 0034227757
- Cluster adaptive training of hidden markov models
- MJF Gales, "Cluster adaptive training of hidden markov models, " Speech and Audio Processing, IEEE Transactions on, vol. 8, no. 4, pp. 417-428, 2000.
- (2000) Speech and Audio Processing, IEEE Transactions on , vol.8 , Issue.4 , pp. 417-428
- Gales, M.J.F.¹

3
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
- CJ Leggetter and PC Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models, " Computer Speech & Language, vol. 9, pp. 171-185, 1995.
- (1995) Computer Speech & Language , vol.9 , pp. 171-185
- Leggetter, C.J.¹ Woodland, P.C.²

4
- 0032050110
- Maximum likelihood linear transformations for HMMbased speech recognition
- April
- MJF Gales, "Maximum likelihood linear transformations for HMMbased speech recognition, " Computer Speech and Language, vol. 12, pp. 75-98, April 1998.
- (1998) Computer Speech and Language , vol.12 , pp. 75-98
- Gales, M.J.F.¹

5
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
- Nov
- G Hinton, L Deng, D Yu, GE Dahl, A Mohamed, N Jaitly, A Senior, V Vanhoucke, P Nguyen, TN Sainath, and B Kingsbury, "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups, " Signal Processing Magazine, IEEE, vol. 29, no. 6, pp. 82-97, Nov 2012.
- (2012) Signal Processing Magazine, IEEE , vol.29 , Issue.6 , pp. 82-97
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.E.⁴ Mohamed, A.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.N.¹⁰ Kingsbury, B.¹¹

6
- 80051654263
- Deep belief networks using discriminative features for phone recognition
- May
- A Mohamed, TN Sainath, G Dahl, B Ramabhadran, GE Hinton, and MA Picheny, "Deep belief networks using discriminative features for phone recognition, " in Proc. ICASSP, May 2011, pp. 5060-5063.
- (2011) Proc. ICASSP , pp. 5060-5063
- Mohamed, A.¹ Sainath, T.N.² Dahl, G.³ Ramabhadran, B.⁴ Hinton, G.E.⁵ Picheny, M.A.⁶

7
- 84858976070
- Feature engineering in context-dependent deep neural networks for conversational speech transcription
- F Seide, X Chen, and D Yu, "Feature engineering in context-dependent deep neural networks for conversational speech transcription, " in Proc IEEE ASRU, 2011.
- (2011) Proc IEEE ASRU
- Seide, F.¹ Chen, X.² Yu, D.³

8
- 84890492591
- Revisiting hybrid and GMM-HMM system combination techniques
- P Swietojanski, A Ghoshal, and S Renals, "Revisiting hybrid and GMM-HMM system combination techniques, " in Proc IEEE ICASSP, 2013.
- (2013) Proc IEEE ICASSP
- Swietojanski, P.¹ Ghoshal, A.² Renals, S.³

9
- 84893691530
- Speaker adaptation of neural network acoustic models using i-vectors
- G Saon, H Soltau, D Nahamoo, and M Picheny, "Speaker adaptation of neural network acoustic models using i-vectors., " in Proc IEEE ASRU, 2013, pp. 55-59.
- (2013) Proc IEEE ASRU , pp. 55-59
- Saon, G.¹ Soltau, H.² Nahamoo, D.³ Picheny, M.⁴

10
- 84937854847
- Speaker adaptation for hybrid HMM-ANN continuous speech recognition system
- J Neto, L Almeida, M Hochberg, C Martins, L Nunes, S Renals, and T Robinson, "Speaker adaptation for hybrid HMM-ANN continuous speech recognition system, " in Proc Eurospeech, 1995, pp. 2171-2174.
- (1995) Proc Eurospeech , pp. 2171-2174
- Neto, J.¹ Almeida, L.² Hochberg, M.³ Martins, C.⁴ Nunes, L.⁵ Renals, S.⁶ Robinson, T.⁷

11
- 84937880519
- Connectionist speaker normalization and adaptation
- V Abrash, H Franco, A Sankar, and M Cohen, "Connectionist speaker normalization and adaptation, " in Proc Eurospeech, 1995, pp. 2183-2186.
- (1995) Proc Eurospeech , pp. 2183-2186
- Abrash, V.¹ Franco, H.² Sankar, A.³ Cohen, M.⁴

12
- 84874226579
- Adaptation of context-dependent deep neural networks for automatic speech recognition
- K Yao, D Yu, F Seide, H Su, L Deng, and Y Gong, "Adaptation of context-dependent deep neural networks for automatic speech recognition., " in Proc IEEE SLT, 2012.
- (2012) Proc IEEE SLT
- Yao, K.¹ Yu, D.² Seide, F.³ Su, H.⁴ Deng, L.⁵ Gong, Y.⁶

13
- 84890542079
- KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition
- D Yu, K Yao, H Su, G Li, and F Seide, "KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition., " in Proc IEEE ICASSP, 2013, pp. 7893-7897.
- (2013) Proc IEEE ICASSP , pp. 7893-7897
- Yu, D.¹ Yao, K.² Su, H.³ Li, G.⁴ Seide, F.⁵

14
- 84890521103
- Speaker adaptation of context dependent deep neural networks
- IEEE
- H Liao, "Speaker adaptation of context dependent deep neural networks., " in In Proc. ICASSP. 2013, pp. 7947-7951, IEEE.
- (2013) Proc. ICASSP. , pp. 7947-7951
- Liao, H.¹

15
- 84906225505
- Rapid and effective speaker adaptation of convolutional neural network based models for speech recognition
- ISCA
- O Abdel-Hamid and H Jiang, "Rapid and effective speaker adaptation of convolutional neural network based models for speech recognition., " in Proc. Interspeech. pp. 1248-1252, ISCA.
- Proc. Interspeech. , pp. 1248-1252
- Abdel-Hamid, O.¹ Jiang, H.²

16
- 84983119674
- Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models
- P Swietojanski and S Renals, "Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models, " in Proc. IEEE SLT, 2014.
- (2014) Proc. IEEE SLT
- Swietojanski, P.¹ Renals, S.²

17
- 84946032695
- Differentiable pooling for unsupervised speaker adaptation
- P Swietojanski and S Renals, "Differentiable pooling for unsupervised speaker adaptation, " in Proc. IEEE ICASSP, 2015.
- (2015) Proc. IEEE ICASSP
- Swietojanski, P.¹ Renals, S.²

18
- 79951609039
- Front end factor analysis for speaker verification
- N Dehak, PJ Kenny, R Dehak, P Dumouchel, and P Ouellet, "Front end factor analysis for speaker verification, " IEEE Trans Audio, Speech and Language Processing, vol. 19, pp. 788-798, 2010.
- (2010) IEEE Trans Audio, Speech and Language Processing , vol.19 , pp. 788-798
- Dehak, N.¹ Kenny, P.J.² Dehak, R.³ Dumouchel, P.⁴ Ouellet, P.⁵

19
- 84858984756
- IVector-based discriminative adaptation for automatic speech recognition
- M Karafiat, L Burget, P Matejka, O Glembek, and J Cernozky, "iVector-based discriminative adaptation for automatic speech recognition, " in Proc IEEE ASRU, 2011.
- (2011) Proc IEEE ASRU
- Karafiat, M.¹ Burget, L.² Matejka, P.³ Glembek, O.⁴ Cernozky, J.⁵

20
- 84910031119
- Towards speaker adaptive training of deep neural network acoustic models
- Y Miao, H Zhang, and F Metze, "Towards speaker adaptive training of deep neural network acoustic models, " in Proc. Interspeech, 2014.
- (2014) Proc. Interspeech
- Miao, Y.¹ Zhang, H.² Metze, F.³

21
- 84890537527
- Multi-level adaptive networks in tandem and hybrid ASR systems
- P Bell, P Swietojanski, and S Renals, "Multi-level adaptive networks in tandem and hybrid ASR systems, " in Proc IEEE ICASSP, 2013.
- (2013) Proc IEEE ICASSP
- Bell, P.¹ Swietojanski, P.² Renals, S.³

22
- 84946036535
- An investigation into speaker informed DNN front-end for LVCSR
- Y Liu, P Karanasou, and T Hain, "An investigation into speaker informed DNN front-end for LVCSR, " in Proc IEEE ICASSP, 2015.
- (2015) Proc IEEE ICASSP
- Liu, Y.¹ Karanasou, P.² Hain, T.³

23
- 84910030053
- Recnorm: Simultaneous normalisation and classification applied to speech recognition
- JS Bridle and S Cox, "Recnorm: Simultaneous normalisation and classification applied to speech recognition, " in Advances in Neural Information Processing Systems 3, 1990, pp. 234-240.
- (1990) Advances in Neural Information Processing Systems 3 , pp. 234-240
- Bridle, J.S.¹ Cox, S.²

24
- 84890521637
- On speaker adaptive training of artificial neural networks
- J Trmal, J Zelinka, and L Müller, "On speaker adaptive training of artificial neural networks, " in Proc. Interspeech, 2010.
- (2010) Proc. Interspeech
- Trmal, J.¹ Zelinka, J.² Müller, L.³

25
- 84890452886
- Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code
- O Abdel-Hamid and H Jiang, "Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code, " in Proc IEEE ICASSP, 2013, pp. 4277-4280.
- (2013) Proc IEEE ICASSP , pp. 4277-4280
- Abdel-Hamid, O.¹ Jiang, H.²

26
- 84946054484
- Multi-basis adaptive neural network for rapid adaptation in speech recognition
- IEEE
- C Wu and M Gales, "Multi-basis adaptive neural network for rapid adaptation in speech recognition, " in Proc. ICASSP. 2015, IEEE.
- (2015) Proc. ICASSP.
- Wu, C.¹ Gales, M.²

27
- 84946083667
- Cluster adaptive training for deep neural network
- IEEE
- T Tan, Y Qian, M Yin, Y Zhuang, and K Yu, "Cluster adaptive training for deep neural network, " in Proc. ICASSP. 2015, IEEE.
- (2015) Proc. ICASSP.
- Tan, T.¹ Qian, Y.² Yin, M.³ Zhuang, Y.⁴ Yu, K.⁵

28
- 84946036209
- Context adaptive deep neural networks for fast acoustic model adaptation
- IEEE
- M Delcroix, K Kinoshita, T Hori, and T Nakatani, "Context adaptive deep neural networks for fast acoustic model adaptation, " in Proc. ICASSP. 2015, IEEE.
- (2015) Proc. ICASSP.
- Delcroix, M.¹ Kinoshita, K.² Hori, T.³ Nakatani, T.⁴

29
- 85001124710
- Wit3: Web inventory of transcribed and translated talks
- M Cettolo, C Girardi, and M Federico, "Wit3: Web inventory of transcribed and translated talks, " in Proc EAMT, 2012, pp. 261-268.
- (2012) Proc EAMT , pp. 261-268
- Cettolo, M.¹ Girardi, C.² Federico, M.³

30
- 85016587886
- SWITCHBOARD: Telephone speech corpus for research and development
- John J Godfrey, Edward C Holliman, and Jane McDaniel, "SWITCHBOARD: Telephone speech corpus for research and development, " in Proc. ICASSP. IEEE, 1992, pp. 517-520.
- (1992) Proc. ICASSP. IEEE , pp. 517-520
- Godfrey, J.J.¹ Holliman, E.C.² McDaniel, J.³

31
- 35948981862
- Unleashing the killer corpus: Experiences in creating the multi-everything AMI meeting corpus
- J Carletta, "Unleashing the killer corpus: Experiences in creating the multi-everything AMI meeting corpus., " Language Resources and Evaluation, vol. 41, no. 2, pp. 181-190, 2007.
- (2007) Language Resources and Evaluation , vol.41 , Issue.2 , pp. 181-190
- Carletta, J.¹

32
- 44849090969
- Recognition and understanding of meetings: The AMI and AMIDA projects
- Kyoto, 12 IDIAP-RR 07-46
- S Renals, T Hain, and H Bourlard, "Recognition and understanding of meetings: The AMI and AMIDA projects, " in Proc. of the IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU'07, Kyoto, 12 2007, IDIAP-RR 07-46.
- (2007) Proc. of the IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU'07
- Renals, S.¹ Hain, T.² Bourlard, H.³

33
- 84055222005
- Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
- GE Dahl, D Yu, L Deng, and A Acero, "Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 20, no. 1, pp. 30-42, 2012.
- (2012) Audio, Speech, and Language Processing, IEEE Transactions on , vol.20 , Issue.1 , pp. 30-42
- Dahl, G.E.¹ Yu, D.² Deng, L.³ Acero, A.⁴

34
- 84890492591
- Revisiting hybrid and GMM-HMM system combination techniques
- P Swietojanski, A Ghoshal, and S Renals, "Revisiting hybrid and GMM-HMM system combination techniques, " in Proc. ICASSP, 2013.
- (2013) Proc. ICASSP
- Swietojanski, P.¹ Ghoshal, A.² Renals, S.³

35
- 84976431564
- The UEDIN ASR systems for the IWSLT 2014 evaluation
- P Bell, P Swietojanski, J Driesen, MSinclair, F McInnes, and S Renals, "The UEDIN ASR Systems for the IWSLT 2014 Evaluation, " in Proc. IWSLT, 2014.
- (2014) Proc. IWSLT
- Bell, P.¹ Swietojanski, P.² Driesen, J.³ Sinclair, M.F.⁴ Renals, S.⁵

36
- 84906274730
- Sequencediscriminative training of deep neural networks
- Lyon, France, August
- K Vesely, A Ghoshal, L Burget, and D Povey, "Sequencediscriminative training of deep neural networks, " in Proc. Interspeech, Lyon, France, August 2013.
- (2013) Proc. Interspeech
- Vesely, K.¹ Ghoshal, A.² Burget, L.³ Povey, D.⁴

37
- 84858953642
- The Kaldi speech recognition toolkit
- D Povey, A Ghoshal, G Boulianne, L Burget, O Glembek, N Goel, M Hannemann, P Motlícek, Y Qian, P Schwarz, J Silovský, G Stemmer, and K Veselý, "The Kaldi speech recognition toolkit, " in Proc. IEEE ASRU, December 2011.
- (2011) Proc. IEEE ASRU, December
- Povey, D.¹ Ghoshal, A.² Boulianne, G.³ Burget, L.⁴ Glembek, O.⁵ Goel, N.⁶ Hannemann, M.⁷ Motlícek, P.⁸ Qian, Y.⁹ Schwarz, P.¹⁰ Silovský, J.¹¹ Stemmer, G.¹² Veselý, K.¹³

38
- 84893704659
- Hybrid acoustic models for distant and multichannel large vocabulary speech recognition
- December
- P Swietojanski, A Ghoshal, and S Renals, "Hybrid acoustic models for distant and multichannel large vocabulary speech recognition, " in Proc. IEEE ASRU, December 2013.
- (2013) Proc. IEEE ASRU
- Swietojanski, P.¹ Ghoshal, A.² Renals, S.³

39
- 84959174678
- Parameterised sigmoid and relu hidden activation functions for DNN acoustic modelling
- C Zhang and PC Woodland, "Parameterised Sigmoid and ReLU Hidden Activation Functions for DNN Acoustic Modelling, " in Proc. Interspeech, 2015.
- (2015) Proc. Interspeech
- Zhang, C.¹ Woodland, P.C.²

40
- 84959177524
- Human vs machine spoofing detection on wideband and narrowband data
- September
- M Wester, Z Wu, and J Yamagishi, "Human vs machine spoofing detection on wideband and narrowband data, " in Proc. of Interspeech, September 2015.
- (2015) Proc. of Interspeech
- Wester, M.¹ Wu, Z.² Yamagishi, J.³

41
- 84910084579
- 2000 NIST evaluation of conversational speech recognition over the telephone: English and Mandarin performance results
- Citeseer
- J Fiscus, W M Fisher, A F Martin, M A Przybocki, and D S Pallett, "2000 NIST evaluation of conversational speech recognition over the telephone: English and Mandarin performance results, " in Proc. Speech Transcription Workshop. Citeseer, 2000.
- (2000) Proc. Speech Transcription Workshop
- Fiscus, J.¹ Fisher, W.M.² Martin, A.F.³ Przybocki, M.A.⁴ Pallett, D.S.⁵

42
- 84973318162
- arXiv: 1601. 02828
- P Swietojanski, J Li, and S Renals, "Learning hidden unit contributions for unsupervised acoustic model adaptation, " arXiv: 1601. 02828, 2016.
- (2016) Learning Hidden Unit Contributions for Unsupervised Acoustic Model Adaptation
- Swietojanski, P.¹ Li, J.² Renals, S.³

43
- 4544265717
- Ph. D. Thesis, University of Cambridge
- D Povey, Discriminative training for large vocabulary speech recognition, Ph. D. Thesis, University of Cambridge, 2003.
- (2003) Discriminative Training for Large Vocabulary Speech Recognition
- Povey, D.¹

44
- 70349213445
- Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling
- B Kingsbury, "Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling, " in Proc. IEEE ICASSP, 2009, pp. 3761-3764.
- (2009) Proc. IEEE ICASSP , pp. 3761-3764
- Kingsbury, B.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.