SCOPUS 정보 검색 플랫폼

IEEE/ACM Transactions on Audio Speech and Language Processing

Volumn 24, Issue 8, 2016, Pages 1450-1463

Learning hidden unit contributions for unsupervised acoustic model adaptation

(3) Swietojanski, Pawel a Li, Jinyu b Renals, Steve a

a UNIVERSITY OF EDINBURGH (United Kingdom)

b MICROSOFT (United States)

Author keywords

[No Author keywords available]

Indexed keywords

ACOUSTIC MODEL ADAPTATION; ADAPTATION TECHNIQUES; FEATURE EXTRACTOR; SPEAKER ADAPTIVE TRAININGS; SPEAKER DEPENDENTS; SPEAKER INDEPENDENTS; UNSUPERVISED ADAPTATION; WORD ERROR RATE REDUCTIONS;

SPEECH RECOGNITION;

EID: 84976435936 PISSN: 23299290 EISSN: None Source Type: Journal
DOI: 10.1109/TASLP.2016.2560534 Document Type: Article

Times cited : (159)

References (84)

1
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
- Nov
- G. Hinton et al., "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups, " IEEE Signal Process. Mag., vol. 29, no. 6, pp. 82-97, Nov. 2012
- (2012) IEEE Signal Process. Mag , vol.29 , Issue.6 , pp. 82-97
- Hinton, G.¹

2
- 0003573244
- Norwell MA USA: Kluwer
- H. Bourlard and N. Morgan, Connectionist Speech Recognition: A Hybrid Approach. Norwell, MA, USA: Kluwer, 1994
- (1994) Connectionist Speech Recognition: A Hybrid Approach
- Bourlard, H.¹ Morgan, N.²

3
- 0028194709
- Connectionist probability estimators in HMM speech recognition
- Jan
- S. Renals, N. Morgan, H. Bourlard, M. Cohen, and H. Franco, "Connectionist probability estimators in HMM speech recognition, " IEEE Trans Speech Audio Process, vol. 2, no. 1, pp. 161-174, Jan. 1994
- (1994) IEEE Trans Speech Audio Process , vol.2 , Issue.1 , pp. 161-174
- Renals, S.¹ Morgan, N.² Bourlard, H.³ Cohen, M.⁴ Franco, H.⁵

4
- 84858976070
- Feature engineering in context-dependent deep neural networks for conversational speech transcription
- F. Seide, X. Chen, and D. Yu, "Feature engineering in context-dependent deep neural networks for conversational speech transcription, " in Proc. 12th IEEE Workshop Automatic Speech Recog. Understanding, 2011, pp. 24-29
- (2011) Proc. 12th IEEE Workshop Automatic Speech Recog Understanding , pp. 24-29
- Seide, F.¹ Chen, X.² Yu, D.³

5
- 84055222005
- Context-dependent pretrained deep neural networks for large-vocabulary speech recognition
- Jan
- G. Dahl, D. Yu, L. Deng, and A. Acero, "Context-dependent pretrained deep neural networks for large-vocabulary speech recognition, " IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 1, pp. 30-42, Jan. 2012
- (2012) IEEE Trans. Audio, Speech, Lang. Process , vol.20 , Issue.1 , pp. 30-42
- Dahl, G.¹ Yu, D.² Deng, L.³ Acero, A.⁴

6
- 0033709098
- Tandem connectionist feature extraction for conventional HMM systems
- H. Hermansky, D. Ellis, and S. Sharma, "Tandem connectionist feature extraction for conventional HMM systems, " in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2000, pp. 1635-1638
- (2000) Proc IEEE Int. Conf. Acoust., Speech Signal Process , pp. 1635-1638
- Hermansky, H.¹ Ellis, D.² Sharma, S.³

7
- 34547548235
- Probabilistic and bottleneck features for LVCSR of meetings
- F.Grezl, M. Karafiat, S.Kontar, and J. Cernocky, "Probabilistic and bottleneck features for LVCSR of meetings, " in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2007, pp. IV-757-IV-760
- (2007) Proc IEEE Int. Conf. Acoust., Speech Signal Process , pp. IV757-IV760
- Grezl, F.¹ Karafiat, M.² Kontar, S.³ Cernocky, J.⁴

8
- 84890525984
- Deep convolutional neural networks for LVCSR
- T. Sainath, A.Mohamed, B. Kingsbury, and B. Ramabhadran, "Deep convolutional neural networks for LVCSR, " in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2013, pp. 8614-8618
- (2013) Proc IEEE Int. Conf. Acoust., Speech Signal Process , pp. 8614-8618
- Sainath, T.¹ Mohamed, A.² Kingsbury, B.³ Ramabhadran, B.⁴

9
- 84890492591
- Revisiting hybrid and GMMHMM system combination techniques
- P. Swietojanski, A. Ghoshal, and S. Renals, "Revisiting hybrid and GMMHMM system combination techniques, " in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2013, pp. 6744-6748
- (2013) Proc IEEE Int. Conf. Acoust., Speech Signal Process , pp. 6744-6748
- Swietojanski, P.¹ Ghoshal, A.² Renals, S.³

10
- 84964475976
- Cambridge University transcription systems for the multi-genre broadcast challenge
- P. Woodland et al., "Cambridge University transcription systems for the multi-genre broadcast challenge, " in Proc. IEEE Workshop Automatic Speech Recog. Understanding, 2015, pp. 639-646
- (2015) Proc IEEE Workshop Automatic Speech Recog. Understanding , pp. 639-646
- Woodl, P.¹

11
- 0346528936
- Speaker adaptation for continuous density HMMs: A review
- P. Woodland, "Speaker adaptation for continuous density HMMs: A review, " in Proc. ISCAWorkshop AdaptationMethods Speech Recog., 2001, pp. 11-19
- (2001) Proc. ISCAWorkshop AdaptationMethods Speech Recog , pp. 11-19
- Woodland, P.¹

12
- 84906225505
- Rapid and effective speaker adaptation of convolutional neural network based models for speech recognition
- O. Abdel-Hamid and H. Jiang, "Rapid and effective speaker adaptation of convolutional neural network based models for speech recognition, " in Proc. 14th Annu. Conf. Int. SpeechCommun. Assoc., 2013, pp. 1248-1252
- (2013) Proc. 14th Annu. Conf. Int. SpeechCommun. Assoc , pp. 1248-1252
- Abdel-Hamid, O.¹ Jiang, H.²

13
- 84983119674
- Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models
- P. Swietojanski and S. Renals, "Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models, " in Proc. IEEE Spoken Lang. Technol. Workshop, 2014, pp. 171-176
- (2014) Proc IEEE Spoken Lang. Technol. Workshop , pp. 171-176
- Swietojanski, P.¹ Renals, S.²

14
- 84973299594
- SAT-LHUC: Speaker adaptive training for learning hidden unit contributions
- P. Swietojanski and S. Renals, "SAT-LHUC: Speaker adaptive training for learning hidden unit contributions, " in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2016, pp. 5010-5014
- (2016) Proc IEEE Int. Conf. Acoust., Speech Signal Process , pp. 5010-5014
- Swietojanski, P.¹ Renals, S.²

15
- 85001124710
- Wit3 : Web inventory of transcribed and translated talks
- M. Cettolo, C. Girardi, and M. Federico, "Wit3 : Web inventory of transcribed and translated talks, " in Proc. 16th Conf. Eur. Assoc. Mach. Translation, 2012, pp. 261-268
- (2012) Proc. 16th Conf. Eur. Assoc. Mach. Translation , pp. 261-268
- Cettolo, M.¹ Girardi, C.² Federico, M.³

16
- 35948981862
- Unleashing the killer corpus: Experiences in creating the multi-everything AMI meeting corpus."
- J. Carletta, "Unleashing the killer corpus: Experiences in creating the multi-everything AMI meeting corpus." Language Resources Eval., vol. 41, no. 2, pp. 181-190, 2007
- (2007) Language Resources Eval , vol.41 , Issue.2 , pp. 181-190
- Carletta, J.¹

17
- 85016587886
- SWITCHBOARD: Telephone speech corpus for research and development
- J. Godfrey, E. Holliman, and J. McDaniel, "SWITCHBOARD: Telephone speech corpus for research and development, " in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 1992, pp. 517-520
- (1992) Proc IEEE Int. Conf. Acoust., Speech Signal Process , pp. 517-520
- Godfrey, J.¹ Holliman, E.² McDaniel, J.³

18
- 84979950321
- Performance analysis of the Aurora large vocabulary baseline system
- N. Parihar, J. Picone, D. Pearce, and H. Hirsch, "Performance analysis of the Aurora large vocabulary baseline system, " in Proc. 12th Eur. Signal Process. Conf., 2004, pp. 553-556
- (2004) Proc. 12th Eur. Signal Process. Conf , pp. 553-556
- Parihar, N.¹ Picone, J.² Pearce, D.³ Hirsch, H.⁴

19
- 0032050110
- Maximum likelihood linear transformations for HMMbased speech recognition
- Apr
- M. Gales, "Maximum likelihood linear transformations for HMMbased speech recognition, " Comput. Speech Lang., vol. 12, pp. 75-98, Apr. 1998
- (1998) Comput. Speech Lang , vol.12 , pp. 75-98
- Gales, M.¹

20
- 80051654263
- Deep belief networks using discriminative features for phone recognition
- A. Mohamed, T. Sainath, G. Dahl, B. Ramabhadran, G. Hinton, and M. Picheny, "Deep belief networks using discriminative features for phone recognition, " in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2011, pp. 5060-5063
- (2011) Proc IEEE Int. Conf. Acoust., Speech Signal Process , pp. 5060-5063
- Mohamed, A.¹ Sainath, T.² Dahl, G.³ Ramabhadran, B.⁴ Hinton, G.⁵ Picheny, M.⁶

21
- 85008520364
- Transcribing meetings with the AMIDA systems
- Feb
- T. Hain, et al., "Transcribing meetings with the AMIDA systems, " IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 2, pp. 486-498, Feb. 2012
- (2012) IEEE Trans. Audio, Speech, Lang. Process , vol.20 , Issue.2 , pp. 486-498
- Hain, T.¹

22
- 84867593213
- Auto-encoder bottleneck features using deep belief networks
- T. Sainath, B. Kingsbury, and B. Ramabhadran, "Auto-encoder bottleneck features using deep belief networks." in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2012, pp. 4153-4156
- (2012) Proc IEEE Int. Conf. Acoust., Speech Signal Process , pp. 4153-4156
- Sainath, T.¹ Kingsbury, B.² Ramabhadran, B.³

23
- 84890537527
- Multi-level adaptive networks in tandem and hybrid ASR systems
- P. Bell, P. Swietojanski, and S. Renals, "Multi-level adaptive networks in tandem and hybrid ASR systems, " in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2013, pp. 6975-6979
- (2013) Proc IEEE Int. Conf. Acoust., Speech Signal Process , pp. 6975-6979
- Bell, P.¹ Swietojanski, P.² Renals, S.³

24
- 84937854847
- Speaker adaptation for hybrid HMM-ANN continuous speech recognition system
- J. Neto, L. Almeida, M. Hochberg, C. Martins, L. Nunes, S. Renals, and T. Robinson, "Speaker adaptation for hybrid HMM-ANN continuous speech recognition system, " in Proc. 4th Eur. Conf. Speech Commun. Technol., 1995, pp. 2171-2174
- (1995) Proc. 4th Eur. Conf. Speech Commun. Technol , pp. 2171-2174
- Neto, J.¹ Almeida, L.² Hochberg, M.³ Martins, C.⁴ Nunes, L.⁵ Renals, S.⁶ Robinson, T.⁷

25
- 84937880519
- Connectionist speaker normalization and adaptation
- V. Abrash, H. Franco, A. Sankar, and M. Cohen, "Connectionist speaker normalization and adaptation, " in Proc. 4th Eur. Conf. Speech Commun. Technol., 1995, pp. 2183-2186
- (1995) Proc. 4th Eur. Conf. Speech Commun. Technol , pp. 2183-2186
- Abrash, V.¹ Franco, H.² Sankar, A.³ Cohen, M.⁴

26
- 79959849500
- Comparison of discriminative input and output transformations for speaker adaptation in the hybrid nn/hmm systems
- B. Li and K. Sim, "Comparison of discriminative input and output transformations for speaker adaptation in the hybrid nn/hmm systems, " in Proc. 11th Annu. Conf. Int. Speech Commun. Assoc., 2010
- (2010) Proc. 11th Annu. Conf. Int. Speech Commun. Assoc
- Li, B.¹ Sim, K.²

27
- 79951609039
- Front end factor analysis for speaker verification
- May
- N. Dehak, P. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, "Front end factor analysis for speaker verification, " IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 4, pp. 788-798, May 2010
- (2010) IEEE Trans. Audio, Speech, Lang. Process , vol.19 , Issue.4 , pp. 788-798
- Dehak, N.¹ Kenny, P.² Dehak, R.³ Dumouchel, P.⁴ Ouellet, P.⁵

28
- 84858984756
- IVectorbased discriminative adaptation for automatic speech recognition
- M. Karafiat, L. Burget, P. Matejka, O. Glembek, and J. Cernozky, "iVectorbased discriminative adaptation for automatic speech recognition, " in Proc. IEEE Workshop Automatic Speech Recog. Understanding, 2011, pp. 152-157
- (2011) Proc IEEE Workshop Automatic Speech Recog. Understanding , pp. 152-157
- Karafiat, M.¹ Burget, L.² Matejka, P.³ Glembek, O.⁴ Cernozky, J.⁵

29
- 84893691530
- Speaker adaptation of neural network acoustic models using i-vectors."
- Online].Available:
- G. Saon, H. Soltau, D. Nahamoo, and M. Picheny, "Speaker adaptation of neural network acoustic models using i-vectors." in Proc. IEEE Automatic Speech Recog. Understanding, 2013, pp. 55-59. [Online].Available: http://dblp.uni-Trier.de/db/conf/asru/asru2013.html#SaonSNP13
- (2013) Proc IEEE Automatic Speech Recog. Understanding , pp. 55-59
- Saon, G.¹ Soltau, H.² Nahamoo, D.³ Picheny, M.⁴

30
- 84905259145
- I-vector based speaker adaptation of deep neural networks for French broadcast audio transcription
- V. Gupta, P. Kenny, P. Ouellet, and T. Stafylakis, "I-vector based speaker adaptation of deep neural networks for french broadcast audio transcription, " in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2014, pp. 6334-6338
- (2014) Proc IEEE Int. Conf. Acoust., Speech Signal Process , pp. 6334-6338
- Gupta, V.¹ Kenny, P.² Ouellet, P.³ Stafylakis, T.⁴

31
- 84910068089
- Adaptation of deep neural network acoustic models using factorised i-vectors
- P. Karanasou, Y. Wang, M. Gales, and P. Woodland, "Adaptation of deep neural network acoustic models using factorised i-vectors, " in Proc. Annu. Conf. Int. Speech Commun. Assoc., 2014, pp. 2180-2184
- (2014) Proc. Annu. Conf. Int. Speech Commun. Assoc , pp. 2180-2184
- Karanasou, P.¹ Wang, Y.² Gales, M.³ Woodland, P.⁴

32
- 84938688160
- Speaker adaptive training of deep neural network acoustic models using i-vectors
- Nov
- Y. Miao, H. Zhang, and F. Metze, "Speaker adaptive training of deep neural network acoustic models using i-vectors, " IEEE/ACM Trans. Audio, Speech, Lang. Process., vol. 23, no. 11, pp. 1938-1949, Nov. 2015
- (2015) IEEE/ACM Trans. Audio, Speech, Lang. Process , vol.23 , Issue.11 , pp. 1938-1949
- Miao, Y.¹ Zhang, H.² Metze, F.³

33
- 84905269643
- Using neural network front-ends on far field multiple microphones based speech recognition
- Y. Liu, P. Zhang, and T. Hain, "Using neural network front-ends on far field multiple microphones based speech recognition, " in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2014, pp. 5542-5546
- (2014) Proc IEEE Int. Conf. Acoust., Speech Signal Process , pp. 5542-5546
- Liu, Y.¹ Zhang, P.² Hain, T.³

34
- 84910030053
- Recnorm: Simultaneous normalisation and classification applied to speech recognition
- Online]. Available:
- J. Bridle and S. Cox, "Recnorm: Simultaneous normalisation and classification applied to speech recognition, " in Proc. Adv. Neural Inf. Process Sys 3, 1990, pp. 234-240. [Online]. Available: http://papers.nips.cc/paper/328-recnorm-simultaneous-normalisationand-classification-applied-To-speech-recognition.pdf
- (1990) Proc. Adv. Neural Inf. Process Sys , vol.3 , pp. 234-240
- Bridle, J.¹ Cox, S.²

35
- 84890452886
- Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code
- O. Abdel-Hamid and H. Jiang, "Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code, " in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2013, pp. 4277-4280
- (2013) Proc IEEE Int. Conf. Acoust., Speech Signal Process , pp. 4277-4280
- Abdel-Hamid, O.¹ Jiang, H.²

36
- 84921731072
- Fast adaptation of deep neural network based on discriminant codes for speech recognition
- Dec
- S. Xue, O. Abdel-Hamid, J. Hui, L. Dai, and Q. Liu, "Fast adaptation of deep neural network based on discriminant codes for speech recognition, " IEEE/ACM Trans. Audio, Speech, Lang. Process., vol. 22, no. 12, pp. 1713-1725, Dec. 2014
- (2014) IEEE/ACM Trans. Audio, Speech, Lang. Process , vol.22 , Issue.12 , pp. 1713-1725
- Xue, S.¹ Abdel-Hamid, O.² Hui, J.³ Dai, L.⁴ Liu, Q.⁵

37
- 84890521103
- Speaker adaptation of context dependent deep neural networks
- H. Liao, "Speaker adaptation of context dependent deep neural networks, " in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2013, pp. 7947-7951
- (2013) Proc IEEE Int. Conf. Acoust., Speech Signal Process , pp. 7947-7951
- Liao, H.¹

38
- 84890542079
- KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition."
- D. Yu, K. Yao, H. Su, G. Li, and F. Seide, "KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition." in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2013, pp. 7893-7897
- (2013) Proc IEEE Int. Conf. Acoust., Speech Signal Process , pp. 7893-7897
- Yu, D.¹ Yao, K.² Su, H.³ Li, G.⁴ Seide, F.⁵

39
- 84959166782
- Regularized sequence-level deep neural network model adaptation
- Y. Huang and Y. Gong, "Regularized sequence-level deep neural network model adaptation, " in Proc. Annu. Conf. Int. Speech Commun. Assoc., 2015, pp. 1081-1085
- (2015) Proc. Annu. Conf. Int. Speech Commun. Assoc , pp. 1081-1085
- Huang, Y.¹ Gong, Y.²

40
- 84905229915
- Singular value decomposition based low-footprint speaker adaptation and personalization for deep neural network
- J. Xue, J. Li, D. Yu, M. Seltzer, and Y. Gong, "Singular value decomposition based low-footprint speaker adaptation and personalization for deep neural network, " in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2014, pp. 6359-6363
- (2014) Proc IEEE Int. Conf. Acoust., Speech Signal Process , pp. 6359-6363
- Xue, J.¹ Li, J.² Yu, D.³ Seltzer, M.⁴ Gong, Y.⁵

41
- 84964489805
- Learning factorized feature transforms for speaker normalization
- L. Samarakoon and K. C. Sim, "Learning factorized feature transforms for speaker normalization, " in Proc. IEEE Workshop Automatic Speech Recog. Understanding, 2015, pp. 145-152
- (2015) Proc IEEE Workshop Automatic Speech Recog. Understanding , pp. 145-152
- Samarakoon, L.¹ Sim, K.C.²

42
- 84905216195
- Speaker adaptive training using deep neural networks
- T. Ochiai, S. Matsuda, X. Lu, C. Hori, and S. Katagiri, "Speaker adaptive training using deep neural networks, " in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2014, pp. 6349-6353
- (2014) Proc IEEE Int. Conf. Acoust., Speech Signal Process , pp. 6349-6353
- Ochiai, T.¹ Matsuda, S.² Lu, X.³ Hori, C.⁴ Katagiri, S.⁵

43
- 84874226579
- Adaptation of context-dependent deep neural networks for automatic speech recognition
- K. Yao, D. Yu, F. Seide, H. Su, L. Deng, and Y. Gong, "Adaptation of context-dependent deep neural networks for automatic speech recognition, " in Proc. IEEE Spoken Language Technol. Workshop, 2012, pp. 366-369
- (2012) Proc IEEE Spoken Language Technol. Workshop , pp. 366-369
- Yao, K.¹ Yu, D.² Seide, F.³ Su, H.⁴ Deng, L.⁵ Gong, Y.⁶

44
- 84946061232
- Investigating online lowfootprint speaker adaptation using generalized linear regression and clickthrough data
- Y. Zhao, J. Li, J. Xue, and Y. Gong, "Investigating online lowfootprint speaker adaptation using generalized linear regression and clickthrough data, " in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2015, pp. 4310-4314
- (2015) Proc IEEE Int. Conf. Acoust., Speech Signal Process , pp. 4310-4314
- Zhao, Y.¹ Li, J.² Xue, J.³ Gong, Y.⁴

45
- 84946032695
- Differentiable pooling for unsupervised speaker adaptation
- P. Swietojanski and S. Renals, "Differentiable pooling for unsupervised speaker adaptation, " in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2015, pp. 4305-4309
- (2015) Proc IEEE Int. Conf. Acoust., Speech Signal Process , pp. 4305-4309
- Swietojanski, P.¹ Renals, S.²

46
- 84881054791
- Hermitian polynomial for speaker adaptation of connectionist speech recognition systems
- Oct
- S. Siniscalchi, J. Li, and C. Lee, "Hermitian polynomial for speaker adaptation of connectionist speech recognition systems, " IEEE Trans. Audio, Speech, Lang. Process., vol. 21, no. 10, pp. 2152-2161, Oct. 2013
- (2013) IEEE Trans. Audio, Speech, Lang. Process , vol.21 , Issue.10 , pp. 2152-2161
- Siniscalchi, S.¹ Li, J.² Lee, C.³

47
- 84959161626
- Maximum a posteriori adaptation of network parameters in deep models
- Z. Huang, S. M. Siniscalchi, I.-F. Chen, J.Wu, and C.-H. Lee, "Maximum a posteriori adaptation of network parameters in deep models, " in Proc. 16th Annu. Conf. Int. Speech Commun. Assoc., 2015, pp. 1076-1080
- (2015) Proc. 16th Annu. Conf. Int. Speech Commun. Assoc , pp. 1076-1080
- Huang, Z.¹ Siniscalchi, S.M.² Chen, I.-F.³ Wu, J.⁴ Lee, C.-H.⁵

48
- 84959169347
- Rapid adaptation for deep neural networks through multi-Task learning
- Z. Huang, J. Li, S.M. Siniscalchi, I.-F. Chen, J.Wu, and C.-H. Lee, "Rapid adaptation for deep neural networks through multi-Task learning, " in Proc. 16th Annu. Conf. Int. Speech Commun. Assoc., 2015, pp. 3625-3629
- (2015) Proc. 16th Annu. Conf. Int. Speech Commun. Assoc , pp. 3625-3629
- Huang, Z.¹ Li, J.² Siniscalchi, S.M.³ Chen, I.-F.⁴ Wu, J.⁵ Lee, C.-H.⁶

49
- 84959095902
- Structured output layer with auxiliary targets for context-dependent acoustic modelling
- P. Swietojanski, P. Bell, and S. Renals, "Structured output layer with auxiliary targets for context-dependent acoustic modelling, " in Proc.Annu. Conf. Int. Speech Commun. Assoc., 2015, pp. 3605-3609
- (2015) Proc.Annu. Conf. Int. Speech Commun. Assoc , pp. 3605-3609
- Swietojanski, P.¹ Bell, P.² Renals, S.³

50
- 84938690750
- Speaker adaptation of deep neural networks using a hierarchy of output layers
- R. Price, K. Iso, and K. Shinoda, "Speaker adaptation of deep neural networks using a hierarchy of output layers, " in Proc. IEEE Spoken Language Technol. Workshop, 2014, pp. 153-158
- (2014) Proc IEEE Spoken Language Technol. Workshop , pp. 153-158
- Price, R.¹ Iso, K.² Shinoda, K.³

51
- 0024880831
- Multilayer feedforward networks are universal approximators
- K. Hornik, M. Stinchcombe, and H. White, "Multilayer feedforward networks are universal approximators, " Neural Netw., vol. 2, no. 5, pp. 359-366, 1989
- (1989) Neural Netw , vol.2 , Issue.5 , pp. 359-366
- Hornik, K.¹ Stinchcombe, M.² White, H.³

52
- 0025751820
- Approximation capabilities of multilayer feedforward networks
- K. Hornik, "Approximation capabilities of multilayer feedforward networks, " Neural Netw., vol. 4, no. 2, pp. 251-257, 1991
- (1991) Neural Netw , vol.4 , Issue.2 , pp. 251-257
- Hornik, K.¹

53
- 0027599793
- Universal approximation bounds for superpositions of a sigmoidal function
- May
- A. Barron, "Universal approximation bounds for superpositions of a sigmoidal function, " IEEE Trans. Inf. Theory, vol. 39, no. 3, pp. 930-945, May 1993
- (1993) IEEE Trans. Inf. Theory , vol.39 , Issue.3 , pp. 930-945
- Barron, A.¹

54
- 84959174678
- Parameterised sigmoid and ReLU hidden activation functions for DNN acoustic modelling
- C. Zhang and P. Woodland, "Parameterised sigmoid and ReLU hidden activation functions for DNN acoustic modelling, " in Proc. 16th Annu. Conf. Int. Speech Commun. Assoc., 2015, pp. 3224-3228
- (2015) Proc. 16th Annu. Conf. Int. Speech Commun. Assoc , pp. 3224-3228
- Zhang, C.¹ Woodland, P.²

55
- 0035024581
- Networks with trainable amplitude of activation functions
- E. Trentin, "Networks with trainable amplitude of activation functions, " Neural Netw., vol. 14, pp. 471-493, 2001
- (2001) Neural Netw , vol.14 , pp. 471-493
- Trentin, E.¹

56
- 84946054484
- Multi-basis adaptive neural network for rapid adaptation in speech recognition
- C. Wu and M. Gales, "Multi-basis adaptive neural network for rapid adaptation in speech recognition, " in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2015, pp. 4315-4319
- (2015) Proc IEEE Int. Conf. Acoust., Speech Signal Process , pp. 4315-4319
- Wu, C.¹ Gales, M.²

57
- 84946083667
- Cluster adaptive training for deep neural network
- T. Tan, Y. Qian, M. Yin, Y. Zhuang, and K. Yu, "Cluster adaptive training for deep neural network, " in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2015, pp. 4325-4329
- (2015) Proc IEEE Int. Conf. Acoust., Speech Signal Process , pp. 4325-4329
- Tan, T.¹ Qian, Y.² Yin, M.³ Zhuang, Y.⁴ Yu, K.⁵

58
- 84946036209
- Context adaptive deep neural networks for fast acoustic model adaptation
- M. Delcroix, K. Kinoshita, T. Hori, and T. Nakatani, "Context adaptive deep neural networks for fast acoustic model adaptation, " in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2015, pp. 4535-4539
- (2015) Proc IEEE Int. Conf. Acoust., Speech Signal Process , pp. 4535-4539
- Delcroix, M.¹ Kinoshita, K.² Hori, T.³ Nakatani, T.⁴

59
- 0030362995
- A compact model for speaker-adaptive training
- T. Anastasakos, J. McDonough, R. Schwartz, and J. Makhoul, "A compact model for speaker-adaptive training, " in Proc. 4th Int. Conf. Spoken Language, 1996, pp. 1137-1140
- (1996) Proc. 4th Int. Conf. Spoken Language , pp. 1137-1140
- Anastasakos, T.¹ McDonough, J.² Schwartz, R.³ Makhoul, J.⁴

60
- 44849090969
- Recognition and understanding of meetings: The AMI and AMIDA projects
- Dec
- S. Renals, T. Hain, and H. Bourlard, "Recognition and understanding of meetings: The AMI and AMIDA projects, " in Proc. IEEE Workshop Aut. Speech Recog. Understanding, Dec. 2007, pp. 238-247
- (2007) Proc IEEE Workshop Aut. Speech Recog. Understanding , pp. 238-247
- Renals, S.¹ Hain, T.² Bourlard, H.³

61
- 85045373614
- Overview of the IWSLT 2012 evaluation campaign
- M. Federico, M. Cettolo, L. Bentivogli, M. Paul, and S. Stüker, "Overview of the IWSLT 2012 evaluation campaign, " in Proc. 9th Int. Workshop Spoken Language Translation, 2012, pp. 12-33
- (2012) Proc. 9th Int. Workshop Spoken Language Translation , pp. 12-33
- Federico, M.¹ Cettolo, M.² Bentivogli, L.³ Paul, M.⁴ Stüker, S.⁵

62
- 84976431564
- The UEDIN ASR systems for the IWSLT 2014 evaluation
- P. Bell, P. Swietojanski, J. Driesen, M. Sinclair, F.McInnes, and S. Renals, "The UEDIN ASR systems for the IWSLT 2014 evaluation, " in Proc. Int. Workshop Spoken Language Translation, 2014, pp. 26-33
- (2014) Proc. Int. Workshop Spoken Language Translation , pp. 26-33
- Bell, P.¹ Swietojanski, P.² Driesen, J.³ Sinclair, M.⁴ McInnes, F.⁵ Renals, S.⁶

63
- 84893704659
- Hybrid acoustic models for distant and multichannel large vocabulary speech recognition
- P. Swietojanski, A. Ghoshal, and S. Renals, "Hybrid acoustic models for distant and multichannel large vocabulary speech recognition, " in Proc. IEEE Workshop Aut. Speech Recog. Understanding, 2013, pp. 285-290
- (2013) Proc IEEE Workshop Aut. Speech Recog. Understanding , pp. 285-290
- Swietojanski, P.¹ Ghoshal, A.² Renals, S.³

64
- 84893654379
- Improvements to deep convolutional neural networks for LVCSR
- T. Sainath et al., "Improvements to deep convolutional neural networks for LVCSR, " in Proc. IEEEWorkshop Aut. Speech Recog. Understanding, 2013, pp. 315-320
- (2013) Proc. IEEEWorkshop Aut. Speech Recog. Understanding , pp. 315-320
- Sainath, T.¹

65
- 0001857994
- Efficient backprop
- New York, NY, USA: Springer ch. 2
- Y. LeCun, L. Bottou, G. Orr, and K. M?uller, "Efficient backprop, " in Neural Networks: Tricks of the Trade. New York, NY, USA: Springer, 1998, ch. 2
- (1998) Neural Networks: Tricks of the Trade
- LeCun, Y.¹ Bottou, L.² Orr, G.³ Müller, K.⁴

66
- 84867605836
- Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition
- O. Abdel-Hamid, A.-R. Mohamed, J. Hui, and G. Penn, "Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition." in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2012, pp. 4277-4280
- (2012) Proc IEEE Int. Conf. Acoust., Speech Signal Process , pp. 4277-4280
- Abdel-Hamid, O.¹ Mohamed, A.-R.² Hui, J.³ Penn, G.⁴

67
- 84901999583
- Convolutional neural networks for distant speech recognition
- Sep
- P. Swietojanski, A. Ghoshal, and S. Renals, "Convolutional neural networks for distant speech recognition, " IEEE Signal Process. Lett., vol. 21, no. 9, pp. 1120-1124, Sep. 2014
- (2014) IEEE Signal Process. Lett , vol.21 , Issue.9 , pp. 1120-1124
- Swietojanski, P.¹ Ghoshal, A.² Renals, S.³

68
- 84906274730
- Sequence-discriminative training of deep neural networks
- Aug
- K.Vesely, A.Ghoshal, L. Burget, andD. Povey, "Sequence-discriminative training of deep neural networks, " in Proc. Annu. Conf. Int. Speech Commun. Assoc., Lyon, France, Aug. 2013, pp. 2345-2349
- (2013) Proc. Annu. Conf. Int. Speech Commun. Assoc., Lyon, France , pp. 2345-2349
- Vesely, K.¹ Ghoshal, A.² Burget, L.³ Povey, D.⁴

69
- 84858953642
- The Kaldi speech recognition toolkit
- Dec
- D. Povey, et al., "The Kaldi speech recognition toolkit, " in Proc. IEEE Workshop Aut. Speech Recog. Understanding, Dec. 2011
- (2011) Proc IEEE Workshop Aut. Speech Recog. Understanding
- Povey, D.¹

70
- 85009890950
- Connectionist probability estimation in the DECIPHER speech recognition system
- S. Renals, N. Morgan, M. Cohen, and H. Franco, "Connectionist probability estimation in the DECIPHER speech recognition system, " in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 1992, pp. 601-604
- (1992) Proc IEEE Int. Conf. Acoust., Speech Signal Process , pp. 601-604
- Renals, S.¹ Morgan, N.² Cohen, M.³ Franco, H.⁴

71
- 0032923221
- Catastrophic forgetting in connectionist networks: Causes, consequences and solutions
- Online]. Available:
- R. French, "Catastrophic forgetting in connectionist networks: Causes, consequences and solutions, " Trends Cognitive Sci., vol. 3, pp. 128-135, 1999. [Online]. Available: http://citeseerx.ist.psu. edu/viewdoc/summary?.doi=10.1.1.36.3676
- (1999) Trends Cognitive Sci , vol.3 , pp. 128-135
- French, R.¹

72
- 84871614543
- A novel loss function for the overall risk criterion based discriminative training of HMMmodels
- J. Kaiser, B. Horvat, and Z. Kacic, "A novel loss function for the overall risk criterion based discriminative training of HMMmodels, " in Proc. 6th Int. Conf. Spoken Language Process., 2000, pp. 887-890
- (2000) Proc. 6th Int. Conf. Spoken Language Process , pp. 887-890
- Kaiser, J.¹ Horvat, B.² Kacic, Z.³

73
- 70349213445
- Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling
- B. Kingsbury, "Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling, " in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2009, pp. 3761-3764
- (2009) Proc IEEE Int. Conf. Acoust., Speech Signal Process , pp. 3761-3764
- Kingsbury, B.¹

74
- 84905238208
- The NICT ASR system for IWSLT 2013
- C.-L. Huang, P. Dixon, S. Matsuda, Y. Wu, X. Lu, M. Saiko, and C. Hori, "The NICT ASR system for IWSLT 2013, " in Proc. Int.Workshop Spoken Language Translation, 2013
- (2013) Proc. Int.Workshop Spoken Language Translation
- Huang, C.-L.¹ Dixon, P.² Matsuda, S.³ Wu, Y.⁴ Lu, X.⁵ Saiko, M.⁶ Hori, C.⁷

75
- 84964497075
- Towards utterancebased neural network adaptation in acoustic modeling
- I. Himawan, P. Motlicek, M. Ferras, and S. Madikeri, "Towards utterancebased neural network adaptation in acoustic modeling, " in Proc. IEEE Workshop Aut. Speech Recog. Understanding, 2015, pp. 289-295
- (2015) Proc IEEE Workshop Aut. Speech Recog. Understanding , pp. 289-295
- Himawan, I.¹ Motlicek, P.² Ferras, M.³ Madikeri, S.⁴

76
- 84973352080
- On combining i-vectors and discriminative adaptation methods for unsupervised speaker normalization in dnn acoustic models
- L. Samarakoon andK.C. Sim, "On combining i-vectors and discriminative adaptation methods for unsupervised speaker normalization in dnn acoustic models, " in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2016, pp. 5275-5279
- (2016) Proc IEEE Int. Conf. Acoust., Speech Signal Process. , pp. 5275-5279
- Samarakoonk, L.¹ Sim, C.²

77
- 84959177524
- Human vs machine spoofing detection on wideband and narrowband data
- M.Wester, Z.Wu, and J. Yamagishi, "Human vs machine spoofing detection on wideband and narrowband data, " in Proc. Annu. Conf. Int. Speech Commun. Assoc., 2015, pp. 2047-2051
- (2015) Proc. Annu. Conf. Int. Speech Commun. Assoc , pp. 2047-2051
- Wester, M.¹ Wu, Z.² Yamagishi, J.³

78
- 84910084579
- 2000 NIST evaluation of conversational speech recognition over the telephone: English and Mandarin performance results
- J. Fiscus, W. Fisher, A. Martin, M. Przybocki, and D. Pallett, "2000 NIST evaluation of conversational speech recognition over the telephone: English and Mandarin performance results, " in Proc. Speech Transcription Workshop, 2000
- (2000) Proc. Speech Transcription Workshop
- Fiscus, J.¹ Fisher, W.² Martin, A.³ Przybocki, M.⁴ Pallett, D.⁵

79
- 84905262902
- Factorized adaptation for deep neural network
- J. Li, J.-T. Huang, and Y. Gong, "Factorized adaptation for deep neural network, " in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2014, pp. 5537-5541
- (2014) Proc IEEE Int. Conf. Acoust., Speech Signal Process , pp. 5537-5541
- Li, J.¹ Huang, J.-T.² Gong, Y.³

80
- 84890492030
- An investigation of deep neural networks for noise robust speech recognition
- M. Seltzer, D. Yu, and Y.Wang, "An investigation of deep neural networks for noise robust speech recognition, " in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., 2013, pp. 7398-7402
- (2013) Proc IEEE Int. Conf. Acoust., Speech Signal Process , pp. 7398-7402
- Seltzer, M.¹ Yu, D.² Wang, Y.³

81
- 84946693226
- Annealed dropout training of deep networks
- S. Rennie, V. Goel, and S. Thomas, "Annealed dropout training of deep networks, " in Proc. IEEE Spoken Language Technology Workshop, 2014, pp. 159-164
- (2014) Proc. IEEE Spoken Language Technology Workshop , pp. 159-164
- Rennie, S.¹ Goel, V.² Thomas, S.³

82
- 84897543523
- Maxout networks
- I.J. Goodfellow, D.Warde-Farley, M.Mirza, A. Courville, and Y. Bengio, "Maxout networks, " Proc. Int. Conf. Mach. Learn., JMLR, 2013, pp. 1319-1327
- (2013) Proc. Int. Conf. Mach. Learn., JMLR , pp. 1319-1327
- Goodfellow, I.J.¹ Warde-Farley, D.² Mirza, M.³ Courville, A.⁴ Bengio, Y.⁵

83
- 84904163933
- Dropout: 2014 A simple way to prevent neural networks from overfitting
- [Online]. Available:
- N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Dropout: 2014 A simple way to prevent neural networks from overfitting, " J. Mach. Learn. Res, 15, pp. 1929-1958, . [Online]. Available: http://jmlr.org/papers/v15/srivastava14a.html
- J. Mach. Learn. Res , vol.15 , pp. 1929-1958
- Srivastava, N.¹ Hinton, G.² Krizhevsky, A.³ Sutskever, I.⁴ Salakhutdinov, R.⁵

84
- 57249084011
- Visualizing high-dimensional data using t-sne
- Nov
- L. Van der Maaten and G. Hinton, "Visualizing high-dimensional data using t-sne, " J. Mach. Learn. Res., vol. 9, pp. 2579-2605, Nov. 2008
- (2008) J. Mach. Learn. Res , vol.9 , pp. 2579-2605
- Van der Maaten, L.¹ Hinton, G.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.