SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn 2015-August, Issue , 2015, Pages 4300-4304

An investigation into speaker informed DNN front-end for LVCSR

(3) Liu, Yulan a Karanasou, Penny b Hain, Thomas a

a UNIVERSITY OF SHEFFIELD (United Kingdom)

b UNIVERSITY OF CAMBRIDGE (United Kingdom)

Author keywords

bias adaptation; deep neural network; speaker adaptation; speaker informed training; speech recognition

Indexed keywords

AUDIO SIGNAL PROCESSING; CODES (SYMBOLS); HYBRID SYSTEMS; SPEECH COMMUNICATION; SPEECH PROCESSING; SPEECH RECOGNITION;

BIAS ADAPTATION; BOTTLENECK FEATURES; INFLUENTIAL FACTORS; MATHEMATICAL EQUIVALENCES; MEETING RECOGNITION; SPEAKER ADAPTATION; SPEAKER CLASSIFICATION; SPEAKER DEPENDENTS;

DEEP NEURAL NETWORKS;

EID: 84946036535 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2015.7178782 Document Type: Conference Paper

Times cited : (18)

References (25)

1
- 34547548235
- Probabilistic and bottle-neck features for LVCSR of meetings
- April
- F. Grezl, M. Karafiat, S. Kontar, and J. Cernocky, Probabilistic and bottle-neck features for LVCSR of meetings, in Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on, April 2007, vol. 4, pp. IV-757-IV-760
- (2007) Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on , vol.4 , pp. 4757-4760
- Grezl, F.¹ Karafiat, M.² Kontar, S.³ Cernocky, J.⁴

2
- 84865785753
- Improved bottleneck features using pretrained deep neural networks
- August, International Speech Communication Association
- D. Yu and M. Seltzer, Improved bottleneck features using pretrained deep neural networks, in Interspeech. August 2011, International Speech Communication Association
- (2011) Interspeech
- Yu, D.¹ Seltzer, M.²

3
- 84865801985
- Conversational speech transcription using context-dependent deep neural networks
- F. Seide, G. Li, and D. Yu, Conversational speech transcription using context-dependent deep neural networks, in INTERSPEECH, 2011, pp. 437-440
- (2011) INTERSPEECH , pp. 437-440
- Seide, F.¹ Li, G.² Yu, D.³

4
- 84893704659
- Hybrid acoustic models for distant and multichannel large vocabulary speech recognition
- Dec
- P. Swietojanski, A. Ghoshal, and S. Renals, Hybrid acoustic models for distant and multichannel large vocabulary speech recognition, in Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on, Dec 2013, pp. 285-290
- (2013) Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on , pp. 285-290
- Swietojanski, P.¹ Ghoshal, A.² Renals, S.³

5
- 84937854847
- Speaker-adaptation for hybrid HMM-ANN continuous speech recognition system
- ISCA
- J. P. Neto, L. B. Almeida, M. Hochberg, C. Martins, L. Nunes, S. Renals, and T. Robinson, Speaker-adaptation for hybrid HMM-ANN continuous speech recognition system., in EUROSPEECH. 1995, ISCA
- (1995) EUROSPEECH
- Neto, J.P.¹ Almeida, L.B.² Hochberg, M.³ Martins, C.⁴ Nunes, L.⁵ Renals, S.⁶ Robinson, T.⁷

6
- 79959849500
- Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems
- B. Li and K. C. Sim, Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems., in INTERSPEECH, 2010, pp. 526-529
- (2010) INTERSPEECH , pp. 526-529
- Li, B.¹ Sim, K.C.²

7
- 34548012893
- Linear hidden transformations for adaptation of hybrid ANN/HMM models
- Intrinsic Speech Variations
- R. Gemello, F. Mana, S. Scanzio, P. Laface, and R. D. Mori, Linear hidden transformations for adaptation of hybrid ANN/HMM models, Speech Communication, vol. 49, no. 1011, pp. 827-835, 2007, Intrinsic Speech Variations
- (2007) Speech Communication , vol.49 , Issue.1011 , pp. 827-835
- Gemello, R.¹ Mana, F.² Scanzio, S.³ Laface, P.⁴ Mori, R.D.⁵

8
- 0033677005
- Fast speaker adaptation of artificial neural networks for automatic speech recognition
- S. Dupont and L. Cheboub, Fast speaker adaptation of artificial neural networks for automatic speech recognition, in Acoustics, Speech, and Signal Processing, 2000. ICASSP'00. Proceedings. 2000 IEEE International Conference on. IEEE, 2000, vol. 3, pp. 1795-1798
- (2000) Acoustics, Speech, and Signal Processing, 2000. ICASSP'00. Proceedings. 2000 IEEE International Conference On. IEEE , vol.3 , pp. 1795-1798
- Dupont, S.¹ Cheboub, L.²

9
- 84890452886
- Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code
- May
- O. Abdel-Hamid and H. Jiang, Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code, in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, May 2013, pp. 7942-7946
- (2013) Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on , pp. 7942-7946
- Abdel-Hamid, O.¹ Jiang, H.²

10
- 84893691530
- Speaker adaptation of neural network acoustic models using i-vectors
- Dec
- G. Saon, H. Soltau, D. Nahamoo, and M. Picheny, Speaker adaptation of neural network acoustic models using i-vectors, in Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on, Dec 2013, pp. 55-59
- (2013) Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on , pp. 55-59
- Saon, G.¹ Soltau, H.² Nahamoo, D.³ Picheny, M.⁴

11
- 84905269643
- Using neural network frontends on far field multiple microphones based speech recognition
- Florence, Italy, May
- Y. Liu, P. Zhang, and T. Hain, Using neural network frontends on far field multiple microphones based speech recognition, in ICASSP2014-Speech and Language Processing (ICASSP2014-SLTC), Florence, Italy, May 2014
- (2014) ICASSP2014-Speech and Language Processing (ICASSP2014-SLTC)
- Liu, Y.¹ Zhang, P.² Hain, T.³

12
- 84874226579
- Adaptation of context-dependent deep neural networks for automatic speech recognition
- December
- K. Yao, D. Yu, F. Seide, H. Su, L. Deng, and Y. Gong, Adaptation of context-dependent deep neural networks for automatic speech recognition, in SLT 2012, December 2012
- (2012) SLT 2012
- Yao, K.¹ Yu, D.² Seide, F.³ Su, H.⁴ Deng, L.⁵ Gong, Y.⁶

13
- 84910028538
- Speaker dependent bottleneck layer training forspeaker adaptation in automatic speech recognition
- R. Doddipatla, M. Hasan, and T. Hain, Speaker dependent bottleneck layer training forspeaker adaptation in automatic speech recognition, in Interspeech 2014, 2014
- (2014) Interspeech 2014
- Doddipatla, R.¹ Hasan, M.² Hain, T.³

14
- 84983119674
- Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models
- South Lake Tahoe, USA, December
- P. Swietojanski and S. Renals, Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models, in Proc. IEEE Workshop on Spoken Language Technology, South Lake Tahoe, USA, December 2014
- (2014) Proc. IEEE Workshop on Spoken Language Technology
- Swietojanski, P.¹ Renals, S.²

15
- 84910097389
- Analysis of i-vector framework for speaker identification in TV-shows
- C. Fredouille and D. Charlet, Analysis of i-vector framework for speaker identification in TV-shows, in Proceedings of Interspeech' 14, 2014
- (2014) Proceedings of Interspeech' 14
- Fredouille, C.¹ Charlet, D.²

16
- 50249170027
- Joint factor analysis versus eigenchannels in speaker recognition
- May
- P. Kenny, G. Boulianne, P. Ouellet, and P. Dumouchel, Joint factor analysis versus eigenchannels in speaker recognition, Audio, Speech, and Language Processing, IEEE Transactions on, vol. 15, no. 4, pp. 1435-1447, May 2007
- (2007) Audio, Speech, and Language Processing, IEEE Transactions on , vol.15 , Issue.4 , pp. 1435-1447
- Kenny, P.¹ Boulianne, G.² Ouellet, P.³ Dumouchel, P.⁴

17
- 79951609039
- Front-end factor analysis for speaker verification
- May
- N. Dehak, P. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, Front-end factor analysis for speaker verification, Audio, Speech, and Language Processing, IEEE Transactions on, vol. 19, no. 4, pp. 788-798, May 2011
- (2011) Audio, Speech, and Language Processing, IEEE Transactions on , vol.19 , Issue.4 , pp. 788-798
- Dehak, N.¹ Kenny, P.² Dehak, R.³ Dumouchel, P.⁴ Ouellet, P.⁵

18
- 84858959884
- Maximum kurtosis beamforming with a subspace filter for distant speech recognition
- K. Kumatani, J. W. McDonough, and B. Raj, Maximum kurtosis beamforming with a subspace filter for distant speech recognition, in ASRU'11, 2011, pp. 179-184
- (2011) ASRU'11 , pp. 179-184
- Kumatani, K.¹ McDonough, J.W.² Raj, B.³

19
- 84890484784
- Rapid adaptation for mobile speech applications
- May
- M. Bacchiani, Rapid adaptation for mobile speech applications, in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, May 2013, pp. 7903-7907
- (2013) Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on , pp. 7903-7907
- Bacchiani, M.¹

20
- 84910068089
- Adaptation of deep neural network acoustic models using factorised i-vectors
- P. Karanasou, Y. Wang, M. Gales, and P. Woodland, Adaptation of deep neural network acoustic models using factorised i-vectors, in Proceedings of Interspeech'14, 2014
- (2014) Proceedings of Interspeech'14
- Karanasou, P.¹ Wang, Y.² Gales, M.³ Woodland, P.⁴

21
- 57249084011
- Visualizing highdimensional data using t-SNE
- Vander L. Maaten and G. E. Hinton, Visualizing highdimensional data using t-SNE, Journal of Machine Learning Research, vol. 9, pp. 2579-2605, 2008
- (2008) Journal of Machine Learning Research , vol.9 , pp. 2579-2605
- Vander L. Maaten¹ Hinton, G.E.²

22
- 84946012387
- Barnes hut SNE
- abs/1301.3342
- Vander L. Maaten, Barnes Hut SNE, Proceedings of the International Conference on Learning Representations, vol. abs/1301.3342, 2013
- (2013) Proceedings of the International Conference on Learning Representations
- Vander, L.M.¹

23
- 33745530242
- J. Carletta, S. Ashby, S. Bourban, M. Flynn, M. Guillemot, T. Hain, J. Kadlec, V. Karaiskos, W. Kraaij, M. Kronenthal, G. Lathoud, M. Lincoln, A. Lisowska, I. McCowan, W. Post, D. Reidsma, and P. Wellner, The AMI meeting corpus: A preannouncement, vol. 3869, pp. 28-39, 2006
- (2006) The AMI Meeting Corpus: A Preannouncement , vol.3869 , pp. 28-39
- Carletta, J.¹ Ashby, S.² Bourban, S.³ Flynn, M.⁴ Guillemot, M.⁵ Hain, T.⁶ Kadlec, J.⁷ Karaiskos, V.⁸ Kraaij, W.⁹ Kronenthal, M.¹⁰ Lathoud, G.¹¹ Lincoln, M.¹² Lisowska, A.¹³ McCowan, I.¹⁴ Post, W.¹⁵ Reidsma, D.¹⁶ Wellner, P.¹⁷

24
- 84874249176
- Transcribing meetings with the AMIDA systems
- Aug
- T. Hain, L. Burget, J. Dines, P. N. Garner, F. Grezl, el A. Hannani, M. Huijbregts, M. Karafiat, M. Lincoln, and V. Wan, Transcribing meetings with the AMIDA systems, IEEE Transactions on Audio, Speech and Language Processing, Aug. 2011
- (2011) IEEE Transactions on Audio, Speech and Language Processing
- Hain, T.¹ Burget, L.² Dines, J.³ Garner, P.N.⁴ Grezl, F.⁵ Hannani, E.A.⁶ Huijbregts, M.⁷ Karafiat, M.⁸ Lincoln, M.⁹ Wan, V.¹⁰

25
- 0034227757
- Cluster adaptive training of hidden Markov models
- M. J. F. Gales, Cluster adaptive training of hidden Markov models, IEEE Transactions on Speech and Audio Processing, vol. 8, pp. 417-428, 1999
- (1999) IEEE Transactions on Speech and Audio Processing , vol.8 , pp. 417-428
- Gales, M.J.F.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.