SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn , Issue , 2014, Pages 3007-3011

Speaker adaptation of DNN-based ASR with i-vectors: Does it actually adapt models to speakers?

(2) Rouvier, Mickael a Favre, Benoit a

a AIX MARSEILLE UNIV (France)

Author keywords

[No Author keywords available]

Indexed keywords

SPEECH COMMUNICATION;

ACOUSTIC CONDITIONS; ACOUSTIC FEATURES; ACOUSTIC MODEL; DEEP NEURAL NETWORKS; SPEAKER ADAPTATION; SPEAKER CLUSTERING; SPECIFIC INFORMATION; UNSUPERVISED SPEAKER ADAPTATION;

VECTORS;

EID: 84910073132 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (26)

References (23)

1
- 0028419019
- Maximum a posteriori estimation for multivariate gaussian mixture observations of markov chains
- J.-L. Gauvain and C.-H. Lee, "Maximum a posteriori estimation for multivariate gaussian mixture observations of markov chains, " Speech and Audio Processing, IEEE Transactions on, vol. 2, no. 2, pp. 291-298, 1994.
- (1994) Speech and Audio Processing, IEEE Transactions on , vol.2 , Issue.2 , pp. 291-298
- Gauvain, J.-L.¹ Lee, C.-H.²

2
- 0030263447
- Mean and variance adaptation within the mllr framework
- M. J. Gales and P. Woodland, "Mean and variance adaptation within the mllr framework, " Computer Speech & Language, vol. 10, no. 4, pp. 249-264, 1996.
- (1996) Computer Speech & Language , vol.10 , Issue.4 , pp. 249-264
- Gales, M.J.¹ Woodland, P.²

3
- 80051654263
- Deep belief networks using discriminative features for phone recognition
- IEEE
- A.-R. Mohamed, T. N. Sainath, G. Dahl, B. Ramabhadran, G. E. Hinton, and M. A. Picheny, "Deep belief networks using discriminative features for phone recognition, " in Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on. IEEE, 2011, pp. 5060-5063.
- (2011) Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on , pp. 5060-5063
- Mohamed, A.-R.¹ Sainath, T.N.² Dahl, G.³ Ramabhadran, B.⁴ Hinton, G.E.⁵ Picheny, M.A.⁶

4
- 84055222005
- Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
- G. E. Dahl, D. Yu, L. Deng, and A. Acero, "Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 20, no. 1, pp. 30-42, 2012.
- (2012) Audio, Speech, and Language Processing, IEEE Transactions on , vol.20 , Issue.1 , pp. 30-42
- Dahl, G.E.¹ Yu, D.² Deng, L.³ Acero, A.⁴

5
- 84865801985
- Conversational speech transcription using context-dependent deep neural networks
- F. Seide, G. Li, and D. Yu, "Conversational speech transcription using context-dependent deep neural networks." in Interspeech, 2011, pp. 437-440.
- (2011) Interspeech , pp. 437-440
- Seide, F.¹ Li, G.² Yu, D.³

6
- 84874226579
- Adaptation of context-dependent deep neural networks for automatic speech recognition
- IEEE
- K. Yao, D. Yu, F. Seide, H. Su, L. Deng, and Y. Gong, "Adaptation of context-dependent deep neural networks for automatic speech recognition, " in Spoken Language Technology Workshop (SLT), 2012 IEEE. IEEE, 2012, pp. 366-369.
- (2012) Spoken Language Technology Workshop (SLT), 2012 IEEE , pp. 366-369
- Yao, K.¹ Yu, D.² Seide, F.³ Su, H.⁴ Deng, L.⁵ Gong, Y.⁶

7
- 84890542079
- Kl-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition
- IEEE
- D. Yu, K. Yao, H. Su, G. Li, and F. Seide, "Kl-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition, " in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, 2013, pp. 7893-7897.
- (2013) Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on , pp. 7893-7897
- Yu, D.¹ Yao, K.² Su, H.³ Li, G.⁴ Seide, F.⁵

8
- 84893691530
- Speaker adaptation of neural network acoustic models using i-vectors
- IEEE
- G. Saon, H. Soltau, D. Nahamoo, and M. Picheny, "Speaker adaptation of neural network acoustic models using i-vectors, " in Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on. IEEE, 2013, pp. 55-59.
- (2013) Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on , pp. 55-59
- Saon, G.¹ Soltau, H.² Nahamoo, D.³ Picheny, M.⁴

9
- 50249170027
- Joint factor analysis versus eigenchannels in speaker recognition
- P. Kenny, G. Boulianne, P. Ouellet, and P. Dumouchel, "Joint factor analysis versus eigenchannels in speaker recognition, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 15, no. 4, pp. 1435-1447, 2007.
- (2007) Audio, Speech, and Language Processing, IEEE Transactions on , vol.15 , Issue.4 , pp. 1435-1447
- Kenny, P.¹ Boulianne, G.² Ouellet, P.³ Dumouchel, P.⁴

10
- 70450180849
- Support vector machines versus fast scoring in the lowdimensional total variability space for speaker verification
- N. Dehak, R. Dehak, P. Kenny, N. Brümmer, P. Ouellet, and P. Dumouchel, "Support vector machines versus fast scoring in the lowdimensional total variability space for speaker verification." in INTERSPEECH, vol. 9, 2009, pp. 1559-1562.
- (2009) INTERSPEECH , vol.9 , pp. 1559-1562
- Dehak, N.¹ Dehak, R.² Kenny, P.³ Brümmer, N.⁴ Ouellet, P.⁵ Dumouchel, P.⁶

11
- 84865753339
- Intersession compensation and scoring methods in the i-vectors space for speaker recognition
- P.-M. Bousquet, D. Matrouf, and J.-F. Bonastre, "Intersession compensation and scoring methods in the i-vectors space for speaker recognition." in InterSpeech, 2011, pp. 485-488.
- (2011) InterSpeech , pp. 485-488
- Bousquet, P.-M.¹ Matrouf, D.² Bonastre, J.-F.³

12
- 84865733857
- Analysis of i-vector length normalization in speaker recognition systems
- D. Garcia-Romero and C. Y. Espy-Wilson, "Analysis of i-vector length normalization in speaker recognition systems." in Interspeech, 2011, pp. 249-252.
- (2011) Interspeech , pp. 249-252
- Garcia-Romero, D.¹ Espy-Wilson, C.Y.²

13
- 84910061411
- The first official repere evaluation
- O. Galibert and J. Kahn, "The first official repere evaluation, " in Speech, Language and Audio for Multimedia (SLAM 2013), 2013.
- (2013) Speech, Language and Audio for Multimedia (SLAM 2013)
- Galibert, O.¹ Kahn, J.²

14
- 84906274473
- An open-source state-of-the-art toolbox for broadcast news diarization
- M. Rouvier, G. Dupuy, P. Gay, E. Khoury, T. Merlin, and S. Meignier, "An open-source state-of-the-art toolbox for broadcast news diarization." in InterSpeech, 2013.
- (2013) InterSpeech
- Rouvier, M.¹ Dupuy, G.² Gay, P.³ Khoury, E.⁴ Merlin, T.⁵ Meignier, S.⁶

15
- 84858953642
- The kaldi speech recognition toolkit
- D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, N. Goel, M. Hannemann, P. Motlicek, Y. Qian, P. Schwarz et al., "The kaldi speech recognition toolkit, " in Proc. ASRU, 2011, pp. 1-4.
- (2011) Proc. ASRU , pp. 1-4
- Povey, D.¹ Ghoshal, A.² Boulianne, G.³ Burget, L.⁴ Glembek, O.⁵ Goel, N.⁶ Hannemann, M.⁷ Motlicek, P.⁸ Qian, Y.⁹ Schwarz, P.¹⁰

16
- 70450180496
- The ester 2 evaluation campaign for the rich transcription of french radio broadcasts
- S. Galliano, G. Gravier, and L. Chaubard, "The ester 2 evaluation campaign for the rich transcription of french radio broadcasts." in Interspeech, vol. 9, 2009, pp. 2583-2586.
- (2009) Interspeech , vol.9 , pp. 2583-2586
- Galliano, S.¹ Gravier, G.² Chaubard, L.³

17
- 85016241152
- The epac corpus: Manual and automatic annotations of conversational speech in french broadcast news
- Y. Esteve, T. Bazillon, J.-Y. Antoine, F. Béchet, and J. Farinas, "The epac corpus: Manual and automatic annotations of conversational speech in french broadcast news." in LREC, 2010.
- (2010) LREC
- Esteve, Y.¹ Bazillon, T.² Antoine, J.-Y.³ Béchet, F.⁴ Farinas, J.⁵

18
- 84884963802
- A. Mendonça, D. Graff, and D. Di Persio, "French gigaword, " 2009.
- (2009) French Gigaword
- Mendonça, A.¹ Graff, D.² Di Persio, D.³

19
- 84907937611
- Srilm-an extensible language modeling toolkit
- A. Stolcke et al., "Srilm-an extensible language modeling toolkit." in InterSpeech, 2002.
- (2002) InterSpeech
- Stolcke, A.¹

20
- 85073229756
- Variance-spectra based normalization for i-vector standard and probabilistic linear discriminant analysis
- P.-M. Bousquet, A. Larcher, D. Matrouf, J.-F. Bonastre, and O. Plchot, "Variance-spectra based normalization for i-vector standard and probabilistic linear discriminant analysis, " in Speaker and Language Recognition Workshop (IEEE Odyssey), 2012.
- (2012) Speaker and Language Recognition Workshop (IEEE Odyssey)
- Bousquet, P.-M.¹ Larcher, A.² Matrouf, D.³ Bonastre, J.-F.⁴ Plchot, O.⁵

21
- 84858973723
- Bayesian speaker verification with heavy tailed priors
- P. Kenny, "Bayesian speaker verification with heavy tailed priors, " in Speaker and Language Recognition Workshop (IEEE ), 2010.
- (2010) Speaker and Language Recognition Workshop (IEEE )
- Kenny, P.¹

22
- 84865783736
- Mixture of plda models in i-vector space for genderindependent speaker recognition
- M. Senoussaoui, P. Kenny, N. Brümmer, E. De Villiers, and P. Dumouchel, "Mixture of plda models in i-vector space for genderindependent speaker recognition." in InterSpeech, 2011, pp. 25- 28.
- (2011) InterSpeech , pp. 25-28
- Senoussaoui, M.¹ Kenny, P.² Brümmer, N.³ De Villiers, E.⁴ Dumouchel, P.⁵

23
- 84910034609
- P. Kenny, T. Stafylakis, P. Ouellet, and M. J. Alam, "Jfa-based front ends for speaker recognition.".
- Jfa-based Front Ends for Speaker Recognition
- Kenny, P.¹ Stafylakis, T.² Ouellet, P.³ Alam, M.J.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.