SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 22, Issue 7, 2014, Pages 1117-1129

Non-negative factor analysis of Gaussian mixture model weight adaptation for language and dialect recognition

(6) Bahari, Mohamad Hasan a Dehak, Najim b Van Hamme, Hugo a Burget, Lukas c Ali, Ahmed M d Glass, Jim b

a UNIVERSITY OF LEUVEN (Belgium)

b MASSACHUSETTS INSTITUTE OF TECHNOLOGY (United States)

c BRNO UNIVERSITY OF TECHNOLOGY (Czech Republic)

d QATAR COMPUTING RESEARCH INSTITUTE (Qatar)

Author keywords

Dialect recognition; Gaussian mixture model weight; Language recognition; Model adaptation; Non negative factor analysis

Indexed keywords

FACE RECOGNITION; FACTORIZATION; GAUSSIAN DISTRIBUTION; MATRIX ALGEBRA; MAXIMUM LIKELIHOOD; MAXIMUM LIKELIHOOD ESTIMATION; MULTIVARIANT ANALYSIS; SPEECH RECOGNITION; VECTORS;

DIALECT RECOGNITION; GAUSSIAN MIXTURE MODEL; LANGUAGE RECOGNITION; MODEL ADAPTATION; NON NEGATIVES;

FACTOR ANALYSIS;

EID: 84904156635 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASLP.2014.2319159 Document Type: Article

Times cited : (26)

References (39)

1
- 84858969967
- New York, NY, USA: Columbia Univ.
- F. Biadsy, Automatic dialect and accent recognition and its application to speech recognition. New York, NY, USA: Columbia Univ., 2011.
- (2011) Automatic Dialect and Accent Recognition and Its Application to Speech Recognition
- Biadsy, F.¹

2
- 84890499114
- Birmingham, U.K.: Univ. of Birmingham, Jul.
- A. Hanani, Human and computer recognition of regional accents and ethnic groups from British English speech. Birmingham, U.K.: Univ. of Birmingham, Jul. 2012.
- (2012) Human and Computer Recognition of Regional Accents and Ethnic Groups from British English Speech
- Hanani, A.¹

3
- 0028516964
- Reviewing automatic language identification
- Oct.
- Y. Muthusamy, E. Barnard, and R. Cole, "Reviewing automatic language identification," IEEE Signal Process. Mag., vol. 11, no. 4, pp. 33-41, Oct. 1994.
- (1994) IEEE Signal Process. Mag. , vol.11 , Issue.4 , pp. 33-41
- Muthusamy, Y.¹ Barnard, E.² Cole, R.³

4
- 0035427178
- Automatic language identification
- M. A. Zissman and K. M. Berkling, "Automatic language identification," Speech Commun., vol. 35, no. 1, pp. 115-124, 2001.
- (2001) Speech Commun. , vol.35 , Issue.1 , pp. 115-124
- Zissman, M.A.¹ Berkling, K.M.²

5
- 33744639316
- RADC/Texas Instruments, Inc., Dallas, TX, , Tech. Rep. RADC-TR-74-2007TI-347650
- R. G. Leonard and G. R. Doddington, "Automatic language identification," RADC/Texas Instruments, Inc., Dallas, TX, , Tech. Rep. RADC-TR-74-2007TI-347650, 1974.
- (1974) Automatic Language Identification
- Leonard, R.G.¹ Doddington, G.R.²

6
- 0001152481
- Toward automatic identification of the language of an utterance. I. Preliminary methodological considerations
- A. S. House and E. P. Neuburg, "Toward automatic identification of the language of an utterance. I. preliminary methodological considerations," J. Acoust. Soc. Amer., vol. 62, p. 708, 1977.
- (1977) J. Acoust. Soc. Amer. , vol.62 , pp. 708
- House, A.S.¹ Neuburg, E.P.²

7
- 84867334135
- Human and computer recognition of regional accents and ethnic groups from British English speech
- A. Hanani, M. Russell, and M. Carey, "Human and computer recognition of regional accents and ethnic groups from British English speech," Comput. Speech Lang., vol. 27, no. 1, pp. 59-74, 2013.
- (2013) Comput. Speech Lang. , vol.27 , Issue.1 , pp. 59-74
- Hanani, A.¹ Russell, M.² Carey, M.³

8
- 0029733178
- Comparison of four approaches to automatic language identification of telephone speech
- Jan.
- M. Zissman, "Comparison of four approaches to automatic language identification of telephone speech," IEEE Trans. Speech Audio Process., vol. 4, no. 1, pp. 31-44, Jan. 1996.
- (1996) IEEE Trans. Speech Audio Process. , vol.4 , Issue.1 , pp. 31-44
- Zissman, M.¹

9
- 34547544522
- Language recognition with word lattices and support vector machines
- W. M. Campbell, F. Richardson, and D. Reynolds, "Language recognition with word lattices and support vector machines," in Proc. ICASSP, 2007, pp. 989-992.
- Proc. ICASSP, 2007 , pp. 989-992
- Campbell, W.M.¹ Richardson, F.² Reynolds, D.³

10
- 33645887246
- Support vector machines using GMM supervectors for speaker verification
- May
- W. Campbell, D. Sturim, and D. Reynolds, "Support vector machines using GMM supervectors for speaker verification," IEEE Signal Process. Lett., vol. 13, no. 5, pp. 308-311, May 2006.
- (2006) IEEE Signal Process. Lett. , vol.13 , Issue.5 , pp. 308-311
- Campbell, W.¹ Sturim, D.² Reynolds, D.³

11
- 79951609039
- Front-end factor analysis for speaker verification
- May
- N. Dehak, P. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, "Front-end factor analysis for speaker verification," IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 4, pp. 788-798, May 2011.
- (2011) IEEE Trans. Audio, Speech, Lang. Process. , vol.19 , Issue.4 , pp. 788-798
- Dehak, N.¹ Kenny, P.² Dehak, R.³ Dumouchel, P.⁴ Ouellet, P.⁵

12
- 84865750857
- Language recognition via ivectors and dimensionality reduction
- N. Dehak, P. A. Torres-Carrasquillo, D. Reynolds, and R. Dehak, "Language recognition via ivectors and dimensionality reduction," in Proc. Interspeech, 2011, pp. 857-860.
- Proc. Interspeech, 2011 , pp. 857-860
- Dehak, N.¹ Torres-Carrasquillo, P.A.² Reynolds, D.³ Dehak, R.⁴

13
- 84878398325
- Age estimation from telephone speech using i-vectors
- M. H. Bahari, M. McLaren, H. Van hamme, and D. van Leeuwen, "Age estimation from telephone speech using i-vectors," in Proc. Interspeech, 2012, pp. 506-509.
- Proc. Interspeech, 2012 , pp. 506-509
- Bahari, M.H.¹ McLaren, M.² Van Hamme, H.³ Van Leeuwen, D.⁴

14
- 84890540214
- Accent recognition using i-vector, Gaussian mean supervector and Gaussian posterior probability supervector for spontaneous telephone speech
- M. H. Bahari, R. Saeidi, H. Van hamme, and D. van Leeuwen, "Accent recognition using i-vector, Gaussian mean supervector and Gaussian posterior probability supervector for spontaneous telephone speech," in Proc. ICASSP, 2013, pp. 7344-7348.
- Proc. ICASSP, 2013 , pp. 7344-7348
- Bahari, M.H.¹ Saeidi, R.² Van Hamme, H.³ Van Leeuwen, D.⁴

15
- 84867336595
- Automatic speaker age and gender recognition using acoustic and prosodic level information fusion
- M. Li, K. J. Han, and S. Narayanan, "Automatic speaker age and gender recognition using acoustic and prosodic level information fusion," Comput. Speech Lang., vol. 27, no. 1, pp. 151-167, 2013.
- (2013) Comput. Speech Lang. , vol.27 , Issue.1 , pp. 151-167
- Li, M.¹ Han, K.J.² Narayanan, S.³

16
- 84879322629
- Rapid speaker adaptation in latent speaker space with non-negative matrix factorization
- X. Zhang, K. Demuynck, and H. Van hamme, "Rapid speaker adaptation in latent speaker space with non-negative matrix factorization," Speech Commun., vol. 55, no. 9, pp. 893-908, 2013.
- (2013) Speech Commun. , vol.55 , Issue.9 , pp. 893-908
- Zhang, X.¹ Demuynck, K.² Van Hamme, H.³

17
- 0033884858
- Speaker verification using adapted Gaussian mixture models
- D. A. Reynolds, T. F. Quatieri, and R. B. Dunn, "Speaker verification using adapted Gaussian mixture models," Digital Signal Process., vol. 10, no. 1, pp. 19-41, 2000.
- (2000) Digital Signal Process. , vol.10 , Issue.1 , pp. 19-41
- Reynolds, D.A.¹ Quatieri, T.F.² Dunn, R.B.³

18
- 79959846570
- Prosodic speaker verification using subspace multinomial models with intersession compensation
- M. Kockmann, L. Burget, O. Glembek, L. Ferrer, and J. Cernocky, "Prosodic speaker verification using subspace multinomial models with intersession compensation," in Proc. 11th Annu. Conf. Int. Speech Commun. Assoc., 2010.
- Proc. 11th Annu. Conf. Int. Speech Commun. Assoc., 2010
- Kockmann, M.¹ Burget, L.² Glembek, O.³ Ferrer, L.⁴ Cernocky, J.⁵

19
- 84867193756
- Advances in phonotactic language recognition
- O. Glembek, P. Matejka, L. Burget, and T. Mikolov, "Advances in phonotactic language recognition," Interspeech'08, pp. 743-746, 2008.
- (2008) Interspeech'08 , pp. 743-746
- Glembek, O.¹ Matejka, P.² Burget, L.³ Mikolov, T.⁴

20
- 84865703431
- ivector approach to phonotactic language recognition
- M. Soufifar, M. Kockmann, L. Burget, O. Plchot, O. Glembek, and T. Svendsen, "ivector approach to phonotactic language recognition," in Proc. Interspeech, 2011, pp. 2913-2916.
- Proc. Interspeech, 2011 , pp. 2913-2916
- Soufifar, M.¹ Kockmann, M.² Burget, L.³ Plchot, O.⁴ Glembek, O.⁵ Svendsen, T.⁶

21
- 84867602712
- Discriminative classifiers for phonotactic language recognition with ivectors
- M. Soufifar, S. Cumani, L. Burget, and J. Cernocky et al., "Discriminative classifiers for phonotactic language recognition with ivectors," in IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2012, pp. 4853-4856.
- IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2012 , pp. 4853-4856
- Soufifar, M.¹ Cumani, S.² Burget, L.³ Cernocky, J.⁴

22
- 58349106697
- A study of interspeaker variability in speaker verification
- Jul.
- P. Kenny, P. Ouellet, N. Dehak, V. Gupta, and P. Dumouchel, "A study of interspeaker variability in speaker verification," IEEE Trans. Audio, Speech, Lang. Process., vol. 16, no. 5, pp. 980-988, Jul. 2008.
- (2008) IEEE Trans. Audio, Speech, Lang. Process. , vol.16 , Issue.5 , pp. 980-988
- Kenny, P.¹ Ouellet, P.² Dehak, N.³ Gupta, V.⁴ Dumouchel, P.⁵

23
- 0002629270
- Maximum likelihood from incomplete data via the em algorithm
- A. P. Dempster, N. M. Laird, and D. B. Rubin et al., "Maximum likelihood from incomplete data via the em algorithm," J. R. Statist. Soc.-Ser. B, vol. 39, no. 1, pp. 1-38, 1977.
- (1977) J. R. Statist. Soc.-Ser. B , vol.39 , Issue.1 , pp. 1-38
- Dempster, A.P.¹ Laird, N.M.² Rubin, D.B.³

24
- 0033592606
- Learning the parts of objects by non-negative matrix factorization
- D. D. Lee and H. S. Seung, "Learning the parts of objects by non-negative matrix factorization," Nature, vol. 401, no. 6755, pp. 788-791, 1999.
- (1999) Nature , vol.401 , Issue.6755 , pp. 788-791
- Lee, D.D.¹ Seung, H.S.²

25
- 32844466040
- New York, NY, USA: Springer Science+ Business Media
- J. A. Snyman, Practical mathematical optimization: an introduction to basic optimization theory and classical and new gradient-based algorithms. New York, NY, USA: Springer Science+ Business Media, 2005, vol. 97.
- (2005) Practical Mathematical Optimization: An Introduction to Basic Optimization Theory and Classical and New Gradient-based Algorithms , vol.97
- Snyman, J.A.¹

26
- 84906234390
- Regularized subspace n-gram model for phonotactic ivector extraction
- M. M. Soufifar, L. Burget, O. Plchot, S. Cumani, and J. Cernocky, "Regularized subspace n-gram model for phonotactic ivector extraction," in Proc. Interspeech, 2013, pp. 74-78.
- Proc. Interspeech, 2013 , pp. 74-78
- Soufifar, M.M.¹ Burget, L.² Plchot, O.³ Cumani, S.⁴ Cernocky, J.⁵

27
- 85073202259
- The MITLL NIST LRE 2011 language recognition system
- E. Singer, P. Torres-Carrasquillo, D. Reynolds, A. McCree, F. Richardson, N. Dehak, and D. Sturim, "The MITLL NIST LRE 2011 language recognition system," in Proc. Speaker Odyssey, 2012, pp. 209-215.
- Proc. Speaker Odyssey, 2012 , pp. 209-215
- Singer, E.¹ Torres-Carrasquillo, P.² Reynolds, D.³ McCree, A.⁴ Richardson, F.⁵ Dehak, N.⁶ Sturim, D.⁷

28
- 0028517164
- RASTA processing of speech
- Oct.
- H. Hermansky and N. Morgan, "RASTA processing of speech," IEEE Trans. Speech Audio Process., vol. 2, no. 4, pp. 578-589, Oct. 1994.
- (1994) IEEE Trans. Speech Audio Process. , vol.2 , Issue.4 , pp. 578-589
- Hermansky, H.¹ Morgan, N.²

29
- 42749106196
- Channel factors compensation inmodel and feature domain for speaker recognition
- C. Vair, D. Colibro, F. Castaldo, E. Dalmasso, and P. Laface, "Channel factors compensation inmodel and feature domain for speaker recognition," in Proc. IEEE Odyssey Speaker Lang. Recogn. Workshop, 2006, pp. 1-6.
- Proc. IEEE Odyssey Speaker Lang. Recogn. Workshop, 2006 , pp. 1-6
- Vair, C.¹ Colibro, D.² Castaldo, F.³ Dalmasso, E.⁴ Laface, P.⁵

30
- 0003472470
- R. O. Duda, P. E. Hart, and D. G. Stork, "Pattern classification and scene analysis 2nd ed," 1995.
- (1995) Pattern Classification and Scene Analysis 2nd Ed
- Duda, R.O.¹ Hart, P.E.² Stork, D.G.³

31
- 44949114401
- Within-class covariance normalization for SVM-based speaker recognition
- no. 2.2
- A. Hatch, S. Kajarekar, and A. Stolcke, "Within-class covariance normalization for SVM-based speaker recognition," in Proc. Interspeech, 2006, vol. 4, no. 2.2.
- Proc. Interspeech, 2006 , vol.4
- Hatch, A.¹ Kajarekar, S.² Stolcke, A.³

32
- 78049369280
- Focal multi-class: Toolkit for evaluation, fusion and calibration of multi-class recognition scores
- N. Brummer, "Focal multi-class: Toolkit for evaluation, fusion and calibration of multi-class recognition scores," Tutorial and User Manual. Spescom DataVoice, 2007.
- (2007) Tutorial and User Manual. Spescom DataVoice
- Brummer, N.¹

33
- 42749108057
- On calibration of language recognition scores
- N. Brummer and D. A. van Leeuwen, "On calibration of language recognition scores," in Proc. IEEE Odyssey Speaker Lang. Recogn. Workshop, 2006, pp. 1-8.
- Proc. IEEE Odyssey Speaker Lang. Recogn. Workshop, 2006 , pp. 1-8
- Brummer, N.¹ Van Leeuwen, D.A.²

34
- 29044447398
- Application-independent evaluation of speaker detection
- N. Brummer, "Application-independent evaluation of speaker detection," in Proc. ODYSSEY04- Speaker Lang. Recogn. Workshop, 2004.
- Proc. ODYSSEY04- Speaker Lang. Recogn. Workshop, 2004
- Brummer, N.¹

35
- 84885337537
- L. J. Rodriguez-Fuentes, N. Brummer, M. Penagarikano, A. Varona, M. Diez, and G. Bordel, The Albayzin 2012 language Recognition Evaluation Plan (Albayzin 2012 LRE), 2012.
- (2012) The Albayzin 2012 Language Recognition Evaluation Plan (Albayzin 2012 LRE)
- Rodriguez-Fuentes, L.J.¹ Brummer, N.² Penagarikano, M.³ Varona, A.⁴ Diez, M.⁵ Bordel, G.⁶

36
- 84911380460
- Citeseerx
- E. D. Bolker and M. Mast, Common Sense Mathematics. : Citeseerx, 2005.
- (2005) Common Sense Mathematics
- Bolker, E.D.¹ Mast, M.²

37
- 84878535284
- Developing a speech activity detection system for the DARPA RATS program
- T. Ng, B. Zhang, L. Nguyen, S. Matsoukas, X. Zhou, N. Mesgarani, K. Vesely, and P. Matejka, "Developing a speech activity detection system for the DARPA RATS program," in Proc. Interspeech, 2012.
- Proc. Interspeech, 2012
- Ng, T.¹ Zhang, B.² Nguyen, L.³ Matsoukas, S.⁴ Zhou, X.⁵ Mesgarani, N.⁶ Vesely, K.⁷ Matejka, P.⁸

38
- 33745805403
- A fast learning algorithm for deep belief nets
- G. E. Hinton, S. Osindero, and Y.-W. Teh, "A fast learning algorithm for deep belief nets," Neural Comput., vol. 18, no. 7, pp. 1527-1554, 2006.
- (2006) Neural Comput. , vol.18 , Issue.7 , pp. 1527-1554
- Hinton, G.E.¹ Osindero, S.² Teh, Y.-W.³

39
- 84890479094
- The rats radio traffic collection system
- K. Walker and S. Strassel, "The rats radio traffic collection system," in Proc. Odyssey, 2012.
- Proc. Odyssey, 2012
- Walker, K.¹ Strassel, S.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.