SCOPUS 정보 검색 플랫폼

Volumn 27, Issue 1, 2013, Pages 151-167

Automatic speaker age and gender recognition using acoustic and prosodic level information fusion

(3) Li, Ming a Han, Kyu J a Narayanan, Shrikanth a

a University of Southern California ^* (United States)

Author keywords

Age recognition; Formant; Gender recognition; GMM; Harmonic structure; Maximum likelihood linear regression; Pitch; Polynomial expansion; Prosodic features; Score level fusion; Sparse representation; SVM; UBM weight posterior probability supervectors

Indexed keywords

AGE RECOGNITION; FORMANT; GENDER RECOGNITION; GMM; HARMONIC STRUCTURES; MAXIMUM LIKELIHOOD LINEAR REGRESSION; PITCH; POLYNOMIAL EXPANSION; POSTERIOR PROBABILITY; PROSODIC FEATURES; SCORE-LEVEL FUSION; SPARSE REPRESENTATION; SVM;

PROBABILITY; SOCIAL SCIENCES; SPEECH ANALYSIS; SPEECH RECOGNITION; TIME DOMAIN ANALYSIS;

SUPPORT VECTOR MACHINES;

EID: 84867336595 PISSN: 08852308 EISSN: 10958363 Source Type: Journal
DOI: 10.1016/j.csl.2012.01.008 Document Type: Article

Times cited : (158)

References (47)

1
- 84867328971
- Age and gender classification using modulation cepstrum
- Ajmera J.; and Burkhardt F. Age and gender classification using modulation cepstrum Proc. Odyssey 2008 025
- (2008) Proc. Odyssey , pp. 025
- Ajmera, J.¹ Burkhardt, F.²

2
- 79959859677
- Automatic classification of married couples' behavior using audio features
- Black M.; Katsamanis A.; Lee C.; Lammert A.; Baucom B.; Christensen A.; Georgiou P.; and Narayanan S. Automatic classification of married couples' behavior using audio features Proc. INTERSPEECH 2010 2030 2033
- (2010) Proc. INTERSPEECH , pp. 2030-2033
- Black, M.¹ Katsamanis, A.² Lee, C.³ Lammert, A.⁴ Baucom, B.⁵ Christensen, A.⁶ Georgiou, P.⁷ Narayanan, S.⁸

3
- 51449091542
- Age and gender recognition for telephone applications based on GMM supervectors and support vector machines
- Bocklet T.; Maier A.; Bauer J.; Burkhardt F.; and Nöth E. Age and gender recognition for telephone applications based on GMM supervectors and support vector machines Proc. ICASSP 2008 1605 1608
- (2008) Proc. ICASSP , pp. 1605-1608
- Bocklet, T.¹ Maier, A.² Bauer, J.³ Burkhardt, F.⁴ Nöth, E.⁵

4
- 79959826869
- Age and gender recognition based on multiple systems - Early vs. late fusion
- Bocklet T.; Stemmer G.; Zeissler V.; and Nöth E. Age and gender recognition based on multiple systems - early vs. late fusion Proc. INTERSPEECH 2010 2830 2833
- (2010) Proc. INTERSPEECH , pp. 2830-2833
- Bocklet, T.¹ Stemmer, G.² Zeissler, V.³ Nöth, E.⁴

5
- 78049369280
- Software
- Brümmer, N.; 2007. Focal multi-class: toolkit for evaluation, fusion and calibration of multi-class recognition scorestutorial and user manual. Software available at http://sites.google.com/site/nikobrummer/focalmulticlass.
- (2007) Focal Multi-class: Toolkit for Evaluation, Fusion and Calibration of Multi-class Recognition Scorestutorial and User Manual
- Brümmer, N.¹

6
- 85028160834
- A database of age and gender annotated telephone speech
- Burkhardt F.; Eckert M.; Johannsen W.; and Stegmann J. A database of age and gender annotated telephone speech Proc. 7th International Conference on Language Resources and Evaluation (LREC) 2010 1562 1565
- (2010) Proc. 7th International Conference on Language Resources and Evaluation (LREC) , pp. 1562-1565
- Burkhardt, F.¹ Eckert, M.² Johannsen, W.³ Stegmann, J.⁴

7
- 29044444825
- Support vector machines for speaker and language recognition
- Campbell W.; Campbell J.; Reynolds D.; Singer E.; and Torres-Carrasquillo P. Support vector machines for speaker and language recognition Computer Speech & Language 20 2006 210 229
- (2006) Computer Speech & Language , vol.20 , pp. 210-229
- Campbell, W.¹ Campbell, J.² Reynolds, D.³ Singer, E.⁴ Torres-Carrasquillo, P.⁵

8
- 33947696754
- SVM based speaker verification using a GMM supervector kernel and NAP variability compensation
- Campbell W.; Sturim D.; Reynolds D.; and Solomonoff A. SVM based speaker verification using a GMM supervector kernel and NAP variability compensation Proc. ICASSP 2006 97 100
- (2006) Proc. ICASSP , pp. 97-100
- Campbell, W.¹ Sturim, D.² Reynolds, D.³ Solomonoff, A.⁴

9
- 84872900462
- Multiple f0 estimation in polyphonic music
- Cao C.; Li M.; Liu J.; and Yan Y. Multiple f0 estimation in polyphonic music Third Music Information Retrieval Evaluation eXchange (MIREX) 2007
- (2007) Third Music Information Retrieval Evaluation EXchange (MIREX)
- Cao, C.¹ Li, M.² Liu, J.³ Yan, Y.⁴

10
- 0003710380
- Software
- Chang, C.C.; Lin, C.J.; 2001. LIBSVM: A Library for Support Vector Machines. Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm.
- (2001) LIBSVM: A Library for Support Vector Machines
- Chang, C.C.¹ Lin, C.J.²

11
- 0000913324
- SVMTorch: Support vector machines for large-scale regression problems
- Collobert R.; and Bengio S. SVMTorch: support vector machines for large-scale regression problems The Journal of Machine Learning Research 1 2001 143 160
- (2001) The Journal of Machine Learning Research , vol.1 , pp. 143-160
- Collobert, R.¹ Bengio, S.²

12
- 64249101047
- Modeling prosodic features with joint factor analysis for speaker verification
- Dehak N.; Dumouchel P.; and Kenny P. Modeling prosodic features with joint factor analysis for speaker verification IEEE Transactions on Audio, Speech, and Language Processing 15 2007 2095 2103
- (2007) IEEE Transactions on Audio, Speech, and Language Processing , vol.15 , pp. 2095-2103
- Dehak, N.¹ Dumouchel, P.² Kenny, P.³

13
- 79951609039
- Front-end factor analysis for speaker verification
- Dehak N.; Kenny P.; Dehak R.; Dumouchel P.; and Ouellet P. Front-end factor analysis for speaker verification IEEE Transactions on Audio, Speech, and Language Processing 19 2011 788 798
- (2011) IEEE Transactions on Audio, Speech, and Language Processing , vol.19 , pp. 788-798
- Dehak, N.¹ Kenny, P.² Dehak, R.³ Dumouchel, P.⁴ Ouellet, P.⁵

14
- 51449092448
- Continuous prosodic features and formant modeling with joint factor analysis for speaker verification
- Dehak N.; Kenny P.; and Dumouchel P. Continuous prosodic features and formant modeling with joint factor analysis for speaker verification Proc. INTERSPEECH 2007 1234 1237
- (2007) Proc. INTERSPEECH , pp. 1234-1237
- Dehak, N.¹ Kenny, P.² Dumouchel, P.³

15
- 70450161521
- Dimension reduction approaches for SVM based speaker age estimation
- Dobry G.; Hecht R.; Avigal M.; and Zigel Y. Dimension reduction approaches for SVM based speaker age estimation Proc. INTERSPEECH 2009 2031 2034
- (2009) Proc. INTERSPEECH , pp. 2031-2034
- Dobry, G.¹ Hecht, R.² Avigal, M.³ Zigel, Y.⁴

16
- 77949415384
- OpenEAR introducing the Munich open-source emotion and affect recognition toolkit
- Eyben F.; Wollmer M.; and Schuller B. OpenEAR introducing the Munich open-source emotion and affect recognition toolkit Affective Computing and Intelligent Interaction and Workshops, ACII 2009 1 6
- (2009) Affective Computing and Intelligent Interaction and Workshops, ACII , pp. 1-6
- Eyben, F.¹ Wollmer, M.² Schuller, B.³

17
- 79959823933
- Gender and affect recognition based on GMM and GMM-UBM modeling with relevance MAP estimation
- Gajšek R.; Žibert J.; Justin T.; Štruc V.; Vesnicer B.; and Mihelič F. Gender and affect recognition based on GMM and GMM-UBM modeling with relevance MAP estimation Proc. INTERSPEECH 2010 2810 2813
- (2010) Proc. INTERSPEECH , pp. 2810-2813
- Gajšek, R.¹ Žibert, J.² Justin, T.³ Štruc, V.⁴ Vesnicer, B.⁵ Mihelič, F.⁶

18
- 78049355199
- Combining regression and classification methods for improving automatic speaker age recognition
- van Heerden C.; Barnard E.; Davel M.; van der Walt C.; van Dyk E.; Feld M.; and Müller C. Combining regression and classification methods for improving automatic speaker age recognition Proc. ICASSP 2010 5174 5177
- (2010) Proc. ICASSP , pp. 5174-5177
- Van Heerden, C.¹ Barnard, E.² Davel, M.³ Van Der Walt, C.⁴ Van Dyk, E.⁵ Feld, M.⁶ Müller, C.⁷

19
- 0023833270
- Measurement of pitch by subharmonic summation
- Hermes D. Measurement of pitch by subharmonic summation Journal of the Acoustical Society of America 83 1988 257 264
- (1988) Journal of the Acoustical Society of America , vol.83 , pp. 257-264
- Hermes, D.¹

20
- 18544365764
- Probability product kernels
- Jebara T.; Kondor R.; and Howard A. Probability product kernels The Journal of Machine Learning Research 5 2004 819 844
- (2004) The Journal of Machine Learning Research , vol.5 , pp. 819-844
- Jebara, T.¹ Kondor, R.² Howard, A.³

21
- 67649543737
- Contour modeling of prosodic and acoustic features for speaker recognition
- Kockmann M.; and Burget L. Contour modeling of prosodic and acoustic features for speaker recognition Proc. Spoken Language Technology Workshop, IEEE 2008 45 48
- (2008) Proc. Spoken Language Technology Workshop, IEEE , pp. 45-48
- Kockmann, M.¹ Burget, L.²

22
- 79959829347
- Brno university of technology system for interspeech 2010 paralinguistic challenge
- Kockmann M.; Burget L.; and Černocký J. Brno university of technology system for interspeech 2010 paralinguistic challenge Proc. INTERSPEECH 2010 2822 2825
- (2010) Proc. INTERSPEECH , pp. 2822-2825
- Kockmann, M.¹ Burget, L.² Černocký, J.³

23
- 79959830675
- Quantification of prosodic entrainment in affective spontaneous spoken interactions of married couples
- Lee C.; Black M.; Katsamanis A.; Lammert A.; Baucom B.; Christensen A.; Georgiou P.; and Narayanan S. Quantification of prosodic entrainment in affective spontaneous spoken interactions of married couples Proc. INTERSPEECH 2010 793 796
- (2010) Proc. INTERSPEECH , pp. 793-796
- Lee, C.¹ Black, M.² Katsamanis, A.³ Lammert, A.⁴ Baucom, B.⁵ Christensen, A.⁶ Georgiou, P.⁷ Narayanan, S.⁸

24
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
- Leggetter C.; and Woodland P. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models Computer Speech and Language 9 1995 171
- (1995) Computer Speech and Language , vol.9 , pp. 171
- Leggetter, C.¹ Woodland, P.²

25
- 84867205560
- Cochannel speech separation using multi-pitch estimation and model based voiced sequential grouping
- Li M.; Cao C.; Di Wang P.; Fu Q.; and Yan Y. Cochannel speech separation using multi-pitch estimation and model based voiced sequential grouping Proc. INTERSPEECH 2008 151 154
- (2008) Proc. INTERSPEECH , pp. 151-154
- Li, M.¹ Cao, C.² Di Wang, P.³ Fu, Q.⁴ Yan, Y.⁵

26
- 79959838775
- Combining five acoustic level methods for automatic speaker age and gender recognition
- Li M.; Jung C.S.; and Han K.J. Combining five acoustic level methods for automatic speaker age and gender recognition Proc. INTERSPEECH 2010 2826 2829
- (2010) Proc. INTERSPEECH , pp. 2826-2829
- Li, M.¹ Jung, C.S.² Han, K.J.³

27
- 80051633581
- Robust talking face video verification using joint factor analysis and sparse representation on GMM mean shifted supervectors
- Li M.; and Narayanan S. Robust talking face video verification using joint factor analysis and sparse representation on GMM mean shifted supervectors Proc. ICASSP 2011 1481 1484
- (2011) Proc. ICASSP , pp. 1481-1484
- Li, M.¹ Narayanan, S.²

28
- 79959831145
- Spoken language identification using score vector modeling and support vector machine
- Li M.; Suo H.; Wu X.; Lu P.; and Yan Y. Spoken language identification using score vector modeling and support vector machine Proc. INTERSPEECH 2007 350 353
- (2007) Proc. INTERSPEECH , pp. 350-353
- Li, M.¹ Suo, H.² Wu, X.³ Lu, P.⁴ Yan, Y.⁵

29
- 84865799827
- Speaker verification using sparse representations on total variability i-vectors
- Li M.; Zhang X.; Yan Y.; and Narayanan S. Speaker verification using sparse representations on total variability i-vectors Proc. INTERSPEECH 2011
- (2011) Proc. INTERSPEECH
- Li, M.¹ Zhang, X.² Yan, Y.³ Narayanan, S.⁴

30
- 33646817778
- Language identification using pitch contour information
- Lin C.; and Wang H. Language identification using pitch contour information Proc. ICASSP 2005 601 604
- (2005) Proc. ICASSP , pp. 601-604
- Lin, C.¹ Wang, H.²

31
- 79959842932
- Age and gender classification from speech using decision level fusion and ensemble based techniques
- Lingenfelser F.; Wagner J.; Vogt T.; Kim J.; and André E. Age and gender classification from speech using decision level fusion and ensemble based techniques Proc. INTERSPEECH 2010 2798 2801
- (2010) Proc. INTERSPEECH , pp. 2798-2801
- Lingenfelser, F.¹ Wagner, J.² Vogt, T.³ Kim, J.⁴ André, E.⁵

32
- 79959823110
- Age and gender classification using fusion of acoustic and prosodic features
- Meinedo H.; and Trancoso I. Age and gender classification using fusion of acoustic and prosodic features Proc. INTERSPEECH 2010 2818 2821
- (2010) Proc. INTERSPEECH , pp. 2818-2821
- Meinedo, H.¹ Trancoso, I.²

33
- 34547542381
- Comparison of four approaches to age and gender recognition for telephone applications
- Metze F.; Ajmera J.; Englert R.; Bub U.; Burkhardt F.; Stegmann J.; Muller C.; Huber R.; Andrassy B.; Bauer J.; and Littel B. Comparison of four approaches to age and gender recognition for telephone applications Proc. ICASSP 2007 1089 1092
- (2007) Proc. ICASSP , pp. 1089-1092
- Metze, F.¹ Ajmera, J.² Englert, R.³ Bub, U.⁴ Burkhardt, F.⁵ Stegmann, J.⁶ Muller, C.⁷ Huber, R.⁸ Andrassy, B.⁹ Bauer, J.¹⁰ Littel, B.¹¹

34
- 56149095112
- Combining short-term cepstral and long-term pitch features for automatic recognition of speaker age
- Müller C.; and Burkhardt F. Combining short-term cepstral and long-term pitch features for automatic recognition of speaker age Proc. INTERSPEECH 2007 2277 2280
- (2007) Proc. INTERSPEECH , pp. 2277-2280
- Müller, C.¹ Burkhardt, F.²

35
- 79959858399
- Fuzzy support vector machines for age and gender classification
- Nguyen P.; Le T.; Tran D.; Huang X.; and Sharma D. Fuzzy support vector machines for age and gender classification Proc. INTERSPEECH 2010 2806 2809
- (2010) Proc. INTERSPEECH , pp. 2806-2809
- Nguyen, P.¹ Le, T.² Tran, D.³ Huang, X.⁴ Sharma, D.⁵

36
- 79959846299
- Age recognition based on speech signals using weights supervector
- Porat R.; Lange D.; and Zigel Y. Age recognition based on speech signals using weights supervector Proc. INTERSPEECH 2010 2814 2817
- (2010) Proc. INTERSPEECH , pp. 2814-2817
- Porat, R.¹ Lange, D.² Zigel, Y.³

37
- 0033884858
- Speaker verification using adapted Gaussian mixture models
- Reynolds D.; Quatieri T.; and Dunn R. Speaker verification using adapted Gaussian mixture models Digital Signal Processing 10 2000 19 41
- (2000) Digital Signal Processing , vol.10 , pp. 19-41
- Reynolds, D.¹ Quatieri, T.² Dunn, R.³

38
- 36248944637
- Acoustic analysis of adult speaker age. Speaker classification i
- Schötz S. Acoustic analysis of adult speaker age. Speaker classification I Lecture Notes in Computer Science 2007 88 107
- (2007) Lecture Notes in Computer Science , pp. 88-107
- Schötz, S.¹

39
- 79954999224
- The INTERSPEECH 2010 paralinguistic challenge
- Schuller B.; Steidl S.; Batliner A.; Burkhardt F.; Devillers L.; Mueller C.; and Narayanan S. The INTERSPEECH 2010 paralinguistic challenge Proc. INTERSPEECH 2010 2794 2797
- (2010) Proc. INTERSPEECH , pp. 2794-2797
- Schuller, B.¹ Steidl, S.² Batliner, A.³ Burkhardt, F.⁴ Devillers, L.⁵ Mueller, C.⁶ Narayanan, S.⁷

40
- 33947620115
- Hierarchical structures of neural networks for phoneme
- Software
- Schwarz, P.; Matejka, P.; Cernocky, J.; 2006. Hierarchical structures of neural networks for phoneme. In: Proc. ICASSP, pp. 325-328. Software available at http://speech.fit.vutbr.cz/software/phoneme-recognizer-based-long-temporal- context.
- (2006) Proc. ICASSP , pp. 325-328
- Schwarz, P.¹ Matejka, P.² Cernocky, J.³

41
- 85009141765
- Wavesurfer-an open source speech tool
- Sjölander K.; and Beskow J. Wavesurfer-an open source speech tool Proc. ICSLP 2000 464 467
- (2000) Proc. ICSLP , pp. 464-467
- Sjölander, K.¹ Beskow, J.²

42
- 70450201151
- Analyzing features for automatic age estimation on cross-sectional data
- Spiegl W.; Stemmer G.; Lasarcyk E.; Kolhatkar V.; Cassidy A.; Potard B.; Shutn S.; Song Y.; Xu P.; Beyerlein P.; Harnsberger J.; and Nöth E. Analyzing features for automatic age estimation on cross-sectional data Proc. INTERSPEECH 2009 2923 2926
- (2009) Proc. INTERSPEECH , pp. 2923-2926
- Spiegl, W.¹ Stemmer, G.² Lasarcyk, E.³ Kolhatkar, V.⁴ Cassidy, A.⁵ Potard, B.⁶ Shutn, S.⁷ Song, Y.⁸ Xu, P.⁹ Beyerlein, P.¹⁰ Harnsberger, J.¹¹ Nöth, E.¹²

43
- 33745216683
- MLLR transforms as features in speaker recognition
- Stolcke A.; Ferrer L.; Kajarekar S.; Shriberg E.; and Venkataraman A. MLLR transforms as features in speaker recognition Proc. INTERSPEECH 2005 2425 2428
- (2005) Proc. INTERSPEECH , pp. 2425-2428
- Stolcke, A.¹ Ferrer, L.² Kajarekar, S.³ Shriberg, E.⁴ Venkataraman, A.⁵

44
- 51449111842
- Speaker recognition with session variability normalization based on MLLR adaptation transforms
- Stolcke A.; Kajarekar S.; Ferrer L.; Shrinberg E.; Int S.; and Park M. Speaker recognition with session variability normalization based on MLLR adaptation transforms IEEE Transactions on Audio, Speech, and Language Processing 15 2007 1987 1998
- (2007) IEEE Transactions on Audio, Speech, and Language Processing , vol.15 , pp. 1987-1998
- Stolcke, A.¹ Kajarekar, S.² Ferrer, L.³ Shrinberg, E.⁴ Int, S.⁵ Park, M.⁶

45
- 70450180871
- Age recognition for spoken dialogue systems: Do we need it?
- Wolters M.; Vipperla R.; and Renals S. Age recognition for spoken dialogue systems: Do we need it? Proc. INTERSPEECH 2009 1435 1438
- (2009) Proc. INTERSPEECH , pp. 1435-1438
- Wolters, M.¹ Vipperla, R.² Renals, S.³

46
- 61549128441
- Robust face recognition via sparse representation
- Wright J.; Yang A.; Ganesh A.; Sastry S.; and Ma Y. Robust face recognition via sparse representation IEEE Transactions on Pattern Analysis and Machine Intelligence 31 2008 210 227
- (2008) IEEE Transactions on Pattern Analysis and Machine Intelligence , vol.31 , pp. 210-227
- Wright, J.¹ Yang, A.² Ganesh, A.³ Sastry, S.⁴ Ma, Y.⁵

47
- 77950366250
- Using a kind of novel phonotactic information for SVM based speaker recognition
- Zhang X.; Suo H.; Zhao Q.; and Yan Y. Using a kind of novel phonotactic information for SVM based speaker recognition IEICE Transactions on Information and Systems 92 2009 746 749
- (2009) IEICE Transactions on Information and Systems , vol.92 , pp. 746-749
- Zhang, X.¹ Suo, H.² Zhao, Q.³ Yan, Y.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.