메뉴 건너뛰기




Volumn 21, Issue 7, 2013, Pages 1458-1468

On acoustic emotion recognition: Compensating for covariate shift

Author keywords

covariate shift; Emotion recognition; speaker and environment differences; transfer learning

Indexed keywords

ACOUSTIC EMOTION RECOGNITION; COVARIATE SHIFTS; EMOTION RECOGNITION; MAXIMUM LIKELIHOOD LINEAR REGRESSION; SPEAKER AND ENVIRONMENT DIFFERENCES; SUPPORT VECTOR MACHINE CLASSIFIERS; TRANSFER LEARNING; VOCAL TRACT LENGTH NORMALIZATION;

EID: 84876262647     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2013.2255278     Document Type: Article
Times cited : (115)

References (39)
  • 1
    • 79960846940 scopus 로고    scopus 로고
    • Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge
    • B. Schuller, A. Batliner, S. Steidl, and D. Seppi, "Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge," Speech Commun., vol. 53, no. 9-10, 2011.
    • Speech Commun , vol.53 , Issue.9-10 , pp. 2011
    • Schuller, B.1    Batliner, A.2    Steidl, S.3    Seppi, D.4
  • 2
    • 78649328053 scopus 로고    scopus 로고
    • Survey on speech emotion recognition: Features, classification schemes, and databases
    • M. El Ayadi, M. S. Kamel, and F. Karray, "Survey on speech emotion recognition: Features, classification schemes, and databases," Pattern Recogn., vol. 44, no. 3, pp. 572-587, 2011.
    • (2011) Pattern Recogn , vol.44 , Issue.3 , pp. 572-587
    • El Ayadi, M.1    Kamel, M.S.2    Karray, F.3
  • 3
    • 84861093047 scopus 로고    scopus 로고
    • Classification of emotional speech using 3DEC hierarchical classifier
    • A. Hassan andR. I.Damper, "Classification of emotional speech using 3DEC hierarchical classifier," Speech Commun., vol. 54, no. 7, pp. 903-916, 2012.
    • (2012) Speech Commun , vol.54 , Issue.7 , pp. 903-916
    • Hassan, A.1    Damper, R.I.2
  • 4
    • 33947164164 scopus 로고    scopus 로고
    • An evaluation of the robustness of existing supervised machine learning approaches to the classification of emotions in speech
    • DOI 10.1016/j.specom.2007.01.006, PII S016763930700009X
    • M. Shami andW. Verhelst, "An evaluation of the robustness of existing supervised machine learning approaches to the classification of emotions in speech," Speech Commun., vol. 49, no. 3, pp. 201-212, 2007. (Pubitemid 46413361)
    • (2007) Speech Communication , vol.49 , Issue.3 , pp. 201-212
    • Shami, M.1    Verhelst, W.2
  • 7
    • 78049286797 scopus 로고    scopus 로고
    • Emotion recognition from speech by combining databases and fusion of classifiers
    • Berlin, Heidelberg, Germany Springer-Verlag
    • I. Lefter, L. Rothkrantz, P. Wiggers, and D. van Leeuwen, "Emotion recognition from speech by combining databases and fusion of classifiers," in Proc. 13th Int. Conf. Text, Speech, Dialogue, Berlin, Heidelberg, Germany, 2010, pp. 353-360, Springer-Verlag.
    • (2010) Proc. 13th Int. Conf. Text, Speech, Dialogue , pp. 353-360
    • Lefter, I.1    Rothkrantz, L.2    Wiggers, P.3    Van Leeuwen, D.4
  • 10
    • 0027447292 scopus 로고
    • Toward the simulation of emotion in synthetic speech: A review of the literature on human vocal emotion
    • DOI 10.1121/1.405558
    • I. R. Murray and J. L. Arnott, "Toward the simulation of emotion in synthetic speech: A review of the literature on human vocal emotion," J. Acoust. Soc. Amer., vol. 93, no. 2, pp. 1097-1108, 1993. (Pubitemid 23059837)
    • (1993) Journal of the Acoustical Society of America , vol.93 , Issue.2 , pp. 1097-1108
    • Murray, I.R.1    Arnott, J.L.2
  • 11
    • 33750564952 scopus 로고    scopus 로고
    • Comparing feature sets for acted and spontaneous speech in view of automatic emotion recognition
    • DOI 10.1109/ICME.2005.1521463, 1521463, IEEE International Conference on Multimedia and Expo, ICME 2005
    • T. Vogt and E. André, "Comparing feature sets for acted and spontaneous speech in view of automatic emotion recognition," in Proc. IEEE Int. Conf. Multimedia and Expo, ICME'05, Amsterdam, The Netherlands, 2005, pp. 474-477. (Pubitemid 44668907)
    • (2005) IEEE International Conference on Multimedia and Expo, ICME 2005 , vol.2005 , pp. 474-477
    • Vogt, T.1    Andre, E.2
  • 13
    • 84861094330 scopus 로고    scopus 로고
    • Real vs. acted emotional speech: Comparing Caucasian and South Asian speakers and observers
    • Campinas, Brazil
    • S. Shahid, E. Krahmer, and M. Swerts, "Real vs. acted emotional speech: Comparing Caucasian and South Asian speakers and observers," in Proc. 4th Int. Conf. Speech Prosody, Campinas, Brazil, 2008, pp. 669-672.
    • (2008) Proc. 4th Int. Conf. Speech Prosody , pp. 669-672
    • Shahid, S.1    Krahmer, E.2    Swerts, M.3
  • 16
    • 79960847182 scopus 로고    scopus 로고
    • Emotion recognition using a hierarchical binary decision tree approach
    • C. Lee, E. Mower, C. Busso, S. Lee, and S. Narayanan, "Emotion recognition using a hierarchical binary decision tree approach," Speech Commun., vol. 53, no. 9-10, pp. 1162-1171, 2011.
    • (2011) Speech Commun , vol.53 , Issue.9-10 , pp. 1162-1171
    • Lee, C.1    Mower, E.2    Busso, C.3    Lee, S.4    Narayanan, S.5
  • 18
    • 70450161311 scopus 로고    scopus 로고
    • Combining spectral and prosodic information for emotion recognition in the interspeech 2009 emotion challenge
    • Brighton, U.K
    • I. Luengo, E. Navas, and I.Hernáez, "Combining spectral and prosodic information for emotion recognition in the interspeech 2009 emotion challenge," in Proc. 10th Annu. Conf. Int. Speech Commun. Assoc. (Interspeech' 09), Brighton, U.K., 2009, pp. 332-335.
    • (2009) Proc. 10th Annu. Conf. Int. Speech Commun. Assoc. (Interspeech' 09) , pp. 332-335
    • Luengo, I.1    Navas, E.2    Hernáez, I.3
  • 21
    • 0037290571 scopus 로고    scopus 로고
    • BabyEars: A recognition system for affective vocalizations
    • DOI 10.1016/S0167-6393(02)00049-3, PII S0167639302000493
    • M. Slaney and G.McRoberts, "BabyEars: A recognition system for affective vocalizations," Speech Commun., vol. 39, no. 3-4, pp. 367-384, 2003. (Pubitemid 35432920)
    • (2003) Speech Communication , vol.39 , Issue.3-4 , pp. 367-384
    • Slaney, M.1    McRoberts, G.2
  • 22
    • 0036171504 scopus 로고    scopus 로고
    • Recognition of affective communicative intent in robot-directed speech
    • DOI 10.1023/A:1013215010749
    • C. Breazeal and L. Aryananda, "Recognition of affective communicative intent in robot-directed speech," Autonomous Robots, vol. 12, no. 1, pp. 83-104, 2002. (Pubitemid 34156624)
    • (2002) Autonomous Robots , vol.12 , Issue.1 , pp. 83-104
    • Breazeal, C.1    Aryananda, L.2
  • 24
    • 54049132925 scopus 로고    scopus 로고
    • The Vera amMittag German audio-visual emotional speech database
    • Hannover, Germany
    • M. Grimm, K. Kroschel, and S. Narayan, "The Vera amMittag German audio-visual emotional speech database," in Proc. IEEE Int. Conf. Multimedia and Expo, Hannover, Germany, 2008, pp. 865-868.
    • (2008) Proc. IEEE Int. Conf. Multimedia and Expo , pp. 865-868
    • Grimm, M.1    Kroschel, K.2    Narayan, S.3
  • 27
    • 85012688561 scopus 로고
    • Princeton NJ, USA: Princeton Univ. Press
    • R. Bellman, Dynamic Programming. Princeton, NJ, USA: Princeton Univ. Press, 1957.
    • (1957) Dynamic Programming
    • Bellman, R.1
  • 29
    • 68949141755 scopus 로고    scopus 로고
    • A least-squares approach to direct importance estimation
    • T. Kanamori, S. Hido, and M. Sugiyama, "A least-squares approach to direct importance estimation," J. Mach. Learn. Res., vol. 10, pp. 1391-1445, 2009.
    • (2009) J. Mach. Learn. Res , vol.10 , pp. 1391-1445
    • Kanamori, T.1    Hido, S.2    Sugiyama, M.3
  • 30
    • 52649172710 scopus 로고    scopus 로고
    • Direct density ratio estimation for large-scale covariate shift adaptation
    • Atlanta, Georgia, USA
    • Y. Tsuboi, H. Kashima, S. Hido, S. Bickel, and M. Sugiyama, "Direct density ratio estimation for large-scale covariate shift adaptation," in Proc. SIAM Int. Conf. Data Mining, Atlanta, Georgia, USA, 2008, pp. 443-454.
    • (2008) Proc. SIAM Int. Conf. Data Mining , pp. 443-454
    • Tsuboi, Y.1    Kashima, H.2    Hido, S.3    Bickel, S.4    Sugiyama, M.5
  • 32
    • 27644522706 scopus 로고    scopus 로고
    • Vocal tract normalization equals linear transformation in cepstral space
    • DOI 10.1109/TSA.2005.848881
    • M. Pitz and H. Ney, "Vocal tract normalization equals linear transformation in cepstral space," IEEE Trans. Speech Audio Process., vol. 13, no. 5, pp. 930-944, Sep. 2005. (Pubitemid 41558907)
    • (2005) IEEE Transactions on Speech and Audio Processing , vol.13 , Issue.5 , pp. 930-944
    • Pitz, M.1    Ney, H.2
  • 34
    • 33645887246 scopus 로고    scopus 로고
    • Support vector machines using GMM supervectors for speaker verification
    • May
    • W. M. Campbell, D. E. Sturim, and D. A. Reynolds, "Support vector machines using GMM supervectors for speaker verification," IEEE Signal Process. Lett., vol. 13, no. 5, pp. 308-311, May 2006.
    • (2006) IEEE Signal Process. Lett , vol.13 , Issue.5 , pp. 308-311
    • Campbell, W.M.1    Sturim, D.E.2    Reynolds, D.A.3
  • 35
    • 0029288633 scopus 로고
    • Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
    • C. J. Leggetter and P. C. Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models," Comput. Speech Lang., vol. 9, no. 2, pp. 171-185, 1995.
    • (1995) Comput. Speech Lang , vol.9 , Issue.2 , pp. 171-185
    • Leggetter, C.J.1    Woodland, P.C.2
  • 36
    • 0030263447 scopus 로고    scopus 로고
    • Mean and variance adaptation within the MLLR framework
    • DOI 10.1006/csla.1996.0013
    • M. Gales and P. Woodland, "Mean and variance adaptation within the MLLR framework," Comput. Speech Lang., vol. 10, no. 4, pp. 249-264, 1996. (Pubitemid 126374488)
    • (1996) Computer Speech and Language , vol.10 , Issue.4 , pp. 249-264
    • Gales, M.J.F.1    Woodland, P.C.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.