메뉴 건너뛰기




Volumn 57, Issue , 2014, Pages 1-12

Compensating for speaker or lexical variabilities in speech for emotion recognition

Author keywords

Emotion recognition; Factor analysis; Feature normalization; Speaker variability

Indexed keywords

EMOTION RECOGNITION; FACTORIZATION TECHNIQUES; FEATURE NORMALIZATION; HUMAN MACHINE INTERFACE; SPEAKER CHARACTERISTICS; SPEAKER VARIABILITY; SPEECH EMOTION RECOGNITION SYSTEMS; WHITENING TRANSFORMATION;

EID: 84884611357     PISSN: 01676393     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.specom.2013.07.011     Document Type: Article
Times cited : (46)

References (52)
  • 1
    • 78349274056 scopus 로고    scopus 로고
    • Segmenting into adequate units for automatic recognition of emotion-related episodes: A speech-based approach
    • A. Batliner, D. Seppi, S. Steidl, and B. Schuller Segmenting into adequate units for automatic recognition of emotion-related episodes: a speech-based approach Advances in Human-Computer Interaction 2010 2010 1 15
    • (2010) Advances in Human-Computer Interaction , vol.2010 , pp. 1-15
    • Batliner, A.1    Seppi, D.2    Steidl, S.3    Schuller, B.4
  • 3
    • 48149104055 scopus 로고    scopus 로고
    • Using neutral speech models for emotional speech analysis
    • Antwerp, Belgium
    • Busso, C., Lee, S., Narayanan, S., 2007. Using neutral speech models for emotional speech analysis. In: Interspeech 2007 - Eurospeech, Antwerp, Belgium, pp. 2225-2228.
    • (2007) Interspeech 2007 Eurospeech , pp. 2225-2228
    • Busso, C.1    Lee, S.2    Narayanan, S.3
  • 6
    • 48149084430 scopus 로고    scopus 로고
    • Interplay between linguistic and affective goals in facial expression during emotional utterances
    • Ubatuba-SP, Brazil
    • Busso, C., Narayanan, S., 2006. Interplay between linguistic and affective goals in facial expression during emotional utterances. In: Seventh International Seminar on Speech Production (ISSP 2006), Ubatuba-SP, Brazil, pp. 549-556.
    • (2006) Seventh International Seminar on Speech Production (ISSP 2006) , pp. 549-556
    • Busso, C.1    Narayanan, S.2
  • 7
    • 48149101094 scopus 로고    scopus 로고
    • Joint analysis of the emotional fingerprint in the face and speech: A single subject study
    • Chania, Crete, Greece
    • Busso, C., Narayanan, S., 2007. Joint analysis of the emotional fingerprint in the face and speech: a single subject study. In: International Workshop on Multimedia Signal Processing (MMSP 2007), Chania, Crete, Greece, pp. 43-47.
    • (2007) International Workshop on Multimedia Signal Processing (MMSP 2007) , pp. 43-47
    • Busso, C.1    Narayanan, S.2
  • 9
    • 80051984128 scopus 로고    scopus 로고
    • Text independent emotion recognition using spectral features
    • S. Aluru, S. Bandyopadhyay, U. Catalyurek, D. Dubhashi, P. Jones, M. Parashar, B. Schmidt, Communications in Computer and Information Science Springer-Verlag Berlin, Heidelberg
    • R. Chauhan, J. Yadav, S. Koolagudi, and K. Rao Text independent emotion recognition using spectral features S. Aluru, S. Bandyopadhyay, U. Catalyurek, D. Dubhashi, P. Jones, M. Parashar, B. Schmidt, Contemporary Computing Communications in Computer and Information Science vol. 168 2011 Springer-Verlag Berlin, Heidelberg 359 370
    • (2011) Contemporary Computing , vol.168 VOL. , pp. 359-370
    • Chauhan, R.1    Yadav, J.2    Koolagudi, S.3    Rao, K.4
  • 11
    • 70450180849 scopus 로고    scopus 로고
    • Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification
    • Brighton, UK
    • Dehak, N., Dehak, R., Kenny, P., Brümmer, N., Ouellet, P., Dumouchel, P., 2009. Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification. In: Interspeech 2009, Brighton, UK, pp. 1559-1562.
    • (2009) Interspeech 2009 , pp. 1559-1562
    • Dehak, N.1
  • 12
    • 84865750857 scopus 로고    scopus 로고
    • Language recognition via i-vectors and dimensionality reduction
    • Florence, Italy
    • Dehak, N., Torres-Carrasquillo, P., Reynolds, D., Dehak, R., 2011. Language recognition via i-vectors and dimensionality reduction. In: Interspeech 2011, Florence, Italy, pp. 857-860.
    • (2011) Interspeech 2011 , pp. 857-860
    • Dehak, N.1    Torres-Carrasquillo, P.2    Reynolds, D.3    Dehak, R.4
  • 13
    • 78650977476 scopus 로고    scopus 로고
    • OpenSMILE: The Munich versatile and fast open-source audio feature extractor
    • Florence, Italy
    • Eyben, F., Wöllmer, M., Schuller, B., 2010. OpenSMILE: the Munich versatile and fast open-source audio feature extractor. In: ACM International conference on Multimedia (MM 2010), Florence, Italy, pp. 1459-1462.
    • (2010) ACM International Conference on Multimedia (MM 2010) , pp. 1459-1462
    • Eyben, F.1
  • 17
    • 0030784572 scopus 로고    scopus 로고
    • Stochastic trajectory modeling and sentence searching for continuous speech recognition
    • PII S1063667697007633
    • Y. Gong Stochastic trajectory modeling and sentence searching for continuous speech recognition IEEE Transactions on Speech and Audio Processing 5 1997 33 44 (Pubitemid 127746033)
    • (1997) IEEE Transactions on Speech and Audio Processing , vol.5 , Issue.1 , pp. 33-44
    • Gong, Y.1
  • 19
    • 0030196359 scopus 로고    scopus 로고
    • Feature Analysis and Neural Network-Based Classification of Speech under Stress
    • PII S1063667696050730
    • J. Hansen, and B. Womack Feature analysis and neural network-based classification of speech under stress IEEE Transactions on Speech and Audio Processing 4 1996 307 313 (Pubitemid 126753019)
    • (1996) IEEE Transactions on Speech and Audio Processing , vol.4 , Issue.4 , pp. 307-313
    • Hansen, J.H.L.1    Womack, B.D.2
  • 21
    • 0025041264 scopus 로고
    • Perceptual linear predictive (PLP) analysis of speech
    • DOI 10.1121/1.399423
    • H. Hermansky Perceptual linear predictive (PLP) analysis of speech Journal of the Acoustical Society of America 87 1990 1738 1752 (Pubitemid 20256470)
    • (1990) Journal of the Acoustical Society of America , vol.87 , Issue.4 , pp. 1738-1752
    • Hermansky, H.1
  • 24
    • 33745191649 scopus 로고    scopus 로고
    • An articulatory study of emotional speech production
    • 9th European Conference on Speech Communication and Technology, Eurospeech Interspeech
    • Lee, S., Yildirim, S., Kazemzadeh, A., Narayanan, S., 2005. An articulatory study of emotional speech production. In: Ninth European Conference on Speech Communication and Technology (Interspeech'2005 - Eurospeech), Lisbon, Portugal, pp. 497-500. (Pubitemid 43908108)
    • (2005) 9th European Conference on Speech Communication and Technology , pp. 497-500
    • Lee, S.1    Yildirim, S.2    Kazemzadeh, A.3    Narayanan, S.4
  • 27
    • 84875840053 scopus 로고    scopus 로고
    • Factorizing speaker, lexical and emotional variabilities observed in facial expressions
    • Orlando, FL, USA
    • Mariooryad, S., Busso, C., 2012. Factorizing speaker, lexical and emotional variabilities observed in facial expressions. In: IEEE International Conference on Image Processing (ICIP 2012), Orlando, FL, USA, pp. 2605-2608.
    • (2012) IEEE International Conference on Image Processing (ICIP 2012) , pp. 2605-2608
    • Mariooryad, S.1    Busso, C.2
  • 28
    • 84880519188 scopus 로고    scopus 로고
    • Exploring Cross-Modality Affective Reactions for Audiovisual Emotion Recognition
    • Mariooryad, S., Busso, C., 2013a. Exploring Cross-Modality Affective Reactions for Audiovisual Emotion Recognition. IEEE Transactions on Affective Computing 4(2), 183-196.
    • (2013) IEEE Transactions on Affective Computing , vol.4 , Issue.2 , pp. 183-196
    • Mariooryad, S.1    Busso, C.2
  • 34
    • 84874728526 scopus 로고    scopus 로고
    • Learning the covariance dynamics of a large-scale environment for informative path planning of unmanned aerial vehicle sensors
    • S. Park, H. Choi, N. Roy, and J. How Learning the covariance dynamics of a large-scale environment for informative path planning of unmanned aerial vehicle sensors International Journal of Aeronautical and Space Sciences 11 2010 327 337
    • (2010) International Journal of Aeronautical and Space Sciences , vol.11 , pp. 327-337
    • Park, S.1    Choi, H.2    Roy, N.3    How, J.4
  • 36
    • 0016470107 scopus 로고
    • An algorithm for detecting the endpoints of isolated utterances
    • L. Rabiner, and M. Sambur An algorithm for detecting the endpoints of isolated utterances Bell System Technical Journal 54 1975 297 315
    • (1975) Bell System Technical Journal , vol.54 , pp. 297-315
    • Rabiner, L.1    Sambur, M.2
  • 38
    • 79960846940 scopus 로고    scopus 로고
    • Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge
    • B. Schuller, A. Batliner, S. Steidl, and D. Seppi Recognising realistic emotions and affect in speech: state of the art and lessons learnt from the first challenge Speech Communication 53 2011 1062 1087
    • (2011) Speech Communication , vol.53 , pp. 1062-1087
    • Schuller, B.1    Batliner, A.2    Steidl, S.3    Seppi, D.4
  • 44
    • 0034202338 scopus 로고    scopus 로고
    • Separating style and content with bilinear models
    • J. Tenenbaum, and W. Freeman Separating style and content with bilinear models Journal of Neural Computation 12 2000 1247 1283
    • (2000) Journal of Neural Computation , vol.12 , pp. 1247-1283
    • Tenenbaum, J.1    Freeman, W.2
  • 45
    • 80155168973 scopus 로고    scopus 로고
    • Vowels formants analysis allows straightforward detection of high arousal emotions
    • Barcelona, Spain
    • Vlasenko, B., Philippou-Hübner, D., Prylipko, D., Böck, R., Siegert, I., Wendemuth, A., 2011. Vowels formants analysis allows straightforward detection of high arousal emotions. In: IEEE International Conference on Multimedia and Expo (ICME 2011), Barcelona, Spain.
    • (2011) IEEE International Conference on Multimedia and Expo (ICME 2011)
    • Vlasenko, B.1
  • 46
    • 84865709430 scopus 로고    scopus 로고
    • Vowels formants analysis allows straightforward detection of high arousal acted and spontaneous emotions
    • Florence, Italy
    • Vlasenko, B., Prylipko, D., Philippou-Hübner, D., Wendemuth, A., 2011. Vowels formants analysis allows straightforward detection of high arousal acted and spontaneous emotions. In: 12th Annual Conference of the International Speech Communication Association (Interspeech'2011), Florence, Italy, pp. 1577-1580.
    • (2011) 12th Annual Conference of the International Speech Communication Association (Interspeech'2011) , pp. 1577-1580
    • Vlasenko, B.1
  • 47
    • 38049048651 scopus 로고    scopus 로고
    • Frame vs. Turn-level: Emotion recognition from speech considering static and dynamic processing
    • A. Paiva, R. Prada, R. Picard, Springer Berlin/Heidelberg, Berlin, Germany
    • B. Vlasenko, B. Schuller, A. Wendemuth, and G. Rigoll Frame vs. turn-level: emotion recognition from speech considering static and dynamic processing A. Paiva, R. Prada, R. Picard, Affective Computing and Intelligent Interaction 2007 Springer Berlin/Heidelberg, Berlin, Germany 139 147
    • (2007) Affective Computing and Intelligent Interaction , pp. 139-147
    • Vlasenko, B.1    Schuller, B.2    Wendemuth, A.3    Rigoll, G.4
  • 48
    • 84862156369 scopus 로고    scopus 로고
    • Abandoning emotion classes - Towards continuous emotion recognition with modelling of long-range dependencies
    • Brisbane, Australia
    • Wöllmer, M., Eyben, F., Reiter, S., Schuller, B., Cox, C., Douglas-Cowie, E., Cowie, R., 2008. Abandoning emotion classes - towards continuous emotion recognition with modelling of long-range dependencies. In: Interspeech 2008 - Eurospeech, Brisbane, Australia, pp. 597-600.
    • (2008) Interspeech 2008 Eurospeech , pp. 597-600
    • Wöllmer, M.1
  • 50
    • 84878539402 scopus 로고    scopus 로고
    • Using i-vector space model for emotion recognition
    • Portland, Oregon, USA
    • Xia, R., Liu, Y., 2012. Using i-vector space model for emotion recognition. In: Interspeech 2012, Portland, Oregon, USA, pp. 2230-2233.
    • (2012) Interspeech 2012 , pp. 2230-2233
    • Xia, R.1    Liu, Y.2
  • 52
    • 44049099067 scopus 로고    scopus 로고
    • Audio-visual affective expression recognition through multistream fused HMM
    • DOI 10.1109/TMM.2008.921737, 4523967
    • Z. Zeng, J. Tu, B. Pianfetti, and T. Huang Audiovisual affective expression recognition through multistream fused HMM IEEE Transactions on Multimedia 10 2008 570 577 (Pubitemid 351711233)
    • (2008) IEEE Transactions on Multimedia , vol.10 , Issue.4 , pp. 570-577
    • Zeng, Z.1    Tu, J.2    Pianfetti Jr., B.M.3    Huang, T.S.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.