메뉴 건너뛰기




Volumn 1, Issue 2, 2010, Pages 119-131

Cross-Corpus acoustic emotion recognition: Variances and strategies

Author keywords

Affective computing; cross corpus evaluation; normalization; speech emotion recognition

Indexed keywords

ACOUSTIC EMOTION RECOGNITION; AFFECTIVE COMPUTING; CROSS VALIDATION; CROSS-CORPUS EVALUATION; EMOTION RECOGNITION; EVALUATION EXPERIMENTS; GENERALIZATION ABILITY; NORMALIZATION; PRE-SELECTED; RECOGNITION OF EMOTION; SPEECH EMOTION RECOGNITION; SYSTEM DEVELOPMENT; TEST DATA;

EID: 80053925819     PISSN: 19493045     EISSN: None     Source Type: Journal    
DOI: 10.1109/T-AFFC.2010.8     Document Type: Article
Times cited : (371)

References (106)
  • 1
    • 0000797410 scopus 로고
    • A study of emotions by speech transcription
    • E. Scripture, "A Study of Emotions by Speech Transcription," Vox, vol. 31, pp. 179-183, 1921.
    • (1921) Vox , vol.31 , pp. 179-183
    • Scripture, E.1
  • 2
    • 0002592382 scopus 로고
    • A calibrated recording and analysis of the pitch, force, and quality of vocal tones expressing happiness and sadness
    • E. Skinner, "A Calibrated Recording and Analysis of the Pitch, Force, and Quality of Vocal Tones Expressing Happiness and Sadness," Speech Monographs, vol. 2, pp. 81-137, 1935.
    • (1935) Speech Monographs , vol.2 , pp. 81-137
    • Skinner, E.1
  • 3
    • 0001988169 scopus 로고
    • An experimental study of the pitch characteristics of the voice during the expression of emotion
    • G. Fairbanks and W. Pronovost, "An Experimental Study of the Pitch Characteristics of the Voice during the Expression of Emotion," Speech Monographs, vol. 6, pp. 87-104, 1939.
    • (1939) Speech Monographs , vol.6 , pp. 87-104
    • Fairbanks, G.1    Pronovost, W.2
  • 4
    • 0015409613 scopus 로고
    • Emotions and speech: Some acoustic correlates
    • C. Williams and K. Stevens, "Emotions and Speech: Some Acoustic Correlates," J. Acoustical Soc. Am., vol. 52, pp. 1238-1250, 1972.
    • (1972) J. Acoustical Soc. Am. , vol.52 , pp. 1238-1250
    • Williams, C.1    Stevens, K.2
  • 5
    • 0022688124 scopus 로고
    • Vocal affect expression: A review and a model for future research
    • K.R. Scherer, "Vocal Affect Expression: A Review and a Model for Future Research," Psychological Bull., vol. 99, pp. 143-165, 1986.
    • (1986) Psychological Bull. , vol.99 , pp. 143-165
    • Scherer, K.R.1
  • 6
    • 0003367662 scopus 로고
    • The dictionary of affect in language
    • The Measurement of Emotions, R. Plutchik and H. Kellerman, eds. Academic Press
    • C. Whissell, "The Dictionary of Affect in Language," Emotion: Theory, Research and Experience, vol. 4, The Measurement of Emotions, R. Plutchik and H. Kellerman, eds., pp. 113-131, Academic Press, 1989.
    • (1989) Emotion: Theory, Research and Experience , vol.4 , pp. 113-131
    • Whissell, C.1
  • 9
    • 33745224103 scopus 로고    scopus 로고
    • Spontaneous speech: How people really talk and why engineers should care
    • 9th European Conference on Speech Communication and Technology, Eurospeech Interspeech
    • E. Shriberg, "Spontaneous Speech: How People Really Talk and Why Engineers Should Care," Proc. EUROSPEECH, pp. 1781-1784, 2005. (Pubitemid 43908428)
    • (2005) 9th European Conference on Speech Communication and Technology , pp. 1781-1784
    • Shriberg, E.1
  • 10
    • 14644439843 scopus 로고    scopus 로고
    • Toward detecting emotions in spoken dialogs
    • DOI 10.1109/TSA.2004.838534
    • C.M. Lee and S.S. Narayanan, "Toward Detecting Emotions in Spoken Dialogs," IEEE Trans. Speech and Audio Processing, vol. 13, no. 2, pp. 293-303, 2005. (Pubitemid 40320247)
    • (2005) IEEE Transactions on Speech and Audio Processing , vol.13 , Issue.2 , pp. 293-303
    • Lee, C.M.1    Narayanan, S.S.2
  • 18
    • 84862156369 scopus 로고    scopus 로고
    • Abandoning emotion classes-towards continuous emotion recognition with modelling of long-range dependencies
    • M. Wöllmer, F. Eyben, S. Reiter, B. Schuller, C. Cox, E. Douglas-Cowie, and R. Cowie, "Abandoning Emotion Classes-Towards Continuous Emotion Recognition with Modelling of Long-Range Dependencies," Proc. INTERSPEECH, pp. 597-600, 2008.
    • (2008) Proc. INTERSPEECH , pp. 597-600
    • Wöllmer, M.1    Eyben, F.2    Reiter, S.3    Schuller, B.4    Cox, C.5    Douglas-Cowie, E.6    Cowie, R.7
  • 21
    • 21544459345 scopus 로고    scopus 로고
    • Challenges in real-life emotion annotation and machine learning based detection
    • DOI 10.1016/j.neunet.2005.03.007, PII S0893608005000407, Emotion and Brain
    • L. Devillers, L. Vidrascu, and L. Lamel, "Challenges in Real-Life Emotion Annotation and Machine Learning Based Detection," Neural Networks, vol. 18, no. 4, pp. 407-422, 2005. (Pubitemid 40922648)
    • (2005) Neural Networks , vol.18 , Issue.4 , pp. 407-422
    • Devillers, L.1    Vidrascu, L.2    Lamel, L.3
  • 22
  • 25
    • 33646781551 scopus 로고    scopus 로고
    • Acoustic training from heterogeneous data sources: Experiments in mandarin conversational telephone speech transcription
    • S. Tsakalidis and W. Byrne, "Acoustic Training from Heterogeneous Data Sources: Experiments in Mandarin Conversational Telephone Speech Transcription," Proc. IEEE Int'l Conf. Aacoustics, Speech, and Signal Processing, 2005.
    • (2005) Proc. IEEE Int'l Conf. Aacoustics, Speech, and Signal Processing
    • Tsakalidis, S.1    Byrne, W.2
  • 28
    • 0003012849 scopus 로고    scopus 로고
    • Combining multiple learning strategies for effective cross validation
    • Y. Yang, T. Ault, and T. Pierce, "Combining Multiple Learning Strategies for Effective Cross Validation," Proc. 17th Int'l Conf. Machine Learning, pp. 1167-1174, 2000.
    • (2000) Proc. 17th Int'l Conf. Machine Learning , pp. 1167-1174
    • Yang, Y.1    Ault, T.2    Pierce, T.3
  • 30
    • 85009115694 scopus 로고    scopus 로고
    • Consonant discrimination in elicited and spontaneous speech: A case for signal-adaptive front ends in ASR
    • K. Soenmez, M. Plauche, E. Shriberg, and H. Franco, "Consonant Discrimination in Elicited and Spontaneous Speech: A Case for Signal-Adaptive Front Ends in ASR," Proc. Int'l Conf. Spoken Language Processing, pp. 548-551, 2000.
    • (2000) Proc. Int'l Conf. Spoken Language Processing , pp. 548-551
    • Soenmez, K.1    Plauche, M.2    Shriberg, E.3    Franco, H.4
  • 32
    • 38149027682 scopus 로고    scopus 로고
    • Automatic classification of expressiveness in speech: A multi-corpus study
    • C. Müller, ed.
    • M. Shami and W. Verhelst, "Automatic Classification of Expressiveness in Speech: A Multi-Corpus Study," Speaker Classification II, C. Müller, ed., pp. 43-56, 2007.
    • (2007) Speaker Classification II , pp. 43-56
    • Shami, M.1    Verhelst, W.2
  • 33
    • 0037380084 scopus 로고    scopus 로고
    • Emotional speech: Towards a new generation of databases
    • E. Douglas-Cowie, N. Campbell, R. Cowie, and P. Roach, "Emotional Speech: Towards a New Generation of Databases," Speech Comm., vol. 40, nos. 1-2, pp. 33-60, 2003.
    • (2003) Speech Comm. , vol.40 , Issue.1-2 , pp. 33-60
    • Douglas-Cowie, E.1    Campbell, N.2    Cowie, R.3    Roach, P.4
  • 37
    • 85089273681 scopus 로고    scopus 로고
    • Getting started with SUSAS: A speech under simulated and actual stress database
    • J. Hansen and S. Bou-Ghazale, "Getting Started with SUSAS: A Speech under Simulated and Actual Stress Database," Proc. EUROSPEECH, vol. 4, pp. 1743-1746, 1997.
    • (1997) Proc. EUROSPEECH , vol.4 , pp. 1743-1746
    • Hansen, J.1    Bou-Ghazale, S.2
  • 39
    • 84862624179 scopus 로고    scopus 로고
    • Fast sequential floating forward selection applied to emotional speech features estimated on des and SUSAS data collection
    • D. Ververidis and C. Kotropoulos, "Fast Sequential Floating Forward Selection Applied to Emotional Speech Features Estimated on DES and SUSAS Data Collection," Proc. European Signal Processing Conf., 2006.
    • (2006) Proc. European Signal Processing Conf.
    • Ververidis, D.1    Kotropoulos, C.2
  • 40
    • 85016351492 scopus 로고    scopus 로고
    • The recognition of emotions from speech using gentleboost classifier. A comparison approach
    • D. Datcu and L.J. Rothkrantz, "The Recognition of Emotions from Speech Using Gentleboost Classifier. A Comparison Approach," Proc. Int'l Conf. Computer Systems and Technologies, vol. 1, pp. 1-6, 2006.
    • (2006) Proc. Int'l Conf. Computer Systems and Technologies , vol.1 , pp. 1-6
    • Datcu, D.1    Rothkrantz, L.J.2
  • 43
    • 80052565871 scopus 로고    scopus 로고
    • A cognitive science reasoning in recognition of emotions in audio-visual speech
    • V. Slavova, W. Verhelst, and H. Sahli, "A Cognitive Science Reasoning in Recognition of Emotions in Audio-Visual Speech," Int'l J. Information Technologies and Knowledge, vol. 2, pp. 324-334, 2008.
    • (2008) Int'l J. Information Technologies and Knowledge , vol.2 , pp. 324-334
    • Slavova, V.1    Verhelst, W.2    Sahli, H.3
  • 45
    • 80052598269 scopus 로고    scopus 로고
    • Semantic audio-visual data fusion for automatic emotion recognition
    • D. Datcu and L.J. M. Rothkrantz, "Semantic Audio-Visual Data Fusion for Automatic Emotion Recognition," Proc. Euromedia, 2008.
    • (2008) Proc. Euromedia
    • Datcu, D.1    Rothkrantz, L.J.M.2
  • 48
    • 0028630509 scopus 로고
    • Nonlinear analysis and classification of speech under stressed conditions
    • DOI 10.1121/1.410601
    • D. Cairns and J.H. L. Hansen, "Nonlinear Analysis and Detection of Speech under Stressed Conditions," J. Acoustical Soc. Am., vol. 96, no. 6, pp. 3392-3400, Dec. 1994. (Pubitemid 24376418)
    • (1994) Journal of the Acoustical Society of America , vol.96 , Issue.6 , pp. 3392-3400
    • Cairns, D.A.1    Hansen, J.H.L.2
  • 49
    • 0012643871 scopus 로고    scopus 로고
    • Emotions: What is possible in the ASR framework?
    • L. Bosch, "Emotions: What Is Possible in the ASR Framework?" Proc. ISCA Workshop Speech and Emotion, pp. 189-194, 2000.
    • (2000) Proc. ISCA Workshop Speech and Emotion , pp. 189-194
    • Bosch, L.1
  • 50
    • 0035278948 scopus 로고    scopus 로고
    • Nonlinear feature based classification of speech under stress
    • DOI 10.1109/89.905995, PII S1063667601013232
    • G. Zhou, J.H.L. Hansen, and J.F. Kaiser, "Nonlinear Feature Based Classification of Speech under Stress," IEEE Trans. Speech and Audio Processing, vol. 9, no. 3, pp. 201-216, Mar. 2001. (Pubitemid 32286594)
    • (2001) IEEE Transactions on Speech and Audio Processing , vol.9 , Issue.3 , pp. 201-216
    • Zhou, G.1    Hansen, J.H.L.2    Kaiser, J.F.3
  • 51
    • 0037513425 scopus 로고    scopus 로고
    • Perception of stress and speaking style for selected elements of the SUSAS database
    • R.S. Bolia and R.E. Slyh, "Perception of Stress and Speaking Style for Selected Elements of the SUSAS Database," Speech Comm., vol. 40, no. 4, pp. 493-501, 2003.
    • (2003) Speech Comm. , vol.40 , Issue.4 , pp. 493-501
    • Bolia, R.S.1    Slyh, R.E.2
  • 52
    • 84867198846 scopus 로고    scopus 로고
    • Detection of security related affect and behaviour in passenger transport
    • B. Schuller, M. Wimmer, D. Arsic, T. Moosmayr, and G. Rigoll, "Detection of Security Related Affect and Behaviour in Passenger Transport," Proc. INTERSPEECH, pp. 265-268, 2008.
    • (2008) Proc. INTERSPEECH , pp. 265-268
    • Schuller, B.1    Wimmer, M.2    Arsic, D.3    Moosmayr, T.4    Rigoll, G.5
  • 55
    • 84867201207 scopus 로고    scopus 로고
    • Balancing spoken content adaptation and unit length in the recognition of emotion and interest
    • B. Vlasenko, B. Schuller, K. Tadesse Mengistu, and G. Rigoll, "Balancing Spoken Content Adaptation and Unit Length in the Recognition of Emotion and Interest," Proc. INTERSPEECH, pp. 805-808, 2008.
    • (2008) Proc. INTERSPEECH , pp. 805-808
    • Vlasenko, B.1    Schuller, B.2    Tadesse Mengistu, K.3    Rigoll, G.4
  • 56
    • 0242589101 scopus 로고    scopus 로고
    • Smartkom: Symmetric multimodality in an adaptive and reusable dialogue shell
    • W. Wahlster, "Smartkom: Symmetric Multimodality in an Adaptive and Reusable Dialogue Shell," Proc. Human Computer Interaction Status Conf., pp. 47-62, 2003.
    • (2003) Proc. Human Computer Interaction Status Conf. , pp. 47-62
    • Wahlster, W.1
  • 60
    • 0030093965 scopus 로고    scopus 로고
    • Acoustic Profiles in Vocal Emotion Expression
    • R. Banse and K.R. Scherer, "Acoustic Profiles in Vocal Emotion Expression," J. Personality and Social Psychology, vol. 70, no. 3, pp. 614-636, 1996. (Pubitemid 126420699)
    • (1996) Journal of Personality and Social Psychology , vol.70 , Issue.3 , pp. 614-636
    • Banse, R.1    Scherer, K.R.2
  • 61
    • 85128358819 scopus 로고    scopus 로고
    • Recognizing emotions in speech using short-term and long-term features
    • Y. Li and Y. Zhao, "Recognizing Emotions in Speech Using Short-Term and Long-Term Features," Proc. Int'l Conf. Spoken Language Processing, p. 379, 1998.
    • (1998) Proc. Int'l Conf. Spoken Language Processing , pp. 379
    • Li, Y.1    Zhao, Y.2
  • 66
    • 84976221270 scopus 로고    scopus 로고
    • Tuning hidden markov model for speech emotion recognition
    • Mar.
    • B. Vlasenko and A. Wendemuth, "Tuning Hidden Markov Model for Speech Emotion Recognition," Proc. DAGA, Mar. 2007.
    • (2007) Proc. DAGA
    • Vlasenko, B.1    Wendemuth, A.2
  • 67
    • 28444487806 scopus 로고    scopus 로고
    • Automatic speech classification to five emotional states based on gender information
    • D. Ververidis and C. Kotropoulos, "Automatic Speech Classification to Five Emotional States Based on Gender Information," Proc. EUSIPCO, pp. 341-344, 2004.
    • (2004) Proc. EUSIPCO , pp. 341-344
    • Ververidis, D.1    Kotropoulos, C.2
  • 71
    • 70450136545 scopus 로고    scopus 로고
    • An incremental analysis of different feature groups in speaker independent emotion recognition
    • Aug.
    • M. Lugger and B. Yang, "An Incremental Analysis of Different Feature Groups in Speaker Independent Emotion Recognition," Proc. Int'l Congress Phonetic Sciences, pp. 2149-2152, Aug. 2007.
    • (2007) Proc. Int'l Congress Phonetic Sciences , pp. 2149-2152
    • Lugger, M.1    Yang, B.2
  • 75
    • 4544316885 scopus 로고    scopus 로고
    • Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture
    • B. Schuller, G. Rigoll, and M. Lang, "Speech Emotion Recognition Combining Acoustic Features and Linguistic Information in a Hybrid Support Vector Machine-Belief Network Architecture," Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, vol. 1, 2004.
    • (2004) Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing , vol.1
    • Schuller, B.1    Rigoll, G.2    Lang, M.3
  • 77
    • 21544466181 scopus 로고    scopus 로고
    • ASR for emotional speech: Clarifying the issues and enhancing performance
    • DOI 10.1016/j.neunet.2005.03.008, PII S0893608005000419, Emotion and Brain
    • T. Athanaselis, S. Bakamidis, I. Dologlou, R. Cowie, E. Douglas-Cowie, and C. Cox, "ASR for Emotional Speech: Clarifying the Issues and Enhancing Performance," Neural Networks, no. 18, pp. 437-444, 2005. (Pubitemid 40922650)
    • (2005) Neural Networks , vol.18 , Issue.4 , pp. 437-444
    • Athanaselis, T.1    Bakamidis, S.2    Dologlou, I.3    Cowie, R.4    Douglas-Cowie, E.5    Cox, C.6
  • 84
    • 56149115138 scopus 로고    scopus 로고
    • Combining frame and turn-level information for robust recognition of emotions within wpeech
    • B. Vlasenko, B. Schuller, A. Wendemuth, and G. Rigoll, "Combining Frame and Turn-Level Information for Robust Recognition of Emotions within Wpeech," Proc. INTERSPEECH, pp. 2249-2252, 2007.
    • (2007) Proc. INTERSPEECH , pp. 2249-2252
    • Vlasenko, B.1    Schuller, B.2    Wendemuth, A.3    Rigoll, G.4
  • 87
    • 33746410556 scopus 로고    scopus 로고
    • Emotional speech recognition: Resources, features, and methods
    • DOI 10.1016/j.specom.2006.04.003, PII S0167639306000422
    • D. Ververidis and C. Kotropoulos, "Emotional Speech Recognition: Resources, Features, and Methods," Speech Comm., vol. 48, no. 9, pp. 1162-1181, Sept. 2006. (Pubitemid 44128615)
    • (2006) Speech Communication , vol.48 , Issue.9 , pp. 1162-1181
    • Ververidis, D.1    Kotropoulos, C.2
  • 88
    • 0037382608 scopus 로고    scopus 로고
    • Modeling drivers' speech under stress
    • R. Fernandez and R.W. Picard, "Modeling Drivers' Speech under Stress," Speech Comm., vol. 40, nos. 1-2, pp. 145-159, 2003.
    • (2003) Speech Comm. , vol.40 , Issue.1-2 , pp. 145-159
    • Fernandez, R.1    Picard, R.W.2
  • 89
    • 70450185596 scopus 로고    scopus 로고
    • Modeling mutual influence of interlocutor emotion states in dyadic spoken interactions
    • C. Lee, C. Busso, S. Lee, and S. Narayanan, "Modeling Mutual Influence of Interlocutor Emotion States in Dyadic Spoken Interactions," Proc. INTERSPEECH, pp. 1983-1986, 2009.
    • (2009) Proc. INTERSPEECH , pp. 1983-1986
    • Lee, C.1    Busso, C.2    Lee, S.3    Narayanan, S.4
  • 92
    • 70450179812 scopus 로고    scopus 로고
    • Emotion recognition using a hierarchical binary decision tree approach
    • C. Lee, E. Mower, C. Busso, S. Lee, and S. Narayanan, "Emotion Recognition Using a Hierarchical Binary Decision Tree Approach," Proc. INTERSPEECH, pp. 320-323, 2009.
    • (2009) Proc. INTERSPEECH , pp. 320-323
    • Lee, C.1    Mower, E.2    Busso, C.3    Lee, S.4    Narayanan, S.5
  • 97
    • 78649332623 scopus 로고    scopus 로고
    • Robust acoustic speech emotion recognition by ensembles of classifiers
    • B. Schuller, M. Lang, and G. Rigoll, "Robust Acoustic Speech Emotion Recognition by Ensembles of Classifiers," Proc. DAGA, vol. I, pp. 329-330, 2005.
    • (2005) Proc. DAGA , vol.1 , pp. 329-330
    • Schuller, B.1    Lang, M.2    Rigoll, G.3
  • 98
    • 33846952503 scopus 로고    scopus 로고
    • Ensemble methods for spoken emotion recognition in call-centres
    • D. Morrison, R. Wang, and L.C.D. Silva, "Ensemble Methods for Spoken Emotion Recognition in Call-Centres," Speech Comm., vol. 49, no. 2, pp. 98-112, 2007.
    • (2007) Speech Comm. , vol.49 , Issue.2 , pp. 98-112
    • Morrison, D.1    Wang, R.2    Silva, L.C.D.3
  • 99
    • 70450177653 scopus 로고    scopus 로고
    • Brno university of technology system for interspeech 2009 emotion challenge
    • M. Kockmann, L. Burget, and J. Cernocky, "Brno University of Technology System for Interspeech 2009 Emotion Challenge," Proc. INTERSPEECH, 2009.
    • (2009) Proc. INTERSPEECH
    • Kockmann, M.1    Burget, L.2    Cernocky, J.3
  • 101
    • 45749090903 scopus 로고    scopus 로고
    • Wearable assistance for the ballroom-dance hobbyist-holistic rhythm analysis and dance-style classification
    • F. Eyben, B. Schuller, and G. Rigoll, "Wearable Assistance for the Ballroom-Dance Hobbyist-Holistic Rhythm Analysis and Dance-Style Classification," Proc. Int'l Conf. Multimedia & Expo, 2007.
    • (2007) Proc. Int'l Conf. Multimedia & Expo
    • Eyben, F.1    Schuller, B.2    Rigoll, G.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.