메뉴 건너뛰기




Volumn 4, Issue MAY, 2013, Pages

On the acoustics of emotion in audio: What speech, music, and sound have in common

Author keywords

Audio signal processing; Emotion recognition; Feature selection; Music perception; Sound perception; Speech perception; Transfer learning

Indexed keywords


EID: 84878925980     PISSN: None     EISSN: 16641078     Source Type: Journal    
DOI: 10.3389/fpsyg.2013.00292     Document Type: Article
Times cited : (240)

References (38)
  • 1
    • 84874471178 scopus 로고    scopus 로고
    • Introducing the Geneva multimodal expression corpus for experimental research on emotion perception
    • doi:10.1037/a0025827
    • Bänziger, T., Mortillaro, M., and Scherer, K. R. (2012). Introducing the Geneva multimodal expression corpus for experimental research on emotion perception. Emotion 12, 1161-1179. doi:10.1037/a0025827
    • (2012) Emotion , vol.12 , pp. 1161-1179
    • Bänziger, T.1    Mortillaro, M.2    Scherer, K.R.3
  • 2
    • 84878390748 scopus 로고    scopus 로고
    • A robust unsupervised arousal rating framework using prosody with cross-corpora evaluation
    • in Proceeding of the Interspeech (Portland, OR: ISCA)
    • Bone, D., Lee, C.-C., and Narayanan, S. (2012). "A robust unsupervised arousal rating framework using prosody with cross-corpora evaluation," in Proceeding of the Interspeech (Portland, OR: ISCA).
    • (2012)
    • Bone, D.1    Lee, C.C.2    Narayanan, S.3
  • 3
    • 33745202280 scopus 로고    scopus 로고
    • A database of German emotional speech
    • in Proceeding of the Interspeech (Lisbon: ISCA)
    • Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., and Weiss, B. (2005). "A database of German emotional speech," in Proceeding of the Interspeech (Lisbon: ISCA), 1517-1520.
    • (2005)
    • Burkhardt, F.1    Paeschke, A.2    Rolfes, M.3    Sendlmeier, W.4    Weiss, B.5
  • 4
    • 84873607349 scopus 로고    scopus 로고
    • A system for evaluating singing enthusiasm for karaoke
    • in Proceeding of the ISMIR (Miami, FL: International Society for Music Information Retrieval)
    • Daido, R., Hahm, S., Ito, M., Makino, S., and Ito, A. (2011). "A system for evaluating singing enthusiasm for karaoke," in Proceeding of the ISMIR (Miami, FL: International Society for Music Information Retrieval), 31-36.
    • (2011) , pp. 31-36
    • Daido, R.1    Hahm, S.2    Ito, M.3    Makino, S.4    Ito, A.5
  • 5
    • 0019053271 scopus 로고
    • Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
    • doi:10.1109/TASSP.1980.1163420
    • Davis, S. B., and Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. 28, 357-366. doi:10.1109/TASSP.1980.1163420
    • (1980) IEEE Trans.Acoust. , vol.28 , pp. 357-366
    • Davis, S.B.1    Mermelstein, P.2
  • 6
    • 84867858889 scopus 로고    scopus 로고
    • Affective acoustic ecology: towards emotionally enhanced sound events
    • in Proceedings of the 7th Audio Mostly Conference: A Conference on Interaction with Sound (New York, NY: ACM)
    • Drossos, K., Floros, A., and Kanellopoulos, N.-G. (2012). "Affective acoustic ecology: towards emotionally enhanced sound events," in Proceedings of the 7th Audio Mostly Conference: A Conference on Interaction with Sound (New York, NY: ACM), 109-116.
    • (2012) , pp. 109-116
    • Drossos, K.1    Floros, A.2    Kanellopoulos, N.G.3
  • 7
    • 36348934700 scopus 로고    scopus 로고
    • The world of emotion is not two-dimensional
    • doi:10.1111/j.1467-9280.2007.02024.x
    • Fontaine, J., Scherer, K. R., Roesch, E., and Ellsworth, P. (2007). The world of emotion is not two-dimensional. Psychol. Sci. 18, 1050-1057. doi:10.1111/j.1467-9280.2007.02024.x
    • (2007) Psychol. Sci. , vol.18 , pp. 1050-1057
    • Fontaine, J.1    Scherer, K.R.2    Roesch, E.3    Ellsworth, P.4
  • 8
    • 33846223296 scopus 로고    scopus 로고
    • Evaluation of natural emotions using self assessment manikins
    • in Proceeding of the ASRU (Cancún: IEEE)
    • Grimm, M., and Kroschel, K. (2005). "Evaluation of natural emotions using self assessment manikins," in Proceeding of the ASRU (Cancún: IEEE), 381-385.
    • (2005) , pp. 381-385
    • Grimm, M.1    Kroschel, K.2
  • 9
    • 34547940048 scopus 로고    scopus 로고
    • Primitives-based evaluation and estimation of emotions in speech
    • doi:10.1016/j.specom.2007.01.010
    • Grimm, M., Kroschel, K., Mower, E., and Narayanan, S. (2007a). Primitives-based evaluation and estimation of emotions in speech. Speech Commun. 49, 787-800. doi:10.1016/j.specom.2007.01.010
    • (2007) Speech Commun , vol.49 , pp. 787-800
    • Grimm, M.1    Kroschel, K.2    Mower, E.3    Narayanan, S.4
  • 10
    • 34547518166 scopus 로고    scopus 로고
    • Support vector regression for automatic recognition of spontaneous emotions in speech
    • in Proceeding of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vol IV (Honolulu,HI: IEEE)
    • Grimm, M., Kroschel, K., and Narayanan, S. (2007b). "Support vector regression for automatic recognition of spontaneous emotions in speech," in Proceeding of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vol. IV (Honolulu, HI: IEEE), 1085-1088.
    • (2007) , pp. 1085-1088
    • Grimm, M.1    Kroschel, K.2    Narayanan, S.3
  • 11
    • 54049132925 scopus 로고    scopus 로고
    • The Vera am Mittag German audio-visual emotional speech database
    • in Proceeding of the IEEE International Conference on Multimedia and Expo (ICME) (Hannover: IEEE)
    • Grimm, M., Kroschel, K., and Narayanan, S. (2008). "The Vera am Mittag German audio-visual emotional speech database," in Proceeding of the IEEE International Conference on Multimedia and Expo (ICME) (Hannover: IEEE), 865-868.
    • (2008) , pp. 865-868
    • Grimm, M.1    Kroschel, K.2    Narayanan, S.3
  • 13
    • 84873433681 scopus 로고    scopus 로고
    • The 2007 MIREX audio mood classification task: lessons learned
    • in Proceeding of the ISMIR (Philadelphia: International Society for Music Information Retrieval)
    • Hu, X., Downie, J. S., Laurier, C., Bay, M., and Ehmann, A. F. (2008). "The 2007 MIREX audio mood classification task: lessons learned," in Proceeding of the ISMIR (Philadelphia: International Society for Music Information Retrieval), 462-467.
    • (2008) , pp. 462-467
    • Hu, X.1    Downie, J.S.2    Laurier, C.3    Bay, M.4    Ehmann, A.F.5
  • 14
    • 33847747806 scopus 로고    scopus 로고
    • A comparison of acoustic cues in music and speech for three dimensions of affect
    • doi:10.1525/mp.2006.23.4.319
    • Ilie, G., and Thompson, W. F. (2006). A comparison of acoustic cues in music and speech for three dimensions of affect. Music Percept. 23, 319-329. doi:10.1525/mp.2006.23.4.319
    • (2006) Music Percept , vol.23 , pp. 319-329
    • Ilie, G.1    Thompson, W.F.2
  • 15
    • 0141764789 scopus 로고    scopus 로고
    • Communication of emotions in vocal expression and music performance: different channels, same code? Psychol
    • doi:10.1037/0033-2909.129.5.770
    • Juslin, P. N., and Laukka, P. (2003). Communication of emotions in vocal expression and music performance: different channels, same code? Psychol. Bull. 129, 770-814. doi:10.1037/0033-2909.129.5.770
    • (2003) Bull , vol.129 , pp. 770-814
    • Juslin, P.N.1    Laukka, P.2
  • 16
    • 84859899698 scopus 로고    scopus 로고
    • The SEMAINE database: annotated multimodal records of emotionally colored conversations between a person and a limited agent
    • doi:10.1109/T-AFFC.2011.20
    • McKeown, G., Valstar, M., Cowie, R., Pantic, M., and Schröder, M. (2012). The SEMAINE database: annotated multimodal records of emotionally colored conversations between a person and a limited agent. IEEE Trans. Affect. Comput. 3, 5-17. doi:10.1109/T-AFFC.2011.20
    • (2012) IEEE Trans.Affect. Comput. , vol.3 , pp. 5-17
    • McKeown, G.1    Valstar, M.2    Cowie, R.3    Pantic, M.4    Schröder, M.5
  • 17
    • 84878304110 scopus 로고    scopus 로고
    • Advocating a componential appraisal model to guide emotion recognition
    • doi:10.4018/jse.2012010102
    • Mortillaro, M., Meuleman, B., and Scherer, K. R. (2012). Advocating a componential appraisal model to guide emotion recognition. Int. J. Synth. Emot. 3, 18-32. doi:10.4018/jse.2012010102
    • (2012) Int.J. Synth. Emot. , vol.3 , pp. 18-32
    • Mortillaro, M.1    Meuleman, B.2    Scherer, K.R.3
  • 18
    • 84968172277 scopus 로고
    • Differences in ability of musicians and nonmusicians to judge emotional state from the fundamental frequency of voice samples
    • doi:10.2307/40285316
    • Nilsonne, A., and Sundberg, J. (1985). Differences in ability of musicians and nonmusicians to judge emotional state from the fundamental frequency of voice samples. Music Percept. 2, 507-516. doi:10.2307/40285316
    • (1985) Music Percept , vol.2 , pp. 507-516
    • Nilsonne, A.1    Sundberg, J.2
  • 19
    • 33644626634 scopus 로고    scopus 로고
    • A Large Set of Audio Features for Sound Description
    • Technical Report. Paris: IRCAM.
    • Peeters, G. (2004). A Large Set of Audio Features for Sound Description. Technical Report. Paris: IRCAM.
    • (2004)
    • Peeters, G.1
  • 20
    • 0003773721 scopus 로고    scopus 로고
    • Fast training of support vector machines using sequential minimal optimization
    • in Advances in Kernel Methods: Support Vector Learning (Cambridge, MA: MIT Press)
    • Platt, J. C. (1999). "Fast training of support vector machines using sequential minimal optimization," in Advances in Kernel Methods: Support Vector Learning (Cambridge, MA: MIT Press), 185-208.
    • (1999) , pp. 185-208
    • Platt, J.C.1
  • 21
    • 84866376204 scopus 로고    scopus 로고
    • A web search engine for sound effects
    • in Proceeding of the 119th Conference of the Audio Engineering Society (AES) (New York: Audio Engineering Society)
    • Rice, S. V., and Bailey, S. M. (2005). "A web search engine for sound effects," in Proceeding of the 119th Conference of the Audio Engineering Society (AES) (New York: Audio Engineering Society).
    • (2005)
    • Rice, S.V.1    Bailey, S.M.2
  • 22
    • 0002138882 scopus 로고
    • Emotion expression in speech and music
    • in Music, Language, Speech, and Brain, eds J Sundberg,L Nord, and R. Carlson (London: Macmillan)
    • Scherer, K. R. (1991). "Emotion expression in speech and music," in Music, Language, Speech, and Brain, eds J. Sundberg, L. Nord, and R. Carlson (London: Macmillan), 146-156.
    • (1991) , pp. 146-156
    • Scherer, K.R.1
  • 23
    • 84893465620 scopus 로고    scopus 로고
    • Emotion in action
    • interaction, music, and speech," in Language, Music, and the Brain: A Mysterious Relationship, ed. M. Arbib (Cambridge, MA: MIT Press)
    • Scherer, K. R. (2013). "Emotion in action, interaction, music, and speech," in Language, Music, and the Brain: A Mysterious Relationship, ed. M. Arbib (Cambridge, MA: MIT Press), 107-139.
    • (2013) , pp. 107-139
    • Scherer, K.R.1
  • 24
    • 0347613216 scopus 로고    scopus 로고
    • Vocal expression of emotion
    • in Handbook of Affective Sciences, eds R. J. Davidson, K. R. Scherer, and H. H. Goldsmith (Oxford, NY: Oxford University Press)
    • Scherer, K. R., Johnstone, T., and Klasmeyer, G. (2003). "Vocal expression of emotion," in Handbook of Affective Sciences, eds R. J. Davidson, K. R. Scherer, and H. H. Goldsmith (Oxford, NY: Oxford University Press), 433-456.
    • (2003) , pp. 433-456
    • Scherer, K.R.1    Johnstone, T.2    Klasmeyer, G.3
  • 25
    • 79960846940 scopus 로고    scopus 로고
    • Recognising realistic emotions and affect in speech: state of the art and lessons learnt from the first challenge
    • doi:10.1016/j.specom.2011.01.011
    • Schuller, B., Batliner, A., Steidl, S., and Seppi, D. (2011a). Recognising realistic emotions and affect in speech: state of the art and lessons learnt from the first challenge. Speech Commun. 53, 1062-1087. doi:10.1016/j.specom.2011.01.011
    • (2011) Speech Commun , vol.53 , pp. 1062-1087
    • Schuller, B.1    Batliner, A.2    Steidl, S.3    Seppi, D.4
  • 26
    • 84871398919 scopus 로고    scopus 로고
    • Multi-modal non-prototypical music mood analysis in continuous space: reliability and performances
    • in Proceedings 12th International Society for Music Information Retrieval Conference, ISMIR 2011 (Miami, FL: ISMIR)
    • Schuller, B., Weninger, F., and Dorfner, J. (2011b). "Multi-modal non-prototypical music mood analysis in continuous space: reliability and performances," in Proceedings 12th International Society for Music Information Retrieval Conference, ISMIR 2011 (Miami, FL: ISMIR), 759-764.
    • (2011) , pp. 759-764
    • Schuller, B.1    Weninger, F.2    Dorfner, J.3
  • 27
    • 77952482069 scopus 로고    scopus 로고
    • Determination of nonprototypical valence and arousal in popular music: features and performances
    • 2010:735854. doi:10.1186/1687-4722-2010-735854
    • Schuller, B., Dorfner, J., and Rigoll, G. (2010). Determination of nonprototypical valence and arousal in popular music: features and performances. EURASIP J. Audio Speech Music Process. 2010:735854. doi:10.1186/1687-4722-2010-735854
    • (2010) EURASIP J. Audio Speech Music Process.
    • Schuller, B.1    Dorfner, J.2    Rigoll, G.3
  • 28
    • 84867612519 scopus 로고    scopus 로고
    • Automatic recognition of emotion evoked by general sound events
    • in Proceedings 37th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 (Kyoto: IEEE)
    • Schuller, B., Hantke, S., Weninger, F., Han, W., Zhang, Z., and Narayanan, S. (2012). "Automatic recognition of emotion evoked by general sound events," in Proceedings 37th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 (Kyoto: IEEE), 341-344.
    • (2012) , pp. 341-344
    • Schuller, B.1    Hantke, S.2    Weninger, F.3    Han, W.4    Zhang, Z.5    Narayanan, S.6
  • 29
    • 70450206416 scopus 로고    scopus 로고
    • The INTERSPEECH 2009 emotion challenge
    • in Proceeding of the Interspeech (Brighton: ISCA)
    • Schuller, B., Steidl, S., and Batliner, A. (2009). "The INTERSPEECH 2009 emotion challenge," in Proceeding of the Interspeech (Brighton: ISCA), 312-315.
    • (2009) , pp. 312-315
    • Schuller, B.1    Steidl, S.2    Batliner, A.3
  • 30
    • 84906269266 scopus 로고    scopus 로고
    • The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism
    • in Proceedings INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association (Lyon: ISCA)
    • Schuller, B., Steidl, S., Batliner, A., Vinciarelli, A., Scherer, K., Ringeval, F., et al. (2013). "The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism," in Proceedings INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association (Lyon: ISCA).
    • (2013)
    • Schuller, B.1    Steidl, S.2    Batliner, A.3    Vinciarelli, A.4    Scherer, K.5    Ringeval, F.6
  • 31
    • 4043137356 scopus 로고    scopus 로고
    • A tutorial on support vector regression
    • doi:10.1023/B:STCO.0000035301.49549.88
    • Smola, A., and Schölkopf, B. (2004). A tutorial on support vector regression. Stat. Comput. 14, 199-222. doi:10.1023/B:STCO.0000035301.49549.88
    • (2004) Stat.Comput. , vol.14 , pp. 199-222
    • Smola, A.1    Schölkopf, B.2
  • 32
    • 70449388050 scopus 로고    scopus 로고
    • Automatic Classification of Emotion-Related User States in Spontaneous Children's Speech
    • Berlin: Logos Verlag
    • Steidl, S. (2009). Automatic Classification of Emotion-Related User States in Spontaneous Children's Speech. Berlin: Logos Verlag.
    • (2009)
    • Steidl, S.1
  • 33
    • 78349289595 scopus 로고    scopus 로고
    • Towards evaluation of example-based audio retrieval system using affective dimensions
    • in Proceeding of the ICME (Singapore: IEEE)
    • Sundaram, S., and Schleicher, R. (2010). "Towards evaluation of example-based audio retrieval system using affective dimensions," in Proceeding of the ICME (Singapore: IEEE), 573-577.
    • (2010) , pp. 573-577
    • Sundaram, S.1    Schleicher, R.2
  • 34
    • 0347649626 scopus 로고    scopus 로고
    • Decoding speech prosody: do music lessons help?
    • doi:10.1037/1528-3542.4.1.46
    • Thompson, W. F., Schellenberg, E. G., and Husain, G. (2004). Decoding speech prosody: do music lessons help? Emotion 4, 46-64. doi:10.1037/1528-3542.4.1.46
    • (2004) Emotion , vol.4 , pp. 46-64
    • Thompson, W.F.1    Schellenberg, E.G.2    Husain, G.3
  • 35
    • 33746410556 scopus 로고    scopus 로고
    • Emotional speech recognition: resources, features, and methods
    • doi:10.1016/j.specom.2006.04.003
    • Ververidis, D., and Kotropoulos, C. (2006). Emotional speech recognition: resources, features, and methods. Speech Commun. 48, 1162-1181. doi:10.1016/j.specom.2006.04.003
    • (2006) Speech Commun , vol.48 , pp. 1162-1181
    • Ververidis, D.1    Kotropoulos, C.2
  • 36
    • 84864517995 scopus 로고    scopus 로고
    • Machine recognition of music emotion: a review
    • doi:10.1145/2168752.2168754
    • Yang, Y.-H., and Chen, H.-H. (2012). Machine recognition of music emotion: a review. ACM Trans. Intell. Syst. Technol. 3, 1-30. doi:10.1145/2168752.2168754
    • (2012) ACM Trans.Intell. Syst. Technol. , vol.3 , pp. 1-30
    • Yang, Y.H.1    Chen, H.H.2
  • 37
    • 0003822743 scopus 로고    scopus 로고
    • The HTK Book
    • Version 3.4.1. Cambridge: Cambridge University Engineering Department
    • Young, S. J., Evermann, G., Gales, M. J. F., Hain, T., Kershaw, D., Liu, X., et al. (2006). The HTK Book, Version 3.4.1. Cambridge: Cambridge University Engineering Department.
    • (2006)
    • Young, S.J.1    Evermann, G.2    Gales, M.J.F.3    Hain, T.4    Kershaw, D.5    Liu, X.6
  • 38
    • 0004236521 scopus 로고    scopus 로고
    • Psychoacoustics - Facts and Models
    • Heidelberg: Springer
    • Zwicker, E., and Fastl, H. (1999). Psychoacoustics - Facts and Models. Heidelberg: Springer.
    • (1999)
    • Zwicker, E.1    Fastl, H.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.