메뉴 건너뛰기




Volumn 13, Issue 2, 2011, Pages 216-234

Audiovisual discrimination between speech and laughter: Why and when visual information might help

Author keywords

Human behavior analysis; laughter versus speech discrimination; neural networks; principal components analysis (PCA)

Indexed keywords

AUDIO AND VIDEO; AUDIO INFORMATION; AUDIO-BASED; CEPSTRAL; CROSS VALIDATION; DATA SETS; FACIAL EXPRESSIONS; HEAD MOVEMENTS; HEAD POSE; HUMAN BEHAVIORS; LAUGHTER-VERSUS-SPEECH DISCRIMINATION; PRINCIPAL COMPONENTS ANALYSIS (PCA); PROSODIC FEATURES; SINGLE-MODAL; TESTING CONDITIONS; TRAINING CONDITIONS; TWO-STREAM; VISUAL CHANNELS; VISUAL INFORMATION;

EID: 79952983096     PISSN: 15209210     EISSN: None     Source Type: Journal    
DOI: 10.1109/TMM.2010.2101586     Document Type: Article
Times cited : (46)

References (68)
  • 1
    • 79952954239 scopus 로고    scopus 로고
    • [Online]. Available:
    • [Online]. Available: http://www.doc.ic.ac.uk/~maja/AMI-SAL-annotations. xls.
  • 2
    • 79952925238 scopus 로고    scopus 로고
    • Nist, Rich Transcription 2004 Spring Meeting Recognition Evaluation, Documentation. [Online]. Available:
    • Nist (2004), Rich Transcription 2004 Spring Meeting Recognition Evaluation, Documentation. [Online]. Available: http://www.nist.gov/ speech/tests/rt/rt2004/spring/.
    • (2004)
  • 3
    • 0035346201 scopus 로고    scopus 로고
    • Not all laughs are alike: Voiced but Not Unvoiced Laughter Readily Elicits Positive Affect
    • J. Bachorowski and M. Owren, "Not all laughs are alike: Voiced but not unvoiced laughter readily elicits positive affect," Psychol. Sci., vol. 12, no. 3, pp. 252-257, 2001. (Pubitemid 33653539)
    • (2001) Psychological Science , vol.12 , Issue.3 , pp. 252-257
    • Bachorowski, J.-A.1    Owren, M.J.2
  • 4
    • 0034855678 scopus 로고    scopus 로고
    • The acoustic features of human laughter
    • J. A. Bachorowski, M. J. Smoski, and M. J. Owren, "The acoustic features of human laughter," J. Acoust. Soc. Amer., vol. 110, no. 1, pp. 1581-1597, 2001.
    • (2001) J. Acoust. Soc. Amer. , vol.110 , Issue.1 , pp. 1581-1597
    • Bachorowski, J.A.1    Smoski, M.J.2    Owren, M.J.3
  • 5
    • 0001835850 scopus 로고
    • Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound
    • P. Boersma, "Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound," Proc. Inst. Phonet. Sci., vol. 17, pp. 97-110, 1993.
    • (1993) Proc. Inst. Phonet. Sci. , vol.17 , pp. 97-110
    • Boersma, P.1
  • 7
    • 65249116503 scopus 로고    scopus 로고
    • Analysis of emotionally salient aspects of fundamental frequency for emotion detection
    • C. Busso, S. Lee, and S. Narayanan, "Analysis of emotionally salient aspects of fundamental frequency for emotion detection," IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 4, pp. 582-596, 2009.
    • (2009) IEEE Trans. Audio, Speech, Lang. Process. , vol.17 , Issue.4 , pp. 582-596
    • Busso, C.1    Lee, S.2    Narayanan, S.3
  • 12
    • 15744378497 scopus 로고    scopus 로고
    • The timing of facial motion in posed and spontaneous smiles
    • J. F. Cohn and K. L. Schmidt, "The timing of facial motion in posed and spontaneous smiles," Int. J. Wavelets Multires. Inf. Process., vol. 2, pp. 121-132, 2005.
    • (2005) Int. J. Wavelets Multires. Inf. Process. , vol.2 , pp. 121-132
    • Cohn, J.F.1    Schmidt, K.L.2
  • 15
    • 0034270644 scopus 로고    scopus 로고
    • Audio-visual speech modeling for continuous speech recognition
    • S. Dupont and J. Luettin, "Audio-visual speech modeling for continuous speech recognition," IEEE Trans. Multimedia, vol. 2, no. 3, pp. 141-151, 2000.
    • (2000) IEEE Trans. Multimedia , vol.2 , Issue.3 , pp. 141-151
    • Dupont, S.1    Luettin, J.2
  • 17
    • 79952915218 scopus 로고    scopus 로고
    • PLP and RASTA (and MFCC and Inversion) in Matlab. [Online]. Available:
    • D. P. W. Ellis, PLP and RASTA (and MFCC and Inversion) in Matlab, 2005. [Online]. Available: http://www.ee.columbia.edu/dpwe/resources/ matlab/rastamat.
    • (2005)
    • Ellis, D.P.W.1
  • 18
    • 0023454438 scopus 로고
    • The vocabulary problem in human-system communication
    • G. Furnas, T. Landauer, L. Gomez, and S. Dumais, "The vocabulary problem in human-system communication," Commun. ACM, vol. 30, no. 11, pp. 964-972, 1987.
    • (1987) Commun. ACM , vol.30 , Issue.11 , pp. 964-972
    • Furnas, G.1    Landauer, T.2    Gomez, L.3    Dumais, S.4
  • 19
    • 34548063185 scopus 로고    scopus 로고
    • Toward pose-invariant 2-D face recognition through point distribution models and facial symmetry
    • DOI 10.1109/TIFS.2007.903543
    • D. Gonzalez-Jimenez and J. L. Alba-Castro, "Toward pose-invariant 2-D face recognition through point distribution models and facial symmetry," IEEE Trans. Inf. Forensics Security, vol. 2, no. 3, pp. 413-429, 2007. (Pubitemid 47290576)
    • (2007) IEEE Transactions on Information Forensics and Security , vol.2 , Issue.3 , pp. 413-429
    • Gonzalez-Jimenez, D.1    Alba-Castro, J.L.2
  • 22
    • 0037279492 scopus 로고    scopus 로고
    • Content-based audio classification and retrieval by support vector machines
    • G. Guo and S. Li, "Content-based audio classification and retrieval by support vector machines," IEEE Trans. Neural Netw., vol. 14, no. 1, pp. 209-215, 2003.
    • (2003) IEEE Trans. Neural Netw. , vol.14 , Issue.1 , pp. 209-215
    • Guo, G.1    Li, S.2
  • 23
    • 0031573117 scopus 로고    scopus 로고
    • Long Short-Term Memory
    • S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural Computat., vol. 9, no. 8, pp. 1735-1780, 1997. (Pubitemid 127462305)
    • (1997) Neural Computation , vol.9 , Issue.8 , pp. 1735-1780
    • Hochreiter, S.1    Schmidhuber, J.2
  • 24
    • 70349591045 scopus 로고    scopus 로고
    • Laughter differs in children with autism: An acoustic analysis of laughs produced by children with and without the disorder
    • W. Hudenko, W. Stone, and J. Bachorowski, "Laughter differs in children with autism: An acoustic analysis of laughs produced by children with and without the disorder," J. Autism Develop. Disorders, vol. 39, no. 10, pp. 1392-1400, 2009.
    • (2009) J. Autism Develop. Disorders , vol.39 , Issue.10 , pp. 1392-1400
    • Hudenko, W.1    Stone, W.2    Bachorowski, J.3
  • 25
    • 33745159644 scopus 로고    scopus 로고
    • Smile and laughter recognition using speech processing and face recognition from conversation video
    • A. Ito, W. Xinyue, M. Suzuki, and S. Makino, "Smile and laughter recognition using speech processing and face recognition from conversation video," in Proc. Int. Conf. Cyberworlds, 2005, pp. 8-15.
    • (2005) Proc. Int. Conf. Cyberworlds , pp. 8-15
    • Ito, A.1    Xinyue, W.2    Suzuki, M.3    Makino, S.4
  • 28
    • 3042649924 scopus 로고    scopus 로고
    • The role of rhythm and pitch in the evaluation of human laughter
    • DOI 10.1023/A:1027384817134
    • S. Kipper and D. Todt, "The role of rhythm and pitch in the evaluation of human laughter," J. Nonverb. Behav., vol. 27, no. 4, pp. 255-272, 2003. (Pubitemid 39015682)
    • (2003) Journal of Nonverbal Behavior , vol.27 , Issue.4 , pp. 255-272
    • Kipper, S.1    Todt, D.2
  • 29
    • 84867209015 scopus 로고    scopus 로고
    • Getting the last laugh: Automatic laughter segmentation in meetings
    • M. Knox, N. Morgan, and N. Mirghafori, "Getting the last laugh: Automatic laughter segmentation in meetings," in Proc. INTERSPEECH, 2008, pp. 797-800.
    • (2008) Proc. INTERSPEECH , pp. 797-800
    • Knox, M.1    Morgan, N.2    Mirghafori, N.3
  • 30
    • 57849169432 scopus 로고    scopus 로고
    • Analysis of the occurrence of laughter in meetings
    • K. Laskowski and S. Burger, "Analysis of the occurrence of laughter in meetings," in Proc. INTERSPEECH, 2007, pp. 1258-1261.
    • (2007) Proc. INTERSPEECH , pp. 1258-1261
    • Laskowski, K.1    Burger, S.2
  • 31
    • 57849164116 scopus 로고    scopus 로고
    • Detection of laughter-in-Interaction in multichannel close-talk microphone recordings of meetings
    • K. Laskowski and T. Schultz, "Detection of laughter-in-Interaction in multichannel close-talk microphone recordings of meetings," Lecture Notes in Computer Science, vol. 5237, pp. 149-160, 2008.
    • (2008) Lecture Notes in Computer Science , vol.5237 , pp. 149-160
    • Laskowski, K.1    Schultz, T.2
  • 32
    • 33746529930 scopus 로고    scopus 로고
    • A study in machine learning from imbalanced data for sentence boundary detection in speech
    • DOI 10.1016/j.csl.2005.06.002, PII S0885230805000306
    • Y. Liu, N. Chawla, M. Harper, E. Shriberg, and A. Stolcke, "A study in machine learning from imbalanced data for sentence boundary detection in speech," Comput. Speech Lang., vol. 20, no. 4, pp. 468-494, 2006. (Pubitemid 44142004)
    • (2006) Computer Speech and Language , vol.20 , Issue.4 , pp. 468-494
    • Liu, Y.1    Chawla, N.V.2    Harper, M.P.3    Shriberg, E.4    Stolcke, A.5
  • 37
  • 38
    • 85032751429 scopus 로고    scopus 로고
    • Implicit human-centered tagging
    • M. Pantic and A. Vinciarelli, "Implicit human-centered tagging," IEEE Signal Process. Mag., vol. 26, no. 6, pp. 173-180, 2009.
    • (2009) IEEE Signal Process. Mag. , vol.26 , Issue.6 , pp. 173-180
    • Pantic, M.1    Vinciarelli, A.2
  • 40
    • 74049087896 scopus 로고    scopus 로고
    • Static vs. dynamic modeling of human nonverbal behavior from multiple cues and modalities
    • S. Petridis, H. Gunes, S. Kaltwang, and M. Pantic, "Static vs. dynamic modeling of human nonverbal behavior from multiple cues and modalities," in Proc. ICMI, 2009, pp. 23-30.
    • (2009) Proc. ICMI , pp. 23-30
    • Petridis, S.1    Gunes, H.2    Kaltwang, S.3    Pantic, M.4
  • 44
    • 70350735731 scopus 로고    scopus 로고
    • Is this joke really funny? Judging the mirth by audiovisual laughter analysis
    • S. Petridis and M. Pantic, "Is this joke really funny? Judging the mirth by audiovisual laughter analysis," in Proc. IEEE Int. Conf. Multimedia & Expo, 2009, pp. 1444-1447.
    • (2009) Proc. IEEE Int. Conf. Multimedia & Expo , pp. 1444-1447
    • Petridis, S.1    Pantic, M.2
  • 45
    • 4544290191 scopus 로고    scopus 로고
    • Recent advances in the automatic recognition of audiovisual speech
    • G. Potamianos, C. Neti, G. Gravier,A. Garg, and A. W. Senior, "Recent advances in the automatic recognition of audiovisual speech," Proc. IEEE, vol. 91, no. 9, pp. 1306-1326, 2003.
    • (2003) Proc. IEEE , vol.91 , Issue.9 , pp. 1306-1326
    • Potamianos, G.1    Neti, C.2    Gravier, G.3    Garg, A.4    Senior, A.W.5
  • 46
    • 84990794587 scopus 로고
    • Laughter punctuates speech: Linguistic, social and gender contexts of laughter
    • R. Provine, "Laughter punctuates speech: Linguistic, social and gender contexts of laughter," Ethology, vol. 95, no. 4, pp. 291-298, 1993.
    • (1993) Ethology , vol.95 , Issue.4 , pp. 291-298
    • Provine, R.1
  • 48
    • 84990789911 scopus 로고
    • Laughter: A stereotyped human vocalization
    • R. Provine and Y. Yong, "Laughter: A stereotyped human vocalization," Ethology, vol. 89, no. 2, pp. 115-124, 1991.
    • (1991) Ethology , vol.89 , Issue.2 , pp. 115-124
    • Provine, R.1    Yong, Y.2
  • 51
    • 84943274699 scopus 로고
    • A direct adaptive method for faster backpropagation learning: The RPROP algorithm
    • M. Riedmiller and H. Braun, "A direct adaptive method for faster backpropagation learning: The RPROP algorithm," in Proc. IEEE Int. Conf. Neural Networks, 1993, vol. 1, pp. 586-591.
    • (1993) Proc. IEEE Int. Conf. Neural Networks , vol.1 , pp. 586-591
    • Riedmiller, M.1    Braun, H.2
  • 52
    • 0031668260 scopus 로고    scopus 로고
    • Analysis of laughter and speech sounds in Italian and German students
    • DOI 10.1007/s001140050522
    • H. Rothgänger, G. Hauser, A. Cappellini, and A. Guidotti, "Analysis of laughter and speech sounds in Italian and German students," Naturwissenschaften, vol. 85, no. 8, pp. 394-402, 1998. (Pubitemid 28402160)
    • (1998) Naturwissenschaften , vol.85 , Issue.8 , pp. 394-402
    • Rothganger, H.1    Hauser, G.2    Cappellini, A.C.3    Guidotti, A.4
  • 53
    • 0003809764 scopus 로고    scopus 로고
    • The expressive pattern of laughter
    • Singapore: World Scientific
    • W. Ruch and P. Ekman, "The expressive pattern of laughter," in Emotions, Qualia and Consciousness. Singapore:World Scientific, 2001, pp. 426-443.
    • (2001) Emotions, Qualia and Consciousness , pp. 426-443
    • Ruch, W.1    Ekman, P.2
  • 54
    • 85121320685 scopus 로고
    • Affect bursts
    • S. van Goozen, N. van de Poll, and J. Sergeant, Eds. Hillsdale, NJ: Erlbaum
    • K. Scherer, "Affect bursts," in Emotions: Essays on Emotion Theory, S. van Goozen, N. van de Poll, and J. Sergeant, Eds. Hillsdale, NJ: Erlbaum, 1994, pp. 161-193.
    • (1994) Emotions: Essays on Emotion Theory , pp. 161-193
    • Scherer, K.1
  • 55
    • 63449087780 scopus 로고    scopus 로고
    • Perception of non-verbal emotional listener feedback
    • Dresden, Germany
    • M. Schroeder, D. Heylen, and I. Poggi, "Perception of non-verbal emotional listener feedback," in Proc. Speech Prosody, Dresden, Germany, 2006, pp. 1-4.
    • (2006) Proc. Speech Prosody , pp. 1-4
    • Schroeder, M.1    Heylen, D.2    Poggi, I.3
  • 57
    • 48249106592 scopus 로고    scopus 로고
    • Static and dynamic modelling for the recognition of non-verbal vocalisations in conversational speech
    • B. Schueller, F. Eyben, and G. Rigoll, "Static and dynamic modelling for the recognition of non-verbal vocalisations in conversational speech," Lecture Notes in Computer Science, vol. 5078, pp. 99-110, 2008.
    • (2008) Lecture Notes in Computer Science , vol.5078 , pp. 99-110
    • Schueller, B.1    Eyben, F.2    Rigoll, G.3
  • 59
  • 60
    • 33846907749 scopus 로고    scopus 로고
    • Automatic discrimination between laughter and speech
    • DOI 10.1016/j.specom.2007.01.001, PII S0167639307000027
    • K. P. Truong and D. A. van Leeuwen, "Automatic discrimination between laughter and speech," Speech Commun., vol. 49, no. 2, pp. 144-158, 2007. (Pubitemid 46241514)
    • (2007) Speech Communication , vol.49 , Issue.2 , pp. 144-158
    • Truong, K.P.1    Van Leeuwen, D.A.2
  • 61
    • 57849100685 scopus 로고    scopus 로고
    • Evaluating laughter segmentation in meetings with acoustic and acoustic-phonetic features
    • K. P. Truong and D. A. van Leeuwen, "Evaluating laughter segmentation in meetings with acoustic and acoustic-phonetic features," in Proc. Workshop Phonetics of Laughter, 2007.
    • (2007) Proc. Workshop Phonetics of Laughter
    • Truong, K.P.1    Van Leeuwen, D.A.2
  • 63
    • 3543005991 scopus 로고    scopus 로고
    • Laughter in conversation: Features of occurrence and acoustic structure
    • DOI 10.1023/B:JONB.0000023654.73558.72
    • J.Vettin and D. Todt, "Laughter in conversation: Features of occurrence and acoustic structure," J. Nonverb. Behav., vol. 28, no. 2, pp. 93-115, 2004. (Pubitemid 39015675)
    • (2004) Journal of Nonverbal Behavior , vol.28 , Issue.2 , pp. 93-115
    • Vettin, J.1    Todt, D.2
  • 64
    • 61549132763 scopus 로고    scopus 로고
    • Social signal processing: Survey of an emerging domain
    • A. Vinciarelli, M. Pantic, and H. Bourlard, "Social signal processing: Survey of an emerging domain," Image Vis. Comput., vol. 27, no. 12, pp. 1743-1759, 2009.
    • (2009) Image Vis. Comput. , vol.27 , Issue.12 , pp. 1743-1759
    • Vinciarelli, A.1    Pantic, M.2    Bourlard, H.3
  • 65
    • 38049048651 scopus 로고    scopus 로고
    • Frame vs. turn-level: Emotion recognition from speech considering static and dynamic processing
    • B. Vlasenko, B. Schueller, A. Wendemuth, and G. Rigoll, "Frame vs. turn-level: Emotion recognition from speech considering static and dynamic processing," in Proc. ACII, 2007, pp. 139-147.
    • (2007) Proc. ACII , pp. 139-147
    • Vlasenko, B.1    Schueller, B.2    Wendemuth, A.3    Rigoll, G.4
  • 66
    • 0036656895 scopus 로고    scopus 로고
    • Linking facial animation, head motion and speech acoustics
    • H. Yehia, T. Kuratate, and E. Vatikiotis-Bateson, "Linking facial animation, head motion and speech acoustics," J. Phonet., vol. 30, no. 3, pp. 555-568, 2002.
    • (2002) J. Phonet. , vol.30 , Issue.3 , pp. 555-568
    • Yehia, H.1    Kuratate, T.2    Vatikiotis-Bateson, E.3
  • 67
    • 34147129605 scopus 로고    scopus 로고
    • Combining cepstral and prosodic features in language identification
    • DOI 10.1109/ICPR.2006.381, 1699828, Proceedings - 18th International Conference on Pattern Recognition, ICPR 2006
    • B. Yin, E. Ambikairajah, and F. Chen, "Combining cepstral and prosodic features in language identification," in Proc. Int. Conf. Pattern Recognition, 2006, vol. 4, pp. 254-257. (Pubitemid 46553409)
    • (2006) Proceedings - International Conference on Pattern Recognition , vol.4 , pp. 254-257
    • Yin, B.1    Ambikairajah, E.2    Chen, F.3
  • 68
    • 57149144228 scopus 로고    scopus 로고
    • A survey of affect recognition methods: Audio, visual and spontaneous expressions
    • Z. Zeng, M. Pantic, G. Roisman, and T. Huang, "A survey of affect recognition methods: Audio, visual and spontaneous expressions," IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 1, pp. 39-58, 2009.
    • (2009) IEEE Trans. Pattern Anal. Mach. Intell. , vol.31 , Issue.1 , pp. 39-58
    • Zeng, Z.1    Pantic, M.2    Roisman, G.3    Huang, T.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.