메뉴 건너뛰기




Volumn 20, Issue 8, 2012, Pages 2329-2340

Generating human-like behaviors using joint, speech-driven models for conversational agents

Author keywords

Conversational agent (CA); dynamic Bayesian network (DBN); facial animation; visual prosody

Indexed keywords

AUDIO-VISUAL INFORMATION; CONVERSATIONAL AGENTS; DYNAMIC BAYESIAN NETWORKS; FACIAL ANIMATION; FACIAL EXPRESSIONS; FACIAL GESTURES; HEAD MOTION; HUMAN COMMUNICATIONS; JOINT MODELS; PERCEPTUAL EVALUATION; VISUAL ASPECTS; VISUAL PERCEPTION; VISUAL PROSODY;

EID: 84865398720     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2012.2201476     Document Type: Article
Times cited : (53)

References (47)
  • 1
    • 48149101094 scopus 로고    scopus 로고
    • Joint analysis of the emotional fingerprint the face and speech: A single subject study
    • Chania, Crete, Greece, Oct. 2007
    • C. Busso and S. Narayanan, "Joint analysis of the emotional fingerprint the face and speech: A single subject study," in Int. Workshop Multimedia Signal Process. (MMSP 2007), Chania, Crete, Greece, Oct. 2007, pp. 43-47.
    • Int. Workshop Multimedia Signal Process. (MMSP 2007) , pp. 43-47
    • Busso, C.1    Narayanan, S.2
  • 2
    • 0037384712 scopus 로고    scopus 로고
    • Vocal communication of emotion: A review of research paradigms
    • Apr
    • K. Scherer, "Vocal communication of emotion: A review of research paradigms," Speech Commun., vol. 40, no. 1-2, pp. 227-256, Apr. 2003.
    • (2003) Speech Commun. , vol.40 , Issue.1-2 , pp. 227-256
    • Scherer, K.1
  • 3
    • 0017199877 scopus 로고
    • Hearing lips and seeing voices
    • Dec
    • H. McGurk and J. W. MacDonald, "Hearing lips and seeing voices," Nature, vol. 264, pp. 746-748, Dec. 1976.
    • (1976) Nature , vol.264 , pp. 746-748
    • McGurk, H.1    MacDonald, J.W.2
  • 7
    • 0035277175 scopus 로고    scopus 로고
    • More than just a pretty face: Conversational protocols and the affordances of embodiment
    • DOI 10.1016/S0950-7051(00)00102-7
    • J. Cassell, T. Bickmore, L. Campbell, H. Vilhjálmsson, and H. Yan, "More than just a pretty face: Conversational protocols and the affordances of embodiment," Knowl.-Based Syst., vol. 14, pp. 55-64, Mar. 2001. (Pubitemid 32264619)
    • (2001) Knowledge-Based Systems , vol.14 , Issue.1-2 , pp. 55-64
    • Cassell, J.1    Bickmore, T.2    Campbell, L.3    Vilhjalmsson, H.4    Yan, H.5
  • 9
    • 0010569888 scopus 로고    scopus 로고
    • Performative facial expressions animated faces
    • J. Cassell, J. Sullivan, S. Prevost, and E. Churchill, Eds. Cambridge, MA: MIT Press
    • I. Poggi and C. Pelachaud, "Performative facial expressions animated faces," in Embodied Conversational Agents, J. Cassell, J. Sullivan, S. Prevost, and E. Churchill, Eds. Cambridge, MA:MIT Press, 2000, p. 154188.
    • (2000) Embodied Conversational Agents , pp. 154188
    • Poggi, I.1    Pelachaud, C.2
  • 10
    • 78049482995 scopus 로고    scopus 로고
    • Comparing rule-based and data-driven selection of facial displays
    • Prague, Czech Republic Jun
    • M. Foster, "Comparing rule-based and data-driven selection of facial displays," in Proc. Workshop Embodied Lang. Process., Assoc. for Comput. Linguist., Prague, Czech Republic, Jun. 2007, pp. 1-8.
    • (2007) Proc. Workshop Embodied Lang. Process., Assoc. for Comput. Linguist. , pp. 1-8
    • Foster, M.1
  • 13
    • 84877622000 scopus 로고    scopus 로고
    • Speaking with hands: Creating animated conversational characters from recordings of human performance
    • August
    • M. Stone, D. DeCarlo, I. Oh, C. Rodriguez, A. Stere, A. Lees, and C. Bregler, "Speaking with hands: Creating animated conversational characters from recordings of human performance," ACM Trans. Graphics (TOG), vol. 23, pp. 506-513, August 2004.
    • (2004) ACM Trans. Graphics (TOG) , vol.23 , pp. 506-513
    • Stone, M.1    DeCarlo, D.2    Oh, I.3    Rodriguez, C.4    Stere, A.5    Lees, A.6    Bregler, C.7
  • 14
    • 27144506606 scopus 로고    scopus 로고
    • Natural head motion synthesis driven by acoustic prosodic features
    • DOI 10.1002/cav.80
    • C. Busso, Z. Deng, U. Neumann, and S. Narayanan, "Natural head motion synthesis driven by acoustic prosodic features," Comput. Animation and Virtual Worlds, vol. 16, no. 3-4, pp. 283-290, Jul. 2005. (Pubitemid 41495224)
    • (2005) Computer Animation and Virtual Worlds , vol.16 , Issue.3-4 , pp. 283-290
    • Busso, C.1    Deng, Z.2    Neumann, U.3    Narayanan, S.4
  • 15
    • 72349087042 scopus 로고    scopus 로고
    • Learning expressive human-like head motion sequences from speech
    • Z. Deng and U. Neumann, Eds. Surrey, U.K.: Springer-Verlag
    • C. Busso, Z. Deng, U. Neumann, and S. Narayanan, "Learning expressive human-like head motion sequences from speech," in Data-Driven 3D Facial Animations, Z. Deng and U. Neumann, Eds. Surrey, U.K.: Springer-Verlag, 2007, pp. 113-131.
    • (2007) Data-Driven 3D Facial Animations , pp. 113-131
    • Busso, C.1    Deng, Z.2    Neumann, U.3    Narayanan, S.4
  • 16
    • 42949107237 scopus 로고    scopus 로고
    • Interrelation between speech and facial gestures emotional utterances: A single subject study
    • Nov
    • C. Busso and S. Narayanan, "Interrelation between speech and facial gestures emotional utterances: A single subject study," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 8, pp. 2331-2347, Nov. 2007.
    • (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.8 , pp. 2331-2347
    • Busso, C.1    Narayanan, S.2
  • 18
    • 0032178592 scopus 로고    scopus 로고
    • Quantitative association of vocal-tract and facial behavior
    • PII S016763939800048X
    • H. Yehia, P. Rubin, and E. Vatikiotis-Bateson, "Quantitative association of vocal-tract and facial behavior," Speech Commun., vol. 26, no. 1-2, pp. 23-43, 1998. (Pubitemid 128381217)
    • (1998) Speech Communication , vol.26 , Issue.1-2 , pp. 23-43
    • Yehia, H.1    Rubin, P.2    Vatikiotis-Bateson, E.3
  • 19
    • 21844443583 scopus 로고    scopus 로고
    • Audiovisual representation of prosody in expressive speech communication
    • DOI 10.1016/j.specom.2005.02.017, PII S0167639305001032, Quantitative Prosody Modelling for Natural Speech Description and Generation
    • B. Granström and D. House, "Audiovisual representation of prosody expressive speech communication," Speech Commun., vol. 46, no. 3-4, pp. 473-484, Jul. 2005. (Pubitemid 40952529)
    • (2005) Speech Communication , vol.46 , Issue.3-4 , pp. 473-484
    • Granstrom, B.1    House, D.2
  • 21
    • 77951108686 scopus 로고    scopus 로고
    • Eyebrowraises dialogue and their relation to discourse structure, utterance function and pitch accents English
    • Jun
    • M. L. Flecha-García, "Eyebrowraises dialogue and their relation to discourse structure, utterance function and pitch accents English," Speech Commun., vol. 52, pp. 542-554, Jun. 2010.
    • (2010) Speech Commun. , vol.52 , pp. 542-554
    • Flecha-García, M.L.1
  • 23
    • 1642405348 scopus 로고    scopus 로고
    • Visual prosody and speech intelligibility: Head movement improves auditory speech perception
    • K. G. Munhall, J. A. Jones, D. E. Callan, T. Kuratate, and E. Vatikiotis-Bateson, "Visual prosody and speech intelligibility: Head movement improves auditory speech perception," Psychol. Sci., vol. 15, no. 2, pp. 133-137, February 2004. (Pubitemid 38361306)
    • (2004) Psychological Science , vol.15 , Issue.2 , pp. 133-137
    • Munhall, K.G.1    Jones, J.A.2    Callan, D.E.3    Kuratate, T.4    Vatikiotis-Bateson, E.5
  • 25
    • 0002473893 scopus 로고    scopus 로고
    • Generating facial expressions for speech
    • DOI 10.1016/S0364-0213(99)80001-9
    • C. Pelachaud, N. Badler, and M. Steedman, "Generating facial expressions for speech," Cognitive Sci., vol. 20, no. 1, pp. 1-46, January 1996. (Pubitemid 126159862)
    • (1996) Cognitive Science , vol.20 , Issue.1 , pp. 1-46
    • Pelachaud, C.1    Badler, N.I.2    Steedman, M.3
  • 27
    • 77954605110 scopus 로고    scopus 로고
    • Making discourse visible: Coding and animating conversational facial displays
    • Geneva, Switzerland, Jun. 2002
    • D. DeCarlo, C. Revilla, M. Stone, and J. Venditti, "Making discourse visible: Coding and animating conversational facial displays," in Proc. Comput. Animat. (CA 2002), Geneva, Switzerland, Jun. 2002, pp. 11-16.
    • Proc. Comput. Animat. (CA 2002) , pp. 11-16
    • DeCarlo, D.1    Revilla, C.2    Stone, M.3    Venditti, J.4
  • 29
    • 0026156861 scopus 로고
    • A media conversion from speech to facial image for intelligent man-machine interface
    • DOI 10.1109/49.81953
    • S. Morishima and H. Harashima, "A media conversion from speech to facial image for intelligent man-machine interface," IEEE J. Sel. Areas Commun., vol. 9, no. 4, pp. 594-600, May 1991. (Pubitemid 21645615)
    • (1991) IEEE Journal on Selected Areas in Communications , vol.9 , Issue.4 , pp. 594-600
    • Morishima Shigeo1    Harashima Hiroshi2
  • 30
    • 0031997085 scopus 로고    scopus 로고
    • Audio-to-visual conversion for multimedia communication
    • PII S0278004698004158
    • R. Rao, T. Chen, and R. Mersereau, "Audio-to-visual conversion for multimedia communication," IEEE Trans. Industrial Electron., vol. 45, no. 1, pp. 15-22, Feb. 1998. (Pubitemid 128739734)
    • (1998) IEEE Transactions on Industrial Electronics , vol.45 , Issue.1 , pp. 15-22
    • Rao, R.R.1    Chen, T.2    Mersereau, R.M.3
  • 33
    • 0036874999 scopus 로고    scopus 로고
    • Dynamic Bayesian networks for audio-visual speech recognition
    • Jan
    • A. V. Nefian, L. Liang, X. Pi, X. Liu, and K. Murphy, "Dynamic Bayesian networks for audio-visual speech recognition," EURASIP J. Appl. Signal Process., vol. 2002, pp. 1274-1288, Jan. 2002.
    • (2002) EURASIP J. Appl. Signal Process. , vol.2002 , pp. 1274-1288
    • Nefian, A.V.1    Liang, L.2    Pi, X.3    Liu, X.4    Murphy, K.5
  • 35
    • 33645777234 scopus 로고    scopus 로고
    • Expressive speech-driven facial animation
    • Oct
    • Y. Cao, W. Tien, P. Faloutsos, and F. Pighin, "Expressive speech-driven facial animation," ACM Trans. Graphics, vol. 24, pp. 1283-1302, Oct. 2005.
    • (2005) ACM Trans. Graphics , vol.24 , pp. 1283-1302
    • Cao, Y.1    Tien, W.2    Faloutsos, P.3    Pighin, F.4
  • 40
    • 0004035636 scopus 로고    scopus 로고
    • Praat, a system for doing phonetics by computer
    • Univ. of Amsterdam, Amsterdam, The Netherlands, Tech. Rep. [Online]. Available:
    • P. Boersma and D. Weeninck, "Praat, a system for doing phonetics by computer," Inst. of Phonetic Sci., Univ. of Amsterdam, Amsterdam, The Netherlands, Tech. Rep. 132, 1996 [Online]. Available: http://www.praat.org
    • (1996) Inst. of Phonetic Sci. , vol.132
    • Boersma, P.1    Weeninck, D.2
  • 42
    • 0031268341 scopus 로고    scopus 로고
    • Factorial hidden Markov models
    • Z. Ghahramani and M. I. Jordan, "Factorial hidden Markov models," Mach. Learn., vol. 29, pp. 245-273, Nov. 1997. (Pubitemid 127510040)
    • (1997) Machine Learning , vol.29 , Issue.2-3 , pp. 245-273
    • Ghahramani, Z.1    Jordan, M.I.2
  • 45
    • 84876513525 scopus 로고    scopus 로고
    • Xface: MPEG-4 based open source toolkit for 3D facial animation
    • Gallipoli, Italy, May 2004
    • K. Balci, "Xface: MPEG-4 based open source toolkit for 3D facial animation," in Proc. Conf. Adv. Vis. Interfaces (AVI 2004), Gallipoli, Italy, May 2004, pp. 399-402.
    • Proc. Conf. Adv. Vis. Interfaces (AVI 2004) , pp. 399-402
    • Balci, K.1
  • 46
    • 17644380476 scopus 로고    scopus 로고
    • Robust methods for canonical correlation analysis
    • Berlin, Germany: Springer-Verlag
    • C. Dehon, P. Filzmoser, and C. Croux, "Robust methods for canonical correlation analysis," in Data Analysis, Classification, Related Methods. Berlin, Germany: Springer-Verlag, 2000, pp. 321-326.
    • (2000) Data Analysis, Classification, Related Methods , pp. 321-326
    • Dehon, C.1    Filzmoser, P.2    Croux, C.3
  • 47
    • 65249116503 scopus 로고    scopus 로고
    • Analysis of emotionally salient aspects of fundamental frequency for emotion detection
    • May
    • C. Busso, S. Lee, and S. Narayanan, "Analysis of emotionally salient aspects of fundamental frequency for emotion detection," IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 4, pp. 582-596, May 2009.
    • (2009) IEEE Trans. Audio, Speech, Lang. Process. , vol.17 , Issue.4 , pp. 582-596
    • Busso, C.1    Lee, S.2    Narayanan, S.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.