메뉴 건너뛰기




Volumn 17, Issue 3, 2009, Pages 459-468

Emphatic visual speech synthesis

Author keywords

Audiovisual speech synthesis; Emphatic visual speech; Talking head

Indexed keywords

AUDIOVISUAL SPEECH SYNTHESIS; BEHAVIORAL SYNTHESIS; COMMUNICATION STYLES; EMPHATIC VISUAL-SPEECH; FACIAL SYNTHESIS; HUMAN BEING; HUMAN SUBJECTS; PERCEPTUAL TEST; REAL SAMPLES; RESEARCH AREAS; STATISTICAL INTERACTION; TALKING HEAD; TALKING HEADS; UNIT SELECTION; VISUAL SPEECH SYNTHESIS; VOCAL-TRACTS;

EID: 70350442426     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2008.2010213     Document Type: Article
Times cited : (13)

References (60)
  • 2
    • 0026147914 scopus 로고
    • Conspec and conlern: A two-process theory of infant face recognition
    • J. Morton and M. Johnson, "Conspec and conlern: A two-process theory of infant face recognition, " Psychol. Rev., vol. 98, no. 2, pp. 164-181, 1991.
    • (1991) Psychol. Rev. , vol.98 , Issue.2 , pp. 164-181
    • Morton, J.1    Johnson, M.2
  • 5
    • 85009080413 scopus 로고    scopus 로고
    • Auditory visual speech processing
    • Aalborg, Denmark
    • D. Massaro, "Auditory visual speech processing, " in Proc. Eurospeech, Aalborg, Denmark, 2001, pp. 1153-1156.
    • (2001) Proc. Eurospeech , pp. 1153-1156
    • Massaro, D.1
  • 6
    • 85032752352 scopus 로고    scopus 로고
    • Audiovisual speech processing: Lip reading and lip synchronization
    • Jan
    • T. Chen, "Audiovisual speech processing: Lip reading and lip synchronization, " IEEE Signal Process. Mag., vol. 18, no. 1, pp. 9-31, Jan. 2001.
    • (2001) IEEE Signal Process. Mag. , vol.18 , Issue.1 , pp. 9-31
    • Chen, T.1
  • 7
    • 85133709259 scopus 로고    scopus 로고
    • Picture my voice: Audio to visual synthesis using artificial neural networks
    • Santa Cruz, NM
    • D. Massaro, J. Beskow, M. Cohen, C. Fry, and T. Rodriguez, "Picture my voice: Audio to visual synthesis using artificial neural networks, " in Proc. AVSP, Santa Cruz, NM, 1999, pp. 133-138.
    • (1999) Proc. AVSP , pp. 133-138
    • Massaro, D.1    Beskow, J.2    Cohen, M.3    Fry, C.4    Rodriguez, T.5
  • 8
    • 84961920244 scopus 로고    scopus 로고
    • Audiovisual speech synthesis
    • [Online]. Available
    • G. Bailly, "Audiovisual speech synthesis, " in Proc. ETRW Speech Synth., 2001 [Online]. Available: citeseer.ist.psu.edu/ bailly03audiovisual.html.
    • (2001) Proc. ETRW Speech Synth.
    • Bailly, G.1
  • 10
    • 85018094829 scopus 로고
    • Computer generated animation of faces
    • New York
    • F. Parke, "Computer generated animation of faces, " in Proc. ACM'72: ACM Annual Conf., New York, 1972, pp. 451-457.
    • (1972) Proc. ACM'72: ACM Annual Conf. , pp. 451-457
    • Parke, F.1
  • 11
    • 35349018832 scopus 로고    scopus 로고
    • Transferring of speech movements from video to 3D face space
    • Jan
    • Y. Pei and H. Zha, "Transferring of speech movements from video to 3D face space, " IEEE Trans. Vis. Comput. Graphics, vol. 13, no. 1, pp. 58-69, Jan. 2007.
    • (2007) IEEE Trans. Vis. Comput. Graphics , vol.13 , Issue.1 , pp. 58-69
    • Pei, Y.1    Zha, H.2
  • 13
    • 0029765811 scopus 로고    scopus 로고
    • Unit selection in a concatenative speech synthesis system using a large speech database
    • Atlanta, GA
    • A. Hunt and A. Black, "Unit selection in a concatenative speech synthesis system using a large speech database, " in Proc. ICASSP, Atlanta, GA, 1996, vol. 1, pp. 373-376.
    • (1996) Proc. ICASSP , vol.1 , pp. 373-376
    • Hunt, A.1    Black, A.2
  • 16
    • 0141637236 scopus 로고    scopus 로고
    • Robust parameterized component analysis: Theory and applications to 2d facial appearance models
    • F. D. la Torre and M. Black, "Robust parameterized component analysis: Theory and applications to 2d facial appearance models, " Comput. Vis. Image Underst., vol. 91, no. 1-2, pp. 53-71, 2003.
    • (2003) Comput. Vis. Image Underst. , vol.91 , Issue.1-2 , pp. 53-71
    • Torre, F.D.L.1    Black, M.2
  • 17
    • 0036903144 scopus 로고    scopus 로고
    • Adding and subtracting eigenspaces with eigenvalue decomposition and singular value decomposition
    • Dec
    • P. M. Hall, D. R. Marshall, and R. Martin, "Adding and subtracting eigenspaces with eigenvalue decomposition and singular value decomposition, " IVC, vol. 20, no. 13-14, pp. 1009-1016, Dec. 2002.
    • (2002) IVC , vol.20 , Issue.13-14 , pp. 1009-1016
    • Hall, P.M.1    Marshall, D.R.2    Martin, R.3
  • 18
    • 21844467038 scopus 로고    scopus 로고
    • Ph.D. dissertation, Swiss Federal Inst. of Technol. Lausanne, Switzerland
    • E. Cosatto, "Sample-based talking-head synthesis, " Ph.D. dissertation, Swiss Federal Inst. of Technol., Lausanne, Switzerland, 2002.
    • (2002) Sample-based Talking-head Synthesis
    • Cosatto, E.1
  • 19
    • 0036989560 scopus 로고    scopus 로고
    • Trainable videorealistic speech animation
    • T. Ezzat, G. Geiger, and T. Poggio, "Trainable videorealistic speech animation, " ACM Trans. Graph., vol. 21, no. 3, pp. 388-398, 2002.
    • (2002) ACM Trans. Graph. , vol.21 , Issue.3 , pp. 388-398
    • Ezzat, T.1    Geiger, G.2    Poggio, T.3
  • 20
    • 0017199877 scopus 로고
    • Hearing lips and seeing voices
    • H. McGurk and J. McDonald, "Hearing lips and seeing voices, " Nature, vol. 264, pp. 746-748, 1976.
    • (1976) Nature , vol.264 , pp. 746-748
    • Mcgurk, H.1    Mcdonald, J.2
  • 21
    • 0030677313 scopus 로고    scopus 로고
    • Video rewrite: Driving visual speech with audio
    • C. Bregler, M. Covell, and M. Slaney, "Video rewrite: Driving visual speech with audio, " Proc. SIGGRAPH, pp. 353-360, 1997.
    • (1997) Proc. SIGGRAPH , pp. 353-360
    • Bregler, C.1    Covell, M.2    Slaney, M.3
  • 25
    • 29544436226 scopus 로고    scopus 로고
    • Range- and domain-specific exaggeration of facial speech
    • H. Hill, N. Troje, and A. Johnston, "Range- and domain-specific exaggeration of facial speech, " J. Vision, vol. 5, no. 10, pp. 793-807, 2005.
    • (2005) J. Vision , vol.5 , Issue.10 , pp. 793-807
    • Hill, H.1    Troje, N.2    Johnston, A.3
  • 26
    • 38149065718 scopus 로고    scopus 로고
    • Visual correlates to prominence in several expressive modes
    • Pittsburg, PA
    • J. Beskow, B. Granström, and D. House, "Visual correlates to prominence in several expressive modes, " in Proc. Interspeech, Pittsburg, PA, 2006, pp. 1272-1275.
    • (2006) Proc. Interspeech , pp. 1272-1275
    • Beskow, J.1    Granström, B.2    House, D.3
  • 27
    • 70350473963 scopus 로고
    • Development of a female voice for a concatenative text-tospeech synthesis system
    • A. Syrdal, "Development of a female voice for a concatenative text-tospeech synthesis system, " Current Topics Acoust. Res., pp. 169-181, 1994.
    • (1994) Current Topics Acoust. Res. , pp. 169-181
    • Syrdal, A.1
  • 28
    • 84966350572 scopus 로고    scopus 로고
    • Perfect synthesis for all of the people all of the time
    • A. Black, "Perfect synthesis for all of the people all of the time, " in Proc. 2002 IEEE Workshop Speech Synth., 2002, pp. 167-170.
    • (2002) Proc. 2002 IEEE Workshop Speech Synth. , pp. 167-170
    • Black, A.1
  • 29
    • 0023381391 scopus 로고
    • Principles of animation as applied to 3D character animation
    • J. Lasseter, "Principles of animation as applied to 3D character animation, " Comput. Graphics, vol. 21, pp. 35-44, 1987.
    • (1987) Comput. Graphics , vol.21 , pp. 35-44
    • Lasseter, J.1
  • 32
    • 0000665734 scopus 로고
    • Explaining phonetic variation: A sketch of the h and h theory
    • B. Lindblom, "Explaining phonetic variation: A sketch of the h and h theory, " Speech Production and Speech Modelling, vol. 55, pp. 403-439, 1990.
    • (1990) Speech Production and Speech Modelling , vol.55 , pp. 403-439
    • Lindblom, B.1
  • 35
    • 20444397148 scopus 로고    scopus 로고
    • Congruent and incongruent audiovisual cues to prominence
    • Nara, Japan
    • M. Swerts and E. Krahmer, "Congruent and incongruent audiovisual cues to prominence, " in Proc. Speech Prosody Conf., Nara, Japan, 2004, pp. 69-72.
    • (2004) Proc. Speech Prosody Conf. , pp. 69-72
    • Swerts, M.1    Krahmer, E.2
  • 37
    • 70350498365 scopus 로고    scopus 로고
    • Testing the effect of audiovisual cues to prominence via a reaction-time experiment
    • Pittsburg, PA, 2006, paper 1288-Mon3A3O.4
    • E. Krahmer and M. Swerts, "Testing the effect of audiovisual cues to prominence via a reaction-time experiment, " in Int. Conf. Spoken Lang. Process., Pittsburg, PA, 2006, 2006, paper 1288-Mon3A3O.4.
    • (2006) Int. Conf. Spoken Lang. Process.
    • Krahmer, E.1    Swerts, M.2
  • 38
    • 0036594804 scopus 로고    scopus 로고
    • Facial expressions modulate the time course of long latency auditory brain potentials
    • G. Pourtois, D. Debatisse, P. Despland, and B. de Gelder, "Facial expressions modulate the time course of long latency auditory brain potentials, " Cognitive Brain Res., vol. 14, no. 1, pp. 99-105, 2002.
    • (2002) Cognitive Brain Res. , vol.14 , Issue.1 , pp. 99-105
    • Pourtois, G.1    Debatisse, D.2    Despland, P.3    Gelder, B.D.4
  • 39
    • 33745200056 scopus 로고    scopus 로고
    • Synthesising hyperarticulation in unit selectiontts
    • M. Aylett, "Synthesising hyperarticulation in unit selectiontts, " Proc. Interspeech, pp. 2521-2524, 2005.
    • (2005) Proc. Interspeech , pp. 2521-2524
    • Aylett, M.1
  • 40
    • 84947606374 scopus 로고    scopus 로고
    • Diphone based unit selection for catalan text-to-speech synthesis
    • Brno, Czech Republic
    • R. Guaus and I. Iriondo, "Diphone based unit selection for catalan text-to-speech synthesis, " in Workshop Text, Speech, Dialogue, Brno, Czech Republic, 2000, pp. 277-282.
    • (2000) Workshop Text, Speech, Dialogue , pp. 277-282
    • Guaus, R.1    Iriondo, I.2
  • 42
    • 0014366349 scopus 로고
    • Confusions among visually perceived consonants
    • C. Fisher, "Confusions among visually perceived consonants, " J. Speech Hearing Res., vol. 11, pp. 796-804, 1968.
    • (1968) J. Speech Hearing Res. , vol.11 , pp. 796-804
    • Fisher, C.1
  • 43
    • 0003616059 scopus 로고
    • Hillsdale, NJ: Lawrence Erlbaum Associates, ch. Some preliminaries to a comprehensive account of audio-visual speech perception
    • A. Summerfield, Hearing by Eye: The Psychology of Lip-Reading. Hillsdale, NJ: Lawrence Erlbaum Associates, 1987, ch. Some preliminaries to a comprehensive account of audio-visual speech perception, pp. 3-51.
    • (1987) Hearing by Eye: The Psychology of Lip-Reading , pp. 3-51
    • Summerfield, A.1
  • 45
    • 0032318785 scopus 로고    scopus 로고
    • Mixtures of eigenfeatures for real-time structure from texture
    • Bombay, India
    • T. Jebara, K. Russell, and A. Pentland, "Mixtures of eigenfeatures for real-time structure from texture, " in Proc. Int. Conf. Computer Vision, Bombay, India, 1998, pp. 128-138.
    • (1998) Proc. Int. Conf. Computer Vision , pp. 128-138
    • Jebara, T.1    Russell, K.2    Pentland, A.3
  • 46
    • 0004236492 scopus 로고    scopus 로고
    • Baltimore, MD: The Johns Hopkins Univer. Press
    • G. Golub and C. V. Loan, Matrix Computations. Baltimore, MD: The Johns Hopkins Univer. Press, 1996.
    • (1996) Matrix Computations
    • Golub, G.1    Loan, C.V.2
  • 47
    • 34147120474 scopus 로고
    • A note on two problems in connexion with graphs
    • E. Dijkstra, "A note on two problems in connexion with graphs, " Numerische Mathematik, vol. 1, pp. 269-271, 1959.
    • (1959) Numerische Mathematik , vol.1 , pp. 269-271
    • Dijkstra, E.1
  • 48
    • 0030151578 scopus 로고    scopus 로고
    • Blind image deconvolution
    • May
    • D. Kundur and D. Hatzinakos, "Blind image deconvolution, " IEEE Signal Process. Mag., vol. 13, no. 3, pp. 43-64, May 1996.
    • (1996) IEEE Signal Process. Mag. , vol.13 , Issue.3 , pp. 43-64
    • Kundur, D.1    Hatzinakos, D.2
  • 49
    • 84935113569 scopus 로고
    • Error bounds for convolutional codes and an asymptotically optimum decoding algorithm
    • Apr
    • A. Viterbi, "Error bounds for convolutional codes and an asymptotically optimum decoding algorithm, " IEEE Trans. Inf. Theory, vol. IT-13, pp. 260-269, Apr. 1967.
    • (1967) IEEE Trans. Inf. Theory , vol.IT-13 , pp. 260-269
    • Viterbi, A.1
  • 50
    • 0019647180 scopus 로고
    • An iterative image registration technique with an application to stereo vision
    • B. Lucas and T. Kanade, "An iterative image registration technique with an application to stereo vision, " in Proc. Int. Joint Conf. Artif. Intell., 1981, pp. 674-679.
    • (1981) Proc. Int. Joint Conf. Artif. Intell. , pp. 674-679
    • Lucas, B.1    Kanade, T.2
  • 51
    • 0031651559 scopus 로고    scopus 로고
    • Eigentracking: Robust matching and tracking of articulated objects using a view-based representation
    • M. Black and A. Jepson, "Eigentracking: Robust matching and tracking of articulated objects using a view-based representation, " Int. J. Comput. Vis., vol. 26, no. 1, pp. 63-84, 1998.
    • (1998) Int. J. Comput. Vis. , vol.26 , Issue.1 , pp. 63-84
    • Black, M.1    Jepson, A.2
  • 52
    • 35048898108 scopus 로고    scopus 로고
    • On-the-fly training
    • Palma de Mallorca, Spain
    • J. Melenchón, L. Meler, and I. Iriondo, "On-the-fly training, " in Proc. AMDO, Palma de Mallorca, Spain, 2004, pp. 146-153.
    • (2004) Proc. AMDO , pp. 146-153
    • Melenchón, J.1    Meler, L.2    Iriondo, I.3
  • 55
    • 0012838619 scopus 로고
    • Estudio estadístico de la ortografía castellana: La frecuencia silábica
    • M. de Vega, C.Álvarez, and M. Carreiras, "Estudio estadístico de la ortografía castellana: La frecuencia silábica, " Cognitiva, vol. 4, no. 1, pp. 75-114, 1992.
    • (1992) Cognitiva , vol.4 , Issue.1 , pp. 75-114
    • De La Vega, M.1    Álvarez, C.2    Carreiras, M.3
  • 57
    • 1542333732 scopus 로고    scopus 로고
    • Image coding quality assessment using fuzzy integrals with a three-component image model
    • Feb
    • J. Li, G. Chen, Z. Chi, and C. Lu, "Image coding quality assessment using fuzzy integrals with a three-component image model, " IEEE Trans. Fuzzy Syst., vol. 12, no. 1, pp. 99-106, Feb. 2004.
    • (2004) IEEE Trans. Fuzzy Syst. , vol.12 , Issue.1 , pp. 99-106
    • Li, J.1    Chen, G.2    Chi, Z.3    Lu, C.4
  • 58
    • 70350470797 scopus 로고    scopus 로고
    • Perceptual evaluation of videorealistic speech
    • G. Geiger, T. Ezzat, and T. Poggio, "Perceptual evaluation of videorealistic speech, " MIT AIM, 2003, 2003-003.
    • (2003) Mit AIM , pp. 2003-3003
    • Geiger, G.1    Ezzat, T.2    Poggio, T.3
  • 59
    • 33845919877 scopus 로고    scopus 로고
    • Visual contribution to speech perception: Measuring the intelligibility of animated talking heads
    • Article ID 47891
    • S. Ouni, M. Cohen, H. Ishak, and D. Massaro, "Visual contribution to speech perception: Measuring the intelligibility of animated talking heads, " EURASIP J. Audio, Speech, Music Process., vol. 2007, Article ID 47891.
    • EURASIP J. Audio, Speech, Music Process. , vol.2007
    • Ouni, S.1    Cohen, M.2    Ishak, H.3    Massaro, D.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.