SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 20, Issue 8, 2012, Pages 2378-2387

Relating objective and subjective performance measures for AAM-based visual speech synthesis

(2) Theobald, Barry John a Matthews, Iain b

a UNIVERSITY OF EAST ANGLIA (United Kingdom)

b DISNEY RESEARCH (United States)

Author keywords

Active appearance models (AAMs); canonical correlation analysis; visual speech evaluation; visual speech synthesis

Indexed keywords

ACOUSTIC FEATURES; ACTIVE APPEARANCE MODELS; CANONICAL CORRELATION ANALYSIS; DYNAMIC TIME; OBJECTIVE MEASURE; PHONETIC TRANSCRIPTIONS; SMALL REGION; SUBJECTIVE PERFORMANCE; SUBJECTIVE QUALITY; VISUAL SPEECH; VISUAL SPEECH SYNTHESIS;

DEGRADATION; SPEECH SYNTHESIS;

QUALITY CONTROL;

EID: 84865358428 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2012.2202651 Document Type: Article

Times cited : (17)

References (82)

1
- 0142216141
- Audiovisual speech synthesis
- G. Bailly, M. Bérar, F. Elisei, and M. Odisio, "Audiovisual speech synthesis," Int. J. Speech Technol., vol. 6, pp. 331-346, 2003.
- (2003) Int. J. Speech Technol. , vol.6 , pp. 331-346
- Bailly, G.¹ Bérar, M.² Elisei, F.³ Odisio, M.⁴

2
- 35048862963
- SYNFACE - A talking head telephone for the hearing-impaired
- Computers Helping People with Special Needs 9th International Conference, ICCHP 2004 Paris, France, July 7-9, 2004 Proceedings
- J. Beskow, I. Karlsson, J. Kewley, and G. Salvi, "SYNFACE - A talking head telephone for the hearing-impaired," Computers Helping People With Special Needs, pp. 1178-1186, 2004. (Pubitemid 38939738)
- (2004) Lecture Notes inf Computer Science , Issue.3118 , pp. 1178-1186
- Beskow, J.¹ Karlsson, I.² Kewley, J.³ Salvi, G.⁴

3
- 0004084456
- Cambridge MA: MIT Press
- D. Massaro, Perceiving Talking Faces. Cambridge, MA: MIT Press, 1998.
- (1998) Perceiving Talking Faces
- Massaro, D.¹

4
- 0020202671
- Parametric models for facial animation
- F. Parke, "Parametric models for facial animation," Comput. Graphics Applicat., vol. 2, no. 9, pp. 61-68, 1982.
- (1982) Comput. Graphics Applicat. , vol.2 , Issue.9 , pp. 61-68
- Parke, F.¹

5
- 33845245544
- Degrees of freedom of facial movements in face-to-face conversational speech
- G. Bailly, F. Elisei, P. Badin, and C. Savariaux, "Degrees of freedom of facial movements in face-to-face conversational speech," in Proc. Int. Workshop Multimodal Corpora, 2006, pp. 33-36.
- (2006) Proc. Int. Workshop Multimodal Corpora , pp. 33-36
- Bailly, G.¹ Elisei, F.² Badin, P.³ Savariaux, C.⁴

6
- 13144278330
- Speech-driven facial animation with realistic dynamics
- DOI 10.1109/TMM.2004.840611
- R. Gutierrez-Osuna, P. Kakumanu,A. Esposito, O. Garcia, B. A., and I. Rudomin, "Speech-driven facial animation with realistic dynamics," IEEE Trans. Multimedia, vol. 7, no. 1, pp. 33-42, Feb. 2005. (Pubitemid 40178367)
- (2005) IEEE Transactions on Multimedia , vol.7 , Issue.1 , pp. 33-42
- Gutierrez-Osuna, R.¹ Kakumanu, P.K.² Esposito, A.³ Garcia, O.N.⁴ Bojorquez, A.⁵ Castillo, J.L.⁶ Rudomin, I.⁷

7
- 67949116914
- Mapping and manipulating facial expression
- B. Theobald, I. Matthews, M. Mangini, J. Spies, T. Brick, J. Cohn, and S. Boker, "Mapping and manipulating facial expression," Lang. Speech, vol. 52, no. 2/3, pp. 369-386, 2009.
- (2009) Lang. Speech , vol.52 , Issue.2-3 , pp. 369-386
- Theobald, B.¹ Matthews, I.² Mangini, M.³ Spies, J.⁴ Brick, T.⁵ Cohn, J.⁶ Boker, S.⁷

8
- 0035701209
- Geometry-basedmusclemodelling for facial animation
- K. Kähler, J. Haber, and H. Seidel, "Geometry- basedmusclemodelling for facial animation," Graphics Interface, pp. 27-36, 2001.
- (2001) Graphics Interface , pp. 27-36
- Kähler, K.¹ Haber, J.² Seidel, H.³

9
- 0029182694
- Realistic modeling for facial animation
- Y. Lee, D. Terzopoulos, and K. Waters, "Realistic modeling for facial animation," in Proc. SIGGRAPH, 1995, pp. 55-62.
- (1995) Proc. SIGGRAPH , pp. 55-62
- Lee, Y.¹ Terzopoulos, D.² Waters, K.³

10
- 85020405512
- Real time muscle deformations using mass-spring systems
- L. Nedel and D. Thalmann, "Real time muscle deformations using mass-spring systems," Comput. Grahpics Int., pp. 156-165, 1998.
- (1998) Comput. Grahpics Int. , pp. 156-165
- Nedel, L.¹ Thalmann, D.²

11
- 84937437186
- Voice puppetry
- Los Angeles, CA
- M. Brand, "Voice puppetry," in Proc. SIGGRAPH, Los Angeles, CA, 1999, pp. 21-28.
- (1999) Proc. SIGGRAPH , pp. 21-28
- Brand, M.¹

12
- 0036885180
- Realistic mouth synthesis based on shape appearance dependence mapping
- Y. Du and X. Lin, "Realistic mouth synthesis based on shape appearance dependence mapping," Pattern Recogn. Lett., vol. 23, no. 14, pp. 1875-1885, 2002.
- (2002) Pattern Recogn. Lett. , vol.23 , Issue.14 , pp. 1875-1885
- Du, Y.¹ Lin, X.²

13
- 0006464798
- Speech driven synthesis of talking head sequences
- P. Eisert, S. Chaudhuri, and B. Girod, "Speech driven synthesis of talking head sequences," in Proc. Workshop 3D Image Anal. Synth., 1997, pp. 51-56.
- (1997) Proc. Workshop 3D Image Anal. Synth. , pp. 51-56
- Eisert, P.¹ Chaudhuri, S.² Girod, B.³

14
- 85162060060
- A probabilistic model for generating realistic speech movements from speech
- G. Englebienne, T. Cootes, and M. Rattray, "A probabilistic model for generating realistic speech movements from speech," in Proc. Adv. Neural Inf. Process. Syst., 2007.
- (2007) Proc. Adv. Neural Inf. Process. Syst.
- Englebienne, G.¹ Cootes, T.² Rattray, M.³

15
- 84865359288
- A comparative study of direct and ASR-based modular audio to visual speech systems
- G. Feldhoffer, A. Tihanyi, and O. Balázs, "A comparative study of direct and ASR-based modular audio to visual speech systems," The Phonetician, vol. 97/98, pp. 15-24, 2008.
- (2008) The Phonetician , vol.97-98 , pp. 15-24
- Feldhoffer, G.¹ Tihanyi, A.² Balázs, O.³

16
- 0036650837
- Real-time speech-driven expressive synthetic talking faces using neural networks
- Jul
- P. Hong, Z. Wen, and T. Huang, "Real-time speech-driven expressive synthetic talking faces using neural networks," IEEE Trans. Neural Netw., vol. 13, no. 4, pp. 916-927, Jul. 2002.
- (2002) IEEE Trans. Neural Netw. , vol.13 , Issue.4 , pp. 916-927
- Hong, P.¹ Wen, Z.² Huang, T.³

17
- 27844486935
- Partial linear regression for speech-driven talking head application
- DOI 10.1016/j.image.2005.04.002, PII S0923596505000421
- C. Hsieh and Y. Chen, "Partial linear regression for speech-driven talking head application," Signal Process.: Image Commun., vol. 21, pp. 1-12, 2006. (Pubitemid 41653199)
- (2006) Signal Processing: Image Communication , vol.21 , Issue.1 , pp. 1-12
- Hsieh, C.-K.¹ Chen, Y.-C.²

18
- 85133709259
- Picture my voice: Audio to visual speech synthesis using artificial neural networks
- D. Massaro, J. Beskow, M. Cohen, T. Fry, and C. Rodriguez, "Picture my voice: Audio to visual speech synthesis using artificial neural networks," in Proc. Int. Conf. Auditory Vis. Speech Process., 1999.
- (1999) Proc. Int. Conf. Auditory Vis. Speech Process.
- Massaro, D.¹ Beskow, J.² Cohen, M.³ Fry, T.⁴ Rodriguez, C.⁵

19
- 84865370755
- Real-time visual speech synthesis using active appearance models
- B. Theobald and N. Wilkinson, "Real-time visual speech synthesis using active appearance models," in Proc. Int. Conf. Auditory Vis. Speech Process., 2007.
- (2007) Proc. Int. Conf. Auditory Vis. Speech Process.
- Theobald, B.¹ Wilkinson, N.²

20
- 79959854294
- Synthesizing photo-real talking head via trajectory-guided sample selection
- L. Wang, W. Han, X. Qian, and F. Soong, "Synthesizing photo-real talking head via trajectory-guided sample selection," in Proc. Interspeech, 2010.
- (2010) Proc. Interspeech
- Wang, L.¹ Han, W.² Qian, X.³ Soong, F.⁴

21
- 76649100693
- Real time speech driven facial animation using formant analysis
- Z. Wen, P. Hong, and T. Huang, "Real time speech driven facial animation using formant analysis," in Proc. Int. Conf. Multimedia Expo, 2001, pp. 817-820.
- (2001) Proc. Int. Conf. Multimedia Expo , pp. 817-820
- Wen, Z.¹ Hong, P.² Huang, T.³

22
- 0030677313
- Video rewrite: Driving visual speech with audio
- C. Bregler, M. Covell, and M. Slaney, "Video rewrite: Driving visual speech with audio," in Proc. SIGGRAPH, 1997, pp. 353-360.
- (1997) Proc. SIGGRAPH , pp. 353-360
- Bregler, C.¹ Covell, M.² Slaney, M.³

23
- 78650937887
- Visual speech synthesis bymodelling coarticulation dynamics using a non-parametric switching state-space model
- S.Deena, S. Hou, and A. Galata, "Visual speech synthesis bymodelling coarticulation dynamics using a non-parametric switching state-space model," in Int. Conf. Multimodal Interfaces, 2010, pp. 1-8.
- (2010) Int. Conf. Multimodal Interfaces , pp. 1-8
- Deena, S.¹ Hou, S.² Galata, A.³

24
- 67649713953
- Visual speech synthesis from 3D video
- J. Edge and A. Hilton, "Visual speech synthesis from 3D video," in IET Eur. Conf. Vis. Media Product., 2006, pp. 174-179.
- (2006) IET Eur. Conf. Vis. Media Product. , pp. 174-179
- Edge, J.¹ Hilton, A.²

25
- 85009254391
- Miketalk: A talking facial display based on morphing visemes
- T. Ezzat and T. Poggio, "Miketalk: A talking facial display based on morphing visemes," in Proc.Comput. Animat. Conf., 1998, pp. 96-103.
- (1998) Proc.Comput. Animat. Conf. , pp. 96-103
- Ezzat, T.¹ Poggio, T.²

26
- 77953828868
- Trainable videorealistic speech animation
- T. Ezzat, G. Geiger, and T. Poggio, "Trainable videorealistic speech animation," in Proc. SIGGRAPH, 2002, pp. 388-398.
- (2002) Proc. SIGGRAPH , pp. 388-398
- Ezzat, T.¹ Geiger, G.² Poggio, T.³

27
- 44949159884
- TDA: A new trainable trajectory formation system for facial animation
- O. Govokhina, G. Bailly, G. Breton, and P. Bagshaw, "TDA: A new trainable trajectory formation system for facial animation," in Proc. Interspeech, 2006, pp. 2474-2477.
- (2006) Proc. Interspeech , pp. 2474-2477
- Govokhina, O.¹ Bailly, G.² Breton, G.³ Bagshaw, P.⁴

28
- 0036289950
- Triphone based unit selection for concatenative visual speech synthesis
- F. Huang, E. Cosatto, and H. Graf, "Triphone based unit selection for concatenative visual speech synthesis," in Proc. Int. Conf. Acoust., Speech, Signal Process., 2002, pp. 2037-2040.
- (2002) Proc. Int. Conf. Acoust., Speech, Signal Process. , pp. 2037-2040
- Huang, F.¹ Cosatto, E.² Graf, H.³

29
- 79959829115
- Active appearancemodels for photorealistic visual speech synthesis
- W.Mattheyses, L. Latacz, andW. Verhelst, "Active appearancemodels for photorealistic visual speech synthesis," in Proc. Interspeech, 2010.
- (2010) Proc. Interspeech
- Mattheyses, W.¹ Latacz, L.² Verhelst, W.³

30
- 10444256499
- Near-videorealistic synthetic talking faces: Implementation and evaluation
- B. Theobald, J. Bangham, I. Matthews, and G. Cawley, "Near- videorealistic synthetic talking faces: Implementation and evaluation," Speech Commun., vol. 44, pp. 127-140, 2004.
- (2004) Speech Commun. , vol.44 , pp. 127-140
- Theobald, B.¹ Bangham, J.² Matthews, I.³ Cawley, G.⁴

31
- 0003608342
- Natick MA: A K Peters
- F. Parke and K.Waters, Computer Facial Animation. Natick,MA: A K Peters, 1996.
- (1996) Computer Facial Animation
- Parke, F.¹ Waters, K.²

32
- 0034271782
- Photo-realistic talking-heads fromimage samples
- Jun
- E. Cosatto and H. Graf, "Photo-realistic talking-heads fromimage samples," IEEE Trans. Multimedia, vol. 2, no. 3, pp. 152-163, Jun. 2000.
- (2000) IEEE Trans. Multimedia , vol.2 , Issue.3 , pp. 152-163
- Cosatto, E.¹ Graf, H.²

33
- 0010466389
- Creating and controlling video-realistic talking heads
- F. Elisei,M. Odisio, G. Bailly, and P. Badin, "Creating and controlling video-realistic talking heads," in Proc. Int. Conf. Auditory Vis. Speech Process., 2001, pp. 90-97.
- (2001) Proc. Int. Conf. Auditory Vis. Speech Process. , pp. 90-97
- Elisei, F.¹ Odisio, M.² Bailly, G.³ Badin, P.⁴

34
- 33947261390
- Animatable facial reflectance fields
- Jun.
- T. Hawkins, A. Wenger, C. Tchou, A. Gardner, F. Goransson, and P. Debevec, "Animatable facial reflectance fields," in Proc. Eurograph. Symp. Rendering, Jun. 2004.
- (2004) Proc. Eurograph. Symp. Rendering
- Hawkins, T.¹ Wenger, A.² Tchou, C.³ Gardner, A.⁴ Goransson, F.⁵ Debevec, P.⁶

35
- 0141615907
- Reanimating faces in images and video
- V. Blanz, C. Basso, T. Poggio, and T. Vetter, "Reanimating faces in images and video," in Proc. Eurographics, 2003, pp. 641-650.
- (2003) Proc. Eurographics , pp. 641-650
- Blanz, V.¹ Basso, C.² Poggio, T.³ Vetter, T.⁴

36
- 0035363218
- Active appearance models
- DOI 10.1109/34.927467
- T. Cootes, G. Edwards, and C. Taylor, "Active appearance models," IEEE Trans. Pattern Anal. Mach. Intell., vol. 23, no. 6, pp. 681-685, Jun. 2001. (Pubitemid 32585279)
- (2001) IEEE Transactions on Pattern Analysis and Machine Intelligence , vol.23 , Issue.6 , pp. 681-685
- Cooles, T.F.¹ Edwards, G.J.² Taylor, C.J.³

37
- 85009179201
- ISCA special session: Hot topics in speech synthesis
- G. Bailly, N. Campbell, and M. Mbius, "ISCA special session: Hot topics in speech synthesis," in Proc. Eurospeech, 2003, pp. 37-40.
- (2003) Proc. Eurospeech , pp. 37-40
- Bailly, G.¹ Campbell, N.² Mbius, M.³

38
- 84867222285
- LIPS2008: Visual speech synthesis challenge
- B. Theobald, S. Fagel, F. Elsei, and G. Bailly, "LIPS2008: Visual speech synthesis challenge," in Proc. Interspeech, 2008, pp. 1875-1878.
- (2008) Proc. Interspeech , pp. 1875-1878
- Theobald, B.¹ Fagel, S.² Elsei, F.³ Bailly, G.⁴

39
- 43949140956
- Cambrige,MA, Tech. Rep. CBCL Paper 224/AIMemo 2003 MIT
- G. Geiger, T. Ezzat, and T. Poggio, "Perceptual evaluation of video-realistic speech," Cambrige,MA, Tech. Rep. CBCL Paper 224/AIMemo 2003-003, 2003, MIT.
- (2003) Perceptual Evaluation of Video-Realistic Speech , vol.2003
- Geiger, G.¹ Ezzat, T.² Poggio, T.³

40
- 0032178592
- Quantitative association of vocal-tract and facial behavior
- PII S016763939800048X
- H. Yehia, R. P., and E. Vatikiotis-Bateson, "Quantitative association of vocal-tract and facial behaviour," Speech Commun., vol. 26, pp. 23-43, 1998. (Pubitemid 128381217)
- (1998) Speech Communication , vol.26 , Issue.1-2 , pp. 23-43
- Yehia, H.¹ Rubin, P.² Vatikiotis-Bateson, E.³

41
- 56749178399
- Real-time speech-driven 3D face animation
- P. Hong, Z. Wen, T. Huang, and H. Shum, "Real-time speech-driven 3D face animation," in Proc. 3D Data Process. Vis. Transmission Symp., 2002, pp. 713-716.
- (2002) Proc. 3D Data Process. Vis. Transmission Symp. , pp. 713-716
- Hong, P.¹ Wen, Z.² Huang, T.³ Shum, H.⁴

42
- 84905369913
- Lipmovement synthesis from speech based on hidden Markov models
- E. Yamamoto, S.Nakamura, and K. Shikano, "Lipmovement synthesis from speech based on hidden Markov models," in Proc. Int. Conf. Face and Gesture, 1998, pp. 154-159.
- (1998) Proc. Int. Conf. Face and Gesture , pp. 154-159
- Yamamoto, E.¹ Nakamura, S.² Shikano, K.³

43
- 33749533574
- Expressive facial animation synthesis by learning speech coarticulation and expression spaces
- DOI 10.1109/TVCG.2006.90, 1703372
- Z. Deng, U. Neumann, J. Lewis, T. Kim, and S. Narayanan, "Expressive facial animation synthesis by learning speech coarticulation and expression spaces," IEEE Trans. Vis. Comput. Graph., vol. 12, no. 6, pp. 1523-1534, 2006. (Pubitemid 44523095)
- (2006) IEEE Transactions on Visualization and Computer Graphics , vol.12 , Issue.6 , pp. 1523-1534
- Deng, Z.¹ Neumann, U.² Lewis, J.P.³ Kim, T.-Y.⁴ Bulut, M.⁵ Narayanan, S.⁶

44
- 0038194614
- Evaluation of a system for concatenative articulatory visual speech synthesis
- O. Engwall, "Evaluation of a system for concatenative articulatory visual speech synthesis," in Proc. Int. Conf. Spoken Lang. Process., 2002, pp. 665-668.
- (2002) Proc. Int. Conf. Spoken Lang. Process. , pp. 665-668
- Engwall, O.¹

45
- 84867227937
- Realistic facial animation system for interactive services
- K. Liu and J. Ostermann, "Realistic facial animation system for interactive services," in Proc. Interspeech, 2008, pp. 2330-2333.
- (2008) Proc. Interspeech , pp. 2330-2333
- Liu, K.¹ Ostermann, J.²

46
- 31344439475
- Accurate visible speech synthesis based on concatenating variable length motion capture data
- Mar-Apr
- J. Ma, R. Cole, B. Pellom, W. Ward, and B. Wise, "Accurate visible speech synthesis based on concatenating variable length motion capture data," IEEE Trans. Vis. Comput. Graphics, vol. 12, no. 2, pp. 266-276, Mar.-Apr. 2006.
- (2006) IEEE Trans. Vis. Comput. Graphics , vol.12 , Issue.2 , pp. 266-276
- Ma, J.¹ Cole, R.² Pellom, B.³ Ward, W.⁴ Wise, B.⁵

47
- 22144492019
- Expressive audio-visual speech
- DOI 10.1002/cav.32, The Very Best Papers of CASA 2004
- E. Bevacqua and C. Pelachaud, "Expressive audio-visual speech," Comput. Animation and Virtual Worlds, vol. 15, pp. 297-304, 2004. (Pubitemid 41108008)
- (2004) Computer Animation and Virtual Worlds , vol.15 , Issue.3-4 , pp. 297-304
- Bevacqua, E.¹ Pelachaud, C.²

48
- 17444408556
- Creating speech-synchronized animation
- DOI 10.1109/TVCG.2005.43
- S. King and R. Parent, "Creating speech-synchronized animation," IEEE Trans. Vis. Comput. Graphics, vol. 11, no. 3, pp. 341-352, May-Jun. 2005. (Pubitemid 40542054)
- (2005) IEEE Transactions on Visualization and Computer Graphics , vol.11 , Issue.3 , pp. 341-352
- King, S.A.¹ Parent, R.E.²

49
- 70350442426
- Emphatic visual speech synthesis
- Mar
- J. Melenchon, E. Martinez, F. De la Torre, and J. Montero, "Emphatic visual speech synthesis," IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 3, pp. 459-468, Mar. 2009.
- (2009) IEEE Trans. Audio, Speech, Lang. Process. , vol.17 , Issue.3 , pp. 459-468
- Melenchon, J.¹ Martinez, E.² De La Torre, F.³ Montero, J.⁴

50
- 84867194643
- A probabilistic trajectory synthesis system for synthesising visual speech
- B. Theobald and N. Wilkinson, "A probabilistic trajectory synthesis system for synthesising visual speech," in Proc. Interspeech, 2008, pp. 2310-2313.
- (2008) Proc. Interspeech , pp. 2310-2313
- Theobald, B.¹ Wilkinson, N.²

51
- 70350437421
- Realistic visual speech synthesis based on hybrid concatenation method
- J. Tao, L. Xin, and Y. Panrong, "Realistic visual speech synthesis based on hybrid concatenation method," IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 3, pp. 469-477, 2009.
- (2009) IEEE Trans. Audio, Speech, Lang. Process. , vol.17 , Issue.3 , pp. 469-477
- Tao, J.¹ Xin, L.² Panrong, Y.³

52
- 16244385915
- Audio/visual mapping with cross-modal hidden Markov models
- DOI 10.1109/TMM.2005.843341
- S. Fu, R. Gutierrez-Osuna, A. Esposito, P. Kakumanu, and O. Garcia, "Audio/visual mapping with cross-modal hidden Markov models," IEEE Trans. Multimedia, vol. 7, no. 2, pp. 243-252, Apr. 2005. (Pubitemid 40454754)
- (2005) IEEE Transactions on Multimedia , vol.7 , Issue.2 , pp. 243-252
- Fu, S.¹ Gutierrez-Osuna, R.² Esposito, A.³ Kakumanu, P.K.⁴ Garcia, O.N.⁵

53
- 0035251712
- Speech-to-lip movement synthesis by maximizing audio-visual joint probability based on the EM algorithm
- DOI 10.1023/A:1008179732362
- S. Nakamura and E.Yamamoto, "Speech-to-lip movement synthesis by maximizing audio-visual joint probability based on the EM algorithm," J. VLSI Signal Process., vol. 27, pp. 119-126, 2001. (Pubitemid 32190760)
- (2001) Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology , vol.27 , Issue.1-2 , pp. 119-126
- Nakamura, S.¹ Yamamoto, E.²

54
- 4143072802
- Trainable articulatory control models for visual speech synthesis
- J. Beskow, "Trainable articulatory control models for visual speech synthesis," J. Speech Technol., vol. 4, no. 7, pp. 335-349, 2004.
- (2004) J. Speech Technol. , vol.4 , Issue.7 , pp. 335-349
- Beskow, J.¹

55
- 24144469759
- Data-driven multimodal synthesis
- DOI 10.1016/j.specom.2005.02.015, PII S0167639305000294
- R. Carlson and B. Granström, "Data-driven multimodal synthesis," Speech Commun., vol. 47, pp. 182-193, 2005. (Pubitemid 41231589)
- (2005) Speech Communication , vol.47 , Issue.1-2 , pp. 182-193
- Carlson, R.¹ Granstrom, B.²

56
- 79959844243
- A minimum converted trajectory error (mcte) approach to high quality speech-to-lips conversion
- X. Zhuang, L. Wang, F. Soong, and M. Hasegawa-Johnson, "A minimum converted trajectory error (mcte) approach to high quality speech-to-lips conversion," in Proc. Interspeech, 2010.
- (2010) Proc. Interspeech
- Zhuang, X.¹ Wang, L.² Soong, F.³ Hasegawa-Johnson, M.⁴

57
- 10044230252
- Speech driven facial animation using a hidden Markov coarticulation model
- D. Cosker, D. Marshall, P. Rosin, and Y. Hicks, "Speech driven facial animation using a hidden Markov coarticulation model," in Proc. Int. Conf. Pattern Recogn., 2004, pp. 128-131.
- (2004) Proc. Int. Conf. Pattern Recogn. , pp. 128-131
- Cosker, D.¹ Marshall, D.² Rosin, P.³ Hicks, Y.⁴

58
- 0141491557
- 3D face point trajectory synthesis using an automatically derived visual phoneme similarity matrix
- L. Arslan and D. Talkin, "3D face point trajectory synthesis using an automatically derived visual phoneme similarity matrix," in Proc. Int. Conf. Auditory-Vis. Speech Process., 1998, pp. 175-180.
- (1998) Proc. Int. Conf. Auditory-Vis. Speech Process. , pp. 175-180
- Arslan, L.¹ Talkin, D.²

59
- 84966335540
- Evaluation of movement generation systems using the point-light technique
- G. Bailly, G. Gibert, and M. Odisio, "Evaluation of movement generation systems using the point-light technique," in Proc. IEEE Workshop Speech Synth., 2002, pp. 27-30.
- (2002) Proc. IEEE Workshop Speech Synth. , pp. 27-30
- Bailly, G.¹ Gibert, G.² Odisio, M.³

60
- 85034718268
- Audio-visual synthesis of talking faces from speech production correlates
- T. Kuratate, K.Munhall, P. Rubin, E. Vatikiotis-Bateson, and H. Yehia, "Audio-visual synthesis of talking faces from speech production correlates," in Proc. Eurospeech, 1999, vol. 3, pp. 1279-1282.
- (1999) Proc. Eurospeech , vol.3 , pp. 1279-1282
- Kuratate, T.¹ Munhall, K.² Rubin, P.³ Vatikiotis-Bateson, E.⁴ Yehia, H.⁵

61
- 84865378897
- Acoustic viseme modelling for speech driven animation: A case study
- J. Dongmei, X. Lei, Z. Rongchun, W. Verhelst, I. Ravyse, and H. Sahli, "Acoustic viseme modelling for speech driven animation: A case study," in Proc. Workshop Model-Based Process. Coding of Audio, 2002, pp. 49-52.
- (2002) Proc. Workshop Model-Based Process. Coding of Audio , pp. 49-52
- Dongmei, J.¹ Lei, X.² Rongchun, Z.³ Verhelst, W.⁴ Ravyse, I.⁵ Sahli, H.⁶

62
- 84898841607
- Detecting local audio-visual synchrony in monologues utilizing vocal pitch and facial landmark trajectories
- S. Cadavid, M. Abdel-Mottaleb, D. Messinger, M. Mahoor, and L. Bahrick, "Detecting local audio-visual synchrony in monologues utilizing vocal pitch and facial landmark trajectories," in Proc. British Mach. Vis. Conf., 2009.
- (2009) Proc. British Mach. Vis. Conf.
- Cadavid, S.¹ Abdel-Mottaleb, M.² Messinger, D.³ Mahoor, M.⁴ Bahrick, L.⁵

63
- 0033336969
- User evaluation: Synthetic talking faces for interactive services
- DOI 10.1007/s003710050182
- I. Pandzic, J. Ostermann, and D. Millen, "User evaluation: Synthetic talking faces for interactive services," Vis. Comput., vol. 15, pp. 330-340, 1999. (Pubitemid 30504265)
- (1999) Visual Computer , vol.15 , Issue.7 , pp. 330-340
- Pandzic, I.S.¹ Ostermann, J.² Millen, D.³

64
- 84882549946
- Intelligibility of natural and 3D-cloned German speech
- S. Fagel, G. Bailly, and F. Elisei, "Intelligibility of natural and 3D-cloned German speech," in Proc. Int. Conf. Auditory-Vis. Speech Process., 2007.
- (2007) Proc. Int. Conf. Auditory-Vis. Speech Process.
- Fagel, S.¹ Bailly, G.² Elisei, F.³

65
- 0032178686
- Audio-visual speech synthesis from French text: Eight years of models, designs and evaluation at the ICP
- PII S0167639398000454
- C. Benoît and B. Le Goff, "Audio-visual speech synthesis from french text: Eight years of models, designs and evaluation at the ICP," Speech Commun., vol. 26, pp. 117-129, 1998. (Pubitemid 128381224)
- (1998) Speech Communication , vol.26 , Issue.1-2 , pp. 117-129
- Benoit, C.¹ Le Goff, B.²

66
- 33749437734
- Design, implementation and evaluation of the czech realistic audio-visual speech synthesis
- M. Železný, K. Zdeněk, P. Císař, and M. Jindřich, "Design, implementation and evaluation of the czech realistic audio-visual speech synthesis," Signal Process., vol. 86, pp. 3657-3673, 2006.
- (2006) Signal Process. , vol.86 , pp. 3657-3673
- Železný, M.¹ Zdeněk, K.² Císař, P.³ Jindřich, M.⁴

67
- 0030166343
- The SUS test 1: A method for the assessment of text-to-speech synthesis intelligibility using Semantically Unpredictable Sentences
- DOI 10.1016/0167-6393(96)00026-X, PII S016763939600026X
- C. Benoît, M. Grice, and V. Hazan, "The SUS test: A method for the assessment of text-to-speech synthesis intelligibility using semantically unpredictable sentences," Speech Commun., vol. 18, no. 4, pp. 381-392, 1996. (Pubitemid 126362996)
- (1996) Speech Communication , vol.18 , Issue.4 , pp. 381-392
- Benoit, C.¹ Grice, M.² Hazan, V.³

68
- 0017199877
- Hearing lips and seeing voices
- H. McGurk and J. MacDonald, "Hearing lips and seeing voices," Nature, vol. 264, pp. 746-748, 1976.
- (1976) Nature , vol.264 , pp. 746-748
- McGurk, H.¹ MacDonald, J.²

69
- 58149506686
- Towards perceptually realistic talking heads: Models, metrics and Mcgurk
- D. Cosker, D. Marshall, P. Rosin, S. Paddock, and S. Rushton, "Towards perceptually realistic talking heads: Models, metrics and Mcgurk," ACM Trans. Appl. Percept., vol. 2, no. 3, pp. 270-285, 2005.
- (2005) ACM Trans. Appl. Percept. , vol.2 , Issue.3 , pp. 270-285
- Cosker, D.¹ Marshall, D.² Rosin, P.³ Paddock, S.⁴ Rushton, S.⁵

70
- 84865401999
- Evaluation of a viseme-driven talking head
- P. Dey, S. Maddock, and R. Nicolson, "Evaluation of a viseme-driven talking head," in Proc. 8th Theory Practice Comput. Graphics Conf., 2010, pp. 139-142.
- (2010) Proc. 8th Theory Practice Comput. Graphics Conf. , pp. 139-142
- Dey, P.¹ Maddock, S.² Nicolson, R.³

71
- 10444283472
- An articulation model for audiovisual speech synthesis-Determination, adjustment, evaluation
- S. Fagel and C. Clemens, "An articulation model for audiovisual speech synthesis - Determination, adjustment, evaluation," Speech Commun., vol. 44, pp. 141-154, 2004.
- (2004) Speech Commun. , vol.44 , pp. 141-154
- Fagel, S.¹ Clemens, C.²

72
- 84867195987
- MASSY speaks English: Adaptation and evaluation of a talking head
- S. Fagel, "MASSY speaks English: Adaptation and evaluation of a talking head," Proc. Interspeech, 2008.
- (2008) Proc. Interspeech
- Fagel, S.¹

73
- 70350007964
- Optimization of an image-based talking head system
- K. Liu and J. Ostermann, "Optimization of an image-based talking head system," EURASIP J. Audio, Speech, Music Process., vol. 2009, 2009.
- (2009) EURASIP J. Audio, Speech, Music Process. , vol.2009
- Liu, K.¹ Ostermann, J.²

74
- 84865387939
- Optimized photorealistic audiovisual speech synthesis using active appearance modeling
- W. Mattheyses, L. Latacz, and W. Verhelst, "Optimized photorealistic audiovisual speech synthesis using active appearance modeling," in Proc. Int. Conf. Auditory-Vis. Speech Process., 2010, pp. 148-153.
- (2010) Proc. Int. Conf. Auditory-Vis. Speech Process. , pp. 148-153
- Mattheyses, W.¹ Latacz, L.² Verhelst, W.³

75
- 24944588116
- Automatic creation of a talking head from a video sequence
- DOI 10.1109/TMM.2005.850964
- K. Choi and J. Hwang, "Automatic creation of a talking head from a video sequence," IEEE Trans. Multimedia, vol. 7, no. 4, pp. 628-637, Aug. 2005. (Pubitemid 41311415)
- (2005) IEEE Transactions on Multimedia , vol.7 , Issue.4 , pp. 628-637
- Choi, K.-H.¹ Hwang, J.-N.²

76
- 70350003751
- On the importance of audiovisual coherence for the perceived quality of synthesized visual speech
- W. Mattheyses, L. Latacz, and W. Verhelst, "On the importance of audiovisual coherence for the perceived quality of synthesized visual speech," EURASIP J. Audio, Speech, Music Process., vol. 2009, pp. 1-12, 2009.
- (2009) EURASIP J. Audio, Speech, Music Process. , vol.2009 , pp. 1-12
- Mattheyses, W.¹ Latacz, L.² Verhelst, W.³

77
- 70450168976
- Direct, modular and hybrid audio to visual speech conversion methods-A comparative study
- G. Takacs, "Direct, modular and hybrid audio to visual speech conversion methods - A comparative study," in Proc. Interspeech, 2009.
- (2009) Proc. Interspeech
- Takacs, G.¹

78
- 10444259183
- Ph.D. dissertation, Univ. of East Anglia, Norwich, UK
- B. Theobald, "Visual speech synthesis using shape and appearance models," Ph.D. dissertation, Univ. of East Anglia, Norwich, UK, 2003.
- (2003) Visual Speech Synthesis using Shape and Appearance Models
- Theobald, B.¹

79
- 34547523367
- Audio-visual speech synchrony measure for talking-face identity verification
- Apr.
- H. Bredin and G. Chollet, "Audio-visual speech synchrony measure for talking-face identity verification," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Apr. 2007, vol. 2, pp. 233-236.
- (2007) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. , vol.2 , pp. 233-236
- Bredin, H.¹ Chollet, G.²

80
- 0035579204
- Calculation of the smoothing spline with weighted roughness measure
- DOI 10.1142/S0218202501000726
- C. de Boor, "Calculation of the smoothing spline with weighted roughness measure," Math. Models Methods in Appl. Sci., vol. 11, no. 1, pp. 33-41, 2001. (Pubitemid 33686843)
- (2001) Mathematical Models and Methods in Applied Sciences , vol.11 , Issue.1 , pp. 33-41
- De Boor, C.¹

81
- 84865399945
- On evaluating synthesised visual speech
- B. Theobald, N. Wilkinson, and I. Matthews, "On evaluating synthesised visual speech," in Proc. Int. Conf. Auditory Vis. Speech Process., 2008, pp. 7-12.
- (2008) Proc. Int. Conf. Auditory Vis. Speech Process. , pp. 7-12
- Theobald, B.¹ Wilkinson, N.² Matthews, I.³

82
- 0003649834
- ser. Duxbury Advanced Series. Pacific Grove, CA: Duxbury
- D. Wakerly, W. Mendenhall, and R. Scheaffer, Mathematical Statistics with Applications, ser. Duxbury Advanced Series. Pacific Grove, CA: Duxbury, 2002.
- (2002) Mathematical Statistics with Applications
- Wakerly, D.¹ Mendenhall, W.² Scheaffer, R.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.