메뉴 건너뛰기




Volumn 9, Issue 3, 2007, Pages 500-510

Realistic mouth-synching for speech-driven talking face using articulatory modelling

Author keywords

Articulatory model; Baum Welch DBN inversion (DBNI); Dynamic Bayesian networks (DBNs); Facial animation; Mouth synching; Talking face

Indexed keywords

ARTICULATORY MODELS; BAUM-WELCH DBN INVERSION (DBNI); DYNAMIC BAYESIAN NETWORKS (DBNS); FACIAL ANIMATION; TALKING FACES;

EID: 33947583073     PISSN: 15209210     EISSN: None     Source Type: Journal    
DOI: 10.1109/TMM.2006.888009     Document Type: Article
Times cited : (90)

References (34)
  • 1
    • 10044221981 scopus 로고    scopus 로고
    • Talking faces-technologies and applications
    • Aug
    • J. Ostermann and A. Weissenfeld, "Talking faces-technologies and applications," in Proc. of ICPR'04, Aug. 2004, vol. 3, pp. 826-833.
    • (2004) Proc. of ICPR'04 , vol.3 , pp. 826-833
    • Ostermann, J.1    Weissenfeld, A.2
  • 2
    • 10044281988 scopus 로고    scopus 로고
    • Lifelike talking faces for interactive services
    • Sep
    • E. Cosatto, J. Ostermann, H. P. Graf, and J. Schroeter, "Lifelike talking faces for interactive services," Proc. IEEE, vol. 91, no. 9, pp. 1406-1428, Sep. 2003.
    • (2003) Proc. IEEE , vol.91 , Issue.9 , pp. 1406-1428
    • Cosatto, E.1    Ostermann, J.2    Graf, H.P.3    Schroeter, J.4
  • 4
    • 0036949796 scopus 로고    scopus 로고
    • Head shop: Generating animated head models with anatomical structure
    • K. Kaehler, J. Haber, H. Yamauchi, and HP Seidel, "Head shop: Generating animated head models with anatomical structure," in Proc. ACM SIGGRAPH'02, 2002, pp. 55-63.
    • (2002) Proc. ACM SIGGRAPH'02 , pp. 55-63
    • Kaehler, K.1    Haber, J.2    Yamauchi, H.3    Seidel, H.P.4
  • 6
    • 77953828868 scopus 로고    scopus 로고
    • Trainable videorealistic speech animation
    • T. Ezzat, G. Geiger, and T. Poggio, "Trainable videorealistic speech animation," in Proc. ACM SIGGRAPH, 2002, pp. 388-397.
    • (2002) Proc. ACM SIGGRAPH , pp. 388-397
    • Ezzat, T.1    Geiger, G.2    Poggio, T.3
  • 7
    • 84872004031 scopus 로고    scopus 로고
    • Sample-based synthesis of photo-realistic talking heads
    • E. Cosatto and H. Graf, "Sample-based synthesis of photo-realistic talking heads," in Proc. IEEE Computer Animation, 1998, pp. 103-110.
    • (1998) Proc. IEEE Computer Animation , pp. 103-110
    • Cosatto, E.1    Graf, H.2
  • 8
    • 0034271782 scopus 로고    scopus 로고
    • Photo-realistic talking heads from image samples
    • _. "Photo-realistic talking heads from image samples," IEEE Trans. Multimedia, vol. 2, pp. 152-163, 2000.
    • (2000) IEEE Trans. Multimedia , vol.2 , pp. 152-163
    • Cosatto, E.1    Graf, H.2
  • 10
    • 0001514782 scopus 로고
    • Modeling coarticulation in synthetic visual speech
    • M. Magnenat-Thalmann and D. Thalmann, Eds. Tokyo, Japan: Springer-Verlag
    • M. M. Cohen and D. W. Massaro, "Modeling coarticulation in synthetic visual speech," in Models and Techniques in Computer Animation, M. Magnenat-Thalmann and D. Thalmann, Eds. Tokyo, Japan: Springer-Verlag, 1993, pp. 139-156.
    • (1993) Models and Techniques in Computer Animation , pp. 139-156
    • Cohen, M.M.1    Massaro, D.W.2
  • 11
    • 0036650837 scopus 로고    scopus 로고
    • Real-time speech-driven face animation with expressions using neural networks
    • P. Hong, Z. Wen, and T. S. Huang, "Real-time speech-driven face animation with expressions using neural networks," IEEE Trans. Neural Networks, vol. 13, no. 4, pp. 916-927, 2002.
    • (2002) IEEE Trans. Neural Networks , vol.13 , Issue.4 , pp. 916-927
    • Hong, P.1    Wen, Z.2    Huang, T.S.3
  • 12
    • 85133709259 scopus 로고    scopus 로고
    • Picture my voice: Audio to visual speech synthesis using artificial neural networks
    • D. W. Massaro, J. Beskow, M. M. Cohen, C. L. Fry, and T. Rodriguez, "Picture my voice: Audio to visual speech synthesis using artificial neural networks," in Proc. AVSP'99, 1999, pp. 133-138.
    • (1999) Proc. AVSP'99 , pp. 133-138
    • Massaro, D.W.1    Beskow, J.2    Cohen, M.M.3    Fry, C.L.4    Rodriguez, T.5
  • 14
    • 0032179320 scopus 로고    scopus 로고
    • Lip movement synthesis from speech based on hidden Markov models
    • E. Yamamoto, S. Nakamura, and K. Shikano, "Lip movement synthesis from speech based on hidden Markov models," Speech Commun., vol. 26, no. 1-2, pp. 105-115, 1998.
    • (1998) Speech Commun , vol.26 , Issue.1-2 , pp. 105-115
    • Yamamoto, E.1    Nakamura, S.2    Shikano, K.3
  • 16
    • 85032752352 scopus 로고    scopus 로고
    • Audiovisual speech processing: Lip reading and lip synchronization
    • T. Chen, "Audiovisual speech processing: Lip reading and lip synchronization," IEEE Signal Process. Mag., vol. 18, no. 1. pp. 9-21, 2001.
    • (2001) IEEE Signal Process. Mag , vol.18 , Issue.1 , pp. 9-21
    • Chen, T.1
  • 18
    • 0000497160 scopus 로고    scopus 로고
    • Baum-Weich hidden Markov model inversion for reliable audio-to-visual conversion
    • K. Choi and J. N. Hwang, "Baum-Weich hidden Markov model inversion for reliable audio-to-visual conversion," in Proc. IEEE 3rd Workshop Multimedia Signal Processing, 1999, pp. 175-180.
    • (1999) Proc. IEEE 3rd Workshop Multimedia Signal Processing , pp. 175-180
    • Choi, K.1    Hwang, J.N.2
  • 19
    • 0028996864 scopus 로고
    • Noisy speech recognition using robust inversion of hidden Markov models
    • S. Y. Moon and J. N. Hwang, "Noisy speech recognition using robust inversion of hidden Markov models," in Proc. ICASSP'95, 1995, pp. 145-148.
    • (1995) Proc. ICASSP'95 , pp. 145-148
    • Moon, S.Y.1    Hwang, J.N.2
  • 20
    • 0035426641 scopus 로고    scopus 로고
    • Hidden Markov model inversion for audio-to-visual conversion in an MPEG-4 facial animation system
    • K. Choi, Y. Luo, and J. Hwang, "Hidden Markov model inversion for audio-to-visual conversion in an MPEG-4 facial animation system," J. VLSI Signal Process., no. 29, pp. 51-61, 2001.
    • (2001) J. VLSI Signal Process , Issue.29 , pp. 51-61
    • Choi, K.1    Luo, Y.2    Hwang, J.3
  • 22
  • 23
    • 28444470028 scopus 로고    scopus 로고
    • Research on Key Issues of Audio Visual Speech Recognition,
    • Ph.D. dissertation, Northwestern Polytechnical Univ, Xian, China
    • L. Xie, "Research on Key Issues of Audio Visual Speech Recognition," Ph.D. dissertation, Northwestern Polytechnical Univ., Xian, China, 2004.
    • (2004)
    • Xie, L.1
  • 24
    • 85128370668 scopus 로고    scopus 로고
    • Combining articulatory and acoustic information for speech recognition in noisy and reverberant environments
    • K. Kirchhoff, "Combining articulatory and acoustic information for speech recognition in noisy and reverberant environments," in Proc. ICSLP'98, 1998, pp. 891-894.
    • (1998) Proc. ICSLP'98 , pp. 891-894
    • Kirchhoff, K.1
  • 26
    • 0038784279 scopus 로고    scopus 로고
    • Bayesian network structures and inference techniques for automatic speech recognition
    • G. G. Zweig, "Bayesian network structures and inference techniques for automatic speech recognition," Comput. Speech Lang., vol. 17, pp. 173-193, 2003.
    • (2003) Comput. Speech Lang , vol.17 , pp. 173-193
    • Zweig, G.G.1
  • 27
    • 0037697284 scopus 로고    scopus 로고
    • Hidden-articulator Markov models for speech recognition
    • M. Richardson, J. Bilmes, and C. Diorio, "Hidden-articulator Markov models for speech recognition," Speech Commun., vol. 41, pp. 511-529, 2003.
    • (2003) Speech Commun , vol.41 , pp. 511-529
    • Richardson, M.1    Bilmes, J.2    Diorio, C.3
  • 29
    • 84972571328 scopus 로고
    • Growth functions for transformations on manifolds
    • L. E. Baum and G. R. Sell, "Growth functions for transformations on manifolds," Pacific J. Math., vol. 27, no. 2, pp. 211-227, 1968.
    • (1968) Pacific J. Math , vol.27 , Issue.2 , pp. 211-227
    • Baum, L.E.1    Sell, G.R.2
  • 30
    • 0002629270 scopus 로고
    • Maximum likelihood from incomplete data via the EM algorithm
    • A. Dempster, A. N. Laird, and D. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," J. R. Statist. Soc. B, vol. 39, pp. 89-111, 1977.
    • (1977) J. R. Statist. Soc. B , vol.39 , pp. 89-111
    • Dempster, A.1    Laird, A.N.2    Rubin, D.3
  • 31
    • 0013288412 scopus 로고    scopus 로고
    • Dynamic Bayesian Networks: Representation, Inference and Learning,
    • Ph.D. dissertation, Univ. California, Berkeley
    • K. Murphy, "Dynamic Bayesian Networks: Representation, Inference and Learning," Ph.D. dissertation, Univ. California, Berkeley, 2002.
    • (2002)
    • Murphy, K.1
  • 32
    • 33947603303 scopus 로고    scopus 로고
    • L. Xie and Z. Ye, The JEWEL Audio-Visual Dataset for Facial Animation 2005, Tech. Rep. RCMT 05-11.
    • L. Xie and Z. Ye, The JEWEL Audio-Visual Dataset for Facial Animation 2005, Tech. Rep. RCMT 05-11.
  • 33
    • 33947610805 scopus 로고    scopus 로고
    • S. Young, G. Evermann, D. Kershaw, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. Woodland, The HTK Book Eng. Dept., Cambridge Univ., Cambridge, U.K., 2002 [Online]. Available: http://htk.eng.cam.ac.uk/, 3.2
    • S. Young, G. Evermann, D. Kershaw, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. Woodland, The HTK Book Eng. Dept., Cambridge Univ., Cambridge, U.K., 2002 [Online]. Available: http://htk.eng.cam.ac.uk/, 3.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.