메뉴 건너뛰기




Volumn , Issue , 2010, Pages 217-222

Photo-Real Lips Synthesis with Trajectory-Guided Sample Selection

Author keywords

photo real; talking head; trajectory guided; visual speech synthesis

Indexed keywords

IMAGE PROCESSING; MAXIMUM LIKELIHOOD; SPEECH COMMUNICATION; SPEECH SYNTHESIS; TRAJECTORIES;

EID: 84996687897     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (8)

References (31)
  • 1
    • 0034271782 scopus 로고    scopus 로고
    • Photo-realistic talking heads from image samples
    • E. Cosatto and H.P. Graf, “Photo-realistic talking heads from image samples”, IEEE Trans. Multimedia, 2000, vol. 2, no. 3, pp. 152-163.
    • (2000) IEEE Trans. Multimedia , vol.2 , Issue.3 , pp. 152-163
    • Cosatto, E.1    Graf, H.P.2
  • 2
    • 0030677313 scopus 로고    scopus 로고
    • Video Rewrite: Driving Visual Speech with Audio
    • Los Angeles, CA
    • C. Bregler, M. Covell, M. Slaney, “Video Rewrite: Driving Visual Speech with Audio,” In Proc. ACM SIGGRAPH 97, Los Angeles, CA, 1997, pp. 353-360.
    • (1997) Proc. ACM SIGGRAPH 97 , pp. 353-360
    • Bregler, C.1    Covell, M.2    Slaney, M.3
  • 3
    • 0036289950 scopus 로고    scopus 로고
    • Triphone based unit selection for concatenative visual speech synthesis
    • F. Huang, E. Cosatto, H.P. Graf, “Triphone based unit selection for concatenative visual speech synthesis,” Proc. ICASSP 2002. Vol. 2, 2002 pp.2037-2040.
    • (2002) Proc. ICASSP 2002 , vol.2 , pp. 2037-2040
    • Huang, F.1    Cosatto, E.2    Graf, H.P.3
  • 4
    • 0036989560 scopus 로고    scopus 로고
    • Trainable video realistic speech animation
    • San Antonio, Texas
    • T. Ezzat, G. Geiger, and T. Poggio, “Trainable video realistic speech animation,” Proc. ACM SIGGRAPH2002, San Antonio, Texas, 2002, pp. 388-398.
    • (2002) Proc. ACM SIGGRAPH2002 , pp. 388-398
    • Ezzat, T.1    Geiger, G.2    Poggio, T.3
  • 5
    • 57949116211 scopus 로고    scopus 로고
    • Multimodal Unit Selection for 2D Audiovisual Text-to-Speech Synthesis
    • The Netherlands
    • W. Mattheyses, L. Latacz, W. Verhelst, H. Sahii, “Multimodal Unit Selection for 2D Audiovisual Text-to-Speech Synthesis,” Proc. MLMI 2008, The Netherlands, 2008, pp. 125-136.
    • (2008) Proc. MLMI 2008 , pp. 125-136
    • Mattheyses, W.1    Latacz, L.2    Verhelst, W.3    Sahii, H.4
  • 6
    • 84867227937 scopus 로고    scopus 로고
    • Realistic Facial Animation System for Interactive Services
    • Brisbane, Australia, Sept
    • K. Liu, J. Ostermann, “Realistic Facial Animation System for Interactive Services,” Proc. Interspeech2008, Brisbane, Australia, Sept. 2008, pp.2330-2333.
    • (2008) Proc. Interspeech2008 , pp. 2330-2333
    • Liu, K.1    Ostermann, J.2
  • 9
    • 34047240820 scopus 로고    scopus 로고
    • Speech Animation Using Coupled Hidden Markov Models
    • August
    • L. Xie, Z.Q. Liu, “Speech Animation Using Coupled Hidden Markov Models,” Pro. ICPR'06, August 2006, pp. 1128-1131.
    • (2006) Pro. ICPR'06 , pp. 1128-1131
    • Xie, L.1    Liu, Z.Q.2
  • 10
    • 34547503417 scopus 로고    scopus 로고
    • HMM-based unit selection using frame sized speech segments
    • Sep
    • Z.H. Ling and R.H. Wang, “HMM-based unit selection using frame sized speech segments,” Proc. Interspeech 2006, Sep. 2006, pp. 2034-2037.
    • (2006) Proc. Interspeech 2006 , pp. 2034-2037
    • Ling, Z.H.1    Wang, R.H.2
  • 11
    • 78049399368 scopus 로고    scopus 로고
    • Rich-Context Unit Selection (RUS) Approach to High Quality TTS
    • March
    • Z.J. Yan, Y. Qian, F. Soong, “Rich-Context Unit Selection (RUS) Approach to High Quality TTS,” Proc. ICASSP 2010, March 2010, pp.4798-4801.
    • (2010) Proc. ICASSP 2010 , pp. 4798-4801
    • Yan, Z.J.1    Qian, Y.2    Soong, F.3
  • 12
    • 84867222285 scopus 로고    scopus 로고
    • LIPS2008: Visual Speech Synthesis Challenge
    • Brisbane, Australia, Sept
    • B. Theobald, S. Fagel, G. Bailly, and F. Elisei, “LIPS2008: Visual Speech Synthesis Challenge,” Proc. Interspeech2008, Brisbane, Australia, Sept. 2008, pp.2310-2313.
    • (2008) Proc. Interspeech2008 , pp. 2310-2313
    • Theobald, B.1    Fagel, S.2    Bailly, G.3    Elisei, F.4
  • 13
    • 85032752352 scopus 로고    scopus 로고
    • Audiovisual speech processing
    • Jan
    • T. Chen, “Audiovisual speech processing,” Signal Processing Magazine, IEEE Vol.18, Issue 1, Jan. 2001, pp.9-21.
    • (2001) Signal Processing Magazine, IEEE , vol.18 , Issue.1 , pp. 9-21
    • Chen, T.1
  • 15
    • 84872004031 scopus 로고    scopus 로고
    • Sample-based synthesis of photo-realistic talking heads
    • E. Cosatto and H.P. Graf, “Sample-based synthesis of photo-realistic talking heads,” Proc. IEEE Computer Animation, pp. 103-110, 1998.
    • (1998) Proc. IEEE Computer Animation , pp. 103-110
    • Cosatto, E.1    Graf, H.P.2
  • 16
    • 85009254391 scopus 로고    scopus 로고
    • Miketalk: A talking facial display based on morphing visemes
    • June
    • T. Ezzat, T. Poggio, “Miketalk: A talking facial display based on morphing visemes,” Proc. Computer Animation, June 1998, pp. 96-102.
    • (1998) Proc. Computer Animation , pp. 96-102
    • Ezzat, T.1    Poggio, T.2
  • 17
    • 10444256499 scopus 로고    scopus 로고
    • Near videorealistic synthetic talking faces: implementation and evaluation
    • B.J. Theobald, J.A. Bangham, I.A. Matthews, G.C. Cawley, “Near videorealistic synthetic talking faces: implementation and evaluation,” Speech Communication 2004, Vol. 44, pp.127-140.
    • (2004) Speech Communication , vol.44 , pp. 127-140
    • Theobald, B.J.1    Bangham, J.A.2    Matthews, I.A.3    Cawley, G.C.4
  • 18
    • 33947683441 scopus 로고    scopus 로고
    • Parameterization of Mouth Images by LLE and PCA for Image-Based Facial Animation
    • May
    • K. Liu, A.Weissenfeld, J. Ostermann, “Parameterization of Mouth Images by LLE and PCA for Image-Based Facial Animation,” Proc. ICASSP 2006, Vol. V, May 2006, pp.461-464.
    • (2006) Proc. ICASSP 2006 , vol.V , pp. 461-464
    • Liu, K.1    Weissenfeld, A.2    Ostermann, J.3
  • 20
    • 0036650148 scopus 로고    scopus 로고
    • Statistical Multimodal Integration for AudioVisual Speech Processing
    • July
    • S. Nakamura, “Statistical Multimodal Integration for AudioVisual Speech Processing,” IEEE Transactions on Neural Networks, Vol.13, No.4, July 2002, pp.854-866.
    • (2002) IEEE Transactions on Neural Networks , vol.13 , Issue.4 , pp. 854-866
    • Nakamura, S.1
  • 21
    • 20444375102 scopus 로고    scopus 로고
    • Integration Strategies for Audio-Visual Speech Processing: Applied to Text-Dependent Speaker Recognition
    • June
    • S. Lucey, T. Chen, S. Sridharan, and V. Chandran, “Integration Strategies for Audio-Visual Speech Processing: Applied to Text-Dependent Speaker Recognition,” IEEE Transactions on Multimedia, Vol.7, No.3, June 2005, pp.495-506.
    • (2005) IEEE Transactions on Multimedia , vol.7 , Issue.3 , pp. 495-506
    • Lucey, S.1    Chen, T.2    Sridharan, S.3    Chandran, V.4
  • 22
    • 0029725605 scopus 로고    scopus 로고
    • Speech synthesis using HMMs with dynamic features
    • K. Tokuda, T. Masuko, T. kobayashi and S. Imai, “Speech synthesis using HMMs with dynamic features,” Proc. ICASSP 1996, Vol. I, pp. 389-392.
    • (1996) Proc. ICASSP , vol.I , pp. 389-392
    • Tokuda, K.1    Masuko, T.2    kobayashi, T.3    Imai, S.4
  • 23
    • 0032678076 scopus 로고    scopus 로고
    • Hidden Markov models based on Multi-space probability distribution for pitch pattern modeling
    • K. Tokuda, T. Masuko, N. Miyazaki, T. Kobayashi, “Hidden Markov models based on Multi-space probability distribution for pitch pattern modeling,” Proc. ICASSP 1999, Vol. I, pp.229-232.
    • (1999) Proc. ICASSP , vol.I , pp. 229-232
    • Tokuda, K.1    Masuko, T.2    Miyazaki, N.3    Kobayashi, T.4
  • 24
    • 33646779506 scopus 로고    scopus 로고
    • Spectral Conversion Based on Maximum Likelihood Estimation Considering Global Variance of Converted Parameter
    • T. Toda, A. Black, K. Tokuda, “Spectral Conversion Based on Maximum Likelihood Estimation Considering Global Variance of Converted Parameter,” Proc. ICASSP 2005, Vol. I, pp. 9-12.
    • (2005) Proc. ICASSP , vol.I , pp. 9-12
    • Toda, T.1    Black, A.2    Tokuda, K.3
  • 26
    • 0030683369 scopus 로고    scopus 로고
    • Recent improvements on Microsoft's trainable text-to-speech system - Whistler
    • X. Huang, A. Acero, H. Hon, Y. Ju, J. Liu, S. Merdith, and M. Plumpe, “Recent improvements on Microsoft's trainable text-to-speech system - Whistler,” Proc. ICASSP 1997, pp. 959-962.
    • (1997) Proc. ICASSP , pp. 959-962
    • Huang, X.1    Acero, A.2    Hon, H.3    Ju, Y.4    Liu, J.5    Merdith, S.6    Plumpe, M.7
  • 27
    • 84944962517 scopus 로고    scopus 로고
    • The IBM trainable speech synthesis system
    • R.E. Donovan, and E.M. Eide, “The IBM trainable speech synthesis system,” Proc. ICSLP 1998, pp.1703-1706.
    • (1998) Proc. ICSLP , pp. 1703-1706
    • Donovan, R.E.1    Eide, E.M.2
  • 28
    • 85063141494 scopus 로고    scopus 로고
    • Using 5 ms segments in concatenative speech synthesis
    • Pittsburgh, PA, USA
    • T. Hirai, and S. Tenpaku, “Using 5 ms segments in concatenative speech synthesis,” Proc. of 5th ISCA Speech Synthesis Workshop, Pittsburgh, PA, USA, 2004, pp. 37-42.
    • (2004) Proc. of 5th ISCA Speech Synthesis Workshop , pp. 37-42
    • Hirai, T.1    Tenpaku, S.2
  • 29
    • 0029765811 scopus 로고    scopus 로고
    • Unit selection in a concatenative speech synthesis system using a large speech database
    • A. Hunt and A. Black, "Unit selection in a concatenative speech synthesis system using a large speech database," Proc. ICASSP 1996, pp. 373-376.
    • (1996) Proc. ICASSP , pp. 373-376
    • Hunt, A.1    Black, A.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.