메뉴 건너뛰기




Volumn 17, Issue 3, 2009, Pages 469-477

Realistic visual Speech synthesis based on hybrid concatenation method

Author keywords

Fused hidden markov model (HMM); Inversion; Speech driven facial animation; Unit concatenation; Visual speech synthesis

Indexed keywords

A-FRAMES; COMPUTING EFFICIENCY; FACIAL ANIMATION; FACIAL EXPRESSIONS; FUSED HIDDEN MARKOV MODEL (HMM); GAUSSIAN MIXTURE MODELS; HIGH QUALITY; HYBRID CONCATENATION; INVERSION; LOOSE SYNCHRONIZATIONS; MAPPING METHOD; REAL-TIME APPLICATION; RUNNING SPEED; SECOND LAYER; TIGHTLY-COUPLED; TWO LAYERS; UNIT CONCATENATION; UNIT SELECTION; VISUAL SPEECH SYNTHESIS; VITERBI SEARCH;

EID: 70350437421     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2008.2011538     Document Type: Article
Times cited : (17)

References (35)
  • 1
    • 48149105519 scopus 로고    scopus 로고
    • Dynamic audio-visual mapping using fused hidden markov model inversion method
    • San Antonio, TX, 2007
    • L. Xin, J. H. Tao, and T. N. Tan, "Dynamic audio-visual mapping using fused hidden Markov model inversion method, " in Proc. ICIP, San Antonio, TX, 2007, pp. 293-296.
    • Proc. ICIP , pp. 293-296
    • Xin, L.1    Tao, J.H.2    Tan, T.N.3
  • 2
    • 34547732498 scopus 로고    scopus 로고
    • Speech driven face animation based on dynamic concatenation model
    • J. H. Tao and P. R. Yin, "Speech driven face animation based on dynamic concatenation model, " J. Inf. Computat. Sci., vol. 4, no. 1, pp. 271-280, 2007.
    • (2007) J. Inf. Computat. Sci. , vol.4 , Issue.1 , pp. 271-280
    • Tao, J.H.1    Yin, P.R.2
  • 3
    • 38049013313 scopus 로고    scopus 로고
    • Expressive face animation synthesis based on dynamic mapping method
    • ser. Lecture Notes in Computer Science, A. Paiva, R. Prada, and R. W. Picard, Eds. New York: Springer-Verlag
    • P. R. Yin, L. Y. Zhao, L. X. Huang, and J. H. Tao, "Expressive face animation synthesis based on dynamic mapping method, " in Affective Computing and Intelligent Interaction, ser. Lecture Notes in Computer Science, A. Paiva, R. Prada, and R. W. Picard, Eds. New York: Springer-Verlag, 2007, pp. 1-11.
    • (2007) Affective Computing and Intelligent Interaction , pp. 1-11
    • Yin, P.R.1    Zhao, L.Y.2    Huang, L.X.3    Tao, J.H.4
  • 4
    • 1542303714 scopus 로고    scopus 로고
    • A fused hidden markov model with application to bimodal speech processing
    • Mar
    • H. Pan, S. Levinson, T. S. Huang, and Z. P. Liang, "A fused hidden Markov model with application to bimodal speech processing, " IEEE Trans. Signal Process., vol. 52, no. 3, pp. 573-581, Mar. 2004.
    • (2004) IEEE Trans. Signal Process. , vol.52 , Issue.3 , pp. 573-581
    • Pan, H.1    Levinson, S.2    Huang, T.S.3    Liang, Z.P.4
  • 7
    • 0030677313 scopus 로고    scopus 로고
    • Video rewrite: Driving visual speech with audio
    • C. Bregler, M. Covell, and M. Slaney, "Video rewrite: Driving visual speech with audio, " in Proc. ACM SIGGRAPH, 1997, pp. 353-360.
    • (1997) Proc. ACM SIGGRAPH , pp. 353-360
    • Bregler, C.1    Covell, M.2    Slaney, M.3
  • 8
    • 34047240820 scopus 로고    scopus 로고
    • Speech animation using coupled hidden markov models
    • Hong Kong, China
    • L. Xie and Z. Liu, "Speech animation using coupled Hidden Markov Models, " in Proc. 18th Int. Conf. Pattern Recognition (ICPR), Hong Kong, China, 2006, pp. 1128-1131.
    • (2006) Proc. 18th Int. Conf. Pattern Recognition (ICPR) , pp. 1128-1131
    • Xie, L.1    Liu, Z.2
  • 9
    • 85133709259 scopus 로고    scopus 로고
    • Picture my voice: Audio to visual speech synthesis using artificial neural networks
    • Santa Cruz, CA
    • D. W. Massaro, J. Beskow, M. M. Cohen, C. L. Fry, and T. Rodriguez, "Picture my voice: Audio to visual speech synthesis using artificial neural networks, " in Proc. AVSP, Santa Cruz, CA, 1999, pp. 133-138.
    • (1999) Proc. AVSP , pp. 133-138
    • Massaro, D.W.1    Beskow, J.2    Cohen, M.M.3    Fry, C.L.4    Rodriguez, T.5
  • 10
    • 85009254391 scopus 로고    scopus 로고
    • Miketalk: A talking facial display based on morphing visemes
    • Philadelphia, PA
    • T. Ezzat and T. Poggio, "MikeTalk: A talking facial display based on morphing visemes, " in Proc. Comput. Animation Conf., Philadelphia, PA, 1998, pp. 96-102.
    • (1998) Proc. Comput. Animation Conf. , pp. 96-102
    • Ezzat, T.1    Poggio, T.2
  • 11
    • 0032179320 scopus 로고    scopus 로고
    • Lip movement synthesis from speech based on hidden markov models
    • E. Yamamoto, S. Nakamura, and K. Shikano, "Lip movement synthesis from speech based on Hidden Markov Models, " Speech Commun., vol. 26, pp. 105-115, 1998.
    • (1998) Speech Commun. , vol.26 , pp. 105-115
    • Yamamoto, E.1    Nakamura, S.2    Shikano, K.3
  • 12
    • 0036650837 scopus 로고    scopus 로고
    • Real-time speech-driven face animation with expressions using neural networks
    • P. Y. Hong, Z. Wen, and T. S. Huang, "Real-time speech-driven face animation with expressions using neural networks, " IEEE Trans. Neural Netw., vol. 13, no. 4, pp. 916-927, 2002.
    • (2002) IEEE Trans. Neural Netw. , vol.13 , Issue.4 , pp. 916-927
    • Hong, P.Y.1    Wen, Z.2    Huang, T.S.3
  • 13
    • 84937437186 scopus 로고    scopus 로고
    • Voice puppetry
    • M. Brand, "Voice puppetry, " in Proc. SIGGRAPH, 1999, pp. 21-28.
    • (1999) Proc. SIGGRAPH , pp. 21-28
    • Brand, M.1
  • 14
    • 33646785065 scopus 로고    scopus 로고
    • Dynamic mapping method based speech driven face animation system
    • ser. Lecture Notes in Computer Science, J. Tao, T. Tie, and R. W. Picard, Eds. New York: Springer-Verlag
    • P. R. Yin and J. H. Tao, "Dynamic mapping method based speech driven face animation system, " in Affective Computing and Intelligent Interaction, ser. Lecture Notes in Computer Science, J. Tao, T. Tie, and R. W. Picard, Eds. New York: Springer-Verlag, 2005.
    • (2005) Affective Computing and Intelligent Interaction
    • Yin, P.R.1    Tao, J.H.2
  • 20
    • 70350498363 scopus 로고    scopus 로고
    • Real-time lip synchronization based on hidden markov models
    • Y. Huang, S. Lin, X. Ding, B. Guo, and H. Shum, "Real-time Lip synchronization based on Hidden Markov Models, " in Proc. ACCV, 2002, pp. 176-181.
    • (2002) Proc. ACCV , pp. 176-181
    • Huang, Y.1    Lin, S.2    Ding, X.3    Guo, B.4    Shum, H.5
  • 21
    • 0031997085 scopus 로고    scopus 로고
    • Audio-to-visual conversion for multimedia communication
    • Feb
    • R. Rao, T. Chen, and R. M. Mersereau, "Audio-to-visual conversion for multimedia communication, " IEEE Trans. Ind. Electron., vol. 45, no. 1, pp. 15-22, Feb. 1998.
    • (1998) IEEE Trans. Ind. Electron , vol.45 , Issue.1 , pp. 15-22
    • Rao, R.1    Chen, T.2    Mersereau, R.M.3
  • 23
    • 33646752807 scopus 로고    scopus 로고
    • Learning dynamic audio-visual mapping with input-output hidden markov models
    • Apr
    • Y. Li and H. Y. Shum, "Learning dynamic audio-visual mapping with input-output Hidden Markov Models, " IEEE Trans. Multimedia, vol. 8, no. 3, pp. 542-549, Apr. 2006.
    • (2006) IEEE Trans. Multimedia , vol.8 , Issue.3 , pp. 542-549
    • Li, Y.1    Shum, H.Y.2
  • 24
    • 0030685285 scopus 로고    scopus 로고
    • Coupled hidden markov models for complex action recognition
    • M. Brand and N. Oliver, "Coupled hidden markov models for complex action recognition, " in Proc. Comput. Vis. Pattern Recognition, 1997, pp. 201-206.
    • (1997) Proc. Comput. Vis. Pattern Recognition , pp. 201-206
    • Brand, M.1    Oliver, N.2
  • 27
    • 41549121431 scopus 로고    scopus 로고
    • Exploiting audio-visual correlation in coding of talking head sequences
    • Melbourne, Australia, Mar
    • R. Rao and T. Chen, "Exploiting audio-visual correlation in coding of talking head sequences, " in Proc. Picture Coding Symp., Melbourne, Australia, Mar. 1996, pp. 653-658.
    • (1996) Proc. Picture Coding Symp. , pp. 653-658
    • Rao, R.1    Chen, T.2
  • 28
    • 0000286376 scopus 로고
    • Using dynamic timewarping to find patterns in time series
    • D. Berndt and J. Clifford, "Using dynamic timewarping to find patterns in time series, " in Proc. KDD Workshop, 1994, pp. 359-370.
    • (1994) Proc. KDD Workshop , pp. 359-370
    • Berndt, D.1    Clifford, J.2
  • 30
    • 84905560807 scopus 로고    scopus 로고
    • Voice conversion with smoothedgmm and map adaptation
    • Y. Chen, M. Chu, E. Chang, J. Liu, and R. Liu, "Voice conversion with smoothedGMM and MAP adaptation, " in Proc. Eurospeech, 2003, pp. 2413-2416.
    • (2003) Proc. Eurospeech , pp. 2413-2416
    • Chen, Y.1    Chu, M.2    Chang, E.3    Liu, J.4    Liu, R.5
  • 31
    • 34047263010 scopus 로고    scopus 로고
    • Prosody conversion from neutral speech to emotional speech
    • Jul
    • J. H. Tao, Y. G. Kang, and A. J. Li, "Prosody conversion from neutral speech to emotional speech, " IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 4, pp. 1145-1154, Jul. 2006.
    • (2006) IEEE Trans. Audio, Speech, Lang. Process. , vol.14 , Issue.4 , pp. 1145-1154
    • Tao, J.H.1    Kang, Y.G.2    Li, A.J.3
  • 32
    • 85009159448 scopus 로고    scopus 로고
    • Emotional space improves emotion recognition
    • Denver, CO
    • R. Tato, R. Santos, R. Kompe, and J. M. Pardo, "Emotional space improves emotion recognition, " in Proc. ICSLP, Denver, CO, 2002, pp. 2029-2032.
    • (2002) Proc. ICSLP , pp. 2029-2032
    • Tato, R.1    Santos, R.2    Kompe, R.3    Pardo, J.M.4
  • 33
    • 84983154011 scopus 로고    scopus 로고
    • Perception of affect in speech-towards an automatic processing of paralinguistic information in spoken conversation
    • Jeju, Korea
    • N. Campbell, "Perception of affect in speech-Towards an automatic processing of paralinguistic information in spoken conversation, " in Proc. ICSLP, Jeju, Korea, 2004, pp. 881-884.
    • (2004) Proc. ICSLP , pp. 881-884
    • Campbell, N.1
  • 35
    • 33745906824 scopus 로고    scopus 로고
    • Automatic 3D face modeling from video
    • L. Xin, Q. Wang, J. H. Tao, X. Tang, T. Tan, and H. Shum, "Automatic 3D face modeling from video, " in Proc. ICCV, 2005, vol. 2, pp. 1193-1199.
    • (2005) Proc. ICCV , vol.2 , pp. 1193-1199
    • Xin, L.1    Wang, Q.2    Tao, J.H.3    Tang, X.4    Tan, T.5    Shum, H.6


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.