메뉴 건너뛰기




Volumn 86, Issue 12, 2006, Pages 3657-3673

Design, implementation and evaluation of the Czech realistic audio-visual speech synthesis

Author keywords

Audio visual speech processing; Facial animation; Talking head

Indexed keywords

ANIMATION CONTROL; AUDIO VISUAL SPEECH PROCESSING; FACIAL ANIMATION; TALKING HEAD;

EID: 33749437734     PISSN: 01651684     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.sigpro.2006.02.039     Document Type: Article
Times cited : (26)

References (42)
  • 1
    • 85009291900 scopus 로고    scopus 로고
    • Design of an audio-visual speech corpus for the Czech audio-visual speech synthesis
    • Denver, USA
    • Železný M., Císař P., Krňoul Z., and Novák J. Design of an audio-visual speech corpus for the Czech audio-visual speech synthesis. Proceedings of ICSLP 2002. Denver, USA (2002)
    • (2002) Proceedings of ICSLP 2002
    • Železný, M.1    Císař, P.2    Krňoul, Z.3    Novák, J.4
  • 2
    • 0020202671 scopus 로고    scopus 로고
    • F.I. Parke, Parameterized models for facial animation, IEEE Comput. Graph. Appl. (November 1982) 61-68.
  • 3
    • 33749431482 scopus 로고    scopus 로고
    • S. Basu, A. Pentland, A three-dimensional model of human lip motions trained from video, M.I.T. media laboratory pereptual computing section, Technical Report No. 441, MIT Media Laboratory, Cambridge, USA, 1997.
  • 4
    • 33749440372 scopus 로고    scopus 로고
    • V. Strnadová, Hádej, co říkám aneb Odezírání je nejisté umění. - Guess What I Am Talking or Lip-Reading is Uncertain Art, Ministerstvo zdravotnictví České republiky, Prague, Czech Republic, 1998.
  • 6
    • 85009071398 scopus 로고    scopus 로고
    • J. Matoušek, J. Romportl, D. Tihelka, Z. Tychtl, Recent improvements on ARTIC: Czech text-to-speech system, in: Proceedings of ICSLP 2004, vol. 3, Jeju, Korea, 2004, pp. 1933-1936.
  • 7
    • 0032651722 scopus 로고    scopus 로고
    • A hidden Markov-model-based trainable speech synthesizer
    • Donovan R.E., and Woodland P.C. A hidden Markov-model-based trainable speech synthesizer. Comput. Speech Language 13 1999.0123 (1999) 223-241
    • (1999) Comput. Speech Language , vol.13 , Issue.1999 0123 , pp. 223-241
    • Donovan, R.E.1    Woodland, P.C.2
  • 8
    • 85009132058 scopus 로고    scopus 로고
    • Design of speech corpus for text-to-speech synthesis
    • Ålborg, Denmark
    • Matoušek J., Psutka J., and Krůta J. Design of speech corpus for text-to-speech synthesis. Proceedings of EUROSPEECH 2001 vol. 3 (2001), Ålborg, Denmark 2047-2050
    • (2001) Proceedings of EUROSPEECH 2001 , vol.3 , pp. 2047-2050
    • Matoušek, J.1    Psutka, J.2    Krůta, J.3
  • 9
    • 85009152114 scopus 로고    scopus 로고
    • Automatic segmentation for Czech concatenative speech synthesis using statistical approach with boundary-specific correction
    • Geneva, Switzerland
    • Matoušek J., Tihelka D., and Psutka J. Automatic segmentation for Czech concatenative speech synthesis using statistical approach with boundary-specific correction. Proceedings of EUROSPEECH 2003. Geneva, Switzerland (2003) 301-304
    • (2003) Proceedings of EUROSPEECH 2003 , pp. 301-304
    • Matoušek, J.1    Tihelka, D.2    Psutka, J.3
  • 10
  • 11
    • 22944437142 scopus 로고    scopus 로고
    • Advanced prosody modelling
    • Bonn, Heidelberg, Springer, Berlin
    • Romportl J., and Matoušek J. Advanced prosody modelling. Proceedings of TSD 2004. Bonn, Heidelberg (2004), Springer, Berlin 441-447
    • (2004) Proceedings of TSD 2004 , pp. 441-447
    • Romportl, J.1    Matoušek, J.2
  • 12
    • 84936862571 scopus 로고    scopus 로고
    • The design of Czech language formal listening tests for the evaluation of TTS systems
    • Lisbon, Portugal
    • Tihelka D., and Matoušek J. The design of Czech language formal listening tests for the evaluation of TTS systems. Proceedings of LREC 2004. Lisbon, Portugal (2004) 2099-2102
    • (2004) Proceedings of LREC 2004 , pp. 2099-2102
    • Tihelka, D.1    Matoušek, J.2
  • 13
    • 33745213550 scopus 로고    scopus 로고
    • Symbolic prosody driven unit selection for highly natural synthetic speech
    • Lisbon, Portugal
    • Tihelka D. Symbolic prosody driven unit selection for highly natural synthetic speech. Proceedings of EUROSPEECH 2005. Lisbon, Portugal (2005) 2525-2528
    • (2005) Proceedings of EUROSPEECH 2005 , pp. 2525-2528
    • Tihelka, D.1
  • 15
    • 85133460248 scopus 로고    scopus 로고
    • Visual speech synthesis based on parameter generation from hmm. speech-driven and text-and-speech-driven approaches
    • Sydney, Australia
    • Tamura M., Masuko T., Kobayashi T., and Tokuda K. Visual speech synthesis based on parameter generation from hmm. speech-driven and text-and-speech-driven approaches. Proceedings of AVSP 1998. Sydney, Australia (1998)
    • (1998) Proceedings of AVSP 1998
    • Tamura, M.1    Masuko, T.2    Kobayashi, T.3    Tokuda, K.4
  • 16
    • 0006455820 scopus 로고    scopus 로고
    • Generation of lip-synched synthetic faces from phonetically clustered face movement data
    • Sydney, Australia
    • Galanes F.M., Unverferth J., Arslan L., and Talkin D. Generation of lip-synched synthetic faces from phonetically clustered face movement data. Proceedings of AVSP 1998. Sydney, Australia (1998)
    • (1998) Proceedings of AVSP 1998
    • Galanes, F.M.1    Unverferth, J.2    Arslan, L.3    Talkin, D.4
  • 19
    • 0020068630 scopus 로고
    • Anticipatory labial coarticulation: experimental, biological, and linguistic variables
    • Lubker J., and Gay T. Anticipatory labial coarticulation: experimental, biological, and linguistic variables. J. Acoust. Soc. Amer. 71 (1982) 437-448
    • (1982) J. Acoust. Soc. Amer. , vol.71 , pp. 437-448
    • Lubker, J.1    Gay, T.2
  • 20
    • 0025687878 scopus 로고
    • Coarticulatory organization for lip rounding in Turkish and English
    • Boyce S.E. Coarticulatory organization for lip rounding in Turkish and English. J. Acoust. Soc. Amer. 88 6 (1990) 2584-2595
    • (1990) J. Acoust. Soc. Amer. , vol.88 , Issue.6 , pp. 2584-2595
    • Boyce, S.E.1
  • 22
    • 0003116759 scopus 로고
    • Speech as audible gestures
    • Kluwer Academic Press, Dordrecht, Netherlands
    • Löfquist A. Speech as audible gestures. Speech Production and Speech Modelling (1990), Kluwer Academic Press, Dordrecht, Netherlands 289-322
    • (1990) Speech Production and Speech Modelling , pp. 289-322
    • Löfquist, A.1
  • 23
    • 0001514782 scopus 로고
    • Text-to-visual speech synthesis based on parameter generation from hmm
    • Springer, Tokyo, Japan
    • Cohen M.M., and Massaro D.W. Text-to-visual speech synthesis based on parameter generation from hmm. Models and Techniques in Computer Animation (1993), Springer, Tokyo, Japan 139-156
    • (1993) Models and Techniques in Computer Animation , pp. 139-156
    • Cohen, M.M.1    Massaro, D.W.2
  • 24
    • 33749452937 scopus 로고    scopus 로고
    • Neural Network Simulator 1.1, University of Tübingen 〈http://www-ra.informatik.uni-tuebingen.de/SNNS/〉.
  • 26
    • 0000892665 scopus 로고
    • Abstract muscle action procedures for human face animation
    • Magnenat-Thalmann N., Primeau E., and Thalmann D. Abstract muscle action procedures for human face animation. Visual Comput. 3 5 (1988) 290-297
    • (1988) Visual Comput. , vol.3 , Issue.5 , pp. 290-297
    • Magnenat-Thalmann, N.1    Primeau, E.2    Thalmann, D.3
  • 28
  • 29
    • 0030702311 scopus 로고    scopus 로고
    • M. Escher, N. Magnenat Thalmann, Automatic 3D cloning and real-time animation of a human face, Comput. Animation (1997) 58.
  • 30
    • 33749435880 scopus 로고    scopus 로고
    • Cyberware Scanning Products 〈http://www.cyberware.com/products/index.html〉.
  • 32
    • 0010946384 scopus 로고    scopus 로고
    • T. Miyasaka, K. Kuroda, M. Hirose, K. Araki, Reconstruction of realistic 3D surface model and 3D animation from range images obtained by real time 3D measurement system, in: IEEE International Conference on Pattern Recognition (ICPR'00), vol. 4, Barcelona, Spain, September 2000, p. 4594.
  • 34
    • 0345180807 scopus 로고    scopus 로고
    • Automated modelling of real human faces for 3D animation
    • Nagel B., Wingbermuhle J., Weik S., and Liedtke C.E. Automated modelling of real human faces for 3D animation. ICPR 98 (1998) 693-696
    • (1998) ICPR 98 , pp. 693-696
    • Nagel, B.1    Wingbermuhle, J.2    Weik, S.3    Liedtke, C.E.4
  • 36
    • 0001027507 scopus 로고    scopus 로고
    • Model based face reconstruction for animation
    • World Scientific Press, Singapore
    • Lee W., Kalra P., and Magnenat-Thalmann N. Model based face reconstruction for animation. Proceedings of the MMM'97 (1997), World Scientific Press, Singapore 323-338
    • (1997) Proceedings of the MMM'97 , pp. 323-338
    • Lee, W.1    Kalra, P.2    Magnenat-Thalmann, N.3
  • 37
    • 0030644092 scopus 로고    scopus 로고
    • L. Moccozet, N. Magnenat Thalmann, Dirichlet free-form deformations and their application to hand simulation, in: Computer Animation '97, Geneva, Switzerland, June 1997.
  • 38
    • 33749453587 scopus 로고    scopus 로고
    • Using dirichlet free form deformation to fit deformable models to noisy 3-D data
    • Springer, Berlin
    • Ilic S., and Fua P. Using dirichlet free form deformation to fit deformable models to noisy 3-D data. ECCV vol. 2351 (2002), Springer, Berlin 704-717
    • (2002) ECCV , vol.2351 , pp. 704-717
    • Ilic, S.1    Fua, P.2
  • 39
    • 33749431045 scopus 로고    scopus 로고
    • Z. Krňoul, M. Železný, P. Císař, Face model reconstruction for Czech audio-visual speech synthesis, in: Proceedings of SPECOM 2004, Saint Petersburg, Russian Federation, 2004, pp. 47-51, SPIIRAS.
  • 40
    • 22944440070 scopus 로고    scopus 로고
    • Z. Krňoul, M. Železný, Realistic face animation for a Czech Talking Head, in: Conference on TEXT, SPEECH and DIALOGUE, TSD 2004, Springer, Berlin, 2004, pp. 603-610.
  • 41
    • 33749429528 scopus 로고    scopus 로고
    • B. Le Goff, T. Guiard-Marigny, M. Cohen, C. Benoit, Real-time analysis-synthesis and intelligibility of talking faces, in: Second International Conference on Speech Synthesis, Newark (NY), September 1994.
  • 42
    • 33749447929 scopus 로고    scopus 로고
    • D.W. Massaro, J. Beskow, M.M. Cohen, C.L. Fry, T. Rodgriguez, Picture my voice: audio to visual speech synthesis using artificial neural networks, in: AVSP'99, Santa Cruz, CA, USA, 1999.


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.