메뉴 건너뛰기




Volumn 47, Issue 1-2, 2005, Pages 182-193

Data-driven multimodal synthesis

Author keywords

Data driven synthesis; Multimodal synthesis; Speech synthesis

Indexed keywords

DATA ACQUISITION; DATA REDUCTION; KNOWLEDGE BASED SYSTEMS; MATHEMATICAL MODELS;

EID: 24144469759     PISSN: 01676393     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.specom.2005.02.015     Document Type: Article
Times cited : (5)

References (57)
  • 1
    • 85135264071 scopus 로고    scopus 로고
    • Formant analysis and synthesis using hidden Markov models
    • Acero, A., 1999. Formant analysis and synthesis using hidden Markov models. In: Proc. Eurospeech'99, pp. 1047-1050.
    • (1999) Proc. Eurospeech'99 , pp. 1047-1050
    • Acero, A.1
  • 3
    • 85009255585 scopus 로고    scopus 로고
    • Seeing tongue movements from outside
    • Bailly, G., Badin, P., 2002. Seeing tongue movements from outside. In: Proc. ICSLP2002, pp. 1913-1916.
    • (2002) Proc. ICSLP2002 , pp. 1913-1916
    • Bailly, G.1    Badin, P.2
  • 7
    • 4143072802 scopus 로고    scopus 로고
    • Trainable articulatory control models for visual speech synthesis
    • J. Beskow Trainable articulatory control models for visual speech synthesis J. Speech Technol. 7 4 2004 335 349
    • (2004) J. Speech Technol. , vol.7 , Issue.4 , pp. 335-349
    • Beskow, J.1
  • 8
    • 21844452845 scopus 로고    scopus 로고
    • Resynthesis of facial and intraoral articulation from simultaneous measurements
    • Barcelona, Spain
    • Beskow, J., Engwall, O., Granström, B., 2003. Resynthesis of facial and intraoral articulation from simultaneous measurements. In: Proc. ICPhS 2003, Barcelona, Spain.
    • (2003) Proc. ICPhS 2003
    • Beskow, J.1    Engwall, O.2    Granström, B.3
  • 9
    • 35048862963 scopus 로고    scopus 로고
    • SYNFACE-a talking head telephone for the hearing-impaired
    • Miesenberger, K., Klaus, J., Zagler, W., Burger, D., (Eds.)
    • Beskow, J., Karlsson, I., Kewley, J., Salvi, G., 2004. SYNFACE-a talking head telephone for the hearing-impaired. In: Miesenberger, K., Klaus, J., Zagler, W., Burger, D., (Eds.), Computers Helping People with Special Needs, pp. 1178-1186.
    • (2004) Computers Helping People with Special Needs , pp. 1178-1186
    • Beskow, J.1    Karlsson, I.2    Kewley, J.3    Salvi, G.4
  • 10
    • 0038533317 scopus 로고
    • Movetrack-a movement tracking system
    • Grenoble, France
    • Branderud, P., 1985. Movetrack-a movement tracking system. In: Proc. French-Swedish Symposium on Speech, Grenoble, France, pp. 113-122.
    • (1985) Proc. French-Swedish Symposium on Speech , pp. 113-122
    • Branderud, P.1
  • 11
    • 0030677313 scopus 로고    scopus 로고
    • Video rewrite: Driving visual speech with audio
    • Bregler, C., Covell, M., Laney, M., 1997. Video rewrite: Driving visual speech with audio. In: Proc. ACM SIGGRAPH'97, pp. 353-360.
    • (1997) Proc. ACM SIGGRAPH'97 , pp. 353-360
    • Bregler, C.1    Covell, M.2    Laney, M.3
  • 13
    • 85067593976 scopus 로고
    • A text-to-speech system based entirely on rules
    • Carlson, R., Granström, B., 1976. A text-to-speech system based entirely on rules. In: Proc. ICASSP-76.
    • (1976) Proc. ICASSP-76
    • Carlson, R.1    Granström, B.2
  • 15
    • 0026372714 scopus 로고
    • Experiments with voice modelling in speech synthesis
    • R. Carlson, B. Granström, and I. Karlsson Experiments with voice modelling in speech synthesis Speech Comm. 10 1991 481 489
    • (1991) Speech Comm. , vol.10 , pp. 481-489
    • Carlson, R.1    Granström, B.2    Karlsson, I.3
  • 16
    • 0001395349 scopus 로고
    • Experiments with emotive speech-acted utterances and synthesized replicas
    • Banff, Canada
    • Carlson, R., Granström, B., Nord, L., 1992. Experiments with emotive speech-acted utterances and synthesized replicas. In: Internat. Conf. on Spoken Language Processing, Banff, Canada, pp 671-674.
    • (1992) Internat. Conf. on Spoken Language Processing , pp. 671-674
    • Carlson, R.1    Granström, B.2    Nord, L.3
  • 18
    • 0025543906 scopus 로고
    • Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones
    • F. Charpentier, and E. Moulines Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones Speech Comm. 9 5/6 1990 435 467
    • (1990) Speech Comm. , vol.9 , Issue.56 , pp. 435-467
    • Charpentier, F.1    Moulines, E.2
  • 19
    • 0022896754 scopus 로고
    • Diphone synthesis using an overlap-add technique for speech waveforms concatenation
    • Charpentier, F., Stella, M., 1986. Diphone synthesis using an overlap-add technique for speech waveforms concatenation. In: Proc. ICASSP 86, Vol. 3, pp. 2015-2018.
    • (1986) Proc. ICASSP 86 , vol.3 , pp. 2015-2018
    • Charpentier, F.1    Stella, M.2
  • 20
    • 0001514782 scopus 로고
    • Modelling coarticulation in synthetic visual speech
    • N. Magnenat Thalmann D. Thalmann Springer Verlag Tokyo
    • M.M. Cohen, and D.W. Massaro Modelling coarticulation in synthetic visual speech N. Magnenat Thalmann D. Thalmann Models and techniques in computer animation 1993 Springer Verlag Tokyo 139 156
    • (1993) Models and Techniques in Computer Animation , pp. 139-156
    • Cohen, M.M.1    Massaro, D.W.2
  • 21
    • 0345093720 scopus 로고
    • Terminal analog synthesis of continuous speech using the diphone method of segment assembly
    • N.R. Dixon, and H.D. Maxey Terminal analog synthesis of continuous speech using the diphone method of segment assembly IEEE Trans. AudioElectroacoust. AU-16 1968 40 50
    • (1968) IEEE Trans. AudioElectroacoust. , vol.AU-16 , pp. 40-50
    • Dixon, N.R.1    Maxey, H.D.2
  • 23
    • 0038194614 scopus 로고    scopus 로고
    • Evaluation of a system for concatenative articulatory visual speech synthesis
    • Engwall, O., 2002b. Evaluation of a system for concatenative articulatory visual speech synthesis. In: Proc. ICSLP 2002.
    • (2002) Proc. ICSLP 2002
    • Engwall, O.1
  • 24
    • 85009083853 scopus 로고    scopus 로고
    • From real-time MRI to 3D tongue movements
    • Engwall, O., 2004. From real-time MRI to 3D tongue movements. In: Proc. ICSLP 2004.
    • (2004) Proc. ICSLP 2004
    • Engwall, O.1
  • 25
    • 77953828868 scopus 로고    scopus 로고
    • Trainable videorealistic speech animation
    • San Antonio, TX
    • Ezzat, T., Geiger, G., Poggio, T., 2002. Trainable videorealistic speech animation. In: Proc. ACM SIGGRAPH 2002, San Antonio, TX, pp. 388-398.
    • (2002) Proc. ACM SIGGRAPH 2002 , pp. 388-398
    • Ezzat, T.1    Geiger, G.2    Poggio, T.3
  • 27
    • 84966440972 scopus 로고    scopus 로고
    • Integration of rule-based formant synthesis and waveform concatenation: A hybrid approach to text-to-speech synthesis
    • Santa Monica, USA, 11-13 September 2002
    • Hertz, S., 2002. Integration of rule-based formant synthesis and waveform concatenation: A hybrid approach to text-to-speech synthesis. In: Proc. IEEE 2002 Workshop on Speech Synthesis, Santa Monica, USA, 11-13 September 2002.
    • (2002) Proc. IEEE 2002 Workshop on Speech Synthesis
    • Hertz, S.1
  • 29
    • 85009141770 scopus 로고    scopus 로고
    • On the correlation between facial movements, tongue movements and speech acoustics
    • Jiang, J., Alwan, A., Bernstein, L., Keating, P., Auer, E., 2000. On the correlation between facial movements, tongue movements and speech acoustics. In: Proc. ICSLP2000, Vol. 1, 42-45.
    • (2000) Proc. ICSLP2000 , vol.1 , pp. 42-45
    • Jiang, J.1    Alwan, A.2    Bernstein, L.3    Keating, P.4    Auer, E.5
  • 30
    • 0141588508 scopus 로고
    • The Klattalk text-to-speech conversion system
    • Klatt, D., 1982. The Klattalk text-to-speech conversion system. In: Proc. ICASSP 82, pp. 1589-1592.
    • (1982) Proc. ICASSP 82 , pp. 1589-1592
    • Klatt, D.1
  • 31
    • 0023407575 scopus 로고
    • Review of text-to-speech conversion for English
    • D. Klatt Review of text-to-speech conversion for English J. Acoust. Soc. Amer. 82 3 1987 737 793
    • (1987) J. Acoust. Soc. Amer. , vol.82 , Issue.3 , pp. 737-793
    • Klatt, D.1
  • 33
    • 33645604824 scopus 로고    scopus 로고
    • Formant tracking using segmental phonemic information
    • Lee, M., van Santen, J., Möbius, B., Olive, J., 1999. Formant tracking using segmental phonemic information. In: Proc. Eurospeech'99, Vol. 6, pp. 2789-2792.
    • (1999) Proc. Eurospeech'99 , vol.6 , pp. 2789-2792
    • Lee, M.1    Van Santen, J.2    Möbius, B.3    Olive, J.4
  • 34
    • 0003116759 scopus 로고
    • Speech as audible gestures
    • W.J. Hardcastle A. Marchal Kluwer Academic Publishers Dordrecht
    • A. Löfqvist Speech as audible gestures W.J. Hardcastle A. Marchal Speech production and speech modelling 1990 Kluwer Academic Publishers Dordrecht 289 322
    • (1990) Speech Production and Speech Modelling , pp. 289-322
    • Löfqvist, A.1
  • 35
    • 0025325827 scopus 로고
    • A procedure for measuring auditory and audio-visual speech-reception thresholds for sentences in noise: Rationale, evaluation, and recommendations for use
    • A. MacLeod, and Q. Summerfield A procedure for measuring auditory and audio-visual speech-reception thresholds for sentences in noise: Rationale, evaluation, and recommendations for use Br. J. Audiol. 24 1990 29 43
    • (1990) Br. J. Audiol. , vol.24 , pp. 29-43
    • MacLeod, A.1    Summerfield, Q.2
  • 36
    • 4544357742 scopus 로고    scopus 로고
    • Formant diphone parameter extraction utilising a labeled single speaker database
    • Mannell, R.H., 1998. Formant diphone parameter extraction utilising a labeled single speaker database. In: Proc. ICSLP 98.
    • (1998) Proc. ICSLP 98
    • Mannell, R.H.1
  • 37
    • 33645595381 scopus 로고    scopus 로고
    • Animated speech: Research progress and applications
    • E. Vatikiotis-Bateson G. Bailly P. Perrier MIT Press
    • D.W. Massaro, M.M. Cohen, M. Tabain, J. Beskow, and R. Clark Animated speech: Research progress and applications E. Vatikiotis-Bateson G. Bailly P. Perrier Audiovisual Speech Processing 2005 MIT Press
    • (2005) Audiovisual Speech Processing
    • Massaro, D.W.1    Cohen, M.M.2    Tabain, M.3    Beskow, J.4    Clark, R.5
  • 38
    • 85009156064 scopus 로고    scopus 로고
    • A data-driven approach to source-formant type text-to-speech system
    • Mori, H., Ohtsuka, T., Kasuya, H., 2002. A data-driven approach to source-formant type text-to-speech system. In: ICSLP-2002, pp. 2365-2368.
    • (2002) ICSLP-2002 , pp. 2365-2368
    • Mori, H.1    Ohtsuka, T.2    Kasuya, H.3
  • 41
    • 33645586478 scopus 로고    scopus 로고
    • Data-driven formant synthesis
    • Öhlin, D., Carlson, R., 2004. Data-driven formant synthesis. In: Proc. Fonetik, pp. 160-163.
    • (2004) Proc. Fonetik , pp. 160-163
    • Öhlin, D.1    Carlson, R.2
  • 42
    • 84937184260 scopus 로고    scopus 로고
    • An audio-visual speech database and automatic measurements of visual speech
    • Öhman, T., 1998. An audio-visual speech database and automatic measurements of visual speech. In: KTH TMH QPSR, Vols. 1-2, pp. 61-76.
    • (1998) KTH TMH QPSR , vol.1-2 , pp. 61-76
    • Öhman, T.1
  • 43
    • 0017632039 scopus 로고
    • Rule synthesis of speech from diadic units
    • Olive, J.P., 1977. Rule synthesis of speech from diadic units. In: Proc. ICASSP-77, pp. 568-570.
    • (1977) Proc. ICASSP-77 , pp. 568-570
    • Olive, J.P.1
  • 44
    • 0020202671 scopus 로고
    • Parameterized models for facial animation
    • F.I. Parke Parameterized models for facial animation IEEE Comput. Graphics 2 9 1982 61 68
    • (1982) IEEE Comput. Graphics , vol.2 , Issue.9 , pp. 61-68
    • Parke, F.I.1
  • 46
  • 48
    • 84870292720 scopus 로고    scopus 로고
    • Mother: A new generation of talking heads providing a flexible articulatory control for video-realistic speech animation
    • Bejing, China
    • Reveret, L., Bailly, G., Badin, P., 2000. Mother: A new generation of talking heads providing a flexible articulatory control for video-realistic speech animation. In: Proc. 6th Internat. Conf. on Spoken Language Processing (ICSLP'2000). Bejing, China, pp. 755-758.
    • (2000) Proc. 6th Internat. Conf. on Spoken Language Processing (ICSLP'2000) , pp. 755-758
    • Reveret, L.1    Bailly, G.2    Badin, P.3
  • 50
    • 4143153672 scopus 로고    scopus 로고
    • Evaluation of a multilingual synthetic talking face as a communication aid for the hearing impaired
    • To appear in Barcelona, Spain
    • Siciliano, C., Williams, G., Beskow, J., Faulkner A., 2003. Evaluation of a multilingual synthetic talking face as a communication aid for the hearing impaired. To appear in Proc. 15th Internat. Congress of Phonetic Sciences, Barcelona, Spain.
    • (2003) Proc. 15th Internat. Congress of Phonetic Sciences
    • Siciliano, C.1    Williams, G.2    Beskow, J.3    Faulkner, A.4
  • 53
    • 10444263998 scopus 로고    scopus 로고
    • An HMM-based system for automatic segmentation and alignment of speech
    • Umeå Universitet, Umeå, Sweden
    • Sjölander, K., 2003. An HMM-based system for automatic segmentation and alignment of speech. In: Proc. Fonetik 2003, Umeå Universitet, Umeå, Sweden, pp. 93-96.
    • (2003) Proc. Fonetik 2003 , pp. 93-96
    • Sjölander, K.1
  • 54
    • 84912906590 scopus 로고
    • Constraints among parameters simplify control of Klatt formant synthesizer
    • K.N. Stevens, and C.A. Bickley Constraints among parameters simplify control of Klatt formant synthesizer J. Phonetics 19 1991 161 174
    • (1991) J. Phonetics , vol.19 , pp. 161-174
    • Stevens, K.N.1    Bickley, C.A.2
  • 55
    • 33645585669 scopus 로고
    • Looking at speech
    • D. Talkin Looking at speech Speech Technol. 4 1989 74 77
    • (1989) Speech Technol. , vol.4 , pp. 74-77
    • Talkin, D.1
  • 57
    • 0032178592 scopus 로고    scopus 로고
    • Quantitative association of vocal-tract and facial behaviour
    • H. Yehia, P. Rubin, and E. Vatikiotis-Bateson Quantitative association of vocal-tract and facial behaviour Speech Comm. 26 1998 23 43
    • (1998) Speech Comm. , vol.26 , pp. 23-43
    • Yehia, H.1    Rubin, P.2    Vatikiotis-Bateson, E.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.