SCOPUS 정보 검색 플랫폼

Volumn 26, Issue 1-2, 1998, Pages 117-129

Audio-visual speech synthesis from French text: Eight years of models, designs and evaluation at the ICP

(2) Benoît, Christian a,b Le Goff, Bertrand a

b ADVANCED TELECOMMUNICATIONS RESEARCH INSTITUTE INTERNATIONAL (Japan)

Author keywords

3D lip model; Coarticulation; Face animation; French visemes; Intelligibility; Loudness; Speaking rate; Speechreading; Text to audiovisual speech synthesis

Indexed keywords

SPEECH ANALYSIS; SPEECH COMMUNICATION; SPEECH INTELLIGIBILITY;

AUDIOVISUAL SPEECH SYNTHESIS; COARTICULATION EFFECT; SPEECH READING;

SPEECH SYNTHESIS;

EID: 0032178686 PISSN: 01676393 EISSN: None Source Type: Journal
DOI: 10.1016/S0167-6393(98)00045-4 Document Type: Article

Times cited : (50)

References (42)

1
- 0039820686
- Audibility and stability of articulatory movements: Deciphering two experiments on anticipatory rounding in French
- Aix-en-Provence, France
- Abry, C., Lallouache, M.T., 1991. Audibility and stability of articulatory movements: Deciphering two experiments on anticipatory rounding in French. In: Proceedings of the XIIth International Congress of Phonetic Sciences, Aix-en-Provence, France, Vol. 1, pp. 220-225.
- (1991) Proceedings of the XIIth International Congress of Phonetic Sciences , vol.1 , pp. 220-225
- Abry, C.¹ Lallouache, M.T.²

2
- 0040413176
- Mémoire de DEA SIP, INPG, Grenoble, France
- Adjoudani, A., 1993. Elaboration d'un modèle de lèvres 3D pour animation en temps réel. Mémoire de DEA SIP, INPG, Grenoble, France.
- (1993) Elaboration d'un Modèle de Lèvres 3D pour Animation en Temps Réel
- Adjoudani, A.¹

3
- 0039228790
- PhD Thesis. Institut National Polytechnique, Grenoble, France
- Alissali, M., 1993. Architecture logicielle pour la synthèse multilingue de la parole, PhD Thesis. Institut National Polytechnique, Grenoble, France.
- (1993) Architecture Logicielle pour la Synthèse Multilingue de la Parole
- Alissali, M.¹

4
- 0031198820
- Learning to speak. Sensori-motor control of speech movements
- Bailly, G., 1997. Learning to speak. Sensori-motor control of speech movements. Speech Communication 22, 251-267.
- (1997) Speech Communication , vol.22 , pp. 251-267
- Bailly, G.¹

5
- 0040413187
- Synthesis-by-rule for French
- Aix-en-Provence, France
- Bailly, G., Guerti, M., 1991. Synthesis-by-rule for French. In: Proceedings of the 12th International Congress of Phonetic Sciences, Aix-en-Provence, France, Vol. 2, pp. 506-511.
- (1991) Proceedings of the 12th International Congress of Phonetic Sciences , vol.2 , pp. 506-511
- Bailly, G.¹ Guerti, M.²

6
- 0040413185
- COMPOST: A server for multilingual text-to-speech system
- Bailly, G., Alissali, M., 1992. COMPOST: a server for multilingual text-to-speech system. Traitement du Signal 9 (4), 359-366.
- (1992) Traitement du Signal , vol.9 , Issue.4 , pp. 359-366
- Bailly, G.¹ Alissali, M.²

7
- 0016196060
- Coarticulation of upper lip protrusion in French
- Benguerel, A.P., Cowan, H.A., 1974. Coarticulation of upper lip protrusion in French. Phonetica 30, 41-55.
- (1974) Phonetica , vol.30 , pp. 41-55
- Benguerel, A.P.¹ Cowan, H.A.²

8
- 0002186602
- A set of French visemes for visual speech synthesis
- Bailly, G., Benoît, C (Eds.), Elsevier, Amsterdam
- Benoît, C., Lallouache, T., Mohamadi, T., Abry, C., 1992. A set of French visemes for visual speech synthesis. In: Bailly, G., Benoît, C (Eds.), Talking Machines: Theories, Models and Designs. Elsevier, Amsterdam, pp. 485-504.
- (1992) Talking Machines: Theories, Models and Designs , pp. 485-504
- Benoît, C.¹ Lallouache, T.² Mohamadi, T.³ Abry, C.⁴

9
- 0028023732
- Audio-visual intelligibility of French speech in noise
- Benoît, C., Mohamadi, T., Kandel, S., 1994. Audio-visual intelligibility of French speech in noise. Journal of Speech and Hearing Research 37, 1195-1203.
- (1994) Journal of Speech and Hearing Research , vol.37 , pp. 1195-1203
- Benoît, C.¹ Mohamadi, T.² Kandel, S.³

10
- 4243879136
- An investigation of hypo- and hyper-speech in the visual modality
- Autrans, France
- Benoît, C., Fuster-Duran, A., Le Goff, B., 1996a. An investigation of hypo- and hyper-speech in the visual modality. In: Proceedings of ETRW 96, Autrans, France, pp. 237-240.
- (1996) Proceedings of ETRW , vol.96 , pp. 237-240
- Benoît, C.¹ Fuster-Duran, A.² Le Goff, B.³

11
- 0001055701
- Which components of the face do humans and machines best speechread?
- Stork, D., Hennecke, M. (Eds.), NATO-ASI Series 150 Springer, Berlin, pp.
- Benoît, C., Guiard-Marigny, T., Le Goff, B., Adjoudani, A., 1996b. Which components of the face do humans and machines best speechread?. In: Stork, D., Hennecke, M. (Eds.), Speechreading by Humans and Machines, NATO-ASI Series 150 Springer, Berlin, pp. 315-328.
- (1996) Speechreading by Humans and Machines , pp. 315-328
- Benoît, C.¹ Guiard-Marigny, T.² Le Goff, B.³ Adjoudani, A.⁴

12
- 84925640716
- A multimedia platform for audio-visual speech processing
- ESCA, Rhodes, Greece
- Benoît, C., Adjoudani, A., Guiard-Marigny, T., Le Goff, B., Reveret, L., 1997. A multimedia platform for audio-visual speech processing. In: Proceedings of the 5th Eurospeech Conference, ESCA, Rhodes, Greece, Vol. 3, pp. 1671-1674.
- (1997) Proceedings of the 5th Eurospeech Conference , vol.3 , pp. 1671-1674
- Benoît, C.¹ Adjoudani, A.² Guiard-Marigny, T.³ Le Goff, B.⁴ Reveret, L.⁵

13
- 84883424118
- Rule-based visual speech synthesis
- Madrid, Spain
- Beskow, J., 1995. Rule-based visual speech synthesis. In: Proceedings of Eurospeech'95, Madrid, Spain, Vol. 1, pp. 299-302.
- (1995) Proceedings of Eurospeech'95 , vol.1 , pp. 299-302
- Beskow, J.¹

14
- 84926273209
- Analysis, synthesis and perception of visible articulatory movements
- Brooke, N.M., Summerfield, A.Q., 1983. Analysis, synthesis and perception of visible articulatory movements. Journal of Phonetics 11, 63-76.
- (1983) Journal of Phonetics , vol.11 , pp. 63-76
- Brooke, N.M.¹ Summerfield, A.Q.²

15
- 0009643164
- Pitch-synchronous wave-form processing technique for text-to-speech synthesis using diphones
- ESCA, Paris, France
- Charpentier, F., Moulines, E., 1989. Pitch-synchronous wave-form processing technique for text-to-speech synthesis using diphones. In: Proceedings of the First Eurospeech Conference, ESCA, Paris, France, Vol. 2, pp. 13-19.
- (1989) Proceedings of the First Eurospeech Conference , vol.2 , pp. 13-19
- Charpentier, F.¹ Moulines, E.²

16
- 0000125550
- Synthesis of visible speech
- Cohen, M.M., Massaro, D.W., 1990. Synthesis of visible speech. Behavior Research Methods, Instruments and Computers 22, 260-263.
- (1990) Behavior Research Methods, Instruments and Computers , vol.22 , pp. 260-263
- Cohen, M.M.¹ Massaro, D.W.²

17
- 0001514782
- Modeling coarticulation in synthetic visual speech. Models and techniques
- Thalmann, N.M., Thalmann, D. (Eds.), Springer, Tokyo
- Cohen, M.M., Massaro, D.W., 1993. Modeling coarticulation in synthetic visual speech. Models and techniques. In: Thalmann, N.M., Thalmann, D. (Eds.), Computer Animation. Springer, Tokyo, pp. 139-156.
- (1993) Computer Animation , pp. 139-156
- Cohen, M.M.¹ Massaro, D.W.²

18
- 0002150047
- Cued speech
- Cornett, R.O., 1967. Cued speech. American Annals of the Deaf 112, 3-13.
- (1967) American Annals of the Deaf , vol.112 , pp. 3-13
- Cornett, R.O.¹

19
- 0014529713
- Interaction of audition and vision in the recognition of oral speech stimuli
- Erber, N.P., 1969. Interaction of audition and vision in the recognition of oral speech stimuli. Journal of Speech and Hearing Research 12, 423-425.
- (1969) Journal of Speech and Hearing Research , vol.12 , pp. 423-425
- Erber, N.P.¹

20
- 0016817064
- Auditory-visual perception of speech
- Erber, N.P., 1975. Auditory-visual perception of speech. Journal of Speech and Hearing Research 40, 481-492.
- (1975) Journal of Speech and Hearing Research , vol.40 , pp. 481-492
- Erber, N.P.¹

21
- 0039820685
- Confusion among visually perceived consonants
- Fisher, C.G., 1968. Confusion among visually perceived consonants. Journal of Speech and Hearing Research 15, 474-482.
- (1968) Journal of Speech and Hearing Research , vol.15 , pp. 474-482
- Fisher, C.G.¹

22
- 0020823325
- Converging sources of evidence on spoken and perceived rhythms of speech: Cyclic production of vowels in monosyllabic stress feet
- Fowler, C., 1983. Converging sources of evidence on spoken and perceived rhythms of speech: Cyclic production of vowels in monosyllabic stress feet. Journal of Experimental Psychology: Human Perception and Performance 112, 386-412.
- (1983) Journal of Experimental Psychology: Human Perception and Performance , vol.112 , pp. 386-412
- Fowler, C.¹

23
- 0040413186
- Mémoire de DEA SIP, INPG, Grenoble, France
- Guiard-Marigny, T., 1992. Animation en temps réel d'un modèle paramétrisé de lèvres. Mémoire de DEA SIP, INPG, Grenoble, France, p. 72.
- (1992) Animation en Temps Réel d'un Modèle Paramétrisé de Lèvres , pp. 72
- Guiard-Marigny, T.¹

24
- 0038473771
- 3D models of the lips and jaw for visual speech synthesis
- van Santen, J.P.H., Sproat, R., Olive, J., Hirshberg, J. (Eds.), Springer, New York
- Guiard-Marigny, T., Adjoudani, A., Benoît, C., 1996. 3D models of the lips and jaw for visual speech synthesis. In: van Santen, J.P.H., Sproat, R., Olive, J., Hirshberg, J. (Eds.), Progress in Speech Synthesis. Springer, New York, pp. 247-258.
- (1996) Progress in Speech Synthesis , pp. 247-258
- Guiard-Marigny, T.¹ Adjoudani, A.² Benoît, C.³

25
- 0003009750
- Acoustic phonetics
- Joos, M., 1948. Acoustic phonetics. Language 24, 1-136.
- (1948) Language , vol.24 , pp. 1-136
- Joos, M.¹

26
- 0039820682
- Mémoire de DEA SIP, INPG, Grenoble, France
- Le Goff, B., 1993. Commandes paramétriques d'un modèle de visage 3D pour animation en temps réel. Mémoire de DEA SIP, INPG, Grenoble, France.
- (1993) Commandes Paramétriques d'un Modèle de Visage 3D pour Animation en Temps Réel
- Le Goff, B.¹

27
- 85133504159
- Automatic modeling of coarticulation in text-to-visual speech synthesis
- ESCA, Rhodes, Greece
- Le Goff, B., 1997a. Automatic modeling of coarticulation in text-to-visual speech synthesis. In: Proceedings of the 5th Eurospeech Conference, ESCA, Rhodes, Greece, Vol. 3, pp. 1667-1670.
- (1997) Proceedings of the 5th Eurospeech Conference , vol.3 , pp. 1667-1670
- Le Goff, B.¹

28
- 78349235205
- Mémoire de thèse, INP, Grenoble, France
- Le Goff, B., 1997b. Synthèse àpartir du texte de visage 3D parlant français. Mémoire de thèse, INP, Grenoble, France, p. 256.
- (1997) Synthèse Àpartir du Texte de Visage 3D Parlant Français , pp. 256
- Le Goff, B.¹

29
- 0030351608
- A text-to-audiovisual-speech synthesizer for French
- Philadelphia, PA, USA
- Le Goff, B., Benoît, C., 1996. A text-to-audiovisual-speech synthesizer for French. In: Proceedings of the 4th International Conference on Spoken Language Processing, Philadelphia, PA, USA, Vol. 4, pp. 2163-2166.
- (1996) Proceedings of the 4th International Conference on Spoken Language Processing , vol.4 , pp. 2163-2166
- Le Goff, B.¹ Benoît, C.²

30
- 84925592440
- A French-speaking synthetic head
- Benoît, C., Campbell, R. (Eds.), Rhodes, Greece
- Le Goff, B., Benoît, C., 1997. A French-speaking synthetic head. In: Benoît, C., Campbell, R. (Eds.), Proceedings of the ESCA workshop on Audio-Visual Speech Processing, Rhodes, Greece, pp. 145-148.
- (1997) Proceedings of the ESCA Workshop on Audio-visual Speech Processing , pp. 145-148
- Le Goff, B.¹ Benoît, C.²

31
- 0003762887
- Analysis-synthesis and intelligibility of a talking face
- Van Santen, J.P.H., Sproat, R.W., Olive, J.P., J. Hirschberg (Eds.), Springer. New York
- Le Goff, B., Guiard-Marigny, T., Benoît, C., 1996. Analysis-synthesis and intelligibility of a talking face. In: Van Santen, J.P.H., Sproat, R.W., Olive, J.P., J. Hirschberg (Eds.), Progress in Speech Synthesis. Springer. New York, pp. 235-246.
- (1996) Progress in Speech Synthesis , pp. 235-246
- Le Goff, B.¹ Guiard-Marigny, T.² Benoît, C.³

32
- 0003116759
- Speech as audible gestures
- Hardcastle, W.J., Marchal, A. (Eds.), Kluwer Academic Publishers, Dordrecht
- Löfquist, A., 1990. Speech as audible gestures. In: Hardcastle, W.J., Marchal, A. (Eds.), Speech Production and Speech Modeling. Kluwer Academic Publishers, Dordrecht, pp. 289-322.
- (1990) Speech Production and Speech Modeling , pp. 289-322
- Löfquist, A.¹

33
- 0004084456
- MIT Press, Cambridge, MA
- Massaro, D.W., 1997. Perceiving Talking Faces. MIT Press, Cambridge, MA.
- (1997) Perceiving Talking Faces
- Massaro, D.W.¹

34
- 0017199877
- Hearing lips and seeing voices
- McGurk, H., MacDonald, J., 1976. Hearing lips and seeing voices. Nature 264, 746-748.
- (1976) Nature , vol.264 , pp. 746-748
- McGurk, H.¹ MacDonald, J.²

35
- 0039228786
- Mémoire de thèse, INP, Grenoble, 174 pp
- Mohamadi, T., 1993. Synthèse àpartir du texte de visages parlants: Réalisation d'un prototype et mesures d'intelligibilité bimodale. Mémoire de thèse, INP, Grenoble, 174 pp.
- (1993) Synthèse Àpartir du Texte de Visages Parlants: Réalisation d'un Prototype et Mesures d'Intelligibilité Bimodale
- Mohamadi, T.¹

36
- 0003584841
- Ph.D Dissertation, University of Utah, Department of Computer Sciences
- Parke, F.I., 1974. A parametric model for human faces. Ph.D Dissertation, University of Utah, Department of Computer Sciences.
- (1974) A Parametric Model for Human Faces
- Parke, F.I.¹

37
- 0040413181
- Creation of a synthetic face speaking in real time with a synthetic voice
- Benoît, C., Bailly, G. (Eds.), Autrans, France
- Saintourens, M., Tramus, M.H., Huitric, H., Nahas, M., 1990. Creation of a synthetic face speaking in real time with a synthetic voice. In: Benoît, C., Bailly, G. (Eds.), Proceedings of the 1st ESCA Workshop on Speech Synthesis, Autrans, France, pp. 249-252.
- (1990) Proceedings of the 1st ESCA Workshop on Speech Synthesis , pp. 249-252
- Saintourens, M.¹ Tramus, M.H.² Huitric, H.³ Nahas, M.⁴

38
- 77956779481
- A dynamical approach to gestural patterning in speech production
- Saltzman, E., Munhall, K., 1989. A dynamical approach to gestural patterning in speech production. Ecological Psychology 1, 333-382.
- (1989) Ecological Psychology , vol.1 , pp. 333-382
- Saltzman, E.¹ Munhall, K.²

39
- 0001048664
- Visual contribution to speech intelligibility in noise
- Sumby, W.H., Pollack, I., 1954. Visual contribution to speech intelligibility in noise. Journal of the Acoustical Society of America 26, 212-215.
- (1954) Journal of the Acoustical Society of America , vol.26 , pp. 212-215
- Sumby, W.H.¹ Pollack, I.²

40
- 0002955163
- Lips, teeth, and the benefits of lipreading
- Young, A.W., Ellis, H.D. (Eds.), Elsevier, Amsterdam
- Summerfield, Q., MacLeod, A., McGrath, M., Brooke, M., 1989. Lips, teeth, and the benefits of lipreading. In: Young, A.W., Ellis, H.D. (Eds.), Handbook of Research on Face Processing. Elsevier, Amsterdam, pp. 223-233.
- (1989) Handbook of Research on Face Processing , pp. 223-233
- Summerfield, Q.¹ MacLeod, A.² McGrath, M.³ Brooke, M.⁴

41
- 0039228789
- DEA Dissertation, ENSERG, INP Grenoble
- Woodward, P., 1991. Synthèse de visage parlant. DEA Dissertation, ENSERG, INP Grenoble.
- (1991) Synthèse de Visage Parlant
- Woodward, P.¹

42
- 0039228784
- Synthèse àpartir du texte d'un visage parlant français
- GFCP-SFA, Brussels, Belgium
- Woodward, P., Mohamadi, T., Benoît, C., Bailly, G., 1992. Synthèse àpartir du texte d'un visage parlant français. Actes des 19èmes Journées dEtude sur la Parole, GFCP-SFA, Brussels, Belgium, pp. 319-324.
- (1992) Actes des 19èmes Journées dEtude sur la Parole , pp. 319-324
- Woodward, P.¹ Mohamadi, T.² Benoît, C.³ Bailly, G.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.