SCOPUS 정보 검색 플랫폼

Journal of the Acoustical Society of America

Volumn 125, Issue 2, 2009, Pages 1184-1196

A study of lip movements during spontaneous dialog and its application to voice activity detection

(6) Sodoyer, David a Rivet, Bertrand a Girin, Laurent a Savariaux, Christophe a Schwartz, Jean Luc a Jutten, Christian a

a UNIV GRENOBLE ALPES (France)

Author keywords

[No Author keywords available]

Indexed keywords

AUDIO-VISUAL CORPORA; COMPREHENSIVE ANALYSIS; COMPREHENSIVE STUDIES; LIP MOVEMENTS; MOVING NOISE SOURCES; NON-STATIONARY NOISE; SPEECH ACTIVITIES; SPEECH SIGNALS; VISUAL SPEECH; VOICE ACTIVITY DETECTIONS; VOICE ACTIVITY DETECTORS;

SPEECH RECOGNITION;

ALGORITHMS; CUES; HUMANS; LIP; LIPREADING; MALE; MOVEMENT; PATTERN RECOGNITION, AUTOMATED; PATTERN RECOGNITION, PHYSIOLOGICAL; SIGNAL DETECTION, PSYCHOLOGICAL; SOUND SPECTROGRAPHY; SPEECH PERCEPTION; VIDEO RECORDING; VISUAL PERCEPTION; VOICE;

EID: 59849111743 PISSN: 00014966 EISSN: None Source Type: Journal
DOI: 10.1121/1.3050257 Document Type: Article

Times cited : (34)

References (61)

1
- 84904283822
- in Proceedings of the International Symposium on Signal Processing and Its Applications (ISSPA), Paris, France
- Abrard, F., and Deville, Y. (2003). " Blind separation of dependent sources using the "time-frequency ratio of mixture" approach.," in Proceedings of the International Symposium on Signal Processing and Its Applications (ISSPA), Paris, France, pp. 81-84.
- (2003) Blind Separation of Dependent Sources Using the "time-frequency Ratio of Mixture" Approach , pp. 81-84
- Abrard, F.¹ Deville, Y.²

2
- 0022690063
- Laws for lips
- Abry, C., and Boë, L. J. (1986). " Laws for lips.," Speech Commun. 5, 97-104.
- (1986) Speech Commun. , vol.5 , pp. 97-104
- Abry, C.¹ Boë, L.J.²

3
- 59849121082
- " in Proceedings of the European Signal Processing Conference (EUSIPCO), Poznan, Poland.
- Aubrey, A., Rivet, B., Hicks, Y., Girin, L., Chambers, J., and Jutten, C. (2007). " Comparison of appearance models and retinal filtering for visual voice activity detection.," in Proceedings of the European Signal Processing Conference (EUSIPCO), Poznan, Poland,.
- (2007) Comparison of Appearance Models and Retinal Filtering for Visual Voice Activity Detection
- Aubrey, A.¹ Rivet, B.² Hicks, Y.³ Girin, L.⁴ Chambers, J.⁵ Jutten, C.⁶

4
- 85009255585
- in Proceedings of the International Conference on Spoken Language Processing (ICSLP), Denver, CO
- Bailly, G., and Badin, P. (2002). " Seeing tongue movements from outside.," in Proceedings of the International Conference on Spoken Language Processing (ICSLP), Denver, CO, pp. 1913-1916.
- (2002) Seeing Tongue Movements from Outside , pp. 1913-1916
- Bailly, G.¹ Badin, P.²

5
- 0142216141
- Audiovisual speech synthesis
- "
- Bailly, G., Berard, M., Elisei, F., and Odisio, M. (2003). " Audiovisual speech synthesis.," Speech Technol., 6, 331-346.
- (2003) Speech Technol. , vol.6 , pp. 331-346
- Bailly, G.¹ Berard, M.² Elisei, F.³ Odisio, M.⁴

6
- 56749183149
- in Proceedings of the Conference on Audio-Visual Speech Processing (AVSP), Santa Cruz, CA
- Barker, J. P., and Berthommier, F. (1999). " Estimation of speech acoustics from visual speech features: A comparison of linear and non-linear models.," in Proceedings of the Conference on Audio-Visual Speech Processing (AVSP), Santa Cruz, CA, pp. 112-117.
- (1999) Estimation of Speech Acoustics from Visual Speech Features: A Comparison of Linear and Non-linear Models , pp. 112-117
- Barker, J.P.¹ Berthommier, F.²

7
- 0001055701
- Which components of the face humans and machines best speechread?
- in, NATO Advanced Studies Institute, Series F: Computer and System Sciences, edited by D. G. Stork and M. E. Hennecke (Springer, New York)
- Benòt, C., Guiard-Marigny, T., Le Goff, B., and Adjoudani, A. (1996). " Which components of the face humans and machines best speechread? " in Speechreading by Man and Machine: Models, Systems and Applications, NATO Advanced Studies Institute, Series F: Computer and System Sciences, edited by, D. G. Stork, and, M. E. Hennecke, (Springer, New York), pp. 315-328.
- (1996) Speechreading by Man and Machine: Models, Systems and Applications , pp. 315-328
- Benòt, C.¹ Guiard-Marigny, T.² Le Goff, B.³ Adjoudani, A.⁴

8
- 0002186602
- A set of French visemes for visual speech synthesis
- " in, edited by G. Bailly, C. Benoit, and T. R. Sawallis (North-Holland, Amsterdam)
- Benòt, C., Lallouache, T., Mohamadi, T., and Abry, C. (1992). " A set of French visemes for visual speech synthesis.," in Talking Machines: Th̀ories, Models, and Designs, edited by, G. Bailly, C. Benoit, and, T. R. Sawallis, (North-Holland, Amsterdam), pp. 485-504.
- (1992) Talking Machines: Th̀ories, Models, and Designs , pp. 485-504
- Benòt, C.¹ Lallouache, T.² Mohamadi, T.³ Abry, C.⁴

9
- 0028023732
- Effects of phonetic context on audio-visual intelligibility of French
- "
- Benòt, C., Mohamadi, T., and Kandel, S. (1994). " Effects of phonetic context on audio-visual intelligibility of French.," J. Speech Hear. Res. 37, 1195-1293.
- (1994) J. Speech Hear. Res. , vol.37 , pp. 1195-1293
- Benòt, C.¹ Mohamadi, T.² Kandel, S.³

10
- 10444276578
- Auditory speech detection in noise enhanced by lipreading
- "
- Bernstein, L. E., Takayanagi, S., and Auer, E. T., Jr. (2004). " Auditory speech detection in noise enhanced by lipreading.," Speech Commun. 44, 5-18.
- (2004) Speech Commun. , vol.44 , pp. 5-18
- Bernstein, L.E.¹ Takayanagi, S.² Auer Jr., E.T.³

11
- 77956789336
- Ventriloquism: A case of crossmodal perceptual grouping
- in, edited by G. Aschersleben, T. Bachmann, and J. Müsseler (Elsevier, Amsterdam)
- Bertelson, P. (1999). " Ventriloquism: A case of crossmodal perceptual grouping.," in Cognitive Contributions to the Perception of Spatial and Temporal Events, edited by, G. Aschersleben, T. Bachmann, and, J. Müsseler, (Elsevier, Amsterdam), pp. 347-362.
- (1999) Cognitive Contributions to the Perception of Spatial and Temporal Events , pp. 347-362
- Bertelson, P.¹

12
- 0037240789
- Reading speech from still and moving faces: The neural substrates of visible speech
- Calvert, G. A., and Campbell, R. (2003). " Reading speech from still and moving faces: The neural substrates of visible speech.," J. Cogn Neurosci. 15, 57-70.
- (2003) J. Cogn Neurosci. , vol.15 , pp. 57-70
- Calvert, G.A.¹ Campbell, R.²

13
- 59849102444
- in Proceedings of the International Congress of Phonetic Sciences (ICPhS), Sarrebrücken, Germany
- Campbell, N. (2007). " Approaches to conversational speech rhythm: Speech activity in two-person telephone dialogues.," in Proceedings of the International Congress of Phonetic Sciences (ICPhS), Sarrebrücken, Germany, pp. 343-348.
- (2007) Approaches to Conversational Speech Rhythm: Speech Activity in Two-person Telephone Dialogues , pp. 343-348
- Campbell, N.¹

14
- 85009210528
- " in Proceedings of the European Conference on Speech Communication and Technology (EuroSpeech), Geneva, Switzerland
- Cosi, P., Fusaro, A., and Tisato, G. (2003). " LUCIA: A new Italian talking-head based on a modified Cohen-Massaro's labial coarticulation model.," in Proceedings of the European Conference on Speech Communication and Technology (EuroSpeech), Geneva, Switzerland, pp. 2269-2272.
- (2003) LUCIA: A New Italian Talking-head Based on a Modified Cohen-Massaro's Labial Coarticulation Model , pp. 2269-2272
- Cosi, P.¹ Fusaro, A.² Tisato, G.³

15
- 0033708494
- " in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Istanbul, Turkey
- De Cueto, P., Neti, C., and Senior, A. W. (2000). " Audio-visual intent-to-speak detection in human-computer interaction.," in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Istanbul, Turkey, pp. 2373-2376.
- (2000) Audio-visual Intent-to-speak Detection in Human-computer Interaction , pp. 2373-2376
- De Cueto, P.¹ Neti, C.² Senior, A.W.³

16
- 85009232030
- " in Proceedings of the International Conference on Spoken Language Processing (ICSLP), Denver, CO
- Deligne, S., Potamianos, G., and Neti, C. (2002). " Audio-visual speech enhancement with AVCDCN (audiovisual codebook dependent cepstral normalization).," in Proceedings of the International Conference on Spoken Language Processing (ICSLP), Denver, CO, pp. 1449-1452.
- (2002) Audio-visual Speech Enhancement with AVCDCN (Audiovisual Codebook Dependent Cepstral Normalization) , pp. 1449-1452
- Deligne, S.¹ Potamianos, G.² Neti, C.³

17
- 0021645331
- Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator
- Ephraim, Y., and Malah, D. (1984). " Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator.," IEEE Trans. Acoust., Speech, Signal Process. 32, 1109-1121.
- (1984) IEEE Trans. Acoust., Speech, Signal Process. , vol.32 , pp. 1109-1121
- Ephraim, Y.¹ Malah, D.²

18
- 0016817064
- Auditory-visual perception of speech
- Erber, N. P. (1975). " Auditory-visual perception of speech.," J. Speech Hear Disord. 40, 481-492.
- (1975) J. Speech Hear Disord. , vol.40 , pp. 481-492
- Erber, N.P.¹

19
- 23744449511
- Analysis and synthesis of three-dimensional movements of the head, face, and of a speaker using cued speech
- "
- Gibert, G., Bailly, G., Beautemps, D., Elisei, F., and Brun, R. (2005). " Analysis and synthesis of three-dimensional movements of the head, face, and of a speaker using cued speech.," J. Acoust. Soc. Am. 118, 1144-1153.
- (2005) J. Acoust. Soc. Am. , vol.118 , pp. 1144-1153
- Gibert, G.¹ Bailly, G.² Beautemps, D.³ Elisei, F.⁴ Brun, R.⁵

20
- 2442628301
- Joint matrix quantization of face parameters and LPC coefficients for low bit rate audiovisual speech coding
- Girin, L. (2004). " Joint matrix quantization of face parameters and LPC coefficients for low bit rate audiovisual speech coding.," IEEE Trans. Speech Audio Process. 12, 265-276.
- (2004) IEEE Trans. Speech Audio Process. , vol.12 , pp. 265-276
- Girin, L.¹

21
- 0034974093
- Audio-visual enhancement of speech noise
- "
- Girin, L., Schwartz, J.-L., and Feng, G. (2001). " Audio-visual enhancement of speech noise.," J. Acoust. Soc. Am. 109, 3007-3020.
- (2001) J. Acoust. Soc. Am. , vol.109 , pp. 3007-3020
- Girin, L.¹ Schwartz, J.-L.² Feng, G.³

22
- 33745217559
- in Proceedings of the Conference on Audio-Visual Speech Processing (AVSP), Saint-Jorioz, France
- Goecke, R., and Millar, J. B. (2003). " Statistical analysis of relationship between audio and video speech parameters Australian English.," in Proceedings of the Conference on Audio-Visual Speech Processing (AVSP), Saint-Jorioz, France, pp. 133-138.
- (2003) Statistical Analysis of Relationship between Audio and Video Speech Parameters Australian English , pp. 133-138
- Goecke, R.¹ Millar, J.B.²

23
- 0033822769
- The use of visible speech cues for improving auditory detection of spoken sentences
- Grant, K. W., and Seitz, P. (2000). " The use of visible speech cues for improving auditory detection of spoken sentences.," J. Acoust. Soc. Am. 108, 1197-1208.
- (2000) J. Acoust. Soc. Am. , vol.108 , pp. 1197-1208
- Grant, K.W.¹ Seitz, P.²

24
- 59849083022
- " in Proceedings of the WorkshoMeeting on Multimedia Signal Processing (MMSP), Copenhagen, Denmark
- Huang, J., Liu, Z., Wang, Y., Chen, Y., and Wong, E. (1999). " Integration of multimodal feature for video scene classification based on HMM.," in Proceedings of the Workshop Meeting on Multimedia Signal Processing (MMSP), Copenhagen, Denmark, pp. 53-58.
- (1999) Integration of Multimodal Feature for Video Scene Classification Based on HMM , pp. 53-58
- Huang, J.¹ Liu, Z.² Wang, Y.³ Chen, Y.⁴ Wong, E.⁵

25
- 14944343138
- in Proceedings of the Workshoat the International Conference on Computer Vision (ICCV) on Recognition, Analysis and Tracking of Face and Gestures in Real Time Systems (RATFG-RTS), Vancouver, Canada
- Iyengar, G., and Neti, C. (2001). " A vision-based microphone switch for speech intent detection.," in Proceedings of the Workshop at the International Conference on Computer Vision (ICCV) on Recognition, Analysis and Tracking of Face and Gestures in Real Time Systems (RATFG-RTS), Vancouver, Canada, pp. 101-105.
- (2001) A Vision-based Microphone Switch for Speech Intent Detection , pp. 101-105
- Iyengar, G.¹ Neti, C.²

26
- 0036874551
- On the relationship between face movements, tongue movements and speech acoustics
- "
- Jiang, J., Alwan, A., Keating, P. A., Auer, E. T., and Bernstein, L. E. (2002). " On the relationship between face movements, tongue movements and speech acoustics.," EURASIP J. Appl. Signal Process. 11, 1174-1188.
- (2002) EURASIP J. Appl. Signal Process. , vol.11 , pp. 1174-1188
- Jiang, J.¹ Alwan, A.² Keating, P.A.³ Auer, E.T.⁴ Bernstein, L.E.⁵

27
- 10444258058
- Investigating the audio-visual speech detection advantage
- Kim, J., and Davis, C. (2004). " Investigating the audio-visual speech detection advantage.," Speech Commun. 44, 19-30.
- (2004) Speech Commun. , vol.44 , pp. 19-30
- Kim, J.¹ Davis, C.²

28
- 59849098592
- in Proceedings of the XVIII Jourńes d'Étude sur la Parole (JEP), Montŕal, Canada, (in French).
- Lallouache, T. (1990). " Un poste visage-parole. Acquisition et traitement des contours labiaux (A device for the capture and processing of lip contours).," in Proceedings of the XVIII Jourńes d'Étude sur la Parole (JEP), Montŕal, Canada, pp. 282-286 (in French).
- (1990) Un Poste Visage-parole. Acquisition et Traitement des Contours Labiaux (A Device for the Capture and Processing of Lip Contours) , pp. 282-286
- Lallouache, T.¹

29
- 0001653589
- The Lombard sign and the role of hearing in speech
- Lane, H., and Tranel, B. (1971). " The Lombard sign and the role of hearing in speech.," J. Speech Hear. Res. 14, 677-709.
- (1971) J. Speech Hear. Res. , vol.14 , pp. 677-709
- Lane, H.¹ Tranel, B.²

30
- 0029290274
- Study of a voice activity detector and its influence on a noise reduction system
- Le Bouquin-Jeanǹs, R., and Faucon, G. (1995). " Study of a voice activity detector and its influence on a noise reduction system.," Speech Commun. 16, 245-254.
- (1995) Speech Commun. , vol.16 , pp. 245-254
- Le Bouquin-Jeanǹs, R.¹ Faucon, G.²

31
- 4544351504
- in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Montreal, Canada
- Liu, P., and Wang, Z. (2004). " Voice activity detection using visual information.," in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Montreal, Canada, pp. 609-612.
- (2004) Voice Activity Detection Using Visual Information , pp. 609-612
- Liu, P.¹ Wang, Z.²

32
- 0000874053
- Annales des maladies de l'oreille et du larynx, p, (in French).
- Lombard, E. (1911). " Le signe de l'́ĺvation de la voix (The sign of voice rise).," Annales des maladies de l'oreille et du larynx 37, pp. 101-119, (in French).
- (1911) Le Signe de l'́Ĺvation de la Voix (The Sign of Voice Rise) , vol.37 , pp. 101-119
- Lombard, E.¹

33
- 33750558676
- " in International Conference on Multimedia and Expo (ICME), Amsterdam, The Netherlands
- Macho, D., Padrell, J., Abad, A., Nadeu, C., Hernando, J., McDonough, J., Wölfel, M., Klee, U., Omologo, M., Brutti, A., Svaizer, P., Potamianos, G., and Chu, S. M. (2005). " Automatic speech activity detection, source localization and speech recognition on the CHIL seminar corpus.," in International Conference on Multimedia and Expo (ICME), Amsterdam, The Netherlands, pp. 876-879.
- (2005) Automatic Speech Activity Detection, Source Localization and Speech Recognition on the CHIL Seminar Corpus , pp. 876-879
- MacHo, D.¹ Padrell, J.² Abad, A.³ Nadeu, C.⁴ Hernando, J.⁵ McDonough, J.⁶ Wölfel, M.⁷ Klee, U.⁸ Omologo, M.⁹ Brutti, A.¹⁰ Svaizer, P.¹¹ Potamianos, G.¹² Chu, S.M.¹³

34
- 0017199877
- Hearing lips and seeing voices
- McGurk, H., and McDonald, J. (1976). " Hearing lips and seeing voices.," Nature (London) 264, 746-748.
- (1976) Nature (London) , vol.264 , pp. 746-748
- McGurk, H.¹ McDonald, J.²

35
- 0030114670
- Temporal constraints on the McGurk effect
- "
- Munhall, K. G., Gribble, P., Sacco, L., and Ward, M. (1996). " Temporal constraints on the McGurk effect.," Percept. Psychophys. 58, 351-362.
- (1996) Percept. Psychophys. , vol.58 , pp. 351-362
- Munhall, K.G.¹ Gribble, P.² Sacco, L.³ Ward, M.⁴

36
- 0037038098
- Dynamic visual speech perception in a patient with visual form agnosia
- "
- Munhall, K. G., Servos, P., Santi, A., and Goodale, M. (2002). " Dynamic visual speech perception in a patient with visual form agnosia.," NeuroReport 13, 1793-1796.
- (2002) NeuroReport , vol.13 , pp. 1793-1796
- Munhall, K.G.¹ Servos, P.² Santi, A.³ Goodale, M.⁴

37
- 84964546021
- The moving face during speech communication
- in, edited by R. Campbell, B. Dodd, and D. Burnham (Psychology, London)
- Munhall, K. G., and Vatikiotis-Bateson, E. (1998). " The moving face during speech communication.," in Hearing by Eye II: Advances in the Psychology of Speechreading and Auditory-Visual Speech, edited by, R. Campbell, B. Dodd, and, D. Burnham, (Psychology, London), pp. 123-139.
- (1998) Hearing by Eye II: Advances in the Psychology of Speechreading and Auditory-Visual Speech , pp. 123-139
- Munhall, K.G.¹ Vatikiotis-Bateson, E.²

38
- 0021541159
- in Proceedings of the Global Telecommunications Conference (GLOBCOM), Atlanta, GA
- Petajan, E. D. (1984). " Automatic lipreading to enhance speech recognition.," in Proceedings of the Global Telecommunications Conference (GLOBCOM), Atlanta, GA, pp. 265-272.
- (1984) Automatic Lipreading to Enhance Speech Recognition , pp. 265-272
- Petajan, E.D.¹

39
- 84893587746
- " in Proceedings of the Conference on Audio-Visual Speech Processing (AVSP), Saint-Jorioz, France
- Potamianos, G., Neti, C., and Deligne, S. (2003a). " Joint audio-visual speech processing for recognition and enhancement.," in Proceedings of the Conference on Audio-Visual Speech Processing (AVSP), Saint-Jorioz, France, pp. 95-104.
- (2003) Joint Audio-visual Speech Processing for Recognition and Enhancement , pp. 95-104
- Potamianos, G.¹ Neti, C.² Deligne, S.³

40
- 4544290191
- Recent advances in the automatic recognition of visual speech
- "
- Potamianos, G., Neti, C., and Gravier, G. (2003b). " Recent advances in the automatic recognition of visual speech.," Proc. IEEE 91, 1306-1326.
- (2003) Proc. IEEE , vol.91 , pp. 1306-1326
- Potamianos, G.¹ Neti, C.² Gravier, G.³

41
- 1842476689
- Efficient voice activity detection algorithms using long-term speech information
- "
- Ramirez, J., Segura, J. C., Bemtez, C., de la Torre, A., and Rubio, A. (2004). " Efficient voice activity detection algorithms using long-term speech information.," Speech Commun. 42, 271-287.
- (2004) Speech Commun. , vol.42 , pp. 271-287
- Ramirez, J.¹ Segura, J.C.² Bemtez, C.³ De La Torre, A.⁴ Rubio, A.⁵

42
- 23344452899
- Statistical voice activity detection using a multiple observation likelihood ratio test
- "
- Ramírez, J., Segura, J. C., Benítez, C., García, L., and Rubio, A. (2005). " Statistical voice activity detection using a multiple observation likelihood ratio test.," IEEE Signal Process. Lett. 12, 689-692.
- (2005) IEEE Signal Process. Lett. , vol.12 , pp. 689-692
- Ramírez, J.¹ Segura, J.C.² Benítez, C.³ García, L.⁴ Rubio, A.⁵

43
- 59849095099
- in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Atlanta, GA
- Rao, R., and Chen, T. (1996). " Cross-modal predictive coding for talking head sequences.," in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Atlanta, GA, pp. 2058-2061.
- (1996) Cross-modal Predictive Coding for Talking Head Sequences , pp. 2058-2061
- Rao, R.¹ Chen, T.²

44
- 34447100075
- Mixing audiovisual speech processing and blind source separation for the extraction of speech signals from convolutive mixtures
- "
- Rivet, B., Girin, L., and Jutten, C. (2007a). " Mixing audiovisual speech processing and blind source separation for the extraction of speech signals from convolutive mixtures.," IEEE Trans. Audio, Speech, Lang. Process. 15, 96-108.
- (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , pp. 96-108
- Rivet, B.¹ Girin, L.² Jutten, C.³

45
- 34447095008
- Visual voice activity detection as a help for speech source separation from convolutive mixtures
- "
- Rivet, B., Girin, L., and Jutten, C. (2007b). " Visual voice activity detection as a help for speech source separation from convolutive mixtures.," Speech Commun. 49, 667-677.
- (2007) Speech Commun. , vol.49 , pp. 667-677
- Rivet, B.¹ Girin, L.² Jutten, C.³

46
- 0031747741
- Complementary and synergy in bimodal speech: Auditory, visual, and audio-visual identification of French oral vowels in noise
- "
- Robert-Ribes, J., Schwartz, J. L., Lallouache, T., and Escudier, P. (1998). " Complementary and synergy in bimodal speech: Auditory, visual, and audio-visual identification of French oral vowels in noise.," J. Acoust. Soc. Am. 6, 3677-3689.
- (1998) J. Acoust. Soc. Am. , vol.6 , pp. 3677-3689
- Robert-Ribes, J.¹ Schwartz, J.L.² Lallouache, T.³ Escudier, P.⁴

47
- 0029853869
- Visual kinematic information for embellishing speech in noise
- "
- Rosenblum, L. D., Johnson, J. A., and Saldana, H. M. (1996). " Visual kinematic information for embellishing speech in noise.," J. Speech Hear. Res. 39, 1159-1170.
- (1996) J. Speech Hear. Res. , vol.39 , pp. 1159-1170
- Rosenblum, L.D.¹ Johnson, J.A.² Saldana, H.M.³

48
- 0030114603
- An audiovisual test of kinematic primitives for visual speech perception
- Rosenblum, L. D., and Saldana, H. M. (1996). " An audiovisual test of kinematic primitives for visual speech perception.," J. Exp. Psychol. Hum. Percept. Perform. 22, 318-331.
- (1996) J. Exp. Psychol. Hum. Percept. Perform. , vol.22 , pp. 318-331
- Rosenblum, L.D.¹ Saldana, H.M.²

49
- 4544333803
- Seeing to hear better: Evidence for early audio-visual interactions in speech identification
- "
- Schwartz, J. L., Berthommier, F., and Savariaux, C. (2004). " Seeing to hear better: Evidence for early audio-visual interactions in speech identification.," Cognition 93, 69-78.
- (2004) Cognition , vol.93 , pp. 69-78
- Schwartz, J.L.¹ Berthommier, F.² Savariaux, C.³

50
- 0036874541
- Separation of audio-visual speech sources: A new approach exploiting the audiovisual coherence of speech stimuli
- "
- Sodoyer, D., Girin, L., Jutten, C., and Schwartz, J. L. (2002). " Separation of audio-visual speech sources: A new approach exploiting the audiovisual coherence of speech stimuli.," EURASIP J. Appl. Signal Process. 11, 1165-1173.
- (2002) EURASIP J. Appl. Signal Process. , vol.11 , pp. 1165-1173
- Sodoyer, D.¹ Girin, L.² Jutten, C.³ Schwartz, J.L.⁴

51
- 10444247388
- Further experiments on audio-visual speech source separation
- "
- Sodoyer, D., Girin, L., Jutten, C., and Schwartz, J. L. (2004). " Further experiments on audio-visual speech source separation.," Speech Commun. 44, 113-125.
- (2004) Speech Commun. , vol.44 , pp. 113-125
- Sodoyer, D.¹ Girin, L.² Jutten, C.³ Schwartz, J.L.⁴

52
- 33947625135
- " in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Toulouse, France
- Sodoyer, D., Rivet, B., Girin, L., Jutten, C., and Schwartz, J. L. (2006). " An analysis of visual speech information applied to voice activity detection.," in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Toulouse, France, pp. 601-604.
- (2006) An Analysis of Visual Speech Information Applied to Voice Activity Detection , pp. 601-604
- Sodoyer, D.¹ Rivet, B.² Girin, L.³ Jutten, C.⁴ Schwartz, J.L.⁵

53
- 0032762471
- A statistical model based voice activity detection
- "
- Sohn, J., Kim, N. S., and Sung, W. (1999). " A statistical model based voice activity detection.," IEEE Signal Process. Lett. 6, 1-3.
- (1999) IEEE Signal Process. Lett. , vol.6 , pp. 1-3
- Sohn, J.¹ Kim, N.S.² Sung, W.³

54
- 0001048664
- Visual contribution to speech intelligibility in noise
- Sumby, W. H., and Pollack, I. (1954). " Visual contribution to speech intelligibility in noise.," J. Acoust. Soc. Am. 26, 212-215.
- (1954) J. Acoust. Soc. Am. , vol.26 , pp. 212-215
- Sumby, W.H.¹ Pollack, I.²

55
- 0018701386
- Use of visual information for phonetic perception
- Summerfield, Q. (1979). " Use of visual information for phonetic perception.," Phonetica 36, 314-331.
- (1979) Phonetica , vol.36 , pp. 314-331
- Summerfield, Q.¹

56
- 0002028032
- Some preliminaries to a comprehensive account of audio-visual speech perception
- in, edited by B. Dodd and R. Campbell (Erlbaum, London)
- Summerfield, Q. (1987). " Some preliminaries to a comprehensive account of audio-visual speech perception.," in Hearing by Eye: The Psychology of Lip-Reading, edited by, B. Dodd, and, R. Campbell, (Erlbaum, London), pp. 3-51.
- (1987) Hearing by Eye: The Psychology of Lip-Reading , pp. 3-51
- Summerfield, Q.¹

57
- 0034228994
- Voice activity detection in nonstationary noise
- Tanyer, S. G., and Ozer, H. (2000). " Voice activity detection in nonstationary noise.," IEEE Trans. Speech Audio Process. 8, 478-482.
- (2000) IEEE Trans. Speech Audio Process. , vol.8 , pp. 478-482
- Tanyer, S.G.¹ Ozer, H.²

58
- 4744359141
- Contributions of oral and extraoral facial movement to visual and audiovisual speech perception
- Thomas, S. M., and Jordan, T. R. (2004). " Contributions of oral and extraoral facial movement to visual and audiovisual speech perception.," J. Exp. Psychol. 30, 873-888.
- (2004) J. Exp. Psychol. , vol.30 , pp. 873-888
- Thomas, S.M.¹ Jordan, T.R.²

59
- 33646231347
- " in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Philadelphia
- Wang, W., Cosker, D., Hicks, Y., Sanei, S., and Chambers, J. A. (2005). " Video assisted speech source separation.," in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Philadelphia, pp. 425-428.
- (2005) Video Assisted Speech Source Separation , pp. 425-428
- Wang, W.¹ Cosker, D.² Hicks, Y.³ Sanei, S.⁴ Chambers, J.A.⁵

60
- 0000986839
- " in Proceedings of the Seminar on Speech Production: Models and Data and CREST Workshoon Models of Speech Production: Motor Planning and Articulatory Modelling, Kloster Seeon, Germany
- Yehia, H., Kuratate, T., and Vatikiotis-Bateson, E. (2000). " Facial animation and head motion driven by speech acoustics.," in Proceedings of the Seminar on Speech Production: Models and Data and CREST Workshop on Models of Speech Production: Motor Planning and Articulatory Modelling, Kloster Seeon, Germany, pp. 265-268.
- (2000) Facial Animation and Head Motion Driven by Speech Acoustics , pp. 265-268
- Yehia, H.¹ Kuratate, T.² Vatikiotis-Bateson, E.³

61
- 0032178592
- Quantitative association of vocal-tract and facial behavior
- "
- Yehia, H., Rubin, P., and Vatikiotis-Bateson, E. (1998). " Quantitative association of vocal-tract and facial behavior.," Speech Commun. 26, 23-43.
- (1998) Speech Commun. , vol.26 , pp. 23-43
- Yehia, H.¹ Rubin, P.² Vatikiotis-Bateson, E.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.