SCOPUS 정보 검색 플랫폼

Volumn 44, Issue 1-4 SPEC. ISS., 2004, Pages 113-125

Developing an audio-visual speech source separation algorithm

(4) Sodoyer, David a Girin, Laurent a Jutten, Christian a Schwartz, Jean Luc a

a GRENOBLE INSTITUTE OF TECHNOLOGY (France)

Author keywords

Audio visual coherence; Audio visual joint probability; Blind source separation; Spectral information; Speech enhancement

Indexed keywords

AUDIO-VISUAL COHERENCE; AUDIO-VISUAL JOINT PROBABILITY; SPEECH ENHANCEMENT;

ACOUSTIC NOISE; ALGORITHMS; BLIND SOURCE SEPARATION; PROBABILITY; SENSOR DATA FUSION; SIGNAL PROCESSING;

SPEECH PROCESSING;

EID: 10444247388 PISSN: 01676393 EISSN: None Source Type: Journal
DOI: 10.1016/j.specom.2004.10.002 Document Type: Article

Times cited : (36)

References (30)

1
- 0000396062
- Natural gradient works efficiently in learning
- Amari, S.-L., 1998. Natural gradient works efficiently in learning. Neural Comput. 10, 251-276.
- (1998) Neural Comput. , vol.10 , pp. 251-276
- Amari, S.-L.¹

2
- 0002186602
- A set of visual French visemes for visual speech synthesis
- Bailly, G. et al. (Eds.), Elsevier, Amsterdam
- Benoît, C., Lallouache, M.T., Mohamadi, T., Abry, C., 1992. A set of visual French visemes for visual speech synthesis. In: Bailly, G. et al. (Eds.), Talking Machines. Elsevier, Amsterdam, pp. 485-504.
- (1992) Talking Machines , pp. 485-504
- Benoît, C.¹ Lallouache, M.T.² Mohamadi, T.³ Abry, C.⁴

3
- 0030362791
- For speech perception by humans or machines, three senses are better than one
- Bernstein, L.E., Benoît, C., 1996. For speech perception by humans or machines, three senses are better than one. In: Proc. ICSLP'96, pp. 1477-1480.
- (1996) Proc. ICSLP'96 , pp. 1477-1480
- Bernstein, L.E.¹ Benoît, C.²

4
- 84901220364
- This volume
- Bernstein, L.E., Takayanagi, S., Auer E.T., Jr., 2004. Enhanced auditory detection with AV speech: Perceptual evidence for speech and non-speech mechanisms. This volume.
- (2004) Enhanced Auditory Detection with AV Speech: Perceptual Evidence for Speech and Non-speech Mechanisms
- Bernstein, L.E.¹ Takayanagi, S.² Auer Jr., E.T.³

5
- 84901700762
- Audiovisual speech enhancement based on the association between speech envelope and video feature
- Geneva
- Berthommier, F., 2003, Audiovisual speech enhancement based on the association between speech envelope and video feature. In: Proc. Eurospeech'03, Geneva, pp. 1045-1048.
- (2003) Proc. Eurospeech'03 , pp. 1045-1048
- Berthommier, F.¹

6
- 84873571317
- This volume
- Berthommier, F., 2004. A phonetically neutral model of the low-level audiovisual interaction. This volume.
- (2004) A Phonetically Neutral Model of the Low-level Audiovisual Interaction
- Berthommier, F.¹

7
- 0027812550
- Blind beamforming for non-Gaussian signals
- Cardoso, J.F., Souloumiac, A., 1993. Blind beamforming for non-Gaussian signals. IEE Proc. - F 140, 362-370.
- (1993) IEE Proc. - F , vol.140 , pp. 362-370
- Cardoso, J.F.¹ Souloumiac, A.²

8
- 0030417779
- Equivariant adaptative source separation
- Cardoso, J.F., Laheld, B., 1996. Equivariant adaptative source separation. IEEE Trans. SP 44, 3017-3030.
- (1996) IEEE Trans. SP , vol.44 , pp. 3017-3030
- Cardoso, J.F.¹ Laheld, B.²

9
- 85009232030
- Audio-visual speech enhancement with AVCDCN (AudioVisual Code-book Dependent Cepstral Normalization)
- Deligne, S., Potamianos, G., Neti, C., 2002. Audio-visual speech enhancement with AVCDCN (AudioVisual Code-book Dependent Cepstral Normalization). In: Proc. ICSLP'2002, pp. 1449-1452.
- (2002) Proc. ICSLP'2002 , pp. 1449-1452
- Deligne, S.¹ Potamianos, G.² Neti, C.³

10
- 0002629270
- Maximum likelihood from incomplete data via the EM algorithm
- Dempster, A.P., Laird, N.M., Rubin, D.B., 1977. Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. 39, 1-38.
- (1977) J. Roy. Statist. Soc. , vol.39 , pp. 1-38
- Dempster, A.P.¹ Laird, N.M.² Rubin, D.B.³

11
- 84873567862
- Can the visual input make the audio signal pop out in noise? A first study of the enhancement of noisy VCV acoustic sequences by audiovisual fusion
- Girin, L., Feng, G., Schwartz, J.-L., 1997. Can the visual input make the audio signal pop out in noise? A first study of the enhancement of noisy VCV acoustic sequences by audiovisual fusion. In: Proc. AVSP'97, pp. 37-40.
- (1997) Proc. AVSP'97 , pp. 37-40
- Girin, L.¹ Feng, G.² Schwartz, J.-L.³

12
- 0034974093
- Audio-visual enhancement of speech in noise
- Girin, L., Schwartz, J.L., Feng, G., 2001. Audio-visual enhancement of speech in noise. J. Acoust. Soc. Am. 109, 3007-3020.
- (2001) J. Acoust. Soc. Am. , vol.109 , pp. 3007-3020
- Girin, L.¹ Schwartz, J.L.² Feng, G.³

13
- 0036295990
- Noisy audio feature enhancement using audio-visual speech data
- Goecke, R., Potamianos, G., Neti, C., 2002. Noisy audio feature enhancement using audio-visual speech data. In: Proc. Internat. Conf, Acoust., Speech, Signal Process., pp. 2025-2028.
- (2002) Proc. Internat. Conf, Acoust., Speech, Signal Process. , pp. 2025-2028
- Goecke, R.¹ Potamianos, G.² Neti, C.³

14
- 0033822769
- The use of visible speech cues for improving auditory detection of spoken sentences
- Grant, K.W., Seitz, P., 2000. The use of visible speech cues for improving auditory detection of spoken sentences. J. Acoust. Soc. Am. 108, 1197-1208.
- (2000) J. Acoust. Soc. Am. , vol.108 , pp. 1197-1208
- Grant, K.W.¹ Seitz, P.²

15
- 84873566400
- This volume
- Grant, K.W., van Wassenhove, V., Poeppel, D., 2004. Detection of auditory and auditory-visual synchrony. This volume.
- (2004) Detection of Auditory and Auditory-visual Synchrony
- Grant, K.W.¹ Van Wassenhove, V.² Poeppel, D.³

16
- 0032629347
- Fast and robust fixed-point algorithms for Independent Component Analysis
- Hyvärinen, A., 1999. Fast and robust fixed-point algorithms for Independent Component Analysis. IEEE Trans. Neural Networks 10, 626-634.
- (1999) IEEE Trans. Neural Networks , vol.10 , pp. 626-634
- Hyvärinen, A.¹

17
- 0026191274
- Blind separation of sources. Part I: An adaptive algorithm based on a neuromimetic architecture
- Jutten, C., Herault, J., 1991. Blind separation of sources. Part I: An adaptive algorithm based on a neuromimetic architecture. Signal Process. 24, 1-10.
- (1991) Signal Process. , vol.24 , pp. 1-10
- Jutten, C.¹ Herault, J.²

18
- 85071899496
- Visible speech cues and auditory detection of spoken sentences: An effect of degree of correlation between acoustic and visual properties
- Kim, J., Davis, C., 2001. Visible speech cues and auditory detection of spoken sentences: an effect of degree of correlation between acoustic and visual properties. In: Proc. AVSP'2001, pp. 127-131.
- (2001) Proc. AVSP'2001 , pp. 127-131
- Kim, J.¹ Davis, C.²

19
- 84873567378
- This volume
- Kim, J., Davis, C., 2004. Testing the cuing hypothesis for the AV speech detection advantage. This volume.
- (2004) Testing the Cuing Hypothesis for the AV Speech Detection Advantage
- Kim, J.¹ Davis, C.²

20
- 0002605227
- Un poste 'visage-parole'. Acquisition et traitement de contours labiaux
- Montréal
- Lallouache, M.T., 1990. Un poste 'visage-parole'. Acquisition et traitement de contours labiaux. In: Proc. XVIII JEPs, Montréal, pp. 282-286.
- (1990) Proc. XVIII JEPs , pp. 282-286
- Lallouache, M.T.¹

21
- 84873570071
- This volume
- Nakadai, K., Matsuura, D., Okuno, H.G., Tsujino, H., 2004. Improvement of three simultaneous speech recognition by using AV integration and scattering theory for humanoid. This volume.
- (2004) Improvement of Three Simultaneous Speech Recognition by Using AV Integration and Scattering Theory for Humanoid
- Nakadai, K.¹ Matsuura, D.² Okuno, H.G.³ Tsujino, H.⁴

22
- 85009151501
- Separating three simultaneous speeches with two microphones by integrating auditory and visual processing
- Okuno, H.G., Nakadai, K., Lourens, T., Kitano, H., 2001. Separating three simultaneous speeches with two microphones by integrating auditory and visual processing. In: Proc. Eurospeech 2001, pp. 2643-2646.
- (2001) Proc. Eurospeech 2001 , pp. 2643-2646
- Okuno, H.G.¹ Nakadai, K.² Lourens, T.³ Kitano, H.⁴

23
- 0003699540
- Doct. Thesis, University of Illinois
- Petajan, E.D., 1984. Automatic Lipreading to Enhance Speech Recognition. Doct. Thesis, University of Illinois.
- (1984) Automatic Lipreading to Enhance Speech Recognition
- Petajan, E.D.¹

24
- 85009257811
- Audio-visual scene analysis: Evidence for a "very-early" integration process in audio-visual speech perception
- Schwartz, J.L., Berthommier, F., Savariaux, C., 2002. Audio-visual scene analysis: Evidence for a "very-early" integration process in audio-visual speech perception. In: Proc. ICSLP'2002, pp. 1937-1940.
- (2002) Proc. ICSLP'2002 , pp. 1937-1940
- Schwartz, J.L.¹ Berthommier, F.² Savariaux, C.³

25
- 4544333803
- Seeing to hear better: Evidence for early audio-visual interactions in speech identification
- Schwartz, J.L., Berthommier, F., Savariaux, C., 2004. Seeing to hear better: Evidence for early audio-visual interactions in speech identification. Cognition 93, B69-B78.
- (2004) Cognition , vol.93
- Schwartz, J.L.¹ Berthommier, F.² Savariaux, C.³

26
- 0036874541
- Separation of audio-visual speech sources
- Sodoyer, D., Schwartz, J.L., Girin, L., Klinkisch, J., Jutten, C., 2002. Separation of audio-visual speech sources. Eurasip JASP 2002, 1164-1173.
- (2002) Eurasip JASP , vol.2002 , pp. 1164-1173
- Sodoyer, D.¹ Schwartz, J.L.² Girin, L.³ Klinkisch, J.⁴ Jutten, C.⁵

27
- 84873567922
- Extracting an AV speech source from a mixture of signals
- Sodoyer, D., Girin, L., Jutten, C., Schwarte, J.L., 2003. Extracting an AV speech source from a mixture of signals. In: Proc. Eurospeech 2003, pp. 1393-1396.
- (2003) Proc. Eurospeech 2003 , pp. 1393-1396
- Sodoyer, D.¹ Girin, L.² Jutten, C.³ Schwarte, J.L.⁴

28
- 0001048664
- Visual contribution to speech intelligibility in noise
- Sumby, W.H., Pollack, I., 1954. Visual contribution to speech intelligibility in noise. J. Acoust. Soc. Am. 26, 212-215.
- (1954) J. Acoust. Soc. Am. , vol.26 , pp. 212-215
- Sumby, W.H.¹ Pollack, I.²

29
- 0033341734
- Source separation in postnonlinear mixtures
- Taleb, A., Jutten, C., 1999. Source separation in postnonlinear mixtures. IEEE Trans. SP 10, 2807-2820.
- (1999) IEEE Trans. SP , vol.10 , pp. 2807-2820
- Taleb, A.¹ Jutten, C.²

30
- 0032178592
- Quantitative association of vocal-tract and facial behavior
- Yehia, H., Rubin, P., Vatikiotis-Bateson, E., 1998. Quantitative association of vocal-tract and facial behavior. Speech Communication 26, 23-43.
- (1998) Speech Communication , vol.26 , pp. 23-43
- Yehia, H.¹ Rubin, P.² Vatikiotis-Bateson, E.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.