메뉴 건너뛰기




Volumn 44, Issue 1-4 SPEC. ISS., 2004, Pages 113-125

Developing an audio-visual speech source separation algorithm

Author keywords

Audio visual coherence; Audio visual joint probability; Blind source separation; Spectral information; Speech enhancement

Indexed keywords

AUDIO-VISUAL COHERENCE; AUDIO-VISUAL JOINT PROBABILITY; SPEECH ENHANCEMENT;

EID: 10444247388     PISSN: 01676393     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.specom.2004.10.002     Document Type: Article
Times cited : (36)

References (30)
  • 1
    • 0000396062 scopus 로고    scopus 로고
    • Natural gradient works efficiently in learning
    • Amari, S.-L., 1998. Natural gradient works efficiently in learning. Neural Comput. 10, 251-276.
    • (1998) Neural Comput. , vol.10 , pp. 251-276
    • Amari, S.-L.1
  • 2
    • 0002186602 scopus 로고
    • A set of visual French visemes for visual speech synthesis
    • Bailly, G. et al. (Eds.), Elsevier, Amsterdam
    • Benoît, C., Lallouache, M.T., Mohamadi, T., Abry, C., 1992. A set of visual French visemes for visual speech synthesis. In: Bailly, G. et al. (Eds.), Talking Machines. Elsevier, Amsterdam, pp. 485-504.
    • (1992) Talking Machines , pp. 485-504
    • Benoît, C.1    Lallouache, M.T.2    Mohamadi, T.3    Abry, C.4
  • 3
    • 0030362791 scopus 로고    scopus 로고
    • For speech perception by humans or machines, three senses are better than one
    • Bernstein, L.E., Benoît, C., 1996. For speech perception by humans or machines, three senses are better than one. In: Proc. ICSLP'96, pp. 1477-1480.
    • (1996) Proc. ICSLP'96 , pp. 1477-1480
    • Bernstein, L.E.1    Benoît, C.2
  • 5
    • 84901700762 scopus 로고    scopus 로고
    • Audiovisual speech enhancement based on the association between speech envelope and video feature
    • Geneva
    • Berthommier, F., 2003, Audiovisual speech enhancement based on the association between speech envelope and video feature. In: Proc. Eurospeech'03, Geneva, pp. 1045-1048.
    • (2003) Proc. Eurospeech'03 , pp. 1045-1048
    • Berthommier, F.1
  • 7
    • 0027812550 scopus 로고
    • Blind beamforming for non-Gaussian signals
    • Cardoso, J.F., Souloumiac, A., 1993. Blind beamforming for non-Gaussian signals. IEE Proc. - F 140, 362-370.
    • (1993) IEE Proc. - F , vol.140 , pp. 362-370
    • Cardoso, J.F.1    Souloumiac, A.2
  • 8
    • 0030417779 scopus 로고    scopus 로고
    • Equivariant adaptative source separation
    • Cardoso, J.F., Laheld, B., 1996. Equivariant adaptative source separation. IEEE Trans. SP 44, 3017-3030.
    • (1996) IEEE Trans. SP , vol.44 , pp. 3017-3030
    • Cardoso, J.F.1    Laheld, B.2
  • 9
    • 85009232030 scopus 로고    scopus 로고
    • Audio-visual speech enhancement with AVCDCN (AudioVisual Code-book Dependent Cepstral Normalization)
    • Deligne, S., Potamianos, G., Neti, C., 2002. Audio-visual speech enhancement with AVCDCN (AudioVisual Code-book Dependent Cepstral Normalization). In: Proc. ICSLP'2002, pp. 1449-1452.
    • (2002) Proc. ICSLP'2002 , pp. 1449-1452
    • Deligne, S.1    Potamianos, G.2    Neti, C.3
  • 10
    • 0002629270 scopus 로고
    • Maximum likelihood from incomplete data via the EM algorithm
    • Dempster, A.P., Laird, N.M., Rubin, D.B., 1977. Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. 39, 1-38.
    • (1977) J. Roy. Statist. Soc. , vol.39 , pp. 1-38
    • Dempster, A.P.1    Laird, N.M.2    Rubin, D.B.3
  • 11
    • 84873567862 scopus 로고    scopus 로고
    • Can the visual input make the audio signal pop out in noise? A first study of the enhancement of noisy VCV acoustic sequences by audiovisual fusion
    • Girin, L., Feng, G., Schwartz, J.-L., 1997. Can the visual input make the audio signal pop out in noise? A first study of the enhancement of noisy VCV acoustic sequences by audiovisual fusion. In: Proc. AVSP'97, pp. 37-40.
    • (1997) Proc. AVSP'97 , pp. 37-40
    • Girin, L.1    Feng, G.2    Schwartz, J.-L.3
  • 12
    • 0034974093 scopus 로고    scopus 로고
    • Audio-visual enhancement of speech in noise
    • Girin, L., Schwartz, J.L., Feng, G., 2001. Audio-visual enhancement of speech in noise. J. Acoust. Soc. Am. 109, 3007-3020.
    • (2001) J. Acoust. Soc. Am. , vol.109 , pp. 3007-3020
    • Girin, L.1    Schwartz, J.L.2    Feng, G.3
  • 14
    • 0033822769 scopus 로고    scopus 로고
    • The use of visible speech cues for improving auditory detection of spoken sentences
    • Grant, K.W., Seitz, P., 2000. The use of visible speech cues for improving auditory detection of spoken sentences. J. Acoust. Soc. Am. 108, 1197-1208.
    • (2000) J. Acoust. Soc. Am. , vol.108 , pp. 1197-1208
    • Grant, K.W.1    Seitz, P.2
  • 16
    • 0032629347 scopus 로고    scopus 로고
    • Fast and robust fixed-point algorithms for Independent Component Analysis
    • Hyvärinen, A., 1999. Fast and robust fixed-point algorithms for Independent Component Analysis. IEEE Trans. Neural Networks 10, 626-634.
    • (1999) IEEE Trans. Neural Networks , vol.10 , pp. 626-634
    • Hyvärinen, A.1
  • 17
    • 0026191274 scopus 로고
    • Blind separation of sources. Part I: An adaptive algorithm based on a neuromimetic architecture
    • Jutten, C., Herault, J., 1991. Blind separation of sources. Part I: An adaptive algorithm based on a neuromimetic architecture. Signal Process. 24, 1-10.
    • (1991) Signal Process. , vol.24 , pp. 1-10
    • Jutten, C.1    Herault, J.2
  • 18
    • 85071899496 scopus 로고    scopus 로고
    • Visible speech cues and auditory detection of spoken sentences: An effect of degree of correlation between acoustic and visual properties
    • Kim, J., Davis, C., 2001. Visible speech cues and auditory detection of spoken sentences: an effect of degree of correlation between acoustic and visual properties. In: Proc. AVSP'2001, pp. 127-131.
    • (2001) Proc. AVSP'2001 , pp. 127-131
    • Kim, J.1    Davis, C.2
  • 20
    • 0002605227 scopus 로고
    • Un poste 'visage-parole'. Acquisition et traitement de contours labiaux
    • Montréal
    • Lallouache, M.T., 1990. Un poste 'visage-parole'. Acquisition et traitement de contours labiaux. In: Proc. XVIII JEPs, Montréal, pp. 282-286.
    • (1990) Proc. XVIII JEPs , pp. 282-286
    • Lallouache, M.T.1
  • 22
    • 85009151501 scopus 로고    scopus 로고
    • Separating three simultaneous speeches with two microphones by integrating auditory and visual processing
    • Okuno, H.G., Nakadai, K., Lourens, T., Kitano, H., 2001. Separating three simultaneous speeches with two microphones by integrating auditory and visual processing. In: Proc. Eurospeech 2001, pp. 2643-2646.
    • (2001) Proc. Eurospeech 2001 , pp. 2643-2646
    • Okuno, H.G.1    Nakadai, K.2    Lourens, T.3    Kitano, H.4
  • 24
    • 85009257811 scopus 로고    scopus 로고
    • Audio-visual scene analysis: Evidence for a "very-early" integration process in audio-visual speech perception
    • Schwartz, J.L., Berthommier, F., Savariaux, C., 2002. Audio-visual scene analysis: Evidence for a "very-early" integration process in audio-visual speech perception. In: Proc. ICSLP'2002, pp. 1937-1940.
    • (2002) Proc. ICSLP'2002 , pp. 1937-1940
    • Schwartz, J.L.1    Berthommier, F.2    Savariaux, C.3
  • 25
    • 4544333803 scopus 로고    scopus 로고
    • Seeing to hear better: Evidence for early audio-visual interactions in speech identification
    • Schwartz, J.L., Berthommier, F., Savariaux, C., 2004. Seeing to hear better: Evidence for early audio-visual interactions in speech identification. Cognition 93, B69-B78.
    • (2004) Cognition , vol.93
    • Schwartz, J.L.1    Berthommier, F.2    Savariaux, C.3
  • 28
    • 0001048664 scopus 로고
    • Visual contribution to speech intelligibility in noise
    • Sumby, W.H., Pollack, I., 1954. Visual contribution to speech intelligibility in noise. J. Acoust. Soc. Am. 26, 212-215.
    • (1954) J. Acoust. Soc. Am. , vol.26 , pp. 212-215
    • Sumby, W.H.1    Pollack, I.2
  • 29
    • 0033341734 scopus 로고    scopus 로고
    • Source separation in postnonlinear mixtures
    • Taleb, A., Jutten, C., 1999. Source separation in postnonlinear mixtures. IEEE Trans. SP 10, 2807-2820.
    • (1999) IEEE Trans. SP , vol.10 , pp. 2807-2820
    • Taleb, A.1    Jutten, C.2
  • 30
    • 0032178592 scopus 로고    scopus 로고
    • Quantitative association of vocal-tract and facial behavior
    • Yehia, H., Rubin, P., Vatikiotis-Bateson, E., 1998. Quantitative association of vocal-tract and facial behavior. Speech Communication 26, 23-43.
    • (1998) Speech Communication , vol.26 , pp. 23-43
    • Yehia, H.1    Rubin, P.2    Vatikiotis-Bateson, E.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.