메뉴 건너뛰기




Volumn 17, Issue 3, 2009, Pages 446-458

Energetic and informational masking effects in an audiovisual speech recognition system

Author keywords

Audiovisual speech recognition; Energetic masking (EM); Iinformational masking (IM); Source separation; Speech fragment decoding

Indexed keywords

ACOUSTIC ANALYSIS; ACOUSTIC SIGNALS; AUDIO-VISUAL SPEECH; AUDIOVISUAL SPEECH RECOGNITION; ENERGETIC MASKING; ENERGETIC MASKING (EM); IINFORMATIONAL MASKING (IM); INFORMATIONAL MASKING; NONSTATIONARY NOISE; SOURCE SEPARATION; TARGET SPEECH; TWO STAGE; VISUAL CUES; VISUAL INFORMATION; VISUAL REPRESENTATIONS;

EID: 70350437422     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2008.2011534     Document Type: Article
Times cited : (14)

References (37)
  • 1
    • 0025604287 scopus 로고
    • How much masking is informational masking?
    • R. Lufti, "How much masking is informational masking?, " J. Acoust. Soc. Amer., vol. 88, pp. 2607-2610, 1990.
    • (1990) J. Acoust. Soc. Amer. , vol.88 , pp. 2607-2610
    • Lufti, R.1
  • 2
    • 0035169173 scopus 로고    scopus 로고
    • Informational and energetic masking effects in the perception of multiple simultaneous talkers
    • D. S. Brungart, B. D. Simpson, M. A. Ericson, and K. R. Scott, "Informational and energetic masking effects in the perception of multiple simultaneous talkers, " J. Acoust. Soc. Amer., vol. 110, no. 5, pp. 2527-2538, 2001.
    • (2001) J. Acoust. Soc. Amer. , vol.110 , Issue.5 , pp. 2527-2538
    • Brungart, D.S.1    Simpson, B.D.2    Ericson, M.A.3    Scott, K.R.4
  • 4
    • 4544290191 scopus 로고    scopus 로고
    • Recent advances in the automatic recognition of audiovisual speech
    • G. Potamianos, C. Neti, G. Gravier, A. Garg, and A. W. Senior, "Recent advances in the automatic recognition of audiovisual speech, " Proc. IEEE, vol. 91, no. 9, pp. 1306-1326, 2003.
    • (2003) Proc. IEEE , vol.91 , Issue.9 , pp. 1306-1326
    • Potamianos, G.1    Neti, C.2    Gravier, G.3    Garg, A.4    W. Sr., A.5
  • 9
    • 85009230873 scopus 로고    scopus 로고
    • Audio-visual speech recognition in challenging environments
    • Geneva, Switzerland
    • G. Potamianos and C. Neti, "Audio-visual speech recognition in challenging environments, " in Proc. Eurospeech'03, Geneva, Switzerland, 2003, pp. 1293-1296.
    • (2003) Proc. Eurospeech'03 , pp. 1293-1296
    • Potamianos, G.1    Neti, C.2
  • 10
    • 43949091431 scopus 로고    scopus 로고
    • Comparison of image transform based features for visual speech recognition in clean and corrupted video
    • R. Seymour, D. Stewart, and J. Ming, "Comparison of image transform based features for visual speech recognition in clean and corrupted video, " EURASIP J. Image Video Process., vol. 8, no. 2, 2008.
    • (2008) EURASIP J. Image Video Process. , vol.8 , Issue.2
    • Seymour, R.1    Stewart, D.2    Ming, J.3
  • 11
    • 4544333803 scopus 로고    scopus 로고
    • Seeing to hear better: Evidence for early audio-visual interactions in speech identification
    • J.-L. Schwartz, F. Berthommier, and C. Savariaux, "Seeing to hear better: Evidence for early audio-visual interactions in speech identification, " Cognition, vol. 93, no. 2, pp. B69-B78, 2004.
    • (2004) Cognition , vol.93 , Issue.2
    • Schwartz, J.-L.1    Berthommier, F.2    Savariaux, C.3
  • 12
    • 0029935458 scopus 로고    scopus 로고
    • Enhancement of selective listening by illusory mislocation of speech sounds due to lip-reading
    • J. Driver, "Enhancement of selective listening by illusory mislocation of speech sounds due to lip-reading, " Nature, vol. 381, no. 6577, pp. 66-68, 1996.
    • (1996) Nature , vol.381 , Issue.6577 , pp. 66-68
    • Driver, J.1
  • 13
    • 13544256368 scopus 로고    scopus 로고
    • The role of visual speech cues in reducing energetic and informational masking
    • K. S. Helfer and R. L. Freyman, "The role of visual speech cues in reducing energetic and informational masking, " J. Acoust. Soc. Amer., vol. 117, no. 2, pp. 842-849, 2005.
    • (2005) J. Acoust. Soc. Amer. , vol.117 , Issue.2 , pp. 842-849
    • Helfer, K.S.1    Freyman, R.L.2
  • 14
    • 33745137835 scopus 로고    scopus 로고
    • Informational masking of speech in children: Auditory-visual integration
    • F. Wightman, D. Kistler, and D. Brungart, "Informational masking of speech in children: Auditory-visual integration, " J. Acoust. Soc. Amer., vol. 119, no. 6, pp. 3940-3949, 2006.
    • (2006) J. Acoust. Soc. Amer. , vol.119 , Issue.6 , pp. 3940-3949
    • Wightman, F.1    Kistler, D.2    Brungart, D.3
  • 15
    • 0032178592 scopus 로고    scopus 로고
    • Quantitative association of vocal-tract and facial behavior
    • H. Yehia, P. Rubin, and E. Vatikiotis-Bateson, "Quantitative association of vocal-tract and facial behavior, " Speech Commun., vol. 26, no. 1, pp. 23-43, 1998.
    • (1998) Speech Commun. , vol.26 , Issue.1 , pp. 23-43
    • Yehia, H.1    Rubin, P.2    Vatikiotis-Bateson, E.3
  • 16
    • 0036874551 scopus 로고    scopus 로고
    • On the relationship between face movements, tongue movements, and speech acoustics
    • J. Jiang, A. Alwan, P. A. Keating, E. T. Auer Jr., and L. Bernstein, "On the relationship between face movements, tongue movements, and speech acoustics, " EURASIP J. Adv. Signal Process., vol. 2002, no. 11, pp. 1174-1188, 2002.
    • (2002) EURASIP J. Adv. Signal Process. , vol.2002 , Issue.11 , pp. 1174-1188
    • Jiang, J.1    Alwan, A.2    Keating, P.A.3    Auer Jr., E.T.4    Bernstein, L.5
  • 17
    • 0012725678 scopus 로고    scopus 로고
    • Estimation of speech acoustics from visual speech features: A comparison of linear and non-linear models
    • San Francisco, CA
    • J. Barker and F. Berthommier, "Estimation of speech acoustics from visual speech features: A comparison of linear and non-linear models, " in Proc. Int. Workshop Auditory-Visual Speech Process. (AVSP), San Francisco, CA, 1999.
    • (1999) Proc. Int. Workshop Auditory-visual Speech Process. (AVSP)
    • Barker, J.1    Berthommier, F.2
  • 18
    • 10444247388 scopus 로고    scopus 로고
    • Developing an audio-visual speech source separation algorithm
    • D. Sodoyer, L. Girin, G. Jutter, and J.-L. Schwartz, "Developing an audio-visual speech source separation algorithm, " Speech Commun., vol. 44, pp. 113-125, 2007.
    • (2007) Speech Commun. , vol.44 , pp. 113-125
    • Sodoyer, D.1    Girin, L.2    Jutter, G.3    Schwartz, J.-L.4
  • 19
    • 77951429856 scopus 로고    scopus 로고
    • Noisy audio speech enhancement using wiener filters derived from visual speech
    • Hilvarenbeek, The Netherlands
    • B. Milner and I. Almajai, "Noisy audio speech enhancement using Wiener filters derived from visual speech, " in Proc. Int. Workshop Auditory-Visual Speech Process. (AVSP), Hilvarenbeek, The Netherlands, 2007.
    • (2007) Proc. Int. Workshop Auditory-visual Speech Process. (AVSP)
    • Milner, B.1    Almajai, I.2
  • 22
    • 33644661135 scopus 로고    scopus 로고
    • A glimpsing model of speech perception in noise
    • M. P. Cooke, "A glimpsing model of speech perception in noise, " J. Acoust. Soc. Amer., vol. 119, no. 3, pp. 1562-1573, 2006.
    • (2006) J. Acoust. Soc. Amer. , vol.119 , Issue.3 , pp. 1562-1573
    • Cooke, M.P.1
  • 26
    • 44949219122 scopus 로고    scopus 로고
    • Recent advances in speech fragment decoding techniques
    • Pittsburgh, PA
    • J. Barker, A. Coy, N. Ma, and M. Cooke, "Recent advances in speech fragment decoding techniques, " in Proc. Interspeech'06, Pittsburgh, PA, 2006, pp. 85-88.
    • (2006) Proc. Interspeech'06 , pp. 85-88
    • Barker, J.1    Coy, A.2    Ma, N.3    Cooke, M.4
  • 27
    • 11144316019 scopus 로고    scopus 로고
    • Decoding speech in the presence of other sources
    • J. P. Barker, M. P. Cooke, and D. P. W. Ellis, "Decoding speech in the presence of other sources, " Speech Commun., vol. 45, pp. 5-25, 2005.
    • (2005) Speech Commun. , vol.45 , pp. 5-25
    • Barker, J.P.1    Cooke, M.P.2    Ellis, D.P.W.3
  • 28
    • 0035342414 scopus 로고    scopus 로고
    • Robust automatic speech recognition with missing and uncertain acoustic data
    • M. P. Cooke, P. D. Green, L. Josifovski, and A. Vizinho, "Robust automatic speech recognition with missing and uncertain acoustic data, " Speech Commun., vol. 34, pp. 267-285, 2001.
    • (2001) Speech Commun. , vol.34 , pp. 267-285
    • Cooke, M.P.1    Green, P.D.2    Josifovski, L.3    Vizinho, A.4
  • 29
    • 0035478859 scopus 로고    scopus 로고
    • The auditory organization of speech and other sources in listeners and computational models
    • M. P. Cooke and D. P. W. Ellis, "The auditory organization of speech and other sources in listeners and computational models, " Speech Commun., vol. 35, pp. 141-177, 2001.
    • (2001) Speech Commun. , vol.35 , pp. 141-177
    • Cooke, M.P.1    Ellis, D.P.W.2
  • 30
    • 33750368310 scopus 로고    scopus 로고
    • An audiovisual corpus for speech perception and automatic speech recognition
    • M. P. Cooke, J. Barker, S. P. Cunningham, and X. Shao, "An audiovisual corpus for speech perception and automatic speech recognition, " J. Acoust. Soc. Amer., vol. 120, pp. 2421-2424, 2006.
    • (2006) J. Acoust. Soc. Amer. , vol.120 , pp. 2421-2424
    • Cooke, M.P.1    Barker, J.2    Cunningham, S.P.3    Shao, X.4
  • 31
    • 37849011878 scopus 로고    scopus 로고
    • The foreign language cocktail party problem: Energetic and informational masking effects in non-native speech perception
    • M. P. Cooke, M. L. Garcia Lecumberri, and J. Barker, "The foreign language cocktail party problem: Energetic and informational masking effects in non-native speech perception, " J. Acoust. Soc. Amer., vol. 123, pp. 414-427, 2008.
    • (2008) J. Acoust. Soc. Amer. , vol.123 , pp. 414-427
    • Cooke, M.P.1    Lecumberri, M.L.G.2    Barker, J.3
  • 32
    • 0025110885 scopus 로고
    • Derivation of auditory filter shapes from notched-noise data
    • B. R. Glasberg and B. C. J. Moore, "Derivation of auditory filter shapes from notched-noise data, " Hearing Res., vol. 47, pp. 103-138, 1990.
    • (1990) Hearing Res. , vol.47 , pp. 103-138
    • Glasberg, B.R.1    Moore, B.C.J.2
  • 33
    • 40249108645 scopus 로고    scopus 로고
    • Stream weight estimation for multistream audio-visual speech recognition in a multispeaker environment
    • X. Shao and J. Barker, "Stream weight estimation for multistream audio-visual speech recognition in a multispeaker environment, " Speech Commun., vol. 50, no. 4, pp. 337-353, 2008.
    • (2008) Speech Commun. , vol.50 , Issue.4 , pp. 337-353
    • Shao, X.1    Barker, J.2
  • 34
    • 0002858640 scopus 로고    scopus 로고
    • Computer video face tracking for use in a perceptual user interface
    • G. R. Bradski, "Computer video face tracking for use in a perceptual user interface, " Intel Technol. J., Q2, 1998.
    • (1998) Intel Technol. J., Q2
    • Bradski, G.R.1
  • 35
    • 34748817500 scopus 로고    scopus 로고
    • Exploiting correlogram structure for robust automatic speech recognition with multiple speech sources
    • N. Ma, P. Green, J. Barker, and A. Coy, "Exploiting correlogram structure for robust automatic speech recognition with multiple speech sources, " Speech Commun., vol. 49, no. 12, pp. 874-891, 2007.
    • (2007) Speech Commun. , vol.49 , Issue.12 , pp. 874-891
    • Ma, N.1    Green, P.2    Barker, J.3    Coy, A.4
  • 37
    • 0012668146 scopus 로고    scopus 로고
    • Asynchrony modeling for audio-visual speech recognition
    • San Diego, CA
    • G. Gravier, G. Potamianos, and C. Neti, "Asynchrony modeling for audio-visual speech recognition, " in Proc. Human Lang. Technol. Conf., San Diego, CA, 2002, pp. 24-27.
    • (2002) Proc. Human Lang. Technol. Conf. , pp. 24-27
    • Gravier, G.1    Potamianos, G.2    Neti, C.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.