메뉴 건너뛰기




Volumn 2002, Issue 11, 2002, Pages 1213-1227

Audio-visual speech recognition using MPEG-4 compliant visual features

Author keywords

Audio visual speech recognition; Facial animation parameters; Snake

Indexed keywords

ACOUSTIC SIGNAL PROCESSING; ANIMATION; FEATURE EXTRACTION; GESTURE RECOGNITION; PRINCIPAL COMPONENT ANALYSIS; SIGNAL TO NOISE RATIO; VIDEO SIGNAL PROCESSING; WHITE ACOUSTIC NOISE;

EID: 0036874915     PISSN: 11108657     EISSN: None     Source Type: Journal    
DOI: 10.1155/S1110865702206162     Document Type: Article
Times cited : (51)

References (37)
  • 1
    • 0027128576 scopus 로고
    • Lipreading and audio-visual speech perception
    • Q. Summerfield, "Lipreading and audio-visual speech perception," Philos. Trans. Roy. Soc. London Ser. B, vol. 335, pp. 71-78, 1992.
    • (1992) Philos. Trans. Roy. Soc. London Ser. B , vol.335 , pp. 71-78
    • Summerfield, Q.1
  • 2
    • 0025767028 scopus 로고
    • Evaluating the articulation index for auditory visual input
    • K. W. Grant and L. D. Braida, "Evaluating the articulation index for auditory visual input," Journal of the Acoustical Society of America, vol. 89, no. 6, pp. 2952-2960, 1991.
    • (1991) Journal of the Acoustical Society of America , vol.89 , Issue.6 , pp. 2952-2960
    • Grant, K.W.1    Braida, L.D.2
  • 4
    • 0036650527 scopus 로고    scopus 로고
    • An HMM-based speech-to-video synthesizer
    • Special Issue on Intelligent Multi-media
    • J. J. Williams and A. K. Katsaggelos, "An HMM-based speech-to-video synthesizer," IEEE Trans. on Neural Networks, vol. 13, no. 4, pp. 900-915, 2002, Special Issue on Intelligent Multi-media.
    • (2002) IEEE Trans. on Neural Networks , vol.13 , Issue.4 , pp. 900-915
    • Williams, J.J.1    Katsaggelos, A.K.2
  • 5
    • 0002028032 scopus 로고
    • Some preliminaries to a comprehensive account of audio-visual speech perception
    • B. Dodd and R. Campbell, Eds., Lawrence Erlbaum Associates, Hillside, Minn, USA
    • Q. Summerfield, "Some preliminaries to a comprehensive account of audio-visual speech perception," in Hearing by Eye: The Psychology of Lip-Reading, B. Dodd and R. Campbell, Eds., pp. 97-113, Lawrence Erlbaum Associates, Hillside, Minn, USA, 1987.
    • (1987) Hearing by Eye: The Psychology of Lip-Reading , pp. 97-113
    • Summerfield, Q.1
  • 6
    • 0031187171 scopus 로고    scopus 로고
    • Speech recognition by machines and humans
    • R. Lippman, "Speech recognition by machines and humans," Speech Communication, vol. 22, no. 1, pp. 1-15, 1997.
    • (1997) Speech Communication , vol.22 , Issue.1 , pp. 1-15
    • Lippman, R.1
  • 7
    • 0029288202 scopus 로고
    • Speech recognition in noisy environments: A survey
    • Y. Gong, "Speech recognition in noisy environments: A survey," Speech Communication, vol. 16, no. 3, pp. 261-291, 1995.
    • (1995) Speech Communication , vol.16 , Issue.3 , pp. 261-291
    • Gong, Y.1
  • 9
    • 84893671339 scopus 로고
    • An improved automatic lipreading system to enhance speech recognition
    • ACM, Washington, DC, USA
    • E. Petajan, B. Bischoff, D. Bodoff, and N. M. Brooke, "An improved automatic lipreading system to enhance speech recognition," in CHI-88, pp. 19-25, ACM, Washington, DC, USA, 1988.
    • (1988) CHI-88 , pp. 19-25
    • Petajan, E.1    Bischoff, B.2    Bodoff, D.3    Brooke, N.M.4
  • 12
    • 0030421449 scopus 로고    scopus 로고
    • Robust face feature analysis for automatic speechreading and character animation
    • Killington, Vt, USA
    • E. Petajan and H. P. Graf, "Robust face feature analysis for automatic speechreading and character animation," in Proc. 2nd Int. Conf. Automatic Face and Gesture Recognition, pp. 357-362, Killington, Vt, USA, 1996.
    • (1996) Proc. 2nd Int. Conf. Automatic Face and Gesture Recognition , pp. 357-362
    • Petajan, E.1    Graf, H.P.2
  • 15
    • 0031624666 scopus 로고    scopus 로고
    • Discriminative training of HMM stream exponents for audio-visual speech recognition
    • Seattle, Wash, USA
    • G. Potamianos and H. P. Graph, "Discriminative training of HMM stream exponents for audio-visual speech recognition," in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, vol. 6, pp. 3733-3736, Seattle, Wash, USA, 1998.
    • (1998) Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing , vol.6 , pp. 3733-3736
    • Potamianos, G.1    Graph, H.P.2
  • 18
    • 0034270644 scopus 로고    scopus 로고
    • Audio-visual speech modeling for continuous speech recognition
    • S. Dupont and J. Luettin, "Audio-visual speech modeling for continuous speech recognition," IEEE Trans. Multimedia, vol. 2, no. 3, pp. 141-151, 2000.
    • (2000) IEEE Trans. Multimedia , vol.2 , Issue.3 , pp. 141-151
    • Dupont, S.1    Luettin, J.2
  • 19
    • 85009060634 scopus 로고    scopus 로고
    • Perceptual interfaces for information interaction: Joint processing of audio and visual information for human-computer interaction
    • Beijing, China, October
    • C. Neti, G. Iyengar, G. Potamianos, A. Senior, and B. Maison, "Perceptual interfaces for information interaction: Joint processing of audio and visual information for human-computer interaction," in Proc. International Conference on Spoken Language Processing, vol. III, pp. 11-14, Beijing, China, October 2000.
    • (2000) Proc. International Conference on Spoken Language Processing , vol.3 , pp. 11-14
    • Neti, C.1    Iyengar, G.2    Potamianos, G.3    Senior, A.4    Maison, B.5
  • 20
    • 84957886748 scopus 로고    scopus 로고
    • Real-time lip tracking for audio-visual speech recognition applications
    • Cambridge, UK
    • R. Kaucic, B. Dalton, and A. Blake, "Real-time lip tracking for audio-visual speech recognition applications," in Proc. European Conference on Computer Vision, vol. 2, pp. 376-387, Cambridge, UK, 1996.
    • (1996) Proc. European Conference on Computer Vision , vol.2 , pp. 376-387
    • Kaucic, R.1    Dalton, B.2    Blake, A.3
  • 21
    • 0031211240 scopus 로고    scopus 로고
    • Lipreading from color video
    • G. Chiou and J.-N. Hwang, "Lipreading from color video," IEEE Trans. Image Processing, vol. 6, no. 8, pp. 1192-1195, 1997.
    • (1997) IEEE Trans. Image Processing , vol.6 , Issue.8 , pp. 1192-1195
    • Chiou, G.1    Hwang, J.-N.2
  • 22
    • 84925639646 scopus 로고    scopus 로고
    • Real-time lip tracking and bimodal continuous speech recognition
    • Redondo Beach, Los Angeles, Calif, USA, December
    • M. T. Chan, Y. Zhang, and T. S. Huang, "Real-time lip tracking and bimodal continuous speech recognition," in Proc. IEEE 2nd Workshop on Multimedia Signal Processing, pp. 65-70, Redondo Beach, Los Angeles, Calif, USA, December 1998.
    • (1998) Proc. IEEE 2nd Workshop on Multimedia Signal Processing , pp. 65-70
    • Chan, M.T.1    Zhang, Y.2    Huang, T.S.3
  • 24
    • 0012725683 scopus 로고    scopus 로고
    • Text for ISO/IEC FDIS 14496-2 Visual, ISO/IEC JTC1/SC29/WG11 N2502, November 1998
    • Text for ISO/IEC FDIS 14496-2 Visual, ISO/IEC JTC1/SC29/WG11 N2502, November 1998.
  • 25
    • 0012705505 scopus 로고    scopus 로고
    • Text for ISO/IEC FDIS 14496-1 Systems, ISO/IEC JTC1/SC29/ WG11 N2502, November 1998
    • Text for ISO/IEC FDIS 14496-1 Systems, ISO/IEC JTC1/SC29/ WG11 N2502, November 1998.
  • 26
    • 0035472468 scopus 로고    scopus 로고
    • An efficient use of MPEG-4 FAP interpolation for facial animation at 70 bits/frame
    • F. Lavagetto and R. Pockaj, "An efficient use of MPEG-4 FAP interpolation for facial animation at 70 bits/frame," IEEE Trans. Circuits and Systems for Video Technology, vol. 11, no. 10, pp. 1085-1097, 2001.
    • (2001) IEEE Trans. Circuits and Systems for Video Technology , vol.11 , Issue.10 , pp. 1085-1097
    • Lavagetto, F.1    Pockaj, R.2
  • 27
    • 0034502124 scopus 로고    scopus 로고
    • Approaches to visual speech processing based on the MPEG-4 face animation standard
    • New York, NY, USA, 30 July-2 August
    • E. Petajan, "Approaches to visual speech processing based on the MPEG-4 face animation standard," in Proc. IEEE International Conference on Multimedia and Expo(I), pp. 575-587, New York, NY, USA, 30 July-2 August 2000.
    • (2000) Proc. IEEE International Conference on Multimedia and Expo(I) , pp. 575-587
    • Petajan, E.1
  • 31
    • 0027693887 scopus 로고
    • Finite element methods for active contour models and balloons for 2-D and 3-D images
    • L. D. Cohen and I. Cohen, "Finite element methods for active contour models and balloons for 2-D and 3-D images," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1131-1147, 1993.
    • (1993) IEEE Trans. on Pattern Analysis and Machine Intelligence , vol.15 , Issue.11 , pp. 1131-1147
    • Cohen, L.D.1    Cohen, I.2
  • 33
    • 0032028944 scopus 로고    scopus 로고
    • Snakes, shapes, and gradient vector flow
    • C. Xu and J. L. Prince, "Snakes, shapes, and gradient vector flow," IEEE Trans. Image Processing, vol. 7, no. 3, pp. 359-369, 1998.
    • (1998) IEEE Trans. Image Processing , vol.7 , Issue.3 , pp. 359-369
    • Xu, C.1    Prince, J.L.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.