메뉴 건너뛰기




Volumn 2002, Issue 11, 2002, Pages 1202-1212

Statistical lip-appearance models trained automatically using audio information

Author keywords

Artificial neural networks; Audio visual corpora; Automatic lip region labeling; Dynamic time warping; Lip appearance model; Lip shape model

Indexed keywords

ACOUSTIC SIGNAL PROCESSING; AUDIO ACOUSTICS; FEATURE EXTRACTION; IMAGE PROCESSING; NEURAL NETWORKS; SPEECH RECOGNITION; VIDEO SIGNAL PROCESSING;

EID: 0036875015     PISSN: 11108657     EISSN: None     Source Type: Journal    
DOI: 10.1155/S1110865702206186     Document Type: Article
Times cited : (12)

References (36)
  • 1
    • 0034825241 scopus 로고    scopus 로고
    • Multistream adaptive evidence combination for noise robust ASR
    • A. Morris, A. Hagen, H. Glotin, and H. Bourlard, "Multistream adaptive evidence combination for noise robust ASR," Speech Communication Journal, vol. 34, no. 1-2, pp. 25-40, 2001.
    • (2001) Speech Communication Journal , vol.34 , Issue.1-2 , pp. 25-40
    • Morris, A.1    Hagen, A.2    Glotin, H.3    Bourlard, H.4
  • 3
    • 0017199877 scopus 로고
    • Hearing lips and seeing voices
    • December
    • H. McGurk and J. McDonald, "Hearing lips and seeing voices," Nature, vol. 264, pp. 746-748, December 1976.
    • (1976) Nature , vol.264 , pp. 746-748
    • McGurk, H.1    McDonald, J.2
  • 6
    • 84957886748 scopus 로고    scopus 로고
    • Real-time lip tracking for audio-visual speech recognition applications
    • Cambridge, UK, April
    • R. Kaucic, B. Dalton, and A. Blake, "Real-time lip tracking for audio-visual speech recognition applications," in Proc. 4th European Conference on Computer Vision (ECCV), vol. 2, pp. 376-387, Cambridge, UK, April 1996.
    • (1996) Proc. 4th European Conference on Computer Vision (ECCV) , vol.2 , pp. 376-387
    • Kaucic, R.1    Dalton, B.2    Blake, A.3
  • 7
    • 0001622390 scopus 로고    scopus 로고
    • Active shape models for visual speech feature extraction
    • D. G. Stork and M. E. Hennecke, Eds., NATO Advanced Science Institutes, Springer-Verlag, New York, NY, USA
    • J. Luettin, N. A. Thacker, and S. W. Beet, "Active shape models for visual speech feature extraction," in Speechreading by Humans and Machines: Models, Systems, and Applications, D. G. Stork and M. E. Hennecke, Eds., vol. 150 of NATO Advanced Science Institutes, pp. 383-390, Springer-Verlag, New York, NY, USA, 1996.
    • (1996) Speechreading by Humans and Machines: Models, Systems, and Applications , vol.150 , pp. 383-390
    • Luettin, J.1    Thacker, N.A.2    Beet, S.W.3
  • 8
    • 0012707450 scopus 로고    scopus 로고
    • Tech. Rep. Workshop 2000, Center for Language and Speech Processing (CLSP), Johns Hopkins University, Baltimore, Md, USA, October
    • C. Neti, G. Potamianos, J. Luettin, et al., "Audio-visual speech recognition," Tech. Rep. Workshop 2000, Center for Language and Speech Processing (CLSP), Johns Hopkins University, Baltimore, Md, USA, October 2000.
    • (2000) Audio-Visual Speech Recognition
    • Neti, C.1    Potamianos, G.2    Luettin, J.3
  • 9
    • 0032180188 scopus 로고    scopus 로고
    • Adaptive fusion of acoustic and visual sources for automatic speech recognition
    • A. Rogozan and P. Deléglise, "Adaptive fusion of acoustic and visual sources for automatic speech recognition," Speech Communication Journal, vol. 26, no. 1-2, pp. 149-161, 1998.
    • (1998) Speech Communication Journal , vol.26 , Issue.1-2 , pp. 149-161
    • Rogozan, A.1    Deléglise, P.2
  • 12
    • 4243927729 scopus 로고    scopus 로고
    • A signal processing system for having the sound "pop-out" in noise thanks to the image of the speaker's lips: New advances using multilayer perceptrons
    • Sydney, Australia, December
    • L. Girin, L. Varin, G. Feng, and J.-L. Schwartz, "A signal processing system for having the sound "pop-out" in noise thanks to the image of the speaker's lips: New advances using multilayer perceptrons," in Proc. 5th International Conference on Spoken Language Processing (ICSLP), vol. 4, pp. 1451-1454, Sydney, Australia, December 1998.
    • (1998) Proc. 5th International Conference on Spoken Language Processing (ICSLP) , vol.4 , pp. 1451-1454
    • Girin, L.1    Varin, L.2    Feng, G.3    Schwartz, J.-L.4
  • 15
    • 0001055701 scopus 로고    scopus 로고
    • Which components of the face do humans and machines best speechread?
    • D. G. Stork and M. E. Hennecke, Eds., NATO Advanced Science Institutes, Springer-Verlag, New York, NY, USA
    • C. Benoît, T. Guiard-Marigny, B. Le Goff, and A. Adjoudani, "Which components of the face do humans and machines best speechread?," in Speechreading by Humans and Machines: Models, Systems, and Applications, D. G. Stork and M. E. Hennecke, Eds., vol. 150 of NATO Advanced Science Institutes, pp. 315-328, Springer-Verlag, New York, NY, USA, 1996.
    • (1996) Speechreading by Humans and Machines: Models, Systems, and Applications , vol.150 , pp. 315-328
    • Benoît, C.1    Guiard-Marigny, T.2    Le Goff, B.3    Adjoudani, A.4
  • 16
    • 0012725681 scopus 로고
    • On the production and perception of audio-visual speech by man and machine
    • Y. Wang, S. Panwar, S.-P. Kim, and H. L. Bertoni, Eds., Plenum, New York, NY, USA, October
    • C. Benoît, "On the production and perception of audio-visual speech by man and machine," in Multimedia Communications and Video Coding, Y. Wang, S. Panwar, S.-P. Kim, and H. L. Bertoni, Eds., Plenum, New York, NY, USA, October 1995.
    • (1995) Multimedia Communications and Video Coding
    • Benoît, C.1
  • 19
    • 0003231941 scopus 로고
    • Active contours for lipreading: Combining snakes with templates
    • Juan-les-Pins, France, September
    • S. Horbelt and J.-L. Dugelay, "Active contours for lipreading: combining snakes with templates," in 15th GRETSI Symposium Signal and Image Processing, pp. 717-720, Juan-les-Pins, France, September 1995.
    • (1995) 15th GRETSI Symposium Signal and Image Processing , pp. 717-720
    • Horbelt, S.1    Dugelay, J.-L.2
  • 20
    • 84997531258 scopus 로고
    • Model-based versus knowledge-guided representation of non-rigid objects: A case study
    • IEEE Computer Society Press, Los Alamitos, Calif, USA
    • R. Kober, J. Schiffers, and K. Schmidt, "Model-based versus knowledge-guided representation of non-rigid objects: A case study," in Proc. IEEE International Conference on Image Processing, vol. 1, pp. 973-977, IEEE Computer Society Press, Los Alamitos, Calif, USA, 1994.
    • (1994) Proc. IEEE International Conference on Image Processing , vol.1 , pp. 973-977
    • Kober, R.1    Schiffers, J.2    Schmidt, K.3
  • 23
    • 78649293030 scopus 로고    scopus 로고
    • A new 3D lip model for analysis and synthesis of lip motion in speech production
    • Terrigal, Australia, December
    • L. Revéret and C. Benoît, "A new 3D lip model for analysis and synthesis of lip motion in speech production," in Proc. Auditory-Visual Speech Processing (AVSP), pp. 207-212, Terrigal, Australia, December 1998.
    • (1998) Proc. Auditory-Visual Speech Processing (AVSP) , pp. 207-212
    • Revéret, L.1    Benoît, C.2
  • 25
    • 0034270644 scopus 로고    scopus 로고
    • Audio-visual speech modeling for continuous speech recognition
    • S. Dupont and J. Luettin, "Audio-visual speech modeling for continuous speech recognition," IEEE Trans. Multimedia, vol. 2, no. 3, pp. 141-151, 2000.
    • (2000) IEEE Trans. Multimedia , vol.2 , Issue.3 , pp. 141-151
    • Dupont, S.1    Luettin, J.2
  • 27
    • 0000134331 scopus 로고    scopus 로고
    • 2D deformable models for visual speech analysis
    • D. G. Stork and M. E. Hennecke, Eds., NATO Advanced Science Institutes, Springer-Verlag, New York, NY, USA
    • T. Coianiz, L. Torresani, and B. Caprile, "2D deformable models for visual speech analysis," in Speechreading by Humans and Machines: Models, Systems, and Applications, D. G. Stork and M. E. Hennecke, Eds., vol. 150 of NATO Advanced Science Institutes, pp. 391-398, Springer-Verlag, New York, NY, USA, 1996.
    • (1996) Speechreading by Humans and Machines: Models, Systems, and Applications , vol.150 , pp. 391-398
    • Coianiz, T.1    Torresani, L.2    Caprile, B.3
  • 32
    • 0000238336 scopus 로고
    • A simplex method for function minimization
    • J. A. Nelder and R. Mead, "A simplex method for function minimization," Computing Journal, vol. 7, no. 4, pp. 308-313, 1965.
    • (1965) Computing Journal , vol.7 , Issue.4 , pp. 308-313
    • Nelder, J.A.1    Mead, R.2
  • 33
    • 0017930815 scopus 로고
    • Dynamic programming algorithm optimization for spoken word recognition
    • H. Sakoe and S. Chiba, "Dynamic programming algorithm optimization for spoken word recognition," IEEE Trans. Acoustics, Speech, and Signal Processing, vol. 26, no. 1, pp. 43-49, 1978.
    • (1978) IEEE Trans. Acoustics, Speech, and Signal Processing , vol.26 , Issue.1 , pp. 43-49
    • Sakoe, H.1    Chiba, S.2
  • 34
    • 0012706763 scopus 로고    scopus 로고
    • Utilisation de l'information acoustique pour aligner deux séquences de parole audiovisuelle
    • Mons, Belgium, September
    • P. Daubias, "Utilisation de l'information acoustique pour aligner deux séquences de parole audiovisuelle," in Proc. 4th Rencontres Jeunes Chercheurs en Parole, pp. 74-77, Mons, Belgium, September 2001.
    • (2001) Proc. 4th Rencontres Jeunes Chercheurs en Parole , pp. 74-77
    • Daubias, P.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.