SCOPUS 정보 검색 플랫폼

Eurasip Journal on Applied Signal Processing

Volumn 2002, Issue 11, 2002, Pages 1202-1212

Statistical lip-appearance models trained automatically using audio information

(2) Daubias, Philippe a,b Deléglise, Paul a

b Lab d'Info Gr Image Modelisation (France)

Author keywords

Artificial neural networks; Audio visual corpora; Automatic lip region labeling; Dynamic time warping; Lip appearance model; Lip shape model

Indexed keywords

ACOUSTIC SIGNAL PROCESSING; AUDIO ACOUSTICS; FEATURE EXTRACTION; IMAGE PROCESSING; NEURAL NETWORKS; SPEECH RECOGNITION; VIDEO SIGNAL PROCESSING;

AUDIO-VISUAL SIGNALS; AUTOMATIC SPEECH RECOGNITION (ASR);

SPEECH PROCESSING;

EID: 0036875015 PISSN: 11108657 EISSN: None Source Type: Journal
DOI: 10.1155/S1110865702206186 Document Type: Article

Times cited : (12)

References (36)

1
- 0034825241
- Multistream adaptive evidence combination for noise robust ASR
- A. Morris, A. Hagen, H. Glotin, and H. Bourlard, "Multistream adaptive evidence combination for noise robust ASR," Speech Communication Journal, vol. 34, no. 1-2, pp. 25-40, 2001.
- (2001) Speech Communication Journal , vol.34 , Issue.1-2 , pp. 25-40
- Morris, A.¹ Hagen, A.² Glotin, H.³ Bourlard, H.⁴

2
- 85009135142
- Beyond the conventional statistical language models: The variable-length sequences approach
- Beijing, China, October
- I. Zitouni, K. Smaïli, and J.-P. Haton, "Beyond the conventional statistical language models: the variable-length sequences approach," in Proc. 6th International Conference on Spoken Language Processing (ICSLP), vol. 3, pp. 962-965, Beijing, China, October 2000.
- (2000) Proc. 6th International Conference on Spoken Language Processing (ICSLP) , vol.3 , pp. 962-965
- Zitouni, I.¹ Smaïli, K.² Haton, J.-P.³

3
- 0017199877
- Hearing lips and seeing voices
- December
- H. McGurk and J. McDonald, "Hearing lips and seeing voices," Nature, vol. 264, pp. 746-748, December 1976.
- (1976) Nature , vol.264 , pp. 746-748
- McGurk, H.¹ McDonald, J.²

4
- 0027228958
- Improving connected letter recognition by lipreading
- Minneapolis, Minn, USA, April
- C. Bregler, H. Hild, S. Manke, and A. Waibel, "Improving connected letter recognition by lipreading," in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, vol. 1, pp. 557-560, Minneapolis, Minn, USA, April 1993.
- (1993) Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing , vol.1 , pp. 557-560
- Bregler, C.¹ Hild, H.² Manke, S.³ Waibel, A.⁴

5
- 0028996862
- Toward movement-invariant automatic lip-reading and speech recognition
- Detroit, Mich, USA, May
- P. Duchnowski, M. Hunke, D. Büsching, U. Meier, and A. Waibel, "Toward movement-invariant automatic lip-reading and speech recognition," in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, vol. 1, pp. 109-112, Detroit, Mich, USA, May 1995.
- (1995) Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing , vol.1 , pp. 109-112
- Duchnowski, P.¹ Hunke, M.² Büsching, D.³ Meier, U.⁴ Waibel, A.⁵

6
- 84957886748
- Real-time lip tracking for audio-visual speech recognition applications
- Cambridge, UK, April
- R. Kaucic, B. Dalton, and A. Blake, "Real-time lip tracking for audio-visual speech recognition applications," in Proc. 4th European Conference on Computer Vision (ECCV), vol. 2, pp. 376-387, Cambridge, UK, April 1996.
- (1996) Proc. 4th European Conference on Computer Vision (ECCV) , vol.2 , pp. 376-387
- Kaucic, R.¹ Dalton, B.² Blake, A.³

7
- 0001622390
- Active shape models for visual speech feature extraction
- D. G. Stork and M. E. Hennecke, Eds., NATO Advanced Science Institutes, Springer-Verlag, New York, NY, USA
- J. Luettin, N. A. Thacker, and S. W. Beet, "Active shape models for visual speech feature extraction," in Speechreading by Humans and Machines: Models, Systems, and Applications, D. G. Stork and M. E. Hennecke, Eds., vol. 150 of NATO Advanced Science Institutes, pp. 383-390, Springer-Verlag, New York, NY, USA, 1996.
- (1996) Speechreading by Humans and Machines: Models, Systems, and Applications , vol.150 , pp. 383-390
- Luettin, J.¹ Thacker, N.A.² Beet, S.W.³

8
- 0012707450
- Tech. Rep. Workshop 2000, Center for Language and Speech Processing (CLSP), Johns Hopkins University, Baltimore, Md, USA, October
- C. Neti, G. Potamianos, J. Luettin, et al., "Audio-visual speech recognition," Tech. Rep. Workshop 2000, Center for Language and Speech Processing (CLSP), Johns Hopkins University, Baltimore, Md, USA, October 2000.
- (2000) Audio-Visual Speech Recognition
- Neti, C.¹ Potamianos, G.² Luettin, J.³

9
- 0032180188
- Adaptive fusion of acoustic and visual sources for automatic speech recognition
- A. Rogozan and P. Deléglise, "Adaptive fusion of acoustic and visual sources for automatic speech recognition," Speech Communication Journal, vol. 26, no. 1-2, pp. 149-161, 1998.
- (1998) Speech Communication Journal , vol.26 , Issue.1-2 , pp. 149-161
- Rogozan, A.¹ Deléglise, P.²

10
- 0002546123
- Lip signatures for automatic person recognition
- Washington, DC, USA, March
- R. Auckenthaler, J. Brand, J. S. D. Mason, F. Deravi, and C. C. Chibelushi, "Lip signatures for automatic person recognition," in Proc. 2nd International Conference on Audio- and Video-Based Biometric Person Authentication (AVBPA), pp. 142-147, Washington, DC, USA, March 1999.
- (1999) Proc. 2nd International Conference on Audio- and Video-Based Biometric Person Authentication (AVBPA) , pp. 142-147
- Auckenthaler, R.¹ Brand, J.² Mason, J.S.D.³ Deravi, F.⁴ Chibelushi, C.C.⁵

11
- 84947907880
- Acoustic-labial speaker verification
- J. Bigün, G. Chollet, and G. Borgefors, Eds., Springer-Verlag, Crans-Montana, Switzerland, March
- P. Jourlin, J. Luettin, D. Genoud, and H. Wassner, "Acoustic-labial speaker verification," in Proc. 1st International Conference on Audio- and Video-Based Biometric Person Authentification (AVBPA), J. Bigün, G. Chollet, and G. Borgefors, Eds., pp. 319-326, Springer-Verlag, Crans-Montana, Switzerland, March 1997.
- (1997) Proc. 1st International Conference on Audio- and Video-Based Biometric Person Authentification (AVBPA) , pp. 319-326
- Jourlin, P.¹ Luettin, J.² Genoud, D.³ Wassner, H.⁴

12
- 4243927729
- A signal processing system for having the sound "pop-out" in noise thanks to the image of the speaker's lips: New advances using multilayer perceptrons
- Sydney, Australia, December
- L. Girin, L. Varin, G. Feng, and J.-L. Schwartz, "A signal processing system for having the sound "pop-out" in noise thanks to the image of the speaker's lips: New advances using multilayer perceptrons," in Proc. 5th International Conference on Spoken Language Processing (ICSLP), vol. 4, pp. 1451-1454, Sydney, Australia, December 1998.
- (1998) Proc. 5th International Conference on Spoken Language Processing (ICSLP) , vol.4 , pp. 1451-1454
- Girin, L.¹ Varin, L.² Feng, G.³ Schwartz, J.-L.⁴

13
- 2642559942
- Towards unrestricted lip reading
- Hong Kong
- U. Meier, R. Stiefelhagen, J. Yang, and A. Waibel, "Towards unrestricted lip reading," in Proc. 2nd International Conference on Multimodal Interfaces (ICMI), Hong Kong, 1999.
- (1999) Proc. 2nd International Conference on Multimodal Interfaces (ICMI)
- Meier, U.¹ Stiefelhagen, R.² Yang, J.³ Waibel, A.⁴

14
- 0032678693
- Unsupervised lip segmentation under natural conditions
- Phoenix, Ariz, USA, March
- M. Liévin and F. Luthon, "Unsupervised lip segmentation under natural conditions," in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), vol. 6, pp. 3065-3068, Phoenix, Ariz, USA, March 1999.
- (1999) Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP) , vol.6 , pp. 3065-3068
- Liévin, M.¹ Luthon, F.²

15
- 0001055701
- Which components of the face do humans and machines best speechread?
- D. G. Stork and M. E. Hennecke, Eds., NATO Advanced Science Institutes, Springer-Verlag, New York, NY, USA
- C. Benoît, T. Guiard-Marigny, B. Le Goff, and A. Adjoudani, "Which components of the face do humans and machines best speechread?," in Speechreading by Humans and Machines: Models, Systems, and Applications, D. G. Stork and M. E. Hennecke, Eds., vol. 150 of NATO Advanced Science Institutes, pp. 315-328, Springer-Verlag, New York, NY, USA, 1996.
- (1996) Speechreading by Humans and Machines: Models, Systems, and Applications , vol.150 , pp. 315-328
- Benoît, C.¹ Guiard-Marigny, T.² Le Goff, B.³ Adjoudani, A.⁴

16
- 0012725681
- On the production and perception of audio-visual speech by man and machine
- Y. Wang, S. Panwar, S.-P. Kim, and H. L. Bertoni, Eds., Plenum, New York, NY, USA, October
- C. Benoît, "On the production and perception of audio-visual speech by man and machine," in Multimedia Communications and Video Coding, Y. Wang, S. Panwar, S.-P. Kim, and H. L. Bertoni, Eds., Plenum, New York, NY, USA, October 1995.
- (1995) Multimedia Communications and Video Coding
- Benoît, C.¹

17
- 0032314380
- An image transform approach for HMM based automatic lipreading
- Chicago, Ill, USA, October
- G. Potamianos, H. P. Graf, and E. Cosatto, "An image transform approach for HMM based automatic lipreading," in Proc. IEEE International Conference on Image Processing (ICIP), vol. 3, pp. 173-177, Chicago, Ill, USA, October 1998.
- (1998) Proc. IEEE International Conference on Image Processing (ICIP) , vol.3 , pp. 173-177
- Potamianos, G.¹ Graf, H.P.² Cosatto, E.³

18
- 78649238564
- Using deformable templates to infer visual speech dynamics
- Pacific Grove, Calif, USA, November
- M. E. Hennecke, K. V. Prasad, and D. G. Stork, "Using deformable templates to infer visual speech dynamics," in 28th Annual Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, Calif, USA, November 1994.
- (1994) 28th Annual Asilomar Conference on Signals, Systems, and Computers
- Hennecke, M.E.¹ Prasad, K.V.² Stork, D.G.³

19
- 0003231941
- Active contours for lipreading: Combining snakes with templates
- Juan-les-Pins, France, September
- S. Horbelt and J.-L. Dugelay, "Active contours for lipreading: combining snakes with templates," in 15th GRETSI Symposium Signal and Image Processing, pp. 717-720, Juan-les-Pins, France, September 1995.
- (1995) 15th GRETSI Symposium Signal and Image Processing , pp. 717-720
- Horbelt, S.¹ Dugelay, J.-L.²

20
- 84997531258
- Model-based versus knowledge-guided representation of non-rigid objects: A case study
- IEEE Computer Society Press, Los Alamitos, Calif, USA
- R. Kober, J. Schiffers, and K. Schmidt, "Model-based versus knowledge-guided representation of non-rigid objects: A case study," in Proc. IEEE International Conference on Image Processing, vol. 1, pp. 973-977, IEEE Computer Society Press, Los Alamitos, Calif, USA, 1994.
- (1994) Proc. IEEE International Conference on Image Processing , vol.1 , pp. 973-977
- Kober, R.¹ Schiffers, J.² Schmidt, K.³

21
- 0012755890
- Face identification by deformation measure
- Vienna, Austria, August
- B. Leroy, I. L. Herlin, and L. D. Cohen, "Face identification by deformation measure," in Proc. IEEE International Conference on Pattern Recognition (ICPR), vol. 3, pp. 633-637, Vienna, Austria, August 1996.
- (1996) Proc. IEEE International Conference on Pattern Recognition (ICPR) , vol.3 , pp. 633-637
- Leroy, B.¹ Herlin, I.L.² Cohen, L.D.³

22
- 0012704142
- Tracking of deformable contours by synthesis and match
- Vienna, Austria, August
- K. F. Lai, C. W. Ngo, and S. Chan, "Tracking of deformable contours by synthesis and match," in Proc. IEEE International Conference on Pattern Recognition (ICPR), vol. 1, pp. 657-661, Vienna, Austria, August 1996.
- (1996) Proc. IEEE International Conference on Pattern Recognition (ICPR) , vol.1 , pp. 657-661
- Lai, K.F.¹ Ngo, C.W.² Chan, S.³

23
- 78649293030
- A new 3D lip model for analysis and synthesis of lip motion in speech production
- Terrigal, Australia, December
- L. Revéret and C. Benoît, "A new 3D lip model for analysis and synthesis of lip motion in speech production," in Proc. Auditory-Visual Speech Processing (AVSP), pp. 207-212, Terrigal, Australia, December 1998.
- (1998) Proc. Auditory-Visual Speech Processing (AVSP) , pp. 207-212
- Revéret, L.¹ Benoît, C.²

24
- 0032309170
- 3D modeling and tracking of human lip motion
- Bombay, India, January
- S. Basu, N. Oliver, and A. Pentland, "3D modeling and tracking of human lip motion," in Proc. IEEE International Conference on Computer Vision (ICCV), pp. 337-343, Bombay, India, January 1998.
- (1998) Proc. IEEE International Conference on Computer Vision (ICCV) , pp. 337-343
- Basu, S.¹ Oliver, N.² Pentland, A.³

25
- 0034270644
- Audio-visual speech modeling for continuous speech recognition
- S. Dupont and J. Luettin, "Audio-visual speech modeling for continuous speech recognition," IEEE Trans. Multimedia, vol. 2, no. 3, pp. 141-151, 2000.
- (2000) IEEE Trans. Multimedia , vol.2 , Issue.3 , pp. 141-151
- Dupont, S.¹ Luettin, J.²

26
- 85133465985
- Lipreading using shape, shading and scale
- Terrigal, Australia, December
- I. Matthews, T. Cootes, S. Cox, R. Harvey, and J. A. Bangham, "Lipreading using shape, shading and scale," in Proc. Auditory-Visual Speech Processing (AVSP), pp. 73-78, Terrigal, Australia, December 1998.
- (1998) Proc. Auditory-Visual Speech Processing (AVSP) , pp. 73-78
- Matthews, I.¹ Cootes, T.² Cox, S.³ Harvey, R.⁴ Bangham, J.A.⁵

27
- 0000134331
- 2D deformable models for visual speech analysis
- D. G. Stork and M. E. Hennecke, Eds., NATO Advanced Science Institutes, Springer-Verlag, New York, NY, USA
- T. Coianiz, L. Torresani, and B. Caprile, "2D deformable models for visual speech analysis," in Speechreading by Humans and Machines: Models, Systems, and Applications, D. G. Stork and M. E. Hennecke, Eds., vol. 150 of NATO Advanced Science Institutes, pp. 391-398, Springer-Verlag, New York, NY, USA, 1996.
- (1996) Speechreading by Humans and Machines: Models, Systems, and Applications , vol.150 , pp. 391-398
- Coianiz, T.¹ Torresani, L.² Caprile, B.³

28
- 84941187690
- Using aerial and geometric features in automatic lip-reading
- Aalborg, Denmark, September
- J. C. Wojdel and L. J. M. Rothkrantz, "Using aerial and geometric features in automatic lip-reading," in Proc. 7th European Conference on Speech Communication and Technology (Eurospeech), vol. 4, pp. 2463-2466, Aalborg, Denmark, September 2001.
- (2001) Proc. 7th European Conference on Speech Communication and Technology (Eurospeech) , vol.4 , pp. 2463-2466
- Wojdel, J.C.¹ Rothkrantz, L.J.M.²

29
- 0036875048
- Automatic speechreading with applications to human-computer interfaces
- X. Zhang, C. C. Broun, R. M. Mersereau, and M. Clements, "Automatic speechreading with applications to human-computer interfaces," EURASIP Journal on Applied Signal Processing, vol. 2002, no. 11, pp. 1228-1247, 2002.
- (2002) EURASIP Journal on Applied Signal Processing , vol.2002 , Issue.11 , pp. 1228-1247
- Zhang, X.¹ Broun, C.C.² Mersereau, R.M.³ Clements, M.⁴

30
- 0032310760
- Accurate, real-time, unadorned lip tracking
- Bombay, India, January
- R. Kaucic and A. Blake, "Accurate, real-time, unadorned lip tracking," in Proc. IEEE International Conference on Computer Vision (ICCV), pp. 370-375, Bombay, India, January 1998.
- (1998) Proc. IEEE International Conference on Computer Vision (ICCV) , pp. 370-375
- Kaucic, R.¹ Blake, A.²

31
- 0012738498
- Ph.D. thesis, Université de Rennes I, December
- J.-M. Odobez, Estimation, détection et segmentation du mouvement: une approche robuste et Markovienne, Ph.D. thesis, Université de Rennes I, December 1994.
- (1994) Estimation, Détection et Segmentation du Mouvement: une Approche Robuste et Markovienne
- Odobez, J.-M.¹

32
- 0000238336
- A simplex method for function minimization
- J. A. Nelder and R. Mead, "A simplex method for function minimization," Computing Journal, vol. 7, no. 4, pp. 308-313, 1965.
- (1965) Computing Journal , vol.7 , Issue.4 , pp. 308-313
- Nelder, J.A.¹ Mead, R.²

33
- 0017930815
- Dynamic programming algorithm optimization for spoken word recognition
- H. Sakoe and S. Chiba, "Dynamic programming algorithm optimization for spoken word recognition," IEEE Trans. Acoustics, Speech, and Signal Processing, vol. 26, no. 1, pp. 43-49, 1978.
- (1978) IEEE Trans. Acoustics, Speech, and Signal Processing , vol.26 , Issue.1 , pp. 43-49
- Sakoe, H.¹ Chiba, S.²

34
- 0012706763
- Utilisation de l'information acoustique pour aligner deux séquences de parole audiovisuelle
- Mons, Belgium, September
- P. Daubias, "Utilisation de l'information acoustique pour aligner deux séquences de parole audiovisuelle," in Proc. 4th Rencontres Jeunes Chercheurs en Parole, pp. 74-77, Mons, Belgium, September 2001.
- (2001) Proc. 4th Rencontres Jeunes Chercheurs en Parole , pp. 74-77
- Daubias, P.¹

35
- 84858968331
- BD-SONS: Une base de donnés des sons du français
- Toronto, Canada
- R. Descout, J.-F. Sérignat, O. Cervantes, and R. Carré, "BD-SONS: Une base de donnés des sons du français," in Proc. 12th International Congress on Acoustics (ICA), Toronto, Canada, 1986.
- (1986) Proc. 12th International Congress on Acoustics (ICA)
- Descout, R.¹ Sérignat, J.-F.² Cervantes, O.³ Carré, R.⁴

36
- 85009187138
- Lip-reading based on a fully automatic statistical model
- Denver, Col, USA, September
- P. Daubias and P. Deléglise, "Lip-reading based on a fully automatic statistical model," in Proc. 7th International Conference on Spoken Language Processing (ICSLP), vol. 1, pp. 209-212, Denver, Col, USA, September 2002.
- (2002) Proc. 7th International Conference on Spoken Language Processing (ICSLP) , vol.1 , pp. 209-212
- Daubias, P.¹ Deléglise, P.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.