SCOPUS 정보 검색 플랫폼

Volumn 15, Issue 10, 2006, Pages 2879-2891

Discriminative analysis of lip motion features for speaker identification and speech-reading

(4) Çetingül, H Ertan a Yücel, Yemez a Erzin, Engin a Tekalp, A Murat a

Author keywords

Bayesian discriminative feature selection; Lip motion; Speaker identification; Speech recognition; Temporal discriminative feature selection

Indexed keywords

COMPUTATIONAL GEOMETRY; FEATURE EXTRACTION; MARKOV PROCESSES; MATHEMATICAL MODELS;

BAYESIAN DISCRIMINATIVE FEATURE SELECTION; DISCRIMINATIVE ANALYSIS; FEATURE SELECTION; LIP MOTION FEATURES; SPEAKER IDENTIFICATION; TEMPORAL DISCRIMINATIVE FEATURE SELECTION;

SPEECH RECOGNITION;

ALGORITHM; ARTICLE; ARTIFICIAL INTELLIGENCE; AUTOMATED PATTERN RECOGNITION; AUTOMATIC SPEECH RECOGNITION; BIOMETRY; COMPUTER ASSISTED DIAGNOSIS; DISCRIMINANT ANALYSIS; HISTOLOGY; HUMAN; IMAGE ENHANCEMENT; INFORMATION RETRIEVAL; LIP; LIP READING; METHODOLOGY; MOVEMENT (PHYSIOLOGY); PHYSIOLOGY; SPEECH;

ALGORITHMS; ARTIFICIAL INTELLIGENCE; BIOMETRY; DISCRIMINANT ANALYSIS; HUMANS; IMAGE ENHANCEMENT; IMAGE INTERPRETATION, COMPUTER-ASSISTED; INFORMATION STORAGE AND RETRIEVAL; LIP; LIPREADING; MOVEMENT; PATTERN RECOGNITION, AUTOMATED; SPEECH; SPEECH RECOGNITION SOFTWARE;

EID: 33749187783 PISSN: 10577149 EISSN: None Source Type: Journal
DOI: 10.1109/TIP.2006.877528 Document Type: Article

Times cited : (109)

References (42)

1
- 85013597845
- "Eigenlips for robust speech recognition"
- C. Bregler and Y. Konig, "Eigenlips for robust speech recognition," in Proc. IEEE Conf. Acoustics, Speech and Signal Processing, 1994, pp. 669-672.
- (1994) Proc. IEEE Conf. Acoustics, Speech and Signal Processing , pp. 669-672
- Bregler, C.¹ Konig, Y.²

2
- 0029747053
- "Integrating audio and visual information to provide highly robust speech recognition"
- M. J. Tomlinson, M. J. Russell, and N. M. Brooke, "Integrating audio and visual information to provide highly robust speech recognition," in Proc. Int. Conf. Acoustics, Speech and Signal Processing, 1996, vol. II, pp. 821-824.
- (1996) Proc. Int. Conf. Acoustics, Speech and Signal Processing , vol.2 , pp. 821-824
- Tomlinson, M.J.¹ Russell, M.J.² Brooke, N.M.³

3
- 4544290191
- "Recent advances in the automatic recognition of audio-visual speech"
- Sep
- G. Potamianos, C. Neti, G. Gravier,A. Garg, and A. W. Senior, "Recent advances in the automatic recognition of audio-visual speech," Proc. IEEE, vol. 91, no. 9, pp. 1306-1326, Sep. 2003.
- (2003) Proc. IEEE , vol.91 , Issue.9 , pp. 1306-1326
- Potamianos, G.¹ Neti, C.² Gravier, G.³ Garg, A.⁴ Senior, A.W.⁵

4
- 2542475258
- "Recognition of visual speech elements using adaptively boosted hidden Markov models"
- May
- S. W. Foo, Y. Lian, and L. Dong, "Recognition of visual speech elements using adaptively boosted hidden Markov models," IEEE Trans. Circuits Syst. Video Technol., vol. 14, no. 5, pp. 693-705, May 2004.
- (2004) IEEE Trans. Circuits Syst. Video Technol. , vol.14 , Issue.5 , pp. 693-705
- Foo, S.W.¹ Lian, Y.² Dong, L.³

5
- 85032752352
- "Audiovisual speech processing"
- Jan
- T. Chen, "Audiovisual speech processing," IEEE Signal Process. Mag., vol. 18, pp. 9-21, Jan. 2001.
- (2001) IEEE Signal Process. Mag. , vol.18 , pp. 9-21
- Chen, T.¹

6
- 4344697838
- "A real-time automatic lipreading system"
- S. L. Wang, W. H. Lau, S. H. Leung, and H. Yan, "A real-time automatic lipreading system," in Proc. 2004 Int. Symp. Circuits and Systems, 2004, vol. 2, pp. 101-104.
- (2004) Proc. 2004 Int. Symp. Circuits and Systems , vol.2 , pp. 101-104
- Wang, S.L.¹ Lau, W.H.² Leung, S.H.³ Yan, H.⁴

7
- 45249099254
- "Visual speech recognition: A solution from feature extraction towords classification"
- L. G. D. Silveira, J. Facon, and D. L. Borges, "Visual speech recognition: A solution from feature extraction towords classification," in Proc. XVI Brazilian Symp. Computer Graphics and Image Processing, 2003, pp. 399-405.
- (2003) Proc. XVI Brazilian Symp. Computer Graphics and Image Processing , pp. 399-405
- Silveira, L.G.D.¹ Facon, J.² Borges, D.L.³

8
- 3142697723
- "Analysis of lip geometric features for audio-visual speech recognition"
- Jul
- M. N. Kaynak, Q. Zhi, A. D. Cheok, K. Sengupta, Z. Jian, and K.C. Chung, "Analysis of lip geometric features for audio-visual speech recognition," IEEE Trans. Syst., Man, Cybern. A, Syst. Humans, vol. 34, no. 4, pp. 564-570, Jul. 2004.
- (2004) IEEE Trans. Syst., Man, Cybern. A, Syst. Humans , vol.34 , Issue.4 , pp. 564-570
- Kaynak, M.N.¹ Zhi, Q.² Cheok, A.D.³ Sengupta, K.⁴ Jian, Z.⁵ Chung, K.C.⁶

9
- 0036875048
- "Automatic speechreading with applications to human-computer interfaces"
- X. Zhang, C. C. Broun, R. M. Mersereau, and M. A. Clements, "Automatic speechreading with applications to human-computer interfaces," EURASIP J. Appl. Signal Process., pp. 1228-1247, 2002.
- (2002) EURASIP J. Appl. Signal Process. , pp. 1228-1247
- Zhang, X.¹ Broun, C.C.² Mersereau, R.M.³ Clements, M.A.⁴

10
- 33646794355
- "Lip reading for robust speech recognition on embedded devices"
- J. F. G. Perez, A. F. Frangi, E. L. Solano, and K. Lukas, "Lip reading for robust speech recognition on embedded devices," in Proc. Int. Conf. Acoustics, Speech and Signal Processing, 2005, vol. I, pp. 473-476.
- (2005) Proc. Int. Conf. Acoustics, Speech and Signal Processing , vol.1 , pp. 473-476
- Perez, J.F.G.¹ Frangi, A.F.² Solano, E.L.³ Lukas, K.⁴

11
- 0036472941
- "Extraction of visual features for lipreading"
- Feb
- I. Matthews, T. F. Cootes, J. A. Bangham, S. Cox, and R. Harvey, "Extraction of visual features for lipreading," IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 2, pp. 198-213, Feb. 2002.
- (2002) IEEE Trans. Pattern Anal. Mach. Intell. , vol.24 , Issue.2 , pp. 198-213
- Matthews, I.¹ Cootes, T.F.² Bangham, J.A.³ Cox, S.⁴ Harvey, R.⁵

12
- 0036874915
- "Audiovisual speech recognition using MPEG-4 compliant visual features"
- P. S. Aleksic, J. J. Williams, Z. Wu, and A. K. Katsaggelos, "Audiovisual speech recognition using MPEG-4 compliant visual features," EURASIP J. Appl. Signal Process., pp. 1213-1227, 2002.
- (2002) EURASIP J. Appl. Signal Process. , pp. 1213-1227
- Aleksic, P.S.¹ Williams, J.J.² Wu, Z.³ Katsaggelos, A.K.⁴

13
- 0034270644
- "Audio-visual speech modeling for continuous speech recognition"
- Sep
- S. Dupont and J. Luettin, "Audio-visual speech modeling for continuous speech recognition," IEEE Trans. Multimedia, vol. 2, no. 3, pp. 141-151, Sep. 2000.
- (2000) IEEE Trans. Multimedia , vol.2 , Issue.3 , pp. 141-151
- Dupont, S.¹ Luettin, J.²

14
- 26844533276
- "Multimodal speaker identification using an adaptive classifier cascade based on modality reliability"
- Oct
- E. Erzin, Y. Yemez, and A. M. Tekalp, "Multimodal speaker identification using an adaptive classifier cascade based on modality reliability," IEEE Trans. Multimedia, vol. 7, no. 5, pp. 840-852, Oct. 2005.
- (2005) IEEE Trans. Multimedia , vol.7 , Issue.5 , pp. 840-852
- Erzin, E.¹ Yemez, Y.² Tekalp, A.M.³

15
- 15744362948
- "Robust multi-modal person identification with tolerance of facial expression"
- N. A. Fox and R. B. Reilly, "Robust multi-modal person identification with tolerance of facial expression," in IEEE Int. Conf. Systems, Man and Cybernetics, 2004, vol. 1, pp. 580-585.
- (2004) IEEE Int. Conf. Systems, Man and Cybernetics , vol.1 , pp. 580-585
- Fox, N.A.¹ Reilly, R.B.²

16
- 0036298115
- "Automatic speechreading with application to speaker verification"
- C. C. Broun, X. Zhang, R. M. Mersereau, and M. Clements, "Automatic speechreading with application to speaker verification," in Proc. Int. Conf. Acoustics, Speech and Signal Processing, 2002, vol. I, pp. 685-688.
- (2002) Proc. Int. Conf. Acoustics, Speech and Signal Processing , vol.1 , pp. 685-688
- Broun, C.C.¹ Zhang, X.² Mersereau, R.M.³ Clements, M.⁴

17
- 4544305570
- "Lip features selection with application to person authentication"
- L. L. Mok, W. H. Lau, S. H. Leung, S. L. Wang, and H. Yan, "Lip features selection with application to person authentication," in Proc. Int. Conf. Acoustics, Speech and Signal Processing, 2004, vol. III, pp. 397-400.
- (2004) Proc. Int. Conf. Acoustics, Speech and Signal Processing , vol.3 , pp. 397-400
- Mok, L.L.¹ Lau, W.H.² Leung, S.H.³ Wang, S.L.⁴ Yan, H.⁵

18
- 0035394653
- "Adaptive fusion of speech and lip information for robust speaker identification"
- Jul
- T.Wark and S. Sridharan, "Adaptive fusion of speech and lip information for robust speaker identification," Digital Signal Process., vol. 11, no. 3, pp. 169-186, Jul. 2001.
- (2001) Digital Signal Process. , vol.11 , Issue.3 , pp. 169-186
- Wark, T.¹ Sridharan, S.²

19
- 0031220766
- "Acoustic-labial speaker verification"
- P. Jourlin, J. Luettin, D. Genoud, and H. Wassner, "Acoustic-labial speaker verification," Pattern Recognit. Lett., vol. 18, no. 9, pp. 853-858, 1997.
- (1997) Pattern Recognit. Lett. , vol.18 , Issue.9 , pp. 853-858
- Jourlin, P.¹ Luettin, J.² Genoud, D.³ Wassner, H.⁴

20
- 33749178012
- "Evaluation of sensor calibration in a biometric person recognition framework based on sensor fusion"
- in Mar
- B. Fröba, C. Rothe, and C. Küblbeck, "Evaluation of sensor calibration in a biometric person recognition framework based on sensor fusion," in Proc. 4th IEEE Int. Conf. Automatic Face and Gesture Recognition, Mar. 2000, pp. 512-517.
- (2000) Proc. 4th IEEE Int. Conf. Automatic Face and Gesture Recognition , pp. 512-517
- Fröba, B.¹ Rothe, C.² Küblbeck, C.³

21
- 0033899298
- "Bioid: A multimodal biometric identification system"
- Feb
- R. W. Frischholz and U. Dieckmann, "Bioid: A multimodal biometric identification system," IEEE Computer, vol. 33, no. 2, pp. 64-68, Feb. 2000.
- (2000) IEEE Computer , vol.33 , Issue.2 , pp. 64-68
- Frischholz, R.W.¹ Dieckmann, U.²

22
- 20444432705
- "Discriminative lip-motion features for biometric speaker identification"
- in Oct
- H. E.Çetingül, Y. Yemez, E. Erzin, and A. M. Tekalp, "Discriminative lip-motion features for biometric speaker identification," in Proc. Int. Conf. on Image Processing, Oct. 2004, pp. 2023-2026.
- (2004) Proc. Int. Conf. on Image Processing , pp. 2023-2026
- E.Çetingül, H.¹ Yemez, Y.² Erzin, E.³ Tekalp, A.M.⁴

23
- 33646818965
- "Robust lip-motion features for speaker identification"
- in Mar
- H. E.Çetingül, "Robust lip-motion features for speaker identification," in Proc. Int. Conf. Acoustics, Speech and Signal Processing, Mar. 2005, vol. I, pp. 509-512.
- (2005) Proc. Int. Conf. Acoustics, Speech and Signal Processing , vol.1 , pp. 509-512
- Çetingül, H.E.¹ Yemez, Y.² Erzin, E.³ Tekalp, A.M.⁴

24
- 0031233424
- "Speaker recognition: A tutorial"
- Sep
- J. P. Campbell, "Speaker recognition: A tutorial," Proc. IEEE, vol. 85, no. 9, pp. 1437-1462, Sep. 1997.
- (1997) Proc. IEEE , vol.85 , Issue.9 , pp. 1437-1462
- Campbell, J.P.¹

25
- 0029355999
- "Speaker identification and verification using gaussian mixture speaker models"
- D. A. Reynolds, "Speaker identification and verification using gaussian mixture speaker models," Speech Commun., vol. 17, pp. 91-108, 1995.
- (1995) Speech Commun. , vol.17 , pp. 91-108
- Reynolds, D.A.¹

26
- 0029489292
- "Robust multiresolution estimation of parametric motion models"
- Image Represent., Dec
- J.-M. Odobez and P. Bouthemy, "Robust multiresolution estimation of parametric motion models," J. Vis. Commun. Image Represent., vol. 6, no. 4, pp. 348-365, Dec. 1995.
- (1995) J. Vis. Commun. , vol.6 , Issue.4 , pp. 348-365
- Odobez, J.-M.¹ Bouthemy, P.²

27
- 0033738345
- "A quadratic motion-based object-oriented video codec"
- Y. Yemez, B. Sankur, and E. Anarim, "A quadratic motion-based object-oriented video codec," Signal Process.: Image Commun., vol. 15, pp. 729-766, 2000.
- (2000) Signal Process.: Image Commun. , vol.15 , pp. 729-766
- Yemez, Y.¹ Sankur, B.² Anarim, E.³

28
- 6344240885
- "Video coding using the H.264/MPEG-4 AVC compression standard"
- A. Puri, X. Chen, and A. Luthra, "Video coding using the H.264/MPEG-4 AVC compression standard," Signal Process.: Imag Commun., vol. 19, pp. 793-849, 2004.
- (2004) Signal Process.: Imag Commun. , vol.19 , pp. 793-849
- Puri, A.¹ Chen, X.² Luthra, A.³

29
- 0036613897
- "Modelling and segmentation of lip area in face images"
- Jun
- M. Sadeghi, J. Kittler, and K. Messer, "Modelling and segmentation of lip area in face images," IEE Proc. Vis. Image Signal Process., vol. 149, no. 3, pp. 179-184, Jun. 2002.
- (2002) IEE Proc. Vis. Image Signal Process. , vol.149 , Issue.3 , pp. 179-184
- Sadeghi, M.¹ Kittler, J.² Messer, K.³

30
- 1242263389
- "Lip Iimage segmentation using fuzzy clustering incorporating an elliptic shape function"
- Jan
- S.-H. Leung, S.-L. Wang, and W.-H. Lau, "Lip Iimage segmentation using fuzzy clustering incorporating an elliptic shape function," IEEE Trans. Image Process., vol. 13, no. 1, pp. 51-62, Jan. 2004.
- (2004) IEEE Trans. Image Process. , vol.13 , Issue.1 , pp. 51-62
- Leung, S.-H.¹ Wang, S.-L.² Lau, W.-H.³

31
- 4544299669
- "Robust lip contour extraction using separability of multi-dimensional distributions"
- in May
- T. Wakasugi, M. Nishiura, and K. Fukui, "Robust lip contour extraction using separability of multi-dimensional distributions," in Proc. 6th IEEE Int. Conf. Automatic Face and Gesture Recognition, May 2004, pp. 415-420.
- (2004) Proc. 6th IEEE Int. Conf. Automatic Face and Gesture Recognition , pp. 415-420
- Wakasugi, T.¹ Nishiura, M.² Fukui, K.³

32
- 4544254537
- "Lip segmentation with the presence of beards"
- S. L. Wang, W. H. Lau, S. H. Leung, and A. W. C. Liew, "Lip segmentation with the presence of beards," in Proc. Int. Conf. Acoustics, Speech and Signal Processing, 2004, vol. III, pp. 529-532.
- (2004) Proc. Int. Conf. Acoustics, Speech and Signal Processing , vol.3 , pp. 529-532
- Wang, S.L.¹ Lau, W.H.² Leung, S.H.³ Liew, A.W.C.⁴

33
- 2542485577
- "Accurate and quasi-automatic Lip tracking"
- May
- N. Eveno, A. Caplier, and P.-Y. Coulon, "Accurate and quasi-automatic Lip tracking," IEEE Trans. Circuits Syst. Video Technol., vol. 14, no. 5, pp. 706-715, May 2004.
- (2004) IEEE Trans. Circuits Syst. Video Technol. , vol.14 , Issue.5 , pp. 706-715
- Eveno, N.¹ Caplier, A.² Coulon, P.-Y.³

34
- 0036296044
- "Visual speech feature extraction for improved speech recognition"
- X. Zhang, R. M. Mersereau, M. A. Clements, and C. C. Broun, "Visual speech feature extraction for improved speech recognition," in Proc. Int. Conf. Acoustics, Speech and Signal Processing, 2002, pp. 1993-1996.
- (2002) Proc. Int. Conf. Acoustics, Speech and Signal Processing , pp. 1993-1996
- Zhang, X.¹ Mersereau, R.M.² Clements, M.A.³ Broun, C.C.⁴

35
- 0036297183
- "A couple hmm for audio-visual speech recognition"
- A. Nefian, L. Liang, X. Pi, L. Xiaoxiang, C. Mao, and K. Murphy, "A couple hmm for audio-visual speech recognition," in Proc. Int. Conf. Acoustics, Speech and Signal Processing, 2002, pp. 2013-2016.
- (2002) Proc. Int. Conf. Acoustics, Speech and Signal Processing , pp. 2013-2016
- Nefian, A.¹ Liang, L.² Pi, X.³ Xiaoxiang, L.⁴ Mao, C.⁵ Murphy, K.⁶

36
- 0040122109
- "Statistical lip modelling for visual speech recognition"
- J. Luettin, N. Thacker, and S. Beet, "Statistical lip modelling for visual speech recognition," in Proc. 8th Eur. Signal Processing Conf., 1996, pp. 10-13.
- (1996) Proc. 8th Eur. Signal Processing Conf. , pp. 10-13
- Luettin, J.¹ Thacker, N.² Beet, S.³

37
- 0002836012
- "An iterative image restoration technique with an application to stereo vision"
- B. Lucas and T. Kanade, "An iterative image restoration technique with an application to stereo vision," in Proc. DARPA IU Workshop, 1981, pp. 121-130.
- (1981) Proc. DARPA IU Workshop , pp. 121-130
- Lucas, B.¹ Kanade, T.²

38
- 85135374344
- "Integration of acoustic and visual speech for speaker recognition"
- C. C. Chibelushi, J. S. Mason, and F. Deravi, "Integration of acoustic and visual speech for speaker recognition," in Proc. 3rd Eur. Conf. Speech Communication and Technology, 1993, vol. 1, pp. 157-160.
- (1993) Proc. 3rd Eur. Conf. Speech Communication and Technology , vol.1 , pp. 157-160
- Chibelushi, C.C.¹ Mason, J.S.² Deravi, F.³

39
- 0004524499
- "An approach to statistical lip modeling for speaker identification via chromatic feature extraction"
- T. J. Wark, S. Sridharan, and V. Chandran, "An approach to statistical lip modeling for speaker identification via chromatic feature extraction," in Proc. Int. Conf. Pattern Recognition, 1998, vol. 1, pp. 123-125.
- (1998) Proc. Int. Conf. Pattern Recognition , vol.1 , pp. 123-125
- Wark, T.J.¹ Sridharan, S.² Chandran, V.³

40
- 0004056285
- Englewood Cliffs: Prentice-Hall
- X. Huang, A. Acero, and H.-W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm, and System Development. Englewood Cliffs: Prentice-Hall, 2001.
- (2001) Spoken Language Processing: A Guide to Theory, Algorithm, and System Development
- Huang, X.¹ Acero, A.² Hon, H.-W.³

41
- 0035248924
- "PCA versus LDA"
- Feb
- A. M. Martinez and A. C. Kak, "PCA versus LDA," IEEE Trans. Pattern Anal. Mach. Intell., vol. 23, no. 2, pp. 228-233, Feb. 2001.
- (2001) IEEE Trans. Pattern Anal. Mach. Intell. , vol.23 , Issue.2 , pp. 228-233
- Martinez, A.M.¹ Kak, A.C.²

42
- 18844422688
- New York: Springer Verlag, 2005, ch. Joint Audio-Video Processing for Robust Biometric Speaker Identification in Car
- E. Erzin, Y. Yemez, and A. M. Tekalp, DSP in Mobile and Vehicular Systems. New York: Springer Verlag, 2005, ch. Joint Audio-Video Processing for Robust Biometric Speaker Identification in Car.
- DSP in Mobile and Vehicular Systems
- Erzin, E.¹ Yemez, Y.² Tekalp, A.M.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.