SCOPUS 정보 검색 플랫폼 - 논문 보기

메뉴 건너뛰기

Signal Processing

Volumn 86, Issue 12, 2006, Pages 3549-3558

Multimodal speaker/speech recognition using lip motion, lip texture and audio

(4) Cetingul H E a Erzin, E a Yemez, Y a Tekalp, A M a

a KOÇ UNIVERSITY (Turkey)

Author keywords

Decision fusion; Isolated word recognition; Lip motion; Lip reading; Speaker identification

Indexed keywords

DECISION FUSION; ISOLATED WORD RECOGNITION; LIP MOTION; LIP READING; SPEAKER IDENTIFICATION;

CHARACTER RECOGNITION; DECISION THEORY; MODAL ANALYSIS; MOTION ESTIMATION;

SPEECH RECOGNITION;

EID: 33749436578 PISSN: 01651684 EISSN: None Source Type: Journal
DOI: 10.1016/j.sigpro.2006.02.045 Document Type: Article

Times cited : (55)

References (34)

1
- 0031233424
- Speaker recognition: a tutorial
- Campbell J. Speaker recognition: a tutorial. Proc. IEEE 85 9 (1997) 1437-1462
- (1997) Proc. IEEE , vol.85 , Issue.9 , pp. 1437-1462
- Campbell, J.¹

2
- 0031234213
- Face recognition: eigenface, elastic matching, and neural nets
- Zhang Y.Y.J., and Lades M. Face recognition: eigenface, elastic matching, and neural nets. Proc. IEEE 85 9 (1997) 1423-1435
- (1997) Proc. IEEE , vol.85 , Issue.9 , pp. 1423-1435
- Zhang, Y.Y.J.¹ Lades, M.²

3
- 0026065565
- Eigenfaces for recognition
- Turk M., and Pentland A. Eigenfaces for recognition. J. Cognitive Neurosci. 3 1 (1991) 586-591
- (1991) J. Cognitive Neurosci. , vol.3 , Issue.1 , pp. 586-591
- Turk, M.¹ Pentland, A.²

4
- 0032021555
- On combining classifiers
- Kittler J., Hatef M., Duin R., and Matas J. On combining classifiers. IEEE Trans. Pattern Anal. Machine Intell. 20 3 (1998) 226-239
- (1998) IEEE Trans. Pattern Anal. Machine Intell. , vol.20 , Issue.3 , pp. 226-239
- Kittler, J.¹ Hatef, M.² Duin, R.³ Matas, J.⁴

5
- 0036472941
- Extraction of visual features for lipreading
- Matthews I., Cootes T., Bangham J., Cox S., and Harvey R. Extraction of visual features for lipreading. IEEE Trans. Pattern Anal. Machine Intell. 24 2 (2002) 198-213
- (2002) IEEE Trans. Pattern Anal. Machine Intell. , vol.24 , Issue.2 , pp. 198-213
- Matthews, I.¹ Cootes, T.² Bangham, J.³ Cox, S.⁴ Harvey, R.⁵

6
- 0036502797
- A review of speech-based bimodal recognition
- Chibelushi C., Deravi F., and Mason J. A review of speech-based bimodal recognition. IEEE Trans. Multimedia 4 1 (2002) 23-37
- (2002) IEEE Trans. Multimedia , vol.4 , Issue.1 , pp. 23-37
- Chibelushi, C.¹ Deravi, F.² Mason, J.³

7
- 0036875048
- X. Zhang, C. Broun, R. Mersereau, M. Clements, Automatic speechreading with applications to human-computer interfaces, EURASIP J. Appl. Signal Process. (2002) 1228-1247.

8
- 4544290191
- G. Potamianos, C. Neti, G. Gravier, A. Garg, A. Senior, Recent advances in the automatic recognition of audio-visual speech, Proc. IEEE 91(9).

9
- 33646794355
- J. Perez, A. Frangi, E. Solano, K. Lukas, Lip reading for robust speech recognition on embedded devices, Proceedings of the International Conference on Acoustics, Speech and Signal Processing 2005 (ICASSP '05), vol. I, 2005, pp. 473-476.

10
- 0033899298
- BioID: a multimodal biometric identification system
- Frischholz R., and Dieckmann U. BioID: a multimodal biometric identification system. J. IEEE Comput. 33 2 (2000) 64-68
- (2000) J. IEEE Comput. , vol.33 , Issue.2 , pp. 64-68
- Frischholz, R.¹ Dieckmann, U.²

11
- 26844533276
- Multimodal speaker identification using an adaptive classifier cascade based on modality reliability
- Erzin E., Yemez Y., and Tekalp A. Multimodal speaker identification using an adaptive classifier cascade based on modality reliability. IEEE Trans. Multimedia 7 5 (2005) 840-852
- (2005) IEEE Trans. Multimedia , vol.7 , Issue.5 , pp. 840-852
- Erzin, E.¹ Yemez, Y.² Tekalp, A.³

12
- 0035394653
- Adaptive fusion of speech and lip information for robust speaker identification
- Wark T., and Sridharan S. Adaptive fusion of speech and lip information for robust speaker identification. Digital Signal Process. 11 3 (2001) 169-186
- (2001) Digital Signal Process. , vol.11 , Issue.3 , pp. 169-186
- Wark, T.¹ Sridharan, S.²

13
- 0031220766
- Acoustic-labial speaker verification
- Jourlin P., Luettin J., Genoud D., and Wassner H. Acoustic-labial speaker verification. Pattern Recognition Lett. 18 9 (1997) 853-858
- (1997) Pattern Recognition Lett. , vol.18 , Issue.9 , pp. 853-858
- Jourlin, P.¹ Luettin, J.² Genoud, D.³ Wassner, H.⁴

14
- 4544305570
- L. Mok, W. Lau, S. Leung, S. Wang, H. Yan, Lip features selection with application to person authentication, Proceedings of the International Conference on Acoustics, Speech and Signal Processing 2004 (ICASSP '04), vol. III, 2004, pp. 397-400.

15
- 58149392013
- M. Civanlar, T. Chen, Password-free network security through joint use of audio and video, Proc. SPIE Photonic (1996) 120-125.

16
- 0004244302
- Prentice-Hall, Englewood Cliffs, NJ
- Rabiner L., and Juang B.-H. Fundamentals of Speech Recognition (1993), Prentice-Hall, Englewood Cliffs, NJ
- (1993) Fundamentals of Speech Recognition
- Rabiner, L.¹ Juang, B.-H.²

17
- 0034270644
- Audio-visual speech modeling for continuous speech recognition
- Dupont S., and Luettin J. Audio-visual speech modeling for continuous speech recognition. IEEE Trans. Multimedia 2 3 (2000) 141-151
- (2000) IEEE Trans. Multimedia , vol.2 , Issue.3 , pp. 141-151
- Dupont, S.¹ Luettin, J.²

18
- 85013597845
- C. Bregler, Y. Konig, Eigenlips for robust speech recognition, Proceedings of IEEE Conference on Acoustics, Speech and Signal Processing, 1994, pp. 669-672.

19
- 20444432705
- H. Cetingul, Y. Yemez, E. Erzin, A. Tekalp, Discriminative lip-motion features for biometric speaker identification, Proceedings of the International Conference on Image Processing 2004 (ICIP 2004), 2004, pp. 2023-2026.

20
- 33750926099
- S. Lucey, S. Sridharan, V. Chandran, Initialised eigenlip estimator for fast lip tracking using linear regression, Proceedings of the 15th International Conference on Pattern Recognition 2000, vol. 3, 2000, pp. 178-181.

21
- 4344697838
- S. Wang, W. Lau, S. Leung, H. Yan, A real-time automatic lipreading system, Proceedings of the 2004 International Symposium on Circuits and Systems (ISCAS 2004), vol. 2, 2004, pp. 101-104.

22
- 4544299669
- T. Wakasugi, M. Nishiura, K. Fukui, Robust lip contour extraction using separability of multi-dimensional distributions, Proceedings of the Sixth IEEE International Conference on Automatic Face and Gesture Recognition (FGR'04), 2004, pp. 415-420.

23
- 0036874915
- P. Aleksic, J. Williams, Z. Wu, A. Katsaggelos, Audio-visual speech recognition using MPEG-4 compliant visual features, EURASIP J. Appl. Signal Process. (2002) 1213-1227.

24
- 2542485577
- Accurate and quasi-automatic lip tracking
- Eveno N., Caplier A., and Coulon P.-Y. Accurate and quasi-automatic lip tracking. IEEE Trans. Circuits Systems Video Technol. 14 5 (2004) 706-715
- (2004) IEEE Trans. Circuits Systems Video Technol. , vol.14 , Issue.5 , pp. 706-715
- Eveno, N.¹ Caplier, A.² Coulon, P.-Y.³

25
- 0036487270
- Noise compensation in a person verification system using face and multiple speech features
- Sanderson C., and Paliwal K. Noise compensation in a person verification system using face and multiple speech features. Pattern Recognition 36 2 (2003) 293-302
- (2003) Pattern Recognition , vol.36 , Issue.2 , pp. 293-302
- Sanderson, C.¹ Paliwal, K.²

26
- 0029393187
- Person identification using multiple clues
- Brunelli R., and Falavigna D. Person identification using multiple clues. IEEE Trans. Pattern Anal. Machine Intell. 17 (1995) 955-966
- (1995) IEEE Trans. Pattern Anal. Machine Intell. , vol.17 , pp. 955-966
- Brunelli, R.¹ Falavigna, D.²

27
- 33749439483
- U. Chaudhari, G. Ramaswamy, G. Potamianos, C. Neti, Information fusion and decision cascading for audio-visual speaker recognition based on time-varying stream reliability prediction, Proceedings of the International Conference on Multimedia & Expo 2003 (ICME2003), vol. 3, 2003, pp. 9-12.

28
- 0141584901
- Kluwer Academic Publishers, Dordrecht
- Zhang D. Automated Biometrics (2000), Kluwer Academic Publishers, Dordrecht
- (2000) Automated Biometrics
- Zhang, D.¹

29
- 0004056285
- Prentice-Hall, Englewood Cliffs, NJ
- Huang X., Acero A., and Hon H.-W. Spoken Language Processing: A Guide to Theory, Algorithm, and System Development (2001), Prentice-Hall, Englewood Cliffs, NJ
- (2001) Spoken Language Processing: A Guide to Theory, Algorithm, and System Development
- Huang, X.¹ Acero, A.² Hon, H.-W.³

30
- 0029489292
- Robust multiresolution estimation of parametric motion models
- Odobez J.-M., and Bouthemy P. Robust multiresolution estimation of parametric motion models. J. Visual Comm. Image Representation 6 4 (1995) 348-365
- (1995) J. Visual Comm. Image Representation , vol.6 , Issue.4 , pp. 348-365
- Odobez, J.-M.¹ Bouthemy, P.²

31
- 33646818965
- H. Cetingul, Y. Yemez, E. Erzin, A. Tekalp, Robust lip-motion features for speaker identification, Proceedings of the International Conference on Acoustics, Speech and Signal Processing 2005 (ICASSP '05), vol. I, 2005, pp. 509-512.

32
- 6344240885
- Video coding using the H.264/MPEG-4 AVC compression standard
- Puri A., Chen X., and Luthra A. Video coding using the H.264/MPEG-4 AVC compression standard. Signal Processing: Image Communications 19 (2004) 793-849
- (2004) Signal Processing: Image Communications , vol.19 , pp. 793-849
- Puri, A.¹ Chen, X.² Luthra, A.³

33
- 0029355999
- Speaker identification and verification using gaussian mixture speaker models
- Reynolds D. Speaker identification and verification using gaussian mixture speaker models. Speech Comm. 17 (1995) 91-108
- (1995) Speech Comm. , vol.17 , pp. 91-108
- Reynolds, D.¹

34
- 18844422688
- Joint audio-video processing for robust biometric speaker identification in car
- Springer, Berlin
- Erzin E., Yemez Y., and Tekalp A. Joint audio-video processing for robust biometric speaker identification in car. DSP in Mobile and Vehicular Systems (2005), Springer, Berlin
- (2005) DSP in Mobile and Vehicular Systems
- Erzin, E.¹ Yemez, Y.² Tekalp, A.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.