메뉴 건너뛰기




Volumn 94, Issue 11, 2006, Pages 2025-2044

Audio-visual biometrics

Author keywords

Audio visual biometrics; Audio visual databases; Audio visual fusion; Audio visual person recognition; Face tracking; Hidden Markov models; Multimodal recognition; Visual feature extraction

Indexed keywords

COMPUTER SIMULATION; DATABASE SYSTEMS; FACE RECOGNITION; FEATURE EXTRACTION; HIDDEN MARKOV MODELS; INFORMATION FUSION;

EID: 33947384963     PISSN: 00189219     EISSN: None     Source Type: Journal    
DOI: 10.1109/JPROC.2006.886017     Document Type: Article
Times cited : (109)

References (137)
  • 7
    • 0032594952 scopus 로고    scopus 로고
    • Fusion of face and speech data for person identity verification
    • S. Ben-Yacoub, Y. Abdeljaoued, and E. Mayoraz, "Fusion of face and speech data for person identity verification," IEEE Trans. Neural Networks, vol. 10, pp. 1065-1074, 1999.
    • (1999) IEEE Trans. Neural Networks , vol.10 , pp. 1065-1074
    • Ben-Yacoub, S.1    Abdeljaoued, Y.2    Mayoraz, E.3
  • 8
    • 4544228318 scopus 로고    scopus 로고
    • Identity verification using speech and face information
    • C. Sanderson and K. K. Paliwal, "Identity verification using speech and face information," Digital Signal Processing, vol. 14, no. 5, pp. 449-480, 2004.
    • (2004) Digital Signal Processing , vol.14 , Issue.5 , pp. 449-480
    • Sanderson, C.1    Paliwal, K.K.2
  • 13
    • 0033692608 scopus 로고    scopus 로고
    • The use of temporal speech and lip information for multi-modal speaker identification via multi-stream HMMs
    • Istanbul, Turkey
    • _, "The use of temporal speech and lip information for multi-modal speaker identification via multi-stream HMMs," in Proc. Int. Conf. Acoustics, Speech Signal Processing, Istanbul, Turkey, 2000, pp. 2389-2392.
    • (2000) Proc. Int. Conf. Acoustics, Speech Signal Processing , pp. 2389-2392
    • Wark, T.1    Sridharan, S.2    Chandran, V.3
  • 14
    • 33947355927 scopus 로고    scopus 로고
    • An audio-visual person identification and verification system using FAPs as visual features
    • Santa Barbara, CA
    • P. S. Aleksic and A. K. Katsaggelos, "An audio-visual person identification and verification system using FAPs as visual features," in Proc. Works. Multimedia User Authentication, Santa Barbara, CA, 2003, pp. 80-84.
    • (2003) Proc. Works. Multimedia User Authentication , pp. 80-84
    • Aleksic, P.S.1    Katsaggelos, A.K.2
  • 15
    • 26844468363 scopus 로고    scopus 로고
    • Information fusion and decision cascading for audio-visual speaker recognition based on time-varying stream reliability prediction
    • Baltimore, MD, Jul. 6-9
    • U. V. Chaudhari, G. N. Ramaswamy, G. Potamianos, and C. Neti, "Information fusion and decision cascading for audio-visual speaker recognition based on time-varying stream reliability prediction," in Proc. Int. Conf. Multimedia Expo, Baltimore, MD, Jul. 6-9, 2003, pp. 9-12.
    • (2003) Proc. Int. Conf. Multimedia Expo , pp. 9-12
    • Chaudhari, U.V.1    Ramaswamy, G.N.2    Potamianos, G.3    Neti, C.4
  • 16
    • 0031223878 scopus 로고    scopus 로고
    • SESAM: A biometric person identification system using sensor fusion
    • U. Dieckmann, P. Plankensteiner, and T. Wagner, "SESAM: A biometric person identification system using sensor fusion," Pattern Recogn. Lett., vol. 18, pp. 827-833, 1997.
    • (1997) Pattern Recogn. Lett , vol.18 , pp. 827-833
    • Dieckmann, U.1    Plankensteiner, P.2    Wagner, T.3
  • 17
    • 0031233424 scopus 로고    scopus 로고
    • Speaker recognition: A tutorial
    • Sep
    • J. P. Campbell, "Speaker recognition: A tutorial," Proc. IEEE, vol. 85, no. 9, pp. 1437-1462, Sep. 1997.
    • (1997) Proc. IEEE , vol.85 , Issue.9 , pp. 1437-1462
    • Campbell, J.P.1
  • 18
    • 1842499650 scopus 로고    scopus 로고
    • W.-Y. Zhao, R. Chellappa, P. J. J. Phillips, and A. Rosenfeld, Face recognition: A literature survey, ACM Computing Survey, pp. 399-458, 2003, Dec. Issue.
    • W.-Y. Zhao, R. Chellappa, P. J. J. Phillips, and A. Rosenfeld, "Face recognition: A literature survey," ACM Computing Survey, pp. 399-458, 2003, Dec. Issue.
  • 19
    • 0026065565 scopus 로고
    • Eigenfaces for recognition
    • Sep
    • M. Turk and A. Pentland, "Eigenfaces for recognition," J. Cognitive Neuroscience, vol. 3, no. 1, pp. 586-591, Sep. 1991.
    • (1991) J. Cognitive Neuroscience , vol.3 , Issue.1 , pp. 586-591
    • Turk, M.1    Pentland, A.2
  • 20
    • 0025236073 scopus 로고
    • Application of the Karhunen-Loeve procedure for the characterization of human faces
    • Jan
    • M. Kirby and L. Sirovich, "Application of the Karhunen-Loeve procedure for the characterization of human faces," IEEE Trans. Pottem Anal. Mach. Intell., vol. 12, no. 1, pp. 103-108, Jan. 1990.
    • (1990) IEEE Trans. Pottem Anal. Mach. Intell , vol.12 , Issue.1 , pp. 103-108
    • Kirby, M.1    Sirovich, L.2
  • 21
    • 0031185845 scopus 로고    scopus 로고
    • Eigenfaces versus fisherfaces: Recognition using class specific linear projection
    • P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, "Eigenfaces versus fisherfaces: Recognition using class specific linear projection," IEEE Trans. Pattern Anal. Mach. Intell., vol. 19, pp. 711-720, 1997.
    • (1997) IEEE Trans. Pattern Anal. Mach. Intell , vol.19 , pp. 711-720
    • Belhumeur, P.N.1    Hespanha, J.P.2    Kriegman, D.J.3
  • 23
    • 33947424506 scopus 로고    scopus 로고
    • J. Luettin, Visual speech and speaker recognition, Ph.D. dissertation, Dept. Computer Science, Univ. Sheffield, Sheffield, U.K., 1997.
    • J. Luettin, "Visual speech and speaker recognition," Ph.D. dissertation, Dept. Computer Science, Univ. Sheffield, Sheffield, U.K., 1997.
  • 24
    • 0031335829 scopus 로고    scopus 로고
    • Audio-visual person recognition: An evaluation of data fusion strategies
    • London, U.K
    • C. C. Chibelushi, F. Deravi, and J. S. Mason, "Audio-visual person recognition: An evaluation of data fusion strategies," in Proc. Eur. Conf. Security Detection, London, U.K., 1997, pp. 26-30.
    • (1997) Proc. Eur. Conf. Security Detection , pp. 26-30
    • Chibelushi, C.C.1    Deravi, F.2    Mason, J.S.3
  • 25
    • 0029527336 scopus 로고
    • Automatic person recognition using acoustic and geometric features
    • R. Brunelli, D. Falavigna, T. Poggio, and L. Stringa, "Automatic person recognition using acoustic and geometric features," Machine Vision Appl., vol. 8, pp. 317-325, 1995.
    • (1995) Machine Vision Appl , vol.8 , pp. 317-325
    • Brunelli, R.1    Falavigna, D.2    Poggio, T.3    Stringa, L.4
  • 26
    • 0036487270 scopus 로고    scopus 로고
    • Noise compensation in a person verification system using face and multiple speech features
    • Feb
    • C. Sanderson and K. K. Paliwal, "Noise compensation in a person verification system using face and multiple speech features," Pattern Recognition, vol. 36, no. 2, pp. 293-302, Feb. 2003.
    • (2003) Pattern Recognition , vol.36 , Issue.2 , pp. 293-302
    • Sanderson, C.1    Paliwal, K.K.2
  • 33
    • 1842854568 scopus 로고    scopus 로고
    • Multimodal speech processing using asynchronous hidden Markov models
    • _, "Multimodal speech processing using asynchronous hidden Markov models," Information Fusion, vol. 5, pp. 81-89, 2004.
    • (2004) Information Fusion , vol.5 , pp. 81-89
    • Bengio, S.1
  • 37
    • 0345565788 scopus 로고    scopus 로고
    • Multimodal speaker identification with audio-video processing
    • Barcelona, Spain
    • Y. Yemez, A. Kanak, E. Erzin, and A. M. Tekalp, "Multimodal speaker identification with audio-video processing," in Proc. Int. Conf. Image Processing, Barcelona, Spain, 2003, pp. 5-8.
    • (2003) Proc. Int. Conf. Image Processing , pp. 5-8
    • Yemez, Y.1    Kanak, A.2    Erzin, E.3    Tekalp, A.M.4
  • 39
    • 26844533276 scopus 로고    scopus 로고
    • Multimodal speaker identification using an adaptive classifier cascade based on modality reliability
    • Oct
    • E. Erzin, Y. Yemez, and A. M. Tekalp, "Multimodal speaker identification using an adaptive classifier cascade based on modality reliability," IEEE Trans. Multimedia, vol. 7, no. 5, pp. 840-852, Oct. 2005.
    • (2005) IEEE Trans. Multimedia , vol.7 , Issue.5 , pp. 840-852
    • Erzin, E.1    Yemez, Y.2    Tekalp, A.M.3
  • 43
    • 84953735755 scopus 로고    scopus 로고
    • Fusion of multiple experts in multimodal biometric personal identity verification systems
    • Switzerland
    • J. Kittler and K. Messer, "Fusion of multiple experts in multimodal biometric personal identity verification systems," in Proc. 12th IEEE Workshop Neural Networks Sig. Processing, Switzerland, 2002, pp. 3-12.
    • (2002) Proc. 12th IEEE Workshop Neural Networks Sig. Processing , pp. 3-12
    • Kittler, J.1    Messer, K.2
  • 45
    • 0033899298 scopus 로고    scopus 로고
    • BiolD: A multimodal biometric identification system
    • R. W. Frischholz and U. Dieckmann, "BiolD: A multimodal biometric identification system," Computer, vol. 33, pp. 64-68, 2000.
    • (2000) Computer , vol.33 , pp. 64-68
    • Frischholz, R.W.1    Dieckmann, U.2
  • 46
    • 33947430259 scopus 로고    scopus 로고
    • Methods and apparatus for audio-visual speaker recognition and utterance verification,
    • U.S. Patent 6 219 640
    • S. Basu, H. S. M. Beigi, S. H. Maes, M. Ghislain, E. Benoit, C. Neti, and A. W. Senior, "Methods and apparatus for audio-visual speaker recognition and utterance verification," U.S. Patent 6 219 640, 1999.
    • (1999)
    • Basu, S.1    Beigi, H.S.M.2    Maes, S.H.3    Ghislain, M.4    Benoit, E.5    Neti, C.6    Senior, A.W.7
  • 47
    • 0032295436 scopus 로고    scopus 로고
    • Integrating faces and fingerprints for personal identification
    • L. Hong and A. Jain, "Integrating faces and fingerprints for personal identification," IEEE Trans. Pattern Anal. Machine Intell., vol. 20, pp. 1295-1307, 1998.
    • (1998) IEEE Trans. Pattern Anal. Machine Intell , vol.20 , pp. 1295-1307
    • Hong, L.1    Jain, A.2
  • 48
    • 0030647922 scopus 로고    scopus 로고
    • An approach to speaker identification using multiple classifiers
    • Munich, Germany
    • V. Radova and J. Psutka, "An approach to speaker identification using multiple classifiers," in Proc. IEEE Conf. Acoustics, Speech Signal Processing, Munich, Germany, 1997, vol. 2, pp. 1135-1138.
    • (1997) Proc. IEEE Conf. Acoustics, Speech Signal Processing , vol.2 , pp. 1135-1138
    • Radova, V.1    Psutka, J.2
  • 49
    • 0038343934 scopus 로고    scopus 로고
    • Information fusion in biometrics
    • A. Ross and A. Jain, "Information fusion in biometrics," Pattern Rccogn. Lett., vol. 24, pp. 2115-2125, 2003.
    • (2003) Pattern Rccogn. Lett , vol.24 , pp. 2115-2125
    • Ross, A.1    Jain, A.2
  • 53
    • 0031238278 scopus 로고    scopus 로고
    • Biometrics: Privacy's foe or privacy's friend?
    • J. D. Woodward, "Biometrics: Privacy's foe or privacy's friend?" Proc. IEEE, vol. 85, pp. 1480-1492, 1997.
    • (1997) Proc. IEEE , vol.85 , pp. 1480-1492
    • Woodward, J.D.1
  • 54
    • 0031187171 scopus 로고    scopus 로고
    • Speech recognition by machines and humans
    • R. P. Lippmann, "Speech recognition by machines and humans," Speech Commun., vol. 22, no. 1, pp. 1-15, 1997.
    • (1997) Speech Commun , vol.22 , Issue.1 , pp. 1-15
    • Lippmann, R.P.1
  • 55
    • 0032178592 scopus 로고    scopus 로고
    • Quantitative association of vocal-tract and facial behavior
    • H. Yehia, P. Rubin, and E. Vatikiotis-Bateson, "Quantitative association of vocal-tract and facial behavior," Speech Commun., vol. 26, no. 1-2, pp. 23-43, 1998.
    • (1998) Speech Commun , vol.26 , Issue.1-2 , pp. 23-43
    • Yehia, H.1    Rubin, P.2    Vatikiotis-Bateson, E.3
  • 56
    • 0036874551 scopus 로고    scopus 로고
    • On the relationship between face movements, tongue movements, and speech acoustics
    • Nov
    • J. Jiang, A. Alwan, P. A. Keating, E. T. Auer, Jr., and L. E. Bernstein, "On the relationship between face movements, tongue movements, and speech acoustics," EURASIP J. Appl. Signal Processing, vol. 2002, no. 11, pp. 1174-1188, Nov. 2002.
    • (2002) EURASIP J. Appl. Signal Processing , vol.2002 , Issue.11 , pp. 1174-1188
    • Jiang, J.1    Alwan, A.2    Keating, P.A.3    Auer Jr., E.T.4    Bernstein, L.E.5
  • 57
    • 0012725678 scopus 로고    scopus 로고
    • Estimation of speech acoustics from visual speech features: A comparison of linear and non-linear models
    • Santa Cruz, CA
    • J. P. Barker and F. Berthommier, "Estimation of speech acoustics from visual speech features: A comparison of linear and non-linear models," in Proc. Int. Conf. Auditory Visual Speech Processing, Santa Cruz, CA, 1999, pp. 112-117.
    • (1999) Proc. Int. Conf. Auditory Visual Speech Processing , pp. 112-117
    • Barker, J.P.1    Berthommier, F.2
  • 59
    • 0034853042 scopus 로고    scopus 로고
    • Measuring the relation between speech acoustics and 2-D facial motion
    • Salt Lake City, UT
    • A. V. Barbosa and H. C. Yehia, "Measuring the relation between speech acoustics and 2-D facial motion," in Proc. Int. Conf. Acoustics, Speech Signal Processing, Salt Lake City, UT, 2001, vol. 1, pp. 181-184.
    • (2001) Proc. Int. Conf. Acoustics, Speech Signal Processing , vol.1 , pp. 181-184
    • Barbosa, A.V.1    Yehia, H.C.2
  • 61
    • 0002028032 scopus 로고
    • Some preliminaries to a comprehensive account of audio-visual speech perception
    • R. Campbell and B. Dodd, Eds. London, U.K, Lawrence Erlbaum
    • A. Q. Summerfield, "Some preliminaries to a comprehensive account of audio-visual speech perception," in Hearing by Eye: The Psychology of Lip-Reading, R. Campbell and B. Dodd, Eds. London, U.K.: Lawrence Erlbaum, 1987, pp. 3-51.
    • (1987) Hearing by Eye: The Psychology of Lip-Reading , pp. 3-51
    • Summerfield, A.Q.1
  • 62
    • 0032072433 scopus 로고    scopus 로고
    • Speech recognition and sensory integration
    • D. W. Massaro and D. G. Stork, "Speech recognition and sensory integration," Amer. Scientist, vol. 86, no. 3, pp. 236-244, 1998.
    • (1998) Amer. Scientist , vol.86 , Issue.3 , pp. 236-244
    • Massaro, D.W.1    Stork, D.G.2
  • 64
    • 0018701386 scopus 로고
    • Use of visual information in phonetic perception
    • Q. Summerfield, "Use of visual information in phonetic perception," Phonetica, vol. 36, pp. 314-331, 1979.
    • (1979) Phonetica , vol.36 , pp. 314-331
    • Summerfield, Q.1
  • 65
    • 0027128576 scopus 로고
    • Lipreading and audio-visual speech perception
    • _, "Lipreading and audio-visual speech perception," Phil. Trans. R. Soc. Lond. B., vol. 335, pp. 71-78, 1992.
    • (1992) Phil. Trans. R. Soc. Lond. B , vol.335 , pp. 71-78
    • Summerfield, Q.1
  • 66
    • 0025767028 scopus 로고
    • Evaluating the articulation index for auditory-visual input
    • Jun
    • K. W. Grant and L. D. Braida, "Evaluating the articulation index for auditory-visual input," J. Acoustical Soc. Amer., vol. 89, pp. 2950-2960, Jun. 1991.
    • (1991) J. Acoustical Soc. Amer , vol.89 , pp. 2950-2960
    • Grant, K.W.1    Braida, L.D.2
  • 69
    • 0012730684 scopus 로고    scopus 로고
    • Articulatory-acoustic models for fricative consonants
    • Jun
    • S. Narayanan and A. Alwan, "Articulatory-acoustic models for fricative consonants," IEEE Trans. Speech Audio Processing, vol. 8, no. 3, pp. 328-344, Jun. 2000.
    • (2000) IEEE Trans. Speech Audio Processing , vol.8 , Issue.3 , pp. 328-344
    • Narayanan, S.1    Alwan, A.2
  • 70
    • 0028259480 scopus 로고
    • Techniques for estimating vocal-tract shapes from the speech signal
    • Feb
    • J. Schroeter and M. Sondhi, "Techniques for estimating vocal-tract shapes from the speech signal," IEEE Trans. Speech Audio Processing, vol. 2, no. 1, pp. 133-150, Feb. 1994.
    • (1994) IEEE Trans. Speech Audio Processing , vol.2 , Issue.1 , pp. 133-150
    • Schroeter, J.1    Sondhi, M.2
  • 71
    • 0017199877 scopus 로고
    • Hearing lips and seeing voices
    • H. McGurk and J. MacDonald, "Hearing lips and seeing voices," Nature, vol. 264, pp. 746-748, 1976.
    • (1976) Nature , vol.264 , pp. 746-748
    • McGurk, H.1    MacDonald, J.2
  • 72
    • 0032074310 scopus 로고    scopus 로고
    • Audio-visual integration in multimodal communication
    • May
    • T. Chen and R. R. Rao, "Audio-visual integration in multimodal communication," Proc. IEEE, vol. 86, no. 5, pp. 837-852, May 1998.
    • (1998) Proc. IEEE , vol.86 , Issue.5 , pp. 837-852
    • Chen, T.1    Rao, R.R.2
  • 75
    • 0036502797 scopus 로고    scopus 로고
    • A review of speech-based bimodal recognition
    • Mar
    • C. C. Chibelushi, F. Deravi, and J. S. D. Mason, "A review of speech-based bimodal recognition," IEEE Trans. Multimedia, vol. 4, no. 1, pp. 23-37, Mar. 2002.
    • (2002) IEEE Trans. Multimedia , vol.4 , Issue.1 , pp. 23-37
    • Chibelushi, C.C.1    Deravi, F.2    Mason, J.S.D.3
  • 76
    • 0003544881 scopus 로고    scopus 로고
    • D. G. Stork and M. E. Hennecke, Eds, Berlin, Germany: Springer
    • D. G. Stork and M. E. Hennecke, Eds., Speechreading by Humans and Machines. Berlin, Germany: Springer, 1996.
    • (1996) Speechreading by Humans and Machines
  • 77
    • 33947376624 scopus 로고    scopus 로고
    • Exploiting visual information in automatic speech processing
    • A. Bovik, Ed. New York: Academic, Jun
    • P. S. Aleksic, G. Potamianos, and A. K. Katsaggelos, "Exploiting visual information in automatic speech processing," in Handbook of Image and Video Processing, A. Bovik, Ed. New York: Academic, Jun. 2005, pp. 1263-1289.
    • (2005) Handbook of Image and Video Processing , pp. 1263-1289
    • Aleksic, P.S.1    Potamianos, G.2    Katsaggelos, A.K.3
  • 78
    • 0003699540 scopus 로고
    • Automatic lipreading to enhance speech recognition,
    • Ph.D. dissertation, Univ. Illinois at Urbana-Champaign, Urbana, IL
    • E. Petajan, "Automatic lipreading to enhance speech recognition," Ph.D. dissertation, Univ. Illinois at Urbana-Champaign, Urbana, IL, 1984.
    • (1984)
    • Petajan, E.1
  • 79
    • 0034270644 scopus 로고    scopus 로고
    • Audio-visual speech modeling for continuous speech recognition
    • Sep
    • S. Dupont and J. Luettin, "Audio-visual speech modeling for continuous speech recognition," IEEE Trans. Multimedia, vol. 2, no. 3, pp. 141-151, Sep. 2000.
    • (2000) IEEE Trans. Multimedia , vol.2 , Issue.3 , pp. 141-151
    • Dupont, S.1    Luettin, J.2
  • 80
    • 4544290191 scopus 로고    scopus 로고
    • Recent advances in the automatic recognition of audiovisual speech
    • Sep
    • G. Potamianos, C. Neti, G. Gravier, A. Garg, and A. W. Senior, "Recent advances in the automatic recognition of audiovisual speech," Proc. IEEE, vol. 91, no. 9, pp. 1306-1326, Sep. 2003.
    • (2003) Proc. IEEE , vol.91 , Issue.9 , pp. 1306-1326
    • Potamianos, G.1    Neti, C.2    Gravier, G.3    Garg, A.4    Senior, A.W.5
  • 81
    • 15044345504 scopus 로고    scopus 로고
    • Audio-visual automatic speech recognition: An overview
    • G. Bailly, E. Vatikiotis-Bateson, and P. Perrier, Eds. Cambridge, MA: MIT Press
    • G. Potamianos, C. Neti, J. Luettin, and I. Matthews, "Audio-visual automatic speech recognition: An overview," in Issues in Visual and Audio-Visual Speech Processing, G. Bailly, E. Vatikiotis-Bateson, and P. Perrier, Eds. Cambridge, MA: MIT Press, 2004.
    • (2004) Issues in Visual and Audio-Visual Speech Processing
    • Potamianos, G.1    Neti, C.2    Luettin, J.3    Matthews, I.4
  • 82
    • 0036874915 scopus 로고    scopus 로고
    • Audio-visual speech recognition using MPEG-4 compliant visual features
    • Nov
    • P. S. Aleksic, J. J. Williams, Z. Wu, and A. K. Katsaggelos, "Audio-visual speech recognition using MPEG-4 compliant visual features," EURASIP J. Appl. Signal Processing, vol. 2002, no. 11, pp. 1213-1227, Nov. 2002.
    • (2002) EURASIP J. Appl. Signal Processing , vol.2002 , Issue.11 , pp. 1213-1227
    • Aleksic, P.S.1    Williams, J.J.2    Wu, Z.3    Katsaggelos, A.K.4
  • 83
    • 85032752352 scopus 로고    scopus 로고
    • Audiovisual speech processing. Lip reading and lip synchronization
    • Jan
    • T. Chen, "Audiovisual speech processing. Lip reading and lip synchronization," IEEE Signal Processing Mag., vol. 18, no. 1, pp. 9-21, Jan. 2001.
    • (2001) IEEE Signal Processing Mag , vol.18 , Issue.1 , pp. 9-21
    • Chen, T.1
  • 85
    • 33947395786 scopus 로고    scopus 로고
    • R. Campbell, B. Dodd, and D. Burnham, Eds., Hearing by Eye II: Advances in the Psychology of Speechreading and Auditory Visual Speech. Hove, U.K.: Psychology Press, 1998.
    • R. Campbell, B. Dodd, and D. Burnham, Eds., Hearing by Eye II: Advances in the Psychology of Speechreading and Auditory Visual Speech. Hove, U.K.: Psychology Press, 1998.
  • 86
    • 33947426576 scopus 로고    scopus 로고
    • S. Young, G. Evermann, T. Hain, D. Kershaw, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. Woodland, The HTK Book. London, U.K.: Entropic, 2005.
    • S. Young, G. Evermann, T. Hain, D. Kershaw, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. Woodland, The HTK Book. London, U.K.: Entropic, 2005.
  • 87
    • 0012745879 scopus 로고    scopus 로고
    • Rationale for phoneme-viseme mapping and feature selection in visual speech recognition
    • D. G. Stork and M. E. Hennecke, Eds. Berlin, Germany: Springer
    • A. J. Goldschen, O. N. Garcia, and E. D. Petajan, "Rationale for phoneme-viseme mapping and feature selection in visual speech recognition," in Speechreading by Humans and Machines, D. G. Stork and M. E. Hennecke, Eds. Berlin, Germany: Springer, 1996, pp. 505-515.
    • (1996) Speechreading by Humans and Machines , pp. 505-515
    • Goldschen, A.J.1    Garcia, O.N.2    Petajan, E.D.3
  • 89
    • 0032785783 scopus 로고    scopus 로고
    • Auditory processing of speech signals for robust speech recognition in real-world noisy environments
    • Jan
    • D.-S. Kim, S.-Y. Lee, and R. M. Kil, "Auditory processing of speech signals for robust speech recognition in real-world noisy environments," IEEE Trans. Speech Audio Processing, vol. 7, no. 1, pp. 55-69, Jan. 1999.
    • (1999) IEEE Trans. Speech Audio Processing , vol.7 , Issue.1 , pp. 55-69
    • Kim, D.-S.1    Lee, S.-Y.2    Kil, R.M.3
  • 90
    • 84892178050 scopus 로고    scopus 로고
    • Spectral subband centroids features for speech recognition
    • Seattle, WA
    • K. K. Paliwal, "Spectral subband centroids features for speech recognition," in Proc. Int. Conf. Acoustics, Speech and Signal Processing, Seattle, WA, 1998, vol. 2, pp. 617-620.
    • (1998) Proc. Int. Conf. Acoustics, Speech and Signal Processing , vol.2 , pp. 617-620
    • Paliwal, K.K.1
  • 91
    • 0141702085 scopus 로고    scopus 로고
    • Environmental sniffing: Noise knowledge estimation for robust speech systems
    • Hong Kong, China
    • M. Akbacak and J. H. L. Hansen, "Environmental sniffing: Noise knowledge estimation for robust speech systems," in Proc. Int. Conf. Acoustics, Speech and Signal Processing, Hong Kong, China, 2003, vol. 2, pp. 113-116.
    • (2003) Proc. Int. Conf. Acoustics, Speech and Signal Processing , vol.2 , pp. 113-116
    • Akbacak, M.1    Hansen, J.H.L.2
  • 94
    • 0031648023 scopus 로고    scopus 로고
    • Example-based learning for view-based human face detection
    • K. Sung and T. Poggio, "Example-based learning for view-based human face detection," IEEE Trans. Pattern Anal. Machine Intell., vol. 20, no. 1, pp. 39-51, 1998.
    • (1998) IEEE Trans. Pattern Anal. Machine Intell , vol.20 , Issue.1 , pp. 39-51
    • Sung, K.1    Poggio, T.2
  • 98
    • 27744546990 scopus 로고    scopus 로고
    • On transforming statistical models for non-frontal face verification
    • C. Sanderson, S. Bengio, and Y. Gao, "On transforming statistical models for non-frontal face verification," Pattern Recognition, vol. 39, no. 2, pp. 288-302, 2006.
    • (2006) Pattern Recognition , vol.39 , Issue.2 , pp. 288-302
    • Sanderson, C.1    Bengio, S.2    Gao, Y.3
  • 99
    • 27844534088 scopus 로고    scopus 로고
    • A survey of approaches and challenges in 3-D and multi-modal 3-D face recognition
    • K. W. Bowyer, K. Chang, and P. Flynn, "A survey of approaches and challenges in 3-D and multi-modal 3-D face recognition," Computer Vision Image Understanding, vol. 101, no. 1, pp. 1-15, 2006.
    • (2006) Computer Vision Image Understanding , vol.101 , Issue.1 , pp. 1-15
    • Bowyer, K.W.1    Chang, K.2    Flynn, P.3
  • 101
    • 0000417467 scopus 로고    scopus 로고
    • Visionary speech: Looking ahead to practical speechreading systems
    • D. G. Stork and M. E. Hennecke, Eds. Berlin, Germany: Springer
    • M. E. Hennecke, D. G. Stork, and K. V. Prasad, "Visionary speech: Looking ahead to practical speechreading systems," in Speechreading by Humans and Machines, D. G. Stork and M. E. Hennecke, Eds. Berlin, Germany: Springer, 1996, pp. 331-349.
    • (1996) Speechreading by Humans and Machines , pp. 331-349
    • Hennecke, M.E.1    Stork, D.G.2    Prasad, K.V.3
  • 102
    • 4544329810 scopus 로고    scopus 로고
    • Comparison of low- and high-level visual features for audio-visual continuous automatic speech recognition
    • Montreal, Canada
    • P. S. Aleksic and A. K. Katsaggelos, "Comparison of low- and high-level visual features for audio-visual continuous automatic speech recognition," in Proc. Jnt. Conf. Acoustics, Speech Signal Processing, Montreal, Canada, 2004, pp. 917-920.
    • (2004) Proc. Jnt. Conf. Acoustics, Speech Signal Processing , pp. 917-920
    • Aleksic, P.S.1    Katsaggelos, A.K.2
  • 103
  • 105
    • 0035680116 scopus 로고    scopus 로고
    • Rapid object detection using a boosted cascade of simple features
    • Kauai, HI, Dec. 11-13
    • P. Viola and M. Jones, "Rapid object detection using a boosted cascade of simple features," in Proc. Conf. Computer Vision Pattern Recognition, Kauai, HI, Dec. 11-13, 2001, pp. 511-518.
    • (2001) Proc. Conf. Computer Vision Pattern Recognition , pp. 511-518
    • Viola, P.1    Jones, M.2
  • 106
    • 0031361424 scopus 로고    scopus 로고
    • Robust recognition of faces and facial features with a multi-modal system
    • Orlando, FL
    • H. P. Graf, E. Cosatto, and G. Potamianos, "Robust recognition of faces and facial features with a multi-modal system," in Proc. Jnt. Conf. Systems, Man, Cybernetics, Orlando, FL, 1997, pp. 2034-2039.
    • (1997) Proc. Jnt. Conf. Systems, Man, Cybernetics , pp. 2034-2039
    • Graf, H.P.1    Cosatto, E.2    Potamianos, G.3
  • 107
    • 84925639646 scopus 로고    scopus 로고
    • Real-time lip tracking and bimodal continuous speech recognition
    • Redondo Beach, CA
    • M. T. Chan, Y. Zhang, and T. S. Huang, "Real-time lip tracking and bimodal continuous speech recognition," in Proc. Workshop Multimedia Signal Processing, Redondo Beach, CA, 1998, pp. 65-70.
    • (1998) Proc. Workshop Multimedia Signal Processing , pp. 65-70
    • Chan, M.T.1    Zhang, Y.2    Huang, T.S.3
  • 108
    • 84931090061 scopus 로고    scopus 로고
    • Liveness verification in audio-video authentication
    • Jeju Island, Korea
    • G. Chetty and M. Wagner, '"Liveness" verification in audio-video authentication," in Proc. Int. Conf. Spoken Language Processing, Jeju Island, Korea, 2004, pp. 2509-2512.
    • (2004) Proc. Int. Conf. Spoken Language Processing , pp. 2509-2512
    • Chetty, G.1    Wagner, M.2
  • 110
    • 0026903014 scopus 로고
    • Feature extraction from faces using deformable templates
    • A. L. Yuille, P. W. Hallinan, and D. S. Cohen, "Feature extraction from faces using deformable templates," Int. J. Computer Vision, vol. 8, no. 2, pp. 99-111, 1992.
    • (1992) Int. J. Computer Vision , vol.8 , Issue.2 , pp. 99-111
    • Yuille, A.L.1    Hallinan, P.W.2    Cohen, D.S.3
  • 113
    • 21244474602 scopus 로고    scopus 로고
    • Audio-visual speaker recognition for broadcast news: Some fusion techniques
    • Copenhagen, Denmark
    • B. Maison, C. Neti, and A. Senior, "Audio-visual speaker recognition for broadcast news: Some fusion techniques," in Proc. Works. Multimedia Signal Processing, Copenhagen, Denmark, 1999, pp. 161-167.
    • (1999) Proc. Works. Multimedia Signal Processing , pp. 161-167
    • Maison, B.1    Neti, C.2    Senior, A.3
  • 114
    • 85135321224 scopus 로고
    • See me, hear me: Integrating automatic speech recognition and lip-reading
    • Yokohama, Japan, Sep. 18-22
    • P. Duchnowski, U. Meier, and A. Waibel, "See me, hear me: Integrating automatic speech recognition and lip-reading," in Proc. Jnt. Conf. Spoken Long. Processing, Yokohama, Japan, Sep. 18-22, 1994, pp. 547-550.
    • (1994) Proc. Jnt. Conf. Spoken Long. Processing , pp. 547-550
    • Duchnowski, P.1    Meier, U.2    Waibel, A.3
  • 115
    • 0032314380 scopus 로고    scopus 로고
    • An image transform approach for HMM based automatic lipreading
    • Chicago, IL, Oct. 4-7
    • G. Potamianos, H. P. Graf, and E. Cosatto, "An image transform approach for HMM based automatic lipreading," in Proc. Int. Conf. Image Processing, Chicago, IL, Oct. 4-7, 1998, vol. 1, pp. 173-177.
    • (1998) Proc. Int. Conf. Image Processing , vol.1 , pp. 173-177
    • Potamianos, G.1    Graf, H.P.2    Cosatto, E.3
  • 116
    • 33749247429 scopus 로고    scopus 로고
    • Comparison of MPEG-4 facial animation parameter groups with respect to audio-visual speech recognition performance
    • Italy, Sep
    • P. S. Aleksic and A. K. Katsaggelos, "Comparison of MPEG-4 facial animation parameter groups with respect to audio-visual speech recognition performance," in Proc. Int. Conf. Image Processing. Italy, Sep. 2005, vol. 5, pp. 501-504.
    • (2005) Proc. Int. Conf. Image Processing , vol.5 , pp. 501-504
    • Aleksic, P.S.1    Katsaggelos, A.K.2
  • 117
    • 0036875048 scopus 로고    scopus 로고
    • Automatic speechreading with applications to human-computer interfaces
    • X. Zhang, C. C. Broun, R. M. Mersereau, and M. Clements, "Automatic speechreading with applications to human-computer interfaces," EURASIP J. Appl. Signal Processing, vol. 2002, no. 11, pp. 1228-1247, 2002.
    • (2002) EURASIP J. Appl. Signal Processing , vol.2002 , Issue.11 , pp. 1228-1247
    • Zhang, X.1    Broun, C.C.2    Mersereau, R.M.3    Clements, M.4
  • 118
    • 0036875002 scopus 로고    scopus 로고
    • A support vector machine-based dynamic network for visual speech recognition applications
    • M. Gordan, C. Kotropoulos, and I. Pitas, "A support vector machine-based dynamic network for visual speech recognition applications," EURASIP J. Appl. Signal Processing, vol. 2002, no. 11, pp. 1248-1259, 2002.
    • (2002) EURASIP J. Appl. Signal Processing , vol.2002 , Issue.11 , pp. 1248-1259
    • Gordan, M.1    Kotropoulos, C.2    Pitas, I.3
  • 119
    • 30344436680 scopus 로고    scopus 로고
    • User authentication via adapted statistical models of face images
    • Jan
    • F. Cardinaux, C. Sanderson, and S. Bengio, "User authentication via adapted statistical models of face images," IEEE Trans. Signal Processing, vol. 54, no. 1, pp. 361-373, Jan. 2006.
    • (2006) IEEE Trans. Signal Processing , vol.54 , Issue.1 , pp. 361-373
    • Cardinaux, F.1    Sanderson, C.2    Bengio, S.3
  • 120
    • 0033738539 scopus 로고    scopus 로고
    • The NIST speaker recognition evaluation - Overview, methodology, systems, results, perspective
    • G. R. Doddington, M. A. Przybycki, A. F. Martin, and D. A. Reynolds, "The NIST speaker recognition evaluation - Overview, methodology, systems, results, perspective," Speech Commun., vol. 31, no. 2-3, pp. 225-254, 2000.
    • (2000) Speech Commun , vol.31 , Issue.2-3 , pp. 225-254
    • Doddington, G.R.1    Przybycki, M.A.2    Martin, A.F.3    Reynolds, D.A.4
  • 122
    • 23744485282 scopus 로고    scopus 로고
    • The expected performance curve: A new assessment measure for person authentication
    • Toledo, OH
    • S. Bengio and J. Mariethoz, "The expected performance curve: A new assessment measure for person authentication," in Proc. Speaker Language Recognition Works. (Odyssey), Toledo, OH, 2004, pp. 279-284.
    • (2004) Proc. Speaker Language Recognition Works. (Odyssey) , pp. 279-284
    • Bengio, S.1    Mariethoz, J.2
  • 123
    • 84875984350 scopus 로고    scopus 로고
    • Multisensor data fusion
    • D. L. Hall and J. Llinas, Eds. Boca Raton, FL: CRC
    • D. L. Hall and J. Llinas, "Multisensor data fusion," in Handbook of Multisensor Data Fusion, D. L. Hall and J. Llinas, Eds. Boca Raton, FL: CRC, 2001, pp. 1-10.
    • (2001) Handbook of Multisensor Data Fusion , pp. 1-10
    • Hall, D.L.1    Llinas, J.2
  • 129
    • 33947384251 scopus 로고    scopus 로고
    • Speech and Image Processing Research Group, Dept. of Electrical and Electronic Engineering, Univ, les Swansea
    • C. C. Chibelushi, F. Deravi, and J. S. Mason, BT DAVID Database - Internal Rep., Speech and Image Processing Research Group, Dept. of Electrical and Electronic Engineering, Univ, les Swansea, 1996.
    • (1996) BT DAVID Database - Internal Rep
    • Chibelushi, C.C.1    Deravi, F.2    Mason, J.S.3
  • 130
    • 26444562315 scopus 로고    scopus 로고
    • The realistic multi-modal VALID database and visual speaker identification comparison experiments
    • T. Kanade, A. K. Jain, and N. K. Ratha, Eds. New York: Springer-Verlag
    • N. Fox, B. O'Mullane, and R. B. Reilly, "The realistic multi-modal VALID database and visual speaker identification comparison experiments," in Lecture Notes in Computer Science, T. Kanade, A. K. Jain, and N. K. Ratha, Eds. New York: Springer-Verlag, 2005, vol. 3546, p. 777.
    • (2005) Lecture Notes in Computer Science , vol.3546 , pp. 777
    • Fox, N.1    O'Mullane, B.2    Reilly, R.B.3
  • 133
    • 85032752352 scopus 로고    scopus 로고
    • Audiovisual speech processing
    • Jan
    • T. Chen, "Audiovisual speech processing," IEEE Signal Processing Mag., vol. 18, pp. 9-21, Jan. 2001.
    • (2001) IEEE Signal Processing Mag , vol.18 , pp. 9-21
    • Chen, T.1
  • 134
    • 0000886386 scopus 로고
    • Visual speech recognition with stochastic networks
    • G. Tesauro, D. Toruetzky, and T. Leen, Eds. Cambridge, MA: MIT Press
    • J. R. Movellan, "Visual speech recognition with stochastic networks," in Advances in Neural Information Processing Systems, G. Tesauro, D. Toruetzky, and T. Leen, Eds. Cambridge, MA: MIT Press, 1995, vol. 7.
    • (1995) Advances in Neural Information Processing Systems , vol.7
    • Movellan, J.R.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.