메뉴 건너뛰기




Volumn 35, Issue 1, 2003, Pages 41-64

Audio-visual speech recognition using red exclusion and neural networks

Author keywords

Audio Visual Speech Recognition; Feature Extraction; Neural Networks; Sensor Fusion

Indexed keywords

ACOUSTIC NOISE; LINGUISTICS; NEURAL NETWORKS; SENSOR DATA FUSION;

EID: 0041624571     PISSN: 1443458X     EISSN: None     Source Type: Journal    
DOI: None     Document Type: Article
Times cited : (12)

References (54)
  • 2
    • 4244194696 scopus 로고    scopus 로고
    • Multispectral color modeling
    • University of Pennsylvania, CIS
    • ANGELOPOULOU, E., MOLANA, R., and DANIILIDIS, K. (2001): Multispectral color modeling. Technical Report MS-CIS-01-22, University of Pennsylvania, CIS.
    • (2001) Technical Report , vol.MS-CIS-01-22
    • Angelopoulou, E.1    Molana, R.2    Daniilidis, K.3
  • 8
    • 0029304865 scopus 로고
    • Human and machine recognition of faces: A survey
    • CHELAPPA, R., WILSON, C., and SIROHEY, S. (1995): Human and machine recognition of faces: A survey, in Proceedings of the IEEE, 83(5): 705-739.
    • (1995) Proceedings of the IEEE , vol.83 , Issue.5 , pp. 705-739
    • Chelappa, R.1    Wilson, C.2    Sirohey, S.3
  • 11
    • 0042453480 scopus 로고
    • Comparision of parametric representations for monosyllabic word recognition in continuously spoken sentences
    • WAIBEL, A. and LEE, K., editors, Morgan Kaufmann Publishers Inc., San Mateo, CA
    • DAVIS, S. and MERMELSTEIN, P. (1990): Comparision of parametric representations for monosyllabic word recognition in continuously spoken sentences. In WAIBEL, A. and LEE, K., editors, Readings in Speech Recognition, 64-74. Morgan Kaufmann Publishers Inc., San Mateo, CA.
    • (1990) Readings in Speech Recognition , pp. 64-74
    • Davis, S.1    Mermelstein, P.2
  • 15
    • 0034270644 scopus 로고    scopus 로고
    • Audio-visual speech modeling for continuous speech recognition
    • DUPONT. S. and LEUTTIN, J. (2000): Audio-visual speech modeling for continuous speech recognition. IEEE Transactions on Multimedia, 2(3):141-151.
    • (2000) IEEE Transactions on Multimedia , vol.2 , Issue.3 , pp. 141-151
    • Dupont, S.1    Leuttin, J.2
  • 19
    • 0041451439 scopus 로고    scopus 로고
    • The use of visible speech cues (speechreading) for directing auditory attention: Reducing temporal and spectral uncertainty in auditory detection of spoken utterances
    • GRANT, K. and SEITZ, P. (1998): The use of visible speech cues (speechreading) for directing auditory attention: Reducing temporal and spectral uncertainty in auditory detection of spoken utterances. In 16th International Congress on Acoustics.
    • (1998) 16th International Congress on Acoustics
    • Grant, K.1    Seitz, P.2
  • 20
    • 0000874921 scopus 로고    scopus 로고
    • Dynamic features for visual speechreading: A systematic comparision
    • MOZER, JORDAN, and PERSCHE, editors, MIT Press, Cambridge MA
    • GRAY, M., MOVELLAN, J., and SEJNOWSKI, T. (1997): Dynamic features for visual speechreading: A systematic comparision. In MOZER, JORDAN, and PERSCHE, editors, Advances in Neural Information Processing Systems, volume 9. MIT Press, Cambridge MA.
    • (1997) Advances in Neural Information Processing Systems , vol.9
    • Gray, M.1    Movellan, J.2    Sejnowski, T.3
  • 22
    • 0034848499 scopus 로고    scopus 로고
    • Optimal weighting of posteriors for audio-visual speech recognition
    • Salt Lake City, Utah
    • HECKMANN, M., BERTHOMMIER, F., and KROSCHEL, K. (2001b): Optimal weighting of posteriors for audio-visual speech recognition. In Proceedings of lCASSP 2001, Salt Lake City, Utah.
    • (2001) Proceedings of LCASSP 2001
    • Heckmann, M.1    Berthommier, F.2    Kroschel, K.3
  • 23
    • 4243462047 scopus 로고
    • Automatic speech recognition using acoustic and visual signals
    • Ricoh Californian Research Centre
    • HENNECKE, M., PRASAD, K.V., and STORK, D. (1995): Automatic speech recognition using acoustic and visual signals. Technical Report CRC-TR-95-37, Ricoh Californian Research Centre.
    • (1995) Technical Report , vol.CRC-TR-95-37
    • Hennecke, M.1    Prasad, K.V.2    Stork, D.3
  • 26
    • 84992590661 scopus 로고
    • Face locating and tracking for human-computer interaction
    • IEEE Computer Society, Pacific Grove, CA
    • HUNKE, M. and WAIBEL, A. (1994): Face locating and tracking for human-computer interaction. In 28th Annual Asimolar Conference on Signals, Systems, and Computers, IEEE Computer Society, Pacific Grove, CA. 2: 1277-1281.
    • (1994) 28th Annual Asimolar Conference on Signals, Systems, and Computers , vol.2 , pp. 1277-1281
    • Hunke, M.1    Waibel, A.2
  • 29
    • 1542320375 scopus 로고    scopus 로고
    • Lip feature extraction using red exclusion
    • EADES, P. and JIN, J., editors
    • Lewis T.W. and POWERS, D. (2001): Lip feature extraction using red exclusion. In EADES, P. and JIN, J., editors, CRPIT: Visualisation, 2000, 2: 61-70.
    • (2000) CRPIT: Visualisation , vol.2 , pp. 61-70
    • Lewis, T.W.1    Powers, D.2
  • 31
    • 0032072433 scopus 로고    scopus 로고
    • Speech recognition and sensory integration: A 240-year old theorem helps explain how people and machines can integrate auditory and visual information to understand speech
    • MASSARO, D. and STORK, D. (1998): Speech recognition and sensory integration: a 240-year old theorem helps explain how people and machines can integrate auditory and visual information to understand speech. American Scientist, 86(3): 236-245.
    • (1998) American Scientist , vol.86 , Issue.3 , pp. 236-245
    • Massaro, D.1    Stork, D.2
  • 32
    • 0017199877 scopus 로고
    • Hearing lips and seeing voices
    • MCGURK, H. and MACDONALD, J. (1976): Hearing lips and seeing voices. Nature, 264:746-748.
    • (1976) Nature , vol.264 , pp. 746-748
    • McGurk, H.1    Macdonald, J.2
  • 35
    • 85029619676 scopus 로고
    • Visual speech recognition with stochastic networks
    • Tesauro, G., Toruetzky, D., and Leen, T., editors, MIT Press, Cambridge
    • MOVELLAN, J. (1995): Visual speech recognition with stochastic networks. In Tesauro, G., Toruetzky, D., and Leen, T., editors, Advances in Neural Information Processing Systems, 7: 851-858. MIT Press, Cambridge.
    • (1995) Advances in Neural Information Processing Systems , vol.7 , pp. 851-858
    • Movellan, J.1
  • 36
    • 0032138429 scopus 로고    scopus 로고
    • Robust sensor fusion: Analysis and application to audio visual speech recognition
    • MOVELLAN, J. and MINEIRO, P. (1998): Robust sensor fusion: Analysis and application to audio visual speech recognition. Machine Learning, 32: 85-100.
    • (1998) Machine Learning , vol.32 , pp. 85-100
    • Movellan, J.1    Mineiro, P.2
  • 39
    • 0010127090 scopus 로고    scopus 로고
    • Speaker adaptation for audio-visual speech recognition
    • Budapest
    • POTAMIAONOS, G. and POTAMIANOS, A. (1999): Speaker adaptation for audio-visual speech recognition. In Proceedings of EUROSPEECH (3), 1291-1294, Budapest.
    • (1999) Proceedings of EUROSPEECH (3) , pp. 1291-1294
    • Potamiaonos, G.1    Potamianos, A.2
  • 40
    • 0003552976 scopus 로고
    • Preprocessing video images for neural learning of lipreading
    • Ricoh California Research Centre
    • PRASAD, K., STORK, D., and WOLFF, G. (1993): Preprocessing video images for neural learning of lipreading. Technical Report CRC-TR-93-26, Ricoh California Research Centre.
    • (1993) Technical Report , vol.CRC-TR-93-26
    • Prasad, K.1    Stork, D.2    Wolff, G.3
  • 44
    • 0002358797 scopus 로고    scopus 로고
    • Discriminative learning of visual data for audiovisual speech recognition
    • ROGOZAN, A. (1999): Discriminative learning of visual data for audiovisual speech recognition. International Journal of Artificial Intelligence Tools, 8(1):43-52.
    • (1999) International Journal of Artificial Intelligence Tools , vol.8 , Issue.1 , pp. 43-52
    • Rogozan, A.1
  • 45
    • 0038133938 scopus 로고
    • Digital representations of speech signals
    • WAIBEL, A. and LEE, K., editors, Morgan Kaufmann Publishers Inc., San Mateo, CA
    • SCHAFER, R. and RABINER, L. (1990): Digital representations of speech signals. In WAIBEL, A. and LEE, K., editors, Readings in Speech Recognition, 49-64. Morgan Kaufmann Publishers Inc., San Mateo, CA.
    • (1990) Readings in Speech Recognition , pp. 49-64
    • Schafer, R.1    Rabiner, L.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.