메뉴 건너뛰기




Volumn , Issue , 2004, Pages 152-158

Articulatory features for robust visual speech recognition

Author keywords

Articulatory features; Audio visual speech recognition; Multimodal interfaces; Speechreading; Support vector machines; Visual feature extraction

Indexed keywords

FEATURE EXTRACTION; HUMAN COMPUTER INTERACTION; IMAGE ANALYSIS; INTERFACES (COMPUTER); SPEECH ANALYSIS; VISUAL COMMUNICATION;

EID: 14944351246     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1027933.1027960     Document Type: Conference Paper
Times cited : (30)

References (33)
  • 1
    • 0001432664 scopus 로고    scopus 로고
    • On the integration of auditory and visual parameters in HMM-based ASR
    • D. G. Stork and M. E. Hennecke, Eds. Berlin, Germany: Springer
    • A. Adjoudani and C. Benoit, "On the integration of auditory and visual parameters in HMM-based ASR," in Speechreading by Humans and Machines, D. G. Stork and M. E. Hennecke, Eds. Berlin, Germany: Springer, pp. 461-471, 1996.
    • (1996) Speechreading by Humans and Machines , pp. 461-471
    • Adjoudani, A.1    Benoit, C.2
  • 2
    • 0003152968 scopus 로고
    • Speech enhancement in the 1980s: Noise suppression with pattern matching
    • Dekker
    • S. Boll, "Speech enhancement in the 1980s: noise suppression with pattern matching," In Advances in Speech Signal Processing, pp. 309-325, Dekker, 1992.
    • (1992) Advances in Speech Signal Processing , pp. 309-325
    • Boll, S.1
  • 3
    • 85013597845 scopus 로고
    • Eigenlips for robust speech recognition
    • C. Bregler and Y. Konig, "Eigenlips for Robust Speech Recognition," In Proc. ICASSP, 1994.
    • (1994) Proc. ICASSP
    • Bregler, C.1    Konig, Y.2
  • 4
    • 84925639646 scopus 로고    scopus 로고
    • Real-time lip tracking and bimodal continuous speech recognition
    • Redondo Beach, CA
    • M. Chan, Y. Zhang, and T. Huang, "Real-time lip tracking and bimodal continuous speech recognition," in Proc. Works. Multimedia Signal Processing, pp. 65-70, Redondo Beach, CA, 1998.
    • (1998) Proc. Works. Multimedia Signal Processing , pp. 65-70
    • Chan, M.1    Zhang, Y.2    Huang, T.3
  • 7
    • 85009135946 scopus 로고    scopus 로고
    • Bimodal speech recognition using coupled hidden Markov models
    • Beijing, China
    • S. Chu and T. Huang, "Bimodal speech recognition using coupled hidden Markov models," In Proc. Int. Conf. Spoken Lang. Processing, vol. II, Beijing, China, pp. 747-750, 2000.
    • (2000) Proc. Int. Conf. Spoken Lang. Processing , vol.2 , pp. 747-750
    • Chu, S.1    Huang, T.2
  • 10
    • 0034270644 scopus 로고    scopus 로고
    • Audio-visual speech modeling for continuous speech recognition
    • S. Dupont and J. Luettin, "Audio-visual speech modeling for continuous speech recognition," IEEE Trans. Multimedia, vol. 2, no. 3, pp. 141-151, 2000.
    • (2000) IEEE Trans. Multimedia , vol.2 , Issue.3 , pp. 141-151
    • Dupont, S.1    Luettin, J.2
  • 12
    • 0036875002 scopus 로고    scopus 로고
    • A support vector machine based dynamic network for visual speech recognition applications
    • M. Gordan, C. Kotropoulos, and I. Pitas, "A support vector machine based dynamic network for visual speech recognition applications," EURASIP J. Appl. Signal Processing, vol. 2002, no. 11, pp. 1248-1259, 2002.
    • (2002) EURASIP J. Appl. Signal Processing , vol.2002 , Issue.11 , pp. 1248-1259
    • Gordan, M.1    Kotropoulos, C.2    Pitas, I.3
  • 13
    • 0034841727 scopus 로고    scopus 로고
    • Application of affine-invariant fourier descriptors to lipreading for audio-visual speech recognition
    • Salt Lake City, UT
    • S. Gurbuz, Z. Tufekci, E. Patterson, and J. Gowdy, "Application of affine-invariant fourier descriptors to lipreading for audio-visual speech recognition," in Proc. Int. Conf. Acoust., Speech, Signal Processing, pp. 177-180, Salt Lake City, UT, 2001.
    • (2001) Proc. Int. Conf. Acoust., Speech, Signal Processing , pp. 177-180
    • Gurbuz, S.1    Tufekci, Z.2    Patterson, E.3    Gowdy, J.4
  • 16
    • 0038193561 scopus 로고    scopus 로고
    • Combining acoustic and articulatory-feature information for robust speech recognition
    • Sydney
    • K. Kirchhoff, G. Fink and G. Sagerer, "Combining Acoustic and Articulatory-feature Information for Robust Speech Recognition," In Proc. ICSLP, pp. 891-894, Sydney, 1998.
    • (1998) Proc. ICSLP , pp. 891-894
    • Kirchhoff, K.1    Fink, G.2    Sagerer, G.3
  • 18
    • 14944341906 scopus 로고    scopus 로고
    • Feature-based pronunciation modeling for speech recognition
    • Boston
    • K. Livescu and J. Glass, "Feature-based Pronunciation Modeling for Speech Recognition," In Proc. HLT/NAACL, Boston, 2004.
    • (2004) Proc. HLT/NAACL
    • Livescu, K.1    Glass, J.2
  • 19
    • 0025750892 scopus 로고
    • Automatic lipreading by optical flow analysis
    • K. Mase and A. Pentland, "Automatic Lipreading by optical flow analysis," Systems and Computers in Japan, vol. 22, no. 6, pp. 67-76, 1991.
    • (1991) Systems and Computers in Japan , vol.22 , Issue.6 , pp. 67-76
    • Mase, K.1    Pentland, A.2
  • 21
    • 85009240321 scopus 로고    scopus 로고
    • A flexible stream architecture for ASR using articulatory features
    • Denver
    • F. Metze, and A. Waibel, "A Flexible Stream Architecture for ASR Using Articulatory Features," In Proc. ICSLP, Denver, 2002.
    • (2002) Proc. ICSLP
    • Metze, F.1    Waibel, A.2
  • 22
    • 84955023511 scopus 로고
    • An analysis of perceptual confusions among some english consonants
    • G. Miller and P. Nicely, "An Analysis of Perceptual Confusions among some English Consonants," J. Acoustical Society America, vol. 27, no. 2, pp. 338-352, 1955.
    • (1955) J. Acoustical Society America , vol.27 , Issue.2 , pp. 338-352
    • Miller, G.1    Nicely, P.2
  • 23
    • 0035790960 scopus 로고    scopus 로고
    • Large-vocabulary audio-visual speech recognition: A summary of the Johns Hopkins summer 2000 workshop
    • Cannes, France
    • C. Neti, G. Potamianos, J. Luettin, I. Matthews, H. Glotin, and D. Vergyri, "Large-vocabulary audio-visual speech recognition: A summary of the Johns Hopkins Summer 2000 Workshop," In Proc. Works. Signal Processing, pp. 619-624, Cannes, France, 2001.
    • (2001) Proc. Works. Signal Processing , pp. 619-624
    • Neti, C.1    Potamianos, G.2    Luettin, J.3    Matthews, I.4    Glotin, H.5    Vergyri, D.6
  • 24
    • 0033676801 scopus 로고    scopus 로고
    • Denoising of human speech using combined acoustic and EM sensor signal processing
    • Istanbul, Turkey
    • L. Ng, G. Burnett, J. Holzrichter, and T. Gable, "Denoising of Human Speech Using Combined Acoustic and EM Sensor Signal Processing," In Proc. ICASSP, Istanbul, Turkey, 2000.
    • (2000) Proc. ICASSP
    • Ng, L.1    Burnett, G.2    Holzrichter, J.3    Gable, T.4
  • 26
    • 0021541159 scopus 로고
    • Automatic lipreading to enhance speech recognition
    • Atlanta, GA
    • E. Petajan, "Automatic lipreading to enhance speech recognition," In Proc. Global Telecomm. Conf., pp. 265-272, Atlanta, GA, 1984.
    • (1984) Proc. Global Telecomm. Conf. , pp. 265-272
    • Petajan, E.1
  • 27
    • 85009230873 scopus 로고    scopus 로고
    • Audio-visual speech recognition in challenging environments
    • Geneva
    • G. Potamianos and C. Neti, "Audio-visual speech recognition in challenging environments," In Proc. Eur. Conf. Speech Comm. Tech., pp. 1293-1296, Geneva, 2003.
    • (2003) Proc. Eur. Conf. Speech Comm. Tech. , pp. 1293-1296
    • Potamianos, G.1    Neti, C.2
  • 28
  • 29
    • 0034517163 scopus 로고    scopus 로고
    • A cascade image transform for speaker-independent automatic speechreading
    • New York
    • G. Potamianos, A. Verma, C. Neti, G. Iyengar, and S. Basu, "A Cascade Image Transform for Speaker-Independent Automatic Speechreading," In Proc. ICME, volume II, pp. 1097-1100, New York, 2000.
    • (2000) Proc. ICME , vol.2 , pp. 1097-1100
    • Potamianos, G.1    Verma, A.2    Neti, C.3    Iyengar, G.4    Basu, S.5
  • 30
    • 0001048664 scopus 로고
    • Visual contribution to speech intelligibility in noise
    • W. Sumby, and I. Pollack, "Visual contribution to speech intelligibility in noise," J. Acoustical Society America, vol. 26, no. 2, pp. 212-215, 1954.
    • (1954) J. Acoustical Society America , vol.26 , Issue.2 , pp. 212-215
    • Sumby, W.1    Pollack, I.2
  • 31
    • 0036165806 scopus 로고    scopus 로고
    • An overlapping-feature based phonological model incorporating linguistic constraints: Applications to speech recognition
    • J. Sun and L. Deng, "An Overlapping-Feature Based Phonological Model Incorporating Linguistic Constraints: Applications to Speech Recognition", J. Acoustic Society of America, vol. 111, No. 2, pp. 1086-1101, 2002.
    • (2002) J. Acoustic Society of America , vol.111 , Issue.2 , pp. 1086-1101
    • Sun, J.1    Deng, L.2
  • 32
    • 0003770986 scopus 로고    scopus 로고
    • Comparing models for audiovisual fusion in a noisy-vowel recognition task
    • P. Teissier, J. Robert-Ribes, and J. Schwartz, "Comparing models for audiovisual fusion in a noisy-vowel recognition task," IEEE Trans. Speech Audio Processing, vol. 7, no. 6, pp. 629-642, 1999.
    • (1999) IEEE Trans. Speech Audio Processing , vol.7 , Issue.6 , pp. 629-642
    • Teissier, P.1    Robert-Ribes, J.2    Schwartz, J.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.