메뉴 건너뛰기




Volumn 5, Issue 2, 2004, Pages 91-101

Continuous audio-visual digit recognition using N -best decision fusion

Author keywords

Audio visual speech; Decision fusion; Lip reading; Speech recognition

Indexed keywords

ACOUSTIC SIGNAL PROCESSING; ALGORITHMS; DATABASE SYSTEMS; MARKOV PROCESSES; MAXIMUM LIKELIHOOD ESTIMATION; SPURIOUS SIGNAL NOISE; VIDEO SIGNAL PROCESSING; WORD PROCESSING;

EID: 1842854571     PISSN: 15662535     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.inffus.2003.07.001     Document Type: Article
Times cited : (26)

References (37)
  • 1
    • 0031187171 scopus 로고    scopus 로고
    • Speech recognition by machines and humans
    • Lippmann R.P. Speech recognition by machines and humans. Speech Communication. 22:1997;1-15.
    • (1997) Speech Communication , vol.22 , pp. 1-15
    • Lippmann, R.P.1
  • 6
    • 0034848499 scopus 로고    scopus 로고
    • Optimal weighting of posteriors for audio-visual speech recognition
    • Salt Lake
    • M. Heckmann, F. Berthommier, K. Kroschel, Optimal weighting of posteriors for audio-visual speech recognition, in: Proceedings on ICASSP 2001, Salt Lake, 2001, pp. 161-164.
    • (2001) Proceedings on ICASSP 2001 , pp. 161-164
    • Heckmann, M.1    Berthommier, F.2    Kroschel, K.3
  • 8
    • 85009083793 scopus 로고    scopus 로고
    • Comparing audio- and a-posteriori-probability-based stream confidence measures for audio-visual speech recognition
    • Aalborg
    • M. Heckmann, T. Wild, F. Berthommier, K. Kroschel, Comparing audio- and a-posteriori-probability-based stream confidence measures for audio-visual speech recognition, in: Proceedings on Eurospeech 2001, Aalborg, 2001, pp. 1023-1026.
    • (2001) Proceedings on Eurospeech 2001 , pp. 1023-1026
    • Heckmann, M.1    Wild, T.2    Berthommier, F.3    Kroschel, K.4
  • 9
    • 85009153179 scopus 로고    scopus 로고
    • Stream confidence estimation for audio-visual speech recognition
    • Beijing
    • G. Potamianos, C. Neti, Stream confidence estimation for audio-visual speech recognition, in: Proceedings on ICSLP 2000, Beijing, 2000, pp. 746-749.
    • (2000) Proceedings on ICSLP 2000 , pp. 746-749
    • Potamianos, G.1    Neti, C.2
  • 12
    • 0020836249 scopus 로고
    • Evaluation and integration of visual and auditory information in speech perception
    • Massaro D.W., Cohen M.M. Evaluation and integration of visual and auditory information in speech perception. Journal of Experimental Psychology: HPP. 9:1983;751-753.
    • (1983) Journal of Experimental Psychology: HPP , vol.9 , pp. 751-753
    • Massaro, D.W.1    Cohen, M.M.2
  • 13
    • 85009080413 scopus 로고    scopus 로고
    • Auditory visual speech processing
    • Aalborg
    • D.W. Massaro, Auditory visual speech processing, in: Proceedings on Eurospeech 2001, Aalborg, 2001, pp. 1153-1156.
    • (2001) Proceedings on Eurospeech 2001 , pp. 1153-1156
    • Massaro, D.W.1
  • 14
    • 0000789852 scopus 로고    scopus 로고
    • Channel separability in the audio-visual integration of speech: A Bayesian approach
    • Speechreading by Man and Machine, Models, Systems and Applications, Berlin: Springer-Verlag
    • Movellan J.R., Chadderon G. Channel separability in the audio-visual integration of speech: a Bayesian approach. Speechreading by Man and Machine, Models, Systems and Applications. NATO ASI Series. 1996;473-487 Springer-Verlag, Berlin.
    • (1996) NATO ASI Series , pp. 473-487
    • Movellan, J.R.1    Chadderon, G.2
  • 15
    • 0002028032 scopus 로고
    • Some preliminaries to a comprehensive account of audio-visual speech perception
    • B. Dodd, & R. Campbell. Hillsdale, NJ: Lawrence Erlbaum Associates
    • Summerfield A.Q. Some preliminaries to a comprehensive account of audio-visual speech perception. Dodd B., Campbell R. Hearing by Eye, the Psychology of Lip-reading. 1987;3-51 Lawrence Erlbaum Associates, Hillsdale, NJ.
    • (1987) Hearing by Eye, the Psychology of Lip-reading , pp. 3-51
    • Summerfield, A.Q.1
  • 18
    • 0034270644 scopus 로고    scopus 로고
    • Audio-visual speech modelling for continuous speech recognition
    • Dupont S., Luettin J. Audio-visual speech modelling for continuous speech recognition. IEEE Transactions on Multimedia. 2:2000;141-151.
    • (2000) IEEE Transactions on Multimedia , vol.2 , pp. 141-151
    • Dupont, S.1    Luettin, J.2
  • 19
  • 20
    • 84987702417 scopus 로고    scopus 로고
    • The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions
    • Beijing
    • D. Pearce, H.-G. Hirsch, The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions, in: Proceedings on ICSLP'00, Beijing, vol. 4, 2000, pp. 29-32.
    • (2000) Proceedings on ICSLP'00 , vol.4 , pp. 29-32
    • Pearce, D.1    Hirsch, H.-G.2
  • 23
    • 0003552976 scopus 로고
    • Preprocessing video images for neural learning of lipreading
    • K.V. Prasad, G. Storck, G.J. Wolf, Preprocessing video images for neural learning of lipreading, Ricoh CRC Technical Report 93-26, 1993.
    • (1993) Ricoh CRC Technical Report , vol.93 , Issue.26
    • Prasad, K.V.1    Storck, G.2    Wolf, G.J.3
  • 24
    • 85013597845 scopus 로고
    • Eigenlips for robust speech recognition
    • Adelaide
    • C. Bregler, Y. Konig, Eigenlips for robust speech recognition, in: Proceedings on ICASSP'94, Adelaide, 1994, pp. 669-672.
    • (1994) Proceedings on ICASSP'94 , pp. 669-672
    • Bregler, C.1    Konig, Y.2
  • 29
    • 0008571982 scopus 로고
    • PCA image coding schemes and visual speech intelligibility
    • Windermere
    • N.M. Brooke, S.D. Scott, PCA image coding schemes and visual speech intelligibility, in: Proceedings on Institute of Acoustics, Windermere, vol. 16, 1994, pp. 123-129.
    • (1994) Proceedings on Institute of Acoustics , vol.16 , pp. 123-129
    • Brooke, N.M.1    Scott, S.D.2
  • 30
    • 0034270644 scopus 로고    scopus 로고
    • Audio-visual speech modeling for continuous speech recognition
    • Dupont S., Luettin J. Audio-visual speech modeling for continuous speech recognition. IEEE Transactions on Multimedia. 2:2000;141-151.
    • (2000) IEEE Transactions on Multimedia , vol.2 , pp. 141-151
    • Dupont, S.1    Luettin, J.2
  • 33
    • 0000813366 scopus 로고    scopus 로고
    • Talking heads and speech recognisers that can see: The computer processing of visual speech signals
    • D.G. Stork, & M.E. Hennecke. Berlin: Springer-Verlag
    • Brooke N.M. Talking heads and speech recognisers that can see: the computer processing of visual speech signals. Stork D.G., Hennecke M.E. Speechreading by Humans and Machines. 1996;351-371 Springer-Verlag, Berlin.
    • (1996) Speechreading by Humans and Machines , pp. 351-371
    • Brooke, N.M.1
  • 35
    • 85009284526 scopus 로고    scopus 로고
    • DCT-based video features for audio-visual speech recognition
    • Beijing
    • M. Heckmann, K. Kroschel, C. Savariaux, F. Berthommier, DCT-based video features for audio-visual speech recognition, in: Proceedings on ICSLP, Beijing, vol. 3, 2002, pp. 1925-1928.
    • (2002) Proceedings on ICSLP , vol.3 , pp. 1925-1928
    • Heckmann, M.1    Kroschel, K.2    Savariaux, C.3    Berthommier, F.4
  • 36
    • 0002358797 scopus 로고    scopus 로고
    • Discriminative learning of visual data for audiovisual speech recognition
    • Rogozan A. Discriminative learning of visual data for audiovisual speech recognition. International Journal on Artificial Intelligence Tools. 8:1999;43-52.
    • (1999) International Journal on Artificial Intelligence Tools , vol.8 , pp. 43-52
    • Rogozan, A.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.