SCOPUS 정보 검색 플랫폼

ICMI'04 - Sixth International Conference on Multimodal Interfaces

Volumn , Issue , 2004, Pages 235-242

A segment-based Audio-Visual speech recognizer: Data collection, development, and initial experiments

(4) Hazen, Timothy J a Saenko, Kate a La, Chia Hao a Glass, James R a

a MASSACHUSETTS INSTITUTE OF TECHNOLOGY (United States)

Author keywords

Audio visual corpora; Audio visual speech recognition

Indexed keywords

APPROXIMATION THEORY; COMPUTER SIMULATION; DATA ACQUISITION; INFORMATION ANALYSIS; MARKOV PROCESSES; MATHEMATICAL MODELS;

AUDIO-VISUAL CORPORA; AUDIO-VISUAL SPEECH RECOGNITION (AVSR); VISUAL INFORMATION; VISUAL MODALITY;

SPEECH RECOGNITION;

EID: 14944353581 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (108)

References (22)

1
- 0039228740
- The intrinsic bimodality of speech communication and the synthesis of talking faces
- Hungary, September
- C. Benoit. The intrinsic bimodality of speech communication and the synthesis of talking faces. In Journal on Communications of the Scientific Society for Telecommunications, Hungary, number 43, pages 32-40, September 1992.
- (1992) Journal on Communications of the Scientific Society for Telecommunications , Issue.43 , pp. 32-40
- Benoit, C.¹

2
- 84925639646
- Real-time lip tracking and bimodal continuous speech recognition
- Redondo Beach, CA
- M. T. Chan, Y. Zhang, and T. S. Huang. Real-time lip tracking and bimodal continuous speech recognition. In Proc. of the Workshop on Multimedia Signal Processing, pp. 65-70, Redondo Beach, CA, 1998.
- (1998) Proc. of the Workshop on Multimedia Signal Processing , pp. 65-70
- Chan, M.T.¹ Zhang, Y.² Huang, T.S.³

3
- 85009135946
- Bimodal speech recognition using coupled hidden markov models
- Beijing, October
- S. Chu and T. Huang. Bimodal speech recognition using coupled hidden Markov models. In Proc. of the International Conference on Spoken Language Processing, vol. II, Beijing, October 2000.
- (2000) Proc. of the International Conference on Spoken Language Processing , vol.2
- Chu, S.¹ Huang, T.²

4
- 0034270644
- Audio-visual speech modeling for continuous speech recognition
- September
- S. Dupont and J. Luettin. Audio-visual speech modeling for continuous speech recognition. In IEEE Transactions on Multimedia, number 2, pages 141-151, September 2000.
- (2000) IEEE Transactions on Multimedia , Issue.2 , pp. 141-151
- Dupont, S.¹ Luettin, J.²

5
- 0038359548
- A probabilistic framework for segment-based speech recognition
- To appear in
- J. Glass. A probabilistic framework for segment-based speech recognition. To appear in Computer Speech and Language, 2003.
- (2003) Computer Speech and Language
- Glass, J.¹

6
- 85128407852
- Heterogeneous measurements and multiple classifiers for speech recognition
- Sydney, Australia, November
- A. Halberstadt and J. Glass. Heterogeneous measurements and multiple classifiers for speech recognition. In Proceedings of ICSLP 98, Sydney, Australia, November 1998.
- (1998) Proceedings of ICSLP 98
- Halberstadt, A.¹ Glass, J.²

7
- 84892140515
- Using aggregation to improve the performance of mixture Gaussian acoustic models
- Seattle, May
- T. J. Hazen and A. Halberstadt, "Using aggregation to improve the performance of mixture Gaussian acoustic models," In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Seattle, May, 1998.
- (1998) Proceedings of the International Conference on Acoustics, Speech, and Signal Processing
- Hazen, T.J.¹ Halberstadt, A.²

8
- 14944382509
- May
- IBM Research - Audio Visual Speech Technologies: Data Collection. Accessed online at http://www.research.ibm.com/AVSTG/data.html, May 2003.
- (2003) IBM Research - Audio Visual Speech Technologies: Data Collection

9
- 14944355052
- Intel's AVCSR Toolkit source code can be downloaded from http://sourceforge.net/projects/opencvlibrary/.

10
- 0024768209
- Speaker-independent phone recognition using hidden markov models
- November
- K. F. Lee and H. W. Hon. Speaker-independent phone recognition using hidden Markov models. In IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 37, no. 11, pp. 1641-1648, November 1989.
- (1989) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.37 , Issue.11 , pp. 1641-1648
- Lee, K.F.¹ Hon, H.W.²

11
- 79952493967
- Speaker independent audio-visual continuous speech recognition
- L. H. Liang, X. X. Liu, Y. Zhao, X. Pi and A.V. Nefian. Speaker independent audio-visual continuous speech recognition. In Proc. of the IEEE International Conference on Multimedia and Expo, vol.2, pp. 25-28, 2002.
- (2002) Proc. of the IEEE International Conference on Multimedia and Expo , vol.2 , pp. 25-28
- Liang, L.H.¹ Liu, X.X.² Zhao, Y.³ Pi, X.⁴ Nefian, A.V.⁵

12
- 0030355932
- Audio-visual speech recognition using multiscale nonlinear image decomposition
- Philadelphia, PA
- I. Matthews, J. A. Bangham, and S. Cox. Audio-visual speech recognition using multiscale nonlinear image decomposition. In Proc. of the International Conference on Spoken Language Processing, pp. 38-41, Philadelphia, PA, 1996.
- (1996) Proc. of the International Conference on Spoken Language Processing , pp. 38-41
- Matthews, I.¹ Bangham, J.A.² Cox, S.³

13
- 0034238554
- Towards unrestricted lip reading
- August
- U. Meier, R. Stiefelhagen, J. Yang, and A. Waibel. Towards unrestricted lip reading. In International Journal of Pattern Recognition and Artificial Intelligence, number 14, pages 571-585, August 2000.
- (2000) International Journal of Pattern Recognition and Artificial Intelligence , Issue.14 , pp. 571-585
- Meier, U.¹ Stiefelhagen, R.² Yang, J.³ Waibel, A.⁴

14
- 0001935972
- XM2VTSDB: The extended M2VTS database
- Washington, D.C., March. 16 IDIAP-RR 99-02
- K. Messer, J. Matas, J. Kittler, and K. Jonsson. XM2VTSDB: The extended M2VTS database. In Audio- and Video-based Biometric Person Authentication, AVBPA'99, pages 72-77, Washington, D.C., March 1999. 16 IDIAP-RR 99-02.
- (1999) Audio- and Video-based Biometric Person Authentication, AVBPA'99 , pp. 72-77
- Messer, K.¹ Matas, J.² Kittler, J.³ Jonsson, K.⁴

15
- 0004052871
- Audio-visual speech recognition
- Baltimore, Maryland. The Johns Hopkins University
- C. Neti, et al. Audio-visual speech recognition. In Technical Report, Center for Language and Speech Processing, Baltimore, Maryland, 2000. The Johns Hopkins University.
- (2000) Technical Report, Center for Language and Speech Processing
- Neti, C.¹

16
- 0006184263
- The M2VTS multimodal face database
- Workshop, Germany
- S. Pigeon and L. Vandendorpe. The M2VTS multimodal face database. In Proc. of the Audio- and Video-based Biometric Person Authentication Workshop, Germany, 1997.
- (1997) Proc. of the Audio- and Video-based Biometric Person Authentication
- Pigeon, S.¹ Vandendorpe, L.²

17
- 85009230873
- Audio-visual speech recognition in challenging environments
- Geneva, Switzerland, September
- G. Potamianos and C. Neti. Audio-visual speech recognition in challenging environments. In Proc. Of EUROSPEECH, pp. 1293-1296, Geneva, Switzerland, September 2003.
- (2003) Proc. of EUROSPEECH , pp. 1293-1296
- Potamianos, G.¹ Neti, C.²

18
- 14944351246
- Articulatory features for robust visual speech recognition
- In these proceedings, State College, Pennsylvania
- K. Saenko, T. Darrel, and J. Glass. Articulatory features for robust visual speech recognition In these proceedings, ICMI'04, State College, Pennsylvania, 2004.
- (2004) ICMI'04
- Saenko, K.¹ Darrel, T.² Glass, J.³

19
- 0041355006
- The VidTIMIT database
- Martigny, Switzerland
- C. Sanderson. The VidTIMIT Database. IDIAP Communication 02-06, Martigny, Switzerland, 2002.
- (2002) IDIAP Communication , vol.2 , Issue.6
- Sanderson, C.¹

20
- 0142017304
- PhD Thesis, Griffith University, Brisbane, Australia
- C. Sanderson. Automatic Person Verification Using Speech and Face Information. PhD Thesis, Griffith University, Brisbane, Australia, 2002.
- (2002) Automatic Person Verification Using Speech and Face Information
- Sanderson, C.¹

21
- 14944356145
- Acoustic modeling improvements in a segment-based speech recognizer
- Keystone, CO, December
- N. Ström, L. Hetherington, T.J. Hazen, E. Sandness, and J. Glass. Acoustic modeling improvements in a segment-based speech recognizer. In Proc. 1999 IEEE ASRU Workshop, Keystone, CO, December 1999.
- (1999) Proc. 1999 IEEE ASRU Workshop
- Ström, N.¹ Hetherington, L.² Hazen, T.J.³ Sandness, E.⁴ Glass, J.⁵

22
- 0025477640
- Speech database development: TIMIT and beyond
- V. Zue, S. Seneff, and J. Glass. Speech database development: TIMIT and beyond. Speech Communication, vol. 9, no. 4, pp. 351-356, 1990.
- (1990) Speech Communication , vol.9 , Issue.4 , pp. 351-356
- Zue, V.¹ Seneff, S.² Glass, J.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.