SCOPUS 정보 검색 플랫폼

Volumn 7, Issue 3, 2005, Pages 495-506

Integration strategies for audio-visual speech processing: Applied to text-dependent speaker recognition

(4) Lucey, Simon a,b Chen, Tsuhan b Sridharan, Sridha a Chandran, Vinod a

a QUEENSLAND UNIVERSITY OF TECHNOLOGY (Australia)

b CARNEGIE MELLON UNIVERSITY (United States)

Author keywords

Audio visual speech processing (AVSP); Classifier combination; Integration strategies; Multistream hidden Markov model (HMM); Speaker recognition

Indexed keywords

ERROR ANALYSIS; GRAPH THEORY; INTEGRATION; MARKOV PROCESSES; MATHEMATICAL MODELS; SPEECH PROCESSING;

AUDIO-VISUAL SPEECH PROCESSING (AVSP); CLASSIFIER COMBINATION; INTEGRATION STRATEGIES; MULTISTREAM HIDDEN MARKOV MODELS (HMM); SPEAKER RECOGNITION;

SPEECH RECOGNITION;

EID: 20444375102 PISSN: 15209210 EISSN: None Source Type: Journal
DOI: 10.1109/TMM.2005.846777 Document Type: Article

Times cited : (33)

References (31)

1
- 0032638088
- Robust speaker verification via fusion of speech and lip modalities
- T. Wark, S. Sridharan, and V. Chandran, "Robust speaker verification via fusion of speech and lip modalities," in Proc. Int. Conf. Acoustics, Speech and Signal Processing (ICASSP'99), vol. 6, 1999, pp. 3061-3064.
- (1999) Proc. Int. Conf. Acoustics, Speech and Signal Processing (ICASSP'99) , vol.6 , pp. 3061-3064
- Wark, T.¹ Sridharan, S.² Chandran, V.³

2
- 0032074310
- Audio-visual integration in multimodal communication
- May
- T. Chen and R. Rao, "Audio-visual integration in multimodal communication," Proc. IEEE, vol. 86, no. 5, pp. 837-852, May 1998.
- (1998) Proc. IEEE , vol.86 , Issue.5 , pp. 837-852
- Chen, T.¹ Rao, R.²

3
- 0029270677
- Converting speech into lip movements: A multimedia telephone for hard hearing people
- Mar.
- F. Lavagetto, "Converting speech into lip movements: A multimedia telephone for hard hearing people," IEEE Trans. Rehab. Eng., vol. 3, no. 1, pp. 90-102, Mar. 1995.
- (1995) IEEE Trans. Rehab. Eng. , vol.3 , Issue.1 , pp. 90-102
- Lavagetto, F.¹

4
- 0017199877
- Hearing lips and seeing voices
- Dec.
- H. McGurk and J. MacDonald, "Hearing lips and seeing voices," Nature, pp. 746-748, Dec. 1976.
- (1976) Nature , pp. 746-748
- McGurk, H.¹ MacDonald, J.²

5
- 0003544881
- NATO ASI Series F: Computer and Systems Sciences, Eds., Springer-Verlag, New York
- Speechreading by Humans and Machines, vol. 150, NATO ASI Series F: Computer and Systems Sciences, D. G. Stork and M. E. Hennecke, Eds., Springer-Verlag, New York, 1996.
- (1996) Speechreading by Humans and Machines , vol.150
- Stork, D.G.¹ Hennecke, M.E.²

6
- 0036502797
- A review of speech-based bimodal recognition
- Mar.
- C. C. Chibelushi, F. Deravi, and J. S. D. Mason, "A review of speech-based bimodal recognition," IEEE Trans. Multimedia, vol. 4, no. 1, pp. 23-37, Mar. 2002.
- (2002) IEEE Trans. Multimedia , vol.4 , Issue.1 , pp. 23-37
- Chibelushi, C.C.¹ Deravi, F.² Mason, J.S.D.³

7
- 0034270644
- Audio-visual speech modeling for continuous speech recognition
- Sep.
- S. Dupont and J. Luettin, "Audio-visual speech modeling for continuous speech recognition," IEEE Trans. Multimedia, vol. 2, no. 3, pp. 141-151, Sep. 2000.
- (2000) IEEE Trans. Multimedia , vol.2 , Issue.3 , pp. 141-151
- Dupont, S.¹ Luettin, J.²

8
- 0031624666
- Discriminative training of HMM stream exponents for audio-visual speech recognition
- G. Potamianos and H. P. Graf, "Discriminative training of HMM stream exponents for audio-visual speech recognition," in Proc. Int. Conf. Acoustics, Speech and Signal Processing (ICASSP'98), vol. 6, 1998, pp. 3733-3736.
- (1998) Proc. Int. Conf. Acoustics, Speech and Signal Processing (ICASSP'98) , vol.6 , pp. 3733-3736
- Potamianos, G.¹ Graf, H.P.²

9
- 0025681008
- Hidden Markov model decomposition of speech and noise
- A. P. Varga and R. K. Moore, "Hidden Markov model decomposition of speech and noise," in Proc. Int. Conf. Acoustics, Speech and Signal Processing (ICASSP'90), vol. 2, 1990, pp. 845-848.
- (1990) Proc. Int. Conf. Acoustics, Speech and Signal Processing (ICASSP'90) , vol.2 , pp. 845-848
- Varga, A.P.¹ Moore, R.K.²

10
- 0032021555
- On combining classifiers
- Mar.
- J. Kittler, M. Hatef, R. Duin, and J. Matas, "On combining classifiers," IEEE Trans. Pattern Anal. Machine Intell., vol. 20, no. 3, pp. 226-239, Mar. 1998.
- (1998) IEEE Trans. Pattern Anal. Machine Intell. , vol.20 , Issue.3 , pp. 226-239
- Kittler, J.¹ Hatef, M.² Duin, R.³ Matas, J.⁴

11
- 0032097263
- ork: Academic
- K. Fukunaga, Introduction to Statistical Pattern Recognition, 2nd ed. ork: Academic, 1990.
- (1990) Introduction to Statistical Pattern Recognition, 2nd Ed.
- Fukunaga, K.¹

12
- 22444454265
- Combining classifiers: A theoretical framework
- J. Kittler, "Combining classifiers: A theoretical framework," Pattern Anal. and Applicat., vol. 1, no. 1, pp. 18-27, 1998.
- (1998) Pattern Anal. and Applicat. , vol.1 , Issue.1 , pp. 18-27
- Kittler, J.¹

13
- 0004473740
- Modularity and catastrophic fusion: A Bayesian approach with applications to audio-visual speech recognition
- USCD, Dept. Cognitive Sci., San Diego, CA
- J. R. Movellan and P. Mineiro, "Modularity and Catastrophic Fusion: A Bayesian Approach with Applications to Audio-Visual Speech Recognition," USCD, Dept. Cognitive Sci., San Diego, CA, Tech. Rep. 97.01, 1997.
- (1997) Tech. Rep. 97.01
- Movellan, J.R.¹ Mineiro, P.²

14
- 0024766457
- A family of distortion measures based upon projection operation for robust speech recognition
- Nov.
- D. Mansour and B. H. Juang, "A family of distortion measures based upon projection operation for robust speech recognition," IEEE Trans. Acoust., Speech, Signal Process., vol. 37, no. 11, pp. 1659-1671, Nov. 1989.
- (1989) IEEE Trans. Acoust., Speech, Signal Process. , vol.37 , Issue.11 , pp. 1659-1671
- Mansour, D.¹ Juang, B.H.²

15
- 35248829639
- Data dependence in combining classifiers
- T. Windeatt and F. Roli, Eds.
- M. S. Kamel and N. M. Wanas, "Data dependence in combining classifiers," in Multiple Classifier Systems, T. Windeatt and F. Roli, Eds., 2003, pp. 1-14.
- (2003) Multiple Classifier Systems , pp. 1-14
- Kamel, M.S.¹ Wanas, N.M.²

16
- 82055174896
- Audio-visual speech recognition compared across two architectures
- A. Adjoudani and C. Benoit, "Audio-visual speech recognition compared across two architectures," in Proc. European Conf. Speech Communication and Technology (Eurospeech'95), 1995, pp. 1563-1566.
- (1995) Proc. European Conf. Speech Communication and Technology (Eurospeech'95) , pp. 1563-1566
- Adjoudani, A.¹ Benoit, C.²

17
- 84925595128
- Combining noise compensation with visual information in speech recognition
- Rhodes, Greece
- S. Cox, I. Matthews, and J. A. Bangham, "Combining noise compensation with visual information in speech recognition," in Auditory-Visual Speech Processing (AVSP'97), Rhodes, Greece, 1997.
- (1997) Auditory-visual Speech Processing (AVSP'97)
- Cox, S.¹ Matthews, I.² Bangham, J.A.³

18
- 85135374344
- Integration of acoustic and visual speech for speaker recognition
- C. C. Chibelushi, J. S. Mason, and F. Deravi, "Integration of acoustic and visual speech for speaker recognition," in Proc. European Conf. Speech Communication and Technology (Eurospeech'93), 1993, pp. 157-160.
- (1993) Proc. European Conf. Speech Communication and Technology (Eurospeech'93) , pp. 157-160
- Chibelushi, C.C.¹ Mason, J.S.² Deravi, F.³

19
- 0022019614
- Intermodal timing relations and audio-visual speech recognition
- Feb.
- M. McGrath and Q. Summerfield, "Intermodal timing relations and audio-visual speech recognition," J. Acoust. Soc. Amer., vol. 77, no. 2, pp. 678-685, Feb. 1985.
- (1985) J. Acoust. Soc. Amer. , vol.77 , Issue.2 , pp. 678-685
- McGrath, M.¹ Summerfield, Q.²

20
- 0034842342
- Asynchronous stream modeling for large vocabulary audio-visual speech recognition
- J. Luettin, G. Potamianos, and C. Neti, "Asynchronous stream modeling for large vocabulary audio-visual speech recognition," in Proc. Int. Conf. Acoustics, Speech and Signal Processing (ICASSP'01), vol. 1, 2001, pp. 169-172.
- (2001) Proc. Int. Conf. Acoustics, Speech and Signal Processing (ICASSP'01) , vol.1 , pp. 169-172
- Luettin, J.¹ Potamianos, G.² Neti, C.³

21
- 85046873967
- The DET curve in assessment of detection task performance
- A. Martin, G. Doddington, T. Kamm, M. Ordowski, and P. Przybocki, "The DET curve in assessment of detection task performance," in Proc. European Conf. Speech Communication and Technology (Eurospeech'97), vol. 4, 1997, pp. 1895-1898.
- (1997) Proc. European Conf. Speech Communication and Technology (Eurospeech'97) , vol.4 , pp. 1895-1898
- Martin, A.¹ Doddington, G.² Kamm, T.³ Ordowski, M.⁴ Przybocki, P.⁵

22
- 0003922190
- New York: Wiley
- R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, 2nd ed, New York: Wiley, 2001.
- (2001) Pattern Classification, 2nd Ed
- Duda, R.O.¹ Hart, P.E.² Stork, D.G.³

23
- 0006184263
- The M2VTS multimodal face database
- Crans-Montana, Switzerland, Mar.
- S. Pigeon and L. Vandendorpe, "The M2VTS multimodal face database," in Proc. Int. Conf. Audio and Video-based Biometric Person Authentication (AVBPA'97), Crans-Montana, Switzerland, Mar. 1997.
- (1997) Proc. Int. Conf. Audio and Video-based Biometric Person Authentication (AVBPA'97)
- Pigeon, S.¹ Vandendorpe, L.²

24
- 0031220766
- Acoustic-labial speaker verification
- P. Jourlin, J. Luettin, D. Genoud, and H. Wassner, "Acoustic-labial speaker verification," Pattern Recognit. Lett., vol. 18:9, pp. 853-858, 1997.
- (1997) Pattern Recognit. Lett. , vol.18 , Issue.9 , pp. 853-858
- Jourlin, P.¹ Luettin, J.² Genoud, D.³ Wassner, H.⁴

25
- 0037360227
- Improved facial-feature detection for AVSP via unsupervised clustering and discriminant analysis
- S. Lucey, V. Chandran, and S. Sridharan, "Improved facial-feature detection for AVSP via unsupervised clustering and discriminant analysis," EURASIP J. Appl. Signal Process., no. 3, pp. 264-275, 2003.
- (2003) EURASIP J. Appl. Signal Process. , Issue.3 , pp. 264-275
- Lucey, S.¹ Chandran, V.² Sridharan, S.³

26
- 0003571976
- Cambridge, U.K.: Entropic Ltd.
- S. Young, D. Kershaw, J. Odell, D. Ollason, V. Valtchev, and P. Woodland, The HTK Book (for HTK Version 2.2). Cambridge, U.K.: Entropic Ltd., 1999.
- (1999) The HTK Book (For HTK Version 2.2)
- Young, S.¹ Kershaw, D.² Odell, J.³ Ollason, D.⁴ Valtchev, V.⁵ Woodland, P.⁶

27
- 0024610919
- A tutorial on hidden Markov models and selected applications in speech recognition
- Feb.
- L. R. Rabiner, "A tutorial on hidden Markov models and selected applications in speech recognition," Proc. IEEE, vol. 77, no. 2, pp. 257-286, Feb. 1989.
- (1989) Proc. IEEE , vol.77 , Issue.2 , pp. 257-286
- Rabiner, L.R.¹

28
- 0029747053
- Integrating audio and visual information to provide highly robust speech recognition
- M. J. Tomlinson, M. J. Russell, and N. M. Brooke, "Integrating audio and visual information to provide highly robust speech recognition," in Proc. Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP '96), 1996, pp. 821-824.
- (1996) Proc. Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP '96) , pp. 821-824
- Tomlinson, M.J.¹ Russell, M.J.² Brooke, N.M.³

29
- 85009268624
- A link between cepstral shrinking and the weighted product rule in audio-visual speech recognition
- S. Lucey, V. Chandran, and S. Sridharan, "A link between cepstral shrinking and the weighted product rule in audio-visual speech recognition," in Proc. Int. Conf. Spoken Language Processing (ICSLP'02), 2002, pp. 1961-1964.
- (2002) Proc. Int. Conf. Spoken Language Processing (ICSLP'02) , pp. 1961-1964
- Lucey, S.¹ Chandran, V.² Sridharan, S.³

30
- 85009126374
- An investigation of HMM classifier combination strategies for improved audio-visual speech recognition
- S. Lucey, S. Sridharan, and V. Chandran, "An investigation of HMM classifier combination strategies for improved audio-visual speech recognition," in Proc. European Conf. Speech Communication and Technology (Eurospeech'01), 2001, pp. 1185-1188.
- (2001) Proc. European Conf. Speech Communication and Technology (Eurospeech'01) , pp. 1185-1188
- Lucey, S.¹ Sridharan, S.² Chandran, V.³

31
- 0001935972
- XM2VTSDB: The extended M2VTS database
- K. Messer, J. Matas, J. Kittler, J. Luettin, and G. Maitre, "XM2VTSDB: The extended M2VTS database," in Proc. Int. Conf. Audio and Video-Based Biometric Person Authentication (AVBPA'99), 1999, pp. 72-77.
- (1999) Proc. Int. Conf. Audio and Video-based Biometric Person Authentication (AVBPA'99) , pp. 72-77
- Messer, K.¹ Matas, J.² Kittler, J.³ Luettin, J.⁴ Maitre, G.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.