SCOPUS 정보 검색 플랫폼

IEEE Transactions on Pattern Analysis and Machine Intelligence

Volumn 31, Issue 9, 2009, Pages 1700-1707

Multistream articulatory feature-based models for visual speech recognition

(4) Saenko, Kate a Livescu, Karen b Glass, James a Darrell, Trevor c

a MASSACHUSETTS INSTITUTE OF TECHNOLOGY (United States)

b TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO (United States)

c INTERNATIONAL COMPUTER SCIENCE INSTITUTE (United States)

Author keywords

Articulatory features; Dynamic Bayesian networks; Support vector machines; Visual speech recognition

Indexed keywords

ARTICULATORY FEATURES; ASYNCHRONY; BASELINE MODELS; DYNAMIC BAYESIAN NETWORK; DYNAMIC BAYESIAN NETWORKS; HIDDEN STATE; MULTI-STREAM; MULTIPLE SEQUENCES; OBSERVATION MODEL; VISUAL SPEECH RECOGNITION; WORD MODELS;

BAYESIAN NETWORKS; CLASSIFIERS; DISTRIBUTED PARAMETER NETWORKS; IMAGE RETRIEVAL; INFERENCE ENGINES; INTELLIGENT NETWORKS; SPEECH PROCESSING; SUPPORT VECTOR MACHINES;

SPEECH RECOGNITION;

ALGORITHM; ARTICLE; AUDIOVISUAL EQUIPMENT; AUTOMATED PATTERN RECOGNITION; AUTOMATIC SPEECH RECOGNITION; BIOLOGICAL MODEL; COMPUTER ASSISTED DIAGNOSIS; COMPUTER SIMULATION; EVALUATION; HISTOLOGY; HUMAN; IMAGE ENHANCEMENT; LIP; LIP READING; METHODOLOGY; PHYSIOLOGY; SPEECH ANALYSIS;

ALGORITHMS; COMPUTER SIMULATION; HUMANS; IMAGE ENHANCEMENT; IMAGE INTERPRETATION, COMPUTER-ASSISTED; LIP; LIPREADING; MODELS, ANATOMIC; MODELS, BIOLOGICAL; PATTERN RECOGNITION, AUTOMATED; SPEECH PRODUCTION MEASUREMENT; SPEECH RECOGNITION SOFTWARE;

EID: 67650911345 PISSN: 01628828 EISSN: None Source Type: Journal
DOI: 10.1109/TPAMI.2008.303 Document Type: Article

Times cited : (30)

References (38)

1
- 27144455475
- On Soft Evidence in Bayesian Networks,
- Technical Report UWEETR-2004-00016, Electrical Eng. Dept, Univ. of Washington
- J. Bilmes, "On Soft Evidence in Bayesian Networks," Technical Report UWEETR-2004-00016, Electrical Eng. Dept., Univ. of Washington, 2004.
- (2004)
- Bilmes, J.¹

2
- 70350617187
- J. Bilmes, "The Graphical Models Toolkit," http:// ssli.ee.washington.edu/people/bilmes/gmtk/, 2009.
- (2009) The Graphical Models Toolkit
- Bilmes, J.¹

3
- 85032752364
- Graphical Model Architectures for Speech Recognition
- Sept
- J.A. Bilmes and C. Bartels, "Graphical Model Architectures for Speech Recognition," IEEE Signal Processing Magazine, vol. 22, no. 5, pp. 89-100, Sept. 2005.
- (2005) IEEE Signal Processing Magazine , vol.22 , Issue.5 , pp. 89-100
- Bilmes, J.A.¹ Bartels, C.²

4
- 0027024362
- Articulatory Phonology: An Overview
- C.P. Browman and L. Goldstein, "Articulatory Phonology: An Overview," Phonetica, vol. 49, nos. 3/4, pp. 155-180, 1992.
- (1992) Phonetica , vol.49 , Issue.3-4 , pp. 155-180
- Browman, C.P.¹ Goldstein, L.²

5
- 34547497796
- O. Cetin et al., An Articulatory Feature-Based Tandem Approach and Factored Observation Modeling Proc. Int'l Conf. Acoustics, Speech, and Signal Proc., pp. IV-645-IV-648, Apr. 2007.
- O. Cetin et al., "An Articulatory Feature-Based Tandem Approach and Factored Observation Modeling" Proc. Int'l Conf. Acoustics, Speech, and Signal Proc., pp. IV-645-IV-648, Apr. 2007.

6
- 0003710380
- C.-C. Chang and C.-J. Lin, "LIBSVM A Library for Support Vector Machines," http://www.csie.ntu.edu.tw/~cjlin/libsvm, 2001.
- (2001) LIBSVM A Library for Support Vector Machines
- Chang, C.-C.¹ Lin, C.-J.²

7
- 84990553353
- A Model for Reasoning About Persistence and Causation
- Feb
- T. Dean and K. Kanazawa, "A Model for Reasoning About Persistence and Causation," Computational Intelligence, vol. 5, no. 2, pp. 142-150, Feb. 1989.
- (1989) Computational Intelligence , vol.5 , Issue.2 , pp. 142-150
- Dean, T.¹ Kanazawa, K.²

8
- 0002629270
- Maximum Likelihood from Incomplete Data via the EM Algorithm
- A.P. Dempster, N.M. Laird, and D.B. Rubin, "Maximum Likelihood from Incomplete Data via the EM Algorithm," J. Royal Statistical Soc. Series B, vol. 39, no. 1, pp. 1-38, 1977.
- (1977) J. Royal Statistical Soc. Series B , vol.39 , Issue.1 , pp. 1-38
- Dempster, A.P.¹ Laird, N.M.² Rubin, D.B.³

9
- 0031198059
- Production Models as a Structural Basis for Automatic Speech Recognition
- Aug
- L. Deng, G. Ramsay, and D. Sun, "Production Models as a Structural Basis for Automatic Speech Recognition," Speech Comm., vol. 22, nos. 2/3, pp. 93-111, Aug. 1997.
- (1997) Speech Comm , vol.22 , Issue.2-3 , pp. 93-111
- Deng, L.¹ Ramsay, G.² Sun, D.³

10
- 0036875002
- A Support Vector Machine-Based Dynamic Network for Visual Speech Recognition Applications
- M. Gordan, C. Kotropoulos, and I. Pitas, "A Support Vector Machine-Based Dynamic Network for Visual Speech Recognition Applications," EURASIP J. Applied Signal Processing, vol. 2002, no. 11, pp. 1248-1259, 2002.
- (2002) EURASIP J. Applied Signal Processing , vol.2002 , Issue.11 , pp. 1248-1259
- Gordan, M.¹ Kotropoulos, C.² Pitas, I.³

11
- 0012668146
- Asynchrony Modeling for Audio-Visual Speech Recognition
- Mar
- G. Gravier, G. Potamianos, and C. Neti, "Asynchrony Modeling for Audio-Visual Speech Recognition" Proc. Human Language Technology Conf., p. 1006, Mar. 2002.
- (2002) Proc. Human Language Technology Conf , pp. 1006
- Gravier, G.¹ Potamianos, G.² Neti, C.³

12
- 78649376063
- Audiovisual Speech Recognition with Articulator Positions as Hidden Variables
- Aug
- M. Hasegawa-Johnson, K. Livescu, P. Lal, and K. Saenko, "Audiovisual Speech Recognition with Articulator Positions as Hidden Variables," Proc. Int'l Congress on Phonetic Sciences, Aug. 2007.
- (2007) Proc. Int'l Congress on Phonetic Sciences
- Hasegawa-Johnson, M.¹ Livescu, K.² Lal, P.³ Saenko, K.⁴

13
- 14944353581
- T.J. Hazen, K. Saenko, C.-H. La, and J.R. Glass, A Segment-Based Audio-Visual Speech Recognizer: Data Collection, Development, and Initial Experiments Proc. Int'l Conf. Multimodal Interfaces, pp. 235-242, Oct. 2004.
- T.J. Hazen, K. Saenko, C.-H. La, and J.R. Glass, "A Segment-Based Audio-Visual Speech Recognizer: Data Collection, Development, and Initial Experiments" Proc. Int'l Conf. Multimodal Interfaces, pp. 235-242, Oct. 2004.

14
- 0003786003
- MIT Press
- F. Jelinek, Statistical Methods for Speech Recognition. MIT Press, 1998.
- (1998) Statistical Methods for Speech Recognition
- Jelinek, F.¹

15
- 33846680938
- Speech Production Knowledge in Automatic Speech Recognition
- Feb
- S. King et al., "Speech Production Knowledge in Automatic Speech Recognition," J. Acoustical Soc. of Am., vol. 121, no. 2, pp. 723-742, Feb. 2007.
- (2007) J. Acoustical Soc. of Am , vol.121 , Issue.2 , pp. 723-742
- King, S.¹

16
- 0034297586
- Detection of Phonological Features in Continuous Speech Using Neural Networks
- Oct
- S. King and P. Taylor, "Detection of Phonological Features in Continuous Speech Using Neural Networks," Computer Speech and Language, vol. 14, no. 4, pp. 333-353, Oct. 2000.
- (2000) Computer Speech and Language , vol.14 , Issue.4 , pp. 333-353
- King, S.¹ Taylor, P.²

17
- 0036642567
- Combining Acoustic and Articulatory Feature Information for Robust Speech Recognition
- July
- K. Kirchhoff, G.A. Fink, and G. Sagerer, "Combining Acoustic and Articulatory Feature Information for Robust Speech Recognition," Speech Comm., vol. 37, nos. 3/4, pp. 303-319, July 2002.
- (2002) Speech Comm , vol.37 , Issue.3-4 , pp. 303-319
- Kirchhoff, K.¹ Fink, G.A.² Sagerer, G.³

18
- 14944340400
- Neural Architectures for Sensor Fusion in Speech Recognition
- Sept
- G. Krone, B. Talle, A. Wichert, and G. Palm, "Neural Architectures for Sensor Fusion in Speech Recognition," Proc. European Speech Comm. Assoc. Workshop Audio-Visual Speech Processing, pp. 57-60, Sept. 1997.
- (1997) Proc. European Speech Comm. Assoc. Workshop Audio-Visual Speech Processing , pp. 57-60
- Krone, G.¹ Talle, B.² Wichert, A.³ Palm, G.⁴

19
- 14944341906
- Feature-Based Pronunciation Modeling for Speech Recognition
- May
- K. Livescu and J. Glass, "Feature-Based Pronunciation Modeling for Speech Recognition," Proc. Human Language Technology Conf. North Am. Chapter of the Assoc. for Computational Linguistics, May 2004.
- (2004) Proc. Human Language Technology Conf. North Am. Chapter of the Assoc. for Computational Linguistics
- Livescu, K.¹ Glass, J.²

20
- 78651465434
- Feature-Based Pronunciation Modeling with Trainable Asynchrony Probabilities
- Oct
- K. Livescu and J. Glass, "Feature-Based Pronunciation Modeling with Trainable Asynchrony Probabilities" Proc. Int'l Conf. Spoken Language, pp. 677-680, Oct. 2004.
- (2004) Proc. Int'l Conf. Spoken Language , pp. 677-680
- Livescu, K.¹ Glass, J.²

21
- 34547548915
- Articulatory Feature-Based Methods for Acoustic and Audio-Visual Speech Recognition
- Johns Hopkins Univ, Center for Language and Speech Processing
- K. Livescu et al., "Articulatory Feature-Based Methods for Acoustic and Audio-Visual Speech Recognition: JHU Summer Workshop Final Report," Johns Hopkins Univ., Center for Language and Speech Processing, 2007.
- (2007) JHU Summer Workshop Final Report
- Livescu, K.¹

22
- 0017199877
- Hearing Lips and Seeing Voices
- Dec
- H. McGurk and J. McDonald, "Hearing Lips and Seeing Voices," Nature vol. 264, no. 5588, pp. 746-748, Dec. 1976.
- (1976) Nature , vol.264 , Issue.5588 , pp. 746-748
- McGurk, H.¹ McDonald, J.²

23
- 0029306621
- Continuous Speech Recognition
- May
- N. Morgan and H. Bourlard, "Continuous Speech Recognition," IEEE Signal Processing Magazine, vol. 12, no. 3, pp. 24-42, May 1995.
- (1995) IEEE Signal Processing Magazine , vol.12 , Issue.3 , pp. 24-42
- Morgan, N.¹ Bourlard, H.²

24
- 0013288412
- Dynamic Bayesian Networks: Representation, Inference and Learning,
- PhD dissertation, Computer Science Division, Univ. of California
- K. Murphy, "Dynamic Bayesian Networks: Representation, Inference and Learning," PhD dissertation, Computer Science Division, Univ. of California, 2002.
- (2002)
- Murphy, K.¹

25
- 0036297183
- A Coupled HMM for Audio-Visual Speech Recognition
- May
- A.V. Nefian, L. Liang, X. Pi, L. Xiaoxiang, C. Mao, and K. Murphy, "A Coupled HMM for Audio-Visual Speech Recognition" Proc. Int'l Conf. Acoustics, Speech, and Signal Processing, pp. 2013-2016, May 2002.
- (2002) Proc. Int'l Conf. Acoustics, Speech, and Signal Processing , pp. 2013-2016
- Nefian, A.V.¹ Liang, L.² Pi, X.³ Xiaoxiang, L.⁴ Mao, C.⁵ Murphy, K.⁶

26
- 84900117327
- A Feature Based Representation for Audio Visual Speech Recognition
- Aug
- P. Niyogi, E. Petajan, and J. Zhong, "A Feature Based Representation for Audio Visual Speech Recognition," Proc. Int'l Conf. Auditory-Visual Speech Processing, Aug. 1999.
- (1999) Proc. Int'l Conf. Auditory-Visual Speech Processing
- Niyogi, P.¹ Petajan, E.² Zhong, J.³

27
- 0036081023
- Modelling Asynchrony in Automatic Speech Recognition Using Loosely Coupled Hidden Markov Models
- May/June
- H. Nock and S. Young, "Modelling Asynchrony in Automatic Speech Recognition Using Loosely Coupled Hidden Markov Models," Cognitive Science, vol. 26, no. 3, pp. 283-301, May/June 2002.
- (2002) Cognitive Science , vol.26 , Issue.3 , pp. 283-301
- Nock, H.¹ Young, S.²

28
- 1542303714
- A Fused Hidden Markov Model with Application to Bimodal Speech Processing
- Mar
- H. Pan, S.E. Levinson, T.S. Huang, and Z. Liang, "A Fused Hidden Markov Model with Application to Bimodal Speech Processing," IEEE Trans. Signal Processing, vol. 52, no. 3, pp. 573-581, Mar. 2004.
- (2004) IEEE Trans. Signal Processing , vol.52 , Issue.3 , pp. 573-581
- Pan, H.¹ Levinson, S.E.² Huang, T.S.³ Liang, Z.⁴

29
- 0003391330
- Morgan Kaufmann
- J. Pearl, Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, 1988.
- (1988) Probabilistic Reasoning in Intelligent Systems
- Pearl, J.¹

30
- 0021541159
- Automatic Lipreading to Enhance Speech Recognition
- E. Petajan, "Automatic Lipreading to Enhance Speech Recognition," Proc. Global Telecomm. Conf., pp. 265-272, 1984.
- (1984) Proc. Global Telecomm. Conf , pp. 265-272
- Petajan, E.¹

31
- 0003243224
- Probabilities for SV Machines
- A.J. Smola, P.L. Bartlett, B. Schoelkopf, and D. Schuurmans, eds, pp, MIT Press
- J. Platt, "Probabilities for SV Machines," Advances in Large Margin Classifiers, A.J. Smola, P.L. Bartlett, B. Schoelkopf, and D. Schuurmans, eds., pp. 61-73, MIT Press, 2000.
- (2000) Advances in Large Margin Classifiers , pp. 61-73
- Platt, J.¹

32
- 4544290191
- Recent Advances in the Automatic Recognition of Audiovisual Speech
- Sept
- G. Potamianos, C. Neti, G. Gravier, A. Garg, and A.W. Senior, "Recent Advances in the Automatic Recognition of Audiovisual Speech," Proc. IEEE Int'l Conf. Image Processing, vol. 91, no. 9, pp. 1306-1326, Sept. 2003.
- (2003) Proc. IEEE Int'l Conf. Image Processing , vol.91 , Issue.9 , pp. 1306-1326
- Potamianos, G.¹ Neti, C.² Gravier, G.³ Garg, A.⁴ Senior, A.W.⁵

33
- 0037697284
- Hidden Articulator Markov Models for Speech Recognition
- Oct
- M. Richardson, J. Bilmes, and C. Diorio, "Hidden Articulator Markov Models for Speech Recognition," Speech Comm., vol. 41, nos. 2/3, pp. 511-529, Oct. 2003.
- (2003) Speech Comm , vol.41 , Issue.2-3 , pp. 511-529
- Richardson, M.¹ Bilmes, J.² Diorio, C.³

34
- 14944351246
- Articulatory Features for Robust Visual Speech Recognition
- Oct
- K. Saenko, T. Darrell, and J.R. Glass, "Articulatory Features for Robust Visual Speech Recognition" Proc. Int'l Conf. Multimodal Interfaces, pp. 152-158, Oct. 2004.
- (2004) Proc. Int'l Conf. Multimodal Interfaces , pp. 152-158
- Saenko, K.¹ Darrell, T.² Glass, J.R.³

35
- 33646822127
- K. Saenko, K. Livescu, J. Glass, and T. Darrell, Production Domain Modeling of Pronunciation for Visual Speech Recognition, Proc. Int'l Conf. Acoustics, Speech, and Signal Processing, pp. v/473-v/ 476, Mar. 2005.
- K. Saenko, K. Livescu, J. Glass, and T. Darrell, "Production Domain Modeling of Pronunciation for Visual Speech Recognition," Proc. Int'l Conf. Acoustics, Speech, and Signal Processing, pp. v/473-v/ 476, Mar. 2005.

36
- 33745926831
- Visual Speech Recognition with Loosely Synchronized Feature Streams
- Oct
- K. Saenko, K. Livescu, M. Siracusa, K. Wilson, J. Glass, and T. Darrell, "Visual Speech Recognition with Loosely Synchronized Feature Streams" Proc. Int'l Conf. Computer Vision, pp. 1424-1431, Oct. 2005.
- (2005) Proc. Int'l Conf. Computer Vision , pp. 1424-1431
- Saenko, K.¹ Livescu, K.² Siracusa, M.³ Wilson, K.⁴ Glass, J.⁵ Darrell, T.⁶

37
- 0025477640
- Speech Database Development: TIMIT and Beyond
- Aug
- V. Zue, S. Seneff, and J. Glass, "Speech Database Development: TIMIT and Beyond," Speech Comm., vol. 9, no. 4, pp. 351-356, Aug. 1990.
- (1990) Speech Comm , vol.9 , Issue.4 , pp. 351-356
- Zue, V.¹ Seneff, S.² Glass, J.³

38
- 0004158153
- Speech Recognition Using Dynamic Bayesian Networks,
- PhD dissertation, Computer Science Division, Univ. of California
- G. Zweig, "Speech Recognition Using Dynamic Bayesian Networks," PhD dissertation, Computer Science Division, Univ. of California, 1998.
- (1998)
- Zweig, G.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.