SCOPUS 정보 검색 플랫폼

ICPRAM 2012 - Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods

Volumn 2, Issue , 2012, Pages 322-329

Phoneme-to-viseme mapping for visual speech recognition

Author keywords

AVSR; DCT; Optical flow; PCA; Viseme

Indexed keywords

AUDIO VISUAL SPEECH RECOGNITION; AVSR; CONTINUOUS SPEECH; DATA-DRIVEN METHODS; DCT; LINGUISTIC METHODS; PCA; VISEME; VISEMES; VISUAL FEATURE; VISUAL SPEECH RECOGNITION;

CONTINUOUS SPEECH RECOGNITION; LINGUISTICS; OPTICAL FLOWS;

PATTERN RECOGNITION;

EID: 84862178164 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (57)

References (19)

1
- 84862219678
- Pyramidal implementation of lucas kanade feature tracker
- Bouguet
- Bouguet (2002). Pyramidal Implementation of Lucas Kanade Feature Tracker. Description of the algorithm.
- (2002) Description of the Algorithm

2
- 47949087133
- Comparison of phoneme and viseme based acoustic units for speech driven realistic lip animation
- Bozkurt, Eroglu, Q., Erzin, Erdem, and Ozkan (2007). Comparison of phoneme and viseme based acoustic units for speech driven realistic lip animation. In 3DTV Conference, 2007, pages 1-4.
- (2007) 3DTV Conference, 2007 , pp. 1-4
- Bozkurt, E.Q.¹ Erzin, E.² Ozkan³

3
- 85013597845
- vol.2
- Bregler and Konig (1994).'Eigenlips'for robust speech recognition. In Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on, volume ii, pages II/669-II/672 vol.2.
- (1994) Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on , vol.2
- Bregler¹ Konig²

4
- 78649613221
- Nostril detection for robust mouth tracking
- Cork
- Cappelletta and Harte (2010). Nostril detection for robust mouth tracking. In Irish Signals and Systems Conference, pages 239 - 244, Cork.
- (2010) Irish Signals and Systems Conference , pp. 239-244
- Cappelletta¹ Harte²

5
- 84862215808
- Viseme definitios comparison for visual-only speech recognition
- Cappelletta, L. and Harte, N. (2011). Viseme definitios comparison for visual-only speech recognition. In Proceedings of 19th European Signal Processing Conference (EUSIPCO), pages 2109-2113.
- (2011) Proceedings of 19th European Signal Processing Conference (EUSIPCO) , pp. 2109-2113
- Cappelletta, L.¹ Harte, N.²

6
- 85009254391
- Miketalk: A talking facial display based on morphing visemes
- Ezzat and Poggio (1998). Miketalk: a talking facial display based on morphing visemes. In Computer Animation 98. Proceedings, pages 96-102.
- (1998) Computer Animation 98. Proceedings , pp. 96-102
- Ezzat¹ Poggio²

7
- 84875584220
- Continuous optical automatic speech recognition by lipreading
- Goldschen, A. J., Garcia, O. N., and Petajan, E. (1994). Continuous optical automatic speech recognition by lipreading. In Proceedings of the 28th Asilomar Conference on Signals, Systems, and Computers, pages 572-577.
- (1994) Proceedings of the 28th Asilomar Conference on Signals, Systems, and Computers , pp. 572-577
- Goldschen, A.J.¹ Garcia, O.N.² Petajan, E.³

8
- 34047263009
- Visual model structures and synchrony constraints for audio-visual speech recognition
- Hazen (2006). Visual model structures and synchrony constraints for audio-visual speech recognition. Audio, Speech, and Language Processing, IEEE Transactions on, 14(3):1082-1089.
- (2006) Audio, Speech, and Language Processing, IEEE Transactions on , vol.14 , Issue.3 , pp. 1082-1089
- Hazen¹

9
- 14944353581
- A segment-based audio-visual speech recognizer: Data collection, development, and initial experiments
- State College, PA, USA. ACM
- Hazen, Saenko, La, and Glass (2004). A segment-based audio-visual speech recognizer: data collection, development, and initial experiments. In Proceedings of the 6th international conference on Multimodal interfaces, pages 235-242, State College, PA, USA. ACM.
- (2004) Proceedings of the 6th International Conference on Multimodal Interfaces , pp. 235-242
- Hazen, S.¹ La, G.²

10
- 85009284526
- DCT-Based video features for audio-visual speech recognition
- Denver, CO, USA
- Heckmann, Kroschel, Savariaux, and Berthommier (2002). DCT-Based Video Features for Audio-Visual Speech Recognition. In International Conference on Spoken Language Processing, volume 1, pages 1925-1928, Denver, CO, USA.
- (2002) International Conference on Spoken Language Processing , vol.1 , pp. 1925-1928
- Heckmann, K.¹ Savariaux, B.²

11
- 84860854043
- In pursuit of visemes
- Hilder, Theobald, and Harvey (2010). In pursuit of visemes. In International Conference on Auditoryvisual Speech Processing.
- (2010) International Conference on Auditoryvisual Speech Processing
- Hilder, T.¹ Harvey²

12
- 0004266328
- Charles C Thomas Pub Ltd
- Jeffers and Barley (1971). Speechreading (Lipreading). Charles C Thomas Pub Ltd.
- (1971) Speechreading (Lipreading)
- Jeffers¹ Barley²

13
- 84919327072
- Audio-to-visual conversion using hidden markov models
- Springer-Verlag
- Lee and Yook (2002). Audio-to-Visual Conversion Using Hidden Markov Models. In Proceedings of the 7th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence, pages 563- 570. Springer-Verlag.
- (2002) Proceedings of the 7th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence , pp. 563-570
- Lee¹ Yook²

14
- 0019647180
- An iterative image registration technique with an application to stereo vision
- Lucas and Kanade (1981). An iterative image registration technique with an application to stereo vision. In Proceedings of Imaging Understanding Workshop.
- (1981) Proceedings of Imaging Understanding Workshop
- Lucas, K.¹

15
- 0004052871
- Audio-visual speech recognition
- The Johns Hopkins University, Baltimore
- Neti, Potamianos, Luettin, Matthews, Glotin, Vergyri, Sison, Mashari, and Zhou (2000). Audio-visual speech recognition. Technical report, Center for Language and Speech Processing, The Johns Hopkins University, Baltimore.
- (2000) Technical Report, Center for Language and Speech Processing
- Neti, P.¹ Luettin, M.² Glotin, V.³ Sison, M.⁴ Zhou⁵

16
- 0344212675
- John Wiley & Sons, Inc. New York, NY, USA
- Pandzic, I. S. and Forchheimer, R. (2003). MPEG-4 Facial Animation: The Standard, Implementation and Applications. John Wiley & Sons, Inc., New York, NY, USA.
- (2003) MPEG-4 Facial Animation: The Standard, Implementation and Applications
- Pandzic, I.S.¹ Forchheimer, R.²

17
- 4544290191
- Recent advances in the automatic recognition of audio-visual speech
- Senior
- Potamianos, Neti, Gravier, Garg, and Senior (2003). Recent advances in the automatic recognition of audio-visual speech. Proceeding of the IEEE, 91(9):1306-1326.
- (2003) Proceeding of the IEEE , vol.91 , Issue.9 , pp. 1306-1326
- Neti, P.¹ Garg, G.²

18
- 84862162638
- Master thesis, Massachussetts Institute of Technology
- Saenko, K. (2004). Articulary Features for Robust Visual Speech Recognition. Master thesis, Massachussetts Institute of Technology.
- (2004) Articulary Features for Robust Visual Speech Recognition
- Saenko, K.¹

19
- 56749108674
- VDM-Verlag
- Sanderson (2008). Biometric Person Recognition: Face, Speech and Fusion. VDM-Verlag.
- (2008) Biometric Person Recognition: Face, Speech and Fusion
- Sanderson¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.