SCOPUS 정보 검색 플랫폼

IEEE Transactions on Multimedia

Volumn 7, Issue 2, 2005, Pages 243-252

Audio/visual mapping with cross-modal hidden Markov models

(5) Fu, Shengli a Gutierrez Osuna, Ricardo b Esposito, Anna c Kakumanu, Praveen K d Garcia, Oscar N e

a UNIVERSITY OF DELAWARE (United States)

b TEXAS A AND M UNIVERSITY (United States)

c SECOND UNIVERSITY OF NAPLES (Italy)

d Wright State University (United States)

e UNIVERSITY OF NORTH TEXAS (United States)

Author keywords

3 D audio video processing; Joint media and multimodal processing; Speech reading and lip synchroization

Indexed keywords

ANIMATION; DATA ACQUISITION; FEATURE EXTRACTION; MAPPING; MARKOV PROCESSES; MATHEMATICAL MODELS; MODAL ANALYSIS; VIDEO SIGNAL PROCESSING;

3-D AUDIO-VIDEO PROCESSING; JOINT MEDIA AND MULTIMODAL PROCESSING; LIP SYNCHRONIZATION; SPEECH READING;

SPEECH COMMUNICATION;

EID: 16244385915 PISSN: 15209210 EISSN: None Source Type: Journal
DOI: 10.1109/TMM.2005.843341 Document Type: Review

Times cited : (59)

References (32)

1
- 0033336969
- User evaluation: Synthetic talking faces for interactive services
- I. Pandzic, J. Ostermann, and D. Millen, "User evaluation: Synthetic talking faces for interactive services," Vis. Comput., vol. 15, no. 7-8, pp. 330-340, 1999.
- (1999) Vis. Comput. , vol.15 , Issue.7-8 , pp. 330-340
- Pandzic, I.¹ Ostermann, J.² Millen, D.³

2
- 78649308717
- Recent developments in facial animation: An inside view
- Terrigal, Australia
- M. M. Cohen, J. Beskow, and D. W. Massaro, "Recent developments in facial animation: An inside view," in Proc. AVSP, Terrigal, Australia, 1998, pp. 201-206.
- (1998) Proc. AVSP , pp. 201-206
- Cohen, M.M.¹ Beskow, J.² Massaro, D.W.³

3
- 0004244302
- Englewood Cliffs, NJ
- L. R. Rabiner and B. H. Juang, Fundamentals of Speech Recognition Englewood Cliffs, NJ, 1993.
- (1993) Fundamentals of Speech Recognition
- Rabiner, L.R.¹ Juang, B.H.²

4
- 78650077027
- Ph.D. dissertation, Eng. and App. Sci. Dept., George Washington Univ., Washington, NJ
- A. J. Goldschen, "Continuous Automatic Speech Recognition by Lipreading," Ph.D. dissertation, Eng. and App. Sci. Dept., George Washington Univ., Washington, NJ, 1993.
- (1993) Continuous Automatic Speech Recognition by Lipreading
- Goldschen, A.J.¹

5
- 0242664388
- Real-time talking head driven by voice and its application to communication and entertainment
- Terrigal, Australia
- S. Morishima, "Real-time talking head driven by voice and its application to communication and entertainment," in Proc. AVSP, Terrigal, Australia, 1998, pp. 195-199.
- (1998) Proc. AVSP , pp. 195-199
- Morishima, S.¹

6
- 0030677313
- Video rewrite: Driving visual speech with audio
- C. Bregler, T. Covell, and M. Slaney, "Video rewrite: Driving visual speech with audio," in Proc. ACM SIGGRAPH'97, 1997, pp. 353-360.
- (1997) Proc. ACM SIGGRAPH'97 , pp. 353-360
- Bregler, C.¹ Covell, T.² Slaney, M.³

7
- 0026156861
- A media conversion from speech to facial image for intelligent man-machine interface
- May
- S. Morishima and H. Harashima, "A media conversion from speech to facial image for intelligent man-machine interface," IEEE J Select. Areas Commun., vol. 9, no. 4, pp. 594-600, May 1991.
- (1991) IEEE J Select. Areas Commun. , vol.9 , Issue.4 , pp. 594-600
- Morishima, S.¹ Harashima, H.²

8
- 0029270677
- Converting speech into lip movements: A multimedia telephone for hard of hearing people
- Mar.
- F. Lavagetto, "Converting speech into lip movements: A multimedia telephone for hard of hearing people," IEEE Trans. Rehabil. Eng., vol. 3, no. 1, pp. 90-102, Mar. 1995.
- (1995) IEEE Trans. Rehabil. Eng. , vol.3 , Issue.1 , pp. 90-102
- Lavagetto, F.¹

9
- 85133709259
- Picture my voice: Audio to visual speech synthesis using artificial neural networks
- D. W. Massaro, Ed., Santa Cruz, CA
- D. W. Massaro, J. Beskow, M. M. Cohen, C. L. Fry, and T. Rodriguez, "Picture my voice: Audio to visual speech synthesis using artificial neural networks," in Proc. AVSP, D. W. Massaro, Ed., Santa Cruz, CA, 1999, pp. 133-138.
- (1999) Proc. AVSP , pp. 133-138
- Massaro, D.W.¹ Beskow, J.² Cohen, M.M.³ Fry, C.L.⁴ Rodriguez, T.⁵

10
- 0036650837
- Real-time speech-driven face animation with expressions using neural networks
- Jul.
- P. Hong, Z. Wen, and T. S. Huang, "Real-time speech-driven face animation with expressions using neural networks," IEEE Trans. Neural Netw., vol. 13, no. 4, pp. 916-927, Jul. 2002.
- (2002) IEEE Trans. Neural Netw. , vol.13 , Issue.4 , pp. 916-927
- Hong, P.¹ Wen, Z.² Huang, T.S.³

11
- 85032752352
- Audiovisual speech processing: Lip reading and lip synchronization
- Jan.
- T. Chen, "Audiovisual speech processing: Lip reading and lip synchronization," IEEE Signal Process. Mag., vol. 18, no. 1, pp. 9-21, Jan. 2001.
- (2001) IEEE Signal Process. Mag. , vol.18 , Issue.1 , pp. 9-21
- Chen, T.¹

12
- 0035426641
- Hidden Markov model inversion for audio-to-visual conversion in an MPEG-4 facial animation system
- K. Choi, Y. Luo, and J.-N. Hwang, "Hidden Markov model inversion for audio-to-visual conversion in an MPEG-4 facial animation system," J. VLSI Signal Process., vol. 29, no. 1-2, pp. 51-61, 2001.
- (2001) J. VLSI Signal Process. , vol.29 , Issue.1-2 , pp. 51-61
- Choi, K.¹ Luo, Y.² Hwang, J.-N.³

13
- 84937437186
- Voice puppetry
- Los Angeles, CA
- M. Brand, "Voice puppetry," in Proc. SIGGRAPH'99, Los Angeles, CA, 1999, pp. 21-28.
- (1999) Proc. SIGGRAPH'99 , pp. 21-28
- Brand, M.¹

14
- 0032179320
- Lip movement synthesis from speech based on hidden Markov models
- E. Yamamoto, S. Nakamura, and K. Shikano, "Lip movement synthesis from speech based on hidden Markov models," Speech Commun., vol. 26, no. 1-2, pp. 105-115, 1998.
- (1998) Speech Commun. , vol.26 , Issue.1-2 , pp. 105-115
- Yamamoto, E.¹ Nakamura, S.² Shikano, K.³

15
- 0031997085
- Audio-to-visual conversion for multimedia communication
- Feb.
- R. R. Rao, T. Chen, and R. M. Mersereau, "Audio-to-visual conversion for multimedia communication," IEEE Trans. Ind. Electron., vol. 45, no. 1, pp. 15-22, Feb. 1998.
- (1998) IEEE Trans. Ind. Electron. , vol.45 , Issue.1 , pp. 15-22
- Rao, R.R.¹ Chen, T.² Mersereau, R.M.³

16
- 0024610919
- A tutorial on hidden Markov models and selected applications in speech recognition
- Feb.
- L. R. Rabiner, "A tutorial on hidden Markov models and selected applications in speech recognition," Proc. IEEE, vol. 77, no. 2, pp. 257-286, Feb. 1989.
- (1989) Proc. IEEE , vol.77 , Issue.2 , pp. 257-286
- Rabiner, L.R.¹

17
- 0033283938
- Shadow puppetry
- Corfu, Greece, Sep.
- M. Brand, "Shadow puppetry," in Proc. ICCV'99, Corfu, Greece, Sep. 1999, pp. 1237-1244.
- (1999) Proc. ICCV'99 , pp. 1237-1244
- Brand, M.¹

18
- 0003493783
- Pacific Grove, CA: Brooks/Cole
- B. N. Datta, Numerical Linear Algebra and Applications. Pacific Grove, CA: Brooks/Cole, 1995.
- (1995) Numerical Linear Algebra and Applications
- Datta, B.N.¹

19
- 16244388770
- Master's thesis, Dept. Comput. Sci. and Eng., Wright State Univ., Dayton, OH
- S. Fu, "Audio/Visual Mapping Based on Hidden Markov Models," Master's thesis, Dept. Comput. Sci. and Eng., Wright State Univ., Dayton, OH, 2002.
- (2002) Audio/Visual Mapping Based on Hidden Markov Models
- Fu, S.¹

20
- 0000497160
- Baum-Welch HMM inversion for audio-to-visual conversion
- K. Choi and J.-N Hwang, "Baum-Welch HMM inversion for audio-to-visual conversion," in Proc. IEEE Int. Workshop Multimedia Signal Processing, 1999, pp. 175-180.
- (1999) Proc. IEEE Int. Workshop Multimedia Signal Processing , pp. 175-180
- Choi, K.¹ Hwang, J.-N.²

21
- 0031100269
- Robust speech recognition based on joint model and feature space optimization of hidden Markov models
- Mar.
- S. Moon and J.-N. Hwang, "Robust speech recognition based on joint model and feature space optimization of hidden Markov models," IEEE Tran. Neural Netw., vol. 8, no. 2, pp. 194-204, Mar. 1997.
- (1997) IEEE Tran. Neural Netw. , vol.8 , Issue.2 , pp. 194-204
- Moon, S.¹ Hwang, J.-N.²

22
- 84972571328
- Growth functions for transformations on manifolds
- L. E. Baum and G. R. Sell, "Growth functions for transformations on manifolds," Pacific J. Math., vol. 27, no. 2, pp. 211-227, 1968.
- (1968) Pacific J. Math. , vol.27 , Issue.2 , pp. 211-227
- Baum, L.E.¹ Sell, G.R.²

23
- 0037569390
- Learning dynamic audio/visual mapping with input-output hidden Markov models
- Melbourne, Australia, Jan.
- Y. Li and H.-Y. Shum, "Learning dynamic audio/visual mapping with input-output hidden Markov models," in Proc. 5th Asian Conf. on Computer Vision, Melbourne, Australia, Jan. 2002.
- (2002) Proc. 5th Asian Conf. on Computer Vision
- Li, Y.¹ Shum, H.-Y.²

24
- 0000675167
- Structure learning in conditional probability models via an entropic prior and parameter extinction
- M. Brand, "Structure learning in conditional probability models via an entropic prior and parameter extinction," Neural Comput., vol. 11, no. 5, pp. 1155-1182, 1999.
- (1999) Neural Comput. , vol.11 , Issue.5 , pp. 1155-1182
- Brand, M.¹

25
- 0003608342
- Wesley, MA: A.K Peters
- F. I. Parke and K. Waters, Computer Facial Animation. Wesley, MA: A.K Peters, 1996.
- (1996) Computer Facial Animation
- Parke, F.I.¹ Waters, K.²

26
- 13144267153
- Master, Dept. Comput. Sci. and Eng., Wright State Univ., Dayton, OH
- P. K. Kakumanu, "Audio-Visual Processing for Speech Driven Facial Animation," Master, Dept. Comput. Sci. and Eng., Wright State Univ., Dayton, OH, 2002.
- (2002) Audio-visual Processing for Speech Driven Facial Animation
- Kakumanu, P.K.¹

27
- 0003522447
- New York: IEEE
- D. O'Shaughnessy, Speech Communication: Human and Machine, 2nd ed. New York: IEEE, 2000.
- (2000) Speech Communication: Human and Machine, 2nd Ed.
- O'Shaughnessy, D.¹

28
- 0035574930
- A new textindependent method for phoneme segmentation
- G. Aversano, A. Esposito, A. Esposito, and M. Marinaro, "A new textindependent method for phoneme segmentation," in Proc. 44th IEEE Midwest Symp. Circuits and Systems, vol. 2, 2001, pp. 516-519.
- (2001) Proc. 44th IEEE Midwest Symp. Circuits and Systems , vol.2 , pp. 516-519
- Aversano, G.¹ Esposito, A.² Esposito, A.³ Marinaro, M.⁴

29
- 0032097263
- Boston, MA: Academic
- K. Fukunaga, Introduction to Statistical Pattern Recognition, 2nd ed. Boston, MA: Academic, 1990.
- (1990) Introduction to Statistical Pattern Recognition, 2nd Ed.
- Fukunaga, K.¹

30
- 0003509908
- ISO/IEC JTC1/SC29/WG11, Seoul, South Korea
- R. Koenen, "Overview of the MPEG-4 Standard," ISO/IEC JTC1/SC29/WG11, Seoul, South Korea, 1999.
- (1999) Overview of the MPEG-4 Standard
- Koenen, R.¹

31
- 13144278330
- Speech-driven facial animation with realistic dynamics
- Feb.
- R. Gutierrez-Osuna, P. K. Kakumanu, A. Esposito, O. N. Garcia, A. Bojorquez, J. L. Castillo, and I. J. Rudomin, "Speech-driven facial animation with realistic dynamics," IEEE Trans. Multimedia, vol. 7, no. 1, pp. 33-42, Feb. 2005.
- (2005) IEEE Trans. Multimedia , vol.7 , Issue.1 , pp. 33-42
- Gutierrez-Osuna, R.¹ Kakumanu, P.K.² Esposito, A.³ Garcia, O.N.⁴ Bojorquez, A.⁵ Castillo, J.L.⁶ Rudomin, I.J.⁷

32
- 0003419545
- Gaithersburg, MD: NIST
- J. S. Garofolo, Getting Started with the DARPA TIMIT CD-ROM: An Acoustic Phonetic Continuous Speech Database. Gaithersburg, MD: NIST, 1988.
- (1988) Getting Started with the DARPA TIMIT CD-ROM: An Acoustic Phonetic Continuous Speech Database
- Garofolo, J.S.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.