-
1
-
-
0031187171
-
Speech recognition by machines and humans
-
Lippman R. Speech recognition by machines and humans. Speech Commun. 22 1 (1997) 1-15
-
(1997)
Speech Commun.
, vol.22
, Issue.1
, pp. 1-15
-
-
Lippman, R.1
-
2
-
-
10044221981
-
-
J. Ostermann, A. Weissenfeld, Talking faces-technologies and applications, in: Proceedings of ICPR'04, vol. 3, 2004, pp. 826-833.
-
-
-
-
3
-
-
0001514782
-
Modeling coarticulation in synthetic visual speech
-
Magnenat-Thalmann M., and Thalmann D. (Eds), Springer, Tokyo
-
Cohen M.M., and Massaro D.W. Modeling coarticulation in synthetic visual speech. In: Magnenat-Thalmann M., and Thalmann D. (Eds). Models and Techniques in Computer Animation (1993), Springer, Tokyo 139-156
-
(1993)
Models and Techniques in Computer Animation
, pp. 139-156
-
-
Cohen, M.M.1
Massaro, D.W.2
-
4
-
-
79952193244
-
-
F. Pighin, J. Hecker, D. Lischinski, R. Szeliski, D.H. Salesin, Synthesizing realistic facial expressions from photographs, in: Proceedings of ACM SIGGRAPH'98, vol. 3, 1998, pp. 75-84.
-
-
-
-
5
-
-
0035501711
-
Synthesizing realistic facial animations using energy minimization for model-based coding
-
Yin L., Basu A., Bernogger S., and Pinz A. Synthesizing realistic facial animations using energy minimization for model-based coding. Pattern Recognition 34 11 (2001) 2201-2213
-
(2001)
Pattern Recognition
, vol.34
, Issue.11
, pp. 2201-2213
-
-
Yin, L.1
Basu, A.2
Bernogger, S.3
Pinz, A.4
-
6
-
-
10044281988
-
Lifelike talking faces for interactive services
-
Cosatto E., Ostermann J., Graf H.P., and Schroeter J. Lifelike talking faces for interactive services. Proc. IEEE 91 9 (2003) 1406-1428
-
(2003)
Proc. IEEE
, vol.91
, Issue.9
, pp. 1406-1428
-
-
Cosatto, E.1
Ostermann, J.2
Graf, H.P.3
Schroeter, J.4
-
7
-
-
0030677313
-
-
C. Bregler, M. Covell, M. Slaney, Video rewrite: driving visual speech with audio, in: Proceedings of ACM SIGGRAPH'97, 1997.
-
-
-
-
8
-
-
0036989560
-
-
T. Ezzat, G. Geiger, T. Poggio, Trainable videorealistic speech animation, in: Proceedings of ACM SIGGRAPH, 2002, pp. 388-397.
-
-
-
-
9
-
-
84872004031
-
-
E. Cosatto, H. Graf, Sample-based synthesis of photo-realistic talking heads, in: Proceedings of IEEE Computer Animation, 1998, pp. 103-110.
-
-
-
-
10
-
-
0034271782
-
Photo-realistic talking heads from image samples
-
Cosatto E., and Graf H. Photo-realistic talking heads from image samples. IEEE Trans. Multimedia 2 3 (2000) 152-163
-
(2000)
IEEE Trans. Multimedia
, vol.2
, Issue.3
, pp. 152-163
-
-
Cosatto, E.1
Graf, H.2
-
11
-
-
0036650837
-
Real-time speech-driven face animation with expressions using neural networks
-
Hong P., Wen Z., and Huang T.S. Real-time speech-driven face animation with expressions using neural networks. IEEE Trans. Neural Networks 13 4 (2002) 916-927
-
(2002)
IEEE Trans. Neural Networks
, vol.13
, Issue.4
, pp. 916-927
-
-
Hong, P.1
Wen, Z.2
Huang, T.S.3
-
12
-
-
85017188218
-
-
F.J. Huang, T. Chen, Real-time lip-synch face animation driven by human voice, in: IEEE Second Workshop on Multimedia Signal Processing, 1998, pp. 352-357.
-
-
-
-
13
-
-
0031997085
-
Audio-to-visual conversion for multimedia communication
-
Rao R.R., Chen T., and Mersereau R.M. Audio-to-visual conversion for multimedia communication. IEEE Trans. Ind. Electron. 45 1 (1998) 15-22
-
(1998)
IEEE Trans. Ind. Electron.
, vol.45
, Issue.1
, pp. 15-22
-
-
Rao, R.R.1
Chen, T.2
Mersereau, R.M.3
-
14
-
-
0032179320
-
Lip movement synthesis from speech based on Hidden Markov Models
-
Yamamoto E., Nakamura S., and Shikano K. Lip movement synthesis from speech based on Hidden Markov Models. Speech Commun. 26 1-2 (1998) 105-115
-
(1998)
Speech Commun.
, vol.26
, Issue.1-2
, pp. 105-115
-
-
Yamamoto, E.1
Nakamura, S.2
Shikano, K.3
-
15
-
-
84937437186
-
-
M. Brand, Voice puppetry, in: SIGGRAPH'99, Los Angeles, 1999, pp. 21-28.
-
-
-
-
16
-
-
34147127210
-
-
K. Choi, J. N. Hwang, Baum-Welch hidden Markov model inversion for reliable audio-to-visual conversion, in: Proceedings of the IEEE 3rd Workshop Multimedia Signal Processing, 1999, pp. 175-180.
-
-
-
-
17
-
-
0035426641
-
Hidden Markov model inversion for audio-to-visual conversion in an MPEG-4 facial animation system
-
Choi K., Luo Y., and Hwang J.N. Hidden Markov model inversion for audio-to-visual conversion in an MPEG-4 facial animation system. J. VLSI Signal Process. 29 1-2 (2001) 51-61
-
(2001)
J. VLSI Signal Process.
, vol.29
, Issue.1-2
, pp. 51-61
-
-
Choi, K.1
Luo, Y.2
Hwang, J.N.3
-
18
-
-
84919327072
-
-
S. Lee, D. Yook, Audio-to-visual conversion using hidden Markov models, in: M. Ishizuka, S. A. (Eds.), Proceedings of PRICAI2002, Lecture Notes in Artificial Intelligence, Springer, Berlin, 2002, pp. 563-570.
-
-
-
-
19
-
-
33845277490
-
-
L. Xie, D.-M. Jiang, I. Ravyse, W. Verhelst, H. Sahli, V. Slavova, R.-C. Zhao, Context dependent viseme models for voice driven animation, in: The 4th EURASIP Conference on Video/Image Processing and Multimedia Communications, vol. 2, 2003, pp. 649-654.
-
-
-
-
21
-
-
0024610919
-
A tutorial on hidden Markov models and selected applications in speech animation
-
Rabiner L.R. A tutorial on hidden Markov models and selected applications in speech animation. Proc. IEEE 77 2 (1989) 257-286
-
(1989)
Proc. IEEE
, vol.77
, Issue.2
, pp. 257-286
-
-
Rabiner, L.R.1
-
22
-
-
16244385915
-
Audio/visual mapping with cross-modal hidden Markov models
-
Fu S., Gutierrez-Osuna R., Esposito A., Kakumanu K.P., and Garcia O.N. Audio/visual mapping with cross-modal hidden Markov models. IEEE Trans. Multimedia 7 2 (2005) 243-251
-
(2005)
IEEE Trans. Multimedia
, vol.7
, Issue.2
, pp. 243-251
-
-
Fu, S.1
Gutierrez-Osuna, R.2
Esposito, A.3
Kakumanu, K.P.4
Garcia, O.N.5
-
23
-
-
0028996864
-
-
S.Y. Moon, J.N. Hwang, Noisy speech recognition using robust inversion of hidden Markov models, in: Proceedings of ICASSP'95, 1995, pp. 145-148.
-
-
-
-
24
-
-
85009254391
-
-
T. Ezzat, T. Poggio, Miketalk: A talking facial display based on morphing visemes, in: Proceedings of the Computer Animation Conference, 1998, pp. 96-102.
-
-
-
-
25
-
-
34147133577
-
-
D.G. Stork, M.E. Hennecke (Eds.), Speechreading by Humans and Machines, Springer, Berlin, 1996.
-
-
-
-
26
-
-
34147108960
-
-
L. Xie, Research on key issues of audio visual speech recognition, Ph.D. Thesis, Northwestern Polytechnical University, September 2004.
-
-
-
-
27
-
-
34147143176
-
-
K.W. Grant, S. Greenberg, Speech intelligibility derived from asynchronous processing of auditory-visual information, in: Proceedings of the International Conference on Auditory-Visual Speech Processing, Aalborg, Denmark, 2001, pp. 132-37.
-
-
-
-
28
-
-
0029270677
-
Converting speech into lip movements: a multimedia telephone for hard hearing people
-
Lavagetto F. Converting speech into lip movements: a multimedia telephone for hard hearing people. IEEE Trans. Rehabil. Eng. 3 (1995) 90-102
-
(1995)
IEEE Trans. Rehabil. Eng.
, vol.3
, pp. 90-102
-
-
Lavagetto, F.1
-
29
-
-
0022019614
-
Intermodal timing relations and audio-visual speech recognition
-
McGrath M., and SummerLeld Q. Intermodal timing relations and audio-visual speech recognition. J. Acoust. Soc. Am. 77 (1985) 678-685
-
(1985)
J. Acoust. Soc. Am.
, vol.77
, pp. 678-685
-
-
McGrath, M.1
SummerLeld, Q.2
-
30
-
-
4544290191
-
Recent advances in the automatic recognition of audio-visual speech
-
Potamianos G., Neti C., Gravier G., Garg A., and Senior A.W. Recent advances in the automatic recognition of audio-visual speech. Proc. IEEE 91 9 (2003) 1306-1326
-
(2003)
Proc. IEEE
, vol.91
, Issue.9
, pp. 1306-1326
-
-
Potamianos, G.1
Neti, C.2
Gravier, G.3
Garg, A.4
Senior, A.W.5
-
32
-
-
34147156067
-
-
K. Murphy, Dynamic Bayesian networks: representation, inference and learning, Ph.D. Thesis, University of California, Berkeley, 2002.
-
-
-
-
33
-
-
0030355935
-
-
H. Bourlard, S. Dupont, A new ASR approach based on independent processing and recombination of partial frequency bands, in: Proceedings of the International Conference on Spoken Language Processing, Philadelphia, 1996, pp. 426-429.
-
-
-
-
34
-
-
34147132541
-
-
B. Logan, P.J. Moreno, Factorial hidden Markov models for speech recognition: preliminary experiments, Technical Reports of Cambridge Research Lab (CRL-97-7).
-
-
-
-
35
-
-
0030685285
-
-
M. Brand, N. Oliver, A. Pentland, Coupled hidden Markov models for complex action recognition, in: IEEE International Conference on Computer Vision and Pattern Recognition, 1997, pp. 994-999.
-
-
-
-
36
-
-
0036297183
-
-
A.V. Nefian, L. Liang, X. Pi, X. Liu, C. Mao, K. Murphy, A coupled HMM for audio-visual speech recognition, in: Proceedings of ICASSP'02, 2002.
-
-
-
-
37
-
-
10044240183
-
-
F. Pernkopf, 3D surface inspection using coupled HMMs, in: Proceedings of 17th ICPR'04, 2004.
-
-
-
-
38
-
-
33646806777
-
-
S. Ananthakrishnan, S.S. Narayanan, An automatic prosody recognizer using a coupled multi-stream acoustic model and a syntactic-prosodic language model, in: Proceedings of ICASSP'05, 2005.
-
-
-
-
39
-
-
34147130506
-
-
L. Xie, Z. Ye, The JEWEL audio visual dataset for facial animation, URL 〈http://www.cityu.edu.hk/rcmt/mouth-synching/jewel.htm〉.
-
-
-
-
40
-
-
6344258662
-
-
L. Xie, X.-L. Cai, R.-C. Zhao, A robust hierarchical lip tracking approach for lipreading and audio visual speech recognition, in: The 3rd IEEE International Conference on Machine Learning and Cybernetics, vol. 6, Shanghai, China, 2004, pp. 3620-3624.
-
-
-
-
42
-
-
34147163731
-
-
S. Young, G. Evermann, D. Kershaw, J. Odell, D. Ollason, D. Povey, V. Valtchev, P. Woodland, The HTK Book (Version 3.2), Cambirdge University Engineering Department, Cambridge, 2002, URL 〈http://htk.eng.cam.ac.uk/〉.
-
-
-
-
43
-
-
0034270644
-
Audio-visual speech modelling for continuous speech recognition
-
Dupont S., and Luettin J. Audio-visual speech modelling for continuous speech recognition. IEEE Trans. Multimedia 2 3 (2000) 141-151
-
(2000)
IEEE Trans. Multimedia
, vol.2
, Issue.3
, pp. 141-151
-
-
Dupont, S.1
Luettin, J.2
|