-
1
-
-
0011048689
-
Plateaus, catastrophes and the structuring of vowel systems
-
Abry C., Boe L.J., and Schwartz J.L. Plateaus, catastrophes and the structuring of vowel systems. J. Phonet. 17 (1989) 47-54
-
(1989)
J. Phonet.
, vol.17
, pp. 47-54
-
-
Abry, C.1
Boe, L.J.2
Schwartz, J.L.3
-
2
-
-
0033100056
-
Codebook based face point trajectory synthesis algorithm using speech input
-
Arslan L.M., and Talkin D. Codebook based face point trajectory synthesis algorithm using speech input. Speech Commun. 27 (1999) 81-93
-
(1999)
Speech Commun.
, vol.27
, pp. 81-93
-
-
Arslan, L.M.1
Talkin, D.2
-
3
-
-
84890517975
-
Least-square fitting of two 3-d point sets
-
Arun K.S., Huang T.S., and Blostein S.D. Least-square fitting of two 3-d point sets. IEEE Trans. PAMI 9 5 (1987) 698-700
-
(1987)
IEEE Trans. PAMI
, vol.9
, Issue.5
, pp. 698-700
-
-
Arun, K.S.1
Huang, T.S.2
Blostein, S.D.3
-
4
-
-
0035574930
-
-
Aversano, G., Esposito, A., Esposito, A., Marinaro, M., 2001. A new text-independent method for phoneme segmentation. In: Proc. IEEE-MWSCAS Conference, Dayton, OH, pp. 516-519.
-
-
-
-
5
-
-
33747792416
-
-
Balan, N., 2003. Analysis and Evaluation of Factors Affecting Speech Driven Facial Animation, MS Thesis, Dept. of Computer Science and Engineering, Wright State University.
-
-
-
-
6
-
-
0032178686
-
-
Benoit, C., Le Goff, B., 1998. audio-visual speech synthesis from French text: eight years of models, designs, and evaluation at the ICP, Speech Commun. 26, 117-129.
-
-
-
-
7
-
-
0030362791
-
-
Bernstein, L.E., Benoit, C., 1996. For speech perception by humans or machines, three senses are better than one. In: Proc. ICSLP, Philadelphia 3, pp. 1477-1480.
-
-
-
-
8
-
-
33747751287
-
-
Beskow, J., 1995. Rule-based visual speech synthesis, Proc. EUROSPEECH, Madrid, Spain 1, pp. 299-302.
-
-
-
-
9
-
-
0018455310
-
Suppression of acoustic noise in speech using spectral subtraction
-
Boll S.F. Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. ASSP 27 2 (1979) 112-113
-
(1979)
IEEE Trans. ASSP
, vol.27
, Issue.2
, pp. 112-113
-
-
Boll, S.F.1
-
10
-
-
84937437186
-
-
Brand, M., 1999. Voice puppetry, Proc. SIGGRAPH, LA, California, pp. 21-28.
-
-
-
-
11
-
-
33747783965
-
-
Bregler, C., Omohundro, S., 1995. Nonlinear image interpolation using manifold learning. In: Tesauro, G., Touretzky, D., Leen, T. (Eds.), Advances in Neural Information Processing Systems 7, MIT press, Cambridge, pp. 401-408.
-
-
-
-
12
-
-
33747755815
-
-
Bryll, R., Ma, X., Quek, F., 1999. Camera calibration utility description, VisLab Tech. Rep., University of Illinois at Chicago.
-
-
-
-
13
-
-
33747764018
-
-
Caldognetto, E.M., Vagges, K., Borghese, N.A., Ferrigno, G., 1989. Automatic analysis of lip and jaw kinematics in VCV sequences. In: Proc. of EUROSPEECH, Paris 2, pp. 453-456.
-
-
-
-
14
-
-
0001514782
-
Modeling coarticulation in synthetic visual speech
-
Thalmann N.M., and Thalmann D. (Eds), Springer
-
Cohen M., and Massaro D.W. Modeling coarticulation in synthetic visual speech. In: Thalmann N.M., and Thalmann D. (Eds). Models and Techniques in Computer Animation (1993), Springer 141-155
-
(1993)
Models and Techniques in Computer Animation
, pp. 141-155
-
-
Cohen, M.1
Massaro, D.W.2
-
15
-
-
33747779045
-
-
Coianiz, T., Torresani, L., Caprile, L., 1995. 2D deformable models for visual speech analysis. In: Stork, D., Hennecke, M. (Eds.), Speech Reading by Man and Machine. Springer, pp. 391-398.
-
-
-
-
17
-
-
0016987103
-
-
Duttweiler, D., Messerschmitt, D., 1976. Nearly instantaneous companding for nonuniformly quantized PCM. In: IEEE Trans. on Comm., COM-24, pp. 864-873.
-
-
-
-
18
-
-
33747764558
-
-
Essa, I., 1995. Analysis, interpretation, and synthesis of facial expression, Ph.D. thesis, MIT Media Arts and Sciences, Cambridge, MA.
-
-
-
-
19
-
-
0034207427
-
Visual speech synthesis by morphing visemes
-
Ezzat T., and Poggio T. Visual speech synthesis by morphing visemes. J. Comput. Vis. 38 1 (2000) 45-57
-
(2000)
J. Comput. Vis.
, vol.38
, Issue.1
, pp. 45-57
-
-
Ezzat, T.1
Poggio, T.2
-
20
-
-
33747756511
-
-
Finn, K., 1986. An investigation of visible lip information to be used in automatic speech recognition, Ph.D. dissertation, Dept. CS, Georgetown University, Washington, DC.
-
-
-
-
21
-
-
33747750568
-
-
Fu, S., 2002. Visual Mapping Based on Hidden Markov Models, MS Thesis, Dept. of Computer Science and Engineering, Wright State University.
-
-
-
-
22
-
-
16244385915
-
-
Fu, S., Gutierrez-Osuna, R., Esposito, A., Kakumanu, P.K., Garcia, O.N., 2005. Audio/Visual Mapping with Cross-Modal Hidden Markov Models. IEEE Transactions on Multimedia 7, No. 2, April.
-
-
-
-
23
-
-
33747756510
-
-
Garofolo, J., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pellet, D.S., Dahlgren, N.L, 1988. The DARPA TIMIT CDROM. Available from LDC: .
-
-
-
-
24
-
-
33747755814
-
-
Goldschen, A.J., 1993. Continuous automatic speech recognition by lipreading, Ph.D. thesis, George Washington University.
-
-
-
-
25
-
-
0345443171
-
Temporal properties of spontaneous speech-a syllable-centric perspective
-
Greenberg S., Carvey H., Hitchcock L., and Chang S. Temporal properties of spontaneous speech-a syllable-centric perspective. J. Phonet. 31 3-4 (2003) 465-485
-
(2003)
J. Phonet.
, vol.31
, Issue.3-4
, pp. 465-485
-
-
Greenberg, S.1
Carvey, H.2
Hitchcock, L.3
Chang, S.4
-
26
-
-
33747784294
-
-
Gutierrez-Osuna, R., Kakumanu, P., Esposito, A., Garcia, O.N., Bojorquez, A., Castillo, J., Rudomin, I., 2002. WSU Technical.Report CS-WSU-02-03, Dayton, OH.
-
-
-
-
27
-
-
13144278330
-
Speech-driven Facial Animation with Realistic Dynamics
-
Gutierrez-Osuna R., Kakumanu P.K., Esposito A., Garcia O.N., Bojorquez A., Castillo J.L., and Rudomin I.J. Speech-driven Facial Animation with Realistic Dynamics. IEEE Trans. Multimedia 7 1 (2005) 33-42
-
(2005)
IEEE Trans. Multimedia
, vol.7
, Issue.1
, pp. 33-42
-
-
Gutierrez-Osuna, R.1
Kakumanu, P.K.2
Esposito, A.3
Garcia, O.N.4
Bojorquez, A.5
Castillo, J.L.6
Rudomin, I.J.7
-
28
-
-
0028517164
-
RASTA processing of speech
-
Hermansky H., and Morgan N. RASTA processing of speech. IEEE Trans. SAP 2 4 (1994) 578-589
-
(1994)
IEEE Trans. SAP
, vol.2
, Issue.4
, pp. 578-589
-
-
Hermansky, H.1
Morgan, N.2
-
29
-
-
0036650837
-
Real-time speech-driven face animation with expressions using neural networks
-
Hong P., Wen Z., and Huang T.S. Real-time speech-driven face animation with expressions using neural networks. IEEE Trans. Neural Networks 13 4 (2002) 916-927
-
(2002)
IEEE Trans. Neural Networks
, vol.13
, Issue.4
, pp. 916-927
-
-
Hong, P.1
Wen, Z.2
Huang, T.S.3
-
30
-
-
33747751903
-
-
Itakura, F., 1975. Line spectrum representation of linear prediction coefficients of speech signal, JASA57, pp. 535 (abstract).
-
-
-
-
33
-
-
0032778055
-
Interlacing properties of line spectrum pair frequencies
-
Kim H., and Lee H. Interlacing properties of line spectrum pair frequencies. IEEE Trans. SAP 7 (1999) 87-91
-
(1999)
IEEE Trans. SAP
, vol.7
, pp. 87-91
-
-
Kim, H.1
Lee, H.2
-
34
-
-
84989489267
-
-
Klatt, D.H., 1982. Prediction of perceived phonetic distance from critical band spectra: a first step. In: Proc. of ICASSP, Paris, pp. 1278-1281.
-
-
-
-
35
-
-
33747757580
-
-
Kühnert, B., Nolan, F., 1999. The origin of coarticulation, in Coarticulation: theory, data, and techniques. In: Harcastle, W., Helwett, N. (Eds.), Cambridge University Press, pp. 7-29.
-
-
-
-
36
-
-
0029270677
-
Converting speech into lip movements: a multimedia telephone for hard of hearing people
-
Lavagetto F. Converting speech into lip movements: a multimedia telephone for hard of hearing people. IEEE Trans. Rehab. Eng. 3 1 (1995) 90-102
-
(1995)
IEEE Trans. Rehab. Eng.
, vol.3
, Issue.1
, pp. 90-102
-
-
Lavagetto, F.1
-
37
-
-
0029182694
-
-
Lee, Y., Terzopoulos, D., Waters, K., 1995. Realisitc modeling for facial animation. In: Proc. of SIGGRAPH, LA, California, pp. 55-62.
-
-
-
-
38
-
-
0032165329
-
Conversion of articulatory parameters into active shape model coefficients for lip motion representation and synthesis
-
Leps¢y S., and Curinga S. Conversion of articulatory parameters into active shape model coefficients for lip motion representation and synthesis. Signal Process. Image Commun. 13 (1998) 209-225
-
(1998)
Signal Process. Image Commun.
, vol.13
, pp. 209-225
-
-
Lepscy, S.1
Curinga, S.2
-
39
-
-
33747756160
-
-
Luttin, J., Thacher, N.A., Beet, S.W., 1996. Active shape models for visual speech feature extraction. In: Stork, D., Hennecke, M. (Eds.), Speech-Reading by Man and Machine, vol. 150. Springer, pp. 383-390.
-
-
-
-
42
-
-
33747786384
-
-
Massaro, D.W., Beskow, J., Cohen, M.M., Fry, C.L., Rodriquez, T., 1999. Picture my voice: audio to visual speech synthesis using Artificial Neural Networks. In: Proc. AVSP, Santa Cruz, CA, pp. 133-138.
-
-
-
-
44
-
-
0020960564
-
Physical characteristics of lips underlying vowel lipreading performances
-
Montgomery A., and Jackson P. Physical characteristics of lips underlying vowel lipreading performances. JASA 73 6 (1983) 2134-2144
-
(1983)
JASA
, vol.73
, Issue.6
, pp. 2134-2144
-
-
Montgomery, A.1
Jackson, P.2
-
45
-
-
0026156861
-
A Media conversion from speech to facial image for man-machine interface
-
Morishima S., and Harashima H. A Media conversion from speech to facial image for man-machine interface. IEEE J. Selected Areas Commun. 9 4 (1991) 594-600
-
(1991)
IEEE J. Selected Areas Commun.
, vol.9
, Issue.4
, pp. 594-600
-
-
Morishima, S.1
Harashima, H.2
-
46
-
-
0035251712
-
Speech-to-lip movements synthesis by maximizing audio-visual joint probability
-
Nakamura S., and Yamamoto E. Speech-to-lip movements synthesis by maximizing audio-visual joint probability. J. VLSI Signal Proc. 27 (2001) 119-126
-
(2001)
J. VLSI Signal Proc.
, vol.27
, pp. 119-126
-
-
Nakamura, S.1
Yamamoto, E.2
-
47
-
-
0020202671
-
Parameterized models for facial animation
-
Parke F.I. Parameterized models for facial animation. IEEE Comput. Graph. Appl. 2 9 (1982) 61-68
-
(1982)
IEEE Comput. Graph. Appl.
, vol.2
, Issue.9
, pp. 61-68
-
-
Parke, F.I.1
-
50
-
-
0027659197
-
Signal modeling techniques in speech recognition
-
Picone J.W. Signal modeling techniques in speech recognition. Proc. IEEE 81 9 (1993) 1215-1247
-
(1993)
Proc. IEEE
, vol.81
, Issue.9
, pp. 1215-1247
-
-
Picone, J.W.1
-
52
-
-
0032180188
-
Adaptive fusion of acoustic and visual sources for automatic speech recognition
-
Rogozan A., and Deléglise P. Adaptive fusion of acoustic and visual sources for automatic speech recognition. Speech Commun. 26 (1998) 149-161
-
(1998)
Speech Commun.
, vol.26
, pp. 149-161
-
-
Rogozan, A.1
Deléglise, P.2
-
53
-
-
84928837806
-
A joint synchrony/mean-rate model of auditory speech processing
-
Seneff S. A joint synchrony/mean-rate model of auditory speech processing. J. Phonetics 16 1 (1988) 55-76
-
(1988)
J. Phonetics
, vol.16
, Issue.1
, pp. 55-76
-
-
Seneff, S.1
-
54
-
-
33747807632
-
-
Sharma, S., Vermeulen, P., Hermansky, H., 1998. Combining information from multiple classifiers to speaker verification. In: Proc. RL2C, France, pp. 115-119.
-
-
-
-
55
-
-
84885499464
-
Optimal quantization of line LSP parameters
-
Soong F.K., and Juang B.H. Optimal quantization of line LSP parameters. IEEE Trans. SAP 1 (1993) 15-24
-
(1993)
IEEE Trans. SAP
, vol.1
, pp. 15-24
-
-
Soong, F.K.1
Juang, B.H.2
-
56
-
-
0018701386
-
Use of visual information for phonetic perception
-
Summerfield Q. Use of visual information for phonetic perception. Phonetics 36 (1979) 314-331
-
(1979)
Phonetics
, vol.36
, pp. 314-331
-
-
Summerfield, Q.1
-
57
-
-
0033879110
-
-
Tekalp, A.M., Ostermann, J., 2000. Face and2-D mesh animation in MPEG-4. In: Sig. Processing: Image Comm. 15, pp. 387-421.
-
-
-
-
58
-
-
0030682291
-
-
Tibrewala, S., Hermansky, H., 1997. Sub-band based recognition of noisy speech. In: Proc. of ICASSP, Munich, Germany, pp. 1255-1258.
-
-
-
-
59
-
-
0023397578
-
A versatile camera calibration technique for high-accuracy 3D machine vision metrology
-
Tsai R.Y. A versatile camera calibration technique for high-accuracy 3D machine vision metrology. IEEE J. Robot. Automat. 3 (1987) 323-344
-
(1987)
IEEE J. Robot. Automat.
, vol.3
, pp. 323-344
-
-
Tsai, R.Y.1
-
60
-
-
0029462324
-
-
Waters, K., Frisbie, J., 1995. A coordinated muscle model for speech animation. In: Proc. of Graphics Interface, Ontario, pp. 163-170.
-
-
-
-
61
-
-
33747788055
-
-
Waters, K., Levergood, T., 1993. DECface: an automatic lip synchronization algorithm for synthetic faces, RLE, Cambridge, MA Tech. Rep. CRL 93/4.
-
-
-
-
62
-
-
0343081513
-
Reduction techniques for exemplar-based learning algorithms
-
Wilson D.R., and Martinez T.R. Reduction techniques for exemplar-based learning algorithms. Mach. Learning 38 3 (2000) 257-286
-
(2000)
Mach. Learning
, vol.38
, Issue.3
, pp. 257-286
-
-
Wilson, D.R.1
Martinez, T.R.2
-
63
-
-
0032179320
-
Lip movement synthesis from speech based on Hidden Markov models
-
Yamamoto E., Nakamura S., and Shikano K. Lip movement synthesis from speech based on Hidden Markov models. Speech Commun. 28 (1998) 105-115
-
(1998)
Speech Commun.
, vol.28
, pp. 105-115
-
-
Yamamoto, E.1
Nakamura, S.2
Shikano, K.3
|