-
1
-
-
0031187171
-
Speech recognition by machines and humans
-
R. P. Lippmann, "Speech recognition by machines and humans," Speech Commun., vol. 22, pp. 1-15, 1997.
-
(1997)
Speech Commun.
, vol.22
, pp. 1-15
-
-
Lippmann, R.P.1
-
2
-
-
0026189808
-
Speech recognition in adverse environments
-
B. H. Juang, "Speech recognition in adverse environments," Comput. Speech Lang., vol. 5, pp. 275-294, 1991.
-
(1991)
Comput. Speech Lang.
, vol.5
, pp. 275-294
-
-
Juang, B.H.1
-
3
-
-
0002788784
-
Signal processing for robust speech recognition
-
C.-H. Lee, F. K. Soong, and Y. Ohshima, Eds. Norwell, MA: Kluwer, ch. 15
-
R. Stern, A. Acero, F.-H. Liu, and Y. Ohshima, "Signal processing for robust speech recognition," in Automatic Speech and Speaker Recognition. Advanced Topics, C.-H. Lee, F. K. Soong, and Y. Ohshima, Eds. Norwell, MA: Kluwer, 1997, ch. 15, pp. 357-384.
-
(1997)
Automatic Speech and Speaker Recognition. Advanced Topics
, pp. 357-384
-
-
Stern, R.1
Acero, A.2
Liu, F.-H.3
Ohshima, Y.4
-
5
-
-
0003835127
-
-
Hove, U.K.: Psychology
-
R. Campbell, B. Dodd, and D. Burnham, Eds., Hearing by Eye II. Hove, U.K.: Psychology, 1998.
-
(1998)
Hearing by Eye II
-
-
Campbell, R.1
Dodd, B.2
Burnham, D.3
-
6
-
-
0001048664
-
Visual contribution to speech intelligibility in noise
-
W. H. Sumby and I. Pollack, "Visual contribution to speech intelligibility in noise," J. Acoust. Soc. Amer., vol. 26, pp. 212-215, 1954.
-
(1954)
J. Acoust. Soc. Amer.
, vol.26
, pp. 212-215
-
-
Sumby, W.H.1
Pollack, I.2
-
7
-
-
0017199877
-
Hearing lips and seeing voices
-
H. MacGurk and J. MacDonald, "Hearing lips and seeing voices," Nature, vol. 264, pp. 746-748, 1976.
-
(1976)
Nature
, vol.264
, pp. 746-748
-
-
MacGurk, H.1
MacDonald, J.2
-
8
-
-
85058246934
-
Mouth movement and signed communication
-
R. Campbell, B. Dodd, and D. Burnham, Eds. Hove, U.K.: Psychology, ch. 13
-
M. Marschark, D. LePoutre, and L. Bernent, "Mouth movement and signed communication," in Hearing by Eye II, R. Campbell, B. Dodd, and D. Burnham, Eds. Hove, U.K.: Psychology, 1998, ch. 13, pp. 245-266.
-
(1998)
Hearing by Eye II
, pp. 245-266
-
-
Marschark, M.1
Lepoutre, D.2
Bernent, L.3
-
9
-
-
85069146767
-
What makes a good speechreader? First you have to find one
-
R. Campbell, B. Dodd, and D. Burnham, Eds. Hove, U.K.: Psychology, ch. 11
-
L. E. Bernstein, M. E. Demorest, and P. E. Tucker, "What makes a good speechreader? First you have to find one," in Hearing by Eye II, R. Campbell, B. Dodd, and D. Burnham, Eds. Hove, U.K.: Psychology, 1998, ch. 11, pp. 211-227.
-
(1998)
Hearing by Eye II
, pp. 211-227
-
-
Bernstein, L.E.1
Demorest, M.E.2
Tucker, P.E.3
-
10
-
-
0002028032
-
Some preliminaries to a comprehensive account of audio visual speech perception
-
R. Campbell and B. Dodd, Eds. London, U.K.: Lawrence Erlbaum
-
A. Q. Summerfield, "Some preliminaries to a comprehensive account of audio visual speech perception," in Hearing by Eye: The Psychology of Lip-Reading, R. Campbell and B. Dodd, Eds. London, U.K.: Lawrence Erlbaum, 1987, pp. 3-51.
-
(1987)
Hearing by Eye: The Psychology of Lip-reading
, pp. 3-51
-
-
Summerfield, A.Q.1
-
11
-
-
0032072433
-
Speech recognition and sensory integration
-
D. W. Massaro and D. G. Stork, "Speech recognition and sensory integration," Amer. Sci., vol. 86, pp. 236-244, 1998.
-
(1998)
Amer. Sci.
, vol.86
, pp. 236-244
-
-
Massaro, D.W.1
Stork, D.G.2
-
12
-
-
0032178592
-
Quantitative association of vocal-tract and facial behavior
-
H. Yehia, P. Rubin, and E. Vatikiotis-Bateson, "Quantitative association of vocal-tract and facial behavior," Speech Commun., vol. 26, pp. 23-43, 1998.
-
(1998)
Speech Commun.
, vol.26
, pp. 23-43
-
-
Yehia, H.1
Rubin, P.2
Vatikiotis-Bateson, E.3
-
13
-
-
0012725678
-
Estimation of speech acoustics from visual speech features: A comparison of linear and nonlinear models
-
J. P. Barker and F. Berthommier, "Estimation of speech acoustics from visual speech features: A comparison of linear and nonlinear models," in Proc. Conf. Audio-Visual Speech Processing, 1999, pp. 112-117.
-
(1999)
Proc. Conf. Audio-visual Speech Processing
, pp. 112-117
-
-
Barker, J.P.1
Berthommier, F.2
-
14
-
-
0036874551
-
On the relationship between face movements, tongue movements, and speech acoustics
-
Nov.
-
J. Jiang, A. Alwan, P. A. Keating, B. Chaney, E. T. Auer Jr., and L. E. Bernstein, "On the relationship between face movements, tongue movements, and speech acoustics," EURASIP J. Appl. Signal Processing, vol. 2002, pp. 1174-1188, Nov. 2002.
-
(2002)
EURASIP J. Appl. Signal Processing
, vol.2002
, pp. 1174-1188
-
-
Jiang, J.1
Alwan, A.2
Keating, P.A.3
Chaney, B.4
Auer Jr., E.T.5
Bernstein, L.E.6
-
15
-
-
0002955163
-
Lips, teeth, and the benefits of lipreading
-
H. D. Ellis and A. W. Young, Eds. Amsterdam, The Netherlands: Elsevier
-
Q. Summerfield, A. MacLeod, M. McGrath, and M. Brooke, "Lips, teeth, and the benefits of lipreading," in Handbook of Research on Face Processing, H. D. Ellis and A. W. Young, Eds. Amsterdam, The Netherlands: Elsevier, 1989, pp. 223-233.
-
(1989)
Handbook of Research on Face Processing
, pp. 223-233
-
-
Summerfield, Q.1
MacLeod, A.2
McGrath, M.3
Brooke, M.4
-
16
-
-
0002700689
-
Psychology of human speechreading
-
D. G. Stork and M. E. Hennecke, Eds. Berlin, Germany: Springer-Verlag
-
P. M. T. Smeele, "Psychology of human speechreading," in Speechreading by Humans and Machines, D. G. Stork and M. E. Hennecke, Eds. Berlin, Germany: Springer-Verlag, 1996, pp. 3-15.
-
(1996)
Speechreading by Humans and Machines
, pp. 3-15
-
-
Smeele, P.M.T.1
-
17
-
-
0003544881
-
Visionary speech: Looking ahead to practical speechreading systems
-
D. G. Stork and M. E. Hennecke, Eds. Berlin, Germany: Springer-Verlag
-
M. E. Hennecke, D. G. Stork, and K. V. Prasad, "Visionary speech: Looking ahead to practical speechreading systems," in Speechreading by Humans and Machines, D. G. Stork and M. E. Hennecke, Eds. Berlin, Germany: Springer-Verlag, 1996, pp. 331-349.
-
(1996)
Speechreading by Humans and Machines
, pp. 331-349
-
-
Hennecke, M.E.1
Stork, D.G.2
Prasad, K.V.3
-
18
-
-
0021541159
-
Automatic lipreading to enhance speech recognition
-
E. D. Petajan, "Automatic lipreading to enhance speech recognition," in Proc. Global Telecommunications Conf., 1984, pp. 265-272.
-
(1984)
Proc. Global Telecommunications Conf.
, pp. 265-272
-
-
Petajan, E.D.1
-
20
-
-
0036502797
-
A review of speech-based bimodal recognition
-
Mar.
-
C. C. Chibelushi, F. Deravi, and J. S. D. Mason, "A review of speech-based bimodal recognition," IEEE Trans. Multimedia, vol. 4, pp. 23-37, Mar. 2002.
-
(2002)
IEEE Trans. Multimedia
, vol.4
, pp. 23-37
-
-
Chibelushi, C.C.1
Deravi, F.2
Mason, J.S.D.3
-
21
-
-
0001432664
-
On the integration of auditory and visual parameters in an HMM-based ASR
-
D. G. Stork and M. E. Hennecke, Eds. Berlin, Germany: Springer-Verlag
-
A. Adjoudani and C. Benoît, "On the integration of auditory and visual parameters in an HMM-based ASR," in Speechreading by Humans and Machines, D. G. Stork and M. E. Hennecke, Eds. Berlin, Germany: Springer-Verlag, 1996, pp. 461-471.
-
(1996)
Speechreading by Humans and Machines
, pp. 461-471
-
-
Adjoudani, A.1
Benoît, C.2
-
22
-
-
0030376248
-
Robust audiovisual integration using semicontinuous hidden Markov models
-
Q. Su and P. L. Silsbee, "Robust audiovisual integration using semicontinuous hidden Markov models," in Proc. Int. Conf. Spoken Language Processing, 1996, pp. 42-45.
-
(1996)
Proc. Int. Conf. Spoken Language Processing
, pp. 42-45
-
-
Su, Q.1
Silsbee, P.L.2
-
23
-
-
0000789852
-
Channel separability in the audio visual integration of speech: A Bayesian approach
-
D. G. Stork and M. E. Hennecke, Eds. Berlin, Germany: Springer-Verlag
-
J. R. Movellan and G. Chadderdon, "Channel separability in the audio visual integration of speech: A Bayesian approach," in Speechreading by Humans and Machines, D. G. Stork and M. E. Hennecke, Eds. Berlin, Germany: Springer-Verlag, 1996, pp. 473-487.
-
(1996)
Speechreading by Humans and Machines
, pp. 473-487
-
-
Movellan, J.R.1
Chadderdon, G.2
-
24
-
-
84925595128
-
Combining noise compensation with visual information in speech recognition
-
S. Cox, I. Matthews, and A. Bangham, "Combining noise compensation with visual information in speech recognition," in Proc. Eur. Workshop Audio-Visual Speech Processing, 1997, pp. 53-56.
-
(1997)
Proc. Eur. Workshop Audio-visual Speech Processing
, pp. 53-56
-
-
Cox, S.1
Matthews, I.2
Bangham, A.3
-
25
-
-
84925639646
-
Real-time lip tracking and bimodal continuous speech recognition
-
M. T. Chan, Y. Zhang, and T. S. Huang, "Real-time lip tracking and bimodal continuous speech recognition," in Proc. Workshop Multimedia Signal Processing, 1998, pp. 65-70.
-
(1998)
Proc. Workshop Multimedia Signal Processing
, pp. 65-70
-
-
Chan, M.T.1
Zhang, Y.2
Huang, T.S.3
-
26
-
-
0034270644
-
Audio-visual speech modeling for continuous speech recognition
-
Sept.
-
S. Dupont and J. Luettin, "Audio-visual speech modeling for continuous speech recognition," IEEE Trans. Multimedia, vol. 2, pp. 141-151, Sept. 2000.
-
(2000)
IEEE Trans. Multimedia
, vol.2
, pp. 141-151
-
-
Dupont, S.1
Luettin, J.2
-
27
-
-
0034841727
-
Application of affine-invariant Fourier descriptors to lipreading for audio-visual speech recognition
-
S. Gurbuz, Z. Tufekci, E. Patterson, and J. N. Gowdy, "Application of affine-invariant Fourier descriptors to lipreading for audio-visual speech recognition," in Proc. Int. Conf. Acoustics, Speech, and Signal Processing, 2001, pp. 177-180.
-
(2001)
Proc. Int. Conf. Acoustics, Speech, and Signal Processing
, pp. 177-180
-
-
Gurbuz, S.1
Tufekci, Z.2
Patterson, E.3
Gowdy, J.N.4
-
29
-
-
0036874999
-
Dynamic Bayesian networks for audio-visual speech recognition
-
Nov.
-
A. V. Nefian, L. Liang, X. Pi, X. Liu, and K. Murphy, "Dynamic Bayesian networks for audio-visual speech recognition," EURASIP J. Appl. Signal Process., vol. 2002, pp. 1274-1288, Nov. 2002.
-
(2002)
EURASIP J. Appl. Signal Process.
, vol.2002
, pp. 1274-1288
-
-
Nefian, A.V.1
Liang, L.2
Pi, X.3
Liu, X.4
Murphy, K.5
-
30
-
-
0031624666
-
Discriminative training of HMM stream exponents for audio-visual speech recognition
-
G. Potamianos and H. P. Graf, "Discriminative training of HMM stream exponents for audio-visual speech recognition," in Proc. Int. Conf. Acoustics, Speech, and Signal Processing, 1998, pp. 3733-3736.
-
(1998)
Proc. Int. Conf. Acoustics, Speech, and Signal Processing
, pp. 3733-3736
-
-
Potamianos, G.1
Graf, H.P.2
-
31
-
-
0034502214
-
Speaker independent audiovisual speech recognition
-
Y. Zhang, S. Levinson, and T. Huang, "Speaker independent audiovisual speech recognition," in Proc. Int. Conf. Multimedia Expo, 2000, pp. 1073-1076.
-
(2000)
Proc. Int. Conf. Multimedia Expo
, pp. 1073-1076
-
-
Zhang, Y.1
Levinson, S.2
Huang, T.3
-
32
-
-
0003544881
-
Rationale for phoneme-viseme mapping and feature selection in visual speech recognition
-
D. G. Stork and M. E. Hennecke, Eds. Berlin, Germany: Springer-Verlag
-
A. J. Goldschen, O. N. Garcia, and E. D. Petajan, "Rationale for phoneme-viseme mapping and feature selection in visual speech recognition," in Speechreading by Humans and Machines, D. G. Stork and M. E. Hennecke, Eds. Berlin, Germany: Springer-Verlag, 1996, pp. 505-515.
-
(1996)
Speechreading by Humans and Machines
, pp. 505-515
-
-
Goldschen, A.J.1
Garcia, O.N.2
Petajan, E.D.3
-
33
-
-
0002100804
-
Adaptive determination of audio and visual weights for automatic speech recognition
-
A. Rogozan, P. Deléglise, and M. Alissali, "Adaptive determination of audio and visual weights for automatic speech recognition," in Proc. Eur. Workshop Audio-Visual Speech Processing, 1997, pp. 61-64.
-
(1997)
Proc. Eur. Workshop Audio-visual Speech Processing
, pp. 61-64
-
-
Rogozan, A.1
Deléglise, P.2
Alissali, M.3
-
34
-
-
85013597845
-
'Eigenlips' for robust speech recognition
-
C. Bregler and Y. Konig, "'Eigenlips' for robust speech recognition," in Proc. Int. Conf. Acoustics, Speech, and Signal Processing, 1994, pp. 669-672.
-
(1994)
Proc. Int. Conf. Acoustics, Speech, and Signal Processing
, pp. 669-672
-
-
Bregler, C.1
Konig, Y.2
-
35
-
-
14944340400
-
Neural architectures for sensorfusion in speech recognition
-
G. Krone, B. Talle, A. Wichen, and G. Palm, "Neural architectures for sensorfusion in speech recognition," in Proc. Eur. Workshop Audio-Visual Speech Processing, 1997, pp. 57-60.
-
(1997)
Proc. Eur. Workshop Audio-visual Speech Processing
, pp. 57-60
-
-
Krone, G.1
Talle, B.2
Wichen, A.3
Palm, G.4
-
36
-
-
85009154155
-
Stream weight optimization of speech and lip image sequence for audio-visual speech recognition
-
S. Nakamura, H. Ito, and K. Shikano, "Stream weight optimization of speech and lip image sequence for audio-visual speech recognition," in Proc. Int. Conf. Spoken Language Processing, vol. 3, 2000, pp. 20-23.
-
(2000)
Proc. Int. Conf. Spoken Language Processing
, vol.3
, pp. 20-23
-
-
Nakamura, S.1
Ito, H.2
Shikano, K.3
-
37
-
-
0004052871
-
Audio-visual speech recognition
-
Johns Hopkins, Univ., Baltimore, MD
-
C. Neti, G. Potamianos, J. Luettin, I. Matthews, H. Glotin, D. Vergyri, J. Sison, A. Mashari, and J. Zhou, "Audio-visual speech recognition," Center Lang. Speech Process., Johns Hopkins, Univ., Baltimore, MD, 2000.
-
(2000)
Center Lang. Speech Process.
-
-
Neti, C.1
Potamianos, G.2
Luettin, J.3
Matthews, I.4
Glotin, H.5
Vergyri, D.6
Sison, J.7
Mashari, A.8
Zhou, J.9
-
39
-
-
85135321224
-
See me, hear me: Integrating automatic speech recognition and lip-reading
-
P. Duchnowski, U. Meier, and A. Waibel, "See me, hear me: Integrating automatic speech recognition and lip-reading," in Proc. Int. Conf. Spoken Language Processing, 1994, pp. 547-550.
-
(1994)
Proc. Int. Conf. Spoken Language Processing
, pp. 547-550
-
-
Duchnowski, P.1
Meier, U.2
Waibel, A.3
-
40
-
-
0032314380
-
An image transform approach for HMM based automatic lipreading
-
G. Potamianos, H. P. Graf, and E. Cosatto, "An image transform approach for HMM based automatic lipreading," in Proc. Int. Conf. image Processing, vol. 1, 1998, pp. 173-177.
-
(1998)
Proc. Int. Conf. Image Processing
, vol.1
, pp. 173-177
-
-
Potamianos, G.1
Graf, H.P.2
Cosatto, E.3
-
42
-
-
0345166088
-
Lipreading using eigensequences
-
N. Li, S. Dettmer, and M. Shah, "Lipreading using eigensequences," in Proc. Int. Workshop Automatic Face Gesture Recognition, 1995, pp. 30-34.
-
(1995)
Proc. Int. Workshop Automatic Face Gesture Recognition
, pp. 30-34
-
-
Li, N.1
Dettmer, S.2
Shah, M.3
-
44
-
-
0035386489
-
A cascade visual front end for speaker independent automatic speechreading
-
July/Oct.
-
G. Potamianos, C. Neti, G. Iyengar, A. W. Senior, and A. Verma, "A cascade visual front end for speaker independent automatic speechreading," Int. J. Speech Technol., vol. 4, pp. 193-208, July/Oct. 2001.
-
(2001)
Int. J. Speech Technol.
, vol.4
, pp. 193-208
-
-
Potamianos, G.1
Neti, C.2
Iyengar, G.3
Senior, A.W.4
Verma, A.5
-
45
-
-
84925619981
-
Word dependent acoustic-labial weights in HMM-based speech recognition
-
P. Jourlin, "Word dependent acoustic-labial weights in HMM-based speech recognition," in Proc. Eur. Workshop Audio-Visual Speech Processing, 1997, pp. 69-72.
-
(1997)
Proc. Eur. Workshop Audio-visual Speech Processing
, pp. 69-72
-
-
Jourlin, P.1
-
46
-
-
0003770986
-
Comparing models for audiovisual fusion in a noisy-vowel recognition task
-
Nov.
-
P. Teissier, J. Robert-Ribes, and J. L. Schwartz, "Comparing models for audiovisual fusion in a noisy-vowel recognition task," IEEE Trans. Speech Audio Processing, vol. 7, pp. 629-642, Nov. 1999.
-
(1999)
IEEE Trans. Speech Audio Processing
, vol.7
, pp. 629-642
-
-
Teissier, P.1
Robert-Ribes, J.2
Schwartz, J.L.3
-
47
-
-
0036874527
-
Noise adaptive stream weighting in audio-visual speech recognition
-
Nov.
-
M. Heckmann, F. Berthommier, and K. Kroschel, "Noise adaptive stream weighting in audio-visual speech recognition," EURASIP J. Appl. Signal Processing, vol. 2002, pp. 1260-1273, Nov. 2002.
-
(2002)
EURASIP J. Appl. Signal Processing
, vol.2002
, pp. 1260-1273
-
-
Heckmann, M.1
Berthommier, F.2
Kroschel, K.3
-
51
-
-
0038706765
-
Automatic speechreading using dynamic contours
-
D. G. Stork and M. E. Hennecke, Eds. Berlin, Germany: Springer-Verlag
-
B. Dalton, R. Kaucic, and A. Blake, "Automatic speechreading using dynamic contours," in Speechreading by Humans and Machines, D. G. Stork and M. E. Hennecke, Eds. Berlin, Germany: Springer-Verlag, 1996, pp. 373-382.
-
(1996)
Speechreading by Humans and Machines
, pp. 373-382
-
-
Dalton, B.1
Kaucic, R.2
Blake, A.3
-
52
-
-
0036874915
-
Audio-visual speech recognition using MPEG-4 compliant visual features
-
Nov.
-
P. S. Aleksic, J. J. Williams, Z. Wu, and A. K. Katsaggelos, "Audio-visual speech recognition using MPEG-4 compliant visual features," EURASIP J. Appl. Signal Processing, vol. 2002, pp. 1213-1227, Nov. 2002.
-
(2002)
EURASIP J. Appl. Signal Processing
, vol.2002
, pp. 1213-1227
-
-
Aleksic, P.S.1
Williams, J.J.2
Wu, Z.3
Katsaggelos, A.K.4
-
53
-
-
0035791288
-
HMM based audio-visual speech recognition integrating geometric- and appearance-based visual features
-
M. T. Chan, "HMM based audio-visual speech recognition integrating geometric- and appearance-based visual features," in Proc. Workshop Multimedia Signal Processing, 2001, pp. 9-14.
-
(2001)
Proc. Workshop Multimedia Signal Processing
, pp. 9-14
-
-
Chan, M.T.1
-
54
-
-
84957810778
-
Active appearance models
-
T. F. Cootes, G. J. Edwards, and C. J. Taylor, "Active appearance models," in Proc. Eur. Conf. Computer Vision, 1998, pp. 484-498.
-
(1998)
Proc. Eur. Conf. Computer Vision
, pp. 484-498
-
-
Cootes, T.F.1
Edwards, G.J.2
Taylor, C.J.3
-
55
-
-
84925640716
-
A multimedia platform for audio-visual speech processing
-
A. Adjoudani, T. Guiard-Marigny, B. L. Goff, L. Reveret, and C. Benoît, "A multimedia platform for audio-visual speech processing," in Proc. Eur. Conf. Speech Communication Technology, 1997, pp. 1671-1674.
-
(1997)
Proc. Eur. Conf. Speech Communication Technology
, pp. 1671-1674
-
-
Adjoudani, A.1
Guiard-Marigny, T.2
Goff, B.L.3
Reveret, L.4
Benoît, C.5
-
56
-
-
34250090755
-
Snakes: Active contour models
-
M. Kass, A. Witkin, and D. Terzopoulos, "Snakes: Active contour models," Int. J. Comput. Vision, vol. 4, pp. 321-331, 1988.
-
(1988)
Int. J. Comput. Vision
, vol.4
, pp. 321-331
-
-
Kass, M.1
Witkin, A.2
Terzopoulos, D.3
-
57
-
-
0026903014
-
Feature extraction from faces using deformable templates
-
A. L. Yuille, P. W. Hallinan, and D. S. Cohen, "Feature extraction from faces using deformable templates," Int. J. Comput. Vision, vol. 8, pp. 99-111, 1992.
-
(1992)
Int. J. Comput. Vision
, vol.8
, pp. 99-111
-
-
Yuille, A.L.1
Hallinan, P.W.2
Cohen, D.S.3
-
58
-
-
0029182228
-
Active shape models - Their training and application
-
Jan.
-
T. F. Cootes, C. J. Taylor, D. H. Cooper, and J. Graham, "Active shape models - their training and application," Comput. Vision image Understanding, vol. 61, pp. 38-59, Jan. 1995.
-
(1995)
Comput. Vision Image Understanding
, vol.61
, pp. 38-59
-
-
Cootes, T.F.1
Taylor, C.J.2
Cooper, D.H.3
Graham, J.4
-
59
-
-
0031361424
-
Robust recognition of faces and facial features with a multi-modal system
-
H. P. Graf, E. Cosatto, and G. Potamianos, "Robust recognition of faces and facial features with a multi-modal system," in Proc. Int. Conf. Systems, Man, Cybernetics, 1997, pp. 2034-2039.
-
(1997)
Proc. Int. Conf. Systems, Man, Cybernetics
, pp. 2034-2039
-
-
Graf, H.P.1
Cosatto, E.2
Potamianos, G.3
-
60
-
-
0036875048
-
Automatic speechreading with applications to human-computer interfaces
-
Nov.
-
X. Zhang, C. C. Broun, R. M. Mersereau, and M. Clements, "Automatic speechreading with applications to human-computer interfaces," EURASIP J. Appl. Signal Processing, vol. 2002, pp. 1228-1247, Nov. 2002.
-
(2002)
EURASIP J. Appl. Signal Processing
, vol.2002
, pp. 1228-1247
-
-
Zhang, X.1
Broun, C.C.2
Mersereau, R.M.3
Clements, M.4
-
61
-
-
0036875015
-
Automatically building and evaluating statistical models for lipreading
-
Nov.
-
P. Daubias and P. Deléglise, "Automatically building and evaluating statistical models for lipreading," EURASIP J. Appl. Signal Processing, vol. 2002, pp. 1202-1212, Nov. 2002.
-
(2002)
EURASIP J. Appl. Signal Processing
, vol.2002
, pp. 1202-1212
-
-
Daubias, P.1
Deléglise, P.2
-
62
-
-
0031672526
-
Neural network-based face detection
-
Jan.
-
H. A. Rowley, S. Batuja, and T. Kanade, "Neural network-based face detection," IEEE Trans. Pattern Anal. Machine Intell., vol. 20, pp. 23-38, Jan. 1998.
-
(1998)
IEEE Trans. Pattern Anal. Machine Intell.
, vol.20
, pp. 23-38
-
-
Rowley, H.A.1
Batuja, S.2
Kanade, T.3
-
63
-
-
0031648023
-
Example-based learning for view-based human face detection
-
Jan.
-
K.-K. Sung and T. Poggio, "Example-based learning for view-based human face detection," IEEE Trans. Pattern Anal. Machine Intell., vol. 20, pp. 39-51, Jan. 1998.
-
(1998)
IEEE Trans. Pattern Anal. Machine Intell.
, vol.20
, pp. 39-51
-
-
Sung, K.-K.1
Poggio, T.2
-
67
-
-
0035167625
-
Improved ROI and within frame discriminant features for lipreading
-
G. Potamianos and C. Neti, "Improved ROI and within frame discriminant features for lipreading," in Proc. Int. Conf. Image Processing, vol. 3, 2001, pp. 250-253.
-
(2001)
Proc. Int. Conf. Image Processing
, vol.3
, pp. 250-253
-
-
Potamianos, G.1
Neti, C.2
-
69
-
-
0004232640
-
-
Philadelphia, PA: SIAM
-
I. Daubechies, Wavelets. Philadelphia, PA: SIAM, 1992.
-
(1992)
Wavelets
-
-
Daubechies, I.1
-
70
-
-
0029747053
-
Integrating audio and visual information to provide highly robust speech recognition
-
M. J. Tomlinson, M. J. Russell, and N. M. Brooke, "Integrating audio and visual information to provide highly robust speech recognition," in Proc. Int. Conf. Acoustics, Speech, and Signal Processing, 1996, pp.821-824.
-
(1996)
Proc. Int. Conf. Acoustics, Speech, and Signal Processing
, pp. 821-824
-
-
Tomlinson, M.J.1
Russell, M.J.2
Brooke, N.M.3
-
71
-
-
0000874921
-
Dynamic features for visual speech-reading: A systematic comparison
-
M. C. Mozer, M. I. Jordan, and T. Petsche, Eds. Cambridge. MA: MIT Press
-
M. S. Gray, J. R. Movellan, and T. J. Sejnowski, "Dynamic features for visual speech-reading: A systematic comparison," in Advances in Neural Information Processing Systems, M. C. Mozer, M. I. Jordan, and T. Petsche, Eds. Cambridge. MA: MIT Press, 1997, vol. 9, pp. 751-757.
-
(1997)
Advances in Neural Information Processing Systems
, vol.9
, pp. 751-757
-
-
Gray, M.S.1
Movellan, J.R.2
Sejnowski, T.J.3
-
72
-
-
0003822743
-
-
Cambridge, U.K.: Entropic
-
S. Young, D. Kershaw, J. Odell, D. Ollason, V. Valtchev, and P. Woodland, The HTK Book. Cambridge, U.K.: Entropic, 1999.
-
(1999)
The HTK Book
-
-
Young, S.1
Kershaw, D.2
Odell, J.3
Ollason, D.4
Valtchev, V.5
Woodland, P.6
-
73
-
-
0001769235
-
Time-varying information for visual speech perception
-
R. Campbell, B. Dodd, and D. Burnham, Eds. Hove, U.K.: Psychology, ch. 3
-
L. D. Rosenblum and H. M. Saldana, "Time-varying information for visual speech perception," in Hearing by Eye II, R. Campbell, B. Dodd, and D. Burnham, Eds. Hove, U.K.: Psychology, 1998, ch. 3. pp. 61-81.
-
(1998)
Hearing by Eye II
, pp. 61-81
-
-
Rosenblum, L.D.1
Saldana, H.M.2
-
74
-
-
84892187452
-
Maximum likelihood modeling with Gaussian distributions for classification
-
R. A. Gopinath, "Maximum likelihood modeling with Gaussian distributions for classification," in Proc. Int. Conf. Acoustics, Speech, and Signal Processing, 1998, pp. 661-664.
-
(1998)
Proc. Int. Conf. Acoustics, Speech, and Signal Processing
, pp. 661-664
-
-
Gopinath, R.A.1
-
75
-
-
0000633724
-
Transcription of broadcast news - Some recent improvements to IBM's LVCSR system
-
L. Polymenakos, P. Olsen, D. Kanevsky, R. A. Gopinath, P. Gopalakrishnan, and S. Chen, "Transcription of broadcast news - some recent improvements to IBM's LVCSR system," in Proc. Int. Conf. Acoustics, Speech, and Signal Processing, 1998, pp. 901-904.
-
(1998)
Proc. Int. Conf. Acoustics, Speech, and Signal Processing
, pp. 901-904
-
-
Polymenakos, L.1
Olsen, P.2
Kanevsky, D.3
Gopinath, R.A.4
Gopalakrishnan, P.5
Chen, S.6
-
76
-
-
0029725863
-
Adaptive bimodal sensor fusion for automatic speechreading
-
U. Meier, W. Hurst, and P. Duchnowski, "Adaptive bimodal sensor fusion for automatic speechreading," in Proc. Int. Conf. Acoustics, Speech, and Signal Processing, 1996, pp. 833-836.
-
(1996)
Proc. Int. Conf. Acoustics, Speech, and Signal Processing
, pp. 833-836
-
-
Meier, U.1
Hurst, W.2
Duchnowski, P.3
-
77
-
-
0003216515
-
HMM-based visual speech recognition using intensity and location normalization
-
O. Vanegas, A. Tanaka, K. Tokuda, and T. Kitamura, "HMM-based visual speech recognition using intensity and location normalization," in Proc. Int. Conf. Spoken Language Processing, 1998, pp. 289-292.
-
(1998)
Proc. Int. Conf. Spoken Language Processing
, pp. 289-292
-
-
Vanegas, O.1
Tanaka, A.2
Tokuda, K.3
Kitamura, T.4
-
78
-
-
85032752352
-
Audiovisual speech processing. Lip reading and lip synchronization
-
Jan.
-
T. Chen, "Audiovisual speech processing. Lip reading and lip synchronization," IEEE Signal Processing Mag., vol. 18, pp. 9-21, Jan. 2001.
-
(2001)
IEEE Signal Processing Mag.
, vol.18
, pp. 9-21
-
-
Chen, T.1
-
79
-
-
0036875002
-
A support vector machine-based dynamic network for visual speech recognition applications
-
Nov.
-
M. Gordan, C. Kotropoulos, and I. Pitas, "A support vector machine-based dynamic network for visual speech recognition applications," EURASIP J. Appl. Signal Processing, vol. 2002, pp. 1248-1259, Nov. 2002.
-
(2002)
EURASIP J. Appl. Signal Processing
, vol.2002
, pp. 1248-1259
-
-
Gordan, M.1
Kotropoulos, C.2
Pitas, I.3
-
80
-
-
0036295989
-
Audio-visual speech modeling using coupled hidden Markov models
-
|80] S. M. Chu and T. S. Huang, "Audio-visual speech modeling using coupled hidden Markov models," in Proc. Int. Conf. Acoustics, Speech, and Signal Processing, 2002, pp. 2009-2012.
-
(2002)
Proc. Int. Conf. Acoustics, Speech, and Signal Processing
, pp. 2009-2012
-
-
Chu, S.M.1
Huang, T.S.2
-
81
-
-
0036874756
-
Moving talker, speaker-independent feature study, and baseline results using the CUAVE multimodal speech corpus
-
Nov.
-
E. K. Patterson, S. Gurbuz, Z. Tufekci, and J. N. Gowdy, "Moving talker, speaker-independent feature study, and baseline results using the CUAVE multimodal speech corpus," EURASIP J. Appl. Signal Processing, vol. 2002, pp. 1189-1201, Nov. 2002.
-
(2002)
EURASIP J. Appl. Signal Processing
, vol.2002
, pp. 1189-1201
-
-
Patterson, E.K.1
Gurbuz, S.2
Tufekci, Z.3
Gowdy, J.N.4
-
82
-
-
0002358797
-
Discriminative learning of visual data for audiovisual speech recognition
-
A. Rogozan, "Discriminative learning of visual data for audiovisual speech recognition," Int. J. Artif. Intell. Tools, vol. 8, pp. 43-52, 1999.
-
(1999)
Int. J. Artif. Intell. Tools
, vol.8
, pp. 43-52
-
-
Rogozan, A.1
-
83
-
-
0030247984
-
Computer lipreading for improved accuracy in automatic speech recognition
-
Sept.
-
P. L. Silsbee and A. C. Bovik, "Computer lipreading for improved accuracy in automatic speech recognition," IEEE Trans. Speech Audio Processing, vol. 4, pp. 337-351, Sept. 1996.
-
(1996)
IEEE Trans. Speech Audio Processing
, vol.4
, pp. 337-351
-
-
Silsbee, P.L.1
Bovik, A.C.2
-
84
-
-
0002629270
-
Maximum likelihood from incomplete data via the EM algorithm
-
A. P. Dempster, N. M. Laird, and D. B. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," J. Royal Statist. Soc., vol. 39, pp. 1-38, 1977.
-
(1977)
J. Royal Statist. Soc.
, vol.39
, pp. 1-38
-
-
Dempster, A.P.1
Laird, N.M.2
Rubin, D.B.3
-
85
-
-
0022890536
-
Maximum mutual information estimation of hidden Markov model parameters for speech recognition
-
L. R. Bahl, P. F. Brown, P. V. DeSouza, and R. L. Mercer, "Maximum mutual information estimation of hidden Markov model parameters for speech recognition," in Proc. Int. Conf. Acoustics, Speech, and Signal Processing, 1986, pp. 49-52.
-
(1986)
Proc. Int. Conf. Acoustics, Speech, and Signal Processing
, pp. 49-52
-
-
Bahl, L.R.1
Brown, P.F.2
DeSouza, P.V.3
Mercer, R.L.4
-
86
-
-
0006132736
-
A minimum error rate pattern recognition approach to speech recognition
-
Jan.
-
W. Chou, B.-H. Juang, C.-H. Lee, and F. Soong, "A minimum error rate pattern recognition approach to speech recognition," Int. J. Pattern Recognit. Artif. Intell., vol. 8, pp. 5-31, Jan. 1994.
-
(1994)
Int. J. Pattern Recognit. Artif. Intell.
, vol.8
, pp. 5-31
-
-
Chou, W.1
Juang, B.-H.2
Lee, C.-H.3
Soong, F.4
-
87
-
-
0009626553
-
Noisy speech enhancement with filters estimated from the speaker's lips
-
L. Girin, G. Feng, and J.-L. Schwartz, "Noisy speech enhancement with filters estimated from the speaker's lips," in Proc. Eur. Conf. Speech Communication Technology, 1995, pp. 1559-1562.
-
(1995)
Proc. Eur. Conf. Speech Communication Technology
, pp. 1559-1562
-
-
Girin, L.1
Feng, G.2
Schwartz, J.-L.3
-
88
-
-
0034974093
-
Audio-visual enhancement of speech in noise
-
L. Girin, J.-L. Schwartz, and G. Feng, "Audio-visual enhancement of speech in noise," J. Acoust. Soc. Amer., vol. 109, pp. 3007-3020, 2001.
-
(2001)
J. Acoust. Soc. Amer.
, vol.109
, pp. 3007-3020
-
-
Girin, L.1
Schwartz, J.-L.2
Feng, G.3
-
89
-
-
0036295990
-
Noisy audio feature enhancement using audio-visual speech data
-
R. Goecke, G. Potamianos, and C. Neti, "Noisy audio feature enhancement using audio-visual speech data," in Proc. Int. Conf. Acoustics, Speech, and Signal Processing, 2002, pp. 2025-2028.
-
(2002)
Proc. Int. Conf. Acoustics, Speech, and Signal Processing
, pp. 2025-2028
-
-
Goecke, R.1
Potamianos, G.2
Neti, C.3
-
90
-
-
85009232030
-
Audio-visual speech enhancement with AVCDCN (audio-visual codebook dependent cepstral normalization)
-
S. Deligne, G. Potamianos, and C. Neti, "Audio-visual speech enhancement with AVCDCN (audio-visual codebook dependent cepstral normalization)," in Proc. Int. Conf. Spoken Language Processing, 2002, pp. 1449-1452.
-
(2002)
Proc. Int. Conf. Spoken Language Processing
, pp. 1449-1452
-
-
Deligne, S.1
Potamianos, G.2
Neti, C.3
-
91
-
-
0026860706
-
Methods of combining multiple classifiers and their applications in handwritten recognition
-
May/June
-
L. Xu, A. Krzyzak, and C. Y. Suen, "Methods of combining multiple classifiers and their applications in handwritten recognition," IEEE Trans. Syst., Man, Cybern., vol. 22, pp. 418-435, May/June 1992.
-
(1992)
IEEE Trans. Syst., Man, Cybern.
, vol.22
, pp. 418-435
-
-
Xu, L.1
Krzyzak, A.2
Suen, C.Y.3
-
92
-
-
0032021555
-
On combining classifiers
-
Mar.
-
J. Kittler, M. Halef, R. P. W. Duin, and J. Matas, "On combining classifiers," IEEE Trans. Pattern Anal. Machine Intell., vol. 20, pp. 226-239, Mar. 1998.
-
(1998)
IEEE Trans. Pattern Anal. Machine Intell.
, vol.20
, pp. 226-239
-
-
Kittler, J.1
Halef, M.2
Duin, R.P.W.3
Matas, J.4
-
93
-
-
0033640646
-
Statistical pattern recognition: A review
-
Jan.
-
A. K. Jain, R. P. W. Duin, and J. Mao, "Statistical pattern recognition: A review," IEEE Trans. Pattern Anal. Machine Intell., vol. 22, pp. 4-37, Jan. 2000.
-
(2000)
IEEE Trans. Pattern Anal. Machine Intell.
, vol.22
, pp. 4-37
-
-
Jain, A.K.1
Duin, R.P.W.2
Mao, J.3
-
94
-
-
4544258314
-
Introduction to biometrics
-
A. Jain, R. Bolle, and S. Pankanti, Eds. Norwell, MA: Kluwer, ch. 1
-
A. Jain, R. Bolle, and S. Pankanti, "Introduction to biometrics," in Biometrics. Personal Identification in Networked Society, A. Jain, R. Bolle, and S. Pankanti, Eds. Norwell, MA: Kluwer, 1999, ch. 1, pp. 1-41.
-
(1999)
Biometrics. Personal Identification in Networked Society
, pp. 1-41
-
-
Jain, A.1
Bolle, R.2
Pankanti, S.3
-
95
-
-
0030355935
-
A new ASR approach based on independent processing and recombination of partial frequency bands
-
H. Bourlard and S. Dupont, "A new ASR approach based on independent processing and recombination of partial frequency bands," in Proc. Int. Conf. Spoken Language Processing, 1996, pp. 426-429.
-
(1996)
Proc. Int. Conf. Spoken Language Processing
, pp. 426-429
-
-
Bourlard, H.1
Dupont, S.2
-
96
-
-
84960942890
-
Test of several external posterior weighting functions for multiband full combination ASR
-
H. Glotin and F. Berthommier, "Test of several external posterior weighting functions for multiband full combination ASR," in Proc. Int. Conf. Spoken Language Processing, vol. 1, 2000, pp. 333-336.
-
(2000)
Proc. Int. Conf. Spoken Language Processing
, vol.1
, pp. 333-336
-
-
Glotin, H.1
Berthommier, F.2
-
97
-
-
85009091822
-
Audio-visual speech recognition using MCE-based HMM's and model-dependent stream weights
-
C. Miyajima, K. Tokuda, and T. Kitamura, "Audio-visual speech recognition using MCE-based HMM's and model-dependent stream weights," in Proc. Int. Conf. Spoken Language Processing, vol. 2, 2000, pp. 1023-1026.
-
(2000)
Proc. Int. Conf. Spoken Language Processing
, vol.2
, pp. 1023-1026
-
-
Miyajima, C.1
Tokuda, K.2
Kitamura, T.3
-
98
-
-
0031625499
-
Discriminative model combination
-
P. Beyerlein, "Discriminative model combination," in Proc. Int. Conf. Acoustics, Speech, and Signal Processing, 1998, pp. 481-484.
-
(1998)
Proc. Int. Conf. Acoustics, Speech, and Signal Processing
, pp. 481-484
-
-
Beyerlein, P.1
-
99
-
-
0034842451
-
Weighting schemes for audio-visual fusion in speech recognition
-
H. Glotin, D. Vergyri, C. Neti, G. Potamianos, and J. Luettin, "Weighting schemes for audio-visual fusion in speech recognition," in Proc. Int. Conf. Acoustics, Speech, and Signal Processing, 2001, pp. 173-176.
-
(2001)
Proc. Int. Conf. Acoustics, Speech, and Signal Processing
, pp. 173-176
-
-
Glotin, H.1
Vergyri, D.2
Neti, C.3
Potamianos, G.4
Luettin, J.5
-
100
-
-
0025681008
-
Hidden Markov model decomposition of speech and noise
-
P. Varga and R. K. Moore, "Hidden Markov model decomposition of speech and noise," in Proc. Int. Conf. Acoustics, Speech, and Signal Processing, 1990, pp. 845-848.
-
(1990)
Proc. Int. Conf. Acoustics, Speech, and Signal Processing
, pp. 845-848
-
-
Varga, P.1
Moore, R.K.2
-
101
-
-
0030685285
-
Coupled hidden Markov models for complex action recognition
-
M. Brand, N. Oliver, and A. Pentland, "Coupled hidden Markov models for complex action recognition," in Proc. Conf. Computer Vision Pattern Recognition, 1997, pp. 994-999.
-
(1997)
Proc. Conf. Computer Vision Pattern Recognition
, pp. 994-999
-
-
Brand, M.1
Oliver, N.2
Pentland, A.3
-
102
-
-
85133343575
-
Speech intelligibility derived from asynchronous processing of auditory-visual information
-
K. W. Grant and S. Greenberg, "Speech intelligibility derived from asynchronous processing of auditory-visual information," in Proc. Conf. Audio-Visual Speech Processing, 2001, pp. 132-137.
-
(2001)
Proc. Conf. Audio-visual Speech Processing
, pp. 132-137
-
-
Grant, K.W.1
Greenberg, S.2
-
103
-
-
85009257778
-
Audio-visual continuous speech recognition using a coupled hidden Markov model
-
X. Liu, Y. Zhao, X. Pi, L. Liang, and A. V. Nefian, "Audio-visual continuous speech recognition using a coupled hidden Markov model," in Proc. Int. Conf. Spoken Language Processing, 2002, pp. 213-216.
-
(2002)
Proc. Int. Conf. Spoken Language Processing
, pp. 213-216
-
-
Liu, X.1
Zhao, Y.2
Pi, X.3
Liang, L.4
Nefian, A.V.5
-
104
-
-
0012668146
-
Asynchrony modeling for audio-visual speech recognition
-
G. Gravier, G. Potamianos, and C. Neti, "Asynchrony modeling for audio-visual speech recognition," in Proc. Human Language Technology Conf., 2002, pp. 1-6.
-
(2002)
Proc. Human Language Technology Conf.
, pp. 1-6
-
-
Gravier, G.1
Potamianos, G.2
Neti, C.3
-
105
-
-
82055176921
-
Fusion of audio-visual information for integrated speech processing
-
J. Bigun and F. Smeraldi, Eds. Berlin, Germany: Springer-Verlag
-
S. Nakamura, "Fusion of audio-visual information for integrated speech processing," in Audio- and Video-Based Biometric Person Authentication, J. Bigun and F. Smeraldi, Eds. Berlin, Germany: Springer-Verlag, 2001, pp. 127-143.
-
(2001)
Audio- and Video-based Biometric Person Authentication
, pp. 127-143
-
-
Nakamura, S.1
-
106
-
-
17344376380
-
Maximum entropy and MCE based HMM stream weight estimation for audio-visual ASR
-
G. Gravier, S. Axelrod, G. Potamianos, and C. Neti, "Maximum entropy and MCE based HMM stream weight estimation for audio-visual ASR," in Proc. Int. Conf. Acoustics, Speech, and Signal Processing, 2002, pp. 853-856.
-
(2002)
Proc. Int. Conf. Acoustics, Speech, and Signal Processing
, pp. 853-856
-
-
Gravier, G.1
Axelrod, S.2
Potamianos, G.3
Neti, C.4
-
107
-
-
0000238336
-
A simplex method for function minimization
-
J. A. Nelder and R. Mead, "A simplex method for function minimization," Comput. J., vol. 7, pp. 308-313, 1965.
-
(1965)
Comput. J.
, vol.7
, pp. 308-313
-
-
Nelder, J.A.1
Mead, R.2
-
108
-
-
0001437767
-
A new SNR-feature mapping for robust multistream speech recognition
-
F. Berthommier and H. Glotin, "A new SNR-feature mapping for robust multistream speech recognition," in Proc. Int. Congress Phonetic Sciences, 1999, pp. 711-715.
-
(1999)
Proc. Int. Congress Phonetic Sciences
, pp. 711-715
-
-
Berthommier, F.1
Glotin, H.2
-
109
-
-
85009153179
-
Stream confidence estimation for audio-visual speech recognition
-
G. Polamianos and C. Neti, "Stream confidence estimation for audio-visual speech recognition," in Proc. Int. Conf. Spoken Language Processing, vol. 3, 2000, pp. 746-749.
-
(2000)
Proc. Int. Conf. Spoken Language Processing
, vol.3
, pp. 746-749
-
-
Polamianos, G.1
Neti, C.2
-
110
-
-
0028419019
-
Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
-
Apr.
-
J.-L. Gauvain and C.-H. Lee, "Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains," IEEE Trans. Speech Audio Processing, vol. 2, pp. 291-298, Apr. 1994.
-
(1994)
IEEE Trans. Speech Audio Processing
, vol.2
, pp. 291-298
-
-
Gauvain, J.-L.1
Lee, C.-H.2
-
111
-
-
0029288633
-
Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
-
C. J. Leggetter and P. C. Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models," Comput. Speech Lang., vol. 9, pp. 171-185, 1995.
-
(1995)
Comput. Speech Lang.
, vol.9
, pp. 171-185
-
-
Leggetter, C.J.1
Woodland, P.C.2
-
112
-
-
85135155427
-
A comparative study of speaker adaptation techniques
-
L. Neumeyer, A. Sankar, and V. Digalakis, "A comparative study of speaker adaptation techniques," in Proc. Eur. Conf. Speech Communication Technology, 1995, pp. 1127-1130.
-
(1995)
Proc. Eur. Conf. Speech Communication Technology
, pp. 1127-1130
-
-
Neumeyer, L.1
Sankar, A.2
Digalakis, V.3
-
113
-
-
0030677475
-
Speaker adaptive training: A maximum likelihood approach to speaker normalization
-
T. Anastasakos, J. McDonough, and J. Makhoul, "Speaker adaptive training: A maximum likelihood approach to speaker normalization," in Proc. Int. Conf. Acoustics, Speech, and Signal Processing, 1997, pp. 1043-1046.
-
(1997)
Proc. Int. Conf. Acoustics, Speech, and Signal Processing
, pp. 1043-1046
-
-
Anastasakos, T.1
McDonough, J.2
Makhoul, J.3
-
116
-
-
84947917954
-
The M2VTS multimodal face database
-
J. Bigün, G. Chollet, and G. Borgefors, Eds. Berlin, Germany: Springer-Verlag
-
S. Pigeon and L. Vandendorpe, "The M2VTS multimodal face database," in Audio-and Video-based Biometric Person Authentication, J. Bigün, G. Chollet, and G. Borgefors, Eds. Berlin, Germany: Springer-Verlag, 1997, pp. 403-109.
-
(1997)
Audio-and Video-based Biometric Person Authentication
, pp. 403-1109
-
-
Pigeon, S.1
Vandendorpe, L.2
-
117
-
-
0001935972
-
XM2VTS: The extended M2VTS database
-
K. Messer, J. Matas, J. Kittler, J. Luettin, and G. Maitre, "XM2VTS: The extended M2VTS database," in Proc. Int. Conf. Audio Video-based Biometric Person Authentication, 1999, pp. 72-76.
-
(1999)
Proc. Int. Conf. Audio Video-based Biometric Person Authentication
, pp. 72-76
-
-
Messer, K.1
Matas, J.2
Kittler, J.3
Luettin, J.4
Maitre, G.5
-
119
-
-
0002068237
-
A fast approximate acoustic match for large vocabulary speech recognition
-
Jan.
-
L. R. Bahl, S. V. De Gennaro, P. S. Gopalakrishnan, and R. L. Mercer, "A fast approximate acoustic match for large vocabulary speech recognition," IEEE Trans. Speech Audio Processing, vol. 1, pp. 59-67, Jan. 1993.
-
(1993)
IEEE Trans. Speech Audio Processing
, vol.1
, pp. 59-67
-
-
Bahl, L.R.1
De Gennaro, S.V.2
Gopalakrishnan, P.S.3
Mercer, R.L.4
-
120
-
-
0032074310
-
Audio-visual integration in multimodal communication
-
May
-
T. Chen and R. R. Rao, "Audio-visual integration in multimodal communication," Proc. IEEE, vol. 86, pp. 837-852, May 1998.
-
(1998)
Proc. IEEE
, vol.86
, pp. 837-852
-
-
Chen, T.1
Rao, R.R.2
-
121
-
-
0031640392
-
A syntactic approach to automatic lip feature extraction for speaker identification
-
T. Wark and S. Sridharan, "A syntactic approach to automatic lip feature extraction for speaker identification," in Proc. Int. Conf. Acoustics, Speech, and Signal Processing, 1998, pp. 3693-3696.
-
(1998)
Proc. Int. Conf. Acoustics, Speech, and Signal Processing
, pp. 3693-3696
-
-
Wark, T.1
Sridharan, S.2
-
122
-
-
0002347773
-
Multisensor biometric person recognition in an access control system
-
B. Fröba, C. Küblbeck, C. Rothe, and P. Plankensteiner, "Multisensor biometric person recognition in an access control system," in Proc. Int. Conf. Audio Video-based Biometric Person Authentication, 1999, pp. 55-59.
-
(1999)
Proc. Int. Conf. Audio Video-based Biometric Person Authentication
, pp. 55-59
-
-
Fröba, B.1
Küblbeck, C.2
Rothe, C.3
Plankensteiner, P.4
-
123
-
-
21244474602
-
Audio-visual speaker recognition for broadcast news: Some fusion techniques
-
B. Maison, C. Neti, and A. Senior, "Audio-visual speaker recognition for broadcast news: Some fusion techniques," in Proc. Workshop Multimedia Signal Processing, 1999, pp. 161-167.
-
(1999)
Proc. Workshop Multimedia Signal Processing
, pp. 161-167
-
-
Maison, B.1
Neti, C.2
Senior, A.3
-
124
-
-
33747294606
-
What can visual speech synthesis tell visual speech recognition?
-
Pacific Grove, CA
-
M. M. Cohen and D. W. Massaro, "What can visual speech synthesis tell visual speech recognition?," presented at the Asilomar Conf. Signals, Systems, Computers, Pacific Grove, CA, 1994.
-
(1994)
Asilomar Conf. Signals, Systems, Computers
-
-
Cohen, M.M.1
Massaro, D.W.2
-
125
-
-
0029291072
-
Lip synchronization using speech-assisted video processing
-
Apr.
-
T. Chen, H. P. Graf, and K. Wang, "Lip synchronization using speech-assisted video processing," IEEE Signal Processing Lett., vol.2, pp. 57-59, Apr. 1995.
-
(1995)
IEEE Signal Processing Lett.
, vol.2
, pp. 57-59
-
-
Chen, T.1
Graf, H.P.2
Wang, K.3
-
126
-
-
85069404424
-
Audio-visual unit selection for the synthesis of photo-realistic talking-heads
-
E. Cosatto, G. Potamianos, and H. P. Graf, "Audio-visual unit selection for the synthesis of photo-realistic talking-heads," in Proc. Int. Conf. Multimedia Expo, 2000, pp. 1097-1100.
-
(2000)
Proc. Int. Conf. Multimedia Expo
, pp. 1097-1100
-
-
Cosatto, E.1
Potamianos, G.2
Graf, H.P.3
-
127
-
-
0034271782
-
Photo-realistic talking-heads from image samples
-
Sept.
-
E. Cosatto and H. P. Graf, "Photo-realistic talking-heads from image samples," IEEE Trans. Multimedia, vol. 2, pp. 152-163, Sept. 2000.
-
(2000)
IEEE Trans. Multimedia
, vol.2
, pp. 152-163
-
-
Cosatto, E.1
Graf, H.P.2
-
129
-
-
0033708494
-
Audio-visual intent to speak detection for human computer interaction
-
P. De Cuetos, C. Neti, and A. Senior, "Audio-visual intent to speak detection for human computer interaction," in Proc. Int. Conf. Acoustics, Speech, and Signal Processing, 2000, pp. 1325-1328.
-
(2000)
Proc. Int. Conf. Acoustics, Speech, and Signal Processing
, pp. 1325-1328
-
-
De Cuetos, P.1
Neti, C.2
Senior, A.3
-
130
-
-
0003342953
-
Integration of multimodal features for video scene classification based on HMM
-
J. Huang, Z. Liu, Y. Wang, Y. Chen, and E. Wong, "Integration of multimodal features for video scene classification based on HMM," in Proc. Workshop Multimedia Signal Processing, 1999, pp. 53-58.
-
(1999)
Proc. Workshop Multimedia Signal Processing
, pp. 53-58
-
-
Huang, J.1
Liu, Z.2
Wang, Y.3
Chen, Y.4
Wong, E.5
-
131
-
-
84925591950
-
Audiovisual speech coder: Using vector quantization to exploit the audio/video correlation
-
E. Foucher, L. Girin, and G. Feng, "Audiovisual speech coder: Using vector quantization to exploit the audio/video correlation," in Proc. Conf. Audio-Visual Speech Processing, 1998, pp. 67-71.
-
(1998)
Proc. Conf. Audio-visual Speech Processing
, pp. 67-71
-
-
Foucher, E.1
Girin, L.2
Feng, G.3
-
132
-
-
0036874541
-
Separation of audio-visual speech sources: A new approach exploiting the audio-visual coherence of speech stimuli
-
Nov.
-
D. Sodoyer, J.-L. Schwartz, L. Girin, J. Klinkisch, and C. Jutten, "Separation of audio-visual speech sources: A new approach exploiting the audio-visual coherence of speech stimuli," EURASIP J. Appl. Signal Processing, vol. 2002, pp. 1165-1173, Nov. 2002.
-
(2002)
EURASIP J. Appl. Signal Processing
, vol.2002
, pp. 1165-1173
-
-
Sodoyer, D.1
Schwartz, J.-L.2
Girin, L.3
Klinkisch, J.4
Jutten, C.5
-
133
-
-
0028997041
-
Knowing who to listen 10 in speech recognition: Visually guided beamforming
-
U. Bub, M. Hunke, and A. Waibel, "Knowing who to listen 10 in speech recognition: Visually guided beamforming," in Proc. Int. Conf. Acoustics, Speech, and Signal Processing, 1995, pp. 848-851.
-
(1995)
Proc. Int. Conf. Acoustics, Speech, and Signal Processing
, pp. 848-851
-
-
Bub, U.1
Hunke, M.2
Waibel, A.3
-
134
-
-
0036874485
-
Joint audio-visual tracking using particle filters
-
Nov.
-
D. N. Zotkin, R. Duraiswami, and L. S. Davis, "Joint audio-visual tracking using particle filters," EURASIP J. Appl. Signal Processing, vol. 2002, pp. 1154-1164, Nov. 2002.
-
(2002)
EURASIP J. Appl. Signal Processing
, vol.2002
, pp. 1154-1164
-
-
Zotkin, D.N.1
Duraiswami, R.2
Davis, L.S.3
|