-
1
-
-
0001432664
-
On the integration of auditory and visual parameters in HMM-based ASR
-
D. G. Stork and M. E. Hennecke, Eds. Berlin, Germany: Springer
-
A. Adjoudani and C. Benoit, "On the integration of auditory and visual parameters in HMM-based ASR," in Speechreading by Humans and Machines, D. G. Stork and M. E. Hennecke, Eds. Berlin, Germany: Springer, pp. 461-471, 1996.
-
(1996)
Speechreading by Humans and Machines
, pp. 461-471
-
-
Adjoudani, A.1
Benoit, C.2
-
2
-
-
0003152968
-
Speech enhancement in the 1980s: Noise suppression with pattern matching
-
Dekker
-
S. Boll, "Speech enhancement in the 1980s: noise suppression with pattern matching," In Advances in Speech Signal Processing, pp. 309-325, Dekker, 1992.
-
(1992)
Advances in Speech Signal Processing
, pp. 309-325
-
-
Boll, S.1
-
3
-
-
85013597845
-
Eigenlips for robust speech recognition
-
C. Bregler and Y. Konig, "Eigenlips for Robust Speech Recognition," In Proc. ICASSP, 1994.
-
(1994)
Proc. ICASSP
-
-
Bregler, C.1
Konig, Y.2
-
4
-
-
84925639646
-
Real-time lip tracking and bimodal continuous speech recognition
-
Redondo Beach, CA
-
M. Chan, Y. Zhang, and T. Huang, "Real-time lip tracking and bimodal continuous speech recognition," in Proc. Works. Multimedia Signal Processing, pp. 65-70, Redondo Beach, CA, 1998.
-
(1998)
Proc. Works. Multimedia Signal Processing
, pp. 65-70
-
-
Chan, M.1
Zhang, Y.2
Huang, T.3
-
7
-
-
85009135946
-
Bimodal speech recognition using coupled hidden Markov models
-
Beijing, China
-
S. Chu and T. Huang, "Bimodal speech recognition using coupled hidden Markov models," In Proc. Int. Conf. Spoken Lang. Processing, vol. II, Beijing, China, pp. 747-750, 2000.
-
(2000)
Proc. Int. Conf. Spoken Lang. Processing
, vol.2
, pp. 747-750
-
-
Chu, S.1
Huang, T.2
-
8
-
-
84957810778
-
Active appearance models
-
Germany
-
T. Cootes, G. Edwards, and C. Taylor, "Active appearance models," In Proc. Europ. Conf. Computer Vision, Germany, pp. 484-498, 1998.
-
(1998)
Proc. Europ. Conf. Computer Vision
, pp. 484-498
-
-
Cootes, T.1
Edwards, G.2
Taylor, C.3
-
9
-
-
0029182228
-
Active shape models - Their training and application
-
T. Cootes, C. Taylor, D. Cooper, and J. Graham, "Active shape models - their training and application," Computer Vision Image Understanding, vol. 61, no. 1, pp. 38-59, 1995.
-
(1995)
Computer Vision Image Understanding
, vol.61
, Issue.1
, pp. 38-59
-
-
Cootes, T.1
Taylor, C.2
Cooper, D.3
Graham, J.4
-
10
-
-
0034270644
-
Audio-visual speech modeling for continuous speech recognition
-
S. Dupont and J. Luettin, "Audio-visual speech modeling for continuous speech recognition," IEEE Trans. Multimedia, vol. 2, no. 3, pp. 141-151, 2000.
-
(2000)
IEEE Trans. Multimedia
, vol.2
, Issue.3
, pp. 141-151
-
-
Dupont, S.1
Luettin, J.2
-
12
-
-
0036875002
-
A support vector machine based dynamic network for visual speech recognition applications
-
M. Gordan, C. Kotropoulos, and I. Pitas, "A support vector machine based dynamic network for visual speech recognition applications," EURASIP J. Appl. Signal Processing, vol. 2002, no. 11, pp. 1248-1259, 2002.
-
(2002)
EURASIP J. Appl. Signal Processing
, vol.2002
, Issue.11
, pp. 1248-1259
-
-
Gordan, M.1
Kotropoulos, C.2
Pitas, I.3
-
13
-
-
0034841727
-
Application of affine-invariant fourier descriptors to lipreading for audio-visual speech recognition
-
Salt Lake City, UT
-
S. Gurbuz, Z. Tufekci, E. Patterson, and J. Gowdy, "Application of affine-invariant fourier descriptors to lipreading for audio-visual speech recognition," in Proc. Int. Conf. Acoust., Speech, Signal Processing, pp. 177-180, Salt Lake City, UT, 2001.
-
(2001)
Proc. Int. Conf. Acoust., Speech, Signal Processing
, pp. 177-180
-
-
Gurbuz, S.1
Tufekci, Z.2
Patterson, E.3
Gowdy, J.4
-
14
-
-
34250090755
-
Snakes: Active contour models
-
M. Kass, A. Witkin, and D. Terzopoulos, "Snakes: Active contour models," Int. J. Computer Vision, vol. 1, no. 4, pp. 321-331, 1988.
-
(1988)
Int. J. Computer Vision
, vol.1
, Issue.4
, pp. 321-331
-
-
Kass, M.1
Witkin, A.2
Terzopoulos, D.3
-
15
-
-
79952968027
-
Speech recognition via phonetically featured syllables
-
Sydney
-
S. King, T. Stephenson, S. Isard, P. Taylor and A. Strachan, "Speech recognition via phonetically featured syllables," In Proc. ICSLP, Sydney, 1998.
-
(1998)
Proc. ICSLP
-
-
King, S.1
Stephenson, T.2
Isard, S.3
Taylor, P.4
Strachan, A.5
-
16
-
-
0038193561
-
Combining acoustic and articulatory-feature information for robust speech recognition
-
Sydney
-
K. Kirchhoff, G. Fink and G. Sagerer, "Combining Acoustic and Articulatory-feature Information for Robust Speech Recognition," In Proc. ICSLP, pp. 891-894, Sydney, 1998.
-
(1998)
Proc. ICSLP
, pp. 891-894
-
-
Kirchhoff, K.1
Fink, G.2
Sagerer, G.3
-
17
-
-
14944340400
-
Neural architectures for sensor fusion in speech recognition
-
Greece
-
G. Krone, B. Talle, A. Wichert, and G. Palm, "Neural architectures for sensor fusion in speech recognition," In Proc. Europ. Tut. Works. Audio-Visual Speech Processing, pp. 57-60, Greece, 1997.
-
(1997)
Proc. Europ. Tut. Works. Audio-visual Speech Processing
, pp. 57-60
-
-
Krone, G.1
Talle, B.2
Wichert, A.3
Palm, G.4
-
18
-
-
14944341906
-
Feature-based pronunciation modeling for speech recognition
-
Boston
-
K. Livescu and J. Glass, "Feature-based Pronunciation Modeling for Speech Recognition," In Proc. HLT/NAACL, Boston, 2004.
-
(2004)
Proc. HLT/NAACL
-
-
Livescu, K.1
Glass, J.2
-
19
-
-
0025750892
-
Automatic lipreading by optical flow analysis
-
K. Mase and A. Pentland, "Automatic Lipreading by optical flow analysis," Systems and Computers in Japan, vol. 22, no. 6, pp. 67-76, 1991.
-
(1991)
Systems and Computers in Japan
, vol.22
, Issue.6
, pp. 67-76
-
-
Mase, K.1
Pentland, A.2
-
20
-
-
0036472941
-
Extraction of visual features for lipreading
-
I. Matthews, T. Cootes, A. Bangham, S. Cox and R. Harvey, "Extraction of Visual Features for Lipreading," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 2, 2002.
-
(2002)
IEEE Transactions on Pattern Analysis and Machine Intelligence
, vol.24
, Issue.2
-
-
Matthews, I.1
Cootes, T.2
Bangham, A.3
Cox, S.4
Harvey, R.5
-
21
-
-
85009240321
-
A flexible stream architecture for ASR using articulatory features
-
Denver
-
F. Metze, and A. Waibel, "A Flexible Stream Architecture for ASR Using Articulatory Features," In Proc. ICSLP, Denver, 2002.
-
(2002)
Proc. ICSLP
-
-
Metze, F.1
Waibel, A.2
-
22
-
-
84955023511
-
An analysis of perceptual confusions among some english consonants
-
G. Miller and P. Nicely, "An Analysis of Perceptual Confusions among some English Consonants," J. Acoustical Society America, vol. 27, no. 2, pp. 338-352, 1955.
-
(1955)
J. Acoustical Society America
, vol.27
, Issue.2
, pp. 338-352
-
-
Miller, G.1
Nicely, P.2
-
23
-
-
0035790960
-
Large-vocabulary audio-visual speech recognition: A summary of the Johns Hopkins summer 2000 workshop
-
Cannes, France
-
C. Neti, G. Potamianos, J. Luettin, I. Matthews, H. Glotin, and D. Vergyri, "Large-vocabulary audio-visual speech recognition: A summary of the Johns Hopkins Summer 2000 Workshop," In Proc. Works. Signal Processing, pp. 619-624, Cannes, France, 2001.
-
(2001)
Proc. Works. Signal Processing
, pp. 619-624
-
-
Neti, C.1
Potamianos, G.2
Luettin, J.3
Matthews, I.4
Glotin, H.5
Vergyri, D.6
-
24
-
-
0033676801
-
Denoising of human speech using combined acoustic and EM sensor signal processing
-
Istanbul, Turkey
-
L. Ng, G. Burnett, J. Holzrichter, and T. Gable, "Denoising of Human Speech Using Combined Acoustic and EM Sensor Signal Processing," In Proc. ICASSP, Istanbul, Turkey, 2000.
-
(2000)
Proc. ICASSP
-
-
Ng, L.1
Burnett, G.2
Holzrichter, J.3
Gable, T.4
-
25
-
-
84900117327
-
Feature based representation for audio-visual speech recognition
-
Santa Cruz, CA
-
P. Niyogi, E. Petajan, and J. Zhong, "Feature Based Representation for Audio-Visual Speech Recognition", Proceedings of the Audio Visual Speech Conference, Santa Cruz, CA, 1999.
-
(1999)
Proceedings of the Audio Visual Speech Conference
-
-
Niyogi, P.1
Petajan, E.2
Zhong, J.3
-
26
-
-
0021541159
-
Automatic lipreading to enhance speech recognition
-
Atlanta, GA
-
E. Petajan, "Automatic lipreading to enhance speech recognition," In Proc. Global Telecomm. Conf., pp. 265-272, Atlanta, GA, 1984.
-
(1984)
Proc. Global Telecomm. Conf.
, pp. 265-272
-
-
Petajan, E.1
-
27
-
-
85009230873
-
Audio-visual speech recognition in challenging environments
-
Geneva
-
G. Potamianos and C. Neti, "Audio-visual speech recognition in challenging environments," In Proc. Eur. Conf. Speech Comm. Tech., pp. 1293-1296, Geneva, 2003.
-
(2003)
Proc. Eur. Conf. Speech Comm. Tech.
, pp. 1293-1296
-
-
Potamianos, G.1
Neti, C.2
-
28
-
-
4544290191
-
Recent advances in the automatic recognition of audio-visual speech
-
G. Potamianos, C. Neti, G. Gravier, A. Garg, and A. Senior, "Recent Advances in the Automatic Recognition of Audio-Visual Speech", In Proc. IEEE, 2003.
-
(2003)
Proc. IEEE
-
-
Potamianos, G.1
Neti, C.2
Gravier, G.3
Garg, A.4
Senior, A.5
-
29
-
-
0034517163
-
A cascade image transform for speaker-independent automatic speechreading
-
New York
-
G. Potamianos, A. Verma, C. Neti, G. Iyengar, and S. Basu, "A Cascade Image Transform for Speaker-Independent Automatic Speechreading," In Proc. ICME, volume II, pp. 1097-1100, New York, 2000.
-
(2000)
Proc. ICME
, vol.2
, pp. 1097-1100
-
-
Potamianos, G.1
Verma, A.2
Neti, C.3
Iyengar, G.4
Basu, S.5
-
30
-
-
0001048664
-
Visual contribution to speech intelligibility in noise
-
W. Sumby, and I. Pollack, "Visual contribution to speech intelligibility in noise," J. Acoustical Society America, vol. 26, no. 2, pp. 212-215, 1954.
-
(1954)
J. Acoustical Society America
, vol.26
, Issue.2
, pp. 212-215
-
-
Sumby, W.1
Pollack, I.2
-
31
-
-
0036165806
-
An overlapping-feature based phonological model incorporating linguistic constraints: Applications to speech recognition
-
J. Sun and L. Deng, "An Overlapping-Feature Based Phonological Model Incorporating Linguistic Constraints: Applications to Speech Recognition", J. Acoustic Society of America, vol. 111, No. 2, pp. 1086-1101, 2002.
-
(2002)
J. Acoustic Society of America
, vol.111
, Issue.2
, pp. 1086-1101
-
-
Sun, J.1
Deng, L.2
-
32
-
-
0003770986
-
Comparing models for audiovisual fusion in a noisy-vowel recognition task
-
P. Teissier, J. Robert-Ribes, and J. Schwartz, "Comparing models for audiovisual fusion in a noisy-vowel recognition task," IEEE Trans. Speech Audio Processing, vol. 7, no. 6, pp. 629-642, 1999.
-
(1999)
IEEE Trans. Speech Audio Processing
, vol.7
, Issue.6
, pp. 629-642
-
-
Teissier, P.1
Robert-Ribes, J.2
Schwartz, J.3
|