-
1
-
-
0021541159
-
Automatic lipreading to enhance speech recognition
-
Atlanta, GA, Nov
-
E. D. Petajan, "Automatic lipreading to enhance speech recognition," in Proc. Global Telecommunications Conf., Atlanta, GA, Nov. 1984, pp. 265-272.
-
(1984)
Proc. Global Telecommunications Conf
, pp. 265-272
-
-
Petajan, E.D.1
-
2
-
-
0036502797
-
A review of speechbased bimodal recognition
-
Mar
-
C. C. Chibelushi, F. Deravi, and J. S. D. Mason, "A review of speechbased bimodal recognition," IEEE Trans. Multimedia, vol. 4, no. 1, pp. 23-37, Mar. 2002.
-
(2002)
IEEE Trans. Multimedia
, vol.4
, Issue.1
, pp. 23-37
-
-
Chibelushi, C.C.1
Deravi, F.2
Mason, J.S.D.3
-
3
-
-
0030830419
-
Sensor fusion potential exploitation: Innovative archi-tectures and illustrative applications
-
Jan
-
B. V. Dasarathy, "Sensor fusion potential exploitation: Innovative archi-tectures and illustrative applications," Proc. IEEE, vol. 85, pp. 24-38, Jan. 1997.
-
(1997)
Proc. IEEE
, vol.85
, pp. 24-38
-
-
Dasarathy, B.V.1
-
5
-
-
34548139784
-
Training hidden Markov models by hybrid simulated annealing for visual speech recognition
-
Taipei, Taiwan, R.O.C, Oct
-
J.-S. Lee and C. H. Park, "Training hidden Markov models by hybrid simulated annealing for visual speech recognition," in Proc. IEEE Int. Conf. Systems, Man, Cybernetics, Taipei, Taiwan, R.O.C., Oct. 2006, pp. 198-202.
-
(2006)
Proc. IEEE Int. Conf. Systems, Man, Cybernetics
, pp. 198-202
-
-
Lee, J.-S.1
Park, C.H.2
-
6
-
-
0004056285
-
-
Upper Saddle River, NJ: Prentice Hall
-
X. Huang, A. Acero, and H.-W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm, and System Development., Upper Saddle River, NJ: Prentice Hall, 2001.
-
(2001)
Spoken Language Processing: A Guide to Theory, Algorithm, and System Development
-
-
Huang, X.1
Acero, A.2
Hon, H.-W.3
-
7
-
-
0027957839
-
Effect of temporal envelope smearing on speech reception
-
Feb
-
R. Drullman, J. M. Festen, and R. Plomp, "Effect of temporal envelope smearing on speech reception," J. Acoust. Soc. Amer., vol. 95, no. 2, pp. 1053-1064, Feb. 1994.
-
(1994)
J. Acoust. Soc. Amer
, vol.95
, Issue.2
, pp. 1053-1064
-
-
Drullman, R.1
Festen, J.M.2
Plomp, R.3
-
8
-
-
84892184580
-
Speech intelligibility in the presence of cross-channel spectral asynchrony
-
Seattle, WA
-
T. Arai and S. Greenberg, "Speech intelligibility in the presence of cross-channel spectral asynchrony," in Proc. ICASSP, Seattle, WA, 1998, vol. 2, pp. 933-936.
-
(1998)
Proc. ICASSP
, vol.2
, pp. 933-936
-
-
Arai, T.1
Greenberg, S.2
-
9
-
-
0022667694
-
Speaker-independent isolated word recognition using dynamic features of speech spectrum
-
Feb
-
S. Furai, "Speaker-independent isolated word recognition using dynamic features of speech spectrum," IEEE Trans. Acoust., Speech, Signal Process., vol. 34, no. 1, pp. 52-59, Feb. 1986.
-
(1986)
IEEE Trans. Acoust., Speech, Signal Process
, vol.34
, Issue.1
, pp. 52-59
-
-
Furai, S.1
-
11
-
-
33846242179
-
Focused state transition information in ASR
-
San Juan, PR, Nov
-
C. Bartels and J. Bilmes. "Focused state transition information in ASR," in Proc. Workshop on Automatic Speech Recognition and Understanding, San Juan, PR, Nov. 2005, pp. 191-196.
-
(2005)
Proc. Workshop on Automatic Speech Recognition and Understanding
, pp. 191-196
-
-
Bartels, C.1
Bilmes, J.2
-
12
-
-
45949121309
-
Fast simulated annealing
-
June
-
H. H. Szu and R. L. Hartley, "Fast simulated annealing," Phys. Lett. A, vol. 122, no. 3-4, pp. 157-162, June 1987.
-
(1987)
Phys. Lett. A
, vol.122
, Issue.3-4
, pp. 157-162
-
-
Szu, H.H.1
Hartley, R.L.2
-
13
-
-
0022227186
-
Training of HMM recognizers by simulated annealing
-
Tampa, FL, Mar
-
D. Paul, "Training of HMM recognizers by simulated annealing," in Proc. ICASSP, Tampa, FL, Mar. 1985, pp. 13-16.
-
(1985)
Proc. ICASSP
, pp. 13-16
-
-
Paul, D.1
-
15
-
-
10444288769
-
n-dimensional Cauchy neighbor generation for the fast simulated annealing
-
Nov
-
D. Nam, J.-S. Lee, and C. H. Park, "n-dimensional Cauchy neighbor generation for the fast simulated annealing," IEICE Trans. Inf. Syst., vol. E87-D, no. 11, pp. 2499-2502, Nov. 2004.
-
(2004)
IEICE Trans. Inf. Syst
, vol.E87-D
, Issue.11
, pp. 2499-2502
-
-
Nam, D.1
Lee, J.-S.2
Park, C.H.3
-
16
-
-
5744249209
-
Equation of state calculations by fast computing machines
-
N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller, "Equation of state calculations by fast computing machines," J. Chem. Phys., vol. 21, no. 6, pp. 1087-1092, 1953.
-
(1953)
J. Chem. Phys
, vol.21
, Issue.6
, pp. 1087-1092
-
-
Metropolis, N.1
Rosenbluth, A.W.2
Rosenbluth, M.N.3
Teller, A.H.4
Teller, E.5
-
17
-
-
47649094767
-
Audio-Visual Speech Recognition: Stochastic Optimization of Hidden Markov Models, Modeling of Interframe Correlations and Integration With Neural Networks,
-
Ph.D. dissertation, Dept. Elect. Eng. Comput. Science, KAIST, Daejeon, Korea
-
J.-S. Lee, "Audio-Visual Speech Recognition: Stochastic Optimization of Hidden Markov Models, Modeling of Interframe Correlations and Integration With Neural Networks," Ph.D. dissertation, Dept. Elect. Eng. Comput. Science, KAIST, Daejeon, Korea, 2006.
-
(2006)
-
-
Lee, J.-S.1
-
18
-
-
0041568115
-
Schur complements and statistics
-
Mar
-
D. V. Ouellette, "Schur complements and statistics," Linear Algebra Appl., vol. 36, pp. 187-295, Mar. 1981.
-
(1981)
Linear Algebra Appl
, vol.36
, pp. 187-295
-
-
Ouellette, D.V.1
-
20
-
-
0026368826
-
Regression features for recognition of speech in quiet and in noise
-
Toronto, ON, Canada, Apr
-
T. H. Applebaum and B. A. Hanson, "Regression features for recognition of speech in quiet and in noise," in Proc. ICASSP, Toronto, ON, Canada, Apr. 1991, vol. 2, pp. 985-988.
-
(1991)
Proc. ICASSP
, vol.2
, pp. 985-988
-
-
Applebaum, T.H.1
Hanson, B.A.2
-
21
-
-
0003408774
-
-
Natick, MA: The Mathworks, Inc, The Mathworks
-
Optimization Toolbox User's Guide. Natick, MA: The Mathworks, Inc., 2005, The Mathworks.
-
(2005)
Optimization Toolbox User's Guide
-
-
-
23
-
-
34247172408
-
Do you see what I am saying? Exploring visual enhancement of speech comprehension in noisy environments
-
L. A. Ross, D. Saint-Amour, V. M. Leavitt, D. C. Javitt, and J. J. Foxe, "Do you see what I am saying? Exploring visual enhancement of speech comprehension in noisy environments," Cerebral Cortex vol. 17, no. 5, pp. 1147-1153, 2007.
-
(2007)
Cerebral Cortex
, vol.17
, Issue.5
, pp. 1147-1153
-
-
Ross, L.A.1
Saint-Amour, D.2
Leavitt, V.M.3
Javitt, D.C.4
Foxe, J.J.5
-
24
-
-
0035347346
-
Bisensory augmentation: A speechreading advantage when speech is clearly audible and intact
-
P. Arnold and F. Hill, "Bisensory augmentation: A speechreading advantage when speech is clearly audible and intact," Brit. J. Psychol., vol. 92, pp. 339-355, 2001.
-
(2001)
Brit. J. Psychol
, vol.92
, pp. 339-355
-
-
Arnold, P.1
Hill, F.2
-
25
-
-
34047262788
-
The intrinsic bimodality of speech communication and the synthesis of talking faces
-
C. Benoît, M. M. Taylor, F. Nel, and D. Bouwhuis, Eds, Amsterdam, The Netherlands: John Benjamins
-
C. Benoît, , M. M. Taylor, F. Nel, and D. Bouwhuis, Eds., "The intrinsic bimodality of speech communication and the synthesis of talking faces," in The Structure of Multimodal Dialogue II. Amsterdam, The Netherlands: John Benjamins, 2000, pp. 485-502.
-
(2000)
The Structure of Multimodal Dialogue II
, pp. 485-502
-
-
-
26
-
-
33745102745
-
Auditory-visual speech perception and synchrony detection for speech and nonspeech signals
-
June
-
B. Conrey and D. B. Pisoni, "Auditory-visual speech perception and synchrony detection for speech and nonspeech signals," J. Acoust. Soc. Amer., vol. 119, no. 6, pp. 4065-4073, June 2006.
-
(2006)
J. Acoust. Soc. Amer
, vol.119
, Issue.6
, pp. 4065-4073
-
-
Conrey, B.1
Pisoni, D.B.2
-
27
-
-
0036874527
-
Noise adaptive stream weighting in audio-visual speech recognition
-
M. Heckmann, F. Berthommier, and K. Kroschel, "Noise adaptive stream weighting in audio-visual speech recognition," EURASIP J. Appl. Signal Process., vol. 11, pp. 1260-1273, 2002.
-
(2002)
EURASIP J. Appl. Signal Process
, vol.11
, pp. 1260-1273
-
-
Heckmann, M.1
Berthommier, F.2
Kroschel, K.3
-
28
-
-
34547497793
-
Dynamic stream weight modeling for audio-visual speech recognition
-
Honolulu, HI, Apr
-
E. Marcheret, V. Libal, and G. Potamianos, "Dynamic stream weight modeling for audio-visual speech recognition," in Proc. ICASSP, Honolulu, HI, Apr. 2007, vol. 4, pp. 945-948.
-
(2007)
Proc. ICASSP
, vol.4
, pp. 945-948
-
-
Marcheret, E.1
Libal, V.2
Potamianos, G.3
-
29
-
-
0032180188
-
Adaptive fusion of acoustic and visual sources for automatic speech recognition
-
Oct
-
A. Rogozan and P. Deléglise, "Adaptive fusion of acoustic and visual sources for automatic speech recognition," Speech Commun., vol. 26, no. 1-2, pp. 149-161, Oct. 1998.
-
(1998)
Speech Commun
, vol.26
, Issue.1-2
, pp. 149-161
-
-
Rogozan, A.1
Deléglise, P.2
-
30
-
-
28444493889
-
Sensor fusion weighting measures in audio-visual speech recognition
-
Dunedin, New Zealand
-
T. W. Lewis and D. M. W. Powers, "Sensor fusion weighting measures in audio-visual speech recognition," in Proc. 27th Conf. Australasian Computer Science, Dunedin, New Zealand, 2004, pp. 305-314.
-
(2004)
Proc. 27th Conf. Australasian Computer Science
, pp. 305-314
-
-
Lewis, T.W.1
Powers, D.M.W.2
-
31
-
-
34047263009
-
Visual model structures and synchrony constraints for audio-visual speech recognition
-
May
-
T. J. Hazen, "Visual model structures and synchrony constraints for audio-visual speech recognition," IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 3, pp. 1082-1089, May 2006.
-
(2006)
IEEE Trans. Audio, Speech, Lang. Process
, vol.14
, Issue.3
, pp. 1082-1089
-
-
Hazen, T.J.1
-
32
-
-
0042954451
-
Late integration in audiovisual continuous speech recognition
-
Keystone, CO, Dec
-
A. Verma, T. Faruquie, C. Neti, and S. Basu, "Late integration in audiovisual continuous speech recognition," in Proc.Workshop on Automatic Speech Recognition and Understanding, Keystone, CO, Dec. 1999, pp. 71-74.
-
(1999)
Proc.Workshop on Automatic Speech Recognition and Understanding
, pp. 71-74
-
-
Verma, A.1
Faruquie, T.2
Neti, C.3
Basu, S.4
-
33
-
-
1842854571
-
Continuous audiovisual digit recognition using N-best decision fusion
-
June
-
G. F. Meyer, J. B. Mulligan, and S. M. Wuerger, "Continuous audiovisual digit recognition using N-best decision fusion," Inform. Fusion, vol. 5, no. 2, pp. 91-101, June 2004.
-
(2004)
Inform. Fusion
, vol.5
, Issue.2
, pp. 91-101
-
-
Meyer, G.F.1
Mulligan, J.B.2
Wuerger, S.M.3
-
34
-
-
33646814706
-
A stream-weight optimization method for multi-stream HMMs based on likelihood value normalization
-
Philadelphia, PA, Mar
-
S. Tamura, K. Iwano, and S. Furui, "A stream-weight optimization method for multi-stream HMMs based on likelihood value normalization," in Proc. ICASSP, Philadelphia, PA, Mar. 2005, vol. 1, pp. 469-472.
-
(2005)
Proc. ICASSP
, vol.1
, pp. 469-472
-
-
Tamura, S.1
Iwano, K.2
Furui, S.3
-
35
-
-
0001432664
-
On the integration of auditory and visual parameters in an HMM-based ASR
-
A. Adjoudani and C. Benoǐt, D. G. Stork and M. E. Hennecke, Eds, Speechreading by Humans and Machines: Models, Systems and Applications, Berlin, Germany: Springer
-
A. Adjoudani and C. Benoǐt, , D. G. Stork and M. E. Hennecke, Eds., "On the integration of auditory and visual parameters in an HMM-based ASR," in Speechreading by Humans and Machines: Models, Systems and Applications, ser. NATO ASI Series. Berlin, Germany: Springer, 1996, pp. 461-472.
-
(1996)
ser. NATO ASI Series
, pp. 461-472
-
-
-
38
-
-
0027623210
-
Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems
-
A. Varga and H. J. M. Steeneken, "Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems," Speech Commun., vol. 12, no. 3, pp. 247-251, 1993.
-
(1993)
Speech Commun
, vol.12
, Issue.3
, pp. 247-251
-
-
Varga, A.1
Steeneken, H.J.M.2
-
39
-
-
0034270644
-
Audio-visual speech modeling for continuous speech recognition
-
Sep
-
S. Dupont and J. Luettin, "Audio-visual speech modeling for continuous speech recognition," IEEE Trans. Multimedia, vol. 2, no. 3, pp. 141-151, Sep. 2000.
-
(2000)
IEEE Trans. Multimedia
, vol.2
, Issue.3
, pp. 141-151
-
-
Dupont, S.1
Luettin, J.2
-
40
-
-
84880887921
-
Multimodal integration - A biological view
-
Seattle, WA
-
M. H. Coen, "Multimodal integration - A biological view," in Proc. Int. Joint Conf. Artificial Intelligence, Seattle, WA, 2001, pp. 1417-1424.
-
(2001)
Proc. Int. Joint Conf. Artificial Intelligence
, pp. 1417-1424
-
-
Coen, M.H.1
-
41
-
-
2342451199
-
Multimedia content processing through cross-modal association
-
Berkeley, CA, Nov
-
D. Li, N. Dimitrova, M. Li, and I. K. Sethi, "Multimedia content processing through cross-modal association," in Proc. ACM Int. Conf. Multimedia, Berkeley, CA, Nov. 2003, pp. 604-611.
-
(2003)
Proc. ACM Int. Conf. Multimedia
, pp. 604-611
-
-
Li, D.1
Dimitrova, N.2
Li, M.3
Sethi, I.K.4
|