-
1
-
-
0001048664
-
Visual contribution to speech intelligibility in noise
-
W.H. Sumby and I. Pollack (1954), "Visual contribution to speech intelligibility in noise," in J. Acoustical Society America, 26: 212-215.
-
(1954)
J. Acoustical Society America
, vol.26
, pp. 212-215
-
-
Sumby, W.H.1
Pollack, I.2
-
2
-
-
0032074310
-
Audio-visual integration in multi-modal communication
-
T. Chen and R.R. Rao (1998), "Audio-visual integration in multi-modal communication," in Proc. IEEE, 86(5): 837-852.
-
(1998)
Proc. IEEE
, vol.86
, Issue.5
, pp. 837-852
-
-
Chen, T.1
Rao, R.R.2
-
4
-
-
0034270644
-
Audio-visual speech modeling for continuous speech recognition
-
S. Dupont and J. Luettin (2000), "Audio-visual speech modeling for continuous speech recognition," in IEEE Trans. Multimedia, 2(3): 141-151.
-
(2000)
IEEE Trans. Multimedia
, vol.2
, Issue.3
, pp. 141-151
-
-
Dupont, S.1
Luettin, J.2
-
5
-
-
0036502797
-
A review of speech-based bimodal recognition
-
C.C. Chibelushi, F. Deravi, and J.S.D. Mason (2002), "A review of speech-based bimodal recognition," in IEEE Trans. Multimedia, 4(1): 23-37.
-
(2002)
IEEE Trans. Multimedia
, vol.4
, Issue.1
, pp. 23-37
-
-
Chibelushi, C.C.1
Deravi, F.2
Mason, J.S.D.3
-
6
-
-
0036874527
-
Noise adaptive stream weighting in audio-visual speech recognition
-
M. Heckmann, F. Berthommier, and K. Kroschel (2002), "Noise adaptive stream weighting in audio-visual speech recognition," in EURASIP J. Appl. Signal Process., 2002(11): 1260-1273.
-
(2002)
EURASIP J. Appl. Signal Process., 2002
, Issue.11
, pp. 1260-1273
-
-
Heckmann, M.1
Berthommier, F.2
Kroschel, K.3
-
7
-
-
4544290191
-
Recent advances in the automatic recognition of audio-visual speech
-
G. Potamianos, C. Neti, G. Gravier, A. Garg, and A.W. Senior (2003), "Recent advances in the automatic recognition of audio-visual speech," in Proc. IEEE, 91(9): 1306-1326.
-
(2003)
Proc. IEEE
, vol.91
, Issue.9
, pp. 1306-1326
-
-
Potamianos, G.1
Neti, C.2
Gravier, G.3
Garg, A.4
Senior, A.W.5
-
8
-
-
84890472914
-
Audio-visual speech recognition
-
K. Brown (Ed. In Chief), Elsevier, Oxford, United Kingdom, ISBN: 0-08-044299-4, 2006
-
G. Potamianos (2006), "Audio-Visual Speech Recognition," in Encyclopedia of Language and Linguistics, Second Edition, (Speech Technology Section-Computer Understanding of Speech), K. Brown (Ed. In Chief), Elsevier, Oxford, United Kingdom, ISBN: 0-08-044299-4, 2006.
-
(2006)
Encyclopedia of Language and Linguistics, Second Edition, (Speech Technology Section-Computer Understanding of Speech)
-
-
Potamianos, G.1
-
9
-
-
10444261199
-
Audio-visual speech recognition using an infrared headset
-
J. Huang, G. Potamianos, J. Connell and C. Neti (2004), "Audio-visual speech recognition using an infrared headset," in Speech Communication 44(4), 83-96.
-
(2004)
Speech Communication
, vol.44
, Issue.4
, pp. 83-96
-
-
Huang, J.1
Potamianos, G.2
Connell, J.3
Neti, C.4
-
10
-
-
85009111979
-
Efficient likelihood computation in multi-stream hmm based audio-visual speech recognition
-
E. Marcheret, S. Chu, V. Goel, G. Potamianos (2004), "Efficient Likelihood Computation in Multi-Stream HMM Based Audio-Visual Speech Recognition," in Int. Conf. Speech and Language Processing, 2004.
-
(2004)
Int. Conf. Speech and Language Processing
-
-
Marcheret, E.1
Chu, S.2
Goel, V.3
Potamianos, G.4
-
11
-
-
0036296863
-
Minimum phone error and i-smoothing for improved discriminative training
-
D. Povey and P. C. Woodland, "Minimum Phone Error and I-smoothing for Improved Discriminative Training," in Proceedings of ICASSP, 2002.
-
(2002)
Proceedings of ICASSP
-
-
Povey, D.1
Woodland, P.C.2
-
12
-
-
33646788786
-
FMPE: Discriminatively trained features for speech recognition
-
D. Povey, B. Kingsbury, L. Mangu, G. Saon, H. Soltau, G. Zweig, "fMPE: Discriminatively trained features for speech recognition," in Proceedings of ICASSP, 2005.
-
(2005)
Proceedings of ICASSP
-
-
Povey, D.1
Kingsbury, B.2
Mangu, L.3
Saon, G.4
Soltau, H.5
Zweig, G.6
-
13
-
-
34047244134
-
Discriminatively trained features using fmpe for multi-stream audio-visual speech recognition
-
J. Huang and D. Povey, "Discriminatively Trained Features Using fMPE for Multi-Stream Audio-Visual Speech Recognition," in Proceedings of Interspeech, 2005.
-
(2005)
Proceedings of Interspeech
-
-
Huang, J.1
Povey, D.2
-
14
-
-
70450172282
-
Combined discriminative training for multi-stream hmm-based audio-visual speech recognition
-
J. Huang and K. Visweswariah, "Combined Discriminative Training for Multi-Stream HMM-based Audio-Visual Speech Recognition," in Proceedings of Interspeech, 2009.
-
(2009)
Proceedings of Interspeech
-
-
Huang, J.1
Visweswariah, K.2
-
15
-
-
85032751458
-
Deep neural networks for acoustic modeling in speech recognition
-
G. Hinton, L. Deng, D. Yu, G. E. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, B. Kingsbury, "Deep neural networks for acoustic modeling in speech recognition," in IEEE Signal Processing Magazine, 29(6): 82-97, 2012.
-
(2012)
IEEE Signal Processing Magazine
, vol.29
, Issue.6
, pp. 82-97
-
-
Hinton, G.1
Deng, L.2
Yu, D.3
Dahl, G.E.4
Mohamed, A.5
Jaitly, N.6
Senior, A.7
Vanhoucke, V.8
Nguyen, P.9
Sainath, T.N.10
Kingsbury, B.11
-
16
-
-
84867585919
-
Understanding how deep belief networks perform acoustic modelling
-
A. Mohamed, G. Hinton, G. Penn, "Understanding how Deep Belief Networks perform acoustic modelling," in Proceedings of ICASSP, 2012.
-
(2012)
Proceedings of ICASSP
-
-
Mohamed, A.1
Hinton, G.2
Penn, G.3
-
17
-
-
85135321224
-
See me, hear me: Integrating automatic speech recognition and lipreading
-
P. Duchnowski, U. Meier, and A. Waibel, "See me, hear me: Integrating automatic speech recognition and lipreading," in Proceedings of ICSLP, 1994.
-
(1994)
Proceedings of ICSLP
-
-
Duchnowski, P.1
Meier, U.2
Waibel, A.3
-
20
-
-
26844502130
-
Speech recognition by integrating audio, visual and contextual features based on neural networks
-
M. Kim, J. Ryu, and E. Kim, "Speech Recognition by Integrating Audio, Visual and Contextual Features Based on Neural Networks," in Advances in Natural Computation, Lecture Notes in Computer Science, 2005.
-
(2005)
Advances in Natural Computation, Lecture Notes in Computer Science
-
-
Kim, M.1
Ryu, J.2
Kim, E.3
-
21
-
-
80053437179
-
Multimodal deep learning
-
J. Ngiam, A. Khosla, J. Nam, H. Lee and A.Ng, "Multimodal Deep Learning", in International Conference on Machine Learning, 2011.
-
(2011)
International Conference on Machine Learning
-
-
Ngiam, J.1
Khosla, A.2
Nam, J.3
Lee, H.4
Ng, A.5
-
22
-
-
0141814785
-
Frame-dependent multi-stream reliability indicators for audio-visual speech recognition
-
A. Garg, G. Potamianos, C. Neti, T. Huang, "Frame-Dependent Multi-Stream Reliability Indicators for Audio-Visual Speech Recognition," in Int. Conf. Acoustic Speech and Signal Processing, 2003.
-
(2003)
Int. Conf. Acoustic Speech and Signal Processing
-
-
Garg, A.1
Potamianos, G.2
Neti, C.3
Huang, T.4
-
24
-
-
33745805403
-
A fast learning algorithm for deep belief nets
-
G. E. Hinton, S. Osindero, and Y. Teh. "A Fast Learning Algorithm for Deep Belief Nets," in Neural Computation, vol. 18, pp. 1527-1554, 2006.
-
(2006)
Neural Computation
, vol.18
, pp. 1527-1554
-
-
Hinton, G.E.1
Osindero, S.2
Teh, Y.3
|