-
1
-
-
4544290191
-
Recent advances in the automatic recognition of audio-visual speech
-
Sep
-
G. Potamianos, C. Neti, G. Gravier, and A. Garg, "Recent advances in the automatic recognition of audio-visual speech, " Proc. IEEE, vol. 91, no. 9, pp. 1306-1326, Sep. 2003.
-
(2003)
Proc. IEEE
, vol.91
, Issue.9
, pp. 1306-1326
-
-
Potamianos, G.1
Neti, C.2
Gravier, G.3
Garg, A.4
-
2
-
-
0142216141
-
Audiovisual speech synthesis
-
Oct
-
G. Bailly, M. Bérar, F. Elisei, and M. Odisio, "Audiovisual speech synthesis, " Int. J. Speech Technol., vol. 6, no. 4, pp. 331-346, Oct. 2003.
-
(2003)
Int. J. Speech Technol.
, vol.6
, Issue.4
, pp. 331-346
-
-
Bailly, G.1
Bérar, M.2
Elisei, F.3
Odisio, M.4
-
3
-
-
0032178592
-
Quantitative association of vocal-tract and facial behavior
-
H. Yehia, P. Rubin, and E. Vatikiotis-Bateson, "Quantitative association of vocal-tract and facial behavior, " Speech Commun., vol. 26, pp. 23-43, 1998.
-
(1998)
Speech Commun
, vol.26
, pp. 23-43
-
-
Yehia, H.1
Rubin, P.2
Vatikiotis-Bateson, E.3
-
4
-
-
0017199877
-
Hearing lips and seeing voices
-
H. Mcgurk, J. Macdonald, Hearing lips and seeing voices, Nature, 264, 746-748, 1976.
-
(1976)
Nature
, vol.264
, pp. 746-748
-
-
Mcgurk, H.1
Macdonald, J.2
-
5
-
-
0028259480
-
Techniques for estimating vocal-tract shapes from the speech signal
-
Jan
-
J. Schroeter and M. Sondhi, "Techniques for estimating vocal-tract shapes from the speech signal, " IEEE Trans. Speech Audio Process., vol. 2, no. 1, pp. 133-150, Jan. 1994.
-
(1994)
IEEE Trans. Speech Audio Process
, vol.2
, Issue.1
, pp. 133-150
-
-
Schroeter, J.1
Sondhi, M.2
-
6
-
-
33846680938
-
Speech production knowledge in automatic speech recognition
-
Feb
-
S. King, J. Frankel, K. Livescu, E. McDermott, K. Richmond, and M. Wester, "Speech production knowledge in automatic speech recognition, " J. Acoust. Soc. Amer., vol. 121, no. 2, pp. 723-742, Feb. 2007.
-
(2007)
J. Acoust. Soc. Amer.
, vol.121
, Issue.2
, pp. 723-742
-
-
King, S.1
Frankel, J.2
Livescu, K.3
Mcdermott, E.4
Richmond, K.5
Wester, M.6
-
7
-
-
0001736204
-
Speech coding based on physiological models of speech production
-
S. Furui and M. M. Sondhi, Eds. New York: Marcel Dekker
-
J. Schroeter and M. M. Sondhi, "Speech coding based on physiological models of speech production, " in Advances in Speech Signal Processing, S. Furui and M. M. Sondhi, Eds. New York: Marcel Dekker, 1992.
-
(1992)
Advances in Speech Signal Processing
-
-
Schroeter, J.1
Sondhi, M.M.2
-
8
-
-
84894560828
-
Designing the user interface of the computer-based speech training system artur based on early user tests
-
O. Engwall, O. Bälter, A.-M. Öster, and H. Sidenbladh- Kjellström, "Designing the user interface of the computer-based speech training system ARTUR based on early user tests, " J. Behavior Inf. Technol., vol. 25, no. 4, pp. 353-365, 2006.
-
(2006)
J. Behavior Inf. Technol.
, vol.25
, Issue.4
, pp. 353-365
-
-
Engwall, O.1
Bälter, O.2
Öster, A.-M.3
Sidenbladh-Kjellström, H.4
-
9
-
-
22144465830
-
Modeling the articulatory space using a hypercube codebook for acoustic-to-articulatory inversion
-
S. Ouni and Y. Laprie, "Modeling the articulatory space using a hypercube codebook for acoustic-to-articulatory inversion, " J. Acoust. Soc. Amer., vol. 118, no. 1, pp. 444-460, 2005.
-
(2005)
J. Acoust. Soc. Amer.
, vol.118
, Issue.1
, pp. 444-460
-
-
Ouni, S.1
Laprie, Y.2
-
10
-
-
0038359547
-
Modelling the uncertainty in recovering articulation from acoustics
-
K. Richmond, S. King, and P. Taylor, "Modelling the uncertainty in recovering articulation from acoustics, " Comput. Speech Lang., vol. 17, pp. 153-172, 2003.
-
(2003)
Comput. Speech Lang
, vol.17
, pp. 153-172
-
-
Richmond, K.1
King, S.2
Taylor, P.3
-
11
-
-
38649140222
-
Statistical mapping between articulatory movements and acoustic spectrum using a gaussian mixture model
-
T. Toda, A. W. Black, and K. Tokuda, "Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model, " Speech Commun., vol. 50, pp. 215-227, 2008.
-
(2008)
Speech Commun.
, vol.50
, pp. 215-227
-
-
Toda, T.1
Black, A.W.2
Tokuda, K.3
-
12
-
-
2142659020
-
Estimation of articulatory movements from speech acoustics using an hmm-based speech production model
-
S. Hiroya and M. Honda, "Estimation of articulatory movements from speech acoustics using an HMM-based speech production model, " IEEE Trans. Speech Audio Process., vol. 12, no. 2, pp. 175-185, Mar. 2004.
-
(2004)
IEEE Trans. Speech Audio Process
, vol.12
, Issue.2
, pp. 175-185
-
-
Hiroya, S.1
Honda, M.2
-
13
-
-
85032752352
-
Audiovisual speech processing
-
Jan
-
T. Chen, "Audiovisual speech processing, " IEEE Signal Process. Mag., vol. 18, no. 1, pp. 9-21, Jan. 2001.
-
(2001)
IEEE Signal Process. Mag.
, vol.18
, Issue.1
, pp. 9-21
-
-
Chen, T.1
-
14
-
-
0032179320
-
Lip movement synthesis from speech based on hidden markov models
-
E. Yamamoto, S. Nakamura, and K. Shikano, "Lip movement synthesis from speech based on hidden Markov models, " Speech Commun., vol. 26, pp. 105-115, 1998.
-
(1998)
Speech Commun
, vol.26
, pp. 105-115
-
-
Yamamoto, E.1
Nakamura, S.2
Shikano, K.3
-
16
-
-
0035426641
-
Hidden markov model inversion for audio-to-visual conversion in an mpeg-4 facial animation system
-
K. Choi, Y. Luo, and J.-N. Hwang, "Hidden Markov model inversion for audio-to-visual conversion in an MPEG-4 facial animation system, " J. VLSI Signal Process., vol. 29, pp. 51-61, 2001.
-
(2001)
J. VLSI Signal Process.
, vol.29
, pp. 51-61
-
-
Choi, K.1
Luo, Y.2
Hwang, J.-N.3
-
17
-
-
33947583073
-
Realistic mouth-synching for speech-driven talking face using articulatory modeling
-
Apr
-
L. Xie and Z.-Q. Liu, "Realistic mouth-synching for speech-driven talking face using articulatory modeling, " IEEE Trans. Multimedia, vol. 9, no. 3, pp. 500-510, Apr. 2007.
-
(2007)
IEEE Trans. Multimedia
, vol.9
, Issue.3
, pp. 500-510
-
-
Xie, L.1
Liu, Z.-Q.2
-
18
-
-
0036874551
-
On the relationship between face movements, tongue movements, and speech acoustics
-
J. Jiang, A. Alwan, P. A. Keating, E. T. Auer, and L. E. Bernstein, "On the relationship between face movements, tongue movements, and speech acoustics, " EURASIP J. Appl. Signal Process., vol. 11, pp. 1174-1188, 2002.
-
(2002)
EURASIP J. Appl. Signal Process
, vol.11
, pp. 1174-1188
-
-
Jiang, J.1
Alwan, A.2
Keating, P.A.3
Auer, E.T.4
Bernstein, L.E.5
-
19
-
-
33745183111
-
Introducing visual cues in acoustic-to-articulatory inversion
-
O. Engwall, "Introducing visual cues in acoustic-to-articulatory inversion, " in Proc. Int. Conf. Spoken Lang. Process., 2005, pp. 3205-3208.
-
(2005)
Proc. Int. Conf. Spoken Lang. Process
, pp. 3205-3208
-
-
Engwall, O.1
-
20
-
-
34548378893
-
Reconstructing tongue movements from audio and video
-
H. Kjellström, O. Engwall, and O. Bälter, "Reconstructing tongue movements from audio and video, " in Proc. Int. Conf. Spoken Lang. Process., 2006, pp. 2238-2241.
-
(2006)
Proc. Int. Conf. Spoken Lang. Process
, pp. 2238-2241
-
-
Kjellström, H.1
Engwall, O.2
Bälter, O.3
-
21
-
-
48149084421
-
Audiovisual-to-articulatory speech inversion using hmms
-
A. Katsamanis, G. Papandreou, and P. Maragos, "Audiovisual-to- articulatory speech inversion using HMMS, " in Proc. Int. Workshop Multimedia Signal Process. (MMSP), 2007, pp. 457-460.
-
(2007)
Proc. Int. Workshop Multimedia Signal Process. (MMSP)
, pp. 457-460
-
-
Katsamanis, A.1
Papandreou, G.2
Maragos, P.3
-
22
-
-
51449089369
-
Audiovisual-to-articulatory speech inversion using active appearance models for the face and hidden markov models for the dynamics
-
A. Katsamanis, G. Papandreou, and P. Maragos, "Audiovisual-to- articulatory speech inversion using active appearance models for the face and hidden Markov models for the dynamics, " in Proc. Int. Conf. Acoust., Speech, Signal Process., 2008, pp. 2237-2240.
-
(2008)
Proc. Int. Conf. Acoust., Speech, Signal Process
, pp. 2237-2240
-
-
Katsamanis, A.1
Papandreou, G.2
Maragos, P.3
-
23
-
-
0035363218
-
Active appearance models
-
Jun
-
T. F. Cootes, G. J. Edwards, and C. J. Taylor, "Active appearance models, " IEEE Trans. Pattern Anal. Mach. Intell., vol. 23, no. 6, pp. 681-685, Jun. 2001.
-
(2001)
IEEE Trans. Pattern Anal. Mach. Intell
, vol.23
, Issue.6
, pp. 681-685
-
-
Cootes, T.F.1
Edwards, G.J.2
Taylor, C.J.3
-
24
-
-
0035680116
-
Rapid object detection using a boosted cascade of simple features
-
P. Viola and M. Jones, "Rapid object detection using a boosted cascade of simple features, " in Proc. IEEE Int. Conf. Comp. Vision Pattern Recog., 2001, vol. I, pp. 511-518.
-
(2001)
Proc. IEEE Int. Conf. Comp. Vision Pattern Recog
, vol.I
, pp. 511-518
-
-
Viola, P.1
Jones, M.2
-
25
-
-
0010424152
-
Acoustic-to-articulatory inversion using dynamical and phonological constraints
-
S. Dusan and L. Deng, "Acoustic-to-articulatory inversion using dynamical and phonological constraints, " in Proc. Seminar Speech Production, 2000, pp. 237-240.
-
(2000)
Proc. Seminar Speech Production
, pp. 237-240
-
-
Dusan, S.1
Deng, L.2
-
26
-
-
0034270644
-
Audio-visual speech modeling for continuous speech recognition
-
Sep
-
S. Dupont and J. Luettin, "Audio-visual speech modeling for continuous speech recognition, " IEEE Trans. Multimedia, vol. 2, no. 3, pp. 141-151, Sep. 2000.
-
(2000)
IEEE Trans. Multimedia
, vol.2
, Issue.3
, pp. 141-151
-
-
Dupont, S.1
Luettin, J.2
-
27
-
-
0037503670
-
A multichannel articulatory speech database and its application for automatic speech recognition
-
A. Wrench and W. Hardcastle, "A multichannel articulatory speech database and its application for automatic speech recognition, " in Proc. 5th Seminar Speech Production, Kloster Seeon, Bavaria, 2000, pp. 305-308. .
-
(2000)
Proc. 5th Seminar Speech Production
, pp. 305-308
-
-
Wrench, A.1
Hardcastle, W.2
-
30
-
-
0032023788
-
Wiener filters in canonical coordinates for transform coding, filtering, and quantizing
-
May
-
L. L. Scharf and J. K. Thomas, "Wiener filters in canonical coordinates for transform coding, filtering, and quantizing, " IEEE Trans. Speech Audio Process., vol. 46, no. 3, pp. 647-654, May 1998.
-
(1998)
IEEE Trans. Speech Audio Process.
, vol.46
, Issue.3
, pp. 647-654
-
-
Scharf, L.L.1
Thomas, J.K.2
-
31
-
-
0000927638
-
Predicting multivariate responses in multiple linear regression
-
L. Breiman and J. H. Friedman, "Predicting multivariate responses in multiple linear regression, " J. Roy. Statist. Soc. (B), vol. 59, no. 1, pp. 3-54, 1997.
-
(1997)
J. Roy. Statist. Soc. (B)
, vol.59
, Issue.1
, pp. 3-54
-
-
Breiman, L.1
Friedman, J.H.2
-
33
-
-
84863731362
-
Audiovisual speech inversion by switching dynamical modeling governed by a hidden markov process
-
CD-ROM
-
A. Katsamanis, G. Ananthakrishnan, G. Papandreou, P. Maragos, and O. Engwall, "Audiovisual speech inversion by switching dynamical modeling governed by a hidden Markov process, " in Proc. Eur. Signal Process. Conf. (EUSIPCO), 2008, CD-ROM.
-
(2008)
Proc. Eur. Signal Process. Conf. (EUSIPCO)
-
-
Katsamanis, A.1
Ananthakrishnan, G.2
Papandreou, G.3
Maragos, P.4
Engwall, O.5
-
35
-
-
0001237218
-
A maximum likelihood methodology for clusterwise linear regression
-
W. DeSarbo and W. Cron, "A maximum likelihood methodology for clusterwise linear regression, " J. Classification, vol. 5, pp. 249-282, 1988.
-
(1988)
J. Classification
, vol.5
, pp. 249-282
-
-
Desarbo, W.1
Cron, W.2
-
36
-
-
0003544881
-
-
D. Stork and M. E. Hennecke, Eds. Berlin, Germany: Springer
-
Speechreading by Humans and Machines, D. Stork and M. E. Hennecke, Eds. Berlin, Germany: Springer, 1996.
-
(1996)
Speechreading by Humans and Machines
-
-
-
37
-
-
0034842342
-
Asynchronous stream modeling for large vocabulary audio-visual speech recognition
-
J. Luettin, G. Potamianos, and C. Neti, "Asynchronous stream modeling for large vocabulary audio-visual speech recognition, " in Proc. Int. Conf. Acoust., Speech, Signal Process., 2001, pp. 169-172.
-
(2001)
Proc. Int. Conf. Acoust., Speech, Signal Process
, pp. 169-172
-
-
Luettin, J.1
Potamianos, G.2
Neti, C.3
-
39
-
-
0032660758
-
Direct least square fitting of ellipses
-
May
-
A. Fitzgibbon, M. Pilu, and R. Fisher, "Direct least square fitting of ellipses, " IEEE Trans. Pattern Anal. Mach. Intell., vol. 21, no. 5, pp. 476-480, May 1999.
-
(1999)
IEEE Trans. Pattern Anal. Mach. Intell.
, vol.21
, Issue.5
, pp. 476-480
-
-
Fitzgibbon, A.1
Pilu, M.2
Fisher, R.3
-
40
-
-
57549101447
-
Audiovisual synchronization and fusion using canonical correlation analysis
-
Nov
-
M. E. Sargin, Y. Yemez, E. Erzin, and M. Tekalp, "Audiovisual synchronization and fusion using canonical correlation analysis, " IEEE Trans. Multimedia, vol. 9, no. 7, pp. 1396-1403, Nov. 2007.
-
(2007)
IEEE Trans. Multimedia
, vol.9
, Issue.7
, pp. 1396-1403
-
-
Sargin, M.E.1
Yemez, Y.2
Erzin, E.3
Tekalp, M.4
-
41
-
-
0003822743
-
-
for HTK version 3.2, Cambridge Univ. Eng. Dept. Tech. Rep
-
S. Young, G. Evermann, D. Kershaw, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. Woodland, The HTK Book (for HTK version 3.2) Cambridge Univ. Eng. Dept., Tech. Rep, 2002.
-
(2002)
The HTK Book
-
-
Young, S.1
Evermann, G.2
Kershaw, D.3
Moore, G.4
Odell, J.5
Ollason, D.6
Povey, D.7
Valtchev, V.8
Woodland, P.9
-
42
-
-
0002557614
-
Line spectrum pair and speech data compression
-
F. K. Soong and B.-H. Juang, "Line spectrum pair and speech data compression, " in Proc. Int. Conf. Acoust., Speech Signal Process, 1984, vol. 9, pp. 37-40.
-
(1984)
Proc. Int. Conf. Acoust., Speech Signal Process
, vol.9
, pp. 37-40
-
-
Soong, F.K.1
Juang, B.-H.2
-
44
-
-
34047263009
-
Visual model structures and synchrony constraints for audio-visual speech recognition
-
May
-
T. J. Hazen, "Visual model structures and synchrony constraints for audio-visual speech recognition, " IEEE Trans. Speech Audio Process., vol. 14, no. 3, pp. 1082-1089, May 2006.
-
(2006)
IEEE Trans. Speech Audio Process
, vol.14
, Issue.3
, pp. 1082-1089
-
-
Hazen, T.J.1
-
45
-
-
0000807171
-
Reduced-rank regression and canonical analysis
-
M.-S. Tso, "Reduced-rank regression and canonical analysis, " J. R. Statist. Soc. (B), vol. 43, pp. 183-189, 1981.
-
(1981)
J. R. Statist. Soc. (B)
, vol.43
, pp. 183-189
-
-
Tso, M.-S.1
|