-
1
-
-
33646779170
-
Smart room: Participant and speaker localization and identification
-
C. Busso, S. Hernanz, C. Chu, S. Kwon, S. Lee, P. Georgiou, I. Cohen, and S. Narayanan, "Smart room: Participant and speaker localization and identification", in Proc. IEEE ICASSP, 2005.
-
(2005)
Proc. IEEE ICASSP
-
-
Busso, C.1
Hernanz, S.2
Chu, C.3
Kwon, S.4
Lee, S.5
Georgiou, P.6
Cohen, I.7
Narayanan, S.8
-
2
-
-
0038715064
-
Distributed meetings: A meeting capture and broadcasting system
-
R. Cutler, Y. Rui, A. Gupta, J. Cadiz, I. Tashev, L. He, A. Colburn, Z. Zhang, Z. Liu, and S. Silverbert, "Distributed meetings: A meeting capture and broadcasting system", in Proc. ACM Conf. Multimedia, 2002.
-
(2002)
Proc. ACM Conf. Multimedia
-
-
Cutler, R.1
Rui, Y.2
Gupta, A.3
Cadiz, J.4
Tashev, I.5
He, L.6
Colburn, A.7
Zhang, Z.8
Liu, Z.9
Silverbert, S.10
-
3
-
-
34250699813
-
Audio-Visual Localization of Multiple Speakers in a Video Teleconferencing Setting
-
Canada
-
B. Kapralos, M. Jenkin, and E. Milios, "Audio-Visual Localization of Multiple Speakers in a Video Teleconferencing Setting", Tech. Rep. York University, Canada, 2002.
-
(2002)
Tech. Rep. York University
-
-
Kapralos, B.1
Jenkin, M.2
Milios, E.3
-
4
-
-
0038038521
-
A multimodal speaker detection and tracking system for teleconferencing
-
B. Yoshimi and G. Pingali, "A multimodal speaker detection and tracking system for teleconferencing", in Proc. ACM Conf. Multimedia, 2002.
-
(2002)
Proc. ACM Conf. Multimedia
-
-
Yoshimi, B.1
Pingali, G.2
-
5
-
-
2142812371
-
Robust real-time face detection
-
P. Viola and M. Jones, "Robust real-time face detection", Int. J. Comput. Vis., vol. 57, no. 2, pp. 137-154, 2004.
-
(2004)
Int. J. Comput. Vis
, vol.57
, Issue.2
, pp. 137-154
-
-
Viola, P.1
Jones, M.2
-
6
-
-
0031385284
-
Voice source localization for automatic camera pointing system in videoconferencing
-
H. Wang and P. Chu, "Voice source localization for automatic camera pointing system in videoconferencing", in Proc. IEEE ICASSP, 1997.
-
(1997)
Proc. IEEE ICASSP
-
-
Wang, H.1
Chu, P.2
-
7
-
-
33646794986
-
Sound source localization for circular arrays of directional microphones
-
Y. Rui, D. Florencio, W. Lam, and J. Su, "Sound source localization for circular arrays of directional microphones", in Proc. IEEE ICASSP, 2005.
-
(2005)
Proc. IEEE ICASSP
-
-
Rui, Y.1
Florencio, D.2
Lam, W.3
Su, J.4
-
8
-
-
0001432664
-
On the integration of auditory and visual parameters in an HMM-based ASR
-
Berlin, Germany: Springer
-
A. Adjoudani and C. Benoit, "On the integration of auditory and visual parameters in an HMM-based ASR", in Speechreading by Humans and Machines. Berlin, Germany: Springer, 1996, pp. 461-471.
-
(1996)
Speechreading by Humans and Machines
, pp. 461-471
-
-
Adjoudani, A.1
Benoit, C.2
-
9
-
-
0034270644
-
Audio-visual speech modeling for continuous speech recognition
-
Sep
-
S. Dupont and J. Luettin, "Audio-visual speech modeling for continuous speech recognition", IEEE Trans. Multimedia, vol. 2, no. 3, pp. 141-151, Sep. 2000.
-
(2000)
IEEE Trans. Multimedia
, vol.2
, Issue.3
, pp. 141-151
-
-
Dupont, S.1
Luettin, J.2
-
10
-
-
8844259704
-
Discovery and fusion of salient multi-modal features towards news story segmentation
-
W. Hsu, S.-F. Chang, C.-W. Huang, L. Kennedy, C.-Y. Lin, and G. Iyengar, "Discovery and fusion of salient multi-modal features towards news story segmentation", in SPIE Electronic Imaging, 2004.
-
(2004)
SPIE Electronic Imaging
-
-
Hsu, W.1
Chang, S.-F.2
Huang, C.-W.3
Kennedy, L.4
Lin, C.-Y.5
Iyengar, G.6
-
11
-
-
0030685285
-
Coupled hidden Markov models for complex action recognition
-
M. Brand, N. Oliver, and A. Pentland, "Coupled hidden Markov models for complex action recognition", in Proc. IEEE CVPR, 1997.
-
(1997)
Proc. IEEE CVPR
-
-
Brand, M.1
Oliver, N.2
Pentland, A.3
-
12
-
-
84908294933
-
Duration dependent input output Markov models for audio-visual event detection
-
M. Naphade, A. Garg, and T. Huang, "Duration dependent input output Markov models for audio-visual event detection", in Proc. IEEEICME, 2001.
-
(2001)
Proc. IEEEICME
-
-
Naphade, M.1
Garg, A.2
Huang, T.3
-
13
-
-
57849091620
-
Speaker change detection using joint audiovisual statistics
-
G. Iyengar and C. Neti, "Speaker change detection using joint audiovisual statistics", in Int. RIAO Conf., 2000.
-
(2000)
Int. RIAO Conf
-
-
Iyengar, G.1
Neti, C.2
-
14
-
-
34250714316
-
Information Theoretic Optimization of Audio Features for Multimodal Speaker Detection
-
EPFL, Lausanne, Switzerland
-
P. Besson and M. Kunt, "Information Theoretic Optimization of Audio Features for Multimodal Speaker Detection", Tech. Rep. Signal Processing Institute, EPFL, Lausanne, Switzerland, 2005.
-
(2005)
Tech. Rep. Signal Processing Institute
-
-
Besson, P.1
Kunt, M.2
-
15
-
-
0009622481
-
Learning joint statistical models for audio-visual fusion and segregation
-
J. Fisher, III, T. Darrell, W. Freeman, and P. Viola, "Learning joint statistical models for audio-visual fusion and segregation", in NIPS, 2000, pp. 772-778.
-
(2000)
NIPS
, pp. 772-778
-
-
Fisher III, J.1
Darrell, T.2
Freeman, W.3
Viola, P.4
-
17
-
-
34250762971
-
Multimodal speaker detection using error feedback dynamic Bayesian networks
-
V. Pavlović, A. Garg, J. Rehg, and T. Huang, "Multimodal speaker detection using error feedback dynamic Bayesian networks", in Proc. IEEE CVPR, 2001.
-
(2001)
Proc. IEEE CVPR
-
-
Pavlović, V.1
Garg, A.2
Rehg, J.3
Huang, T.4
-
18
-
-
0036874485
-
Logistic regression, adaboost and bregman distances
-
D. N. Zotkin, R. Duraiswami, and L. S. Davis, "Logistic regression, adaboost and bregman distances", EURASIP J. Appl. Signal Process., vol. 2002, no. 11, pp. 1154-1164, 2002.
-
(2002)
EURASIP J. Appl. Signal Process
, vol.2002
, Issue.11
, pp. 1154-1164
-
-
Zotkin, D.N.1
Duraiswami, R.2
Davis, L.S.3
-
19
-
-
0034844366
-
Sequential Monte Carlo fusion of sound and vision for speaker tracking
-
J. Vermaak, M. Gangnet, A. Black, and P. Pérez, "Sequential Monte Carlo fusion of sound and vision for speaker tracking", in Proc. IEEE ICCV, 2001.
-
(2001)
Proc. IEEE ICCV
-
-
Vermaak, J.1
Gangnet, M.2
Black, A.3
Pérez, P.4
-
20
-
-
20444478554
-
Speaker localisation using audiovisual synchrony: An empirical study
-
H. Nock, G. Iyengar, and C. Neti, "Speaker localisation using audiovisual synchrony: An empirical study", in Proc. CIVR, 2003.
-
(2003)
Proc. CIVR
-
-
Nock, H.1
Iyengar, G.2
Neti, C.3
-
21
-
-
0034507915
-
Look who's talking: Speaker detection using video and audio correlation
-
R. Cutler and L. Davis, "Look who's talking: Speaker detection using video and audio correlation", in Proc. IEEE ICME, 2000.
-
(2000)
Proc. IEEE ICME
-
-
Cutler, R.1
Davis, L.2
-
22
-
-
0344044776
-
Audio-video sensor fusion with probabilistic graphical models
-
M. Beal, H. Attias, and N. Jojic, "Audio-video sensor fusion with probabilistic graphical models", in Proc. ECCV, 2002.
-
(2002)
Proc. ECCV
-
-
Beal, M.1
Attias, H.2
Jojic, N.3
-
23
-
-
21244492850
-
Real-time speaker tracking using particle filter sensor fusion
-
Y. Chen and Y. Rui, "Real-time speaker tracking using particle filter sensor fusion", Proc. IEEE, vol. 92, pp. 485-494, 2004.
-
(2004)
Proc. IEEE
, vol.92
, pp. 485-494
-
-
Chen, Y.1
Rui, Y.2
-
24
-
-
32344434992
-
A joint particle filter for audio-visual speaker tracking
-
K. Nickel, T. Gehrig, R. Stiefelhagen, and J. McDonough, "A joint particle filter for audio-visual speaker tracking", in ICMI, 2005.
-
(2005)
ICMI
-
-
Nickel, K.1
Gehrig, T.2
Stiefelhagen, R.3
McDonough, J.4
-
25
-
-
34547516010
-
Maximum likelihood sound source localization for multiple directional microphones
-
C. Zhang, Z. Zhang, and D. Florêncio, "Maximum likelihood sound source localization for multiple directional microphones", in ICASSP, 2007.
-
(2007)
ICASSP
-
-
Zhang, C.1
Zhang, Z.2
Florêncio, D.3
-
26
-
-
4644273800
-
Source localization in reverberant environments: Performance bounds and ml estimation
-
T. Gustafsson, B. Rao, and M. Trivedi, "Source localization in reverberant environments: Performance bounds and ml estimation", in Proc. ICASSP, 2001.
-
(2001)
Proc. ICASSP
-
-
Gustafsson, T.1
Rao, B.2
Trivedi, M.3
-
27
-
-
0030701369
-
A robust method for speech signal time-delay estimation in reverberant rooms
-
M. Brandstein and H. Silverman, "A robust method for speech signal time-delay estimation in reverberant rooms", in Proc. ICASSP, 1997.
-
(1997)
Proc. ICASSP
-
-
Brandstein, M.1
Silverman, H.2
-
28
-
-
0003660631
-
-
Tech. Rep. Stanford Univ, Dept. Statistics, Stanford, CA
-
J. Friedman, T. Hastie, and R. Tibshirani, "Additive Logistic Regression: A Statistical View of Boosting", Tech. Rep. Stanford Univ., Dept. Statistics, Stanford, CA, 1998.
-
(1998)
Additive Logistic Regression: A Statistical View of Boosting
-
-
Friedman, J.1
Hastie, T.2
Tibshirani, R.3
-
29
-
-
0033281701
-
Improved boosting algorithms using confidence-rated predictions
-
R. Schapire and Y. Singer, "Improved boosting algorithms using confidence-rated predictions." Mach. Learn., vol. 37, no. 3, pp. 297-336, 1999.
-
(1999)
Mach. Learn
, vol.37
, Issue.3
, pp. 297-336
-
-
Schapire, R.1
Singer, Y.2
-
30
-
-
0344983340
-
Detecting pedestrians using patterns of motion and appearance
-
P. Viola, M. Jones, and D. Snow, "Detecting pedestrians using patterns of motion and appearance", in Proc. IEEE ICCV, 2003.
-
(2003)
Proc. IEEE ICCV
-
-
Viola, P.1
Jones, M.2
Snow, D.3
-
31
-
-
0003396283
-
A System for Video Surveillance and Monitoring
-
Robotics Inst, Pittsburgh, PA
-
R. Collins, A. Lipton, T. Kanade, H. Fujiyoshi, D. Duggins, Y. Tsin, D. Tolliver, N. Enomoto, and O. Hasegawa, "A System for Video Surveillance and Monitoring", Tech. Rep. Carnegie Mellon Univ., Robotics Inst., Pittsburgh, PA, 2000.
-
(2000)
Tech. Rep. Carnegie Mellon Univ
-
-
Collins, R.1
Lipton, A.2
Kanade, T.3
Fujiyoshi, H.4
Duggins, D.5
Tsin, Y.6
Tolliver, D.7
Enomoto, N.8
Hasegawa, O.9
-
32
-
-
0031187308
-
Pfinder: Realtime tracking of the human body
-
Jul
-
C. Wren, A. Azarbayejani, T. Darrell, and A. Pentland, "Pfinder: Realtime tracking of the human body", IEEE Trans. Pattern. Anal. Mach. Intell., vol. 19, no. 7, pp. 780-785, Jul. 1997.
-
(1997)
IEEE Trans. Pattern. Anal. Mach. Intell
, vol.19
, Issue.7
, pp. 780-785
-
-
Wren, C.1
Azarbayejani, A.2
Darrell, T.3
Pentland, A.4
-
33
-
-
0003486467
-
Multimedia sensor fusion for intelligent camera control and human computer interaction
-
Ph.D. dissertation, Dept. Elect. Eng, North Carolina Start Univ, Raleigh
-
S. Goodridge, "Multimedia sensor fusion for intelligent camera control and human computer interaction", Ph.D. dissertation, Dept. Elect. Eng., North Carolina Start Univ., Raleigh, 1997.
-
(1997)
-
-
Goodridge, S.1
-
34
-
-
0036643072
-
Logistic regression, Adaboost and Bregman distances
-
M. Collins, R. Schapire, and Y. Singer, "Logistic regression, Adaboost and Bregman distances", Mach. Learn., vol. 48, no. 1-3, pp. 253-285, 2002.
-
(2002)
Mach. Learn
, vol.48
, Issue.1-3
, pp. 253-285
-
-
Collins, M.1
Schapire, R.2
Singer, Y.3
-
35
-
-
84898978212
-
Boosting algorithms as gradient decent
-
L. Mason, J. Baxter, P. Bartlett, and M. Frean, "Boosting algorithms as gradient decent", in NIPS, 2000.
-
(2000)
NIPS
-
-
Mason, L.1
Baxter, J.2
Bartlett, P.3
Frean, M.4
-
36
-
-
70450187477
-
Multiple-instance pruning for learning efficient cascade detectors
-
C. Zhang and P. Viola, "Multiple-instance pruning for learning efficient cascade detectors", in NIPS, 2007.
-
(2007)
NIPS
-
-
Zhang, C.1
Viola, P.2
-
37
-
-
33645146449
-
Histograms of oriented gradients for human detection
-
N. Dalai and B. Triggs, "Histograms of oriented gradients for human detection", in Proc. CVPR, 2005.
-
(2005)
Proc. CVPR
-
-
Dalai, N.1
Triggs, B.2
|