SCOPUS 정보 검색 플랫폼

ICMI'08: Proceedings of the 10th International Conference on Multimodal Interfaces

Volumn , Issue , 2008, Pages 217-224

Detection and localization of 3D Audio-Visual objects using unsupervised clustering

(5) Khalidov, Vasil a Forbes, Florence a Hansard, Miles a Arnaud, Elise a,b Horaud, Radu a

a INRIA RHÔNE ALPES (France)

b UNIV GRENOBLE ALPES (France)

Author keywords

Audio visual clustering; Binaural hearing; Mixture models; Stereo vision

Indexed keywords

AUDITION; INFERENCE ENGINES; INTERACTIVE COMPUTER SYSTEMS; MIXTURES; STEREO IMAGE PROCESSING; STEREO VISION;

3D REPRESENTATIONS; AUDIO-VISUAL; BINAURAL HEARING; DETECTION AND LOCALIZATION; EXPECTATION-MAXIMIZATION ALGORITHMS; MIXTURE MODEL; UNSUPERVISED CLUSTERING; VISUAL OBSERVATIONS;

OBJECT DETECTION;

EID: 63449109271 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/1452392.1452438 Document Type: Conference Paper

Times cited : (6)

References (26)

1
- 0036874527
- M. Heckmann, F. Berthommier, and K. Kroschel. Noise adaptive stream weighting in audio-visual speech recognition. EURASIP J. Applied Signal Proc., 11:1260-1273, 2002.
- M. Heckmann, F. Berthommier, and K. Kroschel. Noise adaptive stream weighting in audio-visual speech recognition. EURASIP J. Applied Signal Proc., 11:1260-1273, 2002.

2
- 0042349407
- A graphical model for audiovisual object tracking
- M. Beal, N. Jojic, and H. Attias. A graphical model for audiovisual object tracking. IEEE Trans. PAMI, 25(7):828-836, 2003.
- (2003) IEEE Trans. PAMI , vol.25 , Issue.7 , pp. 828-836
- Beal, M.¹ Jojic, N.² Attias, H.³

3
- 34047229439
- Audio-visual speaker localization using graphical models
- A. Kushal, M. Rahurkar, L. Fei-Fei, J. Ponce, and T. Huang. Audio-visual speaker localization using graphical models. In Proc. 18th ICPR., pages 291-294, 2006.
- (2006) Proc. 18th ICPR , pp. 291-294
- Kushal, A.¹ Rahurkar, M.² Fei-Fei, L.³ Ponce, J.⁴ Huang, T.⁵

4
- 0036874485
- Joint audio-visual tracking using particle filters
- D. N. Zotkin, R. Duraiswami, and L. S. Davis. Joint audio-visual tracking using particle filters. EURASIP Journal on Applied Signal Processing, 11:1154-1164, 2002.
- (2002) EURASIP Journal on Applied Signal Processing , vol.11 , pp. 1154-1164
- Zotkin, D.N.¹ Duraiswami, R.² Davis, L.S.³

5
- 0034844366
- Sequential monte carlo fusion of sound and vision for speaker tracking
- J. Vermaak, M. Ganget, A. Blake, and P. Pérez. Sequential monte carlo fusion of sound and vision for speaker tracking. In Proc. IEEE ICCV, pages 741-746, 2001.
- (2001) Proc. IEEE ICCV , pp. 741-746
- Vermaak, J.¹ Ganget, M.² Blake, A.³ Pérez, P.⁴

6
- 13344250690
- Data fusion for visual tracking with particles
- P. Perez, J. Vermaak, and A. Blake. Data fusion for visual tracking with particles. Proc. of IEEE, 92(3):495-513, 2004.
- (2004) Proc. of IEEE , vol.92 , Issue.3 , pp. 495-513
- Perez, P.¹ Vermaak, J.² Blake, A.³

7
- 21244492850
- Real-time speaker tracking using particle filter sensor fusion
- Y. Chen and Y. Rui. Real-time speaker tracking using particle filter sensor fusion. Proc. of IEEE, 92(3):485-494, 2004.
- (2004) Proc. of IEEE , vol.92 , Issue.3 , pp. 485-494
- Chen, Y.¹ Rui, Y.²

8
- 32344434992
- A joint particle filter for audio-visual speaker tracking
- K. Nickel, T. Gehrig, R. Stiefelhagen, and J. McDonough. A joint particle filter for audio-visual speaker tracking. In Proc. 7th International Conference on Multimodal Interfaces, pages 61-68, 2005.
- (2005) Proc. 7th International Conference on Multimodal Interfaces , pp. 61-68
- Nickel, K.¹ Gehrig, T.² Stiefelhagen, R.³ McDonough, J.⁴

9
- 63449118893
- Structure inference for Bayesian multisensory perception and tracking
- T. Hospedales, J. Cartwright, and S. Vijayakumar. Structure inference for Bayesian multisensory perception and tracking. In Proc. International Joint Conference on Artificial Intelligence, pages 2122-2128, 2007.
- (2007) Proc. International Joint Conference on Artificial Intelligence , pp. 2122-2128
- Hospedales, T.¹ Cartwright, J.² Vijayakumar, S.³

10
- 4544347587
- Multiple person and speaker activity tracking with a particle filter
- N. Checka, K. Wilson, M. Siracusa, and T. Darrell. Multiple person and speaker activity tracking with a particle filter. In IEEE Conf. Acoust. Sp. Sign. Proc., pages 881-884, 2004.
- (2004) IEEE Conf. Acoust. Sp. Sign. Proc , pp. 881-884
- Checka, N.¹ Wilson, K.² Siracusa, M.³ Darrell, T.⁴

11
- 64149093817
- Audiovisual probabilistic tracking of multiple speakers in meetings
- D. Gatica-Perez, G. Lathoud, J.-M. Odobez, and I. McCowan. Audiovisual probabilistic tracking of multiple speakers in meetings. IEEE Trans. on ASLP, 15(2):601-616, 2007.
- (2007) IEEE Trans. on ASLP , vol.15 , Issue.2 , pp. 601-616
- Gatica-Perez, D.¹ Lathoud, G.² Odobez, J.-M.³ McCowan, I.⁴

12
- 37849022114
- Audio-visual multi-person tracking and identification for smart environments
- K. Bernardin and R. Stiefelhagen. Audio-visual multi-person tracking and identification for smart environments. In Proc. 15th International ACM Conference on Multimedia, pages 661-670, 2007.
- (2007) Proc. 15th International ACM Conference on Multimedia , pp. 661-670
- Bernardin, K.¹ Stiefelhagen, R.²

13
- 38049107298
- A generative approach to audio-visual person tracking
- R. Brunelli, A. Brutti, P. Chippendale, O. Lanz, M. Omologo, P. Svaizer, and F. Tobia. A generative approach to audio-visual person tracking. In Multimodal Technologies for Perception of Humans: Proc. 1st International CLEAR Evaluation Workshop, pages 55-68, 2007.
- (2007) Multimodal Technologies for Perception of Humans: Proc. 1st International CLEAR Evaluation Workshop , pp. 55-68
- Brunelli, R.¹ Brutti, A.² Chippendale, P.³ Lanz, O.⁴ Omologo, M.⁵ Svaizer, P.⁶ Tobia, F.⁷

14
- 2642562769
- Speaker association with signal-level audiovisual fusion
- J. Fisher and T. Darrell. Speaker association with signal-level audiovisual fusion. IEEE Trans. on Multimedia, 6(3):406-413, 2004.
- (2004) IEEE Trans. on Multimedia , vol.6 , Issue.3 , pp. 406-413
- Fisher, J.¹ Darrell, T.²

15
- 34948829598
- Harmony in motion
- Z. Barzelay and Y.Y. Schechner. Harmony in motion. In Proc. of IEEE CVPR, pages 1-8, 2007.
- (2007) Proc. of IEEE CVPR , pp. 1-8
- Barzelay, Z.¹ Schechner, Y.Y.²

16
- 49949092708
- Patterns of binocular disparity for a fixating observer
- Springer
- M. Hansard and R.P. Horaud. Patterns of binocular disparity for a fixating observer. In Advances in Brain, Vision, & AI, 2nd Int. Symp., pages 308-317. Springer, 2007.
- (2007) Advances in Brain, Vision, & AI, 2nd Int. Symp , pp. 308-317
- Hansard, M.¹ Horaud, R.P.²

17
- 0000789852
- Channel separability in the audio-visual integration of speech: A Bayesian approach
- D.G. Stork and M.E. Hennecke, editors, Speech Reading by Humans and Machines: Models, Systems and Applications, Springer, Berlin
- J.R. Movellan and G. Chadderdon. Channel separability in the audio-visual integration of speech: A Bayesian approach. In D.G. Stork and M.E. Hennecke, editors, Speech Reading by Humans and Machines: Models, Systems and Applications, NATO ASI Series, pages 473-487. Springer, Berlin, 1996.
- (1996) NATO ASI Series , pp. 473-487
- Movellan, J.R.¹ Chadderdon, G.²

18
- 0032072433
- Speech recognition and sensory integration
- D.W. Massaro and D.G. Stork. Speech recognition and sensory integration. American Scientist, 86(3):236-244, 1998.
- (1998) American Scientist , vol.86 , Issue.3 , pp. 236-244
- Massaro, D.W.¹ Stork, D.G.²

19
- 0037209490
- EM procedures using mean-field approximations for Markov model-based image segmentation
- G. Celeux, F. Forbes, and N. Peyrard. EM procedures using mean-field approximations for Markov model-based image segmentation. Pattern Recognition, 36:131-144, 2003.
- (2003) Pattern Recognition , vol.36 , pp. 131-144
- Celeux, G.¹ Forbes, F.² Peyrard, N.³

20
- 0002629270
- Maximum likelihood from incomplete data via the EM algorithm (with discussion)
- A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. Roy. Statist. Soc. Ser. B, 39(1):1-38, 1977.
- (1977) J. Roy. Statist. Soc. Ser. B , vol.39 , Issue.1 , pp. 1-38
- Dempster, A.P.¹ Laird, N.M.² Rubin, D.B.³

21
- 33846516584
- Springer
- C.M. Bishop. Pattern Recognition and Machine Learning. Springer, 2006.
- (2006) Pattern Recognition and Machine Learning
- Bishop, C.M.¹

22
- 0000120766
- Estimating the dimension of a model
- March
- G. Schwarz. Estimating the dimension of a model. The Annals of Statistics, 6(2):461-464, March 1978.
- (1978) The Annals of Statistics , vol.6 , Issue.2 , pp. 461-464
- Schwarz, G.¹

23
- 63449110580
- The CAVA corpus: Synchronized stereoscopic and binaural datasets with head movements
- E. Arnaud, H. Christensen, Y.C. Lu, J. Barker, V. Khalidov, M. Hansard, B. Holveck, H. Mathieu, R. Narasimha, F. Forbes, and R. Horaud. The CAVA corpus: Synchronized stereoscopic and binaural datasets with head movements. In Proc. of ICMI 2008, 2008.
- (2008) Proc. of ICMI 2008
- Arnaud, E.¹ Christensen, H.² Lu, Y.C.³ Barker, J.⁴ Khalidov, V.⁵ Hansard, M.⁶ Holveck, B.⁷ Mathieu, H.⁸ Narasimha, R.⁹ Forbes, F.¹⁰ Horaud, R.¹¹

24
- 0001466773
- A combined corner and edge detector
- C. Harris and M. Stephens. A combined corner and edge detector. In Proc. 4th Alvey Vision Conference, pages 147-151, 1988.
- (1988) Proc. 4th Alvey Vision Conference , pp. 147-151
- Harris, C.¹ Stephens, M.²

25
- 84872705453
- Intel OpenCV Computer Vision library. http://www.intel.com/technology/ computing/opencv.
- Intel OpenCV Computer Vision library

26
- 57849093600
- Integrating pitch and localisation cues at a speech fragment level
- H. Christensen, N. Ma, S.N.Wrigley, and J. Barker. Integrating pitch and localisation cues at a speech fragment level. In Proc. of Interspeech 2007, pages 2769-2772, 2007.
- (2007) Proc. of Interspeech 2007 , pp. 2769-2772
- Christensen, H.¹ Ma, N.² Wrigley, S.N.³ Barker, J.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.