SCOPUS 정보 검색 플랫폼

IEEE Transactions on Pattern Analysis and Machine Intelligence

Volumn 25, Issue 7, 2003, Pages 828-836

A graphical model for audiovisual object tracking

(3) Beal, Matthew J a Jojic, Nebojsa b Attias, Hagai b

Author keywords

Audio; Audiovisual; Automatic calibrations; Bayesian inference; Cameras; Expectation maximization (EM) algorithm; Generative models; Graphical models; Microphone arrays; Multimedia; Multimodal; Probabilistic inference; Speaker modeling; Speech; Tracking; Variational methods; Video; Vision

Indexed keywords

ALGORITHMS; CALIBRATION; CAMERAS; COMPUTER GRAPHICS; COMPUTER SIMULATION; MICROPHONES; MULTIMEDIA SYSTEMS; STATISTICAL METHODS; VARIATIONAL TECHNIQUES;

AUDIOVISUAL OBJECT TRACKING; BAYESIAN INFERENCE; GRAPHICAL MODEL; PROBABILISTIC INFERENCE;

OBJECT RECOGNITION;

EID: 0042349407 PISSN: 01628828 EISSN: None Source Type: Journal
DOI: 10.1109/TPAMI.2003.1206512 Document Type: Article

Times cited : (88)

References (34)

1
- 84977901887
- A new method for speech denoising and robust speech recognition using probabilistic models for clean speech and for noise
- H. Attias, L. Deng, A. Acero, and J.C. Platt, "A New Method for Speech Denoising and Robust Speech Recognition Using Probabilistic Models for Clean Speech and for Noise," Proc. Eurospeech, 2001.
- Proc. Eurospeech, 2001
- Attias, H.¹ Deng, L.² Acero, A.³ Platt, J.C.⁴

2
- 0032528695
- Blind source separation and deconvolution: The dynamic component analysis algorithm
- H. Attias and C.E. Schreiner, "Blind Source Separation and Deconvolution: The Dynamic Component Analysis Algorithm," Neural Computation, vol. 10, 1998.
- (1998) Neural Computation , vol.10
- Attias, H.¹ Schreiner, C.E.²

3
- 0032675797
- Audio-visual person verification
- S. Ben-Yacoub, J. Luttin, K. Jonsson, J. Matas, and J. Kittler, "Audio-Visual Person Verification," Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2000.
- Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2000
- Ben-Yacoub, S.¹ Luttin, J.² Jonsson, K.³ Matas, J.⁴ Kittler, J.⁵

4
- 0004269916
- Springer
- A. Blake and M. Isard, Active Contours. Springer, 1998.
- (1998) Active Contours
- Blake, A.¹ Isard, M.²

5
- 0009590598
- M. Brandstein and D. Ward, eds. Springer
- Microphone Arrays, M. Brandstein and D. Ward, eds. Springer, 2001.
- (2001) Microphone Arrays

6
- 0032918933
- Time-delay estimation of reverberant speech exploiting harmonic structure
- M.S. Brandstein, "Time-Delay Estimation of Reverberant Speech Exploiting Harmonic Structure," J. Accoustic Soc. Am., vol. 105, no. 5, pp. 2914-2919, 1999.
- (1999) J. Accoustic Soc. Am. , vol.105 , Issue.5 , pp. 2914-2919
- Brandstein, M.S.¹

7
- 85013597845
- Eigenlips for robust speech recognition
- C. Bregler and Y. Konig, "Eigenlips for Robust Speech Recognition," Proc. IEEE Conf. Acoustics, Speech, and Signal Processing, 1994.
- Proc. IEEE Conf. Acoustics, Speech, and Signal Processing, 1994
- Bregler, C.¹ Konig, Y.²

8
- 0042456403
- A multisensor-based collision avoidance system with application to military HMMWV
- K. Cheok, G. Smid, and D. McCune, "A Multisensor-Based Collision Avoidance System with Application to Military HMMWV," Proc. IEEE Conf. Intelligent Transportation Systems, 2000.
- Proc. IEEE Conf. Intelligent Transportation Systems, 2000
- Cheok, K.¹ Smid, G.² McCune, D.³

9
- 0034507915
- Look who's talking: Speaker detection using video and audio correlation
- R. Cutler and L. Davis, "Look Who's Talking: Speaker Detection Using Video and Audio Correlation," Proc. IEEE Conf. Multimedia and Expo, 2000.
- Proc. IEEE Conf. Multimedia and Expo, 2000
- Cutler, R.¹ Davis, L.²

10
- 0038715064
- Distributed meetings: A meeting capture and broadcasting system
- R. Cutler, Y. Rui, A. Gupta, J.J. Cadiz, I. Tashev, L.-W. He, A. Colburn, Z. Zhang, Z. Liu, and S. Silverberg, "Distributed Meetings: A Meeting Capture and Broadcasting System," Proc. ACM Multimedia, 2002.
- Proc. ACM Multimedia, 2002
- Cutler, R.¹ Rui, Y.² Gupta, A.³ Cadiz, J.J.⁴ Tashev, I.⁵ He, L.-W.⁶ Colburn, A.⁷ Zhang, Z.⁸ Liu, Z.⁹ Silverberg, S.¹⁰

11
- 0034842488
- Active speech source localization by a dual coarse-to-fine search
- R. Duraiswami, D. Zotkin, and L. David, "Active Speech Source Localization by a Dual Coarse-to-Fine Search," Proc. IEEE Conf. Acoustics, Speech, and Signal Processing, 2001.
- Proc. IEEE Conf. Acoustics, Speech, and Signal Processing, 2001
- Duraiswami, R.¹ Zotkin, D.² David, L.³

12
- 84898998791
- Fast, large-scale transformation-invariant clustering
- B. Frey and N. Jojic, "Fast, Large-Scale Transformation-Invariant Clustering," Proc. Advances in Neural Information Processing Systems 2001, vol. 14, 2002.
- (2002) Proc. Advances in Neural Information Processing Systems 2001 , vol.14
- Frey, B.¹ Jojic, N.²

13
- 0344077914
- Advances in algorithms for inference and learning in complex probability models
- pending publication
- B.J. Frey and N. Jojic, "Advances in Algorithms for Inference and Learning in Complex Probability Models," IEEE Trans. Pattern Analysis and Machine Intelligence, pending publication.
- IEEE Trans. Pattern Analysis and Machine Intelligence
- Frey, B.J.¹ Jojic, N.²

14
- 0037250978
- Transformation-invariant clustering using the EM algorithm
- Jan.
- B.J. Frey and N. Jojic, "Transformation-Invariant Clustering Using the EM Algorithm," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 1, Jan. 2003.
- (2003) IEEE Trans. Pattern Analysis and Machine Intelligence , vol.25 , Issue.1
- Frey, B.J.¹ Jojic, N.²

15
- 84905395655
- Audio-visual speaker detection using dynamic bayesian networks
- A. Garg, V. Pavlovic, and J.M. Rehg, "Audio-Visual Speaker Detection Using Dynamic Bayesian Networks," Proc. IEEE Conf. Automatic Face and Gesture Recognition, 2000.
- Proc. IEEE Conf. Automatic Face and Gesture Recognition, 2000
- Garg, A.¹ Pavlovic, V.² Rehg, J.M.³

16
- 84910034222
- Stereo vision lip-tracking for audio-video speech processing
- R. Goecke, J.B. Millar, A. Zelinsky, and J. Robert-Ribes, "Stereo Vision Lip-Tracking for Audio-Video Speech Processing," Proc. IEEE Conf. Acoustics, Speech, and Signal Processing, 2001.
- Proc. IEEE Conf. Acoustics, Speech, and Signal Processing, 2001
- Goecke, R.¹ Millar, J.B.² Zelinsky, A.³ Robert-Ribes, J.⁴

17
- 0042456392
- Audio-visual speech separation using hidden Markov models
- J. Hershey and M. Case, "Audio-Visual Speech Separation Using Hidden Markov Models," Proc. Advances in Neural Information Processing Systems 2001, vol. 14, 2002.
- (2002) Proc. Advances in Neural Information Processing Systems 2001 , vol.14
- Hershey, J.¹ Case, M.²

18
- 84899028297
- Using audio-visual synchrony to locate sounds
- S.A. Solla, T.K. Leen, and K.-R. Muller, eds.
- J. Hershey and J.R. Movellan, "Using Audio-Visual Synchrony to Locate Sounds," Proc. Advances in Neural Information Processing Systems 1999, S.A. Solla, T.K. Leen, and K.-R. Muller, eds., vol. 12, 2000.
- (2000) Proc. Advances in Neural Information Processing Systems 1999 , vol.12
- Hershey, J.¹ Movellan, J.R.²

19
- 84898954418
- Learning joint statistical models for audio-visual fusion and segregation
- J.W. Fisher III, T. Darrell, W.T. Freeman, and P.A. Viola, "Learning Joint Statistical Models for Audio-Visual Fusion and Segregation," Proc. Advances in Neural Information Processing Systems 2000, vol. 14, 2001.
- (2001) Proc. Advances in Neural Information Processing Systems 2000 , vol.14
- Fisher J.W. III¹ Darrell, T.² Freeman, W.T.³ Viola, P.A.⁴

20
- 0035680076
- Robust, on-line appearance models for vision tracking
- A.D. Jepson, D.J. Fleet, and T. El-Maraghi, "Robust, On-Line Appearance Models for Vision Tracking," Proc. IEEE Conf. Computer Vision and Pattern Recognition, Dec. 2001.
- Proc. IEEE Conf. Computer Vision and Pattern Recognition, Dec. 2001
- Jepson, A.D.¹ Fleet, D.J.² El-Maraghi, T.³

21
- 0035686705
- Learning flexible sprites in video layers
- N. Jojic and B.J. Frey, "Learning Flexible Sprites in Video Layers," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2001.
- Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2001
- Jojic, N.¹ Frey, B.J.²

22
- 0033698724
- Transformed hidden Markov models: Estimating mixture models of images and inferring spatial transformations in video sequences
- N. Jojic, N. Petrovic, B.J. Frey, and T.S. Huang, "Transformed Hidden Markov Models: Estimating Mixture Models of Images and Inferring Spatial Transformations in Video Sequences," Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2000.
- Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2000
- Jojic, N.¹ Petrovic, N.² Frey, B.J.³ Huang, T.S.⁴

23
- 0000935895
- An introduction to variational methods for graphical models
- M.I. Jordan, ed. Norwell Mass.: Kluwer Academic Publishers
- M.I. Jordan, Z. Ghahramani, T.S. Jaakkola, and L.K. Saul, "An Introduction to Variational Methods for Graphical Models," Learning in Graphical Models, M.I. Jordan, ed. Norwell Mass.: Kluwer Academic Publishers, 1998.
- (1998) Learning in Graphical Models
- Jordan, M.I.¹ Ghahramani, Z.² Jaakkola, T.S.³ Saul, L.K.⁴

24
- 84880877816
- Real-time auditory and visual multiple-object tracking for robots
- K. Nakadai, K. Hidai, H. Mizoguchi, H.G. Okuno, and H. Kitano, "Real-Time Auditory and Visual Multiple-Object Tracking for Robots," Proc. Int'l Joint Conf. Artificial Intelligence, 2001.
- Proc. Int'l Joint Conf. Artificial Intelligence, 2001
- Nakadai, K.¹ Hidai, K.² Mizoguchi, H.³ Okuno, H.G.⁴ Kitano, H.⁵

25
- 0002788893
- A view of the EM algorithm that justifies incremental, sparse, and other variants
- M.I. Jordan, ed.; Norwell Mass.: Kluwer Academic Publishers
- R.M. Neal and G.E. Hinton, "A View of the EM Algorithm that Justifies Incremental, Sparse, and Other Variants," Learning in Graphical Models, M.I. Jordan, ed. pp. 355-368, Norwell Mass.: Kluwer Academic Publishers, 1998.
- (1998) Learning in Graphical Models , pp. 355-368
- Neal, R.M.¹ Hinton, G.E.²

26
- 0347502024
- Social interaction of humanoid robot based on audio-visual tracking
- H.G. Okuno, K. Nakadai, and H. Kitano, "Social Interaction of Humanoid Robot Based on Audio-Visual Tracking," Proc. Int'l Conf. Industrial and Eng. Applications of Artificial Intelligence and Expert Systems, 2002.
- Proc. Int'l Conf. Industrial and Eng. Applications of Artificial Intelligence and Expert Systems, 2002
- Okuno, H.G.¹ Nakadai, K.² Kitano, H.³

27
- 0033279153
- Audio-visual tracking for natural interfaces
- G. Pingali, G. Tunali, and I. Carlborn, "Audio-Visual Tracking for Natural Interfaces," Proc. ACM Multimedia, 1999.
- Proc. ACM Multimedia, 1999
- Pingali, G.¹ Tunali, G.² Carlborn, I.³

28
- 0035691549
- Better proposal distributions: Object tracking using unscented particle filter
- Y. Rui and Y. Chen, "Better Proposal Distributions: Object Tracking Using Unscented Particle Filter," Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2000.
- Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2000
- Rui, Y.¹ Chen, Y.²

29
- 84898931254
- Facesync: A linear operator for measuring synchronization of video facial images and audio tracks
- M. Slaney and M. Covell, "Facesync: A Linear Operator for Measuring Synchronization of Video Facial Images and Audio Tracks," Proc. Advances in Neural Information Processing Systems 2000, vol. 14, 2001.
- (2001) Proc. Advances in Neural Information Processing Systems 2000 , vol.14
- Slaney, M.¹ Covell, M.²

30
- 0030681710
- Tracking multiple talkers using microphone-array measurements
- D.E. Sturim, M.S. Brandstein, and H.F. Solverman, "Tracking Multiple Talkers Using Microphone-Array Measurements," Proc. IEEE Conf. Acoustics, Speech, and Signal Processing, 1997.
- Proc. IEEE Conf. Acoustics, Speech, and Signal Processing, 1997
- Sturim, D.E.¹ Brandstein, M.S.² Solverman, H.F.³

31
- 0034844366
- Sequential Monte Carlo fusion of sound and vision for speaker tracking
- J. Vermaak, M. Gangnet, A. Blake, and P. Perez, "Sequential Monte Carlo Fusion of Sound and Vision for Speaker Tracking," Proc. IEEE Int'l Conf. Computer Vision, 2001.
- Proc. IEEE Int'l Conf. Computer Vision, 2001
- Vermaak, J.¹ Gangnet, M.² Blake, A.³ Perez, P.⁴

32
- 0031385284
- Voice source localization for automatic camera pointing system in cideoconferencing
- H. Wang and P. Chu, "Voice Source Localization for Automatic Camera Pointing System in Cideoconferencing," Proc. IEEE Conf. Acoustics, Speech, and Signal Processing, 1997.
- Proc. IEEE Conf. Acoustics, Speech, and Signal Processing, 1997
- Wang, H.¹ Chu, P.²

33
- 33846628402
- Audio-video array source localization for perceptual user interfaces
- K. Wilson, N. Checka, D. Demirdjian, and T. Darrell, "Audio-Video Array Source Localization for Perceptual User Interfaces," Proc. Workshop Perceptive User Interfaces, 2001.
- Proc. Workshop Perceptive User Interfaces, 2001
- Wilson, K.¹ Checka, N.² Demirdjian, D.³ Darrell, T.⁴

34
- 0036874485
- Joint audio-visual tracking using particle filters
- D.N. Zotkin, R. Duraiswami, and L.S. Davis, "Joint Audio-Visual Tracking Using Particle Filters," EURASIP J. Applied Signal Processing, vol. 11, pp. 1154-1164, 2002.
- (2002) EURASIP J. Applied Signal Processing , vol.11 , pp. 1154-1164
- Zotkin, D.N.¹ Duraiswami, R.² Davis, L.S.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.