메뉴 건너뛰기




Volumn 10, Issue 8, 2008, Pages 1541-1551

Boosting-based multimodal speaker detection for distributed meeting videos

Author keywords

Audiovisual fusion; Boosting; Speaker detection

Indexed keywords

FEATURE EXTRACTION; MAGNETOPLASMA; SPEECH RECOGNITION; VISUAL COMMUNICATION;

EID: 57849134738     PISSN: 15209210     EISSN: None     Source Type: Journal    
DOI: 10.1109/TMM.2008.2007344     Document Type: Article
Times cited : (37)

References (37)
  • 3
    • 34250699813 scopus 로고    scopus 로고
    • Audio-Visual Localization of Multiple Speakers in a Video Teleconferencing Setting
    • Canada
    • B. Kapralos, M. Jenkin, and E. Milios, "Audio-Visual Localization of Multiple Speakers in a Video Teleconferencing Setting", Tech. Rep. York University, Canada, 2002.
    • (2002) Tech. Rep. York University
    • Kapralos, B.1    Jenkin, M.2    Milios, E.3
  • 4
    • 0038038521 scopus 로고    scopus 로고
    • A multimodal speaker detection and tracking system for teleconferencing
    • B. Yoshimi and G. Pingali, "A multimodal speaker detection and tracking system for teleconferencing", in Proc. ACM Conf. Multimedia, 2002.
    • (2002) Proc. ACM Conf. Multimedia
    • Yoshimi, B.1    Pingali, G.2
  • 5
    • 2142812371 scopus 로고    scopus 로고
    • Robust real-time face detection
    • P. Viola and M. Jones, "Robust real-time face detection", Int. J. Comput. Vis., vol. 57, no. 2, pp. 137-154, 2004.
    • (2004) Int. J. Comput. Vis , vol.57 , Issue.2 , pp. 137-154
    • Viola, P.1    Jones, M.2
  • 6
    • 0031385284 scopus 로고    scopus 로고
    • Voice source localization for automatic camera pointing system in videoconferencing
    • H. Wang and P. Chu, "Voice source localization for automatic camera pointing system in videoconferencing", in Proc. IEEE ICASSP, 1997.
    • (1997) Proc. IEEE ICASSP
    • Wang, H.1    Chu, P.2
  • 7
    • 33646794986 scopus 로고    scopus 로고
    • Sound source localization for circular arrays of directional microphones
    • Y. Rui, D. Florencio, W. Lam, and J. Su, "Sound source localization for circular arrays of directional microphones", in Proc. IEEE ICASSP, 2005.
    • (2005) Proc. IEEE ICASSP
    • Rui, Y.1    Florencio, D.2    Lam, W.3    Su, J.4
  • 8
    • 0001432664 scopus 로고    scopus 로고
    • On the integration of auditory and visual parameters in an HMM-based ASR
    • Berlin, Germany: Springer
    • A. Adjoudani and C. Benoit, "On the integration of auditory and visual parameters in an HMM-based ASR", in Speechreading by Humans and Machines. Berlin, Germany: Springer, 1996, pp. 461-471.
    • (1996) Speechreading by Humans and Machines , pp. 461-471
    • Adjoudani, A.1    Benoit, C.2
  • 9
    • 0034270644 scopus 로고    scopus 로고
    • Audio-visual speech modeling for continuous speech recognition
    • Sep
    • S. Dupont and J. Luettin, "Audio-visual speech modeling for continuous speech recognition", IEEE Trans. Multimedia, vol. 2, no. 3, pp. 141-151, Sep. 2000.
    • (2000) IEEE Trans. Multimedia , vol.2 , Issue.3 , pp. 141-151
    • Dupont, S.1    Luettin, J.2
  • 11
    • 0030685285 scopus 로고    scopus 로고
    • Coupled hidden Markov models for complex action recognition
    • M. Brand, N. Oliver, and A. Pentland, "Coupled hidden Markov models for complex action recognition", in Proc. IEEE CVPR, 1997.
    • (1997) Proc. IEEE CVPR
    • Brand, M.1    Oliver, N.2    Pentland, A.3
  • 12
    • 84908294933 scopus 로고    scopus 로고
    • Duration dependent input output Markov models for audio-visual event detection
    • M. Naphade, A. Garg, and T. Huang, "Duration dependent input output Markov models for audio-visual event detection", in Proc. IEEEICME, 2001.
    • (2001) Proc. IEEEICME
    • Naphade, M.1    Garg, A.2    Huang, T.3
  • 13
    • 57849091620 scopus 로고    scopus 로고
    • Speaker change detection using joint audiovisual statistics
    • G. Iyengar and C. Neti, "Speaker change detection using joint audiovisual statistics", in Int. RIAO Conf., 2000.
    • (2000) Int. RIAO Conf
    • Iyengar, G.1    Neti, C.2
  • 14
    • 34250714316 scopus 로고    scopus 로고
    • Information Theoretic Optimization of Audio Features for Multimodal Speaker Detection
    • EPFL, Lausanne, Switzerland
    • P. Besson and M. Kunt, "Information Theoretic Optimization of Audio Features for Multimodal Speaker Detection", Tech. Rep. Signal Processing Institute, EPFL, Lausanne, Switzerland, 2005.
    • (2005) Tech. Rep. Signal Processing Institute
    • Besson, P.1    Kunt, M.2
  • 15
    • 0009622481 scopus 로고    scopus 로고
    • Learning joint statistical models for audio-visual fusion and segregation
    • J. Fisher, III, T. Darrell, W. Freeman, and P. Viola, "Learning joint statistical models for audio-visual fusion and segregation", in NIPS, 2000, pp. 772-778.
    • (2000) NIPS , pp. 772-778
    • Fisher III, J.1    Darrell, T.2    Freeman, W.3    Viola, P.4
  • 17
    • 34250762971 scopus 로고    scopus 로고
    • Multimodal speaker detection using error feedback dynamic Bayesian networks
    • V. Pavlović, A. Garg, J. Rehg, and T. Huang, "Multimodal speaker detection using error feedback dynamic Bayesian networks", in Proc. IEEE CVPR, 2001.
    • (2001) Proc. IEEE CVPR
    • Pavlović, V.1    Garg, A.2    Rehg, J.3    Huang, T.4
  • 18
    • 0036874485 scopus 로고    scopus 로고
    • Logistic regression, adaboost and bregman distances
    • D. N. Zotkin, R. Duraiswami, and L. S. Davis, "Logistic regression, adaboost and bregman distances", EURASIP J. Appl. Signal Process., vol. 2002, no. 11, pp. 1154-1164, 2002.
    • (2002) EURASIP J. Appl. Signal Process , vol.2002 , Issue.11 , pp. 1154-1164
    • Zotkin, D.N.1    Duraiswami, R.2    Davis, L.S.3
  • 19
    • 0034844366 scopus 로고    scopus 로고
    • Sequential Monte Carlo fusion of sound and vision for speaker tracking
    • J. Vermaak, M. Gangnet, A. Black, and P. Pérez, "Sequential Monte Carlo fusion of sound and vision for speaker tracking", in Proc. IEEE ICCV, 2001.
    • (2001) Proc. IEEE ICCV
    • Vermaak, J.1    Gangnet, M.2    Black, A.3    Pérez, P.4
  • 20
    • 20444478554 scopus 로고    scopus 로고
    • Speaker localisation using audiovisual synchrony: An empirical study
    • H. Nock, G. Iyengar, and C. Neti, "Speaker localisation using audiovisual synchrony: An empirical study", in Proc. CIVR, 2003.
    • (2003) Proc. CIVR
    • Nock, H.1    Iyengar, G.2    Neti, C.3
  • 21
    • 0034507915 scopus 로고    scopus 로고
    • Look who's talking: Speaker detection using video and audio correlation
    • R. Cutler and L. Davis, "Look who's talking: Speaker detection using video and audio correlation", in Proc. IEEE ICME, 2000.
    • (2000) Proc. IEEE ICME
    • Cutler, R.1    Davis, L.2
  • 22
    • 0344044776 scopus 로고    scopus 로고
    • Audio-video sensor fusion with probabilistic graphical models
    • M. Beal, H. Attias, and N. Jojic, "Audio-video sensor fusion with probabilistic graphical models", in Proc. ECCV, 2002.
    • (2002) Proc. ECCV
    • Beal, M.1    Attias, H.2    Jojic, N.3
  • 23
    • 21244492850 scopus 로고    scopus 로고
    • Real-time speaker tracking using particle filter sensor fusion
    • Y. Chen and Y. Rui, "Real-time speaker tracking using particle filter sensor fusion", Proc. IEEE, vol. 92, pp. 485-494, 2004.
    • (2004) Proc. IEEE , vol.92 , pp. 485-494
    • Chen, Y.1    Rui, Y.2
  • 24
  • 25
    • 34547516010 scopus 로고    scopus 로고
    • Maximum likelihood sound source localization for multiple directional microphones
    • C. Zhang, Z. Zhang, and D. Florêncio, "Maximum likelihood sound source localization for multiple directional microphones", in ICASSP, 2007.
    • (2007) ICASSP
    • Zhang, C.1    Zhang, Z.2    Florêncio, D.3
  • 26
    • 4644273800 scopus 로고    scopus 로고
    • Source localization in reverberant environments: Performance bounds and ml estimation
    • T. Gustafsson, B. Rao, and M. Trivedi, "Source localization in reverberant environments: Performance bounds and ml estimation", in Proc. ICASSP, 2001.
    • (2001) Proc. ICASSP
    • Gustafsson, T.1    Rao, B.2    Trivedi, M.3
  • 27
    • 0030701369 scopus 로고    scopus 로고
    • A robust method for speech signal time-delay estimation in reverberant rooms
    • M. Brandstein and H. Silverman, "A robust method for speech signal time-delay estimation in reverberant rooms", in Proc. ICASSP, 1997.
    • (1997) Proc. ICASSP
    • Brandstein, M.1    Silverman, H.2
  • 29
    • 0033281701 scopus 로고    scopus 로고
    • Improved boosting algorithms using confidence-rated predictions
    • R. Schapire and Y. Singer, "Improved boosting algorithms using confidence-rated predictions." Mach. Learn., vol. 37, no. 3, pp. 297-336, 1999.
    • (1999) Mach. Learn , vol.37 , Issue.3 , pp. 297-336
    • Schapire, R.1    Singer, Y.2
  • 30
    • 0344983340 scopus 로고    scopus 로고
    • Detecting pedestrians using patterns of motion and appearance
    • P. Viola, M. Jones, and D. Snow, "Detecting pedestrians using patterns of motion and appearance", in Proc. IEEE ICCV, 2003.
    • (2003) Proc. IEEE ICCV
    • Viola, P.1    Jones, M.2    Snow, D.3
  • 33
    • 0003486467 scopus 로고    scopus 로고
    • Multimedia sensor fusion for intelligent camera control and human computer interaction
    • Ph.D. dissertation, Dept. Elect. Eng, North Carolina Start Univ, Raleigh
    • S. Goodridge, "Multimedia sensor fusion for intelligent camera control and human computer interaction", Ph.D. dissertation, Dept. Elect. Eng., North Carolina Start Univ., Raleigh, 1997.
    • (1997)
    • Goodridge, S.1
  • 34
    • 0036643072 scopus 로고    scopus 로고
    • Logistic regression, Adaboost and Bregman distances
    • M. Collins, R. Schapire, and Y. Singer, "Logistic regression, Adaboost and Bregman distances", Mach. Learn., vol. 48, no. 1-3, pp. 253-285, 2002.
    • (2002) Mach. Learn , vol.48 , Issue.1-3 , pp. 253-285
    • Collins, M.1    Schapire, R.2    Singer, Y.3
  • 36
    • 70450187477 scopus 로고    scopus 로고
    • Multiple-instance pruning for learning efficient cascade detectors
    • C. Zhang and P. Viola, "Multiple-instance pruning for learning efficient cascade detectors", in NIPS, 2007.
    • (2007) NIPS
    • Zhang, C.1    Viola, P.2
  • 37
    • 33645146449 scopus 로고    scopus 로고
    • Histograms of oriented gradients for human detection
    • N. Dalai and B. Triggs, "Histograms of oriented gradients for human detection", in Proc. CVPR, 2005.
    • (2005) Proc. CVPR
    • Dalai, N.1    Triggs, B.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.