메뉴 건너뛰기




Volumn 5, Issue , 2003, Pages 772-775

Audio-visual synchrony for detection of monologues in video archives

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; CORRELATION METHODS; NEURAL NETWORKS; SPEECH RECOGNITION;

EID: 0141631499     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (16)

References (11)
  • 2
    • 0037860595 scopus 로고    scopus 로고
    • Look Who's Talking: Speaker Detection using Video and Audio Correlation
    • Ross Cutler and Larry Davis, "Look Who's Talking: Speaker Detection using Video and Audio Correlation," in Proc. ICME, 2000.
    • (2000) Proc. ICME
    • Cutler, R.1    Davis, L.2
  • 3
    • 0037700834 scopus 로고    scopus 로고
    • Assessing face and speech consistency for monologue detectionin video
    • Harriet J. Nock, Giridharan lyengar, and Chalapathy Neti, "Assessing face and speech consistency for monologue detectionin video," in Proc. ACM Multimedia, 2002.
    • (2002) Proc. ACM Multimedia
    • Nock, H.J.1    Lyengar, G.2    Neti, C.3
  • 4
    • 0141826698 scopus 로고    scopus 로고
    • Audio-visual speaker recognition for video broadcast news: Some fusion techniques
    • Denmark, September
    • Benoit Maison, Chalapathy Neti, and Andrew Senior, "Audio-visual speaker recognition for video broadcast news: some fusion techniques," in IEEE Multimedia Signal Processing (MMSP99), Denmark, September 1999.
    • (1999) IEEE Multimedia Signal Processing (MMSP99)
    • Maison, B.1    Neti, C.2    Senior, A.3
  • 5
    • 0009622482 scopus 로고    scopus 로고
    • Using audio-visual synchrony to locate sounds
    • John Hershey and Javier Movellan, "Using audio-visual synchrony to locate sounds," in Proc. NIPS, 1999.
    • (1999) Proc. NIPS
    • Hershey, J.1    Movellan, J.2
  • 6
    • 84898954418 scopus 로고    scopus 로고
    • Learning Joint Statistical Models for Audio-Visual Fusion and Segregation
    • JW Fisher III, T Darrell, WT Freeman, and P Viola, "Learning Joint Statistical Models for Audio-Visual Fusion and Segregation," in Proc. NIPS, 2001.
    • (2001) Proc. NIPS
    • Fisher J.W. III1    Darrell, T.2    Freeman, W.T.3    Viola, P.4
  • 7
    • 0036293478 scopus 로고    scopus 로고
    • Informative sub-spaces for audiovisual processing: High-level function from low-level fusion
    • John W Fisher III and Trevor Darrell, "Informative sub-spaces for audiovisual processing: High-level function from low-level fusion," in Proc. ICASSP, 2002.
    • (2002) Proc. ICASSP
    • Fisher J.W. III1    Darrell, T.2
  • 8
    • 84898931254 scopus 로고    scopus 로고
    • Facesync: A linear operator for measuring synchronization of video facial images and audio tracks
    • Malcolm Slaney and Michele Covell, "Facesync: a linear operator for measuring synchronization of video facial images and audio tracks," in Proc. NIPS, 2001.
    • (2001) Proc. NIPS
    • Slaney, M.1    Covell, M.2
  • 10
    • 17344389852 scopus 로고    scopus 로고
    • Robust Speech Recognition in Noisy Environments: The IBM Spine-2 Evaluation System
    • Brian Kingsbury, George Saon, Lidia Mangu, Mukund Padmanabhan, and Ruhi Sarikaya, "Robust Speech Recognition in Noisy Environments: The IBM Spine-2 Evaluation System," in Proc. ICASSP, 2002.
    • (2002) Proc. ICASSP
    • Kingsbury, B.1    Saon, G.2    Mangu, L.3    Padmanabhan, M.4    Sarikaya, R.5
  • 11
    • 0002595416 scopus 로고    scopus 로고
    • Speaker, environment and channel change detection and clustering via the bayesian information criterion
    • Scott S. Chen and P. S. Gopalakrishnan, "Speaker, environment and channel change detection and clustering via the bayesian information criterion," Intl. Conf. On Acoust., Sp., and Sig. Proc., 1998.
    • (1998) Intl. Conf. On Acoust., Sp., and Sig. Proc.
    • Chen, S.S.1    Gopalakrishnan, P.S.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.