메뉴 건너뛰기




Volumn , Issue , 2011, Pages 349-352

Improving acoustic event detection using generalizable visual features and multi-modality modeling

Author keywords

acoustic event detection; coupled hidden Markov models; hidden Markov models; multi stream HMM; optical flow

Indexed keywords

ACOUSTIC EVENT CLASSIFICATION; ACOUSTIC EVENTS; ASYNCHRONY; AUDIO-VISUAL; COUPLED HIDDEN MARKOV MODELS; DATA RESOURCES; DATA SETS; JOINT MODELING; LOCALIZATION INFORMATION; MULTI-MODALITY; MULTI-STREAM; MULTI-STREAM HMM; STATE-SPACE; TIME STAMPS; VIDEO DATA; VIDEO STREAMS; VISUAL FEATURE; VISUAL REPRESENTATIONS;

EID: 80051652444     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2011.5946412     Document Type: Conference Paper
Times cited : (7)

References (17)
  • 9
    • 4544228318 scopus 로고    scopus 로고
    • Identity verification using speech and face information
    • C. Sanderson and K. K. Paliwal, "Identity verification using speech and face information," Digital Signal Processing, vol. 14, no. 5, pp. 449-480, 2004.
    • (2004) Digital Signal Processing , vol.14 , Issue.5 , pp. 449-480
    • Sanderson, C.1    Paliwal, K.K.2
  • 12
    • 33845572523 scopus 로고    scopus 로고
    • Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories
    • S. Lazebnik, C. Schmid, and J. Ponce, "Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories," in CVPR, 2006.
    • (2006) CVPR
    • Lazebnik, S.1    Schmid, C.2    Ponce, J.3
  • 13
    • 0001432664 scopus 로고    scopus 로고
    • On the integration of auditory and visual parameters in an HMM-based ASR
    • D. G. Stork and M. E. Hennecke (Eds.), Berlin: Springer-Verlag
    • A. Adjoudani and C. Benoit, "On the integration of auditory and visual parameters in an HMM-based ASR," In D. G. Stork and M. E. Hennecke (Eds.), Speechreading by Humans and Machines. Berlin: Springer-Verlag, pp. 461-471, 1996.
    • (1996) Speechreading by Humans and Machines , pp. 461-471
    • Adjoudani, A.1    Benoit, C.2
  • 14
  • 16
    • 0036650148 scopus 로고    scopus 로고
    • Statistical multimodal integration for audio-visual speech processing
    • S. Nakamura, "Statistical multimodal integration for audio-visual speech processing," IEEE Transactions on Neural Networks, vol. 13, no. 4, 2002.
    • (2002) IEEE Transactions on Neural Networks , vol.13 , Issue.4
    • Nakamura, S.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.