메뉴 건너뛰기




Volumn , Issue , 2015, Pages 1107-1110

Deep multimodal speaker naming

Author keywords

CNN; Deep Learning; Multimodal; Speaker Naming

Indexed keywords

FEATURE EXTRACTION;

EID: 84962840571     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/2733373.2806293     Document Type: Conference Paper
Times cited : (55)

References (21)
  • 1
    • 33947167478 scopus 로고    scopus 로고
    • Face description with local binary patterns: Application to face recognition
    • T. Ahonen, A. Hadid, and M. Pieti-kainen. Face description with local binary patterns: Application to face recognition. TPAMI, 2006.
    • (2006) TPAMI
    • Ahonen, T.1    Hadid, A.2    Pieti-Kainen, M.3
  • 2
    • 84887366672 scopus 로고    scopus 로고
    • Semi-supervised learning with constraints for person identification in multimedia data
    • M. Bauml, M. Tapaswi, and R. Stiefelhagen. Semi-supervised learning with constraints for person identification in multimedia data. In CVPR, 2013.
    • (2013) CVPR
    • Bauml, M.1    Tapaswi, M.2    Stiefelhagen, R.3
  • 3
    • 0031185845 scopus 로고    scopus 로고
    • Eigenfaces vs. Sherfaces: Recognition using class specific linear projection
    • P. Belhumeur, J. Hespanha, and D. Kriegman. Eigenfaces vs. sherfaces: Recognition using class specific linear projection. TPAMI, 1997.
    • (1997) TPAMI
    • Belhumeur, P.1    Hespanha, J.2    Kriegman, D.3
  • 5
    • 79955702502 scopus 로고    scopus 로고
    • LIBSVM: A library for support vector machines
    • C. Chang and C. Lin. LIBSVM: A library for support vector machines. TIST, 2011.
    • (2011) TIST
    • Chang, C.1    Lin, C.2
  • 6
    • 84898027861 scopus 로고    scopus 로고
    • Hello! my name is Buy"-Automatic naming of characters in TV video
    • M. Everingham, J. Sivic, and A. Zisserman. \Hello! my name is.. Buy"-Automatic naming of characters in TV video. In BMVC, 2006.
    • (2006) BMVC
    • Everingham, M.1    Sivic, J.2    Zisserman, A.3
  • 7
    • 78650993562 scopus 로고    scopus 로고
    • Dynamic captioning: Video accessibility enhancement for hearing impairment
    • R. Hong, M. Wang, M. Xu, S. Yan, and T. Chua. Dynamic captioning: Video accessibility enhancement for hearing impairment. In MM, 2010.
    • (2010) MM
    • Hong, R.1    Wang, M.2    Xu, M.3    Yan, S.4    Chua, T.5
  • 8
    • 84962814970 scopus 로고    scopus 로고
    • Speaker-following video subtitles
    • Y. Hu, J. Kautz, Y. Yu, and W. Wang. Speaker-following video subtitles. TOMM, 2014.
    • (2014) TOMM
    • Hu, Y.1    Kautz, J.2    Yu, Y.3    Wang, W.4
  • 10
    • 84876231242 scopus 로고    scopus 로고
    • Imagenet classification with deep convolutional neural networks
    • A. Krizhevsky et al. Imagenet classification with deep convolutional neural networks. In NIPS, 2012.
    • (2012) NIPS
    • Krizhevsky, A.1
  • 11
    • 70350647441 scopus 로고    scopus 로고
    • Naming faces in broadcast news video by image google
    • C. Liu, S. Jiang, and Q. Huang. Naming faces in broadcast news video by image google. In MM, 2008.
    • (2008) MM
    • Liu, C.1    Jiang, S.2    Huang, Q.3
  • 12
    • 84911449395 scopus 로고    scopus 로고
    • Learning and transferring mid-level image representations using convolutional neural networks
    • M. Oquab, L. Bottou, I. Laptev, and J. Sivic. Learning and transferring mid-level image representations using convolutional neural networks. In CVPR, 2014.
    • (2014) CVPR
    • Oquab, M.1    Bottou, L.2    Laptev, I.3    Sivic, J.4
  • 13
    • 84962798777 scopus 로고    scopus 로고
    • On vectorization of deep convolutional neural networks for vision tasks
    • J. Ren and L. Xu. On vectorization of deep convolutional neural networks for vision tasks. In AAAI, 2015.
    • (2015) AAAI
    • Ren, J.1    Xu, L.2
  • 14
    • 84856609002 scopus 로고    scopus 로고
    • Design, analysis and experimental evaluation of block based transformation in mfcc computation for speaker recognition
    • M. Sahidullah and G. Saha. Design, analysis and experimental evaluation of block based transformation in mfcc computation for speaker recognition. SC, 2012.
    • (2012) SC
    • Sahidullah, M.1    Saha, G.2
  • 15
    • 70450202706 scopus 로고    scopus 로고
    • Who are you?-learning person specific classifiers from video
    • J. Sivic et al. \Who are you?"-learning person specific classifiers from video. In CVPR, 2009.
    • (2009) CVPR
    • Sivic, J.1
  • 16
    • 84871360424 scopus 로고    scopus 로고
    • Drive video summarization based on double articulation structure of driving behavior
    • K. Takenaka, T. Bando, S. Nagasaka, et al. Drive video summarization based on double articulation structure of driving behavior. In MM, 2012.
    • (2012) MM
    • Takenaka, K.1    Bando, T.2    Nagasaka, S.3
  • 17
    • 84866659479 scopus 로고    scopus 로고
    • Knock! Knock! Who is it? Probabilistic person identification in TV-series
    • M. Tapaswi, M. Bauml, and R. Stie-felhagen. \Knock! Knock! Who is it?" probabilistic person identification in TV-series. In CVPR, 2012.
    • (2012) CVPR
    • Tapaswi, M.1    Bauml, M.2    Stie-Felhagen, R.3
  • 18
    • 0026065565 scopus 로고
    • Eigenfaces for recognition
    • M. Turk and A. Pentland. Eigenfaces for recognition. JCN, 1991.
    • (1991) JCN
    • Turk, M.1    Pentland, A.2
  • 19
    • 13444303935 scopus 로고    scopus 로고
    • Naming every individual in news video monologues
    • J. Yang and A. Hauptmann. Naming every individual in news video monologues. In MM, 2004.
    • (2004) MM
    • Yang, J.1    Hauptmann, A.2
  • 20
    • 33746589042 scopus 로고    scopus 로고
    • Multiple instance learning for labeling faces in broadcasting news video
    • J. Yang, R. Yan, and A. Hauptmann. Multiple instance learning for labeling faces in broadcasting news video. In MM, 2005.
    • (2005) MM
    • Yang, J.1    Yan, R.2    Hauptmann, A.3
  • 21
    • 84919678786 scopus 로고    scopus 로고
    • Attribute-Augmented semantic hierarchy
    • H. Zhang, Z. Zha, Y. Yang, S. Yan, et al. Attribute-Augmented semantic hierarchy. In MM, 2013.
    • (2013) MM
    • Zhang, H.1    Zha, Z.2    Yang, Y.3    Yan, S.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.