SCOPUS 정보 검색 플랫폼

MM 2015 - Proceedings of the 2015 ACM Multimedia Conference

Volumn , Issue , 2015, Pages 1107-1110

Deep multimodal speaker naming

(6) Hu, Yongtao a Ren, Jimmy Sj b Dai, Jingwen c Yuan, Chang d Xu, Li b Wang, Wenping a

a UNIVERSITY OF HONG KONG (Hong Kong)

b SENSETIME RESEARCH (China)

c Xim Industry Inc (Hong Kong)

d Lenovo Group Ltd (United States)

Author keywords

CNN; Deep Learning; Multimodal; Speaker Naming

Indexed keywords

FEATURE EXTRACTION;

DEEP LEARNING; FUSION FUNCTIONS; LANDMARK LOCALIZATION; LEARNING FRAMEWORKS; MULTI-MODAL; MULTI-MODAL APPROACH; MULTIMODAL FEATURE EXTRACTIONS; SPEAKER NAMING;

SPEECH RECOGNITION;

EID: 84962840571 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/2733373.2806293 Document Type: Conference Paper

Times cited : (55)

References (21)

1
- 33947167478
- Face description with local binary patterns: Application to face recognition
- T. Ahonen, A. Hadid, and M. Pieti-kainen. Face description with local binary patterns: Application to face recognition. TPAMI, 2006.
- (2006) TPAMI
- Ahonen, T.¹ Hadid, A.² Pieti-Kainen, M.³

2
- 84887366672
- Semi-supervised learning with constraints for person identification in multimedia data
- M. Bauml, M. Tapaswi, and R. Stiefelhagen. Semi-supervised learning with constraints for person identification in multimedia data. In CVPR, 2013.
- (2013) CVPR
- Bauml, M.¹ Tapaswi, M.² Stiefelhagen, R.³

3
- 0031185845
- Eigenfaces vs. Sherfaces: Recognition using class specific linear projection
- P. Belhumeur, J. Hespanha, and D. Kriegman. Eigenfaces vs. sherfaces: Recognition using class specific linear projection. TPAMI, 1997.
- (1997) TPAMI
- Belhumeur, P.¹ Hespanha, J.² Kriegman, D.³

4
- 84898792367
- Finding actors and actions in movies
- P. Bojanowski, F. Bach, I. Laptev, J. Ponce, et al. Finding actors and actions in movies. In ICCV, 2013.
- (2013) ICCV
- Bojanowski, P.¹ Bach, F.² Laptev, I.³ Ponce, J.⁴

5
- 79955702502
- LIBSVM: A library for support vector machines
- C. Chang and C. Lin. LIBSVM: A library for support vector machines. TIST, 2011.
- (2011) TIST
- Chang, C.¹ Lin, C.²

6
- 84898027861
- Hello! my name is Buy"-Automatic naming of characters in TV video
- M. Everingham, J. Sivic, and A. Zisserman. \Hello! my name is.. Buy"-Automatic naming of characters in TV video. In BMVC, 2006.
- (2006) BMVC
- Everingham, M.¹ Sivic, J.² Zisserman, A.³

7
- 78650993562
- Dynamic captioning: Video accessibility enhancement for hearing impairment
- R. Hong, M. Wang, M. Xu, S. Yan, and T. Chua. Dynamic captioning: Video accessibility enhancement for hearing impairment. In MM, 2010.
- (2010) MM
- Hong, R.¹ Wang, M.² Xu, M.³ Yan, S.⁴ Chua, T.⁵

8
- 84962814970
- Speaker-following video subtitles
- Y. Hu, J. Kautz, Y. Yu, and W. Wang. Speaker-following video subtitles. TOMM, 2014.
- (2014) TOMM
- Hu, Y.¹ Kautz, J.² Yu, Y.³ Wang, W.⁴

9
- 84893716958
- Open source biometric recognition
- J. Klontz, B. Klare, S. Klum, A. Jain, and M. Burge. Open source biometric recognition. In BTAS, 2013.
- (2013) BTAS
- Klontz, J.¹ Klare, B.² Klum, S.³ Jain, A.⁴ Burge, M.⁵

10
- 84876231242
- Imagenet classification with deep convolutional neural networks
- A. Krizhevsky et al. Imagenet classification with deep convolutional neural networks. In NIPS, 2012.
- (2012) NIPS
- Krizhevsky, A.¹

11
- 70350647441
- Naming faces in broadcast news video by image google
- C. Liu, S. Jiang, and Q. Huang. Naming faces in broadcast news video by image google. In MM, 2008.
- (2008) MM
- Liu, C.¹ Jiang, S.² Huang, Q.³

12
- 84911449395
- Learning and transferring mid-level image representations using convolutional neural networks
- M. Oquab, L. Bottou, I. Laptev, and J. Sivic. Learning and transferring mid-level image representations using convolutional neural networks. In CVPR, 2014.
- (2014) CVPR
- Oquab, M.¹ Bottou, L.² Laptev, I.³ Sivic, J.⁴

13
- 84962798777
- On vectorization of deep convolutional neural networks for vision tasks
- J. Ren and L. Xu. On vectorization of deep convolutional neural networks for vision tasks. In AAAI, 2015.
- (2015) AAAI
- Ren, J.¹ Xu, L.²

14
- 84856609002
- Design, analysis and experimental evaluation of block based transformation in mfcc computation for speaker recognition
- M. Sahidullah and G. Saha. Design, analysis and experimental evaluation of block based transformation in mfcc computation for speaker recognition. SC, 2012.
- (2012) SC
- Sahidullah, M.¹ Saha, G.²

15
- 70450202706
- Who are you?-learning person specific classifiers from video
- J. Sivic et al. \Who are you?"-learning person specific classifiers from video. In CVPR, 2009.
- (2009) CVPR
- Sivic, J.¹

16
- 84871360424
- Drive video summarization based on double articulation structure of driving behavior
- K. Takenaka, T. Bando, S. Nagasaka, et al. Drive video summarization based on double articulation structure of driving behavior. In MM, 2012.
- (2012) MM
- Takenaka, K.¹ Bando, T.² Nagasaka, S.³

17
- 84866659479
- Knock! Knock! Who is it? Probabilistic person identification in TV-series
- M. Tapaswi, M. Bauml, and R. Stie-felhagen. \Knock! Knock! Who is it?" probabilistic person identification in TV-series. In CVPR, 2012.
- (2012) CVPR
- Tapaswi, M.¹ Bauml, M.² Stie-Felhagen, R.³

18
- 0026065565
- Eigenfaces for recognition
- M. Turk and A. Pentland. Eigenfaces for recognition. JCN, 1991.
- (1991) JCN
- Turk, M.¹ Pentland, A.²

19
- 13444303935
- Naming every individual in news video monologues
- J. Yang and A. Hauptmann. Naming every individual in news video monologues. In MM, 2004.
- (2004) MM
- Yang, J.¹ Hauptmann, A.²

20
- 33746589042
- Multiple instance learning for labeling faces in broadcasting news video
- J. Yang, R. Yan, and A. Hauptmann. Multiple instance learning for labeling faces in broadcasting news video. In MM, 2005.
- (2005) MM
- Yang, J.¹ Yan, R.² Hauptmann, A.³

21
- 84919678786
- Attribute-Augmented semantic hierarchy
- H. Zhang, Z. Zha, Y. Yang, S. Yan, et al. Attribute-Augmented semantic hierarchy. In MM, 2013.
- (2013) MM
- Zhang, H.¹ Zha, Z.² Yang, Y.³ Yan, S.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.