SCOPUS 정보 검색 플랫폼

MM 2016 - Proceedings of the 2016 ACM Multimedia Conference

Volumn , Issue , 2016, Pages 1082-1086

Early embedding and late reranking for video captioning

(5) Dong, Jianfeng a Li, Xirong b Lan, Weiyu b Huo, Yujia b Snoek, Cees G M c

a ZHEJIANG UNIVERSITY (China)

b RENMIN UNIVERSITY OF CHINA (China)

c UNIVERSITY OF AMSTERDAM (Netherlands)

Author keywords

MSR video to language challenge; Sentence reranking; Tag embedding; Video captioning

Indexed keywords

HUMAN LIKENESS; IMAGE CAPTIONING; MSR VIDEO TO LANGUAGE CHALLENGE; NOCV1; NON-TRIVIAL; PERFORMANCE METRICS; RE-RANKING; TAG EMBEDDING; VIDEO CAPTIONING;

EID: 84994631269 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/2964284.2984064 Document Type: Conference Paper

Times cited : (88)

References (16)

1
- 84994683919
- CoRR, abs/1604.06838
- J. Dong, X. Li, and C. Snoek. Word2VisualVec: Cross-media retrieval by visual feature prediction. CoRR, abs/1604.06838, 2016.
- (2016) Word 2VisualVec: Cross-media Retrieval by Visual Feature Prediction
- Dong, J.¹ Li, X.² Snoek, C.³

2
- 84959250180
- From captions to visual concepts and back
- H. Fang, S. Gupta, F. Iandola, R. K. Srivastava, L. Deng, P. Dollár, J. Gao, X. He, M. Mitchell, J. C. Platt, et al. From captions to visual concepts and back. In CVPR, 2015.
- (2015) CVPR
- Fang, H.¹ Gupta, S.² Iandola, F.³ Srivastava, R.K.⁴ Deng, L.⁵ Dollár, P.⁶ Gao, J.⁷ He, X.⁸ Mitchell, M.⁹ Platt, J.C.¹⁰

3
- 84962835109
- CoRR, abs/1502.07209
- Y.-G. Jiang, Z. Wu, J. Wang, X. Xue, and S.-F. Chang. Exploiting feature and class relationships in video categorization with regularized deep neural networks. CoRR, abs/1502.07209, 2015.
- (2015) Exploiting Feature and Class Relationships in Video Categorization with Regularized Deep Neural Networks
- Jiang, Y.-G.¹ Wu, Z.² Wang, J.³ Xue, X.⁴ Chang, S.-F.⁵

4
- 84994639031
- Improving image captioning by concept-based sentence reranking
- X. Li and Q. Jin. Improving image captioning by concept-based sentence reranking. In PCM, 2016.
- (2016) PCM
- Li, X.¹ Jin, Q.²

5
- 70350333307
- Learning social tag relevance by neighbor voting
- X. Li, C. Snoek, and M. Worring. Learning social tag relevance by neighbor voting. IEEE Trans. Multimedia, 11(7):1310-1322, 2009.
- (2009) IEEE Trans. Multimedia , vol.11 , Issue.7 , pp. 1310-1322
- Li, X.¹ Snoek, C.² Worring, M.³

6
- 84975263305
- Socializing the semantic gap: A comparative survey on image tag assignment, refinement, and retrieval
- X. Li, T. Uricchio, L. Ballan, M. Bertini, C. Snoek, and A. D. Bimbo. Socializing the semantic gap: A comparative survey on image tag assignment, refinement, and retrieval. ACM Computing Surveys, 49(1):14:1-14:39, 2016.
- (2016) ACM Computing Surveys , vol.49 , Issue.1 , pp. 1401-1439
- Li, X.¹ Uricchio, T.² Ballan, L.³ Bertini, M.⁴ Snoek, C.⁵ Bimbo, A.D.⁶

7
- 84978696136
- The ImageNet shuffle: Reorganized pre-training for video event detection
- P. Mettes, D. Koelma, and C. Snoek. The ImageNet shuffle: Reorganized pre-training for video event detection. In ICMR, 2016.
- (2016) ICMR
- Mettes, P.¹ Koelma, D.² Snoek, C.³

8
- 85083951332
- Efficient estimation of word representations in vector space
- T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. In ICLR, 2013.
- (2013) ICLR
- Mikolov, T.¹ Chen, K.² Corrado, G.³ Dean, J.⁴

9
- 84986332702
- Jointly modeling embedding and translation to bridge video and language
- Y. Pan, T. Mei, T. Yao, H. Li, and Y. Rui. Jointly modeling embedding and translation to bridge video and language. In CVPR, 2016.
- (2016) CVPR
- Pan, Y.¹ Mei, T.² Yao, T.³ Li, H.⁴ Rui, Y.⁵

10
- 84937522268
- Going deeper with convolutions
- C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In CVPR, 2015.
- (2015) CVPR
- Szegedy, C.¹ Liu, W.² Jia, Y.³ Sermanet, P.⁴ Reed, S.⁵ Anguelov, D.⁶ Erhan, D.⁷ Vanhoucke, V.⁸ Rabinovich, A.⁹

11
- 84973865953
- Learning spatiotemporal features with 3d convolutional networks
- D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri. Learning spatiotemporal features with 3d convolutional networks. In ICCV, 2015.
- (2015) ICCV
- Tran, D.¹ Bourdev, L.² Fergus, R.³ Torresani, L.⁴ Paluri, M.⁵

12
- 84956980995
- Cider: Consensus-based image description evaluation
- R. Vedantam, C. L. Zitnick, and D. Parikh. Cider: Consensus-based image description evaluation. In CVPR, 2015.
- (2015) CVPR
- Vedantam, R.¹ Zitnick, C.L.² Parikh, D.³

13
- 84973882730
- Sequence to sequence-video to text
- S. Venugopalan, M. Rohrbach, J. Donahue, R. Mooney, T. Darrell, and K. Saenko. Sequence to sequence-video to text. In ICCV, 2015.
- (2015) ICCV
- Venugopalan, S.¹ Rohrbach, M.² Donahue, J.³ Mooney, R.⁴ Darrell, T.⁵ Saenko, K.⁶

14
- 84946747440
- Show and tell: A neural image caption generator
- O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. In CVPR, 2015.
- (2015) CVPR
- Vinyals, O.¹ Toshev, A.² Bengio, S.³ Erhan, D.⁴

15
- 84986260127
- MSR-VTT: A large video description dataset for bridging video and language
- J. Xu, T. Mei, T. Yao, and Y. Rui. MSR-VTT: A large video description dataset for bridging video and language. In CVPR, 2016.
- (2016) CVPR
- Xu, J.¹ Mei, T.² Yao, T.³ Rui, Y.⁴

16
- 84986275061
- Video paragraph captioning using hierarchical recurrent neural networks
- H. Yu, J. Wang, Z. Huang, Y. Yang, and W. Xu. Video paragraph captioning using hierarchical recurrent neural networks. In CVPR, 2016.
- (2016) CVPR
- Yu, H.¹ Wang, J.² Huang, Z.³ Yang, Y.⁴ Xu, W.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.