메뉴 건너뛰기




Volumn , Issue , 2016, Pages 1082-1086

Early embedding and late reranking for video captioning

Author keywords

MSR video to language challenge; Sentence reranking; Tag embedding; Video captioning

Indexed keywords

HUMAN LIKENESS; IMAGE CAPTIONING; MSR VIDEO TO LANGUAGE CHALLENGE; NOCV1; NON-TRIVIAL; PERFORMANCE METRICS; RE-RANKING; TAG EMBEDDING; VIDEO CAPTIONING;

EID: 84994631269     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/2964284.2984064     Document Type: Conference Paper
Times cited : (88)

References (16)
  • 4
    • 84994639031 scopus 로고    scopus 로고
    • Improving image captioning by concept-based sentence reranking
    • X. Li and Q. Jin. Improving image captioning by concept-based sentence reranking. In PCM, 2016.
    • (2016) PCM
    • Li, X.1    Jin, Q.2
  • 5
    • 70350333307 scopus 로고    scopus 로고
    • Learning social tag relevance by neighbor voting
    • X. Li, C. Snoek, and M. Worring. Learning social tag relevance by neighbor voting. IEEE Trans. Multimedia, 11(7):1310-1322, 2009.
    • (2009) IEEE Trans. Multimedia , vol.11 , Issue.7 , pp. 1310-1322
    • Li, X.1    Snoek, C.2    Worring, M.3
  • 6
    • 84975263305 scopus 로고    scopus 로고
    • Socializing the semantic gap: A comparative survey on image tag assignment, refinement, and retrieval
    • X. Li, T. Uricchio, L. Ballan, M. Bertini, C. Snoek, and A. D. Bimbo. Socializing the semantic gap: A comparative survey on image tag assignment, refinement, and retrieval. ACM Computing Surveys, 49(1):14:1-14:39, 2016.
    • (2016) ACM Computing Surveys , vol.49 , Issue.1 , pp. 1401-1439
    • Li, X.1    Uricchio, T.2    Ballan, L.3    Bertini, M.4    Snoek, C.5    Bimbo, A.D.6
  • 7
    • 84978696136 scopus 로고    scopus 로고
    • The ImageNet shuffle: Reorganized pre-training for video event detection
    • P. Mettes, D. Koelma, and C. Snoek. The ImageNet shuffle: Reorganized pre-training for video event detection. In ICMR, 2016.
    • (2016) ICMR
    • Mettes, P.1    Koelma, D.2    Snoek, C.3
  • 8
    • 85083951332 scopus 로고    scopus 로고
    • Efficient estimation of word representations in vector space
    • T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. In ICLR, 2013.
    • (2013) ICLR
    • Mikolov, T.1    Chen, K.2    Corrado, G.3    Dean, J.4
  • 9
    • 84986332702 scopus 로고    scopus 로고
    • Jointly modeling embedding and translation to bridge video and language
    • Y. Pan, T. Mei, T. Yao, H. Li, and Y. Rui. Jointly modeling embedding and translation to bridge video and language. In CVPR, 2016.
    • (2016) CVPR
    • Pan, Y.1    Mei, T.2    Yao, T.3    Li, H.4    Rui, Y.5
  • 11
    • 84973865953 scopus 로고    scopus 로고
    • Learning spatiotemporal features with 3d convolutional networks
    • D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri. Learning spatiotemporal features with 3d convolutional networks. In ICCV, 2015.
    • (2015) ICCV
    • Tran, D.1    Bourdev, L.2    Fergus, R.3    Torresani, L.4    Paluri, M.5
  • 12
    • 84956980995 scopus 로고    scopus 로고
    • Cider: Consensus-based image description evaluation
    • R. Vedantam, C. L. Zitnick, and D. Parikh. Cider: Consensus-based image description evaluation. In CVPR, 2015.
    • (2015) CVPR
    • Vedantam, R.1    Zitnick, C.L.2    Parikh, D.3
  • 14
    • 84946747440 scopus 로고    scopus 로고
    • Show and tell: A neural image caption generator
    • O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. In CVPR, 2015.
    • (2015) CVPR
    • Vinyals, O.1    Toshev, A.2    Bengio, S.3    Erhan, D.4
  • 15
    • 84986260127 scopus 로고    scopus 로고
    • MSR-VTT: A large video description dataset for bridging video and language
    • J. Xu, T. Mei, T. Yao, and Y. Rui. MSR-VTT: A large video description dataset for bridging video and language. In CVPR, 2016.
    • (2016) CVPR
    • Xu, J.1    Mei, T.2    Yao, T.3    Rui, Y.4
  • 16
    • 84986275061 scopus 로고    scopus 로고
    • Video paragraph captioning using hierarchical recurrent neural networks
    • H. Yu, J. Wang, Z. Huang, Y. Yang, and W. Xu. Video paragraph captioning using hierarchical recurrent neural networks. In CVPR, 2016.
    • (2016) CVPR
    • Yu, H.1    Wang, J.2    Huang, Z.3    Yang, Y.4    Xu, W.5


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.