메뉴 건너뛰기




Volumn 2017-January, Issue , 2017, Pages 984-992

Video captioning with transferred semantic attributes

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER VISION; SEMANTICS;

EID: 85029372390     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/CVPR.2017.111     Document Type: Conference Paper
Times cited : (308)

References (38)
  • 1
    • 85083954507 scopus 로고    scopus 로고
    • Delving deeper into convolutional networks for learning video representations
    • N. Ballas, L. Yao, C. Pal, and A. Courville. Delving deeper into convolutional networks for learning video representations. In ICLR, 2016.
    • (2016) ICLR
    • Ballas, N.1    Yao, L.2    Pal, C.3    Courville, A.4
  • 2
    • 85116156579 scopus 로고    scopus 로고
    • Meteor: An automatic metric for mt evaluation with improved correlation with human judgments
    • S. Banerjee and A. Lavie. Meteor: An automatic metric for mt evaluation with improved correlation with human judgments. In ACL workshop, 2005.
    • (2005) ACL Workshop
    • Banerjee, S.1    Lavie, A.2
  • 3
    • 84859089502 scopus 로고    scopus 로고
    • Collecting highly parallel data for paraphrase evaluation
    • D. L. Chen and W. B. Dolan. Collecting highly parallel data for paraphrase evaluation. In ACL, 2011.
    • (2011) ACL
    • Chen, D.L.1    Dolan, W.B.2
  • 7
    • 84986296735 scopus 로고    scopus 로고
    • You lead, we exceed: Labor-free video concept learning by jointly exploiting web videos and images
    • C. Gan, T. Yao, K. Yang, Y. Yang, and T. Mei. You lead, we exceed: Labor-free video concept learning by jointly exploiting web videos and images. In CVPR, 2016.
    • (2016) CVPR
    • Gan, C.1    Yao, T.2    Yang, K.3    Yang, Y.4    Mei, T.5
  • 11
    • 0036843382 scopus 로고    scopus 로고
    • Natural language description of human activities from video images based on concept hierarchy of actions
    • A. Kojima, T. Tamura, and K. Fukunaga. Natural language description of human activities from video images based on concept hierarchy of actions. IJCV, 2002.
    • (2002) IJCV
    • Kojima, A.1    Tamura, T.2    Fukunaga, K.3
  • 14
    • 85006171438 scopus 로고    scopus 로고
    • Learning deep intrinsic video representation by exploring temporal coherence and graph structure
    • Y. Pan, Y. Li, T. Yao, T. Mei, H. Li, and Y. Rui. Learning deep intrinsic video representation by exploring temporal coherence and graph structure. In IJCAI, 2016.
    • (2016) IJCAI
    • Pan, Y.1    Li, Y.2    Yao, T.3    Mei, T.4    Li, H.5    Rui, Y.6
  • 15
    • 84986332702 scopus 로고    scopus 로고
    • Jointly modeling embedding and translation to bridge video and language
    • Y. Pan, T. Mei, T. Yao, H. Li, and Y. Rui. Jointly modeling embedding and translation to bridge video and language. In CVPR, 2016.
    • (2016) CVPR
    • Pan, Y.1    Mei, T.2    Yao, T.3    Li, H.4    Rui, Y.5
  • 16
    • 85133336275 scopus 로고    scopus 로고
    • Bleu: A method for automatic evaluation of machine translation
    • K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu. Bleu: a method for automatic evaluation of machine translation. In ACL, 2002.
    • (2002) ACL
    • Papineni, K.1    Roukos, S.2    Ward, T.3    Zhu, W.-J.4
  • 17
  • 18
    • 84973887740 scopus 로고    scopus 로고
    • The long-short story of movie description
    • A. Rohrbach, M. Rohrbach, and B. Schiele. The long-short story of movie description. In GCPR, 2015.
    • (2015) GCPR
    • Rohrbach, A.1    Rohrbach, M.2    Schiele, B.3
  • 22
    • 84977650097 scopus 로고    scopus 로고
    • Video captioning with recurrent networks based on frame-and video-level features and visual content classification
    • R. Shetty and J. Laaksonen. Video captioning with recurrent networks based on frame-and video-level features and visual content classification. In ICCV workshop, 2015.
    • (2015) ICCV Workshop
    • Shetty, R.1    Laaksonen, J.2
  • 23
    • 85083953063 scopus 로고    scopus 로고
    • Very deep convolutional networks for large-scale image recognition
    • K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015.
    • (2015) ICLR
    • Simonyan, K.1    Zisserman, A.2
  • 24
    • 84928547704 scopus 로고    scopus 로고
    • Sequence to sequence learning with neural networks
    • I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. In NIPS, 2014.
    • (2014) NIPS
    • Sutskever, I.1    Vinyals, O.2    Le, Q.V.3
  • 27
    • 84973865953 scopus 로고    scopus 로고
    • Learning spatiotemporal features with 3d convolutional networks
    • D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri. Learning spatiotemporal features with 3d convolutional networks. In ICCV, 2015.
    • (2015) ICCV
    • Tran, D.1    Bourdev, L.2    Fergus, R.3    Torresani, L.4    Paluri, M.5
  • 28
    • 84956980995 scopus 로고    scopus 로고
    • Cider: Consensus-based image description evaluation
    • R. Vedantam, C. Lawrence Zitnick, and D. Parikh. Cider: Consensus-based image description evaluation. In CVPR, 2015.
    • (2015) CVPR
    • Vedantam, R.1    Lawrence Zitnick, C.2    Parikh, D.3
  • 31
    • 84986301177 scopus 로고    scopus 로고
    • What value do explicit high level concepts have in vision to language problems?
    • Q. Wu, C. Shen, L. Liu, A. Dick, and A. v. d. Hengel. What value do explicit high level concepts have in vision to language problems? In CVPR, 2016.
    • (2016) CVPR
    • Wu, Q.1    Shen, C.2    Liu, L.3    Dick, A.4    Hengel, A.5
  • 32
    • 84986260127 scopus 로고    scopus 로고
    • MSR-VTT: A large video description dataset for bridging video and language
    • J. Xu, T. Mei, T. Yao, and Y. Rui. MSR-VTT: A large video description dataset for bridging video and language. In CVPR, 2016.
    • (2016) CVPR
    • Xu, J.1    Mei, T.2    Yao, T.3    Rui, Y.4
  • 33
    • 84952349307 scopus 로고    scopus 로고
    • Jointly modeling deep video and compositional text to bridge vision and language in a unified framework
    • R. Xu, C. Xiong, W. Chen, and J. J. Corso. Jointly modeling deep video and compositional text to bridge vision and language in a unified framework. In AAAI, 2015.
    • (2015) AAAI
    • Xu, R.1    Xiong, C.2    Chen, W.3    Corso, J.J.4
  • 36
    • 84986317307 scopus 로고    scopus 로고
    • Image captioning with semantic attention
    • Q. You, H. Jin, Z. Wang, C. Fang, and J. Luo. Image captioning with semantic attention. In CVPR, 2016.
    • (2016) CVPR
    • You, Q.1    Jin, H.2    Wang, Z.3    Fang, C.4    Luo, J.5
  • 37
    • 84986275061 scopus 로고    scopus 로고
    • Video paragraph captioning using hierarchical recurrent neural networks
    • H. Yu, J. Wang, Z. Huang, Y. Yang, and W. Xu. Video paragraph captioning using hierarchical recurrent neural networks. In CVPR, 2016.
    • (2016) CVPR
    • Yu, H.1    Wang, J.2    Huang, Z.3    Yang, Y.4    Xu, W.5
  • 38
    • 84864049528 scopus 로고    scopus 로고
    • Multiple instance boosting for object detection
    • C. Zhang, J. C. Platt, and P. A. Viola. Multiple instance boosting for object detection. In NIPS, 2005.
    • (2005) NIPS
    • Zhang, C.1    Platt, J.C.2    Viola, P.A.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.