메뉴 건너뛰기




Volumn , Issue , 2016, Pages 357-361

Attention-based LSTM with semantic consistency for videos captioning

Author keywords

Attention mechanism; LSTM; Multimodal embedding; Semantic consistence; Video description

Indexed keywords

NEURAL NETWORKS;

EID: 84994560125     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/2964284.2967242     Document Type: Conference Paper
Times cited : (57)

References (27)
  • 2
    • 84859089502 scopus 로고    scopus 로고
    • Collecting highly parallel data for paraphrase evaluation
    • Association for Computational Linguistics
    • D. L. Chen and W. B. Dolan. Collecting highly parallel data for paraphrase evaluation. In ACL, pages 190-200. Association for Computational Linguistics, 2011.
    • (2011) ACL , pp. 190-200
    • Chen, D.L.1    Dolan, W.B.2
  • 4
    • 84946763507 scopus 로고    scopus 로고
    • Describing multimedia content using attention-based encoder-decoder networks
    • K. Cho, A. Courville, and Y. Bengio. Describing multimedia content using attention-based encoder-decoder networks. Multimedia, IEEE Transactions on, 17(11):1875-1886, 2015.
    • (2015) Multimedia IEEE Transactions on , vol.17 , Issue.11 , pp. 1875-1886
    • Cho, K.1    Courville, A.2    Bengio, Y.3
  • 7
    • 84959233699 scopus 로고    scopus 로고
    • Optimal graph learning with partial tags and multiple features for image and video annotation
    • L. Gao, J. Song, F. Nie, Y. Yan, N. Sebe, and H. T. Shen. Optimal graph learning with partial tags and multiple features for image and video annotation. In CVPR, pages 4371-4379, 2015.
    • (2015) CVPR , pp. 4371-4379
    • Gao, L.1    Song, J.2    Nie, F.3    Yan, Y.4    Sebe, N.5    Shen, H.T.6
  • 8
    • 84994636856 scopus 로고    scopus 로고
    • Graph-without-cut: An ideal graph learning for image segmentation
    • L. Gao, J. Song, F. Nie, F. Zou, N. Sebe, and H. T. Shen. Graph-without-cut: An ideal graph learning for image segmentation. In AAAI, pages 1188-1194, 2016.
    • (2016) AAAI , pp. 1188-1194
    • Gao, L.1    Song, J.2    Nie, F.3    Zou, F.4    Sebe, N.5    Shen, H.T.6
  • 10
    • 84937843643 scopus 로고    scopus 로고
    • Deep fragment embeddings for bidirectional image sentence mapping
    • A. Karpathy, A. Joulin, and F. F. F. Li. Deep fragment embeddings for bidirectional image sentence mapping. In NIPS, pages 1889-1897, 2014.
    • (2014) NIPS , pp. 1889-1897
    • Karpathy, A.1    Joulin, A.2    Li, F.F.F.3
  • 11
    • 84962850062 scopus 로고    scopus 로고
    • Summarization-based video caption via deep neural networks
    • ACM
    • G. Li, S. Ma, and Y. Han. Summarization-based video caption via deep neural networks. In ACM Multimedia, pages 1191-1194. ACM, 2015.
    • (2015) ACM Multimedia , pp. 1191-1194
    • Li, G.1    Ma, S.2    Han, Y.3
  • 13
    • 85133336275 scopus 로고    scopus 로고
    • Bleu: A method for automatic evaluation of machine translation
    • Association for Computational Linguistics
    • K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu. Bleu: a method for automatic evaluation of machine translation. In ACL, pages 311-318. Association for Computational Linguistics, 2002.
    • (2002) ACL , pp. 311-318
    • Papineni, K.1    Roukos, S.2    Ward, T.3    Zhu, W.-J.4
  • 14
    • 84888343222 scopus 로고    scopus 로고
    • Effective multiple feature hashing for large-scale near-duplicate video retrieval
    • J. Song, Y. Yang, Z. Huang, H. T. Shen, and J. Luo. Effective multiple feature hashing for large-scale near-duplicate video retrieval. IEEE Trans. Multimedia, 15(8):1997-2008, 2013.
    • (2013) IEEE Trans. Multimedia , vol.15 , Issue.8 , pp. 1997-2008
    • Song, J.1    Yang, Y.2    Huang, Z.3    Shen, H.T.4    Luo, J.5
  • 15
    • 84880548516 scopus 로고    scopus 로고
    • Inter-media hashing for large-scale retrieval from heterogeneous data sources
    • J. Song, Y. Yang, Y. Yang, Z. Huang, and H. T. Shen. Inter-media hashing for large-scale retrieval from heterogeneous data sources. In SIGMOD, pages 785-796, 2013.
    • (2013) SIGMOD , pp. 785-796
    • Song, J.1    Yang, Y.2    Yang, Y.3    Huang, Z.4    Shen, H.T.5
  • 19
    • 84956980995 scopus 로고    scopus 로고
    • Cider: Consensus-based image description evaluation
    • R. Vedantam, C. Lawrence Zitnick, and D. Parikh. Cider: Consensus-based image description evaluation. In CVPR, pages 4566-4575, 2015.
    • (2015) CVPR , pp. 4566-4575
    • Vedantam, R.1    Lawrence Zitnick, C.2    Parikh, D.3
  • 22
    • 84946747440 scopus 로고    scopus 로고
    • Show and tell: A neural image caption generator
    • O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. In CVPR, pages 3156-3164, 2015.
    • (2015) CVPR , pp. 3156-3164
    • Vinyals, O.1    Toshev, A.2    Bengio, S.3    Erhan, D.4
  • 25
    • 84940762015 scopus 로고    scopus 로고
    • Jointly modeling deep video and compositional text to bridge vision and language in a unified framework
    • Citeseer
    • R. Xu, C. Xiong, W. Chen, and J. J. Corso. Jointly modeling deep video and compositional text to bridge vision and language in a unified framework. In AAAI, pages 2346-2352. Citeseer, 2015.
    • (2015) AAAI , pp. 2346-2352
    • Xu, R.1    Xiong, C.2    Chen, W.3    Corso, J.J.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.