메뉴 건너뛰기




Volumn 2017-January, Issue , 2017, Pages 3242-3250

Knowing when to look: Adaptive attention via a visual sentinel for image captioning

Author keywords

[No Author keywords available]

Indexed keywords

BEHAVIORAL RESEARCH; COMPUTER VISION; DECODING; PATTERN RECOGNITION; STATISTICAL TESTS;

EID: 85041910666     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/CVPR.2017.345     Document Type: Conference Paper
Times cited : (1427)

References (36)
  • 1
    • 85021678581 scopus 로고    scopus 로고
    • Spice: Semantic propositional image caption evaluation
    • P. Anderson, B. Fernando, M. Johnson, and S. Gould. Spice: Semantic propositional image caption evaluation. In ECCV, 2016.
    • (2016) ECCV
    • Anderson, P.1    Fernando, B.2    Johnson, M.3    Gould, S.4
  • 3
    • 84957029470 scopus 로고    scopus 로고
    • Mind's eye: A recurrent visual representation for image caption generation
    • X. Chen and C. Lawrence Zitnick. Mind's eye: A recurrent visual representation for image caption generation. In CVPR, 2015.
    • (2015) CVPR
    • Chen, X.1    Lawrence Zitnick, C.2
  • 10
    • 84986274465 scopus 로고    scopus 로고
    • Deep residual learning for image recognition
    • K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, 2016.
    • (2016) CVPR
    • He, K.1    Zhang, X.2    Ren, S.3    Sun, J.4
  • 11
    • 84946734827 scopus 로고    scopus 로고
    • Deep visual-semantic alignments for generating image descriptions
    • A. Karpathy and L. Fei-Fei. Deep visual-semantic alignments for generating image descriptions. In CVPR, 2015.
    • (2015) CVPR
    • Karpathy, A.1    Fei-Fei, L.2
  • 15
    • 78650200194 scopus 로고    scopus 로고
    • Rouge: A package for automatic evaluation of summaries
    • C.-Y. Lin. Rouge: A package for automatic evaluation of summaries. In ACL 2004 Workshop, 2004.
    • (2004) ACL 2004 Workshop
    • Lin, C.-Y.1
  • 17
    • 85018917850 scopus 로고    scopus 로고
    • Hierarchical question-image co-attention for visual question answering
    • J. Lu, J. Yang, D. Batra, and D. Parikh. Hierarchical question-image co-attention for visual question answering. In NIPS, 2016.
    • (2016) NIPS
    • Lu, J.1    Yang, J.2    Batra, D.3    Parikh, D.4
  • 18
    • 85083950512 scopus 로고    scopus 로고
    • Deep captioning with multimodal recurrent neural networks (m-rnn)
    • J. Mao, W. Xu, Y. Yang, J. Wang, Z. Huang, and A. Yuille. Deep captioning with multimodal recurrent neural networks (m-rnn). In ICLR, 2015.
    • (2015) ICLR
    • Mao, J.1    Xu, W.2    Yang, Y.3    Wang, J.4    Huang, Z.5    Yuille, A.6
  • 21
    • 85133336275 scopus 로고    scopus 로고
    • Bleu: A method for automatic evaluation of machine translation
    • K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu. Bleu: a method for automatic evaluation of machine translation. In ACL, 2002.
    • (2002) ACL
    • Papineni, K.1    Roukos, S.2    Ward, T.3    Zhu, W.-J.4
  • 24
    • 84928547704 scopus 로고    scopus 로고
    • Sequence to sequence learning with neural networks
    • I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. In NIPS, 2014.
    • (2014) NIPS
    • Sutskever, I.1    Vinyals, O.2    Le, Q.V.3
  • 26
    • 84956980995 scopus 로고    scopus 로고
    • Cider: Consensus-based image description evaluation
    • R. Vedantam, C. Lawrence Zitnick, and D. Parikh. Cider: Consensus-based image description evaluation. In CVPR, 2015.
    • (2015) CVPR
    • Vedantam, R.1    Lawrence Zitnick, C.2    Parikh, D.3
  • 27
    • 84946747440 scopus 로고    scopus 로고
    • Show and tell: A neural image caption generator
    • O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. In CVPR, 2015.
    • (2015) CVPR
    • Vinyals, O.1    Toshev, A.2    Bengio, S.3    Erhan, D.4
  • 29
    • 84999008900 scopus 로고    scopus 로고
    • Dynamic memory networks for visual and textual question answering
    • C. Xiong, S. Merity, and R. Socher. Dynamic memory networks for visual and textual question answering. In ICML, 2016.
    • (2016) ICML
    • Xiong, C.1    Merity, S.2    Socher, R.3
  • 31
    • 84986334021 scopus 로고    scopus 로고
    • Stacked attention networks for image question answering
    • Z. Yang, X. He, J. Gao, L. Deng, and A. Smola. Stacked attention networks for image question answering. In CVPR, 2016.
    • (2016) CVPR
    • Yang, Z.1    He, X.2    Gao, J.3    Deng, L.4    Smola, A.5
  • 32
    • 85030211479 scopus 로고    scopus 로고
    • Encode, review, and decode: Reviewer module for caption generation
    • Z. Yang, Y. Yuan, Y. Wu, R. Salakhutdinov, and W. W. Cohen. Encode, review, and decode: Reviewer module for caption generation. In NIPS, 2016.
    • (2016) NIPS
    • Yang, Z.1    Yuan, Y.2    Wu, Y.3    Salakhutdinov, R.4    Cohen, W.W.5
  • 34
    • 84986317307 scopus 로고    scopus 로고
    • Image captioning with semantic attention
    • Q. You, H. Jin, Z. Wang, C. Fang, and J. Luo. Image captioning with semantic attention. In CVPR, 2016.
    • (2016) CVPR
    • You, Q.1    Jin, H.2    Wang, Z.3    Fang, C.4    Luo, J.5
  • 35
    • 84906494296 scopus 로고    scopus 로고
    • From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions
    • P. Young, A. Lai, M. Hodosh, and J. Hockenmaier. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. In ACL, 2014.
    • (2014) ACL
    • Young, P.1    Lai, A.2    Hodosh, M.3    Hockenmaier, J.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.