메뉴 건너뛰기




Volumn 2017-January, Issue , 2017, Pages 1988-1997

CLEVR: A diagnostic dataset for compositional language and elementary visual reasoning

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER VISION; STATISTICAL TESTS;

EID: 85041904911     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/CVPR.2017.215     Document Type: Conference Paper
Times cited : (1888)

References (50)
  • 1
    • 85072842417 scopus 로고    scopus 로고
    • Analyzing the behavior of visual question answering models
    • A. Agrawal, D. Batra, and D. Parikh. Analyzing the behavior of visual question answering models. In EMNLP, 2016.
    • (2016) EMNLP
    • Agrawal, A.1    Batra, D.2    Parikh, D.3
  • 2
    • 84993660571 scopus 로고    scopus 로고
    • Learning to compose neural networks for question answering
    • J. Andreas, M. Rohrbach, T. Darrell, and D. Klein. Learning to compose neural networks for question answering. In NAACL, 2016.
    • (2016) NAACL
    • Andreas, J.1    Rohrbach, M.2    Darrell, T.3    Klein, D.4
  • 5
    • 84879854889 scopus 로고    scopus 로고
    • Representation learning: A review and new perspectives
    • Y. Bengio, A. Courville, and P. Vincent. Representation learning: A review and new perspectives. TPAMI, 35(8):1798-1828, 2014.
    • (2014) TPAMI , vol.35 , Issue.8 , pp. 1798-1828
    • Bengio, Y.1    Courville, A.2    Vincent, P.3
  • 6
    • 84992615443 scopus 로고    scopus 로고
    • Blender Foundation, Blender Institute, Amsterdam
    • Blender Online Community. Blender - a 3D modelling and rendering package. Blender Foundation, Blender Institute, Amsterdam, 2016.
    • (2016) Blender - A 3D Modelling and Rendering Package
  • 7
    • 84959908834 scopus 로고    scopus 로고
    • Deja image-captions: A corpus of expressive image descriptions in repetition
    • J. Chen, P. Kuznetsova, D. Warren, and Y. Choi. Deja image-captions: A corpus of expressive image descriptions in repetition. In NAACL, 2015.
    • (2015) NAACL
    • Chen, J.1    Kuznetsova, P.2    Warren, D.3    Choi, Y.4
  • 10
    • 84965148420 scopus 로고    scopus 로고
    • Are you talking to a Machine? Dataset and methods for multilingual image question answering
    • H. Gao, J. Mao, J. Zhou, Z. Huang, L. Wang, and W. Xu. Are you talking to a machine? Dataset and methods for multilingual image question answering. In NIPS, 2015.
    • (2015) NIPS
    • Gao, H.1    Mao, J.2    Zhou, J.3    Huang, Z.4    Wang, L.5    Xu, W.6
  • 14
    • 84986274465 scopus 로고    scopus 로고
    • Deep residual learning for image recognition
    • K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, 2016.
    • (2016) CVPR
    • He, K.1    Zhang, X.2    Ren, S.3    Sun, J.4
  • 16
    • 85041926703 scopus 로고    scopus 로고
    • Revisiting visual question answering baselines
    • A. Jabri, A. Joulin, and L. van der Maaten. Revisiting visual question answering baselines. In ECCV, 2016.
    • (2016) ECCV
    • Jabri, A.1    Joulin, A.2    Van Der Maaten, L.3
  • 18
    • 84965117324 scopus 로고    scopus 로고
    • Inferring algorithmic patterns with stack-augmented recurrent nets
    • A. Joulin and T. Mikolov. Inferring algorithmic patterns with stack-augmented recurrent nets. In NIPS, 2015.
    • (2015) NIPS
    • Joulin, A.1    Mikolov, T.2
  • 19
    • 84943540775 scopus 로고    scopus 로고
    • Referitgame: Referring to objects in photographs of natural scenes
    • S. Kazemzadeh, V. Ordonez, M. Matten, and T. Berg. Referitgame: Referring to objects in photographs of natural scenes. In EMNLP, 2014.
    • (2014) EMNLP
    • Kazemzadeh, S.1    Ordonez, V.2    Matten, M.3    Berg, T.4
  • 20
    • 85083951076 scopus 로고    scopus 로고
    • Adam: A method for stochastic optimization
    • D. Kingma and J. Ba. Adam: A method for stochastic optimization. In ICLR, 2015.
    • (2015) ICLR
    • Kingma, D.1    Ba, J.2
  • 24
    • 85018917850 scopus 로고    scopus 로고
    • Hierarchical question-image co-attention for visual question answering
    • J. Lu, J. Yang, D. Batra, and D. Parikh. Hierarchical question-image co-attention for visual question answering. In NIPS, 2016.
    • (2016) NIPS
    • Lu, J.1    Yang, J.2    Batra, D.3    Parikh, D.4
  • 25
    • 85007153677 scopus 로고    scopus 로고
    • Learning to answer questions from image using convolutional neural network
    • L. Ma, Z. Lu, and H. Li. Learning to answer questions from image using convolutional neural network. In AAAI, 2016.
    • (2016) AAAI
    • Ma, L.1    Lu, Z.2    Li, H.3
  • 26
    • 84937822746 scopus 로고    scopus 로고
    • A multi-world approach to question answering about real-world scenes based on uncertain input
    • M. Malinowski and M. Fritz. A multi-world approach to question answering about real-world scenes based on uncertain input. In NIPS, 2014.
    • (2014) NIPS
    • Malinowski, M.1    Fritz, M.2
  • 28
    • 84973896625 scopus 로고    scopus 로고
    • Ask your neurons: A neural-based approach to answering questions about images
    • M. Malinowski, M. Rohrbach, and M. Fritz. Ask your neurons: A neural-based approach to answering questions about images. In ICCV, 2015.
    • (2015) ICCV
    • Malinowski, M.1    Rohrbach, M.2    Fritz, M.3
  • 31
    • 85072826753 scopus 로고    scopus 로고
    • Question relevance in vqa: Identifying non-visual and falsepremise questions
    • A. Ray, G. Christie, M. Bansal, D. Batra, and D. Parikh. Question relevance in vqa: Identifying non-visual and falsepremise questions. In EMNLP, 2016.
    • (2016) EMNLP
    • Ray, A.1    Christie, G.2    Bansal, M.3    Batra, D.4    Parikh, D.5
  • 32
    • 84965170394 scopus 로고    scopus 로고
    • Exploring models and data for image question answering
    • M. Ren, R. Kiros, and R. Zemel. Exploring models and data for image question answering. In NIPS, 2015.
    • (2015) NIPS
    • Ren, M.1    Kiros, R.2    Zemel, R.3
  • 33
    • 84986327457 scopus 로고    scopus 로고
    • Where to look: Focus regions for visual question answering
    • K. Shih, S. Singh, and D. Hoiem. Where to look: Focus regions for visual question answering. In CVPR, 2016.
    • (2016) CVPR
    • Shih, K.1    Singh, S.2    Hoiem, D.3
  • 34
    • 84904163933 scopus 로고    scopus 로고
    • Dropout: A simple way to prevent neural networks from overfitting
    • N. Srivastava, G. E. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: a simple way to prevent neural networks from overfitting. JMLR, 15(1):1929-1958, 2014.
    • (2014) JMLR , vol.15 , Issue.1 , pp. 1929-1958
    • Srivastava, N.1    Hinton, G.E.2    Krizhevsky, A.3    Sutskever, I.4    Salakhutdinov, R.5
  • 35
    • 84907449171 scopus 로고    scopus 로고
    • A simple method to determine if a music information retrieval system is a horse
    • B. Sturm. A simple method to determine if a music information retrieval system is a horse. IEEE Transactions on Multimedia, 16(6):1636-1644, 2014.
    • (2014) IEEE Transactions on Multimedia , vol.16 , Issue.6 , pp. 1636-1644
    • Sturm, B.1
  • 42
    • 84999008900 scopus 로고    scopus 로고
    • Dynamic memory networks for visual and textual question answering
    • C. Xiong, S. Merity, and R. Socher. Dynamic memory networks for visual and textual question answering. ICML, 2016.
    • (2016) ICML
    • Xiong, C.1    Merity, S.2    Socher, R.3
  • 43
    • 85035008367 scopus 로고    scopus 로고
    • Ask, attend, and answer: Exploring question-guided spatial attention for visual question answering
    • H. Xu and K. Saenko. Ask, attend, and answer: Exploring question-guided spatial attention for visual question answering. In ECCV, 2016.
    • (2016) ECCV
    • Xu, H.1    Saenko, K.2
  • 44
    • 84986334021 scopus 로고    scopus 로고
    • Stacked attention networks for image question answering
    • Z. Yang, X. He, J. Gao, L. Deng, and A. Smola. Stacked attention networks for image question answering. In CVPR, 2016.
    • (2016) CVPR
    • Yang, Z.1    He, X.2    Gao, J.3    Deng, L.4    Smola, A.5
  • 45
    • 84906494296 scopus 로고    scopus 로고
    • From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions
    • P. Young, A. Lai, M. Hodosh, and J. Hockenmaier. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. In TACL, pages 67-78, 2014.
    • (2014) TACL , pp. 67-78
    • Young, P.1    Lai, A.2    Hodosh, M.3    Hockenmaier, J.4
  • 46
    • 84959862697 scopus 로고    scopus 로고
    • Visual madlibs: Fill in the blank image generation and question answering
    • L. Yu, E. Park, A. Berg, and T. Berg. Visual madlibs: Fill in the blank image generation and question answering. In ICCV, 2015.
    • (2015) ICCV
    • Yu, L.1    Park, E.2    Berg, A.3    Berg, T.4
  • 49
    • 84986275767 scopus 로고    scopus 로고
    • Visual7w: Grounded question answering in images
    • Y. Zhu, O. Groth, M. Bernstein, and L. Fei-Fei. Visual7w: Grounded question answering in images. In CVPR, 2016.
    • (2016) CVPR
    • Zhu, Y.1    Groth, O.2    Bernstein, M.3    Fei-Fei, L.4
  • 50
    • 84887338442 scopus 로고    scopus 로고
    • Bringing semantics into focus using visual abstraction
    • C. Zitnick and D. Parikh. Bringing semantics into focus using visual abstraction. In CVPR, 2013.
    • (2013) CVPR
    • Zitnick, C.1    Parikh, D.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.