SCOPUS 정보 검색 플랫폼

Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017

Volumn 2017-January, Issue , 2017, Pages 3233-3241

Graph-structured representations for visual question answering

(3) Teney, Damien a Liu, Lingqiao a Van Den Hengel, Anton a

a UNIVERSITY OF ADELAIDE (Australia)

Author keywords

[No Author keywords available]

Indexed keywords

ABSTRACTING; COMPUTER VISION; DEEP NEURAL NETWORKS; IMAGE ENHANCEMENT;

FEATURE VECTORS; LANGUAGE STRUCTURE; MULTIPLE CHOICE; MULTIPLE OBJECTS; QUESTION ANSWERING; SCENE OBJECT; STATE OF THE ART; VECTOR REPRESENTATIONS;

PATTERN RECOGNITION;

EID: 85040312182 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/CVPR.2017.344 Document Type: Conference Paper

Times cited : (372)

References (31)

1
- 85044312568
- VQA Challenge leaderboard
- VQA Challenge leaderboard. http://visualqa.org/challenge.html.

2
- 84993660571
- Learning to compose neural networks for question answering
- J. Andreas, M. Rohrbach, T. Darrell, and D. Klein. Learning to compose neural networks for question answering. In Annual Conference of the North American Chapter of the Association for Computational Linguistics, 2016.
- (2016) Annual Conference of the North American Chapter of the Association for Computational Linguistics
- Andreas, J.¹ Rohrbach, M.² Darrell, T.³ Klein, D.⁴

3
- 84986272553
- Neural module networks
- J. Andreas, M. Rohrbach, T. Darrell, and D. Klein. Neural Module Networks. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2016.
- (2016) Proc. IEEE Conf. Comp. Vis. Patt. Recogn
- Andreas, J.¹ Rohrbach, M.² Darrell, T.³ Klein, D.⁴

4
- 84973890960
- Vqa: Visual question answering
- S. Antol, A. Agrawal, J. Lu, M. Mitchell, D. Batra, C. L. Zitnick, and D. Parikh. VQA: Visual Question Answering. In Proc. IEEE Int. Conf. Comp. Vis., 2015.
- (2015) Proc. IEEE Int. Conf. Comp. Vis
- Antol, S.¹ Agrawal, A.² Lu, J.³ Mitchell, M.⁴ Batra, D.⁵ Zitnick, C.L.⁶ Parikh, D.⁷

5
- 84986262382
- K. Chen, J. Wang, L.-C. Chen, H. Gao, W. Xu, and R. Nevatia. ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering. arXiv preprint arXiv: 1511.05960, 2015.
- (2015) ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering
- Chen, K.¹ Wang, J.² Chen, L.-C.³ Gao, H.⁴ Xu, W.⁵ Nevatia, R.⁶

6
- 84961291190
- Learning phrase representations using rnn encoder-decoder for statistical machine translation
- K. Cho, B. van Merrienboer, C. Gulcehre, F. Bougares, H. Schwenk, and Y. Bengio. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proc. Conf. Empirical Methods in Natural Language Processing, 2014.
- (2014) Proc. Conf. Empirical Methods in Natural Language Processing
- Cho, K.¹ Van Merrienboer, B.² Gulcehre, C.³ Bougares, F.⁴ Schwenk, H.⁵ Bengio, Y.⁶

7
- 70549102500
- The stanford typed dependencies representation
- M.-C. De Marneffe and C. D. Manning. The stanford typed dependencies representation. In COLING Workshop on Cross-framework and Cross-domain Parser Evaluation, 2008.
- (2008) COLING Workshop on Cross-framework and Cross-domain Parser Evaluation
- De Marneffe, M.-C.¹ Manning, C.D.²

8
- 85015387152
- D. K. Duvenaud, D. Maclaurin, J. Aguilera-Iparraguirre, R. Gómez-Bombarelli, T. Hirzel, A. Aspuru-Guzik, and R. P. Adams. Convolutional networks on graphs for learning molecular fingerprints. arXiv preprint arXiv: 1509.09292, 2015.
- (2015) Convolutional Networks On Graphs for Learning Molecular Fingerprints
- Duvenaud, D.K.¹ Maclaurin, D.² Aguilera-Iparraguirre, J.³ Gómez-Bombarelli, R.⁴ Hirzel, T.⁵ Adams, R.P.⁶

9
- 84990060711
- A. Fukui, D. H. Park, D. Yang, A. Rohrbach, T. Darrell, and M. Rohrbach. Multimodal compact bilinear pooling for visual question answering and visual grounding. arXiv preprint arXiv: 1606.01847, 2016.
- (2016) Multimodal Compact Bilinear Pooling for Visual Question Answering And Visual Grounding
- Fukui, A.¹ Park, D.H.² Yang, D.³ Rohrbach, A.⁴ Darrell, T.⁵ Rohrbach, M.⁶

10
- 84862277874
- Understanding the difficulty of training deep feedforward neural networks
- X. Glorot and Y. Bengio. Understanding the difficulty of training deep feedforward neural networks. In Proc. Int. Conf. Artificial Intell. & Stat., pages 249-256, 2010.
- (2010) Proc. Int. Conf. Artificial Intell. & Stat , pp. 249-256
- Glorot, X.¹ Bengio, Y.²

11
- 84986292208
- Structuralrnn: Deep learning on spatio-temporal graphs
- A. Jain, A. R. Zamir, S. Savarese, and A. Saxena. Structuralrnn: Deep learning on spatio-temporal graphs. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2016.
- (2016) Proc. IEEE Conf. Comp. Vis. Patt. Recogn
- Jain, A.¹ Zamir, A.R.² Savarese, S.³ Saxena, A.⁴

12
- 85037344457
- A. Jiang, F. Wang, F. Porikli, and Y. Li. Compositional Memory for Visual Question Answering. arXiv preprint arXiv: 1511.05676, 2015.
- (2015) Compositional Memory for Visual Question Answering
- Jiang, A.¹ Wang, F.² Porikli, F.³ Li, Y.⁴

13
- 85020657410
- J.-H. Kim, S.-W. Lee, D.-H. Kwak, M.-O. Heo, J. Kim, J.-W. Ha, and B.-T. Zhang. Multimodal residual learning for visual qa. arXiv preprint arXiv: 1606.01455, 2016.
- (2016) Multimodal Residual Learning for Visual Qa
- Kim, J.-H.¹ Lee, S.-W.² Kwak, D.-H.³ Heo, M.-O.⁴ Kim, J.⁵ Ha, J.-W.⁶ Zhang, B.-T.⁷

14
- 84978730111
- R. Krishna, Y. Zhu, O. Groth, J. Johnson, K. Hata, J. Kravitz, S. Chen, Y. Kalantidis, L.-J. Li, D. A. Shamma, M. Bernstein, and L. Fei-Fei. Visual genome: Connecting language and vision using crowdsourced dense image annotations. arXiv preprint arXiv: 1602.07332, 2016.
- (2016) Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
- Krishna, R.¹ Zhu, Y.² Groth, O.³ Johnson, J.⁴ Hata, K.⁵ Kravitz, J.⁶ Chen, S.⁷ Kalantidis, Y.⁸ Li, L.-J.⁹ Shamma, D.A.¹⁰ Bernstein, M.¹¹ Fei-Fei, L.¹²

15
- 84998642250
- Y. Li, D. Tarlow, M. Brockschmidt, and R. S. Zemel. Gated graph sequence neural networks. arXiv preprint arXiv: 1511.05493, 2015.
- (2015) Gated Graph Sequence Neural Networks
- Li, Y.¹ Tarlow, D.² Brockschmidt, M.³ Zemel, R.S.⁴

16
- 84990020800
- J. Lu, J. Yang, D. Batra, and D. Parikh. Hierarchical question-image co-attention for visual question answering. arXiv preprint arXiv: 1606.00061, 2016.
- (2016) Hierarchical Question-Image Co-Attention for Visual Question Answering
- Lu, J.¹ Yang, J.² Batra, D.³ Parikh, D.⁴

17
- 84937822746
- A multi-world approach to question answering about real-world scenes based on uncertain input
- M. Malinowski and M. Fritz. A multi-world approach to question answering about real-world scenes based on uncertain input. In Proc. Advances in Neural Inf. Process. Syst., pages 1682-1690, 2014.
- (2014) Proc. Advances in Neural Inf. Process. Syst , pp. 1682-1690
- Malinowski, M.¹ Fritz, M.²

18
- 84942666203
- J. Pennington, R. Socher, and C. Manning. Glove: Global vectors for word representation. http://nlp. stanford.edu/projects/glove/.
- Glove: Global vectors for Word Representation
- Pennington, J.¹ Socher, R.² Manning, C.³

19
- 85044307368
- J. Pennington, R. Socher, and C. Manning. Stanford dependency parser website. http://nlp.stanford.edu/software/stanford-dependencies.shtml.
- Stanford Dependency Parser Website
- Pennington, J.¹ Socher, R.² Manning, C.³

20
- 84961289992
- Glove: Global vectors for word representation
- J. Pennington, R. Socher, and C. Manning. Glove: Global Vectors for Word Representation. In Conference on Empirical Methods in Natural Language Processing, 2014.
- (2014) Conference on Empirical Methods in Natural Language Processing
- Pennington, J.¹ Socher, R.² Manning, C.³

21
- 84962816362
- Image question answering: A visual semantic embedding model and a new dataset
- M. Ren, R. Kiros, and R. Zemel. Image Question Answering: A Visual Semantic Embedding Model and a New Dataset. In Proc. Advances in Neural Inf. Process. Syst., 2015.
- (2015) Proc. Advances in Neural Inf. Process. Syst
- Ren, M.¹ Kiros, R.² Zemel, R.³

22
- 85031713628
- K. Saito, A. Shin, Y. Ushiku, and T. Harada. Dualnet: Domain-invariant network for visual question answering. arXiv preprint arXiv: 1606.06108, 2016.
- (2016) Dualnet: Domain-Invariant Network For Visual Question Answering
- Saito, K.¹ Shin, A.² Ushiku, Y.³ Harada, T.⁴

23
- 84986327457
- Where to look: Focus regions for visual question answering
- K. J. Shih, S. Singh, and D. Hoiem. Where to look: Focus regions for visual question answering. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2016.
- (2016) Proc. IEEE Conf. Comp. Vis. Patt. Recogn
- Shih, K.J.¹ Singh, S.² Hoiem, D.³

24
- 84988676748
- O. Vinyals, S. Bengio, and M. Kudlur. Order matters: Sequence to sequence for sets. arXiv preprint arXiv: 1511.06391, 2015.
- (2015) Order Matters: Sequence To Sequence For Sets
- Vinyals, O.¹ Bengio, S.² Kudlur, M.³

25
- 85044300052
- Q.Wu, D. Teney, P.Wang, C. Shen, A. Dick, and A. Van Den Hengel. Visual Question Answering: A Survey of Methods and Datasets. arXiv preprint arXiv: 1607.05910, 2016.
- (2016) A Survey of Methods and Datasets
- Wu, Q.¹ Teney, D.² Wang, P.³ Shen, C.⁴ Dick, A.⁵ Van Den Hengel, A.⁶

26
- 84999008900
- Dynamic memory networks for visual and textual question answering
- C. Xiong, S. Merity, and R. Socher. Dynamic memory networks for visual and textual question answering. In Proc. Int. Conf. Mach. Learn., 2016.
- (2016) Proc. Int. Conf. Mach. Learn
- Xiong, C.¹ Merity, S.² Socher, R.³

27
- 84990044633
- H. Xu and K. Saenko. Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering. arXiv preprint arXiv: 1511.05234, 2015.
- (2015) Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering
- Xu, H.¹ Saenko, K.²

28
- 84986334021
- Stacked attention networks for image question answering
- Z. Yang, X. He, J. Gao, L. Deng, and A. Smola. Stacked Attention Networks for Image Question Answering. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2016.
- (2016) Proc. IEEE Conf. Comp. Vis. Patt. Recogn
- Yang, Z.¹ He, X.² Gao, J.³ Deng, L.⁴ Smola, A.⁵

29
- 84969736572
- M. D. Zeiler. ADADELTA: an adaptive learning rate method. arXiv preprint arXiv: 1212.5701, 2012.
- (2012) ADADELTA: An Adaptive Learning Rate Method
- Zeiler, M.D.¹

30
- 84986278354
- Yin and yang: Balancing and answering binary visual questions
- P. Zhang, Y. Goyal, D. Summers-Stay, D. Batra, and D. Parikh. Yin and yang: Balancing and answering binary visual questions. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2016.
- (2016) Proc. IEEE Conf. Comp. Vis. Patt. Recogn
- Zhang, P.¹ Goyal, Y.² Summers-Stay, D.³ Batra, D.⁴ Parikh, D.⁵

31
- 84986275767
- Visual7w: Grounded question answering in images
- Y. Zhu, O. Groth, M. Bernstein, and L. Fei-Fei. Visual7W: Grounded Question Answering in Images. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2016.
- (2016) Proc. IEEE Conf. Comp. Vis. Patt. Recogn
- Zhu, Y.¹ Groth, O.² Bernstein, M.³ Fei-Fei, L.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.