-
1
-
-
84993660571
-
Learning to compose neural networks for question answering
-
J. Andreas, M. Rohrbach, T. Darrell, and D. Klein. Learning to compose neural networks for question answering. In NAACL, 2016.
-
(2016)
NAACL
-
-
Andreas, J.1
Rohrbach, M.2
Darrell, T.3
Klein, D.4
-
3
-
-
84973890960
-
VQA: Visual question answering
-
S. Antol, A. Agrawal, J. Lu, M. Mitchell, D. Batra, C. Zitnick, and D. Parikh. VQA: Visual question answering. In ICCV, 2015.
-
(2015)
ICCV
-
-
Antol, S.1
Agrawal, A.2
Lu, J.3
Mitchell, M.4
Batra, D.5
Zitnick, C.6
Parikh, D.7
-
4
-
-
85088225001
-
Deepcoder: Learning to write programs
-
M. Balog, A. Gaunt, M. Brockschmidt, S. Nowozin, and D. Tarlow. Deepcoder: Learning to write programs. In ICLR, 2017.
-
(2017)
ICLR
-
-
Balog, M.1
Gaunt, A.2
Brockschmidt, M.3
Nowozin, S.4
Tarlow, D.5
-
5
-
-
85063566337
-
Making neural programming architectures generalize via recursion
-
J. Cai, R. Shin, and D. Song. Making neural programming architectures generalize via recursion. In ICLR, 2017.
-
(2017)
ICLR
-
-
Cai, J.1
Shin, R.2
Song, D.3
-
6
-
-
85041927710
-
Visual dialog
-
A. Das, S. Kottur, K. Gupta, A. Singh, D. Yadav, J. Moura, D. Parikh, and D. Batra. Visual dialog. In CVPR, 2017.
-
(2017)
CVPR
-
-
Das, A.1
Kottur, S.2
Gupta, K.3
Singh, A.4
Yadav, D.5
Moura, J.6
Parikh, D.7
Batra, D.8
-
7
-
-
84965102873
-
-
arXiv preprint arXiv:1505. 04467
-
J. Devlin, S. Gupta, R. Girshick, M. Mitchell, and C. L. Zitnick. Exploring nearest neighbor approaches for image captioning. arXiv preprint arXiv:1505. 04467, 2015.
-
(2015)
Exploring Nearest Neighbor Approaches for Image Captioning
-
-
Devlin, J.1
Gupta, S.2
Girshick, R.3
Mitchell, M.4
Zitnick, C.L.5
-
8
-
-
84990060711
-
-
arXiv:1606. 01847
-
A. Fukui, D. H. Park, D. Yang, A. Rohrbach, T. Darrell, and M. Rohrbach. Multimodal compact bilinear pooling for visual question answering and visual grounding. In arXiv:1606. 01847, 2016.
-
(2016)
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
-
-
Fukui, A.1
Park, D.H.2
Yang, D.3
Rohrbach, A.4
Darrell, T.5
Rohrbach, M.6
-
10
-
-
85029359197
-
Fast R-CNN
-
R. Girshick. Fast R-CNN. In ICCV, 2015.
-
(2015)
ICCV
-
-
Girshick, R.1
-
11
-
-
85041900002
-
Making the v in VQA matter: Elevating the role of image understanding in visual question answering
-
Y. Goyal, T. Khot, D. Summers-Stay, D. Batra, and D. Parikh. Making the V in VQA matter: Elevating the role of image understanding in visual question answering. In CVPR, 2017.
-
(2017)
CVPR
-
-
Goyal, Y.1
Khot, T.2
Summers-Stay, D.3
Batra, D.4
Parikh, D.5
-
13
-
-
84993949467
-
Hybrid computing using a neural network with dynamic external memory
-
A. Graves, G. Wayne, M. Reynolds, T. Harley, I. Danihelka, A. Grabska-Barwinska, S. Colmenarejo, E. Grefenstette, T. Ramalho, J. Agapiou, A. Badia, K. Hermann, Y. Zwols, G. Ostrovski, A. Cain, H. King, C. Summerfield, P. Blunsom, K. Kavukcuoglu, and D. Hassabis. Hybrid computing using a neural network with dynamic external memory. Nature, 2016.
-
(2016)
Nature
-
-
Graves, A.1
Wayne, G.2
Reynolds, M.3
Harley, T.4
Danihelka, I.5
Grabska-Barwinska, A.6
Colmenarejo, S.7
Grefenstette, E.8
Ramalho, T.9
Agapiou, J.10
Badia, A.11
Hermann, K.12
Zwols, Y.13
Ostrovski, G.14
Cain, A.15
King, H.16
Summerfield, C.17
Blunsom, P.18
Kavukcuoglu, K.19
Hassabis, D.20
more..
-
14
-
-
84986274465
-
Deep residual learning for image recognition
-
K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, 2016.
-
(2016)
CVPR
-
-
He, K.1
Zhang, X.2
Ren, S.3
Sun, J.4
-
16
-
-
85041904328
-
Learning to reason: End-to-end module networks for visual question answering
-
R. Hu, J. Andreas, M. Rohrbach, T. Darrell, and K. Saenko. Learning to reason: End-to-end module networks for visual question answering. In ICCV, 2017.
-
(2017)
ICCV
-
-
Hu, R.1
Andreas, J.2
Rohrbach, M.3
Darrell, T.4
Saenko, K.5
-
17
-
-
85041929043
-
Modeling relationships in referential expressions with compositional modular networks
-
R. Hu, M. Rohrbach, J. Andreas, T. Darrell, and K. Saenko. Modeling relationships in referential expressions with compositional modular networks. In CVPR, 2017.
-
(2017)
CVPR
-
-
Hu, R.1
Rohrbach, M.2
Andreas, J.3
Darrell, T.4
Saenko, K.5
-
18
-
-
85041926703
-
Revisiting visual question answering baselines
-
A. Jabri, A. Joulin, and L. van der Maaten. Revisiting visual question answering baselines. In ECCV, 2016.
-
(2016)
ECCV
-
-
Jabri, A.1
Joulin, A.2
Van der Maaten, L.3
-
19
-
-
85041904911
-
CLEVR: A diagnostic dataset for compositional language and elementary visual reasoning
-
J. Johnson, B. Hariharan, L. van der Maaten, L. Fei-Fei, C. L. Zitnick, and R. Girshick. CLEVR: A diagnostic dataset for compositional language and elementary visual reasoning. In CVPR, 2017.
-
(2017)
CVPR
-
-
Johnson, J.1
Hariharan, B.2
Maaten Der Van, L.3
Fei-Fei, L.4
Zitnick, C.L.5
Girshick, R.6
-
20
-
-
84965117324
-
Inferring algorithmic patterns with stack-augmented recurrent nets
-
A. Joulin and T. Mikolov. Inferring algorithmic patterns with stack-augmented recurrent nets. In NIPS, 2015.
-
(2015)
NIPS
-
-
Joulin, A.1
Mikolov, T.2
-
22
-
-
85083953090
-
Neural GPUs learn algorithms
-
L. Kaiser and I. Sutskever. Neural GPUs learn algorithms. In ICLR, 2016.
-
(2016)
ICLR
-
-
Kaiser, L.1
Sutskever, I.2
-
23
-
-
85146417759
-
Accurate unlexicalized parsing
-
D. Klein and C. D. Manning. Accurate unlexicalized parsing. In ACL, pages 423-430, 2003.
-
(2003)
ACL
, pp. 423-430
-
-
Klein, D.1
Manning, C.D.2
-
24
-
-
85011596790
-
Visual genome: Connecting language and vision using crowdsourced dense image annotations
-
R. Krishna, Y. Zhu, O. Groth, J. Johnson, K. Hata, J. Kravitz, S. Chen, Y. Kalantidis, L.-J. Li, D. A. Shamma, et al. Visual genome: Connecting language and vision using crowdsourced dense image annotations. IJCV, 2017.
-
(2017)
IJCV
-
-
Krishna, R.1
Zhu, Y.2
Groth, O.3
Johnson, J.4
Hata, K.5
Kravitz, J.6
Chen, S.7
Kalantidis, Y.8
Li, L.-J.9
Shamma, D.A.10
-
25
-
-
84876231242
-
ImageNet classification with deep convolutional neural networks
-
A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet classification with deep convolutional neural networks. In NIPS. 2012.
-
(2012)
NIPS.
-
-
Krizhevsky, A.1
Sutskever, I.2
Hinton, G.E.3
-
28
-
-
85040923863
-
Neural symbolic machines: Learning semantic parsers on freebase with weak supervision
-
C. Liang, J. Berant, Q. Le, K. D. Forbus, and N. Lao. Neural symbolic machines: Learning semantic parsers on freebase with weak supervision. ACL, 2017.
-
(2017)
ACL
-
-
Liang, C.1
Berant, J.2
Le, Q.3
Forbus, K.D.4
Lao, N.5
-
29
-
-
84859072748
-
Learning dependencybased compositional semantics
-
P. Liang, M. I. Jordan, and D. Klein. Learning dependencybased compositional semantics. In ACL, 2011.
-
(2011)
ACL
-
-
Liang, P.1
Jordan, M.I.2
Klein, D.3
-
30
-
-
85018917850
-
Hierarchical question-image co-attention for visual question answering
-
J. Lu, J. Yang, D. Batra, and D. Parikh. Hierarchical question-image co-attention for visual question answering. In NIPS, 2016.
-
(2016)
NIPS
-
-
Lu, J.1
Yang, J.2
Batra, D.3
Parikh, D.4
-
31
-
-
84937822746
-
A multi-world approach to question answering about real-world scenes based on uncertain input
-
M. Malinowski and M. Fritz. A multi-world approach to question answering about real-world scenes based on uncertain input. In NIPS, 2014.
-
(2014)
NIPS
-
-
Malinowski, M.1
Fritz, M.2
-
32
-
-
84973896625
-
Ask your neurons: A neural-based approach to answering questions about images
-
M. Malinowski, M. Rohrbach, and M. Fritz. Ask your neurons: A neural-based approach to answering questions about images. In ICCV, 2015.
-
(2015)
ICCV
-
-
Malinowski, M.1
Rohrbach, M.2
Fritz, M.3
-
33
-
-
85041909637
-
Learning models for actions and person-object interactions with transfer to question answering
-
A. Mallya and S. Lazebnik. Learning models for actions and person-object interactions with transfer to question answering. In ECCV, 2016.
-
(2016)
ECCV
-
-
Mallya, A.1
Lazebnik, S.2
-
34
-
-
85083953004
-
Neural programmer: Inducing latent programs with gradient descent
-
A. Neelakantan, Q. V. Le, and I. Sutskever. Neural programmer: Inducing latent programs with gradient descent. In ICLR, 2016.
-
(2016)
ICLR
-
-
Neelakantan, A.1
Le, Q.V.2
Sutskever, I.3
-
35
-
-
85083952850
-
Neural programmer-interpreters
-
S. Reed and N. De Freitas. Neural programmer-interpreters. In ICLR, 2016.
-
(2016)
ICLR
-
-
Reed, S.1
De Freitas, N.2
-
36
-
-
84947041871
-
ImageNet large scale visual recognition challenge
-
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, et al. ImageNet large scale visual recognition challenge. IJCV, 2015.
-
(2015)
IJCV
-
-
Russakovsky, O.1
Deng, J.2
Su, H.3
Krause, J.4
Satheesh, S.5
Ma, S.6
Huang, Z.7
Karpathy, A.8
Khosla, A.9
Bernstein, M.10
-
38
-
-
84928547704
-
Sequence to sequence learning with neural networks
-
I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. In NIPS, 2014.
-
(2014)
NIPS
-
-
Sutskever, I.1
Vinyals, O.2
Le, Q.V.3
-
39
-
-
84986296727
-
Movieqa: Understanding stories in movies through question-answering
-
M. Tapaswi, Y. Zhu, R. Stiefelhagen, A. Torralba, R. Urtasun, and S. Fidler. Movieqa: Understanding stories in movies through question-answering. In CVPR, 2016.
-
(2016)
CVPR
-
-
Tapaswi, M.1
Zhu, Y.2
Stiefelhagen, R.3
Torralba, A.4
Urtasun, R.5
Fidler, S.6
-
41
-
-
0000337576
-
Simple statistical gradient-following algorithms for connectionist reinforcement learning
-
R. J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8 (23), 1992.
-
(1992)
Machine Learning
, vol.8
, Issue.23
-
-
Williams, R.J.1
-
43
-
-
85035003427
-
-
arXiv preprint arXiv:1607. 05910
-
Q. Wu, D. Teney, P. Wang, C. Shen, A. Dick, and A. van den Hengel. Visual question answering: A survey of methods and datasets. In arXiv preprint arXiv:1607. 05910, 2016.
-
(2016)
Visual Question Answering: A Survey of Methods and Datasets
-
-
Wu, Q.1
Teney, D.2
Wang, P.3
Shen, C.4
Dick, A.5
Van den Hengel, A.6
-
44
-
-
84999008900
-
Dynamic memory networks for visual and textual question answering
-
C. Xiong, S. Merity, and R. Socher. Dynamic memory networks for visual and textual question answering. ICML, 2016.
-
(2016)
ICML
-
-
Xiong, C.1
Merity, S.2
Socher, R.3
-
45
-
-
84986334021
-
Stacked attention networks for image question answering
-
Z. Yang, X. He, J. Gao, L. Deng, and A. Smola. Stacked attention networks for image question answering. In CVPR, 2016.
-
(2016)
CVPR
-
-
Yang, Z.1
He, X.2
Gao, J.3
Deng, L.4
Smola, A.5
-
49
-
-
84986278354
-
Yin and yang: Balancing and answering binary visual questions
-
P. Zhang, Y. Goyal, D. Summers-Stay, D. Batra, and D. Parikh. Yin and yang: Balancing and answering binary visual questions. In CVPR, 2016.
-
(2016)
CVPR
-
-
Zhang, P.1
Goyal, Y.2
Summers-Stay, D.3
Batra, D.4
Parikh, D.5
|