-
1
-
-
85044312568
-
-
VQA Challenge leaderboard
-
VQA Challenge leaderboard. http://visualqa.org/challenge.html.
-
-
-
-
4
-
-
84973890960
-
Vqa: Visual question answering
-
S. Antol, A. Agrawal, J. Lu, M. Mitchell, D. Batra, C. L. Zitnick, and D. Parikh. VQA: Visual Question Answering. In Proc. IEEE Int. Conf. Comp. Vis., 2015.
-
(2015)
Proc. IEEE Int. Conf. Comp. Vis
-
-
Antol, S.1
Agrawal, A.2
Lu, J.3
Mitchell, M.4
Batra, D.5
Zitnick, C.L.6
Parikh, D.7
-
5
-
-
84986262382
-
-
K. Chen, J. Wang, L.-C. Chen, H. Gao, W. Xu, and R. Nevatia. ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering. arXiv preprint arXiv: 1511.05960, 2015.
-
(2015)
ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering
-
-
Chen, K.1
Wang, J.2
Chen, L.-C.3
Gao, H.4
Xu, W.5
Nevatia, R.6
-
6
-
-
84961291190
-
Learning phrase representations using rnn encoder-decoder for statistical machine translation
-
K. Cho, B. van Merrienboer, C. Gulcehre, F. Bougares, H. Schwenk, and Y. Bengio. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proc. Conf. Empirical Methods in Natural Language Processing, 2014.
-
(2014)
Proc. Conf. Empirical Methods in Natural Language Processing
-
-
Cho, K.1
Van Merrienboer, B.2
Gulcehre, C.3
Bougares, F.4
Schwenk, H.5
Bengio, Y.6
-
8
-
-
85015387152
-
-
D. K. Duvenaud, D. Maclaurin, J. Aguilera-Iparraguirre, R. Gómez-Bombarelli, T. Hirzel, A. Aspuru-Guzik, and R. P. Adams. Convolutional networks on graphs for learning molecular fingerprints. arXiv preprint arXiv: 1509.09292, 2015.
-
(2015)
Convolutional Networks On Graphs for Learning Molecular Fingerprints
-
-
Duvenaud, D.K.1
Maclaurin, D.2
Aguilera-Iparraguirre, J.3
Gómez-Bombarelli, R.4
Hirzel, T.5
Adams, R.P.6
-
9
-
-
84990060711
-
-
A. Fukui, D. H. Park, D. Yang, A. Rohrbach, T. Darrell, and M. Rohrbach. Multimodal compact bilinear pooling for visual question answering and visual grounding. arXiv preprint arXiv: 1606.01847, 2016.
-
(2016)
Multimodal Compact Bilinear Pooling for Visual Question Answering And Visual Grounding
-
-
Fukui, A.1
Park, D.H.2
Yang, D.3
Rohrbach, A.4
Darrell, T.5
Rohrbach, M.6
-
10
-
-
84862277874
-
Understanding the difficulty of training deep feedforward neural networks
-
X. Glorot and Y. Bengio. Understanding the difficulty of training deep feedforward neural networks. In Proc. Int. Conf. Artificial Intell. & Stat., pages 249-256, 2010.
-
(2010)
Proc. Int. Conf. Artificial Intell. & Stat
, pp. 249-256
-
-
Glorot, X.1
Bengio, Y.2
-
13
-
-
85020657410
-
-
J.-H. Kim, S.-W. Lee, D.-H. Kwak, M.-O. Heo, J. Kim, J.-W. Ha, and B.-T. Zhang. Multimodal residual learning for visual qa. arXiv preprint arXiv: 1606.01455, 2016.
-
(2016)
Multimodal Residual Learning for Visual Qa
-
-
Kim, J.-H.1
Lee, S.-W.2
Kwak, D.-H.3
Heo, M.-O.4
Kim, J.5
Ha, J.-W.6
Zhang, B.-T.7
-
14
-
-
84978730111
-
-
R. Krishna, Y. Zhu, O. Groth, J. Johnson, K. Hata, J. Kravitz, S. Chen, Y. Kalantidis, L.-J. Li, D. A. Shamma, M. Bernstein, and L. Fei-Fei. Visual genome: Connecting language and vision using crowdsourced dense image annotations. arXiv preprint arXiv: 1602.07332, 2016.
-
(2016)
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
-
-
Krishna, R.1
Zhu, Y.2
Groth, O.3
Johnson, J.4
Hata, K.5
Kravitz, J.6
Chen, S.7
Kalantidis, Y.8
Li, L.-J.9
Shamma, D.A.10
Bernstein, M.11
Fei-Fei, L.12
-
17
-
-
84937822746
-
A multi-world approach to question answering about real-world scenes based on uncertain input
-
M. Malinowski and M. Fritz. A multi-world approach to question answering about real-world scenes based on uncertain input. In Proc. Advances in Neural Inf. Process. Syst., pages 1682-1690, 2014.
-
(2014)
Proc. Advances in Neural Inf. Process. Syst
, pp. 1682-1690
-
-
Malinowski, M.1
Fritz, M.2
-
25
-
-
85044300052
-
-
Q.Wu, D. Teney, P.Wang, C. Shen, A. Dick, and A. Van Den Hengel. Visual Question Answering: A Survey of Methods and Datasets. arXiv preprint arXiv: 1607.05910, 2016.
-
(2016)
A Survey of Methods and Datasets
-
-
Wu, Q.1
Teney, D.2
Wang, P.3
Shen, C.4
Dick, A.5
Van Den Hengel, A.6
-
28
-
-
84986334021
-
Stacked attention networks for image question answering
-
Z. Yang, X. He, J. Gao, L. Deng, and A. Smola. Stacked Attention Networks for Image Question Answering. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2016.
-
(2016)
Proc. IEEE Conf. Comp. Vis. Patt. Recogn
-
-
Yang, Z.1
He, X.2
Gao, J.3
Deng, L.4
Smola, A.5
-
30
-
-
84986278354
-
Yin and yang: Balancing and answering binary visual questions
-
P. Zhang, Y. Goyal, D. Summers-Stay, D. Batra, and D. Parikh. Yin and yang: Balancing and answering binary visual questions. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2016.
-
(2016)
Proc. IEEE Conf. Comp. Vis. Patt. Recogn
-
-
Zhang, P.1
Goyal, Y.2
Summers-Stay, D.3
Batra, D.4
Parikh, D.5
|