-
1
-
-
84973890960
-
VQA: Visual question answering
-
S. Antol, A. Agrawal, J. Lu, M. Mitchell, D. Batra, C. L. Zitnick, and D. Parikh. VQA: Visual Question Answering. In Proc. IEEE Int. Conf. Comp. Vis., 2015.
-
(2015)
Proc. IEEE Int. Conf. Comp. Vis.
-
-
Antol, S.1
Agrawal, A.2
Lu, J.3
Mitchell, M.4
Batra, D.5
Zitnick, C.L.6
Parikh, D.7
-
2
-
-
70350086542
-
-
Springer
-
S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives. Dbpedia: A nucleus for a web of open data. Springer, 2007.
-
(2007)
Dbpedia: A Nucleus for A Web of Open Data
-
-
Auer, S.1
Bizer, C.2
Kobilarov, G.3
Lehmann, J.4
Cyganiak, R.5
Ives, Z.6
-
3
-
-
84904308637
-
Semantic parsing on freebase from question-answer pairs
-
J. Berant, A. Chou, R. Frostig, and P. Liang. Semantic Parsing on Freebase from Question-Answer Pairs. In Proc. Conf. Empirical Methods in Natural Language Processing, pages 1533-1544, 2013.
-
(2013)
Proc. Conf. Empirical Methods in Natural Language Processing
, pp. 1533-1544
-
-
Berant, J.1
Chou, A.2
Frostig, R.3
Liang, P.4
-
4
-
-
57149137628
-
Freebase: A collaboratively created graph database for structuring human knowledge
-
ACM
-
K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: A collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 1247-1250. ACM, 2008.
-
(2008)
Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data
, pp. 1247-1250
-
-
Bollacker, K.1
Evans, C.2
Paritosh, P.3
Sturge, T.4
Taylor, J.5
-
5
-
-
84952349295
-
-
arXiv:1504.00325
-
X. Chen, H. Fang, T.-Y. Lin, R. Vedantam, S. Gupta, P. Dollar, and C. L. Zitnick. Microsoft COCO captions: Data collection and evaluation server. arXiv:1504.00325, 2015.
-
(2015)
Microsoft COCO Captions: Data Collection and Evaluation Server
-
-
Chen, X.1
Fang, H.2
Lin, T.-Y.3
Vedantam, R.4
Gupta, S.5
Dollar, P.6
Zitnick, C.L.7
-
7
-
-
84961291190
-
Learning phrase representations using rnn encoder-decoder for statistical machine translation
-
K. Cho, B. van Merrienboer, C. Gulcehre, F. Bougares, H. Schwenk, and Y. Bengio. Learning phrase representations using rnn encoder-decoder for statistical machine translation. In Proc. Conf. Empirical Methods in Natural Language Processing, 2014.
-
(2014)
Proc. Conf. Empirical Methods in Natural Language Processing
-
-
Cho, K.1
Van Merrienboer, B.2
Gulcehre, C.3
Bougares, F.4
Schwenk, H.5
Bengio, Y.6
-
8
-
-
72449136144
-
Imagenet: A large-scale hierarchical image database
-
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2009.
-
(2009)
Proc. IEEE Conf. Comp. Vis. Patt. Recogn.
-
-
Deng, J.1
Dong, W.2
Socher, R.3
Li, L.-J.4
Li, K.5
Fei-Fei, L.6
-
9
-
-
84959236502
-
Long-term recurrent convolutional networks for visual recognition and description
-
J. Donahue, L. A. Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, K. Saenko, and T. Darrell. Long-term recurrent convolutional networks for visual recognition and description. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2015.
-
(2015)
Proc. IEEE Conf. Comp. Vis. Patt. Recogn.
-
-
Donahue, J.1
Hendricks, L.A.2
Guadarrama, S.3
Rohrbach, M.4
Venugopalan, S.5
Saenko, K.6
Darrell, T.7
-
10
-
-
79953685181
-
Building Watson: An overview of the DeepQA project
-
D. Ferrucci, E. Brown, J. Chu-Carroll, J. Fan, D. Gondek, A. A. Kalyanpur, A. Lally, J. W. Murdock, E. Nyberg, J. Prager, et al. Building Watson: An overview of the DeepQA project. AI magazine, 31(3):59-79, 2010.
-
(2010)
AI Magazine
, vol.31
, Issue.3
, pp. 59-79
-
-
Ferrucci, D.1
Brown, E.2
Chu-Carroll, J.3
Fan, J.4
Gondek, D.5
Kalyanpur, A.A.6
Lally, A.7
Murdock, J.W.8
Nyberg, E.9
Prager, J.10
-
11
-
-
84965148420
-
Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question Answering
-
H. Gao, J. Mao, J. Zhou, Z. Huang, L. Wang, and W. Xu. Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question Answering. In Proc. Advances in Neural Inf. Process. Syst., 2015.
-
(2015)
Proc. Advances in Neural Inf. Process. Syst.
-
-
Gao, H.1
Mao, J.2
Zhou, J.3
Huang, Z.4
Wang, L.5
Xu, W.6
-
12
-
-
84925422907
-
Visual Turing test for computer vision systems
-
D. Geman, S. Geman, N. Hallonquist, and L. Younes. Visual Turing test for computer vision systems. Proceedings of the National Academy of Sciences, 112(12):3618-3623, 2015.
-
(2015)
Proceedings of the National Academy of Sciences
, vol.112
, Issue.12
, pp. 3618-3623
-
-
Geman, D.1
Geman, S.2
Hallonquist, N.3
Younes, L.4
-
16
-
-
84959227898
-
Don't just listen, use your imagination: Leveraging visual common sense for non-visual tasks
-
June
-
X. Lin and D. Parikh. Don't Just Listen, Use Your Imagination: Leveraging Visual Common Sense for Non-Visual Tasks. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., June 2015.
-
(2015)
Proc. IEEE Conf. Comp. Vis. Patt. Recogn
-
-
Lin, X.1
Parikh, D.2
-
18
-
-
84937822746
-
A multi-world approach to question answering about real-world scenes based on uncertain input
-
M. Malinowski and M. Fritz. A multi-world approach to question answering about real-world scenes based on uncertain input. In Proc. Advances in Neural Inf. Process. Syst., pages 1682-1690, 2014.
-
(2014)
Proc. Advances in Neural Inf. Process. Syst.
, pp. 1682-1690
-
-
Malinowski, M.1
Fritz, M.2
-
21
-
-
85083950512
-
Deep captioning with multimodal recurrent neural networks (m-rnn)
-
J. Mao, W. Xu, Y. Yang, J. Wang, and A. Yuille. Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN). In Proc. Int. Conf. Learn. Representations, 2015.
-
(2015)
Proc. Int. Conf. Learn. Representations
-
-
Mao, J.1
Xu, W.2
Yang, Y.3
Wang, J.4
Yuille, A.5
-
22
-
-
84973900209
-
-
arXiv:1503.00848, March
-
J. Pont-Tuset, P. Arbeláez, J. Barron, F. Marques, and J. Malik. Multiscale Combinatorial Grouping for Image Segmentation and Object Proposal Generation. In arXiv:1503.00848, March 2015.
-
(2015)
Multiscale Combinatorial Grouping for Image Segmentation and Object Proposal Generation
-
-
Pont-Tuset, J.1
Arbeláez, P.2
Barron, J.3
Marques, F.4
Malik, J.5
-
24
-
-
84959184467
-
VisKE: Visual Knowledge Extraction and Question Answering by Visual Verification of Relation Phrases
-
June
-
F. Sadeghi, S. K. Kumar Divvala, and A. Farhadi. VisKE: Visual Knowledge Extraction and Question Answering by Visual Verification of Relation Phrases. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., June 2015.
-
(2015)
Proc. IEEE Conf. Comp. Vis. Patt. Recogn
-
-
Sadeghi, F.1
Kumar Divvala, S.K.2
Farhadi, A.3
-
27
-
-
84901405262
-
Joint video and text parsing for understanding events and answering queries
-
K. Tu, M. Meng, M.W. Lee, T. E. Choe, and S.-C. Zhu. Joint video and text parsing for understanding events and answering queries. IEEE Trans. Multimedia, 21(2):42-70, 2014.
-
(2014)
IEEE Trans. Multimedia
, vol.21
, Issue.2
, pp. 42-70
-
-
Tu, K.1
Meng, M.2
Lee, M.W.3
Choe, T.E.4
Zhu, S.-C.5
-
29
-
-
84938908409
-
-
arXiv:1406.5726
-
Y. Wei, W. Xia, J. Huang, B. Ni, J. Dong, Y. Zhao, and S. Yan. CNN: Single-label to multi-label. arXiv:1406.5726, 2014.
-
(2014)
CNN: Single-label to Multi-label
-
-
Wei, Y.1
Xia, W.2
Huang, J.3
Ni, B.4
Dong, J.5
Zhao, Y.6
Yan, S.7
-
30
-
-
84986301177
-
What Value Do Explicit High Level Concepts Have in Vision to Language Problems?
-
Q. Wu, C. Shen, A. v. d. Hengel, L. Liu, and A. Dick. What Value Do Explicit High Level Concepts Have in Vision to Language Problems? In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2016.
-
(2016)
Proc. IEEE Conf. Comp. Vis. Patt. Recogn.
-
-
Wu, Q.1
Shen, C.2
Hengel, A.V.D.3
Liu, L.4
Dick, A.5
-
32
-
-
84965160010
-
-
arXiv:1502.08029
-
L. Yao, A. Torabi, K. Cho, N. Ballas, C. Pal, H. Larochelle, and A. Courville. Describing videos by exploiting temporal structure. arXiv:1502.08029, 2015.
-
(2015)
Describing Videos by Exploiting Temporal Structure
-
-
Yao, L.1
Torabi, A.2
Cho, K.3
Ballas, N.4
Pal, C.5
Larochelle, H.6
Courville, A.7
|