-
1
-
-
85009933593
-
-
Keras: Theano-based deep learning library
-
Keras: Theano-based deep learning library. https://github.com/fchollet/keras.git, 2015.
-
(2015)
-
-
-
2
-
-
84973890960
-
Vqa: Visual question answering
-
S. Antol, A. Agrawal, J. Lu, M. Mitchell, D. Batra, C. L. Zitnick, and D. Parikh. Vqa: Visual question answering. In ICCV, 2015.
-
(2015)
ICCV
-
-
Antol, S.1
Agrawal, A.2
Lu, J.3
Mitchell, M.4
Batra, D.5
Zitnick, C.L.6
Parikh, D.7
-
3
-
-
84959209132
-
Zero-Shot Learning via Visual Abstraction
-
S. Antol, C. L. Zitnick, and D. Parikh. Zero-Shot Learning via Visual Abstraction. In ECCV, 2014.
-
(2014)
ECCV
-
-
Antol, S.1
Zitnick, C.L.2
Parikh, D.3
-
4
-
-
78649587763
-
Vizwiz: Nearly real-time answers to visual questions
-
J. P. Bigham, C. Jayant, H. Ji, G. Little, A. Miller, R. C. Miller, R. Miller, A. Tatarowicz, B. White, S. White, and T. Yeh. VizWiz: Nearly Real-time Answers to Visual Questions. In User Interface Software and Technology, 2010.
-
(2010)
User Interface Software and Technology
-
-
Bigham, J.P.1
Jayant, C.2
Ji, H.3
Little, G.4
Miller, A.5
Miller, R.C.6
Miller, R.7
Tatarowicz, A.8
White, B.9
White, S.10
Yeh, T.11
-
5
-
-
84887349875
-
Simultaneous active learning of classifiers & attributes via relative feedback
-
A. Biswas and D. Parikh. Simultaneous active learning of classifiers & attributes via relative feedback. In CVPR, 2013.
-
(2013)
CVPR
-
-
Biswas, A.1
Parikh, D.2
-
6
-
-
84957029470
-
Mind's eye: A recurrent visual representation for image caption generation
-
X. Chen and C. L. Zitnick. Mind's Eye: A Recurrent Visual Representation for Image Caption Generation. In CVPR, 2015.
-
(2015)
CVPR
-
-
Chen, X.1
Zitnick, C.L.2
-
7
-
-
85198028989
-
Imagenet: A large-scale hierarchical image database
-
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR, 2009.
-
(2009)
CVPR
-
-
Deng, J.1
Dong, W.2
Socher, R.3
Li, L.-J.4
Li, K.5
Fei-Fei, L.6
-
8
-
-
84965102873
-
-
CoRR, abs/1505.04467
-
J. Devlin, S. Gupta, R. B. Girshick, M. Mitchell, and C. L. Zitnick. Exploring nearest neighbor approaches for image captioning. CoRR, abs/1505.04467, 2015.
-
(2015)
Exploring Nearest Neighbor Approaches for Image Captioning
-
-
Devlin, J.1
Gupta, S.2
Girshick, R.B.3
Mitchell, M.4
Zitnick, C.L.5
-
9
-
-
84856671941
-
Annotator rationales for visual recognition
-
J. Donahue and K. Grauman. Annotator rationales for visual recognition. In ICCV, 2011.
-
(2011)
ICCV
-
-
Donahue, J.1
Grauman, K.2
-
10
-
-
84959236502
-
Long-term recurrent convolutional networks for visual recognition and description
-
J. Donahue, L. A. Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, K. Saenko, and T. Darrell. Long-term Recurrent Convolutional Networks for Visual Recognition and Description. In CVPR, 2015.
-
(2015)
CVPR
-
-
Donahue, J.1
Hendricks, L.A.2
Guadarrama, S.3
Rohrbach, M.4
Venugopalan, S.5
Saenko, K.6
Darrell, T.7
-
11
-
-
80053288463
-
Identifying relations for open information extraction
-
A. Fader, S. Soderland, and O. Etzioni. Identifying relations for open information extraction. In EMNLP, 2011.
-
(2011)
EMNLP
-
-
Fader, A.1
Soderland, S.2
Etzioni, O.3
-
12
-
-
84959250180
-
From captions to visual concepts and back
-
H. Fang, S. Gupta, F. N. Iandola, R. Srivastava, L. Deng, P. Dollár, J. Gao, X. He, M. Mitchell, J. C. Platt, C. L. Zitnick, and G. Zweig. From Captions to Visual Concepts and Back. In CVPR, 2015.
-
(2015)
CVPR
-
-
Fang, H.1
Gupta, S.2
Iandola, F.N.3
Srivastava, R.4
Deng, L.5
Dollár, P.6
Gao, J.7
He, X.8
Mitchell, M.9
Platt, J.C.10
Zitnick, C.L.11
Zweig, G.12
-
13
-
-
84911394491
-
Predicting object dynamics in scenes
-
D. F. Fouhey and C. Zitnick. Predicting object dynamics in scenes. In CVPR, 2014.
-
(2014)
CVPR
-
-
Fouhey, D.F.1
Zitnick, C.2
-
14
-
-
84965148420
-
Are you talking to a machine? Dataset and methods for multilingual image question answering
-
H. Gao, J. Mao, J. Zhou, Z. Huang, and A. Yuille. Are you talking to a machine? dataset and methods for multilingual image question answering. In NIPS, 2015.
-
(2015)
NIPS
-
-
Gao, H.1
Mao, J.2
Zhou, J.3
Huang, Z.4
Yuille, A.5
-
17
-
-
85106746987
-
Hunpos-an open source trigram tagger
-
P. Halcsy, A. Kornai, and C. Oravecz. Hunpos-an open source trigram tagger. In ACL, 2007.
-
(2007)
ACL
-
-
Halcsy, P.1
Kornai, A.2
Oravecz, C.3
-
19
-
-
84946734827
-
Deep visual-semantic alignments for generating image descriptions
-
A. Karpathy and L. Fei-Fei. Deep Visual-Semantic Alignments for Generating Image Descriptions. In CVPR, 2015.
-
(2015)
CVPR
-
-
Karpathy, A.1
Fei-Fei, L.2
-
20
-
-
85009861505
-
Identifying relations for open information extraction
-
A. Karpathy, A. Joulin, and F.-F. Li. Identifying relations for open information extraction. In NIPS, 2014.
-
(2014)
NIPS
-
-
Karpathy, A.1
Joulin, A.2
Li, F.-F.3
-
21
-
-
84952349298
-
Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models
-
R. Kiros, R. Salakhutdinov, and R. S. Zemel. Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models. TACL, 2015.
-
(2015)
TACL
-
-
Kiros, R.1
Salakhutdinov, R.2
Zemel, R.S.3
-
22
-
-
85009931853
-
Microsoft COCO: Common Objects in Context
-
T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick. Microsoft COCO: Common Objects in Context. In ECCV, 2014.
-
(2014)
ECCV
-
-
Lin, T.-Y.1
Maire, M.2
Belongie, S.3
Hays, J.4
Perona, P.5
Ramanan, D.6
Dollár, P.7
Zitnick, C.L.8
-
23
-
-
84959227898
-
Don't just listen, use your imagination: Leveraging visual common sense for non-visual tasks
-
X. Lin and D. Parikh. Don't just listen, use your imagination: Leveraging visual common sense for non-visual tasks. In CVPR, 2015.
-
(2015)
CVPR
-
-
Lin, X.1
Parikh, D.2
-
24
-
-
84937822746
-
A multi-world approach to question answering about real-world scenes based on uncertain input
-
M. Malinowski and M. Fritz. A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input. In NIPS, 2014.
-
(2014)
NIPS
-
-
Malinowski, M.1
Fritz, M.2
-
25
-
-
84973896625
-
Ask your neurons: A neural-based approach to answering questions about images
-
M. Malinowski, M. Rohrbach, and M. Fritz. Ask your neurons: A neural-based approach to answering questions about images. In ICCV, 2015.
-
(2015)
ICCV
-
-
Malinowski, M.1
Rohrbach, M.2
Fritz, M.3
-
26
-
-
84951072975
-
-
CoRR, abs/1410.1090
-
J. Mao, W. Xu, Y. Yang, J. Wang, and A. L. Yuille. Explain Images with Multimodal Recurrent Neural Networks. CoRR, abs/1410.1090, 2014.
-
(2014)
Explain Images with Multimodal Recurrent Neural Networks
-
-
Mao, J.1
Xu, W.2
Yang, Y.3
Wang, J.4
Yuille, A.L.5
-
27
-
-
84898956512
-
Distributed representations of words and phrases and their compositionality
-
T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed Representations of Words and Phrases and their Compositionality. In NIPS, 2013.
-
(2013)
NIPS
-
-
Mikolov, T.1
Sutskever, I.2
Chen, K.3
Corrado, G.S.4
Dean, J.5
-
28
-
-
84887393691
-
Attributes for classifier feedback
-
A. Parkash and D. Parikh. Attributes for classifier feedback. In ECCV, 2012.
-
(2012)
ECCV
-
-
Parkash, A.1
Parikh, D.2
-
29
-
-
84965170394
-
Exploring models and data for image question answering
-
M. Ren, R. Kiros, and R. Zemel. Exploring models and data for image question answering. In NIPS, 2015.
-
(2015)
NIPS
-
-
Ren, M.1
Kiros, R.2
Zemel, R.3
-
30
-
-
84959184467
-
Viske: Visual knowledge extraction and question answering by visual verification of relation phrases
-
F. Sadeghi, S. K. Kumar Divvala, and A. Farhadi. Viske: Visual knowledge extraction and question answering by visual verification of relation phrases. In CVPR, 2015.
-
(2015)
CVPR
-
-
Sadeghi, F.1
Kumar Divvala, S.K.2
Farhadi, A.3
-
32
-
-
84881046414
-
Unbiased look at the bias
-
A. Torralba and A. Efros. Unbiased look at the bias. In CVPR, 2011.
-
(2011)
CVPR
-
-
Torralba, A.1
Efros, A.2
-
33
-
-
84901405262
-
Joint video and text parsing for understanding events and answering queries
-
K. Tu, M. Meng, M. W. Lee, T. E. Choe, and S. C. Zhu. Joint Video and Text Parsing for Understanding Events and Answering Queries. IEEE MultiMedia, 2014.
-
(2014)
IEEE MultiMedia
-
-
Tu, K.1
Meng, M.2
Lee, M.W.3
Choe, T.E.4
Zhu, S.C.5
-
34
-
-
84973926486
-
Learning common sense through visual abstraction
-
R. Vedantam, X. Lin, T. Batra, C. L. Zitnick, and D. Parikh. Learning common sense through visual abstraction. In ICCV, 2015.
-
(2015)
ICCV
-
-
Vedantam, R.1
Lin, X.2
Batra, T.3
Zitnick, C.L.4
Parikh, D.5
-
36
-
-
84970002232
-
Show, attend and tell: Neural image caption generation with visual attention
-
K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhutdinov, R. Zemel, and Y. Bengio. Show, attend and tell: Neural image caption generation with visual attention. In ICML, 2015.
-
(2015)
ICML
-
-
Xu, K.1
Ba, J.2
Kiros, R.3
Cho, K.4
Courville, A.5
Salakhutdinov, R.6
Zemel, R.7
Bengio, Y.8
-
37
-
-
77957858975
-
Using annotator rationales to improve machine learning for text categorization
-
O. Zaidan, J. Eisner, and C. Piatko. Using annotator rationales to improve machine learning for text categorization. In NAACL-HLT, 2007.
-
(2007)
NAACL-HLT
-
-
Zaidan, O.1
Eisner, J.2
Piatko, C.3
-
38
-
-
84887338442
-
Bringing semantics into focus using visual abstraction
-
C. L. Zitnick and D. Parikh. Bringing Semantics Into Focus Using Visual Abstraction. In CVPR, 2013.
-
(2013)
CVPR
-
-
Zitnick, C.L.1
Parikh, D.2
-
39
-
-
84898772194
-
Learning the visual interpretation of sentences
-
C. L. Zitnick, D. Parikh, and L. Vanderwende. Learning the Visual Interpretation of Sentences. In ICCV, 2013.
-
(2013)
ICCV
-
-
Zitnick, C.L.1
Parikh, D.2
Vanderwende, L.3
-
40
-
-
84959182108
-
Adopting abstract images for semantic scene understanding
-
C. L. Zitnick, R. Vedantam, and D. Parikh. Adopting Abstract Images for Semantic Scene Understanding. PAMI, 2015.
-
(2015)
PAMI
-
-
Zitnick, C.L.1
Vedantam, R.2
Parikh, D.3
|