-
2
-
-
84973890960
-
VQA: Visual question answering
-
S. Antol, A. Agrawal, J. Lu, M. Mitchell, D. Batra, C. L. Zitnick, and D. Parikh. VQA: visual question answering. In 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7-13, 2015, pages 2425-2433, 2015.
-
(2015)
2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7-13, 2015
, pp. 2425-2433
-
-
Antol, S.1
Agrawal, A.2
Lu, J.3
Mitchell, M.4
Batra, D.5
Zitnick, C.L.6
Parikh, D.7
-
4
-
-
84867872703
-
Semantic segmentation with second-order pooling
-
J. Carreira, R. Caseiro, J. Batista, and C. Sminchisescu. Semantic segmentation with second-order pooling. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), volume 7578 LNCS, pages 430-443, 2012.
-
(2012)
Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7578 LNCS
, pp. 430-443
-
-
Carreira, J.1
Caseiro, R.2
Batista, J.3
Sminchisescu, C.4
-
7
-
-
85029078673
-
Counting everyday objects in everyday scenes
-
P. Chattopadhyay, R. Vedantam, R. S. Ramprasaath, D. Batra, and D. Parikh. Counting everyday objects in everyday scenes. CoRR, abs/1604.03505, 2016.
-
(2016)
CoRR, abs/1604.03505
-
-
Chattopadhyay, P.1
Vedantam, R.2
Ramprasaath, R.S.3
Batra, D.4
Parikh, D.5
-
8
-
-
0012036878
-
Subitizing: What is it? Why teach it?
-
D. H. Clements. Subitizing: What is it? why teach it? Teaching children mathematics, 5(7):400, 1999.
-
(1999)
Teaching Children Mathematics
, vol.5
, Issue.7
, pp. 400
-
-
Clements, D.H.1
-
10
-
-
84870917824
-
Subitizing and visual shortterm memory in human and non-human species: A common shared system?
-
S. Cutini and M. Bonato. Subitizing and visual shortterm memory in human and non-human species: a common shared system? Frontiers in Psychology, 3, 2012.
-
(2012)
Frontiers in Psychology
, vol.3
-
-
Cutini, S.1
Bonato, M.2
-
11
-
-
45849104230
-
Log or linear? Distinct intuitions of the number scale in western and amazonian indigene cultures
-
S. Dehaene, V. Izard, E. Spelke, and P. Pica. Log or linear? distinct intuitions of the number scale in western and amazonian indigene cultures. Science, 320(5880):1217-1220, 2008.
-
(2008)
Science
, vol.320
, Issue.5880
, pp. 1217-1220
-
-
Dehaene, S.1
Izard, V.2
Spelke, E.3
Pica, P.4
-
12
-
-
85044305973
-
-
J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell. Decaf: A deep convolutional activation feature for generic visual recognition. 10 2013.
-
(2013)
Decaf: A Deep Convolutional Activation Feature for Generic Visual Recognition
, vol.10
-
-
Donahue, J.1
Jia, Y.2
Vinyals, O.3
Hoffman, J.4
Zhang, N.5
Tzeng, E.6
Darrell, T.7
-
13
-
-
84921069139
-
The pascal visual object classes challenge: A retrospective
-
Jan
-
M. Everingham, S. M. A. Eslami, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The pascal visual object classes challenge: A retrospective. International Journal of Computer Vision, 111(1):98-136, Jan. 2015.
-
(2015)
International Journal of Computer Vision
, vol.111
, Issue.1
, pp. 98-136
-
-
Everingham, M.1
Eslami, S.M.A.2
Van Gool, L.3
Williams, C.K.I.4
Winn, J.5
Zisserman, A.6
-
14
-
-
51949101231
-
A discriminatively trained, multiscale, deformable part model
-
P. Felzenszwalb, D. McAllester, and D. Ramanan. A discriminatively trained, multiscale, deformable part model. In 26th IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2008.
-
(2008)
26th IEEE Conference on Computer Vision and Pattern Recognition, CVPR
-
-
Felzenszwalb, P.1
McAllester, D.2
Ramanan, D.3
-
15
-
-
85044506279
-
Multimodal compact bilinear pooling for visual question answering and visual grounding
-
A. Fukui, D. H. Park, D. Yang, A. Rohrbach, T. Darrell, and M. Rohrbach. Multimodal compact bilinear pooling for visual question answering and visual grounding. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, November 1-4, 2016, pages 457-468, 2016.
-
(2016)
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, November 1-4 2016
, pp. 457-468
-
-
Fukui, A.1
Park, D.H.2
Yang, D.3
Rohrbach, A.4
Darrell, T.5
Rohrbach, M.6
-
19
-
-
84887356947
-
Multi-source multi-scale counting in extremely dense crowd images
-
Washington, DC, USA, IEEE Computer Society
-
H. Idrees, I. Saleemi, C. Seibert, and M. Shah. Multi-source multi-scale counting in extremely dense crowd images. In Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, CVPR '13, pages 2547-2554, Washington, DC, USA, 2013. IEEE Computer Society.
-
(2013)
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, CVPR '13
, pp. 2547-2554
-
-
Idrees, H.1
Saleemi, I.2
Seibert, C.3
Shah, M.4
-
22
-
-
85041904911
-
CLEVR: A diagnostic dataset for compositional language and elementary visual reasoning
-
J. Johnson, B. Hariharan, L. van der Maaten, L. Fei-Fei, C. L. Zitnick, and R. Girshick. CLEVR: A diagnostic dataset for compositional language and elementary visual reasoning. In CVPR, 2017.
-
(2017)
CVPR
-
-
Johnson, J.1
Hariharan, B.2
Maaten Der LVan3
Fei-Fei, L.4
Zitnick, C.L.5
Girshick, R.6
-
23
-
-
84943540775
-
Referit game: Referring to objects in photographs of natural scenes
-
S. Kazemzadeh, V. Ordonez, M. Matten, and T. L. Berg. Referit game: Referring to objects in photographs of natural scenes. In EMNLP, 2014.
-
(2014)
EMNLP
-
-
Kazemzadeh, S.1
Ordonez, V.2
Matten, M.3
Berg, T.L.4
-
24
-
-
85083951076
-
Adam: A method for stochastic optimization
-
D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. CoRR, abs/1412.6980, 2014.
-
(2014)
CoRR, abs/1412.6980
-
-
Kingma, D.P.1
Ba, J.2
-
28
-
-
84937834115
-
Microsoft COCO: Common objects in context
-
T. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick. Microsoft COCO: Common objects in context. In ECCV, 2014.
-
(2014)
ECCV
-
-
Lin, T.1
Maire, M.2
Belongie, S.3
Hays, J.4
Perona, P.5
Ramanan, D.6
Dollár, P.7
Zitnick, C.L.8
-
29
-
-
85007570504
-
SSD: Single shot multibox detector
-
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, and S. E. Reed. SSD: single shot multibox detector. CoRR, abs/1512.02325, 2015.
-
(2015)
CoRR, abs/1512.02325
-
-
Liu, W.1
Anguelov, D.2
Erhan, D.3
Szegedy, C.4
Reed, S.E.5
-
32
-
-
84898956512
-
Distributed representations of words and phrases and their compositionality
-
T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed Representations of Words and Phrases and their Compositionality. In Advances in Neural Information Processing Systems, pages 3111-3119, 2013.
-
(2013)
Advances in Neural Information Processing Systems
, pp. 3111-3119
-
-
Mikolov, T.1
Sutskever, I.2
Chen, K.3
Corrado, G.S.4
Dean, J.5
-
33
-
-
85021624882
-
Towards perspective-free object counting with deep learning
-
D. Oñoro Rubio and R. J. López-Sastre. Towards perspective-free object counting with deep learning. In ECCV, 2016.
-
(2016)
ECCV
-
-
Oñoro Rubio, D.1
López-Sastre, R.J.2
-
34
-
-
84986308404
-
You only look once: Unified, real-time object detection
-
June
-
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi. You only look once: Unified, real-time object detection. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016.
-
(2016)
The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
-
-
Redmon, J.1
Divvala, S.2
Girshick, R.3
Farhadi, A.4
-
36
-
-
84990028830
-
End-to-end instance segmentation and counting with recurrent attention
-
M. Ren and R. S. Zemel. End-to-end instance segmentation and counting with recurrent attention. CoRR, abs/1605.09410, 2016.
-
(2016)
CoRR, abs/1605.09410
-
-
Ren, M.1
Zemel, R.S.2
-
37
-
-
84960980241
-
Faster r-cnn: Towards real-time object detection with region proposal networks
-
C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, editors, Curran Associates, Inc
-
S. Ren, K. He, R. Girshick, and J. Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing Systems 28, pages 91-99. Curran Associates, Inc., 2015.
-
(2015)
Advances in Neural Information Processing Systems
, vol.28
, pp. 91-99
-
-
Ren, S.1
He, K.2
Girshick, R.3
Sun, J.4
-
38
-
-
84947041871
-
Imagenet large scale visual recognition challenge
-
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211-252, 2015.
-
(2015)
International Journal of Computer Vision (IJCV)
, vol.115
, Issue.3
, pp. 211-252
-
-
Russakovsky, O.1
Deng, J.2
Su, H.3
Krause, J.4
Satheesh, S.5
Ma, S.6
Huang, Z.7
Karpathy, A.8
Khosla, A.9
Bernstein, M.10
Berg, A.C.11
Fei-Fei, L.12
-
39
-
-
84887384357
-
It's not polite to point: Describing people with uncertain attributes
-
IEEE
-
A. Sadovnik, A. C. Gallagher, and T. Chen. It's not polite to point: Describing people with uncertain attributes. In CVPR, pages 3089-3096. IEEE, 2013.
-
(2013)
CVPR
, pp. 3089-3096
-
-
Sadovnik, A.1
Gallagher, A.C.2
Chen, T.3
-
43
-
-
0035680116
-
Rapid object detection using a boosted cascade of simple features
-
P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features.computer Vision and Pattern Recognition (CVPR), 1:I-511-I-518, 2001.
-
(2001)
Computer Vision and Pattern Recognition (CVPR)
, vol.1
, pp. 1511-1518
-
-
Viola, P.1
Jones, M.2
-
44
-
-
84959203164
-
End-to-end integration of a convolution network, deformable parts model and nonmaximum suppression
-
L. Wan, D. Eigen, and R. Fergus. End-to-end integration of a convolution network, deformable parts model and nonmaximum suppression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 851-859, 2015.
-
(2015)
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
, pp. 851-859
-
-
Wan, L.1
Eigen, D.2
Fergus, R.3
-
45
-
-
84959214343
-
Cross-scene crowd counting via deep convolutional neural networks
-
C. Zhang, H. Li, X.Wang, and X. Yang. Cross-Scene Crowd Counting via Deep Convolutional Neural Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 833-841, 2015.
-
(2015)
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
, pp. 833-841
-
-
Zhang, C.1
Li, H.2
Wang, X.3
Yang, X.4
-
46
-
-
84959205754
-
Salient object subitizing
-
J. Zhang, S. Ma, M. Sameki, S. Sclaroff, M. Betke, Z. Lin, X. Shen, B. Price, and R. M?ech. Salient object subitizing. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
-
(2015)
IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
-
-
Zhang, J.1
Ma, S.2
Sameki, M.3
Sclaroff, S.4
Betke, M.5
Lin, Z.6
Shen, X.7
Price, B.8
Mech, R.9
|