-
2
-
-
84887378604
-
Fast, accurate detection of 100,000 object classes on a single machine
-
IEEE
-
T. Dean, M. Ruzon, M. Segal, J. Shlens, S. Vijayanarasimhan, J. Yagnik, et al. Fast, accurate detection of 100,000 object classes on a single machine. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pages 1814-1821. IEEE, 2013.
-
(2013)
Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on
, pp. 1814-1821
-
-
Dean, T.1
Ruzon, M.2
Segal, M.3
Shlens, J.4
Vijayanarasimhan, S.5
Yagnik, J.6
-
3
-
-
85198028989
-
Imagenet: A large-scale hierarchical image database
-
IEEE
-
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pages 248-255. IEEE, 2009.
-
(2009)
Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on
, pp. 248-255
-
-
Deng, J.1
Dong, W.2
Socher, R.3
Li, L.-J.4
Li, K.5
Fei-Fei, L.6
-
4
-
-
84959236502
-
Long-term recurrent convolutional networks for visual recognition and description
-
J. Donahue, L. Anne Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, K. Saenko, and T. Darrell. Long-term recurrent convolutional networks for visual recognition and description. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2625-2634, 2015.
-
(2015)
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
, pp. 2625-2634
-
-
Donahue, J.1
Anne Hendricks, L.2
Guadarrama, S.3
Rohrbach, M.4
Venugopalan, S.5
Saenko, K.6
Darrell, T.7
-
5
-
-
77649188328
-
The segmented and annotated iapr tc-12 benchmark
-
H. J. Escalante, C. A. Hernández, J. A. Gonzalez, A. López-López, M. Montes, E. F. Morales, L. E. Sucar, L. Villasenor, and M. Grubinger. The segmented and annotated iapr tc-12 benchmark. Computer Vision and Image Understanding, 114(4):419-428, 2010.
-
(2010)
Computer Vision and Image Understanding
, vol.114
, Issue.4
, pp. 419-428
-
-
Escalante, H.J.1
Hernández, C.A.2
Gonzalez, J.A.3
López-López, A.4
Montes, M.5
Morales, E.F.6
Sucar, L.E.7
Villasenor, L.8
Grubinger, M.9
-
6
-
-
84898958665
-
Devise: A deep visual-semantic embedding model
-
A. Frome, G. S. Corrado, J. Shlens, S. Bengio, J. Dean, T. Mikolov, et al. Devise: A deep visual-semantic embedding model. In Advances in Neural Information Processing Systems, pages 2121-2129, 2013.
-
(2013)
Advances in Neural Information Processing Systems
, pp. 2121-2129
-
-
Frome, A.1
Corrado, G.S.2
Shlens, J.3
Bengio, S.4
Dean, J.5
Mikolov, T.6
-
8
-
-
84911400494
-
Rich feature hierarchies for accurate object detection and semantic segmentation
-
IEEE
-
R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on, pages 580-587. IEEE, 2014.
-
(2014)
Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on
, pp. 580-587
-
-
Girshick, R.1
Donahue, J.2
Darrell, T.3
Malik, J.4
-
9
-
-
38049183286
-
The iapr tc-12 benchmark: A new evaluation resource for visual information systems
-
M. Grubinger, P. Clough, H. Müller, and T. Deselaers. The iapr tc-12 benchmark: A new evaluation resource for visual information systems. In International Workshop OntoImage, pages 13-23, 2006.
-
(2006)
International Workshop OntoImage
, pp. 13-23
-
-
Grubinger, M.1
Clough, P.2
Müller, H.3
Deselaers, T.4
-
10
-
-
85131224768
-
Open-vocabulary object retrieval
-
S. Guadarrama, E. Rodner, K. Saenko, N. Zhang, R. Farrell, J. Donahue, and T. Darrell. Open-vocabulary object retrieval. In Robotics: Science and Systems, 2014.
-
(2014)
Robotics: Science and Systems
-
-
Guadarrama, S.1
Rodner, E.2
Saenko, K.3
Zhang, N.4
Farrell, R.5
Donahue, J.6
Darrell, T.7
-
12
-
-
84924803045
-
Lsda: Large scale detection through adaptation
-
J. Hoffman, S. Guadarrama, E. S. Tzeng, R. Hu, J. Donahue, R. Girshick, T. Darrell, and K. Saenko. Lsda: Large scale detection through adaptation. In Advances in Neural Information Processing Systems, pages 3536-3544, 2014.
-
(2014)
Advances in Neural Information Processing Systems
, pp. 3536-3544
-
-
Hoffman, J.1
Guadarrama, S.2
Tzeng, E.S.3
Hu, R.4
Donahue, J.5
Girshick, R.6
Darrell, T.7
Saenko, K.8
-
13
-
-
33845594193
-
Learning distance metrics with contextual constraints for image retrieval
-
IEEE
-
S. C. Hoi, W. Liu, M. R. Lyu, and W.-Y. Ma. Learning distance metrics with contextual constraints for image retrieval. In Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on, volume 2, pages 2072-2078. IEEE, 2006.
-
(2006)
Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on
, vol.2
, pp. 2072-2078
-
-
Hoi, S.C.1
Liu, W.2
Lyu, M.R.3
Ma, W.-Y.4
-
15
-
-
84986302997
-
-
arXiv preprint arXiv:1511.04164
-
R. Hu, H. Xu, M. Rohrbach, J. Feng, K. Saenko, and T. Darrell. Natural language object retrieval. arXiv preprint arXiv:1511.04164, 2015.
-
(2015)
Natural Language Object Retrieval
-
-
Hu, R.1
Xu, H.2
Rohrbach, M.3
Feng, J.4
Saenko, K.5
Darrell, T.6
-
16
-
-
84913580146
-
Caffe: Convolutional architecture for fast feature embedding
-
Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. B. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. In ACM Multimedia, volume 2, page 4, 2014.
-
(2014)
ACM Multimedia
, vol.2
, pp. 4
-
-
Jia, Y.1
Shelhamer, E.2
Donahue, J.3
Karayev, S.4
Long, J.5
Girshick, R.B.6
Guadarrama, S.7
Darrell, T.8
-
19
-
-
84943540775
-
Referitgame: Referring to objects in photographs of natural scenes
-
S. Kazemzadeh, V. Ordonez, M. Matten, and T. L. Berg. Referitgame: Referring to objects in photographs of natural scenes. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 787-798, 2014.
-
(2014)
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)
, pp. 787-798
-
-
Kazemzadeh, S.1
Ordonez, V.2
Matten, M.3
Berg, T.L.4
-
22
-
-
84911370987
-
What are you talking about? Text-to-image coreference
-
IEEE
-
C. Kong, D. Lin, M. Bansal, R. Urtasun, and S. Fidler. What are you talking about? text-to-image coreference. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on, pages 3558-3565. IEEE, 2014.
-
(2014)
Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on
, pp. 3558-3565
-
-
Kong, C.1
Lin, D.2
Bansal, M.3
Urtasun, R.4
Fidler, S.5
-
23
-
-
84906493406
-
Microsoft coco: Common objects in context
-
Springer
-
T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick. Microsoft coco: Common objects in context. In Computer Vision-ECCV 2014, pages 740-755. Springer, 2014.
-
(2014)
Computer Vision-ECCV 2014
, pp. 740-755
-
-
Lin, T.-Y.1
Maire, M.2
Belongie, S.3
Hays, J.4
Perona, P.5
Ramanan, D.6
Dollár, P.7
Zitnick, C.L.8
-
24
-
-
84986260074
-
Generation and comprehension of unambiguous object descriptions
-
J. Mao, J. Huang, A. Toshev, O. Camburu, A. Yuille, and K. Murphy. Generation and comprehension of unambiguous object descriptions. Computer Vision and Pattern Recognition (CVPR), 2016 IEEE Conference on, 2016.
-
(2016)
Computer Vision and Pattern Recognition (CVPR), 2016 IEEE Conference on
-
-
Mao, J.1
Huang, J.2
Toshev, A.3
Camburu, O.4
Yuille, A.5
Murphy, K.6
-
25
-
-
85083950512
-
Deep captioning with multimodal recurrent neural networks (m-rnn)
-
J. Mao, W. Xu, Y. Yang, J. Wang, Z. Huang, and A. Yuille. Deep captioning with multimodal recurrent neural networks (m-rnn). In Proceedings of the International Conference on Learning Representations, 2015.
-
(2015)
Proceedings of the International Conference on Learning Representations
-
-
Mao, J.1
Xu, W.2
Yang, Y.3
Wang, J.4
Huang, Z.5
Yuille, A.6
-
26
-
-
84973856017
-
Flickr30k entities: Collecting region-to-phrase correspondences for richer image-tosentence models
-
B. Plummer, L. Wang, C. Cervantes, J. Caicedo, J. Hockenmaier, and S. Lazebnik. Flickr30k entities: Collecting region-to-phrase correspondences for richer image-tosentence models. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015.
-
(2015)
Proceedings of the IEEE International Conference on Computer Vision (ICCV)
-
-
Plummer, B.1
Wang, L.2
Cervantes, C.3
Caicedo, J.4
Hockenmaier, J.5
Lazebnik, S.6
-
28
-
-
84986327251
-
-
arXiv preprint arXiv:1511.03745
-
A. Rohrbach, M. Rohrbach, R. Hu, T. Darrell, and B. Schiele. Grounding of textual phrases in images by reconstruction. arXiv preprint arXiv:1511.03745, 2015.
-
(2015)
Grounding of Textual Phrases in Images by Reconstruction
-
-
Rohrbach, A.1
Rohrbach, M.2
Hu, R.3
Darrell, T.4
Schiele, B.5
-
29
-
-
84945944033
-
Imagenet large scale visual recognition challenge
-
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, et al. Imagenet large scale visual recognition challenge. International Journal of Computer Vision, pages 1-42, 2014.
-
(2014)
International Journal of Computer Vision
, pp. 1-42
-
-
Russakovsky, O.1
Deng, J.2
Su, H.3
Krause, J.4
Satheesh, S.5
Ma, S.6
Huang, Z.7
Karpathy, A.8
Khosla, A.9
Bernstein, M.10
-
31
-
-
84946747440
-
Show and tell: A neural image caption generator
-
O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3156-3164, 2015.
-
(2015)
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
, pp. 3156-3164
-
-
Vinyals, O.1
Toshev, A.2
Bengio, S.3
Erhan, D.4
-
32
-
-
84970002232
-
Show, attend and tell: Neural image caption generation with visual attention
-
K. Xu, J. Ba, R. Kiros, A. Courville, R. Salakhutdinov, R. Zemel, and Y. Bengio. Show, attend and tell: Neural image caption generation with visual attention. Proceedings of the International Conference on Machine Learning (ICML), 2015.
-
(2015)
Proceedings of the International Conference on Machine Learning (ICML)
-
-
Xu, K.1
Ba, J.2
Kiros, R.3
Courville, A.4
Salakhutdinov, R.5
Zemel, R.6
Bengio, Y.7
|