-
3
-
-
85021776053
-
Toward an architecture for never-ending language learning
-
A. Carlson, J. Betteridge, B. Kisiel, B. Settles, E. R. Hruschka Jr, and T. M. Mitchell. Toward an architecture for never-ending language learning. In AAAI, 2010
-
(2010)
AAAI
-
-
Carlson, A.1
Betteridge, J.2
Kisiel, B.3
Settles, B.4
Hruschka, E.R.5
Mitchell, T.M.6
-
4
-
-
84952349295
-
-
arXiv preprint arXiv:1504. 00325
-
X. Chen, H. Fang, T. Lin, R. Vedantam, S. Gupta, P. Dollár, and C. L. Zitnick. Microsoft coco captions: Data collection and evaluation server. arXiv preprint arXiv:1504. 00325, 2015
-
(2015)
Microsoft Coco Captions: Data Collection and Evaluation Server
-
-
Chen, X.1
Fang, H.2
Lin, T.3
Vedantam, R.4
Gupta, S.5
Dollár, P.6
Zitnick, C.L.7
-
5
-
-
84898803720
-
Neil: Extracting visual knowledge from web data
-
X. Chen, A. Shrivastava, and A. Gupta. Neil: Extracting visual knowledge from web data. In ICCV, 2013
-
(2013)
ICCV
-
-
Chen, X.1
Shrivastava, A.2
Gupta, A.3
-
6
-
-
84957029470
-
Mind's eye: A recurrent visual representation for image caption generation
-
X. Chen and C. L. Zitnick. Mind's eye: A recurrent visual representation for image caption generation. CVPR, 2015
-
(2015)
CVPR
-
-
Chen, X.1
Zitnick, C.L.2
-
7
-
-
85198028989
-
ImageNet: A large-scale hierarchical image database
-
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A large-scale hierarchical image database. In CVPR, 2009
-
(2009)
CVPR
-
-
Deng, J.1
Dong, W.2
Socher, R.3
Li, L.-J.4
Li, K.5
Fei-Fei, L.6
-
8
-
-
84911368326
-
Learning everything about anything: Webly-supervised visual concept learning
-
S. Divvala, A. Farhadi, and C. Guestrin. Learning everything about anything: Webly-supervised visual concept learning. In CVPR, 2014
-
(2014)
CVPR
-
-
Divvala, S.1
Farhadi, A.2
Guestrin, C.3
-
9
-
-
84959236502
-
Long-term recurrent convolutional networks for visual recognition and description
-
J. Donahue, L. A. Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, K. Saenko, and T. Darrell. Long-term recurrent convolutional networks for visual recognition and description. CVPR, 2015
-
(2015)
CVPR
-
-
Donahue, J.1
Hendricks, L.A.2
Guadarrama, S.3
Rohrbach, M.4
Venugopalan, S.5
Saenko, K.6
Darrell, T.7
-
10
-
-
84906928552
-
Comparing automatic evaluation measures for image description
-
D. Elliott and F. Keller. Comparing automatic evaluation measures for image description. In ACL, 2014
-
(2014)
ACL
-
-
Elliott, D.1
Keller, F.2
-
11
-
-
77951298115
-
The PASCAL visual object classes (VOC) challenge
-
June
-
M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The PASCAL visual object classes (VOC) challenge. IJCV, 88(2):303-338, June 2010
-
(2010)
IJCV
, vol.88
, Issue.2
, pp. 303-338
-
-
Everingham, M.1
Van Gool, L.2
Williams, C.K.I.3
Winn, J.4
Zisserman, A.5
-
12
-
-
80052017343
-
Every picture tells a story: Generating sentences from images
-
A. Farhadi, M. Hejrati, M. A. Sadeghi, P. Young, C. Rashtchian, J. Hockenmaier, and D. Forsyth. Every picture tells a story: Generating sentences from images. In ECCV, 2010
-
(2010)
ECCV
-
-
Farhadi, A.1
Hejrati, M.2
Sadeghi, M.A.3
Young, P.4
Rashtchian, C.5
Hockenmaier, J.6
Forsyth, D.7
-
13
-
-
84911400494
-
Rich feature hierarchies for accurate object detection and semantic segmentation
-
R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR, 2014
-
(2014)
CVPR
-
-
Girshick, R.1
Donahue, J.2
Darrell, T.3
Malik, J.4
-
14
-
-
84911427286
-
Using k-poselets for detecting people and localizing their keypoints
-
G. Gkioxari, B. Hariharan, R. Girshick, and J. Malik. Using k-poselets for detecting people and localizing their keypoints. In CVPR, 2014
-
(2014)
CVPR
-
-
Gkioxari, G.1
Hariharan, B.2
Girshick, R.3
Malik, J.4
-
15
-
-
84883394520
-
Framing image description as a ranking task: Data, models and evaluation metrics
-
M. Hodosh, P. Young, and J. Hockenmaier. Framing image description as a ranking task: Data, models and evaluation metrics. JAIR, 47:853-899, 2013
-
(2013)
JAIR
, vol.47
, pp. 853-899
-
-
Hodosh, M.1
Young, P.2
Hockenmaier, J.3
-
16
-
-
84889566627
-
Learning deep structured semantic models for web search using clickthrough data
-
P. Huang, X. He, J. Gao, L. Deng, A. Acero, and L. Heck. Learning deep structured semantic models for web search using clickthrough data. In CIKM, 2013
-
(2013)
CIKM
-
-
Huang, P.1
He, X.2
Gao, J.3
Deng, L.4
Acero, A.5
Heck, L.6
-
17
-
-
84913555165
-
-
arXiv preprint arXiv:1408. 5093
-
Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408. 5093, 2014
-
(2014)
Caffe: Convolutional Architecture for Fast Feature Embedding
-
-
Jia, Y.1
Shelhamer, E.2
Donahue, J.3
Karayev, S.4
Long, J.5
Girshick, R.6
Guadarrama, S.7
Darrell, T.8
-
18
-
-
84946734827
-
Deep visual-semantic alignments for generating image descriptions
-
A. Karpathy and L. Fei-Fei. Deep visual-semantic alignments for generating image descriptions. CVPR, 2015
-
(2015)
CVPR
-
-
Karpathy, A.1
Fei-Fei, L.2
-
21
-
-
84876231242
-
ImageNet classification with deep convolutional neural networks
-
A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet classification with deep convolutional neural networks. In NIPS, 2012
-
(2012)
NIPS
-
-
Krizhevsky, A.1
Sutskever, I.2
Hinton, G.E.3
-
22
-
-
80052901011
-
Baby talk: Understanding and generating simple image descriptions
-
G. Kulkarni, V. Premraj, S. Dhar, S. Li, Y. Choi, A. C. Berg, and T. L. Berg. Baby talk: Understanding and generating simple image descriptions. In CVPR, 2011
-
(2011)
CVPR
-
-
Kulkarni, G.1
Premraj, V.2
Dhar, S.3
Li, S.4
Choi, Y.5
Berg, A.C.6
Berg, T.L.7
-
23
-
-
84878189119
-
Collective generation of natural image descriptions
-
P. Kuznetsova, V. Ordonez, A. C. Berg, T. L. Berg, and Y. Choi. Collective generation of natural image descriptions. In ACL, 2012
-
(2012)
ACL
-
-
Kuznetsova, P.1
Ordonez, V.2
Berg, A.C.3
Berg, T.L.4
Choi, Y.5
-
24
-
-
0027252194
-
Trigger-based language models: A maximum entropy approach
-
R. Lau, R. Rosenfeld, and S. Roukos. Trigger-based language models: A maximum entropy approach. In ICASSP, 1993
-
(1993)
ICASSP
-
-
Lau, R.1
Rosenfeld, R.2
Roukos, S.3
-
26
-
-
84862279067
-
Composing simple image descriptions using web-scale n-grams
-
S. Li, G. Kulkarni, T. L. Berg, A. C. Berg, and Y. Choi. Composing simple image descriptions using web-scale n-grams. In CoNLL, 2011
-
(2011)
CoNLL
-
-
Li, S.1
Kulkarni, G.2
Berg, T.L.3
Berg, A.C.4
Choi, Y.5
-
27
-
-
85149140250
-
Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics
-
Stroudsburg, PA, USA. Association for Computational Linguistics
-
C.-Y. Lin and F. J. Och. Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics. In Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, ACL '04, Stroudsburg, PA, USA, 2004. Association for Computational Linguistics
-
(2004)
Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, ACL '04
-
-
Lin, C.-Y.1
Och, F.J.2
-
28
-
-
84937834115
-
Microsoft COCO: Common objects in context
-
T. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick. Microsoft COCO: Common objects in context. In ECCV, 2014
-
(2014)
ECCV
-
-
Lin, T.1
Maire, M.2
Belongie, S.3
Hays, J.4
Perona, P.5
Ramanan, D.6
Dollár, P.7
Zitnick, C.L.8
-
29
-
-
84951072975
-
-
arXiv preprint arXiv:1410. 1090
-
J. Mao, W. Xu, Y. Yang, J. Wang, and A. L. Yuille. Explain images with multimodal recurrent neural networks. arXiv preprint arXiv:1410. 1090, 2014
-
(2014)
Explain Images with Multimodal Recurrent Neural Networks
-
-
Mao, J.1
Xu, W.2
Yang, Y.3
Wang, J.4
Yuille, A.L.5
-
30
-
-
84898935332
-
A framework for multipleinstance learning
-
O. Maron and T. Lozano-Pérez. A framework for multipleinstance learning. NIPS, 1998
-
(1998)
NIPS
-
-
Maron, O.1
Lozano-Pérez, T.2
-
31
-
-
84858966958
-
Strategies for training large scale neural network language models
-
T. Mikolov, A. Deoras, D. Povey, L. Burget, and J. Cernocky. Strategies for training large scale neural network language models. In ASRU, 2011
-
(2011)
ASRU
-
-
Mikolov, T.1
Deoras, A.2
Povey, D.3
Burget, L.4
Cernocky, J.5
-
32
-
-
85034832841
-
Midge: Generating image descriptions from computer vision detections
-
M. Mitchell, X. Han, J. Dodge, A. Mensch, A. Goyal, A. Berg, K. Yamaguchi, T. Berg, K. Stratos, and H. Daumé III. Midge: Generating image descriptions from computer vision detections. In EACL, 2012
-
(2012)
EACL
-
-
Mitchell, M.1
Han, X.2
Dodge, J.3
Mensch, A.4
Goyal, A.5
Berg, A.6
Yamaguchi, K.7
Berg, T.8
Stratos, K.9
Daumé, H.10
-
33
-
-
67650453038
-
Three new graphical models for statistical language modelling
-
A. Mnih and G. Hinton. Three new graphical models for statistical language modelling. In ICML, 2007
-
(2007)
ICML
-
-
Mnih, A.1
Hinton, G.2
-
34
-
-
84867118996
-
A fast and simple algorithm for training neural probabilistic language models
-
A. Mnih and Y. W. Teh. A fast and simple algorithm for training neural probabilistic language models. In ICML, 2012
-
(2012)
ICML
-
-
Mnih, A.1
Teh, Y.W.2
-
35
-
-
84944098666
-
Minimum error rate training in statistical machine translation
-
F. J. Och. Minimum error rate training in statistical machine translation. In ACL, 2003
-
(2003)
ACL
-
-
Och, F.J.1
-
36
-
-
85162522202
-
Im2text: Describing images using 1 million captioned photographs
-
V. Ordonez, G. Kulkarni, and T. L. Berg. Im2text: Describing images using 1 million captioned photographs. In NIPS, 2011
-
(2011)
NIPS
-
-
Ordonez, V.1
Kulkarni, G.2
Berg, T.L.3
-
37
-
-
85133336275
-
Bleu: A method for automatic evaluation of machine translation
-
K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu. Bleu: a method for automatic evaluation of machine translation. In ACL, 2002
-
(2002)
ACL
-
-
Papineni, K.1
Roukos, S.2
Ward, T.3
Zhu, W.-J.4
-
39
-
-
84896359701
-
Trainable methods for surface natural language generation
-
A. Ratnaparkhi. Trainable methods for surface natural language generation. In NAACL, 2000
-
(2000)
NAACL
-
-
Ratnaparkhi, A.1
-
40
-
-
0036663624
-
Trainable approaches to surface natural language generation and their application to conversational dialog systems
-
A. Ratnaparkhi. Trainable approaches to surface natural language generation and their application to conversational dialog systems. Computer Speech & Language, 16(3):435-455, 2002
-
(2002)
Computer Speech & Language
, vol.16
, Issue.3
, pp. 435-455
-
-
Ratnaparkhi, A.1
-
41
-
-
84928315948
-
A latent semantic model with convolutional-pooling structure for information retrieval
-
Y. Shen, X. He, J. Gao, L. Deng, and G. Mesnil. A latent semantic model with convolutional-pooling structure for information retrieval. In CIKM, 2014
-
(2014)
CIKM
-
-
Shen, Y.1
He, X.2
Gao, J.3
Deng, L.4
Mesnil, G.5
-
43
-
-
84928030723
-
Grounded compositional semantics for finding and describing images with sentences
-
R. Socher, Q. Le, C. Manning, and A. Ng. Grounded compositional semantics for finding and describing images with sentences. In NIPS Deep Learning Workshop, 2013
-
(2013)
NIPS Deep Learning Workshop
-
-
Socher, R.1
Le, Q.2
Manning, C.3
Ng, A.4
-
46
-
-
84939821074
-
-
arXiv preprint arXiv:1502. 03044
-
K. Xu, J. Ba, R. Kiros, A. Courville, R. Salakhutdinov, R. Zemel, and Y. Bengio. Show, attend and tell: Neural image caption generation with visual attention. arXiv preprint arXiv:1502. 03044, 2015
-
(2015)
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
-
-
Xu, K.1
Ba, J.2
Kiros, R.3
Courville, A.4
Salakhutdinov, R.5
Zemel, R.6
Bengio, Y.7
-
47
-
-
80053258778
-
Corpus-guided sentence generation of natural images
-
Y. Yang, C. L. Teo, H. Daumé III, and Y. Aloimonos. Corpus-guided sentence generation of natural images. In EMNLP, 2011
-
(2011)
EMNLP
-
-
Yang, Y.1
Teo, C.L.2
Daumé, H.3
Aloimonos, Y.4
-
48
-
-
77954862144
-
I2T: Image parsing to text description
-
B. Z. Yao, X. Yang, L. Lin, M. W. Lee, and S.-C. Zhu. I2T: Image parsing to text description. Proceedings of the IEEE, 98(8):1485-1508, 2010
-
(2010)
Proceedings of the IEEE
, vol.98
, Issue.8
, pp. 1485-1508
-
-
Yao, B.Z.1
Yang, X.2
Lin, L.3
Lee, M.W.4
Zhu, S.-C.5
-
49
-
-
84864049528
-
Multiple instance boosting for object detection
-
C. Zhang, J. C. Platt, and P. A. Viola. Multiple instance boosting for object detection. In NIPS, 2005
-
(2005)
NIPS
-
-
Zhang, C.1
Platt, J.C.2
Viola, P.A.3
-
50
-
-
84952018709
-
Edge boxes: Locating object proposals from edges
-
C. L. Zitnick and P. Dollár. Edge boxes: Locating object proposals from edges. In ECCV, 2014
-
(2014)
ECCV
-
-
Zitnick, C.L.1
Dollár, P.2
-
51
-
-
84887338442
-
Bringing semantics into focus using visual abstraction
-
C. L. Zitnick and D. Parikh. Bringing semantics into focus using visual abstraction. In CVPR, 2013.
-
(2013)
CVPR
-
-
Zitnick, C.L.1
Parikh, D.2
|