-
1
-
-
0011812771
-
Kernel independent component analysis
-
3
-
F. R. Bach and M. I. Jordan. Kernel independent component analysis. JMLR, 2002. 3
-
(2002)
JMLR
-
-
Bach, F.R.1
Jordan, M.I.2
-
2
-
-
84887369458
-
Watching unlabeled video helps learn new human actions from very few labeled snapshots
-
2
-
C.-Y. Chen and K. Grauman. Watching unlabeled video helps learn new human actions from very few labeled snapshots. In CVPR, 2013. 2
-
(2013)
CVPR
-
-
Chen, C.-Y.1
Grauman, K.2
-
3
-
-
84898803720
-
NEIL: Extracting visual knowledge from web data
-
2, 6, 7
-
X. Chen, A. Shrivastava, and A. Gupta. NEIL: Extracting visual knowledge from web data. In ICCV, 2013. 2, 6, 7
-
(2013)
ICCV
-
-
Chen, X.1
Shrivastava, A.2
Gupta, A.3
-
5
-
-
85037338954
-
Generating typed dependency parses from phrase structure parses
-
3
-
M.-C. de Marneffe, B. MacCartney, and C. D. Manning. Generating typed dependency parses from phrase structure parses. In LREC, 2006. 3
-
(2006)
LREC
-
-
De Marneffe, M.-C.1
MacCartney, B.2
Manning, C.D.3
-
6
-
-
85198028989
-
ImageNet: A large-scale hierarchical image database
-
1, 2
-
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR, 2009. 1, 2
-
(2009)
CVPR
-
-
Deng, J.1
Dong, W.2
Socher, R.3
Li, L.-J.4
Li, K.5
Fei-Fei, L.6
-
7
-
-
84866674680
-
Hedging your bets: Optimizing accuracy-specificity trade-offs in large scale visual recognition
-
2
-
J. Deng, J. Krause, A. Berg, and L. Fei-Fei. Hedging your bets: Optimizing accuracy-specificity trade-offs in large scale visual recognition. In CVPR, 2012. 2
-
(2012)
CVPR
-
-
Deng, J.1
Krause, J.2
Berg, A.3
Fei-Fei, L.4
-
8
-
-
84911368326
-
Learning everything about anything: Webly-supervised visual concept learning
-
2, 6, 7
-
S. K. Divvala, A. Farhadi, and C. Guestrin. Learning everything about anything: Webly-supervised visual concept learning. In CVPR, 2014. 2, 6, 7
-
(2014)
CVPR
-
-
Divvala, S.K.1
Farhadi, A.2
Guestrin, C.3
-
9
-
-
84959216468
-
ActivityNet: A large-scale video benchmark for human activity understanding
-
2
-
B. G. Fabian Caba Heilbron, Victor Escorcia and J. C. Niebles. ActivityNet: A large-scale video benchmark for human activity understanding. In CVPR, 2015. 2
-
(2015)
CVPR
-
-
Fabian Caba Heilbron, B.G.1
Escorcia, V.2
Niebles, J.C.3
-
10
-
-
50949133669
-
LIBLINEAR: A library for large linear classification
-
5
-
R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR: A library for large linear classification. JMLR, 2008. 5
-
(2008)
JMLR
-
-
Fan, R.-E.1
Chang, K.-W.2
Hsieh, C.-J.3
Wang, X.-R.4
Lin, C.-J.5
-
12
-
-
84898958665
-
Devise: A deep visual-semantic embedding model
-
3
-
A. FRome, G. Corrado, J. Shlens, S. Bengio, J. Dean, M. Ranzato, and T. Mikolov. Devise: A deep visual-semantic embedding model. In NIPS, 2013. 3
-
(2013)
NIPS
-
-
Frome, A.1
Corrado, G.2
Shlens, J.3
Bengio, S.4
Dean, J.5
Ranzato, M.6
Mikolov, T.7
-
13
-
-
84911400494
-
Rich feature hierarchies for accurate object detection and semantic segmentation
-
5
-
R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR, 2014. 5
-
(2014)
CVPR
-
-
Girshick, R.1
Donahue, J.2
Darrell, T.3
Malik, J.4
-
14
-
-
84951930934
-
Conceptmap: Mining noisy web data for concept learning
-
2
-
E. Golge and P. Duygulu. Conceptmap: Mining noisy web data for concept learning. In ECCV, 2014. 2
-
(2014)
ECCV
-
-
Golge, E.1
Duygulu, P.2
-
15
-
-
84959243872
-
Improving image-sentence embeddings using large weakly annotated photo collections
-
3, 4, 5
-
Y. Gong, L. Wang, M. Hodosh, J. Hockenmaier, and S. Lazebnik. Improving image-sentence embeddings using large weakly annotated photo collections. In ECCV, 2014. 3, 4, 5
-
(2014)
ECCV
-
-
Gong, Y.1
Wang, L.2
Hodosh, M.3
Hockenmaier, J.4
Lazebnik, S.5
-
16
-
-
84898773262
-
YouTube2Text: Recognizing and describing arbitrary activities using semantic hierarchies and zero-shot recognition
-
3
-
S. Guadarrama, N. Krishnamoorthy, G. Malkarnenkar, R. Mooney, T. Darrell, and K. Saenko. YouTube2Text: Recognizing and describing arbitrary activities using semantic hierarchies and zero-shot recognition. In ICCV, 2013. 3
-
(2013)
ICCV
-
-
Guadarrama, S.1
Krishnamoorthy, N.2
Malkarnenkar, G.3
Mooney, R.4
Darrell, T.5
Saenko, K.6
-
17
-
-
84883394520
-
Framing image description as a ranking task: Data, models and evaluation metrics
-
1, 3, 4, 5
-
M. Hodosh, P. Young, and J. Hockenmaier. Framing image description as a ranking task: Data, models and evaluation metrics. JAIR, 2013. 1, 3, 4, 5
-
(2013)
JAIR
-
-
Hodosh, M.1
Young, P.2
Hockenmaier, J.3
-
18
-
-
85009867858
-
Caffe: Convolutional architecture for fast feature embedding
-
5
-
Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. In ACM MM, 2014. 5
-
(2014)
ACM MM
-
-
Jia, Y.1
Shelhamer, E.2
Donahue, J.3
Karayev, S.4
Long, J.5
Girshick, R.6
Guadarrama, S.7
Darrell, T.8
-
19
-
-
84946734827
-
Deep visual-semantic alignments for generating image descriptions
-
3, 5, 6
-
A. Karpathy and L. Fei-Fei. Deep visual-semantic alignments for generating image descriptions. CVPR, 2015. 3, 5, 6
-
(2015)
CVPR
-
-
Karpathy, A.1
Fei-Fei, L.2
-
20
-
-
84937843643
-
Deep fragment embeddings for bidirectional image sentence mapping
-
4, 5
-
A. Karpathy, A. Joulin, and L. Fei-Fei. Deep fragment embeddings for bidirectional image sentence mapping. In NIPS, 2014. 4, 5
-
(2014)
NIPS
-
-
Karpathy, A.1
Joulin, A.2
Fei-Fei, L.3
-
21
-
-
84952349298
-
Unifying visual-semantic embeddings with multimodal neural language models
-
3, 5, 6
-
R. Kiros, R. Salakhutdinov, and R. S. Zemel. Unifying visual-semantic embeddings with multimodal neural language models. TACL, 2015. 3, 5, 6
-
(2015)
TACL
-
-
Kiros, R.1
Salakhutdinov, R.2
Zemel, R.S.3
-
22
-
-
84876231242
-
Imagenet classification with deep convolutional neural networks
-
5
-
A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012. 5
-
(2012)
NIPS
-
-
Krizhevsky, A.1
Sutskever, I.2
Hinton, G.E.3
-
23
-
-
80052901011
-
Baby talk: Understanding and generating image descriptions
-
3
-
G. Kulkarni, V. Premraj, S. Dhar, S. Li, Y. Choi, A. C. Berg, and T. L. Berg. Baby talk: Understanding and generating image descriptions. In CVPR, 2011. 3
-
(2011)
CVPR
-
-
Kulkarni, G.1
Premraj, V.2
Dhar, S.3
Li, S.4
Choi, Y.5
Berg, A.C.6
Berg, T.L.7
-
24
-
-
84878189119
-
Collective generation of natural image descriptions
-
3
-
P. Kuznetsova, V. Ordonez, A. C. Berg, T. L. Berg, and Y. Choi. Collective generation of natural image descriptions. In ACL, 2012. 3
-
(2012)
ACL
-
-
Kuznetsova, P.1
Ordonez, V.2
Berg, A.C.3
Berg, T.L.4
Choi, Y.5
-
25
-
-
84907331257
-
Generalizing image captions for image-text parallel corpus
-
3
-
P. Kuznetsova, V. Ordonez, A. C. Berg, T. L. Berg, and Y. Choi. Generalizing image captions for image-text parallel corpus. In ACL, 2013. 3
-
(2013)
ACL
-
-
Kuznetsova, P.1
Ordonez, V.2
Berg, A.C.3
Berg, T.L.4
Choi, Y.5
-
26
-
-
85162513516
-
Object bank: A highlevel image representation for scene classification & semantic feature sparsification
-
2
-
L.-J. Li, H. Su, E. P. Xing, and F.-F. Li. Object bank: A highlevel image representation for scene classification & semantic feature sparsification. In NIPS, 2010. 2
-
(2010)
NIPS
-
-
Li, L.-J.1
Su, H.2
Xing, E.P.3
Li, F.-F.4
-
27
-
-
84937834115
-
Microsoft COCO: Common objects in context
-
5, 6
-
T. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick. Microsoft COCO: common objects in context. In ECCV, 2014. 5, 6
-
(2014)
ECCV
-
-
Lin, T.1
Maire, M.2
Belongie, S.3
Hays, J.4
Perona, P.5
Ramanan, D.6
Dollár, P.7
Zitnick, C.L.8
-
28
-
-
84951072975
-
-
CoRR, abs/1410. 1090 3, 5, 6
-
J. Mao, W. Xu, Y. Yang, J. Wang, and A. L. Yuille. Explain images with multimodal recurrent neural networks. CoRR, abs/1410. 1090, 2014. 3, 5, 6
-
(2014)
Explain Images with Multimodal Recurrent Neural Networks
-
-
Mao, J.1
Xu, W.2
Yang, Y.3
Wang, J.4
Yuille, A.L.5
-
29
-
-
84898956512
-
Distributed representations of words and phrases and their compositionality
-
4
-
T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In NIPS, 2013. 4
-
(2013)
NIPS
-
-
Mikolov, T.1
Sutskever, I.2
Chen, K.3
Corrado, G.S.4
Dean, J.5
-
30
-
-
84976702763
-
WordNet: A lexical database for english
-
2
-
G. A. Miller. WordNet: A Lexical Database for English. CACM, 1995. 2
-
(1995)
CACM
-
-
Miller, G.A.1
-
31
-
-
84898828265
-
From large scale image categorization to entry-level categories
-
2
-
V. Ordonez, J. Deng, Y. Choi, A. C. Berg, and T. L. Berg. From large scale image categorization to entry-level categories. In ICCV, 2013. 2
-
(2013)
ICCV
-
-
Ordonez, V.1
Deng, J.2
Choi, Y.3
Berg, A.C.4
Berg, T.L.5
-
32
-
-
85162522202
-
Im2text: Describing images using 1 million captioned photographs
-
3
-
V. Ordonez, G. Kulkarni, and T. L. Berg. Im2text: Describing images using 1 million captioned photographs. In NIPS, 2011. 3
-
(2011)
NIPS
-
-
Ordonez, V.1
Kulkarni, G.2
Berg, T.L.3
-
33
-
-
84898775239
-
Translating video content to natural language descriptions
-
3
-
M. Rohrbach, W. Qiu, I. Titov, S. Thater, M. Pinkal, and B. Schiele. Translating video content to natural language descriptions. In ICCV, 2013. 3
-
(2013)
ICCV
-
-
Rohrbach, M.1
Qiu, W.2
Titov, I.3
Thater, S.4
Pinkal, M.5
Schiele, B.6
-
35
-
-
84909978410
-
-
6, 7
-
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. ImageNet Large Scale Visual Recognition Challenge, 2014. 6, 7
-
(2014)
ImageNet Large Scale Visual Recognition Challenge
-
-
Russakovsky, O.1
Deng, J.2
Su, H.3
Krause, J.4
Satheesh, S.5
Ma, S.6
Huang, Z.7
Karpathy, A.8
Khosla, A.9
Bernstein, M.10
Berg, A.C.11
Fei-Fei, L.12
-
36
-
-
84866718894
-
Action bank: A high-level representation of activity in video
-
2
-
S. Sadanand and J. Corso. Action bank: A high-level representation of activity in video. In CVPR, 2012. 2
-
(2012)
CVPR
-
-
Sadanand, S.1
Corso, J.2
-
37
-
-
80052889458
-
Recognition using visual phrases
-
2
-
M. A. Sadeghi and A. Farhadi. Recognition using visual phrases. In CVPR, 2011. 2
-
(2011)
CVPR
-
-
Sadeghi, M.A.1
Farhadi, A.2
-
38
-
-
84990069553
-
Very deep convolutional networks for large-scale image recognition
-
5
-
K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. NIPS, 2014. 5
-
(2014)
NIPS
-
-
Simonyan, K.1
Zisserman, A.2
-
39
-
-
84964474107
-
Grounded compositional semantics for finding and describing images with sentences
-
3, 5
-
R. Socher, A. Karpathy, Q. V. Le, C. D. Manning, and A. Y. Ng. Grounded compositional semantics for finding and describing images with sentences. TACL, 2014. 3, 5
-
(2014)
TACL
-
-
Socher, R.1
Karpathy, A.2
Le, Q.V.3
Manning, C.D.4
Ng, A.Y.5
-
40
-
-
84911429593
-
DISCOVER: Discovering important segments for classification of video events and recounting
-
2
-
C. Sun and R. Nevatia. DISCOVER: Discovering important segments for classification of video events and recounting. In CVPR, 2014. 2
-
(2014)
CVPR
-
-
Sun, C.1
Nevatia, R.2
-
41
-
-
84973858597
-
Semantic aware video transcription using random forest classifiers
-
3
-
C. Sun and R. Nevatia. Semantic aware video transcription using random forest classifiers. In ECCV, 2014. 3
-
(2014)
ECCV
-
-
Sun, C.1
Nevatia, R.2
-
42
-
-
84955184649
-
Deep multiple instance learning for image classification and auto-annotation
-
2
-
J. Wu, Y. Yu, C. Huang, and K. Yu. Deep multiple instance learning for image classification and auto-annotation. CVPR, 2015. 2
-
(2015)
CVPR
-
-
Wu, J.1
Yu, Y.2
Huang, C.3
Yu, K.4
-
43
-
-
84906494296
-
From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions
-
3, 5
-
P. Young, A. Lai, M. Hodosh, and J. Hockenmaier. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. TACL, 2014. 3, 5
-
(2014)
TACL
-
-
Young, P.1
Lai, A.2
Hodosh, M.3
Hockenmaier, J.4
-
44
-
-
84959187860
-
ConceptLearner: Discovering visual concepts from weakly labeled image collections
-
2
-
B. Zhou, V. Jagadeesh, and R. Piramuthu. ConceptLearner: Discovering Visual Concepts from Weakly Labeled Image Collections. CVPR, 2015. 2
-
(2015)
CVPR
-
-
Zhou, B.1
Jagadeesh, V.2
Piramuthu, R.3
-
45
-
-
84937964578
-
Learning deep features for scene recognition using places database
-
2
-
B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva. Learning Deep Features for Scene Recognition using Places Database. NIPS, 2014. 2
-
(2014)
NIPS
-
-
Zhou, B.1
Lapedriza, A.2
Xiao, J.3
Torralba, A.4
Oliva, A.5
|