-
1
-
-
84986259594
-
Labelembedding for image classification
-
2, 3, 5, 7
-
Z. Akata, F. Perronnin, Z. Harchaoui, and C. Schmid. Labelembedding for image classification. IEEE TPAMI, 2015.
-
(2015)
IEEE TPAMI
-
-
Akata, Z.1
Perronnin, F.2
Harchaoui, Z.3
Schmid, C.4
-
2
-
-
84959243017
-
Evaluation of output embeddings for fine-grained image classification
-
1, 2, 3, 6, 7
-
Z. Akata, S. Reed, D. Walter, H. Lee, and B. Schiele. Evaluation of Output Embeddings for Fine-Grained Image Classification. In CVPR, 2015.
-
(2015)
CVPR
-
-
Akata, Z.1
Reed, S.2
Walter, D.3
Lee, H.4
Schiele, B.5
-
3
-
-
84973882857
-
Predicting deep zero-shot convolutional neural networks using textual descriptions
-
1, 2, 5, 8
-
J. Ba, K. Swersky, S. Fidler, and R. Salakhutdinov. Predicting deep zero-shot convolutional neural networks using textual descriptions. In ICCV, 2015.
-
(2015)
ICCV
-
-
Ba, J.1
Swersky, K.2
Fidler, S.3
Salakhutdinov, R.4
-
4
-
-
85162050606
-
Label embedding trees for large multi-class tasks
-
2
-
S. Bengio, J. Weston, and D. Grangier. Label embedding trees for large multi-class tasks. In NIPS, 2010.
-
(2010)
NIPS
-
-
Bengio, S.1
Weston, J.2
Grangier, D.3
-
5
-
-
84889607930
-
Zero-shot video retrieval using content and concepts
-
2
-
J. Dalton, J. Allan, and P. Mirajkar. Zero-shot video retrieval using content and concepts. In CIKM, 2013.
-
(2013)
CIKM
-
-
Dalton, J.1
Allan, J.2
Mirajkar, P.3
-
6
-
-
85198028989
-
ImageNet: A large-scale hierarchical image database
-
2
-
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A large-scale hierarchical image database. In CVPR, 2009.
-
(2009)
CVPR
-
-
Deng, J.1
Dong, W.2
Socher, R.3
Li, L.-J.4
Li, K.5
Fei-Fei, L.6
-
7
-
-
84887325349
-
Fine-grained crowdsourcing for fine-grained recognition
-
1, 2
-
J. Deng, J. Krause, and L. Fei-Fei. Fine-grained crowdsourcing for fine-grained recognition. In CVPR, 2013.
-
(2013)
CVPR
-
-
Deng, J.1
Krause, J.2
Fei-Fei, L.3
-
8
-
-
84959236502
-
Long-term recurrent convolutional networks for visual recognition and description
-
1, 2
-
J. Donahue, L. A. Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, K. Saenko, and T. Darrell. Long-term recurrent convolutional networks for visual recognition and description. In CVPR, 2015.
-
(2015)
CVPR
-
-
Donahue, J.1
Hendricks, L.A.2
Guadarrama, S.3
Rohrbach, M.4
Venugopalan, S.5
Saenko, K.6
Darrell, T.7
-
9
-
-
84919881041
-
Decaf: A deep convolutional activation feature for generic visual recognition
-
2
-
J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell. Decaf: A deep convolutional activation feature for generic visual recognition. In ICML, 2014.
-
(2014)
ICML
-
-
Donahue, J.1
Jia, Y.2
Vinyals, O.3
Hoffman, J.4
Zhang, N.5
Tzeng, E.6
Darrell, T.7
-
10
-
-
84866719272
-
Discovering localized attributes for fine-grained recognition
-
1, 2
-
K. Duan, D. Parikh, D. J. Crandall, and K. Grauman. Discovering localized attributes for fine-grained recognition. In CVPR, 2012.
-
(2012)
CVPR
-
-
Duan, K.1
Parikh, D.2
Crandall, D.J.3
Grauman, K.4
-
11
-
-
84898803425
-
Write a classifier: Zero-shot learning using purely textual descriptions
-
2, 8
-
M. Elhoseiny, B. Saleh, and A. Elgammal. Write a classifier: Zero-shot learning using purely textual descriptions. In ICCV, 2013.
-
(2013)
ICCV
-
-
Elhoseiny, M.1
Saleh, B.2
Elgammal, A.3
-
12
-
-
84898958665
-
Devise: A deep visual-semantic embedding model
-
1, 2
-
A. Frome, G. S. Corrado, J. Shlens, S. Bengio, J. Dean, and T. Mikolov. Devise: A deep visual-semantic embedding model. In NIPS, 2013.
-
(2013)
NIPS
-
-
Frome, A.1
Corrado, G.S.2
Shlens, J.3
Bengio, S.4
Dean, J.5
Mikolov, T.6
-
13
-
-
84906482165
-
Transductive multi-view embedding for zero-shot recognition and annotation
-
1
-
Y. Fu, T. M. Hospedales, T. Xiang, Z. Fu, and S. Gong. Transductive multi-view embedding for zero-shot recognition and annotation. In ECCV, 2014.
-
(2014)
ECCV
-
-
Fu, Y.1
Hospedales, T.M.2
Xiang, T.3
Fu, Z.4
Gong, S.5
-
14
-
-
84941001216
-
Transductive multi-view zero-shot learning
-
7
-
Y. Fu, T. M. Hospedales, T. Xiang, and S. Gong. Transductive multi-view zero-shot learning. IEEE TPAMI, 37 (11): 2332-2345, 2015.
-
(2015)
IEEE TPAMI
, vol.37
, Issue.11
, pp. 2332-2345
-
-
Fu, Y.1
Hospedales, T.M.2
Xiang, T.3
Gong, S.4
-
16
-
-
0000679216
-
Distributional structure
-
1
-
Z. Harris. Distributional structure. Word, 10 (23), 1954.
-
(1954)
Word
, vol.10
, Issue.23
-
-
Harris, Z.1
-
19
-
-
84969584486
-
Batch normalization: Accelerating deep network training by reducing internal covariate shift
-
5
-
S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In ICML, 2015.
-
(2015)
ICML
-
-
Ioffe, S.1
Szegedy, C.2
-
20
-
-
84946734827
-
Deep visual-semantic alignments for generating image descriptions
-
1, 2
-
A. Karpathy and F. Li. Deep visual-semantic alignments for generating image descriptions. In CVPR, 2015.
-
(2015)
CVPR
-
-
Karpathy, A.1
Li, F.2
-
21
-
-
84959189488
-
Ranking and retrieval of image sequences from multiple paragraph queries
-
2
-
G. Kim, S. Moon, and L. Sigal. Ranking and retrieval of image sequences from multiple paragraph queries. In CVPR, 2015.
-
(2015)
CVPR
-
-
Kim, G.1
Moon, S.2
Sigal, L.3
-
22
-
-
84876231242
-
ImageNet classification with deep convolutional neural networks
-
2
-
A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet classification with deep convolutional neural networks. In NIPS, 2012.
-
(2012)
NIPS
-
-
Krizhevsky, A.1
Sutskever, I.2
Hinton, G.E.3
-
23
-
-
80052901011
-
Baby talk: Understanding and generating simple image descriptions
-
1
-
G. Kulkarni, V. Premraj, S. Dhar, S. Li, Y. choi, A. Berg, and T. Berg. Baby talk: understanding and generating simple image descriptions. In CVPR, 2011.
-
(2011)
CVPR
-
-
Kulkarni, G.1
Premraj, V.2
Dhar, S.3
Li, S.4
Choi, Y.5
Berg, A.6
Berg, T.7
-
24
-
-
84894522762
-
Attributebased classification for zero-shot visual object categorization
-
1, 2
-
C. Lampert, H. Nickisch, and S. Harmeling. Attributebased classification for zero-shot visual object categorization. IEEE TPAMI, 36 (3): 453-465, 2014.
-
(2014)
IEEE TPAMI
, vol.36
, Issue.3
, pp. 453-465
-
-
Lampert, C.1
Nickisch, H.2
Harmeling, S.3
-
25
-
-
85009931853
-
Microsoft COCO: Common objects in context
-
1
-
T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick. Microsoft COCO: Common objects in context. In ECCV. 2014.
-
(2014)
ECCV.
-
-
Lin, T.-Y.1
Maire, M.2
Belongie, S.3
Hays, J.4
Perona, P.5
Ramanan, D.6
Dollár, P.7
Zitnick, C.L.8
-
26
-
-
85083950512
-
Deep captioning with multimodal recurrent neural networks (MRNN)
-
2
-
J. Mao, W. Xu, Y. Yang, J. Wang, and A. Yuille. Deep captioning with multimodal recurrent neural networks (MRNN). ICLR, 2015.
-
(2015)
ICLR
-
-
Mao, J.1
Xu, W.2
Yang, Y.3
Wang, J.4
Yuille, A.5
-
28
-
-
84898956512
-
Distributed representations of words and phrases and their compositionality
-
1, 4
-
T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In NIPS, 2013.
-
(2013)
NIPS
-
-
Mikolov, T.1
Sutskever, I.2
Chen, K.3
Corrado, G.S.4
Dean, J.5
-
29
-
-
84976702763
-
Wordnet: A lexical database for English
-
1
-
G. A. Miller. Wordnet: A lexical database for English. CACM, 38 (11): 39-41, 1995.
-
(1995)
CACM
, vol.38
, Issue.11
, pp. 39-41
-
-
Miller, G.A.1
-
30
-
-
84959228762
-
Beyond short snippets: Deep networks for video classification
-
2
-
J. Y.-H. Ng, M. Hausknecht, S. Vijayanarasimhan, O. Vinyals, R. Monga, and G. Toderici. Beyond short snippets: Deep networks for video classification. In CVPR, 2015.
-
(2015)
CVPR
-
-
Ng, J.Y.-H.1
Hausknecht, M.2
Vijayanarasimhan, S.3
Vinyals, O.4
Monga, R.5
Toderici, G.6
-
31
-
-
80053437179
-
Multimodal deep learning
-
2
-
J. Ngiam, A. Khosla, M. Kim, J. Nam, H. Lee, and A. Y. Ng. Multimodal deep learning. In ICML, 2011.
-
(2011)
ICML
-
-
Ngiam, J.1
Khosla, A.2
Kim, M.3
Nam, J.4
Lee, H.5
Ng, A.Y.6
-
32
-
-
65249121810
-
Automated flower classification over a large number of classes
-
2
-
M.-E. Nilsback and A. Zisserman. Automated flower classification over a large number of classes. In ICCVGIP, 2008.
-
(2008)
ICCVGIP
-
-
Nilsback, M.-E.1
Zisserman, A.2
-
33
-
-
84898979068
-
-
arXiv: 1312. 5650, 1, 2
-
M. Norouzi, T. Mikolov, S. Bengio, Y. Singer, J. Shlens, A. Frome, G. Corrado, and J. Dean. Zero-shot learning by convex combination of semantic embeddings. ArXiv: 1312. 5650, 2013.
-
(2013)
Zero-shot Learning by Convex Combination of Semantic Embeddings
-
-
Norouzi, M.1
Mikolov, T.2
Bengio, S.3
Singer, Y.4
Shlens, J.5
Frome, A.6
Corrado, G.7
Dean, J.8
-
34
-
-
84908539410
-
Learning and transferring mid-level image representations using convolutional neural networks
-
M. Oquab, L. Bottou, I. Laptev, and J. Sivic. Learning and transferring mid-level image representations using convolutional neural networks. In CVPR.
-
CVPR
, vol.2
-
-
Oquab, M.1
Bottou, L.2
Laptev, I.3
Sivic, J.4
-
35
-
-
85162522202
-
Im2Text: Describing images using 1 million captioned photographs
-
1
-
V. Ordonez, G. Kulkarni, and T. Berg. Im2Text: Describing images using 1 million captioned photographs. In NIPS, 2011.
-
(2011)
NIPS
-
-
Ordonez, V.1
Kulkarni, G.2
Berg, T.3
-
38
-
-
80052892795
-
Evaluating knowledge transfer and zero-shot learning in a large-scale setting
-
1, 2
-
M. Rohrbach, M. Stark, and B. Schiele. Evaluating knowledge transfer and zero-shot learning in a large-scale setting. In CVPR, 2011.
-
(2011)
CVPR
-
-
Rohrbach, M.1
Stark, M.2
Schiele, B.3
-
39
-
-
84947041871
-
Imagenet large scale visual recognition challenge
-
1
-
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. IJCV, 115 (3): 211-252, 2015.
-
(2015)
IJCV
, vol.115
, Issue.3
, pp. 211-252
-
-
Russakovsky, O.1
Deng, J.2
Su, H.3
Krause, J.4
Satheesh, S.5
-
40
-
-
85083953063
-
Very deep convolutional networks for large-scale image recognition
-
7
-
K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015.
-
(2015)
ICLR
-
-
Simonyan, K.1
Zisserman, A.2
-
41
-
-
84898938559
-
Zero-shot learning through cross-modal transfer
-
1, 2
-
R. Socher, M. Ganjoo, H. Sridhar, O. Bastani, C. Manning, and A. Ng. Zero-shot learning through cross-modal transfer. In NIPS, 2013.
-
(2013)
NIPS
-
-
Socher, R.1
Ganjoo, M.2
Sridhar, H.3
Bastani, O.4
Manning, C.5
Ng, A.6
-
42
-
-
84937873395
-
Improved multimodal deep learning with variation of information
-
2
-
K. Sohn, W. Shang, and H. Lee. Improved multimodal deep learning with variation of information. In NIPS, 2014.
-
(2014)
NIPS
-
-
Sohn, K.1
Shang, W.2
Lee, H.3
-
43
-
-
84916911784
-
Multimodal learning with deep boltzmann machines
-
2
-
N. Srivastava and R. Salakhutdinov. Multimodal learning with deep boltzmann machines. JMLR, 15: 2949-2980, 2014.
-
(2014)
JMLR
, vol.15
, pp. 2949-2980
-
-
Srivastava, N.1
Salakhutdinov, R.2
-
44
-
-
84937522268
-
Going deeper with convolutions
-
2, 5, 7
-
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In CVPR, 2015.
-
(2015)
CVPR
-
-
Szegedy, C.1
Liu, W.2
Jia, Y.3
Sermanet, P.4
Reed, S.5
Anguelov, D.6
Erhan, D.7
Vanhoucke, V.8
Rabinovich, A.9
-
46
-
-
80052891795
-
-
Technical Report CNS-TR-2010-001, Caltech, 1, 2
-
P. Welinder, S. Branson, T. Mita, C. Wah, F. Schroff, S. Belongie, and P. Perona. Caltech-UCSD Birds 200. Technical Report CNS-TR-2010-001, Caltech, 2010.
-
(2010)
Caltech-UCSD Birds 200
-
-
Welinder, P.1
Branson, S.2
Mita, T.3
Wah, C.4
Schroff, F.5
Belongie, S.6
Perona, P.7
-
47
-
-
77955654853
-
Large scale image annotation: Learning to rank with joint word-image embeddings
-
2
-
J. Weston, S. Bengio, and N. Usunier. Large scale image annotation: Learning to rank with joint word-image embeddings. ECML, 2010.
-
(2010)
ECML
-
-
Weston, J.1
Bengio, S.2
Usunier, N.3
-
48
-
-
84911434661
-
Zero-shot event detection using multi-modal fusion of weakly supervised concepts
-
2
-
S. Wu, S. Bondugula, F. Luisier, X. Zhuang, and P. Natarajan. Zero-shot event detection using multi-modal fusion of weakly supervised concepts. In CVPR, 2014.
-
(2014)
CVPR
-
-
Wu, S.1
Bondugula, S.2
Luisier, F.3
Zhuang, X.4
Natarajan, P.5
-
49
-
-
84970002232
-
Show, attend and tell: Neural image caption generation with visual attention
-
2
-
K. Xu, J. Ba, R. Kiros, A. Courville, R. Salakhutdinov, R. Zemel, and Y. Bengio. Show, attend and tell: Neural image caption generation with visual attention. In ICML, 2015.
-
(2015)
ICML
-
-
Xu, K.1
Ba, J.2
Kiros, R.3
Courville, A.4
Salakhutdinov, R.5
Zemel, R.6
Bengio, Y.7
-
50
-
-
84906494296
-
From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions
-
1
-
P. Young, A. Lai, M. Hodosh, and J. Hockenmaier. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. TACL, 2: 67-78, 2014.
-
(2014)
TACL
, vol.2
, pp. 67-78
-
-
Young, P.1
Lai, A.2
Hodosh, M.3
Hockenmaier, J.4
-
52
-
-
84965162393
-
Character-level convolutional networks for text classification
-
2, 3
-
X. Zhang, J. Zhao, and Y. LeCun. Character-level convolutional networks for text classification. In NIPS, 2015.
-
(2015)
NIPS
-
-
Zhang, X.1
Zhao, J.2
LeCun, Y.3
-
53
-
-
84973861983
-
Conditional random fields as recurrent neural networks
-
2
-
S. Zheng, S. Jayasumana, B. Romera-Paredes, V. Vineet, Z. Su, D. Du, C. Huang, and P. H. Torr. Conditional random fields as recurrent neural networks. In ICCV, 2015.
-
(2015)
ICCV
-
-
Zheng, S.1
Jayasumana, S.2
Romera-Paredes, B.3
Vineet, V.4
Su, Z.5
Du, D.6
Huang, C.7
Torr, P.H.8
|