-
1
-
-
84959243017
-
Evaluation of Output Embeddings for Fine-Grained Image Classification
-
Akata, Z., Reed, S., Walter, D., Lee, H., and Schiele, B. Evaluation of Output Embeddings for Fine-Grained Image Classification. In CVPR, 2015.
-
(2015)
CVPR
-
-
Akata, Z.1
Reed, S.2
Walter, D.3
Lee, H.4
Schiele, B.5
-
2
-
-
85083951076
-
Adam: A method for stochastic optimization
-
Ba, J. and Kingma, D. Adam: A method for stochastic optimization. In ICLR, 2015.
-
(2015)
ICLR
-
-
Ba, J.1
Kingma, D.2
-
3
-
-
84882266451
-
Better mixing via deep representations
-
Bengio, Y, Mesnil, G., Dauphin, Y, and Rifai, S. Better mixing via deep representations. In ICML, 2013.
-
(2013)
ICML
-
-
Bengio, Y.1
Mesnil, G.2
Dauphin, Y.3
Rifai, S.4
-
4
-
-
84965143571
-
Deep generative image models using a laplacian pyramid of adversarial networks
-
Denton, E. L., Chintala, S., Fergus, R., et al. Deep generative image models using a laplacian pyramid of adversarial networks. In NIPS, 2015.
-
(2015)
NIPS
-
-
Denton, E.L.1
Chintala, S.2
Fergus, R.3
-
5
-
-
84959236502
-
Longterm recurrent convolutional networks for visual recognition and description
-
Donahue, J., Hendricks, L. A., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., and Darrell, T. Longterm recurrent convolutional networks for visual recognition and description. In CVPR, 2015.
-
(2015)
CVPR
-
-
Donahue, J.1
Hendricks, L.A.2
Guadarrama, S.3
Rohrbach, M.4
Venugopalan, S.5
Saenko, K.6
Darrell, T.7
-
6
-
-
84959184995
-
Learning to generate chairs with convolutional neural networks
-
Dosovitskiy, A., Tobias Springenberg, J., and Brox, T. Learning to generate chairs with convolutional neural networks. In CVPR, 2015.
-
(2015)
CVPR
-
-
Dosovitskiy, A.1
Tobias Springenberg, J.2
Brox, T.3
-
7
-
-
70450207704
-
Describing objects by their attributes
-
Farhadi, A., Endres, I., Hoiem, D., and Forsyth, D. Describing objects by their attributes. In CVPR, 2009.
-
(2009)
CVPR
-
-
Farhadi, A.1
Endres, I.2
Hoiem, D.3
Forsyth, D.4
-
8
-
-
84906482165
-
Transductive multi-view embedding for zero-shot recognition and annotation
-
Fu, Y, Hospedales, T. M., Xiang, T., Fu, Z., and Gong, S. Transductive multi-view embedding for zero-shot recognition and annotation. In ECCV, 2014.
-
(2014)
ECCV
-
-
Fu, Y.1
Hospedales, T.M.2
Xiang, T.3
Fu, Z.4
Gong, S.5
-
10
-
-
84937849144
-
Generative adversarial nets
-
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. Generative adversarial nets. In NIPS, 2014.
-
(2014)
NIPS
-
-
Goodfellow, I.1
Pouget-Abadie, J.2
Mirza, M.3
Xu, B.4
Warde-Farley, D.5
Ozair, S.6
Courville, A.7
Bengio, Y.8
-
11
-
-
84983208884
-
Draw: A recurrent neural network for image generation
-
Gregor, K., Danihelka, I., Graves, A., Rezende, D., and Wierstra, D. Draw: A recurrent neural network for image generation. In ICML, 2015.
-
(2015)
ICML
-
-
Gregor, K.1
Danihelka, I.2
Graves, A.3
Rezende, D.4
Wierstra, D.5
-
13
-
-
84969584486
-
Batch normalization: Accelerating deep network training by reducing internal covariate shift
-
Ioffe, S. and Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In ICML, 2015.
-
(2015)
ICML
-
-
Ioffe, S.1
Szegedy, C.2
-
14
-
-
84946734827
-
Deep visual-semantic alignments for generating image descriptions
-
Karpathy, A. and Li, F. Deep visual-semantic alignments for generating image descriptions. In CVPR, 2015.
-
(2015)
CVPR
-
-
Karpathy, A.1
Li, F.2
-
15
-
-
84952349298
-
Unifying visual-semantic embeddings with multimodal neural language models
-
Kiros, R., Salakhutdinov, R., and Zemel, R. S. Unifying visual-semantic embeddings with multimodal neural language models. In ACL, 2014.
-
(2014)
ACL
-
-
Kiros, R.1
Salakhutdinov, R.2
Zemel, R.S.3
-
16
-
-
77953185711
-
Attribute and simile classifiers for face verification
-
Kumar, N., Berg, A. C., Belhumeur, P. N., and Nayar, S. K. Attribute and simile classifiers for face verification. In ICCV, 2009.
-
(2009)
ICCV
-
-
Kumar, N.1
Berg, A.C.2
Belhumeur, P.N.3
Nayar, S.K.4
-
17
-
-
84894522762
-
Attributebased classification for zero-shot visual object categorization
-
Lampert, C. H., Nickisch, H., and Harmeling, S. Attributebased classification for zero-shot visual object categorization. TPAMI, 36(3):453-465, 2014.
-
(2014)
TPAMI
, vol.36
, Issue.3
, pp. 453-465
-
-
Lampert, C.H.1
Nickisch, H.2
Harmeling, S.3
-
18
-
-
84937834115
-
Microsoft coco: Common objects in context
-
Lin, T.-Y, Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., and Zitnick, C. L. Microsoft coco: Common objects in context. In ECCV. 2014.
-
(2014)
ECCV
-
-
Lin, T.-Y.1
Maire, M.2
Belongie, S.3
Hays, J.4
Perona, P.5
Ramanan, D.6
Dollár, P.7
Zitnick, C.L.8
-
19
-
-
85083950885
-
Generating images from captions with attention
-
Mansimov, E., Parisotto, E., Ba, J. L., and Salakhutdinov, R. Generating images from captions with attention. ICLR, 2016.
-
(2016)
ICLR
-
-
Mansimov, E.1
Parisotto, E.2
Ba, J.L.3
Salakhutdinov, R.4
-
20
-
-
85083950512
-
Deep captioning with multimodal recurrent neural networks (m-rnn)
-
Mao, J., Xu, W., Yang, Y, Wang, J., and Yuille, A. Deep captioning with multimodal recurrent neural networks (m-rnn). ICLR, 2015.
-
(2015)
ICLR
-
-
Mao, J.1
Xu, W.2
Yang, Y.3
Wang, J.4
Yuille, A.5
-
22
-
-
80053437179
-
Multimodal deep learning
-
Ngiam, J., Khosla, A., Kim, M., Nam, J., Lee, H., and Ng, A. Y Multimodal deep learning. In ICML, 2011.
-
(2011)
ICML
-
-
Ngiam, J.1
Khosla, A.2
Kim, M.3
Nam, J.4
Lee, H.5
Ng, A.Y.6
-
25
-
-
84919832734
-
Learning to disentangle factors of variation with manifold interaction
-
Reed, S., Sohn, K., Zhang, Y, and Lee, H. Learning to disentangle factors of variation with manifold interaction. In ICML, 2014.
-
(2014)
ICML
-
-
Reed, S.1
Sohn, K.2
Zhang, Y.3
Lee, H.4
-
26
-
-
84965113821
-
Deep visual analogy-making
-
Reed, S., Zhang, Y, Zhang, Y, and Lee, H. Deep visual analogy-making. In NIPS, 2015.
-
(2015)
NIPS
-
-
Reed, S.1
Zhang, Y.2
Zhang, Y.3
Lee, H.4
-
27
-
-
84986250442
-
Learning deep representations for fine-grained visual descriptions
-
Reed, S., Akata, Z., Lee, H., and Schiele, B. Learning deep representations for fine-grained visual descriptions. In CVPR, 2016.
-
(2016)
CVPR
-
-
Reed, S.1
Akata, Z.2
Lee, H.3
Schiele, B.4
-
28
-
-
84965170394
-
Exploring models and data for image question answering
-
Ren, M., Kiros, R., and Zemel, R. Exploring models and data for image question answering. In NIPS, 2015.
-
(2015)
NIPS
-
-
Ren, M.1
Kiros, R.2
Zemel, R.3
-
29
-
-
84937873395
-
Improved multimodal deep learning with variation of information
-
Sohn, K., Shang, W., and Lee, H. Improved multimodal deep learning with variation of information. In NIPS, 2014.
-
(2014)
NIPS
-
-
Sohn, K.1
Shang, W.2
Lee, H.3
-
30
-
-
84877724347
-
Multimodal learning with deep boltzmann machines
-
Srivastava, N. and Salakhutdinov, R. R. Multimodal learning with deep boltzmann machines. In NIPS, 2012.
-
(2012)
NIPS
-
-
Srivastava, N.1
Salakhutdinov, R.R.2
-
31
-
-
84937522268
-
Going deeper with convolutions
-
Szegedy, C, Liu, W., Jia, Y, Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. Going deeper with convolutions. In CVPR, 2015.
-
(2015)
CVPR
-
-
Szegedy, C.1
Liu, W.2
Jia, Y.3
Sermanet, P.4
Reed, S.5
Anguelov, D.6
Erhan, D.7
Vanhoucke, V.8
Rabinovich, A.9
-
32
-
-
84946747440
-
Show and tell: A neural image caption generator
-
Vinyals, O., Toshev, A., Bengio, S., and Erhan, D. Show and tell: A neural image caption generator. In CVPR, 2015.
-
(2015)
CVPR
-
-
Vinyals, O.1
Toshev, A.2
Bengio, S.3
Erhan, D.4
-
33
-
-
84878084353
-
-
Wah, C, Branson, S., Welinder, P., Perona, P., and Belongie, S. The caltech-ucsd birds-200-2011 dataset. 2011.
-
(2011)
The Caltech-ucsd Birds-200-2011 Dataset
-
-
Wah, C.1
Branson, S.2
Welinder, P.3
Perona, P.4
Belongie, S.5
-
34
-
-
84998721476
-
-
arXiv preprint arXiv: 1511.02570
-
Wang, P., Wu, Q., Shen, C, Hengel, A. v. d., and Dick, A. Explicit knowledge-based reasoning for visual question answering. arXiv preprint arXiv: 1511.02570, 2015.
-
(2015)
Explicit Knowledge-based Reasoning for Visual Question Answering
-
-
Wang, P.1
Wu, Q.2
Shen, C.3
Hengel, A.V.D.4
Dick, A.5
-
35
-
-
84970002232
-
Show, attend and tell: Neural image caption generation with visual attention
-
Xu, K., Ba, J., Kiros, R., Courville, A., Salakhutdinov, R., Zemel, R., and Bengio, Y. Show, attend and tell: Neural image caption generation with visual attention. In ICML, 2015.
-
(2015)
ICML
-
-
Xu, K.1
Ba, J.2
Kiros, R.3
Courville, A.4
Salakhutdinov, R.5
Zemel, R.6
Bengio, Y.7
-
36
-
-
84988339664
-
-
arXiv preprint arXiv: 1512.00570
-
Yan, X., Yang, J., Sohn, K., and Lee, H. Attribute2image: Conditional image generation from visual attributes. arXiv preprint arXiv: 1512.00570, 2015.
-
(2015)
Attribute2image: Conditional Image Generation from Visual Attributes
-
-
Yan, X.1
Yang, J.2
Sohn, K.3
Lee, H.4
-
37
-
-
84965161391
-
Weaklysupervised disentangling with recurrent transformations for 3d view synthesis
-
Yang, J., Reed, S., Yang, M.-H., and Lee, H. Weaklysupervised disentangling with recurrent transformations for 3d view synthesis. In NIPS, 2015.
-
(2015)
NIPS
-
-
Yang, J.1
Reed, S.2
Yang, M.-H.3
Lee, H.4
-
38
-
-
84973911532
-
Aligning books and movies: Towards story-like visual explanations by watching movies and reading books
-
Zhu, Y, Kiros, R., Zemel, R., Salakhutdinov, R., Urtasun, R., Torralba, A., and Fidler, S. Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. In ICCV, 2015.
-
(2015)
ICCV
-
-
Zhu, Y.1
Kiros, R.2
Zemel, R.3
Salakhutdinov, R.4
Urtasun, R.5
Torralba, A.6
Fidler, S.7
|