-
2
-
-
85018920030
-
-
arXiv:1607.07086v1 [cs.LG]
-
D. Bahdanau, P. Brakel, K. Xu, A. Goyal, R. Lowe, J. Pineau, A. Courville, and Y. Bengio. An Actor-Critic Algorithm for Sequence Prediction. arXiv:1607.07086v1 [cs.LG], 2016.
-
(2016)
An Actor-Critic Algorithm for Sequence Prediction
-
-
Bahdanau, D.1
Brakel, P.2
Xu, K.3
Goyal, A.4
Lowe, R.5
Pineau, J.6
Courville, A.7
Bengio, Y.8
-
3
-
-
84965179228
-
Scheduled sampling for sequence prediction with recurrent neural networks
-
S. Bengio, O. Vinyals, N. Jaitly, and N. Shazeer. Scheduled sampling for sequence prediction with recurrent neural networks. In Advances in Neural Information Processing Systems, pages 1171-1179, 2015.
-
(2015)
Advances in Neural Information Processing Systems
, pp. 1171-1179
-
-
Bengio, S.1
Vinyals, O.2
Jaitly, N.3
Shazeer, N.4
-
5
-
-
84959933549
-
Neural Machine Translation by Jointly Learning to Align and Translate
-
Dzmitry Bahdana, D. Bahdanau, K. Cho, and Y. Bengio. Neural Machine Translation By Jointly Learning To Align and Translate. Iclr 2015, pages 1-15, 2014.
-
(2014)
Iclr 2015
, pp. 1-15
-
-
Bahdana, D.1
Bahdanau, D.2
Cho, K.3
Bengio, Y.4
-
6
-
-
77649188328
-
The segmented and annotated IAPR TC-12 benchmark
-
H. J. Escalante, C. A. Hernández, J. A. Gonzalez, A. López- López, M. Montes, E. F. Morales, L. Enrique Sucar, L. Villase ~nor, and M. Grubinger. The segmented and annotated IAPR TC-12 benchmark. Computer Vision and Image Understanding, 114(4):419-428, 2010.
-
(2010)
Computer Vision and Image Understanding
, vol.114
, Issue.4
, pp. 419-428
-
-
Escalante, H.J.1
Hernández, C.A.2
Gonzalez, J.A.3
López-López, A.4
Montes, M.5
Morales, E.F.6
Enrique Sucar, L.7
Villase~nor, L.8
Grubinger, M.9
-
7
-
-
84898958665
-
Devise: A deep visual-semantic embedding model
-
A. Frome, G. S. Corrado, J. Shlens, S. Bengio, J. Dean, T. Mikolov, et al. Devise: A deep visual-semantic embedding model. In Advances in neural information processing systems, 2013.
-
(2013)
Advances in Neural Information Processing Systems
-
-
Frome, A.1
Corrado, G.S.2
Shlens, J.3
Bengio, S.4
Dean, J.5
Mikolov, T.6
-
8
-
-
84990060711
-
-
Arxiv
-
A. Fukui, D. H. Park, D. Yang, A. Rohrbach, T. Darrell, and M. Rohrbach. Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding. Arxiv, 2016.
-
(2016)
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
-
-
Fukui, A.1
Park, D.H.2
Yang, D.3
Rohrbach, A.4
Darrell, T.5
Rohrbach, M.6
-
10
-
-
84937849144
-
Generative adversarial nets
-
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D.Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In Advances in Neural Information Processing Systems, pages 2672-2680, 2014.
-
(2014)
Advances in Neural Information Processing Systems
, pp. 2672-2680
-
-
Goodfellow, I.1
Pouget-Abadie, J.2
Mirza, M.3
Xu, B.4
Warde-Farley, D.5
Ozair, S.6
Courville, A.7
Bengio, Y.8
-
12
-
-
84890543083
-
Speech recognition with deep recurrent neural networks. 2013
-
IEEE
-
A. Graves, A.-r. Mohamed, and G. Hinton. Speech recognition with deep recurrent neural networks. In 2013 IEEE international conference on acoustics, speech and signal processing, pages 6645-6649. IEEE, 2013.
-
(2013)
IEEE International Conference on Acoustics, Speech and Signal Processing
, pp. 6645-6649
-
-
Graves, A.1
Mohamed, A.-R.2
Hinton, G.3
-
13
-
-
38049183286
-
The IAPR TC-12 Benchmark: A New Evaluation Resource for Visual Information Systems
-
M. Grübinger, P. Clough, H. Müller, and T. Deselaers. The IAPR TC-12 Benchmark: A New Evaluation Resource for Visual Information Systems. LREC Workshop OntoImage Language Resources for Content-Based Image Retrieval, pages 13-23, 2006.
-
(2006)
LREC Workshop OntoImage Language Resources for Content-Based Image Retrieval
, pp. 13-23
-
-
Grübinger, M.1
Clough, P.2
Müller, H.3
Deselaers, T.4
-
15
-
-
84986305787
-
-
arXiv preprint
-
R. Hu, H. Xu, M. Rohrbach, J. Feng, K. Saenko, and T. Darrell. Natural Language Object Retrieval. arXiv preprint, pages 4555-4564, 2015.
-
(2015)
Natural Language Object Retrieval
, pp. 4555-4564
-
-
Hu, R.1
Xu, H.2
Rohrbach, M.3
Feng, J.4
Saenko, K.5
Darrell, T.6
-
18
-
-
84943540775
-
ReferItGame: Referring to Objects in Photographs of Natural Scenes
-
S. Kazemzadeh, V. Ordonez, M. Matten, and T. L. Berg. ReferItGame: Referring to Objects in Photographs of Natural Scenes. Emnlp, pages 787-798, 2014.
-
(2014)
Emnlp
, pp. 787-798
-
-
Kazemzadeh, S.1
Ordonez, V.2
Matten, M.3
Berg, T.L.4
-
20
-
-
84906493406
-
Microsoft coco: Common objects in context
-
Springer
-
T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick. Microsoft coco: Common objects in context. In European Conference on Computer Vision, pages 740-755. Springer, 2014.
-
(2014)
European Conference on Computer Vision
, pp. 740-755
-
-
Lin, T.-Y.1
Maire, M.2
Belongie, S.3
Hays, J.4
Perona, P.5
Ramanan, D.6
Dollár, P.7
Zitnick, C.L.8
-
22
-
-
84973864182
-
Multimodal convolutional neural networks for matching image and sentence
-
Dece
-
L. Ma, Z. Lu, L. Shang, and H. Li. Multimodal convolutional neural networks for matching image and sentence. Proceedings of the IEEE International Conference on Computer Vision, 11-18-Dece:2623-2631, 2016.
-
(2016)
Proceedings of the IEEE International Conference on Computer Vision
, vol.11-18
, pp. 2623-2631
-
-
Ma, L.1
Lu, Z.2
Shang, L.3
Li, H.4
-
23
-
-
84986260074
-
Generation and Comprehension of Unambiguous Object Descriptions
-
J. Mao, J. Huang, A. Toshev, O. Camburu, A. Yuille, and K. Murphy. Generation and Comprehension of Unambiguous Object Descriptions. Cvpr, pages 11-20, 2016.
-
(2016)
Cvpr
, pp. 11-20
-
-
Mao, J.1
Huang, J.2
Toshev, A.3
Camburu, O.4
Yuille, A.5
Murphy, K.6
-
25
-
-
85021826252
-
Modeling Context between Objects for Referring Expression Understanding
-
V. K. Nagaraja, V. I. Morariu, and L. S. Davis. Modeling Context Between Objects for Referring Expression Understanding. Eccv, 2016.
-
(2016)
Eccv
-
-
Nagaraja, V.K.1
Morariu, V.I.2
Davis, L.S.3
-
27
-
-
85083951479
-
Sequence Level Training with Recurrent Neural Networks
-
M. Ranzato, S. Chopra, M. Auli, andW. Zaremba. Sequence Level Training with Recurrent Neural Networks. Iclr, pages 1-15, 2016.
-
(2016)
Iclr
, pp. 1-15
-
-
Ranzato, M.1
Chopra, S.2
Auli, M.3
Zaremba, W.4
-
28
-
-
84986250442
-
Learning Deep Representations of Fine-Grained Visual Descriptions
-
S. Reed, Z. Akata, H. Lee, and B. Schiele. Learning Deep Representations of Fine-Grained Visual Descriptions. Cvpr, pages 49-58, 2016.
-
(2016)
Cvpr
, pp. 49-58
-
-
Reed, S.1
Akata, Z.2
Lee, H.3
Schiele, B.4
-
29
-
-
85044386408
-
-
1511.03745V1
-
A. Rohrbach, M. Rohrbach, R. Hu, T. Darrell, and B. Schiele. Grounding of Textual Phrases in Images by Reconstruction. 1511.03745V1, 1:1-10, 2015.
-
(2015)
Grounding of Textual Phrases in Images by Reconstruction
, vol.1
, pp. 1-10
-
-
Rohrbach, A.1
Rohrbach, M.2
Hu, R.3
Darrell, T.4
Schiele, B.5
-
31
-
-
84992670816
-
-
arXiv preprint, (2005)
-
I. Vendrov, R. Kiros, S. Fidler, and R. Urtasun. Order- Embeddings of Images and Language. arXiv preprint, (2005):1-13, 2015.
-
(2015)
Order- Embeddings of Images and Language
, pp. 1-13
-
-
Vendrov, I.1
Kiros, R.2
Fidler, S.3
Urtasun, R.4
-
32
-
-
84946747440
-
Show and tell: A neural image caption generator
-
O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3156-3164, 2015.
-
(2015)
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
, pp. 3156-3164
-
-
Vinyals, O.1
Toshev, A.2
Bengio, S.3
Erhan, D.4
-
33
-
-
84986271102
-
Learning Deep Structure- Preserving Image-Text Embeddings
-
L. Wang, Y. Li, and S. Lazebnik. Learning Deep Structure- Preserving Image-Text Embeddings. Cvpr, (Figure 1):5005- 5013, 2016.
-
(2016)
Cvpr (Figure 1)
, pp. 5005-5013
-
-
Wang, L.1
Li, Y.2
Lazebnik, S.3
-
35
-
-
84970002232
-
Show Attend and Tell: Neural Image Caption Generation with Visual Attention
-
K. Xu, J. L. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhutdinov, R. S. Zemel, and Y. Bengio. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. Icml-2015, 2015.
-
(2015)
Icml-2015
-
-
Xu, K.1
Ba, J.L.2
Kiros, R.3
Cho, K.4
Courville, A.5
Salakhutdinov, R.6
Zemel, R.S.7
Bengio, Y.8
-
36
-
-
84990061297
-
Modeling Context in Referring Expressions
-
L. Yu, P. Poirson, S. Yang, A. C. Berg, and T. L. Berg. Modeling Context in Referring Expressions. In Eccv, 2016.
-
(2016)
Eccv
-
-
Yu, L.1
Poirson, P.2
Yang, S.3
Berg, A.C.4
Berg, T.L.5
|