-
1
-
-
84910073639
-
Pushdown automata in statistical machine translation
-
Cyril Allauzen, Bill Byrne, Adrià de Gispert, Gonzalo Iglesias, and Michael Riley. 2014. Pushdown automata in statistical machine translation. Computational Linguistics 40(3):687–723.
-
(2014)
Computational Linguistics
, vol.40
, Issue.3
, pp. 687-723
-
-
Allauzen, C.1
Byrne, B.2
de Gispert, A.3
Iglesias, G.4
Riley, M.5
-
2
-
-
85021678581
-
SPICE: Semantic propositional image caption evaluation
-
Peter Anderson, Basura Fernando, Mark Johnson, and Stephen Gould. 2016. SPICE: Semantic propositional image caption evaluation. In ECCV.
-
(2016)
ECCV
-
-
Anderson, P.1
Fernando, B.2
Johnson, M.3
Gould, S.4
-
4
-
-
84952349295
-
-
arXiv preprint
-
Xinlei Chen, Tsung-Yi Lin Hao Fang, Ramakrishna Vedantam, Saurabh Gupta, Piotr Dollar, and C. Lawrence Zitnick. 2015. Microsoft COCO captions: Data collection and evaluation server. arXiv preprint arXiv:1504.00325 .
-
(2015)
Microsoft COCO Captions: Data Collection and Evaluation Server
-
-
Chen, X.1
Fang, T.-Y.L.H.2
Vedantam, R.3
Gupta, S.4
Dollar, P.5
Zitnick, C.L.6
-
6
-
-
84944096380
-
Language models for image captioning: The quirks and what works
-
Jacob Devlin, Hao Cheng, Hao Fang, Saurabh Gupta, Li Deng, Xiaodong He, Geoffrey Zweig, and Margaret Mitchell. 2015. Language models for image captioning: The quirks and what works. In ACL.
-
(2015)
ACL
-
-
Devlin, J.1
Cheng, H.2
Fang, H.3
Gupta, S.4
Deng, L.5
He, X.6
Zweig, G.7
Mitchell, M.8
-
7
-
-
84959236502
-
Long-term recurrent convolutional networks for visual recognition and description
-
Jeffrey Donahue, Lisa A. Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, and Trevor Darrell. 2015. Long-term recurrent convolutional networks for visual recognition and description. In CVPR.
-
(2015)
CVPR
-
-
Donahue, J.1
Hendricks, L.A.2
Guadarrama, S.3
Rohrbach, M.4
Venugopalan, S.5
Saenko, K.6
Darrell, T.7
-
8
-
-
84943812736
-
Describing images using inferred visual dependency representations
-
Desmond Elliot and Arjen P. de Vries. 2015. Describing images using inferred visual dependency representations. In ACL.
-
(2015)
ACL
-
-
Elliot, D.1
de Vries, A.P.2
-
9
-
-
84959250180
-
From captions to visual concepts and back
-
Hao Fang, Saurabh Gupta, Forrest N. Iandola, Rupesh Srivastava, Li Deng, Piotr Dollar, Jianfeng Gao, Xiaodong He, Margaret Mitchell, John C. Platt, C. Lawrence Zitnick, and Geoffrey Zweig. 2015. From captions to visual concepts and back. In CVPR.
-
(2015)
CVPR
-
-
Fang, H.1
Gupta, S.2
Iandola, F.N.3
Srivastava, R.4
Deng, L.5
Dollar, P.6
Gao, J.7
He, X.8
Mitchell, M.9
Platt, J.C.10
Zitnick, C.L.11
Zweig, G.12
-
12
-
-
84986274465
-
Deep residual learning for image recognition
-
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR.
-
(2016)
CVPR
-
-
He, K.1
Zhang, X.2
Ren, S.3
Sun, J.4
-
13
-
-
84986274522
-
Deep compositional captioning: Describing novel object categories without paired training data
-
Lisa Anne Hendricks, Subhashini Venugopalan, Marcus Rohrbach, Raymond Mooney, Kate Saenko, and Trevor Darrell. 2016. Deep compositional captioning: Describing novel object categories without paired training data. In CVPR.
-
(2016)
CVPR
-
-
Hendricks, L.A.1
Venugopalan, S.2
Rohrbach, M.3
Mooney, R.4
Saenko, K.5
Darrell, T.6
-
15
-
-
84913555165
-
-
arXiv preprint
-
Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093 .
-
(2014)
Caffe: Convolutional Architecture for Fast Feature Embedding
-
-
Jia, Y.1
Shelhamer, E.2
Donahue, J.3
Karayev, S.4
Long, J.5
Girshick, R.6
Guadarrama, S.7
Darrell, T.8
-
16
-
-
84946734827
-
Deep visual-semantic alignments for generating image descriptions
-
Andrej Karpathy and Li Fei-Fei. 2015. Deep visual-semantic alignments for generating image descriptions. In CVPR.
-
(2015)
CVPR
-
-
Karpathy, A.1
Fei-Fei, L.2
-
17
-
-
49449108990
-
-
Cambridge University Press, New York, NY, USA, 1st edition
-
Philipp Koehn. 2010. Statistical Machine Translation. Cambridge University Press, New York, NY, USA, 1st edition.
-
(2010)
Statistical Machine Translation
-
-
Koehn, P.1
-
18
-
-
85044305404
-
The unreasonable effectiveness of noisy data for fine-grained recognition
-
Jonathan Krause, Benjamin Sapp, Andrew Howard, Howard Zhou, Alexander Toshev, Tom Duerig, James Philbin, and Li Fei-Fei. 2016. The unreasonable effectiveness of noisy data for fine-grained recognition. In ECCV.
-
(2016)
ECCV
-
-
Krause, J.1
Sapp, B.2
Howard, A.3
Zhou, H.4
Toshev, A.5
Duerig, T.6
Philbin, J.7
Fei-Fei, L.8
-
19
-
-
84937834115
-
Microsoft COCO: Common objects in context
-
T.Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, and C. L. Zitnick. 2014. Microsoft COCO: Common objects in context. In ECCV.
-
(2014)
ECCV
-
-
Lin, T.Y.1
Maire, M.2
Belongie, S.3
Hays, J.4
Perona, P.5
Ramanan, D.6
Dollar, P.7
Zitnick, C.L.8
-
20
-
-
85083950512
-
Deep captioning with multimodal recurrent neural networks (m-RNN)
-
Junhua Mao, Wei Xu, Yi Yang, Jiang Wang, and Alan L. Yuille. 2015. Deep captioning with multimodal recurrent neural networks (m-RNN). In ICLR.
-
(2015)
ICLR
-
-
Mao, J.1
Xu, W.2
Yang, Y.3
Wang, J.4
Yuille, A.L.5
-
21
-
-
84961289992
-
Glove: Global vectors for word representation
-
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global vectors for word representation. In EMNLP.
-
(2014)
EMNLP
-
-
Pennington, J.1
Socher, R.2
Manning, C.D.3
-
22
-
-
84960980241
-
Faster R-CNN: Towards real-time object detection with region proposal networks
-
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In NIPS.
-
(2015)
NIPS
-
-
Ren, S.1
He, K.2
Girshick, R.3
Sun, J.4
-
23
-
-
84947041871
-
Imagenet large scale visual recognition challenge
-
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. Imagenet large scale visual recognition challenge. International Journal of Computer Vision (IJCV) 115(3):211–252.
-
(2015)
International Journal of Computer Vision (IJCV)
, vol.115
, Issue.3
, pp. 211-252
-
-
Russakovsky, O.1
Deng, J.2
Su, H.3
Krause, J.4
Satheesh, S.5
Ma, S.6
Huang, Z.7
Karpathy, A.8
Khosla, A.9
Bernstein, M.10
Berg, A.C.11
Fei-Fei, L.12
-
24
-
-
85083953063
-
Very deep convolutional networks for large-scale image recognition
-
Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In ICLR.
-
(2015)
ICLR
-
-
Simonyan, K.1
Zisserman, A.2
-
26
-
-
85010205139
-
Rich image captioning in the wild
-
Kenneth Tran, Xiaodong He, Lei Zhang, Jian Sun, Cornelia Carapcea, Chris Thrasher, Chris Buehler, and Chris Sienkiewicz. 2016. Rich image captioning in the wild. In CVPR Workshop.
-
(2016)
CVPR Workshop
-
-
Tran, K.1
He, X.2
Zhang, L.3
Sun, J.4
Carapcea, C.5
Thrasher, C.6
Buehler, C.7
Sienkiewicz, C.8
-
27
-
-
84956980995
-
CiDer: Consensus-based image description evaluation
-
Ramakrishna Vedantam, C. Lawrence Zitnick, and Devi Parikh. 2015. CIDEr: Consensus-based image description evaluation. In CVPR.
-
(2015)
CVPR
-
-
Vedantam, R.1
Zitnick, C.L.2
Parikh, D.3
-
28
-
-
85034846838
-
-
arXiv preprint
-
Subhashini Venugopalan, Lisa Anne Hendricks, Marcus Rohrbach, Raymond J. Mooney, Trevor Darrell, and Kate Saenko. 2016. Captioning images with diverse objects. arXiv preprint arXiv:1606.07770 .
-
(2016)
Captioning Images with Diverse Objects
-
-
Venugopalan, S.1
Hendricks, L.A.2
Rohrbach, M.3
Mooney, R.J.4
Darrell, T.5
Saenko, K.6
-
29
-
-
84946747440
-
Show and tell: A neural image caption generator
-
Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. 2015. Show and tell: A neural image caption generator. In CVPR.
-
(2015)
CVPR
-
-
Vinyals, O.1
Toshev, A.2
Bengio, S.3
Erhan, D.4
-
30
-
-
84986301177
-
What value do explicit high level concepts have in vision to language problems?
-
Q. Wu, C. Shen, L. Liu, A. Dick, and A. van den Hengel. 2016. What Value Do Explicit High Level Concepts Have in Vision to Language Problems? In CVPR.
-
(2016)
CVPR
-
-
Wu, Q.1
Shen, C.2
Liu, L.3
Dick, A.4
van den Hengel, A.5
-
31
-
-
84906494296
-
From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions
-
Peter Young, Alice Lai, Micah Hodosh, and Julia Hockenmaier. 2014. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. TACL .
-
(2014)
TACL
-
-
Young, P.1
Lai, A.2
Hodosh, M.3
Hockenmaier, J.4
-
32
-
-
84986272569
-
Fast zero-shot image tagging
-
Yang Zhang, Boqing Gong, and Mubarak Shah. 2016. Fast zero-shot image tagging. In CVPR.
-
(2016)
CVPR
-
-
Zhang, Y.1
Gong, B.2
Shah, M.3
|