SCOPUS 정보 검색 플랫폼

Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017

Volumn 2017-January, Issue , 2017, Pages 1170-1178

Captioning images with diverse objects

(6) Venugopalan, Subhashini a Mooney, Raymond a Hendricks, Lisa Anne b Darrell, Trevor b Rohrbach, Marcus c Saenko, Kate c

a UNIVERSITY OF TEXAS AT AUSTIN (United States)

b UNIVERSITY OF CALIFORNIA (United States)

c BOSTON UNIVERSITY (United States)

Author keywords

[No Author keywords available]

Indexed keywords

CHARACTER RECOGNITION; COMPUTER VISION; OBJECT RECOGNITION; SEMANTICS;

AUTOMATIC EVALUATION; DIVERSE OBJECTS; EXTERNAL SOURCES; OBJECT CATEGORIES; SEMANTIC INFORMATION; SEMANTIC KNOWLEDGE; UNANNOTATED TEXTS; VISUAL SEMANTICS;

PATTERN RECOGNITION;

EID: 85044269789 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/CVPR.2017.130 Document Type: Conference Paper

Times cited : (154)

References (27)

1
- 84926007060
- Meteor universal: Language specific translation evaluation for any target language
- M. Denkowski and A. Lavie. Meteor universal: Language specific translation evaluation for any target language. In Proceedings of the Ninth Workshop on Statistical Machine Translation, 2014.
- (2014) Proceedings of the Ninth Workshop on Statistical Machine Translation
- Denkowski, M.¹ Lavie, A.²

2
- 84959236502
- Long-term recurrent convolutional networks for visual recognition and description
- J. Donahue, L. A. Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, K. Saenko, and T. Darrell. Long-term recurrent convolutional networks for visual recognition and description. In CVPR, 2015.
- (2015) CVPR
- Donahue, J.¹ Hendricks, L.A.² Guadarrama, S.³ Rohrbach, M.⁴ Venugopalan, S.⁵ Saenko, K.⁶ Darrell, T.⁷

3
- 84959250180
- From captions to visual concepts and back
- H. Fang, S. Gupta, F. N. Iandola, R. Srivastava, L. Deng, P. Dollár, J. Gao, X. He, M. Mitchell, J. C. Platt, C. L. Zitnick, and G. Zweig. From captions to visual concepts and back. In CVPR, 2015.
- (2015) CVPR
- Fang, H.¹ Gupta, S.² Iandola, F.N.³ Srivastava, R.⁴ Deng, L.⁵ Dollár, P.⁶ Gao, J.⁷ He, X.⁸ Mitchell, M.⁹ Platt, J.C.¹⁰ Zitnick, C.L.¹¹ Zweig, G.¹²

4
- 84898958665
- Devise: A deep visual-semantic embedding model
- A. Frome, G. S. Corrado, J. Shlens, S. Bengio, J. Dean, T. Mikolov, et al. Devise: A deep visual-semantic embedding model. In Advances in Neural Information Processing Systems, pages 2121-2129, 2013.
- (2013) Advances in Neural Information Processing Systems , pp. 2121-2129
- Frome, A.¹ Corrado, G.S.² Shlens, J.³ Bengio, S.⁴ Dean, J.⁵ Mikolov, T.⁶

5
- 84959109176
- arXiv preprint
- C. Gulcehre, O. Firat, K. Xu, K. Cho, L. Barrault, H. Lin, F. Bougares, H. Schwenk, and Y. Bengio. On using monolingual corpora in neural machine translation. arXiv preprint arXiv:1503.03535, 2015.
- (2015) On Using Monolingual Corpora in Neural Machine Translation
- Gulcehre, C.¹ Firat, O.² Xu, K.³ Cho, K.⁴ Barrault, L.⁵ Lin, H.⁶ Bougares, F.⁷ Schwenk, H.⁸ Bengio, Y.⁹

6
- 84986274465
- Deep residual learning for image recognition
- K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, 2016.
- (2016) CVPR
- He, K.¹ Zhang, X.² Ren, S.³ Sun, J.⁴

7
- 84986274522
- Deep compositional captioning: Describing novel object categories without paired training data
- L. A. Hendricks, S. Venugopalan, M. Rohrbach, R. Mooney, K. Saenko, and T. Darrell. Deep compositional captioning: Describing novel object categories without paired training data. In CVPR, 2016.
- (2016) CVPR
- Hendricks, L.A.¹ Venugopalan, S.² Rohrbach, M.³ Mooney, R.⁴ Saenko, K.⁵ Darrell, T.⁶

8
- 84946734827
- Deep visual-semantic alignments for generating image descriptions
- A. Karpathy and L. Fei-Fei. Deep visual-semantic alignments for generating image descriptions. In CVPR, 2015.
- (2015) CVPR
- Karpathy, A.¹ Fei-Fei, L.²

9
- 84919921461
- Multimodal neural language models
- R. Kiros, R. Salakhutdinov, and R. Zemel. Multimodal neural language models. In Proceedings of the 31st International Conference on Machine Learning (ICML-14), pages 595-603, 2014.
- (2014) Proceedings of the 31st International Conference on Machine Learning (ICML-14) , pp. 595-603
- Kiros, R.¹ Salakhutdinov, R.² Zemel, R.³

10
- 84952349298
- Unifying visual-semantic embeddings with multimodal neural language models
- R. Kiros, R. Salakhutdinov, and R. S. Zemel. Unifying visual-semantic embeddings with multimodal neural language models. TACL, 2015.
- (2015) TACL
- Kiros, R.¹ Salakhutdinov, R.² Zemel, R.S.³

11
- 84934873221
- Treetalk: Composition and compression of trees for image descriptions
- P. Kuznetsova, V. Ordonez, T. L. Berg, U. C. Hill, and Y. Choi. Treetalk: Composition and compression of trees for image descriptions. In TACL, 2014.
- (2014) TACL
- Kuznetsova, P.¹ Ordonez, V.² Berg, T.L.³ Hill, U.C.⁴ Choi, Y.⁵

12
- 84906927509
- Is this a wampimuk? Cross-modal mapping between distributional semantics and the visual world
- A. Lazaridou, E. Bruni, and M. Baroni. Is this a wampimuk? cross-modal mapping between distributional semantics and the visual world. In ACL, 2014.
- (2014) ACL
- Lazaridou, A.¹ Bruni, E.² Baroni, M.³

13
- 84937834115
- Microsoft coco: Common objects in context
- T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick. Microsoft coco: Common objects in context. In ECCV, 2014.
- (2014) ECCV
- Lin, T.-Y.¹ Maire, M.² Belongie, S.³ Hays, J.⁴ Perona, P.⁵ Ramanan, D.⁶ Dollár, P.⁷ Zitnick, C.L.⁸

14
- 85117622017
- The stanford corenlp natural language processing toolkit
- C. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. J. Bethard, and D. McClosky. The stanford corenlp natural language processing toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 55-60, 2014.
- (2014) Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations , pp. 55-60
- Manning, C.¹ Surdeanu, M.² Bauer, J.³ Finkel, J.⁴ Bethard, S.J.⁵ McClosky, D.⁶

15
- 85083950512
- Deep captioning with multimodal recurrent neural networks (m-rnn)
- J. Mao, W. Xu, Y. Yang, J. Wang, Z. Huang, and A. Yuille. Deep captioning with multimodal recurrent neural networks (m-rnn). In ICLR, 2015.
- (2015) ICLR
- Mao, J.¹ Xu, W.² Yang, Y.³ Wang, J.⁴ Huang, Z.⁵ Yuille, A.⁶

16
- 84973863256
- Learning like a child: Fast novel visual concept learning from sentence descriptions of images
- J. Mao, W. Xu, Y. Yang, J. Wang, Z. Huang, and A. L. Yuille. Learning like a child: Fast novel visual concept learning from sentence descriptions of images. In ICCV, 2015.
- (2015) ICCV
- Mao, J.¹ Xu, W.² Yang, Y.³ Wang, J.⁴ Huang, Z.⁵ Yuille, A.L.⁶

17
- 84898956512
- Distributed representations of words and phrases and their compositionality
- T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In NIPS, 2013.
- (2013) NIPS
- Mikolov, T.¹ Sutskever, I.² Chen, K.³ Corrado, G.S.⁴ Dean, J.⁵

18
- 85044322278
- Midge: Generating image descriptions from computer vision detections
- H. D. III
- M. Mitchell, J. Dodge, A. Goyal, K. Yamaguchi, K. Stratos, X. Han, A. Mensch, A. C. Berg, T. L. Berg, and H. D. III. Midge: Generating image descriptions from computer vision detections. In EACL, 2012.
- (2012) EACL
- Mitchell, M.¹ Dodge, J.² Goyal, A.³ Yamaguchi, K.⁴ Stratos, K.⁵ Han, X.⁶ Mensch, A.⁷ Berg, A.C.⁸ Berg, T.L.⁹

19
- 84898979068
- arXiv preprint
- M. Norouzi, T. Mikolov, S. Bengio, Y. Singer, J. Shlens, A. Frome, G. S. Corrado, and J. Dean. Zero-shot learning by convex combination of semantic embeddings. arXiv preprint arXiv:1312.5650, 2013.
- (2013) Zero-shot Learning by Convex Combination of Semantic Embeddings
- Norouzi, M.¹ Mikolov, T.² Bengio, S.³ Singer, Y.⁴ Shlens, J.⁵ Frome, A.⁶ Corrado, G.S.⁷ Dean, J.⁸

20
- 84961289992
- Glove: Global vectors for word representation
- J. Pennington, R. Socher, and C. D. Manning. Glove: Global vectors for word representation. Proceedings of the Empiricial Methods in Natural Language Processing (EMNLP 2014), 12:1532-1543, 2014.
- (2014) Proceedings of the Empiricial Methods in Natural Language Processing (EMNLP 2014) , vol.12 , pp. 1532-1543
- Pennington, J.¹ Socher, R.² Manning, C.D.³

21
- 84909978410
- O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. ILSVRC, 2014.
- (2014) ILSVRC
- Russakovsky, O.¹ Deng, J.² Su, H.³ Krause, J.⁴ Satheesh, S.⁵ Ma, S.⁶ Huang, Z.⁷ Karpathy, A.⁸ Khosla, A.⁹ Bernstein, M.¹⁰ Berg, A.C.¹¹ Fei-Fei, L.¹²

22
- 84933585162
- Very deep convolutional networks for large-scale image recognition
- K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556, 2014.
- (2014) CoRR, abs/1409.1556
- Simonyan, K.¹ Zisserman, A.²

23
- 84906925854
- Grounded compositional semantics for finding and describing images with sentences
- R. Socher, A. Karpathy, Q. V. Le, C. D. Manning, and A. Y. Ng. Grounded compositional semantics for finding and describing images with sentences. TACL, 2014.
- (2014) TACL
- Socher, R.¹ Karpathy, A.² Le, Q.V.³ Manning, C.D.⁴ Ng, A.Y.⁵

24
- 84878402147
- LSTM neural networks for language modeling
- M. Sundermeyer, R. Schlüter, and H. Ney. LSTM neural networks for language modeling. In INTERSPEECH, 2012.
- (2012) INTERSPEECH
- Sundermeyer, M.¹ Schlüter, R.² Ney, H.³

25
- 85072843664
- Improving LSTM-based video description with linguistic knowledge mined from text
- S. Venugopalan, L. A. Hendricks, R. Mooney, and K. Saenko. Improving LSTM-based video description with linguistic knowledge mined from text. In EMNLP, 2016.
- (2016) EMNLP
- Venugopalan, S.¹ Hendricks, L.A.² Mooney, R.³ Saenko, K.⁴

26
- 84946747440
- Show and tell: A neural image caption generator
- O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. In CVPR, 2015.
- (2015) CVPR
- Vinyals, O.¹ Toshev, A.² Bengio, S.³ Erhan, D.⁴

27
- 80053258778
- Corpus-guided sentence generation of natural images
- Y. Yang, C. L. Teo, H. Daumé III, and Y. Aloimonos. Corpus-guided sentence generation of natural images. In EMNLP, 2011.
- (2011) EMNLP
- Yang, Y.¹ Teo, C.L.² Daumé, H.³ Aloimonos, Y.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.