SCOPUS 정보 검색 플랫폼

32nd International Conference on Machine Learning, ICML 2015

Volumn 3, Issue , 2015, Pages 2085-2094

Phrase-based image captioning

(3) Lebret, Rémi a Pinheiro, Pedro O b Collobert, Ronan c

a IDIAP RESEARCH INSTITUTE (Switzerland)

b EPFL (Switzerland)

c FACEBOOK AI RESEARCH (United States)

Author keywords

[No Author keywords available]

Indexed keywords

ARTIFICIAL INTELLIGENCE; COMPUTATIONAL LINGUISTICS; LEARNING ALGORITHMS; LEARNING SYSTEMS; NATURAL LANGUAGE PROCESSING SYSTEMS; NEURAL NETWORKS; SYNTACTICS;

BILINEAR MODELS; CONVOLUTIONAL NEURAL NETWORK; IMAGE CAPTIONING; IMAGE REPRESENTATIONS; NATURAL LANGUAGE PROCESSING; SIMPLE MODELING; STATE OF THE ART; TEXTUAL DESCRIPTION;

COMPUTER VISION;

EID: 84970028761 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (104)

References (24)

1
- 84957029470
- Minds eye: A recurrent visual representation for image caption generation
- Chen, X. and Zitnick, C. L. Minds Eye: A Recurrent Visual Representation for Image Caption Generation. In IEEE International Concference on Computer Vision and Patter Recognition (CVPR), 2015.
- (2015) IEEE International Concference on Computer Vision and Patter Recognition (CVPR)
- Chen, X.¹ Zitnick, C.L.²

2
- 84944046597
- arXiv preprint arXiv: 1411.4389
- Donahue, I., Hendricks, L. A., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., and Darrell, T. Long-term Recurrent Convolutional Networks for Visual Recognition and Description. arXiv preprint arXiv: 1411.4389, 2014.
- (2014) Long-term Recurrent Convolutional Networks for Visual Recognition and Description
- Donahue, I.¹ Hendricks, L.A.² Guadarrama, S.³ Rohrbach, M.⁴ Venugopalan, S.⁵ Saenko, K.⁶ Darrell, T.⁷

3
- 84959250180
- From captions to visual concepts and back
- Fang, H., Gupta, S., Iandola, F., Srivastava, R., Deng, L., Dollár, P., Gao, J., He, X., Mitchell, M., Piatt, J., Zitnick, C. L., and Zweig, G. From captions to visual concepts and back. In IEEE International Concference on Computer Vision and Patter Recognition (CVPR), 2015.
- (2015) IEEE International Concference on Computer Vision and Patter Recognition (CVPR)
- Fang, H.¹ Gupta, S.² Iandola, F.³ Srivastava, R.⁴ Deng, L.⁵ Dollár, P.⁶ Gao, J.⁷ He, X.⁸ Mitchell, M.⁹ Piatt, J.¹⁰ Zitnick, C.L.¹¹ Zweig, G.¹²

4
- 84883394520
- Framing image description as a ranking task: Data, models and evaluation metrics
- Hodosh, M., Young, P., and Hockenmaier, J. Framing image description as a ranking task: data, models and evaluation metrics. Journal of Artificial Intelligence Research, 2013.
- (2013) Journal of Artificial Intelligence Research
- Hodosh, M.¹ Young, P.² Hockenmaier, J.³

5
- 84946734827
- Deep visual-semantic alignments for generating image descriptions
- Karpathy, A. and Fei-Fei, L. Deep Visual-Semantic Alignments for Generating Image Descriptions. In IEEE International Concference on Computer Vision and Patter Recognition (CVPR), 2015.
- (2015) IEEE International Concference on Computer Vision and Patter Recognition (CVPR)
- Karpathy, A.¹ Fei-Fei, L.²

6
- 84944113729
- arXiv preprint arXiv:1411.2539
- Kiros, R., Salakhutdinov, R., and Zemel, R. S. Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models. arXiv preprint arXiv:1411.2539, 2014.
- (2014) Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models
- Kiros, R.¹ Salakhutdinov, R.² Zemel, R.S.³

7
- 84887601544
- Baby talk: Understanding and generating simple image descriptions
- Kulkarni, G., Premraj, V., Dhar, S., Li, Siming, Choi, Yejin, Berg, A. C., and Berg, T. L. Baby Talk: Understanding and Generating Simple Image Descriptions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35 (12):2891-2903, 2013.
- (2013) IEEE Transactions on Pattern Analysis and Machine Intelligence , vol.35 , Issue.12 , pp. 2891-2903
- Kulkarni, G.¹ Premraj, V.² Dhar, S.³ Li, S.⁴ Choi, Y.⁵ Berg, A.C.⁶ Berg, T.L.⁷

8
- 84878189119
- Collective generation of natural image descriptions
- Association for Computational Linguistics, luly
- Kuznetsova, P., Ordonez, V., Berg, A. C., Berg, T. L., and Choi, Y. Collective Generation of Natural Image Descriptions. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 359-368. Association for Computational Linguistics, luly 2012.
- (2012) Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pp. 359-368
- Kuznetsova, P.¹ Ordonez, V.² Berg, A.C.³ Berg, T.L.⁴ Choi, Y.⁵

9
- 84942673026
- Rehabilitation of count-based models for word vector representations
- Gelbukh, Alexander (ed.), of Lecture Notes in Computer Science, Springer International Publishing
- Lebret, R. and Collobert, R. Rehabilitation of count-based models for word vector representations. In Gelbukh, Alexander (ed.), Computational Linguistics and Intelligent Text Processing, volume 9041 of Lecture Notes in Computer Science, pp. 417-429. Springer International Publishing, 2015.
- (2015) Computational Linguistics and Intelligent Text Processing , vol.9041 , pp. 417-429
- Lebret, R.¹ Collobert, R.²

10
- 0032203257
- Gradient-based learning applied to document recognition
- Le Cun, Y., Bottou, L., Bengio, Y., and Haffner, P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998.
- (1998) Proceedings of the IEEE
- Le Cun, Y.¹ Bottou, L.² Bengio, Y.³ Haffner, P.⁴

11
- 84906493406
- Microsoft COCO: Common objects in context
- Springer
- Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., and Zitnick, C. L. Microsoft COCO: Common Objects in Context. In Computer Vision-ECCV2014, pp. 740-755. Springer, 2014.
- (2014) Computer Vision-ECCV2014 , pp. 740-755
- Lin, T.-Y.¹ Maire, M.² Belongie, S.³ Hays, J.⁴ Perona, P.⁵ Ramanan, D.⁶ Dollár, P.⁷ Zitnick, C.L.⁸

12
- 85083950512
- Deep captioning with multimodal recurrent neural networks (m-RNN)
- Mao, I., Xu, W., Yang, Y., Wang, J., Huang, Z., and Yuille, A. L. Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN). In International Conference on Learning Representations (ICLR), 2015.
- (2015) International Conference on Learning Representations (ICLR)
- Mao, I.¹ Xu, W.² Yang, Y.³ Wang, J.⁴ Huang, Z.⁵ Yuille, A.L.⁶

13
- 85083951332
- arXiv preprint arXiv:1301.3781
- Mikolov, T., Chen, K., Corrado, G., and Dean, I. Efficient Estimation of Word Representations in Vector Space. arXiv preprint arXiv:1301.3781, 2013a.
- (2013) Efficient Estimation of Word Representations in Vector Space
- Mikolov, T.¹ Chen, K.² Corrado, G.³ Dean, I.⁴

14
- 84898956512
- Distributed representations of words and phrases and their compositionality
- Mikolov, T., Sutskever, I., Chen, K., Corrado, G., and Dean, J. Distributed Representations of Words and Phrases and their Compositionality. In Advances in Neural Information Processing Systems, pp. 3111-3119. 2013b.
- (2013) Advances in Neural Information Processing Systems , pp. 3111-3119
- Mikolov, T.¹ Sutskever, I.² Chen, K.³ Corrado, G.⁴ Dean, J.⁵

15
- 85034832841
- Midge: Generating image descriptions from computer vision detections
- Association for Computational Linguistics
- Mitchell, M., Han, X., Dodge, J., Mensch, A., Goyal, A., Berg, A., Yamaguchi, K., Berg, T., Stratos, K., and Daume, III, H. Midge: Generating Image Descriptions from Computer Vision Detections. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, pp. 747-756. Association for Computational Linguistics, 2012.
- (2012) Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics , pp. 747-756
- Mitchell, M.¹ Han, X.² Dodge, J.³ Mensch, A.⁴ Goyal, A.⁵ Berg, A.⁶ Yamaguchi, K.⁷ Berg, T.⁸ Stratos, K.⁹ Daume, H.¹⁰

16
- 84898987069
- Learning word em-beddings efficiently with noise-contrastive estimation
- Mnih, A. and Kavukcuoglu, Koray. Learning word em-beddings efficiently with noise-contrastive estimation. In Advances in Neural Information Processing Systems, pp. 2265-2273. 2013.
- (2013) Advances in Neural Information Processing Systems , pp. 2265-2273
- Mnih, A.¹ Kavukcuoglu, K.²

17
- 85133336275
- BLEU: A method for automatic evaluation of machine translation
- Association for Computational Linguistics
- Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. BLEU: A Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th annual meeting on association for computational linguistics, pp. 311-318. Association for Computational Linguistics, 2002.
- (2002) Proceedings of the 40th Annual Meeting on Association for Computational Linguistics , pp. 311-318
- Papineni, K.¹ Roukos, S.² Ward, T.³ Zhu, W.-J.⁴

18
- 84961289992
- GloVe: Global vectors for word representation
- Pennington, J., Socher, R., and Manning, C. D. GloVe: Global Vectors for Word Representation. In Proceedings of the Empiricial Methods in Natural Language Processing (EMNLP 2014), volume 12, 2014.
- (2014) Proceedings of the Empiricial Methods in Natural Language Processing (EMNLP 2014) , vol.12
- Pennington, J.¹ Socher, R.² Manning, C.D.³

19
- 84933585162
- Very deep convolutional networks for large-scale image recognition
- Simonyan, K. and Zisserman, A. Very deep convolutional networks for large-scale image recognition. CoRR, 2014.
- (2014) CoRR
- Simonyan, K.¹ Zisserman, A.²

20
- 84964474107
- Grounded compositional semantics for finding and describing images with sentences
- Socher, R., Karpathy, A., Le, Q. V., Manning, C. D., and Ng, A. Y. Grounded Compositional Semantics for Finding and Describing Images with Sentences. Transactions of the Association for Computational Linguistics, 2:207-218, 2014.
- (2014) Transactions of the Association for Computational Linguistics , vol.2 , pp. 207-218
- Socher, R.¹ Karpathy, A.² Le, Q.V.³ Manning, C.D.⁴ Ng, A.Y.⁵

21
- 84916911784
- Multimodal learning with deep boltzmann machines
- Srivastava, N. and Salakhutdinov, R. Multimodal Learning with Deep Boltzmann Machines. Journal of Machine Learning Research, 2014.
- (2014) Journal of Machine Learning Research
- Srivastava, N.¹ Salakhutdinov, R.²

22
- 84944069490
- arXiv preprint arXiv:1412.4729
- Venugopalan, S., Xu, H., Donahue, J., Rohrbach, M., Mooney, R. J., and Saenko, K. Translating Videos to Natural Language Using Deep Recurrent Neural Networks. arXiv preprint arXiv:1412.4729, 2014.
- (2014) Translating Videos to Natural Language Using Deep Recurrent Neural Networks
- Venugopalan, S.¹ Xu, H.² Donahue, J.³ Rohrbach, M.⁴ Mooney, R.J.⁵ Saenko, K.⁶

23
- 84939821075
- arXiv preprint arXiv: 1411.4555
- Vinyals, O., Toshev, A., Bengio, S., and Erhan, D. Show and tell: A neural image caption generator. arXiv preprint arXiv: 1411.4555, 2014.
- (2014) Show and Tell: A Neural Image Caption Generator
- Vinyals, O.¹ Toshev, A.² Bengio, S.³ Erhan, D.⁴

24
- 77954862144
- I2T: Image parsing to text description
- Yao, B. Z., Yang, X., Lin, L., Lee, M. W., and Zhu, S. C. I2T: Image Parsing to Text Description. Proceedings of the IEEE, 98(8): 1485-1508, 2010.
- (2010) Proceedings of the IEEE , vol.98 , Issue.8 , pp. 1485-1508
- Yao, B.Z.¹ Yang, X.² Lin, L.³ Lee, M.W.⁴ Zhu, S.C.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.