SCOPUS 정보 검색 플랫폼

Proceedings of the IEEE International Conference on Computer Vision

Volumn 2015 International Conference on Computer Vision, ICCV 2015, Issue , 2015, Pages 2407-2415

Guiding the long-short term memory model for image caption generation

(4) Jia, Xu a Gavves, Efstratios b Fernando, Basura c Tuytelaars, Tinne a

a UNIVERSITY OF LEUVEN (Belgium)

b UNIVERSITY OF AMSTERDAM (Netherlands)

c AUSTRALIAN NATIONAL UNIVERSITY (Australia)

Author keywords

[No Author keywords available]

Indexed keywords

BRAIN; SEMANTICS;

BENCHMARK DATASETS; IMAGE CAPTION; IMAGE CONTENT; LENGTH NORMALIZATION; LONG SHORT TERM MEMORY; SEMANTIC INFORMATION; STATE OF THE ART; TIGHTLY-COUPLED;

COMPUTER VISION;

EID: 84973917813 PISSN: 15505499 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICCV.2015.277 Document Type: Conference Paper

Times cited : (417)

References (40)

1
- 85083953689
- Neural machine translation by jointly learning to align and translate
- 1, 2, 3, 4
- D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to align and translate. In ICLR, 2015. 1, 2, 3, 4
- (2015) ICLR
- Bahdanau, D.¹ Cho, K.² Bengio, Y.³

2
- 70349549313
- 6 O'Reilly
- S. Bird, E. Klein, and E. Loper. Natural Language Processing with Python. O'Reilly, 2009. 6
- (2009) Natural Language Processing with Python
- Bird, S.¹ Klein, E.² Loper, E.³

3
- 84957029470
- Mind's eye: A recurrent visual representation for image caption generation
- 6
- X. Chen and C. L. Zitnick. Mind's eye: A recurrent visual representation for image caption generation. In CVPR, 2015. 6
- (2015) CVPR
- Chen, X.¹ Zitnick, C.L.²

4
- 85097641926
- On the properties of neural machine translation: Encoderdecoder approaches
- 4, 5
- K. Cho, B. van Merrienboer, D. Bahdanau, and Y. Bengio. On the properties of neural machine translation: Encoderdecoder approaches. In Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-8), 2014. 4, 5
- (2014) Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-8)
- Cho, K.¹ Van Merrienboer, B.² Bahdanau, D.³ Bengio, Y.⁴

5
- 84961291190
- Learning phrase representations using RNN encoder-decoder for statistical machine translation
- 1, 2, 3
- K. Cho, B. van Merrienboer, C. Gülcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In EMNLP, 2014. 1, 2, 3
- (2014) EMNLP
- Cho, K.¹ Van Merrienboer, B.² Gülcehre, C.³ Bahdanau, D.⁴ Bougares, F.⁵ Schwenk, H.⁶ Bengio, Y.⁷

6
- 85107661995
- Meteor universal: Language specific translation evaluation for any target language
- 6
- M. Denkowski and A. Lavie. Meteor universal: Language specific translation evaluation for any target language. In EACL 2014 Workshop on Statistical Machine Translation, 2014. 6
- (2014) EACL 2014 Workshop on Statistical Machine Translation
- Denkowski, M.¹ Lavie, A.²

7
- 84959236502
- Long-term recurrent convolutional networks for visual recognition and description
- 1, 2, 3, 8
- J. Donahue, L. A. Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, K. Saenko, and T. Darrell. Long-term recurrent convolutional networks for visual recognition and description. In CVPR, 2015. 1, 2, 3, 8
- (2015) CVPR
- Donahue, J.¹ Hendricks, L.A.² Guadarrama, S.³ Rohrbach, M.⁴ Venugopalan, S.⁵ Saenko, K.⁶ Darrell, T.⁷

8
- 84959250180
- From captions to visual concepts and back
- 6
- H. Fang, S. Gupta, F. N. Iandola, R. Srivastava, L. Deng, P. Dollár, J. Gao, X. He, M. Mitchell, J. C. Platt, C. L. Zitnick, and G. Zweig. From captions to visual concepts and back. In CVPR, 2015. 6
- (2015) CVPR
- Fang, H.¹ Gupta, S.² Iandola, F.N.³ Srivastava, R.⁴ Deng, L.⁵ Dollár, P.⁶ Gao, J.⁷ He, X.⁸ Mitchell, M.⁹ Platt, J.C.¹⁰ Zitnick, C.L.¹¹ Zweig, G.¹²

9
- 80052017343
- Every picture tells a story: Generating sentences from images
- 1, 2
- A. Farhadi, S. M. M. Hejrati, M. A. Sadeghi, P. Young, C. Rashtchian, J. Hockenmaier, and D. A. Forsyth. Every picture tells a story: Generating sentences from images. In ECCV (4), 2010. 1, 2
- (2010) ECCV , Issue.4
- Farhadi, A.¹ Hejrati, S.M.M.² Sadeghi, M.A.³ Young, P.⁴ Rashtchian, C.⁵ Hockenmaier, J.⁶ Forsyth, D.A.⁷

10
- 84894905366
- A multi-view embedding space for modeling internet images, tags, and their semantics
- 4, 6
- Y. Gong, Q. Ke, M. Isard, and S. Lazebnik. A multi-view embedding space for modeling internet images, tags, and their semantics. IJCV, 106 (2): 210-233, 2014. 4, 6
- (2014) IJCV , vol.106 , Issue.2 , pp. 210-233
- Gong, Y.¹ Ke, Q.² Isard, M.³ Lazebnik, S.⁴

11
- 84897549167
- CoRR, abs/1211. 3711 5
- A. Graves. Sequence transduction with recurrent neural networks. CoRR, abs/1211. 3711, 2012. 5
- (2012) Sequence Transduction with Recurrent Neural Networks
- Graves, A.¹

12
- 84906979661
- CoRR, abs/1308. 0850 5
- A. Graves. Generating sequences with recurrent neural networks. CoRR, abs/1308. 0850, 2013. 5
- (2013) Generating Sequences with Recurrent Neural Networks
- Graves, A.¹

13
- 84943739264
- CoRR, abs/1503. 04069 3
- K. Greff, R. K. Srivastava, J. Koutník, B. R. Steunebrink, and J. Schmidhuber. LSTM: A search space odyssey. CoRR, abs/1503. 04069, 2015. 3
- (2015) LSTM: A Search Space Odyssey
- Greff, K.¹ Srivastava, R.K.² Koutník, J.³ Steunebrink, B.R.⁴ Schmidhuber, J.⁵

14
- 0031573117
- Long short-term memory
- 2, 3
- S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Comput., 9 (8): 1735-1780, 1997. 2, 3
- (1997) Neural Comput. , vol.9 , Issue.8 , pp. 1735-1780
- Hochreiter, S.¹ Schmidhuber, J.²

15
- 84883394520
- Framing image description as a ranking task: Data, models and evaluation metrics
- 6
- M. Hodosh, P. Young, and J. Hockenmaier. Framing image description as a ranking task: Data, models and evaluation metrics. JAIR, 47: 853-899, 2013. 6
- (2013) JAIR , vol.47 , pp. 853-899
- Hodosh, M.¹ Young, P.² Hockenmaier, J.³

16
- 0000107975
- Relations between two sets of variates
- 4
- H. Hotelling. Relations between two sets of variates. Biometrika, pages 321-377, 1936. 4
- (1936) Biometrika , pp. 321-377
- Hotelling, H.¹

17
- 84946734827
- Deep visual-semantic alignments for generating image descriptions
- 1, 2, 3, 6, 8
- A. Karpathy and L. Fei-Fei. Deep visual-semantic alignments for generating image descriptions. In CVPR, 2015. 1, 2, 3, 6, 8
- (2015) CVPR
- Karpathy, A.¹ Fei-Fei, L.²

18
- 84937843643
- Deep fragment embeddings for bidirectional image sentence mapping
- 6
- A. Karpathy, A. Joulin, and F. Li. Deep fragment embeddings for bidirectional image sentence mapping. In NIPS, 2014. 6
- (2014) NIPS
- Karpathy, A.¹ Joulin, A.² Li, F.³

19
- 84919921461
- Multimodal neural language models
- 1, 2, 8
- R. Kiros, R. Salakhutdinov, and R. S. Zemel. Multimodal neural language models. In ICML, 2014. 1, 2, 8
- (2014) ICML
- Kiros, R.¹ Salakhutdinov, R.² Zemel, R.S.³

20
- 84887601544
- Babytalk: Understanding and generating simple image descriptions
- 1, 2
- G. Kulkarni, V. Premraj, V. Ordonez, S. Dhar, S. Li, Y. Choi, A. C. Berg, and T. L. Berg. Babytalk: Understanding and generating simple image descriptions. TPAMI, 35 (12): 2891-2903, 2013. 1, 2
- (2013) TPAMI , vol.35 , Issue.12 , pp. 2891-2903
- Kulkarni, G.¹ Premraj, V.² Ordonez, V.³ Dhar, S.⁴ Li, S.⁵ Choi, Y.⁶ Berg, A.C.⁷ Berg, T.L.⁸

21
- 84878189119
- Collective generation of natural image descriptions
- 1, 2
- P. Kuznetsova, V. Ordonez, A. C. Berg, T. L. Berg, and Y. Choi. Collective generation of natural image descriptions. In ACL, 2012. 1, 2
- (2012) ACL
- Kuznetsova, P.¹ Ordonez, V.² Berg, A.C.³ Berg, T.L.⁴ Choi, Y.⁵

22
- 84907331257
- Generalizing image captions for image-text parallel corpus
- 1, 2
- P. Kuznetsova, V. Ordonez, A. C. Berg, T. L. Berg, and Y. Choi. Generalizing image captions for image-text parallel corpus. In ACL, 2013. 1, 2
- (2013) ACL
- Kuznetsova, P.¹ Ordonez, V.² Berg, A.C.³ Berg, T.L.⁴ Choi, Y.⁵

23
- 84934873221
- Treetalk: Composition and compression of trees for image descriptions
- 1, 2
- P. Kuznetsova, V. Ordonez, T. Berg, and Y. Choi. Treetalk: Composition and compression of trees for image descriptions. TACL, 2: 351-362, 2014. 1, 2
- (2014) TACL , vol.2 , pp. 351-362
- Kuznetsova, P.¹ Ordonez, V.² Berg, T.³ Choi, Y.⁴

24
- 52149112996
- Meteor: An automatic metric for mt evaluation with high levels of correlation with human judgments
- 6
- A. Lavie and A. Agarwal. Meteor: An automatic metric for mt evaluation with high levels of correlation with human judgments. In Second Workshop on Statistical Machine Translation, 2007. 6
- (2007) Second Workshop on Statistical Machine Translation
- Lavie, A.¹ Agarwal, A.²

25
- 84937834115
- Microsoft COCO: Common objects in context
- 6
- T. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick. Microsoft COCO: common objects in context. In ECCV, 2014. 6
- (2014) ECCV
- Lin, T.¹ Maire, M.² Belongie, S.³ Hays, J.⁴ Perona, P.⁵ Ramanan, D.⁶ Dollár, P.⁷ Zitnick, C.L.⁸

26
- 85083950512
- Deep captioning with multimodal recurrent neural networks (mrnn)
- 1, 2, 3, 8
- J. Mao, W. Xu, Y. Yang, J. Wang, and A. L. Yuille. Deep captioning with multimodal recurrent neural networks (mrnn). In ICLR, 2015. 1, 2, 3, 8
- (2015) ICLR
- Mao, J.¹ Xu, W.² Yang, Y.³ Wang, J.⁴ Yuille, A.L.⁵

27
- 84906925144
- Nonparametric method for datadriven image captioning
- 1, 2
- R. Mason and E. Charniak. Nonparametric method for datadriven image captioning. In ACL, 2014. 1, 2
- (2014) ACL
- Mason, R.¹ Charniak, E.²

28
- 85034832841
- Midge: Generating image descriptions from computer vision detections
- 1, 2
- M. Mitchell, J. Dodge, A. Goyal, K. Yamaguchi, K. Stratos, X. Han, A. Mensch, A. C. Berg, T. L. Berg, and H. D. III. Midge: Generating image descriptions from computer vision detections. In EACL, 2012. 1, 2
- (2012) EACL
- Mitchell, M.¹ Dodge, J.² Goyal, A.³ Yamaguchi, K.⁴ Stratos, K.⁵ Han, X.⁶ Mensch, A.⁷ Berg, A.C.⁸ Berg, T.L.⁹

29
- 85133336275
- Bleu: A method for automatic evaluation of machine translation
- 6
- K. Papineni, S. Roukos, T. Ward, and W. Zhu. Bleu: A method for automatic evaluation of machine translation. In ACL, 2002. 6
- (2002) ACL
- Papineni, K.¹ Roukos, S.² Ward, T.³ Zhu, W.⁴

30
- 85083953063
- Very deep convolutional networks for large-scale image recognition
- 6, 8
- K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015. 6, 8
- (2015) ICLR
- Simonyan, K.¹ Zisserman, A.²

31
- 80053459857
- Generating text with recurrent neural networks
- 2
- I. Sutskever, J. Martens, and G. Hinton. Generating text with recurrent neural networks. In ICML, 2011. 2
- (2011) ICML
- Sutskever, I.¹ Martens, J.² Hinton, G.³

32
- 84928547704
- Sequence to sequence learning with neural networks
- 1, 2, 3, 5
- I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. In NIPS, 2014. 1, 2, 3, 5
- (2014) NIPS
- Sutskever, I.¹ Vinyals, O.² Le, Q.V.³

33
- 84973926705
- Going deeper with convolutions
- 8
- C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In CVPR, 2014. 8
- (2014) CVPR
- Szegedy, C.¹ Liu, W.² Jia, Y.³ Sermanet, P.⁴ Reed, S.⁵ Anguelov, D.⁶ Erhan, D.⁷ Vanhoucke, V.⁸ Rabinovich, A.⁹

34
- 84973891116
- Technical Report MSU-CSE-00-2 6
- T. Tieleman and G. Hinton. Leccture 6. 5-rmsprop. Technical Report MSU-CSE-00-2, 2000. 6
- (2000) Leccture 6. 5-rmsprop
- Tieleman, T.¹ Hinton, G.²

35
- 84937504995
- 6 CoRR, abs/1412. 4564
- A. Vedaldi and K. Lenc. Matconvnet-convolutional neural networks for matlab. CoRR, abs/1412. 4564, 2014. 6
- (2014) Matconvnet-convolutional Neural Networks for Matlab
- Vedaldi, A.¹ Lenc, K.²

36
- 84956980995
- Cider: Consensus-based image description evaluation
- 6
- R. Vedantam, C. L. Zitnick, and D. Parikh. Cider: Consensus-based image description evaluation. In CVPR, 2015. 6
- (2015) CVPR
- Vedantam, R.¹ Zitnick, C.L.² Parikh, D.³

37
- 84946747440
- Show and tell: A neural image caption generator
- 1, 2, 3, 4, 6, 8
- O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. In CVPR, 2015. 1, 2, 3, 4, 6, 8
- (2015) CVPR
- Vinyals, O.¹ Toshev, A.² Bengio, S.³ Erhan, D.⁴

38
- 84970002232
- Show, attend and tell: Neural image caption generation with visual attention
- 1, 2, 3, 6, 8
- K. Xu, J. Ba, R. Kiros, K. Cho, A. C. Courville, R. Salakhutdinov, R. S. Zemel, and Y. Bengio. Show, attend and tell: Neural image caption generation with visual attention. In ICML, 2015. 1, 2, 3, 6, 8
- (2015) ICML
- Xu, K.¹ Ba, J.² Kiros, R.³ Cho, K.⁴ Courville, A.C.⁵ Salakhutdinov, R.⁶ Zemel, R.S.⁷ Bengio, Y.⁸

39
- 80053258778
- Corpusguided sentence generation of natural images
- 1, 2
- Y. Yang, C. L. Teo, H. D. III, and Y. Aloimonos. Corpusguided sentence generation of natural images. In EMNLP, 2011. 1, 2
- (2011) EMNLP
- Yang, Y.¹ Teo, C.L.² Aloimonos, Y.³

40
- 84906494296
- From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions
- 6
- P. Young, A. Lai, M. Hodosh, and J. Hockenmaier. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. TACL, 2: 67-78, 2014. 6
- (2014) TACL , vol.2 , pp. 67-78
- Young, P.¹ Lai, A.² Hodosh, M.³ Hockenmaier, J.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.