SCOPUS 정보 검색 플랫폼

Advances in Neural Information Processing Systems

Volumn 2015-January, Issue , 2015, Pages 73-81

Expressing an image stream with a sequence of natural sentences

(2) Park, Cesc Chunseong a Kim, Gunhee a

a Seoul National University (South Korea)

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER HARDWARE DESCRIPTION LANGUAGES; CONVOLUTION; INFORMATION SCIENCE; NEURAL NETWORKS;

AMAZON MECHANICAL TURKS; BIDIRECTIONAL RECURRENT NEURAL NETWORKS; CONVOLUTIONAL NETWORKS; CONVOLUTIONAL NEURAL NETWORK; INPUT AND OUTPUTS; MULTIMODAL ARCHITECTURES; QUANTITATIVE MEASURES; SEQUENCE OF IMAGES;

RECURRENT NEURAL NETWORKS;

EID: 84965149840 PISSN: 10495258 EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (102)

References (32)

1
- 41549147844
- Modeling local coherence: An entity-based approach
- R. Barzilay and M. Lapata. Modeling Local Coherence: An Entity-Based Approach. In ACL, 2008.
- (2008) ACL
- Barzilay, R.¹ Lapata, M.²

2
- 70349549313
- O'Reilly Media Inc.
- S. Bird, E. Loper, and E. Klein. Natural Language Processing with Python. O'Reilly Media Inc., 2009.
- (2009) Natural Language Processing with Python
- Bird, S.¹ Loper, E.² Klein, E.³

3
- 84957029470
- Mind's eye: A recurrent visual representation for image caption generation
- X. Chen and C. L. Zitnick. Mind's Eye: A Recurrent Visual Representation for Image Caption Generation. In CVPR, 2015.
- (2015) CVPR
- Chen, X.¹ Zitnick, C.L.²

4
- 85126918482
- Latent semantic analysis for text segmentation
- F. Y. Y. Choi, P. Wiemer-Hastings, and J. Moore. Latent Semantic Analysis for Text Segmentation. In EMNLP, 2001.
- (2001) EMNLP
- Choi, F.Y.Y.¹ Wiemer-Hastings, P.² Moore, J.³

5
- 84959236502
- Long-term recurrent convolutional networks for visual recognition and description
- J. Donahue, L. A. Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, K. Saenko, and T. Darrell. Long-term Recurrent Convolutional Networks for Visual Recognition and Description. In CVPR, 2015.
- (2015) CVPR
- Donahue, J.¹ Hendricks, L.A.² Guadarrama, S.³ Rohrbach, M.⁴ Venugopalan, S.⁵ Saenko, K.⁶ Darrell, T.⁷

6
- 84959243872
- Improving image-sentence embeddings using large weakly annotated photo collections
- Y. Gong, L. Wang, M. Hodosh, J. Hockenmaier, and S. Lazebnik. Improving Image-Sentence Embeddings Using Large Weakly Annotated Photo Collections. In ECCV, 2014.
- (2014) ECCV
- Gong, Y.¹ Wang, L.² Hodosh, M.³ Hockenmaier, J.⁴ Lazebnik, S.⁵

7
- 84973911419
- arXiv
- K. He, X. Zhang, S. Ren, and J. Sun. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. In arXiv, 2015.
- (2015) Delving Deep Into Rectifiers: Surpassing Human-level Performance on ImageNet Classification
- He, K.¹ Zhang, X.² Ren, S.³ Sun, J.⁴

8
- 84883394520
- Framing image description as a ranking task: Data, models and evaluation metrics
- M. Hodosh, P. Young, and J. Hockenmaier. Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics. JAIR, 47:853-899, 2013.
- (2013) JAIR , vol.47 , pp. 853-899
- Hodosh, M.¹ Young, P.² Hockenmaier, J.³

9
- 84946734827
- Deep visual-semantic alignments for generating image descriptions
- A. Karpathy and L. Fei-Fei. Deep Visual-Semantic Alignments for Generating Image Descriptions. In CVPR, 2015.
- (2015) CVPR
- Karpathy, A.¹ Fei-Fei, L.²

10
- 84959191227
- Joint photo stream and blog post summarization and exploration
- G. Kim, S. Moon, and L. Sigal. Joint Photo Stream and Blog Post Summarization and Exploration. In CVPR, 2015.
- (2015) CVPR
- Kim, G.¹ Moon, S.² Sigal, L.³

11
- 84959189488
- Ranking and retrieval of image sequences from multiple paragraph queries
- G. Kim, S. Moon, and L. Sigal. Ranking and Retrieval of Image Sequences from Multiple Paragraph Queries. In CVPR, 2015.
- (2015) CVPR
- Kim, G.¹ Moon, S.² Sigal, L.³

12
- 84919921461
- Multimodal neural language models
- R. Kiros, R. Salakhutdinov, and R. Zemel. Multimodal Neural Language Models. In ICML, 2014.
- (2014) ICML
- Kiros, R.¹ Salakhutdinov, R.² Zemel, R.³

13
- 84876231242
- Imagenet classification with deep convolutional neural networks
- A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet Classification with Deep Convolutional Neural Networks. In NIPS, 2012.
- (2012) NIPS
- Krizhevsky, A.¹ Sutskever, I.² Hinton, G.E.³

14
- 80052901011
- Baby talk: Understanding and generating image descriptions
- G. Kulkarni, V. Premraj, S. Dhar, S. Li, Y. Choi, A. C. Berg, and T. L. Berg. Baby Talk: Understanding and Generating Image Descriptions. In CVPR, 2011.
- (2011) CVPR
- Kulkarni, G.¹ Premraj, V.² Dhar, S.³ Li, S.⁴ Choi, Y.⁵ Berg, A.C.⁶ Berg, T.L.⁷

15
- 84934873221
- TreeTalk: Composition and compression of trees for image descriptions
- P. Kuznetsova, V. Ordonez, T. L. Berg, and Y. Choi. TreeTalk: Composition and Compression of Trees for Image Descriptions. In TACL, 2014.
- (2014) TACL
- Kuznetsova, P.¹ Ordonez, V.² Berg, T.L.³ Choi, Y.⁴

16
- 33847226906
- METEOR: An automatic metric for MT evaluation with improved correlation with human judgments
- S. B. A. Lavie. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments. In ACL, 2005.
- (2005) ACL
- Lavie, S.B.A.¹

17
- 84919829999
- Distributed representations of sentences and documents
- Q. Le and T. Mikolov. Distributed Representations of Sentences and Documents. In ICML, 2014.
- (2014) ICML
- Le, Q.¹ Mikolov, T.²

18
- 85117622017
- The stanford CoreNLP natural language processing toolkit
- C. D. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. J. Bethard, and D. McClosky. The Stanford CoreNLP Natural Language Processing Toolkit. In ACL, 2014.
- (2014) ACL
- Manning, C.D.¹ Surdeanu, M.² Bauer, J.³ Finkel, J.⁴ Bethard, S.J.⁵ McClosky, D.⁶

19
- 85083950512
- Deep captioning with multimodal recurrent neural networks (m-RNN)
- J. Mao, W. Xu, Y. Yang, J. Wang, Z. Huang, and A. L. Yuille. Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN). In ICLR, 2015.
- (2015) ICLR
- Mao, J.¹ Xu, W.² Yang, Y.³ Wang, J.⁴ Huang, Z.⁵ Yuille, A.L.⁶

20
- 84874250121
- Ph. D. Thesis, Brno University of Technology
- T. Mikolov. Statistical Language Models based on Neural Networks. In Ph. D. Thesis, Brno University of Technology, 2012.
- (2012) Statistical Language Models Based on Neural Networks
- Mikolov, T.¹

21
- 85162522202
- Im2Text: Describing images using 1 million captioned photographs
- V. Ordonez, G. Kulkarni, and T. L. Berg. Im2Text: Describing Images Using 1 Million Captioned Photographs. In NIPS, 2011.
- (2011) NIPS
- Ordonez, V.¹ Kulkarni, G.² Berg, T.L.³

22
- 85133336275
- BLEU: A method for automatic evaluation of machine translation
- K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu. BLEU: A Method for Automatic Evaluation of Machine Translation. In ACL, 2002.
- (2002) ACL
- Papineni, K.¹ Roukos, S.² Ward, T.³ Zhu, W.-J.⁴

23
- 84898775239
- Translating video content to natural language descriptions
- M. Rohrbach, W. Qiu, I. Titov, S. Thater, M. Pinkal, and B. Schiele. Translating Video Content to Natural Language Descriptions. In ICCV, 2013.
- (2013) ICCV
- Rohrbach, M.¹ Qiu, W.² Titov, I.³ Thater, S.⁴ Pinkal, M.⁵ Schiele, B.⁶

24
- 84965144884
- Bidirectional recurrent neural networks
- M. Schuster and K. K. Paliwal. Bidirectional Recurrent Neural Networks. In IEEE TSP, 1997.
- (1997) IEEE TSP
- Schuster, M.¹ Paliwal, K.K.²

25
- 85083953063
- Very deep convolutional networks for large-scale image recognition
- K. Simonyan and A. Zisserman. Very Deep Convolutional Networks for Large-Scale Image Recognition. In ICLR, 2015.
- (2015) ICLR
- Simonyan, K.¹ Zisserman, A.²

26
- 84906925854
- Grounded compositional semantics for finding and describing images with sentences
- R. Socher, A. Karpathy, Q. V. Le, C. D. Manning, and A. Y. Ng. Grounded Compositional Semantics for Finding and Describing Images with Sentences. In TACL, 2013.
- (2013) TACL
- Socher, R.¹ Karpathy, A.² Le, Q.V.³ Manning, C.D.⁴ Ng, A.Y.⁵

27
- 84877724347
- Multimodal learning with deep boltzmann machines
- N. Srivastava and R. Salakhutdinov. Multimodal Learning with Deep Boltzmann Machines. In NIPS, 2012.
- (2012) NIPS
- Srivastava, N.¹ Salakhutdinov, R.²

28
- 84893343292
- Lecture 6.5-RMSProp
- T. Tieleman and G. E. Hinton. Lecture 6.5-RMSProp. In Coursera, 2012.
- (2012) Coursera
- Tieleman, T.¹ Hinton, G.E.²

29
- 84959197551
- arXiv:1411.5726
- R. Vedantam, C. L. Zitnick, and D. Parikh. CIDEr: Consensus-based Image Description Evaluation. In arXiv:1411.5726, 2014.
- (2014) CIDEr: Consensus-based Image Description Evaluation
- Vedantam, R.¹ Zitnick, C.L.² Parikh, D.³

30
- 84946747440
- Show and tell: A neural image caption generator
- O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and Tell: A Neural Image Caption Generator. In CVPR, 2015.
- (2015) CVPR
- Vinyals, O.¹ Toshev, A.² Bengio, S.³ Erhan, D.⁴

31
- 0000903748
- Generalization of backpropagation with application to a recurrent gas market model
- P. J. Werbos. Generalization of Backpropagation with Application to a Recurrent Gas Market Model. Neural Networks, 1:339-356, 1988.
- (1988) Neural Networks , vol.1 , pp. 339-356
- Werbos, P.J.¹

32
- 84952349307
- Jointly modeling deep video and compositional text to bridge vision and language in a unified framework
- R. Xu, C. Xiong, W. Chen, and J. J. Corso. Jointly Modeling Deep Video and Compositional Text to Bridge Vision and Language in a Unified Framework. In AAAI, 2015.
- (2015) AAAI
- Xu, R.¹ Xiong, C.² Chen, W.³ Corso, J.J.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.