메뉴 건너뛰기




Volumn 07-12-June-2015, Issue , 2015, Pages 2422-2431

Mind's eye: A recurrent visual representation for image caption generation

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER VISION; PATTERN RECOGNITION; RECURRENT NEURAL NETWORKS;

EID: 84957029470     PISSN: 10636919     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/CVPR.2015.7298856     Document Type: Conference Paper
Times cited : (490)

References (47)
  • 3
    • 0028392483 scopus 로고
    • Learning long-term dependencies with gradient descent is difficult
    • Y. Bengio, P. Simard, and P. Frasconi. Learning long-term dependencies with gradient descent is difficult. Neural Net-works, IEEE Transactions on, 5(2):157-166, 1994
    • (1994) Neural Net-works, IEEE Transactions on , vol.5 , Issue.2 , pp. 157-166
    • Bengio, Y.1    Simard, P.2    Frasconi, P.3
  • 9
    • 26444565569 scopus 로고
    • Finding structure in time
    • J. L. Elman. Finding structure in time. Cognitive science, 14(2):179-211, 1990
    • (1990) Cognitive Science , vol.14 , Issue.2 , pp. 179-211
    • Elman, J.L.1
  • 14
    • 84906484732 scopus 로고    scopus 로고
    • Improving image-sentence embeddings using large weakly annotated photo collections
    • Y. Gong, L. Wang, M. Hodosh, J. Hockenmaier, and S. Lazebnik. Improving image-sentence embeddings using large weakly annotated photo collections. In ECCV, pages 529-545, 2014
    • (2014) ECCV , pp. 529-545
    • Gong, Y.1    Wang, L.2    Hodosh, M.3    Hockenmaier, J.4    Lazebnik, S.5
  • 16
    • 84883394520 scopus 로고    scopus 로고
    • Framing image description as a ranking task: Data, models and evaluation metrics
    • M. Hodosh, P. Young, and J. Hockenmaier. Framing image description as a ranking task: Data, models and evaluation metrics. J. Artif. Intell. Res. (JAIR), 47:853-899, 2013
    • (2013) J. Artif Intell. Res. (JAIR) , vol.47 , pp. 853-899
    • Hodosh, M.1    Young, P.2    Hockenmaier, J.3
  • 19
    • 84946734827 scopus 로고    scopus 로고
    • Deep visual-semantic alignments for generating image descriptions
    • A. Karpathy and L. Fei-Fei. Deep visual-semantic alignments for generating image descriptions. CVPR, 2015
    • (2015) CVPR
    • Karpathy, A.1    Fei-Fei, L.2
  • 24
    • 80052901011 scopus 로고    scopus 로고
    • Baby talk: Understanding and generating simple image descriptions
    • IEEE
    • G. Kulkarni, V. Premraj, S. Dhar, S. Li, Y. Choi, A. C. Berg, and T. L. Berg. Baby talk: Understanding and generating simple image descriptions. In CVPR, pages 1601-1608. IEEE, 2011
    • (2011) CVPR , pp. 1601-1608
    • Kulkarni, G.1    Premraj, V.2    Dhar, S.3    Li, S.4    Choi, Y.5    Berg, A.C.6    Berg, T.L.7
  • 26
    • 0013828836 scopus 로고
    • Words versus objects: Comparison of free verbal recall
    • L. R. Lieberman and J. T. Culpepper. Words versus objects: Comparison of free verbal recall. Psychological Reports, 17(3):983-988, 1965
    • (1965) Psychological Reports , vol.17 , Issue.3 , pp. 983-988
    • Lieberman, L.R.1    Culpepper, J.T.2
  • 32
    • 84874235486 scopus 로고    scopus 로고
    • Context dependent recurrent neural network language model
    • T. Mikolov and G. Zweig. Context dependent recurrent neural network language model. In SLT, pages 234-239, 2012
    • (2012) SLT , pp. 234-239
    • Mikolov, T.1    Zweig, G.2
  • 33
    • 85034832841 scopus 로고    scopus 로고
    • Midge: Generating image descriptions from computer vision detections
    • Association for Computational Linguistics
    • M. Mitchell, X. Han, J. Dodge, A. Mensch, A. Goyal, A. Berg, K. Yamaguchi, T. Berg, K. Stratos, and H. Daumé III. Midge: Generating image descriptions from computer vision detections. In EACL, pages 747-756. Association for Computational Linguistics, 2012
    • (2012) EACL , pp. 747-756
    • Mitchell, M.1    Han, X.2    Dodge, J.3    Mensch, A.4    Goyal, A.5    Berg, A.6    Yamaguchi, K.7    Berg, T.8    Stratos, K.9    Daumé, H.10
  • 34
    • 80054092539 scopus 로고
    • Why are pictures easier to recall than words
    • A. Paivio, T. B. Rogers, and P. C. Smythe. Why are pictures easier to recall than words Psychonomic Science, 11(4):137-138, 1968
    • (1968) Psychonomic Science , vol.11 , Issue.4 , pp. 137-138
    • Paivio, A.1    Rogers, T.B.2    Smythe, P.C.3
  • 38
    • 84928030723 scopus 로고    scopus 로고
    • Grounded compositional semantics for finding and describing images with sentences
    • R. Socher, Q. Le, C. Manning, and A. Ng. Grounded compositional semantics for finding and describing images with sentences. In NIPS Deep Learning Workshop, 2013
    • (2013) NIPS Deep Learning Workshop
    • Socher, R.1    Le, Q.2    Manning, C.3    Ng, A.4
  • 40
    • 84956980995 scopus 로고    scopus 로고
    • Cider: Consensus-based image description evaluation
    • R. Vedantam, C. L. Zitnick, and D. Parikh. Cider: Consensus-based image description evaluation. CVPR, 2015
    • (2015) CVPR
    • Vedantam, R.1    Zitnick, C.L.2    Parikh, D.3
  • 42
    • 84946747440 scopus 로고    scopus 로고
    • Show and tell: A neural image caption generator
    • O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. CVPR, 2015
    • (2015) CVPR
    • Vinyals, O.1    Toshev, A.2    Bengio, S.3    Erhan, D.4
  • 43
    • 0003066062 scopus 로고
    • Experimental analysis of the real-time recurrent learning algorithm
    • R. J. Williams and D. Zipser. Experimental analysis of the real-time recurrent learning algorithm. Connection Science, 1(1):87-111, 1989
    • (1989) Connection Science , vol.1 , Issue.1 , pp. 87-111
    • Williams, R.J.1    Zipser, D.2
  • 45
    • 80053258778 scopus 로고    scopus 로고
    • Corpus-guided sentence generation of natural images
    • Y. Yang, C. L. Teo, H. Daumé III, and Y. Aloimonos. Corpus-guided sentence generation of natural images. In EMNLP, 2011
    • (2011) EMNLP
    • Yang, Y.1    Teo, C.L.2    Daumé, H.3    Aloimonos, Y.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.