메뉴 건너뛰기




Volumn 10115 LNCS, Issue , 2017, Pages 101-117

Phi-LSTM: A phrase-based hierarchical LSTM model for image captioning

Author keywords

[No Author keywords available]

Indexed keywords

IMAGE ANALYSIS; LONG SHORT-TERM MEMORY; NEURAL NETWORKS; OBJECT DETECTION;

EID: 85016280072     PISSN: 03029743     EISSN: 16113349     Source Type: Book Series    
DOI: 10.1007/978-3-319-54193-8_7     Document Type: Conference Paper
Times cited : (32)

References (39)
  • 1
    • 80052889458 scopus 로고    scopus 로고
    • Recognition using visual phrases
    • Sadeghi, M.A., Farhadi, A.: Recognition using visual phrases. In: CVPR, pp. 1745– 1752 (2011)
    • (2011) CVPR
    • Sadeghi, M.A.1    Farhadi, A.2
  • 2
    • 84868289993 scopus 로고    scopus 로고
    • Choosing linguistics over vision to describe images
    • Gupta, A., Verma, Y., Jawahar, C.: Choosing linguistics over vision to describe images. In: AAAI, pp. 606–612 (2012)
    • (2012) AAAI , pp. 606-612
    • Gupta, A.1    Verma, Y.2    Jawahar, C.3
  • 6
    • 84946747440 scopus 로고    scopus 로고
    • Show and tell: A neural image caption generator
    • Vinyals, O., Toshev, A., Bengio, S., Erhan, D.: Show and tell: a neural image caption generator. In: CVPR, pp. 3156–3164 (2015)
    • (2015) CVPR , pp. 3156-3164
    • Vinyals, O.1    Toshev, A.2    Bengio, S.3    Erhan, D.4
  • 7
    • 84946734827 scopus 로고    scopus 로고
    • Deep visual-semantic alignments for generating image descriptions
    • Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: CVPR, pp. 3128–3137 (2015)
    • (2015) CVPR , pp. 3128-3137
    • Karpathy, A.1    Fei-Fei, L.2
  • 11
    • 84876231242 scopus 로고    scopus 로고
    • Imagenet classification with deep convolutional neural networks
    • Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)
    • (2012) NIPS , pp. 1097-1105
    • Krizhevsky, A.1    Sutskever, I.2    Hinton, G.3
  • 13
    • 0000754012 scopus 로고
    • A model and an hypothesis for language structure
    • Yngve, V.: A model and an hypothesis for language structure. Proc. Am. Philos. Soc. 104, 444–466 (1960)
    • (1960) Proc. Am. Philos. Soc. , vol.104 , pp. 444-466
    • Yngve, V.1
  • 16
    • 84906494296 scopus 로고    scopus 로고
    • From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions
    • Young, P., Lai, A., Hodosh, M., Hockenmaier, J.: From image descriptions to visual denotations: new similarity metrics for semantic inference over event descriptions. Trans. Assoc. Comput. Linguist. 2, 67–78 (2014)
    • (2014) Trans. Assoc. Comput. Linguist. , vol.2 , pp. 67-78
    • Young, P.1    Lai, A.2    Hodosh, M.3    Hockenmaier, J.4
  • 18
    • 84883394520 scopus 로고    scopus 로고
    • Framing image description as a ranking task: Data, models and evaluation metrics
    • Hodosh, M., Young, P., Hockenmaier, J.: Framing image description as a ranking task: data, models and evaluation metrics. J. Artif. Intell. Res. 47, 853–899 (2013)
    • (2013) J. Artif. Intell. Res. , vol.47 , pp. 853-899
    • Hodosh, M.1    Young, P.2    Hockenmaier, J.3
  • 20
    • 84906925854 scopus 로고    scopus 로고
    • Grounded compositional semantics for finding and describing images with sentences
    • Socher, R., Karpathy, A., Le, Q.V., Manning, C.D., Ng, A.: Grounded compositional semantics for finding and describing images with sentences. Trans. Assoc. Comput. Linguist. 2, 207–218 (2014)
    • (2014) Trans. Assoc. Comput. Linguist. , vol.2 , pp. 207-218
    • Socher, R.1    Karpathy, A.2    Le, Q.V.3    Manning, C.D.4    Ng, A.5
  • 21
    • 84937843643 scopus 로고    scopus 로고
    • Deep fragment embeddings for bidirectional image sentence mapping
    • Karpathy, A., Joulin, A., Fei-Fei, L.: Deep fragment embeddings for bidirectional image sentence mapping. In: NIPS, pp. 1889–1897 (2014)
    • (2014) NIPS , pp. 1889-1897
    • Karpathy, A.1    Joulin, A.2    Fei-Fei, L.3
  • 22
    • 84877724347 scopus 로고    scopus 로고
    • Multimodal learning with deep Boltzmann machines
    • Srivastava, N., Salakhutdinov, R.: Multimodal learning with deep Boltzmann machines. In: NIPS, pp. 2222–2230 (2012)
    • (2012) NIPS , pp. 2222-2230
    • Srivastava, N.1    Salakhutdinov, R.2
  • 23
    • 84856653718 scopus 로고    scopus 로고
    • Learning cross-modality similarity for multinomial data
    • Jia, Y., Salzmann, M., Darrell, T.: Learning cross-modality similarity for multinomial data. In: ICCV, pp. 2407–2414 (2011)
    • (2011) ICCV , pp. 2407-2414
    • Jia, Y.1    Salzmann, M.2    Darrell, T.3
  • 24
    • 84929363334 scopus 로고    scopus 로고
    • Multimodal neural language models
    • Kiros, R., Salakhutdinov, R., Zemel, R.: Multimodal neural language models. In: ICML, pp. 595–603 (2014)
    • (2014) ICML , pp. 595-603
    • Kiros, R.1    Salakhutdinov, R.2    Zemel, R.3
  • 26
    • 78149311145 scopus 로고    scopus 로고
    • Every picture tells a story: Generating sentences from images
    • Daniilidis, K., Maragos, P., Paragios, N., Springer, Heidelberg
    • Farhadi, A., Hejrati, M., Sadeghi, M.A., Young, P., Rashtchian, C., Hockenmaier, J., Forsyth, D.: Every picture tells a story: generating sentences from images. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 15–29. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15561-1 2
    • (2010) ECCV 2010. LNCS , vol.6314 , pp. 15-29
    • Farhadi, A.1    Hejrati, M.2    Sadeghi, M.A.3    Young, P.4    Rashtchian, C.5    Hockenmaier, J.6    Forsyth, D.7
  • 28
    • 80053258778 scopus 로고    scopus 로고
    • Corpus-guided sentence generation of natural images
    • Yang, Y., Teo, C.L., Daumé III, H., Aloimonos, Y.: Corpus-guided sentence generation of natural images. In: EMNLP, pp. 444–454 (2011)
    • (2011) EMNLP , pp. 444-454
    • Yang, Y.1    Teo, C.L.2    Daumé, H.3    Aloimonos, Y.4
  • 30
    • 84869018122 scopus 로고    scopus 로고
    • From image annotation to image description
    • Huang, T., Zeng, Z., Li, C., Leung, C.S., Springer, Heidelberg
    • Gupta, A., Mannem, P.: From image annotation to image description. In: Huang, T., Zeng, Z., Li, C., Leung, C.S. (eds.) ICONIP 2012. LNCS, vol. 7667, pp. 196–204. Springer, Heidelberg (2012). doi:10.1007/978-3-642-34500-5 24
    • (2012) ICONIP 2012. LNCS , vol.7667 , pp. 196-204
    • Gupta, A.1    Mannem, P.2
  • 31
    • 84862279067 scopus 로고    scopus 로고
    • Composing simple image descriptions using web-scale n-grams
    • Li, S., Kulkarni, G., Berg, T., Berg, A., Choi, Y.: Composing simple image descriptions using web-scale n-grams. In: CoNLL, pp. 220–228 (2011)
    • (2011) Conll , pp. 220-228
    • Li, S.1    Kulkarni, G.2    Berg, T.3    Berg, A.4    Choi, Y.5
  • 32
    • 84878189119 scopus 로고    scopus 로고
    • Collective generation of natural image descriptions
    • Kuznetsova, P., Ordonez, V., Berg, A., Berg, T., Choi, Y.: Collective generation of natural image descriptions. In: ACL, pp. 359–368 (2012)
    • (2012) ACL , pp. 359-368
    • Kuznetsova, P.1    Ordonez, V.2    Berg, A.3    Berg, T.4    Choi, Y.5
  • 36
    • 85198028989 scopus 로고    scopus 로고
    • Imagenet: A large-scale hierarchical image database
    • Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: CVPR, pp. 248–255 (2009)
    • (2009) CVPR , pp. 248-255
    • Deng, J.1    Dong, W.2    Socher, R.3    Li, L.J.4    Li, K.5    Fei-Fei, L.6
  • 37
    • 85016240817 scopus 로고    scopus 로고
    • Lecture 6a overview of mini-batch gradient descent
    • Hinton, G., Srivastava, N., Swersky, K.: Lecture 6a overview of mini-batch gradient descent (2012). Coursera Lecture slides https://class.coursera.org/neuralnets-2012-001/lecture
    • (2012) Coursera Lecture Slides
    • Hinton, G.1    Srivastava, N.2    Swersky, K.3
  • 39
    • 85133336275 scopus 로고    scopus 로고
    • BLEU: A method for automatic evaluation of machine translation
    • Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: ACL, pp. 311–318 (2002)
    • (2002) ACL , pp. 311-318
    • Papineni, K.1    Roukos, S.2    Ward, T.3    Zhu, W.J.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.