메뉴 건너뛰기




Volumn , Issue , 2015, Pages

Deep captioning with multimodal recurrent neural networks (m-RNN)

Author keywords

[No Author keywords available]

Indexed keywords

IMAGE ENHANCEMENT; PROBABILITY DISTRIBUTIONS;

EID: 85083950512     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (628)

References (48)
  • 8
    • 26444565569 scopus 로고
    • Finding structure in time
    • Elman, Jeffrey L. Finding structure in time. Cognitive science, 14(2):179–211, 1990.
    • (1990) Cognitive Science , vol.14 , Issue.2 , pp. 179-211
    • Elman, J.L.1
  • 11
    • 84898958665 scopus 로고    scopus 로고
    • Devise: A deep visual-semantic embedding model
    • Frome, Andrea, Corrado, Greg S, Shlens, Jon, Bengio, Samy, Dean, Jeff, Mikolov, Tomas, et al. Devise: A deep visual-semantic embedding model. In NIPS, pp. 2121–2129, 2013.
    • (2013) NIPS , pp. 2121-2129
    • Frome, A.1    Corrado, G.S.2    Shlens, J.3    Bengio, S.4    Dean, J.5    Mikolov, T.6
  • 12
    • 84911400494 scopus 로고    scopus 로고
    • Rich feature hierarchies for accurate object detection and semantic segmentation
    • Girshick, R., Donahue, J., Darrell, T., and Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR, 2014.
    • (2014) CVPR
    • Girshick, R.1    Donahue, J.2    Darrell, T.3    Malik, J.4
  • 13
    • 38049183286 scopus 로고    scopus 로고
    • The iapr tc-12 benchmark: A new evaluation resource for visual information systems
    • Grubinger, Michael, Clough, Paul, Müller, Henning, and Deselaers, Thomas. The iapr tc-12 benchmark: A new evaluation resource for visual information systems. In International Workshop OntoImage, pp. 13–23, 2006.
    • (2006) International Workshop OntoImage , pp. 13-23
    • Grubinger, M.1    Clough, P.2    Müller, H.3    Deselaers, T.4
  • 14
    • 78149341381 scopus 로고    scopus 로고
    • Multiple instance metric learning from automatically labeled bags of faces
    • Guillaumin, Matthieu, Verbeek, Jakob, and Schmid, Cordelia. Multiple instance metric learning from automatically labeled bags of faces. In ECCV, pp. 634–647, 2010.
    • (2010) ECCV , pp. 634-647
    • Guillaumin, M.1    Verbeek, J.2    Schmid, C.3
  • 15
    • 84973931408 scopus 로고    scopus 로고
    • From image annotation to image description
    • Gupta, Ankush and Mannem, Prashanth. From image annotation to image description. In ICONIP, 2012.
    • (2012) ICONIP
    • Gupta, A.1    Mannem, P.2
  • 16
    • 85059866463 scopus 로고    scopus 로고
    • Choosing linguistics over vision to describe images
    • Gupta, Ankush, Verma, Yashaswi, and Jawahar, CV. Choosing linguistics over vision to describe images. In AAAI, 2012.
    • (2012) AAAI
    • Gupta, A.1    Verma, Y.2    Jawahar, C.V.3
  • 18
    • 84883394520 scopus 로고    scopus 로고
    • Framing image description as a ranking task: Data, models and evaluation metrics
    • Hodosh, Micah, Young, Peter, and Hockenmaier, Julia. Framing image description as a ranking task: Data, models and evaluation metrics. JAIR, 47:853–899, 2013.
    • (2013) JAIR , vol.47 , pp. 853-899
    • Hodosh, M.1    Young, P.2    Hockenmaier, J.3
  • 19
    • 84856653718 scopus 로고    scopus 로고
    • Learning cross-modality similarity for multinomial data
    • Jia, Yangqing, Salzmann, Mathieu, and Darrell, Trevor. Learning cross-modality similarity for multinomial data. In ICCV, pp. 2407–2414, 2011.
    • (2011) ICCV , pp. 2407-2414
    • Jia, Y.1    Salzmann, M.2    Darrell, T.3
  • 20
    • 84926283798 scopus 로고    scopus 로고
    • Recurrent continuous translation models
    • Kalchbrenner, Nal and Blunsom, Phil. Recurrent continuous translation models. In EMNLP, pp. 1700–1709, 2013.
    • (2013) EMNLP , pp. 1700-1709
    • Kalchbrenner, N.1    Blunsom, P.2
  • 24
  • 25
    • 84876231242 scopus 로고    scopus 로고
    • Imagenet classification with deep convolutional neural networks
    • Krizhevsky, Alex, Sutskever, Ilya, and Hinton, Geoffrey E. Imagenet classification with deep convolutional neural networks. In NIPS, pp. 1097–1105, 2012.
    • (2012) NIPS , pp. 1097-1105
    • Krizhevsky, A.1    Sutskever, I.2    Hinton, G.E.3
  • 26
  • 33
    • 80051643236 scopus 로고    scopus 로고
    • Extensions of recurrent neural network language model
    • Mikolov, Tomas, Kombrink, Stefan, Burget, Lukas, Cernocky, JH, and Khudanpur, Sanjeev. Extensions of recurrent neural network language model. In ICASSP, pp. 5528–5531, 2011.
    • (2011) ICASSP , pp. 5528-5531
    • Mikolov, T.1    Kombrink, S.2    Burget, L.3    Cernocky, J.H.4    Khudanpur, S.5
  • 34
    • 84898956512 scopus 로고    scopus 로고
    • Distributed representations of words and phrases and their compositionality
    • Mikolov, Tomas, Sutskever, Ilya, Chen, Kai, Corrado, Greg S, and Dean, Jeff. Distributed representations of words and phrases and their compositionality. In NIPS, pp. 3111–3119, 2013.
    • (2013) NIPS , pp. 3111-3119
    • Mikolov, T.1    Sutskever, I.2    Chen, K.3    Corrado, G.S.4    Dean, J.5
  • 36
    • 34547970628 scopus 로고    scopus 로고
    • Three new graphical models for statistical language modelling
    • ACM
    • Mnih, Andriy and Hinton, Geoffrey. Three new graphical models for statistical language modelling. In ICML, pp. 641–648. ACM, 2007.
    • (2007) ICML , pp. 641-648
    • Mnih, A.1    Hinton, G.2
  • 37
    • 77956509090 scopus 로고    scopus 로고
    • Rectified linear units improve restricted boltzmann machines
    • Nair, Vinod and Hinton, Geoffrey E. Rectified linear units improve restricted boltzmann machines. In ICML, pp. 807–814, 2010.
    • (2010) ICML , pp. 807-814
    • Nair, V.1    Hinton, G.E.2
  • 38
    • 85133336275 scopus 로고    scopus 로고
    • BLEU: A method for automatic evaluation of machine translation
    • Papineni, Kishore, Roukos, Salim, Ward, Todd, and Zhu, Wei-Jing. Bleu: a method for automatic evaluation of machine translation. In ACL, pp. 311–318, 2002.
    • (2002) ACL , pp. 311-318
    • Papineni, K.1    Roukos, S.2    Ward, T.3    Zhu, W.-J.4
  • 43
    • 84906925854 scopus 로고    scopus 로고
    • Grounded compositional semantics for finding and describing images with sentences
    • Socher, Richard, Le, Q, Manning, C, and Ng, A. Grounded compositional semantics for finding and describing images with sentences. In TACL, 2014.
    • (2014) TACL
    • Socher, R.1    Le, Q.2    Manning, C.3    Ng, A.4
  • 44
    • 84877724347 scopus 로고    scopus 로고
    • Multimodal learning with deep boltzmann machines
    • Srivastava, Nitish and Salakhutdinov, Ruslan. Multimodal learning with deep boltzmann machines. In NIPS, pp. 2222–2230, 2012.
    • (2012) NIPS , pp. 2222-2230
    • Srivastava, N.1    Salakhutdinov, R.2
  • 45
    • 84928547704 scopus 로고    scopus 로고
    • Sequence to sequence learning with neural networks
    • Sutskever, Ilya, Vinyals, Oriol, and Le, Quoc VV. Sequence to sequence learning with neural networks. In NIPS, pp. 3104–3112, 2014.
    • (2014) NIPS , pp. 3104-3112
    • Sutskever, I.1    Vinyals, O.2    Le, Q.V.V.3
  • 48
    • 84906494296 scopus 로고    scopus 로고
    • From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions
    • Young, Peter, Lai, Alice, Hodosh, Micah, and Hockenmaier, Julia. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. In ACL, pp. 479–488, 2014.
    • (2014) ACL , pp. 479-488
    • Young, P.1    Lai, A.2    Hodosh, M.3    Hockenmaier, J.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.