메뉴 건너뛰기




Volumn 2016-December, Issue , 2016, Pages 4565-4574

DenseCap: Fully convolutional localization networks for dense captioning

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER VISION; CONVOLUTION; PATTERN RECOGNITION; RECURRENT NEURAL NETWORKS;

EID: 84986245786     PISSN: 10636919     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/CVPR.2016.494     Document Type: Conference Paper
Times cited : (1163)

References (54)
  • 4
    • 84957029470 scopus 로고    scopus 로고
    • Mind's eye: A recurrent visual representation for image caption generation
    • X. Chen and C. L. Zitnick. Mind's eye: A recurrent visual representation for image caption generation. CVPR, 2015.
    • (2015) CVPR
    • Chen, X.1    Zitnick, C.L.2
  • 5
    • 85009929513 scopus 로고    scopus 로고
    • Describing multimedia content using attention-based encoder-decoder networks
    • abs/1507.01053
    • K. Cho, A. C. Courville, and Y. Bengio. Describing multimedia content using attention-based encoder-decoder networks. CoRR, abs/1507.01053, 2015.
    • (2015) CoRR
    • Cho, K.1    Courville, A.C.2    Bengio, Y.3
  • 9
    • 84911443425 scopus 로고    scopus 로고
    • Scalable object detection using deep neural networks
    • D. Erhan, C. Szegedy, A. Toshev, and D. Anguelov. Scalable object detection using deep neural networks. CVPR, 2014.
    • (2014) CVPR
    • Erhan, D.1    Szegedy, C.2    Toshev, A.3    Anguelov, D.4
  • 13
  • 14
    • 84911400494 scopus 로고    scopus 로고
    • Rich feature hierarchies for accurate object detection and semantic segmentation
    • R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. CVPR, 2014.
    • (2014) CVPR
    • Girshick, R.1    Donahue, J.2    Darrell, T.3    Malik, J.4
  • 16
  • 20
    • 84856653718 scopus 로고    scopus 로고
    • Learning cross-modality similarity for multinomial data
    • Y. Jia, M. Salzmann, and T. Darrell. Learning cross-modality similarity for multinomial data. ICCV, 2011.
    • (2011) ICCV
    • Jia, Y.1    Salzmann, M.2    Darrell, T.3
  • 21
    • 84946734827 scopus 로고    scopus 로고
    • Deep visual-semantic alignments for generating image descriptions
    • A. Karpathy and L. Fei-Fei. Deep visual-semantic alignments for generating image descriptions. CVPR, 2015.
    • (2015) CVPR
    • Karpathy, A.1    Fei-Fei, L.2
  • 23
    • 85083951076 scopus 로고    scopus 로고
    • Adam: A method for stochastic optimization
    • D. Kingma and J. Ba. Adam: A method for stochastic optimization. ICLR, 2015.
    • (2015) ICLR
    • Kingma, D.1    Ba, J.2
  • 24
    • 84952349298 scopus 로고    scopus 로고
    • Unifying visual-semantic embeddings with multimodal neural language models
    • R. Kiros, R. Salakhutdinov, and R. S. Zemel. Unifying visual-semantic embeddings with multimodal neural language models. TACL, 2015.
    • (2015) TACL
    • Kiros, R.1    Salakhutdinov, R.2    Zemel, R.S.3
  • 26
    • 84876231242 scopus 로고    scopus 로고
    • Imagenet classification with deep convolutional neural networks
    • A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012.
    • (2012) NIPS
    • Krizhevsky, A.1    Sutskever, I.2    Hinton, G.E.3
  • 28
    • 84907331257 scopus 로고    scopus 로고
    • Generalizing image captions for image-text parallel corpus
    • Citeseer
    • P. Kuznetsova, V. Ordonez, A. C. Berg, T. L. Berg, and Y. Choi. Generalizing image captions for image-text parallel corpus. In ACL (2), pages 790-796. Citeseer, 2013.
    • (2013) ACL , vol.2 , pp. 790-796
    • Kuznetsova, P.1    Ordonez, V.2    Berg, A.C.3    Berg, T.L.4    Choi, Y.5
  • 29
    • 0032203257 scopus 로고    scopus 로고
    • Gradientbased learning applied to document recognition
    • Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradientbased learning applied to document recognition. Proceedings of the IEEE, 86(11):2278-2324, 1998.
    • (1998) Proceedings of the IEEE , vol.86 , Issue.11 , pp. 2278-2324
    • LeCun, Y.1    Bottou, L.2    Bengio, Y.3    Haffner, P.4
  • 31
    • 84959205572 scopus 로고    scopus 로고
    • Fully convolutional networks for semantic segmentation
    • J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. CVPR, 2015.
    • (2015) CVPR
    • Long, J.1    Shelhamer, E.2    Darrell, T.3
  • 35
    • 84973856017 scopus 로고    scopus 로고
    • Flickr30k entities: Collecting region-to-phrase correspondences for richer imageto-sentence models
    • B. A. Plummer, L. Wang, C. M. Cervantes, J. C. Caicedo, J. Hockenmaier, and S. Lazebnik. Flickr30k entities: Collecting region-to-phrase correspondences for richer imageto-sentence models. ICCV, 2015.
    • (2015) ICCV
    • Plummer, B.A.1    Wang, L.2    Cervantes, C.M.3    Caicedo, J.C.4    Hockenmaier, J.5    Lazebnik, S.6
  • 36
    • 85009891462 scopus 로고    scopus 로고
    • qassemoquab. stnbhwd
    • qassemoquab. stnbhwd. https://github.com/qassemoquab/stnbhwd, 2015.
    • (2015)
  • 38
    • 84960980241 scopus 로고    scopus 로고
    • Faster R-CNN: Towards real-time object detection with region proposal networks
    • S. Ren, K. He, R. Girshick, and J. Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. NIPS, 2015.
    • (2015) NIPS
    • Ren, S.1    He, K.2    Girshick, R.3    Sun, J.4
  • 40
    • 85083951635 scopus 로고    scopus 로고
    • OverFeat: Integrated recognition, localization and detection using convolutional networks
    • P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun. OverFeat: Integrated recognition, localization and detection using convolutional networks. ICLR, 2014.
    • (2014) ICLR
    • Sermanet, P.1    Eigen, D.2    Zhang, X.3    Mathieu, M.4    Fergus, R.5    LeCun, Y.6
  • 41
    • 85083953063 scopus 로고    scopus 로고
    • Very deep convolutional networks for large-scale image recognition
    • K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. ICLR, 2015.
    • (2015) ICLR
    • Simonyan, K.1    Zisserman, A.2
  • 42
    • 77955998009 scopus 로고    scopus 로고
    • Connecting modalities: Semisupervised segmentation and annotation of images using unaligned text corpora
    • R. Socher and L. Fei-Fei. Connecting modalities: Semisupervised segmentation and annotation of images using unaligned text corpora. CVPR, 2010.
    • (2010) CVPR
    • Socher, R.1    Fei-Fei, L.2
  • 43
    • 84964474107 scopus 로고    scopus 로고
    • Grounded compositional semantics for finding and describing images with sentences
    • R. Socher, A. Karpathy, Q. V. Le, C. D. Manning, and A. Y. Ng. Grounded compositional semantics for finding and describing images with sentences. TACL, 2014.
    • (2014) TACL
    • Socher, R.1    Karpathy, A.2    Le, Q.V.3    Manning, C.D.4    Ng, A.Y.5
  • 44
    • 80053459857 scopus 로고    scopus 로고
    • Generating text with recurrent neural networks
    • I. Sutskever, J. Martens, and G. E. Hinton. Generating text with recurrent neural networks. ICML, 2011.
    • (2011) ICML
    • Sutskever, I.1    Martens, J.2    Hinton, G.E.3
  • 48
    • 84956980995 scopus 로고    scopus 로고
    • Cider: Consensus-based image description evaluation
    • R. Vedantam, C. Lawrence Zitnick, and D. Parikh. Cider: Consensus-based image description evaluation. CVPR, 2015.
    • (2015) CVPR
    • Vedantam, R.1    Lawrence Zitnick, C.2    Parikh, D.3
  • 49
    • 84946747440 scopus 로고    scopus 로고
    • Show and tell: A neural image caption generator
    • O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. CVPR, 2015.
    • (2015) CVPR
    • Vinyals, O.1    Toshev, A.2    Bengio, S.3    Erhan, D.4
  • 50
    • 0000903748 scopus 로고
    • Generalization of backpropagation with application to a recurrent gas market model
    • P. J. Werbos. Generalization of backpropagation with application to a recurrent gas market model. Neural Networks, 1(4):339-356, 1988.
    • (1988) Neural Networks , vol.1 , Issue.4 , pp. 339-356
    • Werbos, P.J.1
  • 52
    • 84906494296 scopus 로고    scopus 로고
    • From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions
    • P. Young, A. Lai, M. Hodosh, and J. Hockenmaier. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. TACL, 2014.
    • (2014) TACL
    • Young, P.1    Lai, A.2    Hodosh, M.3    Hockenmaier, J.4
  • 53
    • 85009899017 scopus 로고    scopus 로고
    • Visualizing and understanding convolutional networks
    • M. D. Zeiler and R. Fergus. Visualizing and understanding convolutional networks. ECCV, 2014.
    • (2014) ECCV
    • Zeiler, M.D.1    Fergus, R.2
  • 54
    • 85009853104 scopus 로고    scopus 로고
    • Edge boxes: Locating object proposals from edges
    • C. L. Zitnick and P. Dollár. Edge boxes: Locating object proposals from edges. ECCV, 2014.
    • (2014) ECCV
    • Zitnick, C.L.1    Dollár, P.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.