메뉴 건너뛰기




Volumn 2017-January, Issue , 2017, Pages 955-964

StyleNet: Generating attractive visual captions with styles

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER VISION;

EID: 85044213495     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/CVPR.2017.108     Document Type: Conference Paper
Times cited : (330)

References (56)
  • 1
    • 85083953689 scopus 로고    scopus 로고
    • Neural machine translation by jointly learning to align and translate
    • 3
    • D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to align and translate. ICLR, 2015. 3
    • (2015) ICLR
    • Bahdanau, D.1    Cho, K.2    Bengio, Y.3
  • 2
    • 84965179228 scopus 로고    scopus 로고
    • Scheduled sampling for sequence prediction with recurrent neural networks
    • 1, 2
    • S. Bengio, O. Vinyals, N. Jaitly, and N. Shazeer. Scheduled sampling for sequence prediction with recurrent neural networks. In NIPS, pages 1171-1179, 2015. 1, 2
    • (2015) NIPS , pp. 1171-1179
    • Bengio, S.1    Vinyals, O.2    Jaitly, N.3    Shazeer, N.4
  • 3
    • 84859089502 scopus 로고    scopus 로고
    • Collecting highly parallel data for paraphrase evaluation
    • 8
    • D. L. Chen and W. B. Dolan. Collecting highly parallel data for paraphrase evaluation. In ACL, pages 190-200, 2011. 8
    • (2011) ACL , pp. 190-200
    • Chen, D.L.1    Dolan, W.B.2
  • 5
    • 84957029470 scopus 로고    scopus 로고
    • Mind's eye: A recurrent visual representation for image caption generation
    • 1, 2
    • X. Chen and C. Lawrence Zitnick. Mind's eye: A recurrent visual representation for image caption generation. In CVPR, pages 2422-2431, 2015. 1, 2
    • (2015) CVPR , pp. 2422-2431
    • Chen, X.1    Lawrence Zitnick, C.2
  • 7
    • 85198028989 scopus 로고    scopus 로고
    • Imagenet: A large-scale hierarchical image database
    • 6
    • J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. In CVPR, pages 248-255, 2009. 6
    • (2009) CVPR , pp. 248-255
    • Deng, J.1    Dong, W.2    Socher, R.3    Li, L.-J.4    Li, K.5    Fei-Fei, L.6
  • 8
    • 85107661995 scopus 로고    scopus 로고
    • Meteor universal: Language specific translation evaluation for any target language
    • 6
    • M. Denkowski and A. Lavie. Meteor universal: Language specific translation evaluation for any target language. In ACL, 2014. 6
    • (2014) ACL
    • Denkowski, M.1    Lavie, A.2
  • 12
    • 84986281512 scopus 로고    scopus 로고
    • Learning attributes equals multi-source domain generalization
    • 2
    • C. Gan, T. Yang, and B. Gong. Learning attributes equals multi-source domain generalization. In CVPR, pages 87-97, 2016. 2
    • (2016) CVPR , pp. 87-97
    • Gan, C.1    Yang, T.2    Gong, B.3
  • 14
    • 85044442374 scopus 로고    scopus 로고
    • Residual multiple instance learning for visually impaired image descriptions
    • 3
    • S. Gella and M. Mitchell. Residual multiple instance learning for visually impaired image descriptions. NIPS Women in Machine Learning Workshop, 2016. 3
    • (2016) NIPS Women in Machine Learning Workshop
    • Gella, S.1    Mitchell, M.2
  • 15
    • 84964588182 scopus 로고    scopus 로고
    • Fast r-cnn
    • 2
    • R. Girshick. Fast r-cnn. In ICCV, pages 1440-1448, 2015. 2
    • (2015) ICCV , pp. 1440-1448
    • Girshick, R.1
  • 16
    • 84911400494 scopus 로고    scopus 로고
    • Rich feature hierarchies for accurate object detection and semantic segmentation
    • 2
    • R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. Computer Science, pages 580-587, 2014. 2
    • (2014) Computer Science , pp. 580-587
    • Girshick, R.1    Donahue, J.2    Darrell, T.3    Malik, J.4
  • 17
    • 84986274465 scopus 로고    scopus 로고
    • Deep residual learning for image recognition
    • 2, 6
    • K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. CVPR, 2016. 2, 6
    • (2016) CVPR
    • He, K.1    Zhang, X.2    Ren, S.3    Sun, J.4
  • 18
    • 84986274522 scopus 로고    scopus 로고
    • Deep compositional captioning: Describing novel object categories without paired training data
    • 3
    • L. A. Hendricks, S. Venugopalan, M. Rohrbach, R. Mooney, K. Saenko, and T. Darrell. Deep compositional captioning: Describing novel object categories without paired training data. CVPR, 2016. 3
    • (2016) CVPR
    • Hendricks, L.A.1    Venugopalan, S.2    Rohrbach, M.3    Mooney, R.4    Saenko, K.5    Darrell, T.6
  • 20
    • 84883394520 scopus 로고    scopus 로고
    • Framing image description as a ranking task: Data, models and evaluation metrics
    • 2, 5, 6
    • M. Hodosh, P. Young, and J. Hockenmaier. Framing image description as a ranking task: Data, models and evaluation metrics. Journal of Artificial Intelligence Research, 47:853-899, 2013. 2, 5, 6
    • (2013) Journal of Artificial Intelligence Research , vol.47 , pp. 853-899
    • Hodosh, M.1    Young, P.2    Hockenmaier, J.3
  • 21
    • 84973917813 scopus 로고    scopus 로고
    • Guiding the long-short term memory model for image caption generation
    • 2
    • X. Jia, E. Gavves, B. Fernando, and T. Tuytelaars. Guiding the long-short term memory model for image caption generation. In ICCV, pages 2407-2415, 2015. 2
    • (2015) ICCV , pp. 2407-2415
    • Jia, X.1    Gavves, E.2    Fernando, B.3    Tuytelaars, T.4
  • 22
    • 84946734827 scopus 로고    scopus 로고
    • Deep visual-semantic alignments for generating image descriptions
    • 1, 2
    • A. Karpathy and L. Fei-Fei. Deep visual-semantic alignments for generating image descriptions. In CVPR, pages 3128-3137, 2015. 1, 2
    • (2015) CVPR , pp. 3128-3137
    • Karpathy, A.1    Fei-Fei, L.2
  • 24
    • 85083951076 scopus 로고    scopus 로고
    • A method for stochastic optimization
    • 6
    • D. Kingma and J. Ba. Adam: A method for stochastic optimization. ICLR, 2015. 6
    • (2015) ICLR
    • Kingma, D.1    Adam, J.Ba.2
  • 26
    • 84876231242 scopus 로고    scopus 로고
    • Imagenet classification with deep convolutional neural networks
    • 2
    • A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. NIPS, 2012. 2
    • (2012) NIPS
    • Krizhevsky, A.1    Sutskever, I.2    Hinton, G.E.3
  • 28
    • 84934873221 scopus 로고    scopus 로고
    • TREETALK: Composition and compression of trees for image descriptions
    • 2
    • P. Kuznetsova, V. Ordonez, T. L. Berg, and Y. Choi. TREETALK: composition and compression of trees for image descriptions. TACL, 2:351-362, 2014. 2
    • (2014) TACL , vol.2 , pp. 351-362
    • Kuznetsova, P.1    Ordonez, V.2    Berg, T.L.3    Choi, Y.4
  • 29
    • 85044442587 scopus 로고    scopus 로고
    • Composing simple image descriptions using web-scale n-grams
    • 2
    • S. Li, G. Kulkarni, T. L. Berg, A. C. Berg, and Y. Choi. Composing simple image descriptions using web-scale n-grams. In ACL, 2011. 2
    • (2011) ACL
    • Li, S.1    Kulkarni, G.2    Berg, T.L.3    Berg, A.C.4    Choi, Y.5
  • 32
    • 85083950512 scopus 로고    scopus 로고
    • Deep captioning with multimodal recurrent neural networks (m-RNN)
    • 1, 2, 4
    • J. Mao, W. Xu, Y. Yang, J. Wang, Z. Huang, and A. Yuille. Deep captioning with multimodal recurrent neural networks (m-RNN). ICLR, 2015. 1, 2, 4
    • (2015) ICLR
    • Mao, J.1    Xu, W.2    Yang, Y.3    Wang, J.4    Huang, Z.5    Yuille, A.6
  • 33
    • 84973863256 scopus 로고    scopus 로고
    • Learning like a child: Fast novel visual concept learning from sentence descriptions of images
    • 3
    • J. Mao, W. Xu, Y. Yang, J. Wang, Z. Huang, and A. Yuille. Learning like a child: Fast novel visual concept learning from sentence descriptions of images. In ICCV, 2015. 3
    • (2015) ICCV
    • Mao, J.1    Xu, W.2    Yang, Y.3    Wang, J.4    Huang, Z.5    Yuille, A.6
  • 34
    • 85044481513 scopus 로고    scopus 로고
    • Senticap: Generating image descriptions with sentiments
    • 2, 3
    • A. Mathews, L. Xie, and X. He. Senticap: Generating image descriptions with sentiments. AAAI, 2015. 2, 3
    • (2015) AAAI
    • Mathews, A.1    Xie, L.2    He, X.3
  • 36
    • 85162522202 scopus 로고    scopus 로고
    • Im2text: Describing images using 1 million captioned photographs
    • 2
    • V. Ordonez, G. Kulkarni, T. L. Berg, V. Ordonez, G. Kulkarni, and T. L. Berg. Im2text: Describing images using 1 million captioned photographs. NIPS, pages 1143-1151, 2011. 2
    • (2011) NIPS , pp. 1143-1151
    • Ordonez, V.1    Kulkarni, G.2    Berg, T.L.3    Ordonez, V.4    Kulkarni, G.5    Berg, T.L.6
  • 37
    • 85133336275 scopus 로고    scopus 로고
    • BLEU: A method for automatic evaluation of machine translation
    • 6
    • K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu. BLEU: a method for automatic evaluation of machine translation. In ACL, pages 311-318, 2002. 6
    • (2002) ACL , pp. 311-318
    • Papineni, K.1    Roukos, S.2    Ward, T.3    Zhu, W.-J.4
  • 38
    • 85018916536 scopus 로고    scopus 로고
    • Variational autoencoder for deep learning of images, labels and captions
    • 3
    • Y. Pu, Z. Gan, R. Henao, X. Yuan, C. Li, A. Stevens, and L. Carin. Variational autoencoder for deep learning of images, labels and captions. In NIPS, pages 2352-2360, 2016. 3
    • (2016) NIPS , pp. 2352-2360
    • Pu, Y.1    Gan, Z.2    Henao, R.3    Yuan, X.4    Li, C.5    Stevens, A.6    Carin, L.7
  • 40
    • 84990034009 scopus 로고    scopus 로고
    • Very deep convolutional networks for large-scale image recognition
    • 2
    • K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. Computer Science, 2014. 2
    • (2014) Computer Science
    • Simonyan, K.1    Zisserman, A.2
  • 41
    • 84973888835 scopus 로고    scopus 로고
    • Automatic concept discovery from parallel text and visual corpora
    • 2
    • C. Sun, C. Gan, and R. Nevatia. Automatic concept discovery from parallel text and visual corpora. In ICCV, pages 2596-2604, 2015. 2
    • (2015) ICCV , pp. 2596-2604
    • Sun, C.1    Gan, C.2    Nevatia, R.3
  • 42
    • 84928547704 scopus 로고    scopus 로고
    • Sequence to sequence learning with neural networks
    • 2, 3
    • I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. In NIPS, pages 3104-3112, 2014. 2, 3
    • (2014) NIPS , pp. 3104-3112
    • Sutskever, I.1    Vinyals, O.2    Le, Q.V.3
  • 43
    • 84937522268 scopus 로고    scopus 로고
    • Going deeper with convolutions
    • 2
    • C. Szegedy, W. Liu, Y. Jia, and P. Sermanet. Going deeper with convolutions. CVPR, pages 1-9, 2015. 2
    • (2015) CVPR , pp. 1-9
    • Szegedy, C.1    Liu, W.2    Jia, Y.3    Sermanet, P.4
  • 45
    • 84973865953 scopus 로고    scopus 로고
    • Learning spatiotemporal features with 3D convolutional networks
    • 8
    • D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri. Learning spatiotemporal features with 3D convolutional networks. In ICCV, pages 4489-4497, 2015. 8
    • (2015) ICCV , pp. 4489-4497
    • Tran, D.1    Bourdev, L.2    Fergus, R.3    Torresani, L.4    Paluri, M.5
  • 47
    • 84956980995 scopus 로고    scopus 로고
    • Cider: Consensus-based image description evaluation
    • 6
    • R. Vedantam, C. Lawrence Zitnick, and D. Parikh. Cider: Consensus-based image description evaluation. In CVPR, pages 4566-4575, 2015. 6
    • (2015) CVPR , pp. 4566-4575
    • Vedantam, R.1    Lawrence Zitnick, C.2    Parikh, D.3
  • 50
    • 84946747440 scopus 로고    scopus 로고
    • Show and tell: A neural image caption generator
    • 1, 2, 3, 4, 5, 6
    • O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. In CVPR, pages 3156-3164, 2015. 1, 2, 3, 4, 5, 6
    • (2015) CVPR , pp. 3156-3164
    • Vinyals, O.1    Toshev, A.2    Bengio, S.3    Erhan, D.4
  • 51
    • 85044461924 scopus 로고    scopus 로고
    • Dense-cap: Fully convolutional localization networks for dense captioning
    • 3
    • L. Wei, Q. Huang, D. Ceylan, E. Vouga, and H. Li. Dense-cap: Fully convolutional localization networks for dense captioning. Computer Science, 2015. 3
    • (2015) Computer Science
    • Wei, L.1    Huang, Q.2    Ceylan, D.3    Vouga, E.4    Li, H.5
  • 52
    • 84970002232 scopus 로고    scopus 로고
    • Show, attend and tell: Neural image caption generation with visual attention
    • 1, 2, 4
    • K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudi-nov, R. Zemel, and Y. Bengio. Show, attend and tell: Neural image caption generation with visual attention. In ICML, pages 2048-2057, 2015. 1, 2, 4
    • (2015) ICML , pp. 2048-2057
    • Xu, K.1    Ba, J.2    Kiros, R.3    Cho, K.4    Courville, A.5    Salakhudi-Nov, R.6    Zemel, R.7    Bengio, Y.8
  • 53
    • 80053258778 scopus 로고    scopus 로고
    • Corpus-guided sentence generation of natural images
    • 2
    • Y. Yang, C. L. Teo, Daum, H. Iii, and Y. Aloimonos. Corpus-guided sentence generation of natural images. In EMNLP, pages 444-454, 2011. 2
    • (2011) EMNLP , pp. 444-454
    • Yang, Y.1    Teo, C.L.2    Daum, H.I.3    Aloimonos, Y.4
  • 54
    • 85030211479 scopus 로고    scopus 로고
    • Encode, review, and decode: Reviewer module for caption generation
    • 1, 2
    • Z. Yang, Y. Yuan, Y. Wu, R. Salakhutdinov, and W. W. Cohen. Encode, review, and decode: Reviewer module for caption generation. NIPS, 2016. 1, 2
    • (2016) NIPS
    • Yang, Z.1    Yuan, Y.2    Wu, Y.3    Salakhutdinov, R.4    Cohen, W.W.5
  • 55
    • 84986317307 scopus 로고    scopus 로고
    • Image captioning with semantic attention
    • 1, 2
    • Q. You, H. Jin, Z. Wang, C. Fang, and J. Luo. Image captioning with semantic attention. CVPR, 2016. 1, 2
    • (2016) CVPR
    • You, Q.1    Jin, H.2    Wang, Z.3    Fang, C.4    Luo, J.5
  • 56
    • 84906494296 scopus 로고    scopus 로고
    • From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions
    • 5
    • P. Young, A. Lai, M. Hodosh, and J. Hockenmaier. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. TACL, 2014. 5
    • (2014) TACL
    • Young, P.1    Lai, A.2    Hodosh, M.3    Hockenmaier, J.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.