메뉴 건너뛰기




Volumn 2017-January, Issue , 2017, Pages 1141-1150

Semantic compositional networks for visual captioning

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER VISION; SEMANTIC WEB; SEMANTICS;

EID: 85021786108     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/CVPR.2017.127     Document Type: Conference Paper
Times cited : (405)

References (57)
  • 2
    • 85083953689 scopus 로고    scopus 로고
    • Neural machine translation by jointly learning to align and translate
    • D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to align and translate. In ICLR, 2015.
    • (2015) ICLR
    • Bahdanau, D.1    Cho, K.2    Bengio, Y.3
  • 3
    • 85083954507 scopus 로고    scopus 로고
    • Delving deeper into convolutional networks for learning video representations
    • N. Ballas, L. Yao, C. Pal, and A. Courville. Delving deeper into convolutional networks for learning video representations. In ICLR, 2016.
    • (2016) ICLR
    • Ballas, N.1    Yao, L.2    Pal, C.3    Courville, A.4
  • 4
    • 85116156579 scopus 로고    scopus 로고
    • Meteor: An automatic metric for mt evaluation with improved correlation with human judgments
    • S. Banerjee and A. Lavie. Meteor: An automatic metric for mt evaluation with improved correlation with human judgments. In ACL workshop, 2005.
    • (2005) ACL Workshop
    • Banerjee, S.1    Lavie, A.2
  • 5
    • 84859089502 scopus 로고    scopus 로고
    • Collecting highly parallel data for paraphrase evaluation
    • D. L. Chen and W. B. Dolan. Collecting highly parallel data for paraphrase evaluation. In ACL, 2011.
    • (2011) ACL
    • Chen, D.L.1    Dolan, W.B.2
  • 7
    • 84957029470 scopus 로고    scopus 로고
    • Mind's eye: A recurrent visual representation for image caption generation
    • X. Chen and C. Lawrence Zitnick. Mind's eye: A recurrent visual representation for image caption generation. In CVPR, 2015.
    • (2015) CVPR
    • Chen, X.1    Lawrence Zitnick, C.2
  • 11
    • 84994631269 scopus 로고    scopus 로고
    • Early embedding and late reranking for video captioning
    • J. Dong, X. Li, W. Lan, Y. Huo, and C. G. Snoek. Early embedding and late reranking for video captioning. In ACMMM, 2016.
    • (2016) ACMMM
    • Dong, J.1    Li, X.2    Lan, W.3    Huo, Y.4    Snoek, C.G.5
  • 14
    • 85044213495 scopus 로고    scopus 로고
    • Stylenet: Generating attractive visual captions with styles
    • C. Gan, Z. Gan, X. He, J. Gao, and L. Deng. Stylenet: Generating attractive visual captions with styles. In CVPR, 2017.
    • (2017) CVPR
    • Gan, C.1    Gan, Z.2    He, X.3    Gao, J.4    Deng, L.5
  • 15
    • 84986281512 scopus 로고    scopus 로고
    • Learning attributes equals multi-source domain generalization
    • C. Gan, T. Yang, and B. Gong. Learning attributes equals multi-source domain generalization. In CVPR, 2016.
    • (2016) CVPR
    • Gan, C.1    Yang, T.2    Gong, B.3
  • 16
    • 84986274465 scopus 로고    scopus 로고
    • Deep residual learning for image recognition
    • K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, 2016.
    • (2016) CVPR
    • He, K.1    Zhang, X.2    Ren, S.3    Sun, J.4
  • 18
    • 84973917813 scopus 로고    scopus 로고
    • Guiding long-short term memory for image caption generation
    • X. Jia, E. Gavves, B. Fernando, and T. Tuytelaars. Guiding long-short term memory for image caption generation. In ICCV, 2015.
    • (2015) ICCV
    • Jia, X.1    Gavves, E.2    Fernando, B.3    Tuytelaars, T.4
  • 20
    • 84946734827 scopus 로고    scopus 로고
    • Deep visual-semantic alignments for generating image descriptions
    • A. Karpathy and L. Fei-Fei. Deep visual-semantic alignments for generating image descriptions. In CVPR, 2015.
    • (2015) CVPR
    • Karpathy, A.1    Fei-Fei, L.2
  • 22
    • 85083951076 scopus 로고    scopus 로고
    • Adam: A method for stochastic optimization
    • D. Kingma and J. Ba. Adam: A method for stochastic optimization. In ICLR, 2015.
    • (2015) ICLR
    • Kingma, D.1    Ba, J.2
  • 25
    • 84937834963 scopus 로고    scopus 로고
    • A multiplicative model for learning distributed text-based attribute representations
    • R. Kiros, R. Zemel, and R. R. Salakhutdinov. A multiplicative model for learning distributed text-based attribute representations. In NIPS, 2014.
    • (2014) NIPS
    • Kiros, R.1    Zemel, R.2    Salakhutdinov, R.R.3
  • 26
    • 78650200194 scopus 로고    scopus 로고
    • Rouge: A package for automatic evaluation of summaries
    • C.-Y. Lin. Rouge: A package for automatic evaluation of summaries. In ACL workshop, 2004.
    • (2004) ACL Workshop
    • Lin, C.-Y.1
  • 29
    • 85083950512 scopus 로고    scopus 로고
    • Deep captioning with multimodal recurrent neural networks (m-rnn)
    • J. Mao, W. Xu, Y. Yang, J. Wang, Z. Huang, and A. Yuille. Deep captioning with multimodal recurrent neural networks (m-rnn). In ICLR, 2015.
    • (2015) ICLR
    • Mao, J.1    Xu, W.2    Yang, Y.3    Wang, J.4    Huang, Z.5    Yuille, A.6
  • 30
    • 34948828582 scopus 로고    scopus 로고
    • Unsupervised learning of image transformations
    • R. Memisevic and G. Hinton. Unsupervised learning of image transformations. In CVPR, 2007.
    • (2007) CVPR
    • Memisevic, R.1    Hinton, G.2
  • 31
    • 84898956512 scopus 로고    scopus 로고
    • Distributed representations of words and phrases and their compositionality
    • T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In NIPS, 2013.
    • (2013) NIPS
    • Mikolov, T.1    Sutskever, I.2    Chen, K.3    Corrado, G.S.4    Dean, J.5
  • 32
    • 84986332702 scopus 로고    scopus 로고
    • Jointly modeling embedding and translation to bridge video and language
    • Y. Pan, T. Mei, T. Yao, H. Li, and Y. Rui. Jointly modeling embedding and translation to bridge video and language. In CVPR, 2016.
    • (2016) CVPR
    • Pan, Y.1    Mei, T.2    Yao, T.3    Li, H.4    Rui, Y.5
  • 33
    • 85133336275 scopus 로고    scopus 로고
    • Bleu: A method for automatic evaluation of machine translation
    • K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu. Bleu: a method for automatic evaluation of machine translation. In ACL, 2002.
    • (2002) ACL
    • Papineni, K.1    Roukos, S.2    Ward, T.3    Zhu, W.-J.4
  • 34
    • 85018916536 scopus 로고    scopus 로고
    • Variational autoencoder for deep learning of images, labels and captions
    • Y. Pu, Z. Gan, R. Henao, X. Yuan, C. Li, A. Stevens, and L. Carin. Variational autoencoder for deep learning of images, labels and captions. In NIPS, 2016.
    • (2016) NIPS
    • Pu, Y.1    Gan, Z.2    Henao, R.3    Yuan, X.4    Li, C.5    Stevens, A.6    Carin, L.7
  • 37
    • 84906925854 scopus 로고    scopus 로고
    • Grounded compositional semantics for finding and describing images with sentences
    • R. Socher, A. Karpathy, Q. V. Le, C. D. Manning, and A. Y. Ng. Grounded compositional semantics for finding and describing images with sentences. TACL, 2014.
    • (2014) TACL
    • Socher, R.1    Karpathy, A.2    Le, Q.V.3    Manning, C.D.4    Ng, A.Y.5
  • 38
    • 84998827890 scopus 로고    scopus 로고
    • Factored temporal sigmoid belief networks for sequence learning
    • J. Song, Z. Gan, and L. Carin. Factored temporal sigmoid belief networks for sequence learning. In ICML, 2016.
    • (2016) ICML
    • Song, J.1    Gan, Z.2    Carin, L.3
  • 39
    • 80053459857 scopus 로고    scopus 로고
    • Generating text with recurrent neural networks
    • I. Sutskever, J. Martens, and G. E. Hinton. Generating text with recurrent neural networks. In ICML, 2011.
    • (2011) ICML
    • Sutskever, I.1    Martens, J.2    Hinton, G.E.3
  • 40
    • 84928547704 scopus 로고    scopus 로고
    • Sequence to sequence learning with neural networks
    • I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. In NIPS, 2014.
    • (2014) NIPS
    • Sutskever, I.1    Vinyals, O.2    Le, Q.V.3
  • 41
    • 71149118574 scopus 로고    scopus 로고
    • Factored conditional restricted boltzmann machines for modeling motion style
    • G. W. Taylor and G. E. Hinton. Factored conditional restricted boltzmann machines for modeling motion style. In ICML, 2009.
    • (2009) ICML
    • Taylor, G.W.1    Hinton, G.E.2
  • 43
    • 84973865953 scopus 로고    scopus 로고
    • Learning spatiotemporal features with 3d convolutional networks
    • D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri. Learning spatiotemporal features with 3d convolutional networks. In ICCV, 2015.
    • (2015) ICCV
    • Tran, D.1    Bourdev, L.2    Fergus, R.3    Torresani, L.4    Paluri, M.5
  • 45
    • 84956980995 scopus 로고    scopus 로고
    • Cider: Consensus-based image description evaluation
    • R. Vedantam, C. Lawrence Zitnick, and D. Parikh. Cider: Consensus-based image description evaluation. In CVPR, 2015.
    • (2015) CVPR
    • Vedantam, R.1    Lawrence Zitnick, C.2    Parikh, D.3
  • 48
    • 84946747440 scopus 로고    scopus 로고
    • Show and tell: A neural image caption generator
    • O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. In CVPR, 2015.
    • (2015) CVPR
    • Vinyals, O.1    Toshev, A.2    Bengio, S.3    Erhan, D.4
  • 49
    • 84986301177 scopus 로고    scopus 로고
    • What value do explicit high level concepts have in vision to language problems?
    • Q. Wu, C. Shen, L. Liu, A. Dick, and A. v. d. Hengel. What value do explicit high level concepts have in vision to language problems? In CVPR, 2016.
    • (2016) CVPR
    • Wu, Q.1    Shen, C.2    Liu, L.3    Dick, A.4    Hengel, A.V.D.5
  • 50
    • 85018912617 scopus 로고    scopus 로고
    • On multiplicative integration with recurrent neural networks
    • Y. Wu, S. Zhang, Y. Zhang, Y. Bengio, and R. Salakhutdinov. On multiplicative integration with recurrent neural networks. In NIPS, 2016.
    • (2016) NIPS
    • Wu, Y.1    Zhang, S.2    Zhang, Y.3    Bengio, Y.4    Salakhutdinov, R.5
  • 51
    • 84986260127 scopus 로고    scopus 로고
    • Msr-vtt: A large video description dataset for bridging video and language
    • J. Xu, T. Mei, T. Yao, and Y. Rui. Msr-vtt: A large video description dataset for bridging video and language. In CVPR, 2016.
    • (2016) CVPR
    • Xu, J.1    Mei, T.2    Yao, T.3    Rui, Y.4
  • 54
    • 84986317307 scopus 로고    scopus 로고
    • Image captioning with semantic attention
    • Q. You, H. Jin, Z. Wang, C. Fang, and J. Luo. Image captioning with semantic attention. In CVPR, 2016.
    • (2016) CVPR
    • You, Q.1    Jin, H.2    Wang, Z.3    Fang, C.4    Luo, J.5
  • 55
    • 84906494296 scopus 로고    scopus 로고
    • From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions
    • P. Young, A. Lai, M. Hodosh, and J. Hockenmaier. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. TACL, 2014.
    • (2014) TACL
    • Young, P.1    Lai, A.2    Hodosh, M.3    Hockenmaier, J.4
  • 56
    • 84986275061 scopus 로고    scopus 로고
    • Video paragraph captioning using hierarchical recurrent neural networks
    • H. Yu, J. Wang, Z. Huang, Y. Yang, and W. Xu. Video paragraph captioning using hierarchical recurrent neural networks. In CVPR, 2016.
    • (2016) CVPR
    • Yu, H.1    Wang, J.2    Huang, Z.3    Yang, Y.4    Xu, W.5


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.