메뉴 건너뛰기




Volumn 2016-December, Issue , 2016, Pages 11-20

Generation and comprehension of unambiguous object descriptions

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER VISION;

EID: 84986260074     PISSN: 10636919     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/CVPR.2016.9     Document Type: Conference Paper
Times cited : (1407)

References (56)
  • 3
    • 0022890536 scopus 로고
    • Maximum mutual information estimation of hidden Markov model parameters for speech recognition
    • Apr.
    • L. Bahl, P. Brown, P. V. de Souza, and R. Mercer. Maximum mutual information estimation of hidden Markov model parameters for speech recognition. In ICASSP, volume 11, pages 49-52, Apr. 1986.
    • (1986) ICASSP , vol.11 , pp. 49-52
    • Bahl, L.1    Brown, P.2    De Souza, P.V.3    Mercer, R.4
  • 5
    • 84957029470 scopus 로고    scopus 로고
    • Mind's eye: A recurrent visual representation for image caption generation
    • X. Chen and C. L. Zitnick. Mind's eye: A recurrent visual representation for image caption generation. In CVPR, 2015.
    • (2015) CVPR , vol.1 , pp. 2
    • Chen, X.1    Zitnick, C.L.2
  • 7
    • 85198028989 scopus 로고    scopus 로고
    • Imagenet: A large-scale hierarchical image database
    • J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. In CVPR, pages 248-255, 2009.
    • (2009) CVPR , vol.4 , pp. 248-255
    • Deng, J.1    Dong, W.2    Socher, R.3    Li, L.-J.4    Li, K.5    Fei-Fei, L.6
  • 10
    • 84911443425 scopus 로고    scopus 로고
    • Scalable object detection using deep neural networks
    • 4
    • D. Erhan, C. Szegedy, A. Toshev, and D. Anguelov. Scalable object detection using deep neural networks. In CVPR, pages 2155-2162, 2014.
    • (2014) CVPR , pp. 2155-2162
    • Erhan, D.1    Szegedy, C.2    Toshev, A.3    Anguelov, D.4
  • 14
    • 84908171707 scopus 로고    scopus 로고
    • Learning distributions over logical forms for referring expression generation
    • 1, 2
    • N. FitzGerald, Y. Artzi, and L. S. Zettlemoyer. Learning distributions over logical forms for referring expression generation. In EMNLP, pages 1914-1925, 2013.
    • (2013) EMNLP , pp. 1914-1925
    • FitzGerald, N.1    Artzi, Y.2    Zettlemoyer, L.S.3
  • 15
    • 84965148420 scopus 로고    scopus 로고
    • Are you talking to a machine dataset and methods for multilingual image question answering
    • 2
    • H. Gao, J. Mao, J. Zhou, Z. Huang, L. Wang, and W. Xu. Are you talking to a machine dataset and methods for multilingual image question answering. In NIPS, 2015.
    • (2015) NIPS
    • Gao, H.1    Mao, J.2    Zhou, J.3    Huang, Z.4    Wang, L.5    Xu, W.6
  • 16
    • 84925422907 scopus 로고    scopus 로고
    • Visual turing test for computer vision systems
    • 2
    • D. Geman, S. Geman, N. Hallonquist, and L. Younes. Visual turing test for computer vision systems. PANS, 112 (12): 3618-3623, 2015.
    • (2015) PANS , vol.112 , Issue.12 , pp. 3618-3623
    • Geman, D.1    Geman, S.2    Hallonquist, N.3    Younes, L.4
  • 17
    • 84911400494 scopus 로고    scopus 로고
    • Rich feature hierarchies for accurate object detection and semantic segmentation
    • 4
    • R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR, 2014.
    • (2014) CVPR
    • Girshick, R.1    Donahue, J.2    Darrell, T.3    Malik, J.4
  • 18
    • 84954299658 scopus 로고    scopus 로고
    • From the virtual to the real world: Referring to objects in Real-World spatial scenes
    • 2
    • D. Gkatzia, V. Rieser, P. Bartie, andW. Mackaness. From the virtual to the real world: Referring to objects in Real-World spatial scenes. In EMNLP, 2015.
    • (2015) EMNLP
    • Gkatzia, D.1    Rieser, V.2    Bartie, P.3    Mackaness, W.4
  • 19
    • 80053265931 scopus 로고    scopus 로고
    • A game-theoretic approach to generating spatial descriptions
    • 1, 2, 5
    • D. Golland, P. Liang, and D. Klein. A game-theoretic approach to generating spatial descriptions. In EMNLP, pages 410-419, 2010.
    • (2010) EMNLP , pp. 410-419
    • Golland, D.1    Liang, P.2    Klein, D.3
  • 20
    • 84905579579 scopus 로고    scopus 로고
    • Probabilistic semantics and pragmatics: Uncertainty in language and thought
    • Wiley-Blackwell, 2
    • N. D. Goodman and D. Lassiter. Probabilistic semantics and pragmatics: Uncertainty in language and thought. Handbook of Contemporary Semantic Theory. Wiley-Blackwell, 2014.
    • (2014) Handbook of Contemporary Semantic Theory
    • Goodman, N.D.1    Lassiter, D.2
  • 22
    • 85009917737 scopus 로고
    • Logic and conversation
    • 2
    • H. P. Grice. Logic and conversation. na, 1970.
    • (1970) Na
    • Grice, H.P.1
  • 23
    • 84883394520 scopus 로고    scopus 로고
    • Framing image description as a ranking task: Data, models and evaluation metrics
    • 2
    • M. Hodosh, P. Young, and J. Hockenmaier. Framing image description as a ranking task: Data, models and evaluation metrics. JAIR, 47: 853-899, 2013.
    • (2013) JAIR , vol.47 , pp. 853-899
    • Hodosh, M.1    Young, P.2    Hockenmaier, J.3
  • 27
    • 84943540775 scopus 로고    scopus 로고
    • Referitgame: Referring to objects in photographs of natural scenes
    • 1, 2, 3
    • S. Kazemzadeh, V. Ordonez, M. Matten, and T. L. Berg. Referitgame: Referring to objects in photographs of natural scenes. In EMNLP, pages 787-798, 2014.
    • (2014) EMNLP , pp. 787-798
    • Kazemzadeh, S.1    Ordonez, V.2    Matten, M.3    Berg, T.L.4
  • 31
    • 84856184938 scopus 로고    scopus 로고
    • Computational generation of referring expressions: A survey
    • 1, 2
    • E. Krahmer and K. van Deemter. Computational generation of referring expressions: A survey. Comp. Linguistics, 38, 2012.
    • (2012) Comp. Linguistics , vol.38
    • Krahmer, E.1    Van Deemter, K.2
  • 33
    • 84876231242 scopus 로고    scopus 로고
    • Imagenet classification with deep convolutional neural networks
    • 4
    • A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, pages 1097-1105, 2012.
    • (2012) NIPS , pp. 1097-1105
    • Krizhevsky, A.1    Sutskever, I.2    Hinton, G.E.3
  • 35
    • 85120046073 scopus 로고    scopus 로고
    • Meteor: An automatic metric for mt evaluation with high levels of correlation with human judgements
    • 6
    • A. Lavie and A. Agarwal. Meteor: An automatic metric for mt evaluation with high levels of correlation with human judgements. In Workshop on Statistical Machine Translation, pages 228-231, 2007.
    • (2007) Workshop on Statistical Machine Translation , pp. 228-231
    • Lavie, A.1    Agarwal, A.2
  • 36
    • 84862279067 scopus 로고    scopus 로고
    • Composing simple image descriptions using web-scale n-grams
    • 2
    • S. Li, G. Kulkarni, T. L. Berg, A. C. Berg, and Y. Choi. Composing simple image descriptions using web-scale n-grams. In CoNLL, pages 220-228, 2011.
    • (2011) CoNLL , pp. 220-228
    • Li, S.1    Kulkarni, G.2    Berg, T.L.3    Berg, A.C.4    Choi, Y.5
  • 38
    • 84937822746 scopus 로고    scopus 로고
    • A multi-world approach to question answering about real-world scenes based on uncertain input
    • 2
    • M. Malinowski and M. Fritz. A multi-world approach to question answering about real-world scenes based on uncertain input. In NIPS, pages 1682-1690, 2014.
    • (2014) NIPS , pp. 1682-1690
    • Malinowski, M.1    Fritz, M.2
  • 39
    • 84986313218 scopus 로고    scopus 로고
    • Ask your neurons: A neural-based approach to answering questions about images
    • 2
    • M. Malinowski, M. Rohrbach, and M. Fritz. Ask your neurons: A neural-based approach to answering questions about images. In NIPS, 2015.
    • (2015) NIPS
    • Malinowski, M.1    Rohrbach, M.2    Fritz, M.3
  • 40
    • 85083950512 scopus 로고    scopus 로고
    • Deep captioning with multimodal recurrent neural networks (m-rnn)
    • 1, 2, 4
    • J. Mao, W. Xu, Y. Yang, J. Wang, Z. Huang, and A. Yuille. Deep captioning with multimodal recurrent neural networks (m-rnn). In ICLR, 2015.
    • (2015) ICLR
    • Mao, J.1    Xu, W.2    Yang, Y.3    Wang, J.4    Huang, Z.5    Yuille, A.6
  • 41
    • 84858142989 scopus 로고    scopus 로고
    • Natural reference to objects in a visual domain
    • 1, 2
    • M. Mitchell, K. van Deemter, and E. Reiter. Natural reference to objects in a visual domain. In INLG, pages 95-104, 2010.
    • (2010) INLG , pp. 95-104
    • Mitchell, M.1    Van Deemter, K.2    Reiter, E.3
  • 42
    • 84908171705 scopus 로고    scopus 로고
    • Generating expressions that refer to visible objects
    • 1, 2
    • M. Mitchell, K. van Deemter, and E. Reiter. Generating expressions that refer to visible objects. In HLT-NAACL, pages 1174-1184, 2013.
    • (2013) HLT-NAACL , pp. 1174-1184
    • Mitchell, M.1    Van Deemter, K.2    Reiter, E.3
  • 43
    • 85162522202 scopus 로고    scopus 로고
    • Im2text: Describing images using 1 million captioned photographs
    • 2
    • V. Ordonez, G. Kulkarni, and T. L. Berg. Im2text: Describing images using 1 million captioned photographs. In NIPS, 2011.
    • (2011) NIPS
    • Ordonez, V.1    Kulkarni, G.2    Berg, T.L.3
  • 44
    • 85133336275 scopus 로고    scopus 로고
    • Bleu: A method for automatic evaluation of machine translation
    • 6
    • K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu. Bleu: A method for automatic evaluation of machine translation. In ACL, pages 311-318, 2002.
    • (2002) ACL , pp. 311-318
    • Papineni, K.1    Roukos, S.2    Ward, T.3    Zhu, W.-J.4
  • 45
    • 84973856017 scopus 로고    scopus 로고
    • Flickr30k entities: Collecting region-to-phrase correspondences for richer imageto-sentence models
    • 2
    • B. A. Plummer, L. Wang, C. M. Cervantes, J. C. Caicedo, J. Hockenmaier, and S. Lazebnik. Flickr30k entities: Collecting region-to-phrase correspondences for richer imageto-sentence models. In ICCV, 2015.
    • (2015) ICCV
    • Plummer, B.A.1    Wang, L.2    Cervantes, C.M.3    Caicedo, J.C.4    Hockenmaier, J.5    Lazebnik, S.6
  • 46
    • 80052889458 scopus 로고    scopus 로고
    • Recognition using visual phrases
    • 2
    • M. A. Sadeghi and A. Farhadi. Recognition using visual phrases. In CVPR, 2011.
    • (2011) CVPR
    • Sadeghi, M.A.1    Farhadi, A.2
  • 47
    • 84866654828 scopus 로고    scopus 로고
    • Image description with a goal: Building efficient discriminating expressions for images
    • 2
    • A. Sadovnik, Y.-I. Chiu, N. Snavely, S. Edelman, and T. Chen. Image description with a goal: Building efficient discriminating expressions for images. In CVPR, 2012.
    • (2012) CVPR
    • Sadovnik, A.1    Chiu, Y.-I.2    Snavely, N.3    Edelman, S.4    Chen, T.5
  • 48
    • 85083953063 scopus 로고    scopus 로고
    • Very deep convolutional networks for large-scale image recognition
    • 4
    • K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015.
    • (2015) ICLR
    • Simonyan, K.1    Zisserman, A.2
  • 49
    • 84964474107 scopus 로고    scopus 로고
    • Grounded compositional semantics for finding and describing images with sentences
    • 2
    • R. Socher, Q. Le, C. Manning, and A. Ng. Grounded compositional semantics for finding and describing images with sentences. In TACL, 2014.
    • (2014) TACL
    • Socher, R.1    Le, Q.2    Manning, C.3    Ng, A.4
  • 50
    • 84858111046 scopus 로고    scopus 로고
    • Building a semantically transparent corpus for the generation of referring expressions
    • 1, 2
    • K. van Deemter, I. van der Sluis, and A. Gatt. Building a semantically transparent corpus for the generation of referring expressions. In INLG, pages 130-132, 2006.
    • (2006) INLG , pp. 130-132
    • Van Deemter, K.1    Sluis Der Van, I.2    Gatt, A.3
  • 51
    • 84956980995 scopus 로고    scopus 로고
    • CIDEr: Consensus-based image description evaluation
    • 6
    • R. Vedantam, C. Lawrence Zitnick, and D. Parikh. CIDEr: Consensus-based image description evaluation. In CVPR, 2015.
    • (2015) CVPR
    • Vedantam, R.1    Lawrence Zitnick, C.2    Parikh, D.3
  • 52
    • 84858111888 scopus 로고    scopus 로고
    • The use of spatial relations in referring expression generation
    • Association for Computational Linguistics, 1, 2
    • J. Viethen and R. Dale. The use of spatial relations in referring expression generation. In INLG, pages 59-67. Association for Computational Linguistics, 2008.
    • (2008) INLG , pp. 59-67
    • Viethen, J.1    Dale, R.2
  • 53
    • 84946747440 scopus 로고    scopus 로고
    • Show and tell: A neural image caption generator
    • 1, 2, 4
    • O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. In CVPR, 2015.
    • (2015) CVPR
    • Vinyals, O.1    Toshev, A.2    Bengio, S.3    Erhan, D.4
  • 54
    • 0013359151 scopus 로고
    • Understanding natural language
    • 2
    • T. Winograd. Understanding natural language. Cognitive psychology, 3 (1): 1-191, 1972.
    • (1972) Cognitive Psychology , vol.3 , Issue.1 , pp. 1-191
    • Winograd, T.1
  • 56
    • 80053258778 scopus 로고    scopus 로고
    • Corpus-guided sentence generation of natural images
    • 2
    • Y. Yang, C. L. Teo, H. Daumé III, and Y. Aloimonos. Corpus-guided sentence generation of natural images. In EMNLP, pages 444-454, 2011.
    • (2011) EMNLP , pp. 444-454
    • Yang, Y.1    Teo, C.L.2    Daumé, H.3    Aloimonos, Y.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.