메뉴 건너뛰기




Volumn 1, Issue , 2015, Pages 42-52

Describing images using inferred visual dependency representations

Author keywords

[No Author keywords available]

Indexed keywords

ARTS COMPUTING; COMPUTATIONAL LINGUISTICS; DEEP NEURAL NETWORKS; OBJECT DETECTION;

EID: 84943812736     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.3115/v1/p15-1005     Document Type: Conference Paper
Times cited : (54)

References (49)
  • 1
    • 84911448580 scopus 로고    scopus 로고
    • 2d human pose estimation: New benchmark and State of the Art Analysis
    • Columbus, OH, US
    • Mykhaylo Andriluka, Leonid Pishchulin, Peter Gehler, and Bernt Schiele. 2014. 2D Human Pose Estimation: New Benchmark and State of the Art Analysis. In CVPR 14, pages 3686-3693, Columbus, OH, US.
    • (2014) CVPR , vol.14 , pp. 3686-3693
    • Andriluka, M.1    Pishchulin, L.2    Gehler, P.3    Schiele, B.4
  • 2
    • 0029681342 scopus 로고    scopus 로고
    • Spatial context in recognition
    • Moshe Bar and Shimon Ullman. 1996. Spatial Context in Recognition. Perception, 25(3):343-52.
    • (1996) Perception , vol.25 , Issue.3 , pp. 343-352
    • Bar, M.1    Ullman, S.2
  • 3
    • 0020120019 scopus 로고
    • Scene perception: Detecting and judging objects undergoing relational violations
    • Irving Biederman, Robert J Mezzanotte, and Jan C Rabinowitz. 1982. Scene perception: Detecting and judging objects undergoing relational violations. Cognitive Psychology, 14(2):143-177.
    • (1982) Cognitive Psychology , vol.14 , Issue.2 , pp. 143-177
    • Biederman, I.1    Mezzanotte, R.J.2    Rabinowitz, J.C.3
  • 4
    • 84859020282 scopus 로고    scopus 로고
    • Better hypothesis testing for statistical machine translation: Controlling for optimizer instability
    • Portland, OR, U.S.A
    • JH Clark, Chris Dyer, Alon Lavie, and NA Smith. 2011. Better hypothesis testing for statistical machine translation: Controlling for optimizer instability. In ACL-HTL 11, pages 176-181, Portland, OR, U.S.A.
    • (2011) ACL-HTL , vol.11 , pp. 176-181
    • Clark, J.H.1    Dyer, C.2    Lavie, A.3    Smith, N.A.4
  • 5
    • 85120305515 scopus 로고    scopus 로고
    • Meteor 1.3: Automatic metric for reliable optimization and evaluation of machine translation systems
    • Edinburgh, Scotland, U.K
    • Michael Denkowski and Alon Lavie. 2011. Meteor 1.3: Automatic metric for reliable optimization and evaluation of machine translation systems. In SMT at EMNLP 11, Edinburgh, Scotland, U.K.
    • (2011) SMT at EMNLP , vol.11
    • Denkowski, M.1    Lavie, A.2
  • 6
    • 84959236502 scopus 로고    scopus 로고
    • Longterm recurrent convolutional networks for Visual Recognition and Description
    • Boston, MA, U.S.A
    • Jeff Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, and Trevor Darrell. 2015. Longterm Recurrent Convolutional Networks for Visual Recognition and Description. In CVPR 15, Boston, MA, U.S.A.
    • (2015) CVPR , vol.15
    • Donahue, J.1    Anne Hendricks, L.2    Guadarrama, S.3    Rohrbach, M.4    Venugopalan, S.5    Saenko, K.6    Darrell, T.7
  • 7
    • 84937943470 scopus 로고    scopus 로고
    • Depth map prediction from a single image using a Multi-Scale Deep Network
    • Lake Tahoe, CA, U.S.A, June
    • David Eigen, Christian Puhrsch, and Rob Fergus. 2014. Depth Map Prediction from a Single Image using a Multi-Scale Deep Network. In NIPS 27, Lake Tahoe, CA, U.S.A, June.
    • (2014) NIPS , vol.27
    • Eigen, D.1    Puhrsch, C.2    Fergus, R.3
  • 8
    • 84906929591 scopus 로고    scopus 로고
    • Image description using visual dependency Representations
    • Seattle, WA, U.S.A
    • Desmond Elliott and Frank Keller. 2013. Image Description using Visual Dependency Representations. In EMNLP 13, pages 1292-1302, Seattle, WA, U.S.A.
    • (2013) EMNLP , vol.13 , pp. 1292-1302
    • Elliott, D.1    Keller, F.2
  • 9
    • 84906928552 scopus 로고    scopus 로고
    • Comparing automatic evaluation measures for Image Description
    • Baltimore MD, U.S.A
    • Desmond Elliott and Frank Keller. 2014. Comparing Automatic Evaluation Measures for Image Description. In ACL 14, pages 452-457, Baltimore, MD, U.S.A.
    • (2014) ACL , vol.14 , pp. 452-457
    • Elliott, D.1    Keller, F.2
  • 10
    • 84943810574 scopus 로고    scopus 로고
    • Query-by-example image retrieval using visual Dependency Representations
    • Dublin, Ireland
    • Desmond Elliott, Victor Lavrenko, and Frank Keller. 2014. Query-by-Example Image Retrieval using Visual Dependency Representations. In COLING 14, pages 109-120, Dublin, Ireland.
    • (2014) COLING , vol.14 , pp. 109-120
    • Elliott, D.1    Lavrenko, V.2    Keller, F.3
  • 11
    • 77951298115 scopus 로고    scopus 로고
    • The pascal visual object classes challenge
    • Mark Everingham, Luc Van Gool, Christopher Williams, John Winn, and Andrew Zisserman. 2010. The PASCAL Visual Object Classes Challenge. IJCV, 88(2):303-338.
    • (2011) IJCV , vol.88 , Issue.2 , pp. 303-338
    • Everingham, M.1    Van Gool, L.2    Williams, C.3    Winn, J.4    Zisserman, A.5
  • 13
    • 80052017343 scopus 로고    scopus 로고
    • Every picture tells a story: Generating sentences from images
    • Heraklion, Crete, Greece
    • Ali Farhadi, Mohsen Hejrati, Mohammad Amin Sadeghi, Peter Young, Cyrus Rashtchian, Julia Hockenmaier, and David Forsyth. 2010. Every picture tells a story: generating sentences from images. In ECCV 10, pages 15-29, Heraklion, Crete, Greece.
    • (2011) ECCV , vol.10 , pp. 15-29
    • Farhadi, A.1    Hejrati, M.2    Amin Sadeghi, M.3    Young, P.4    Rashtchian, C.5    Hockenmaier, J.6    Forsyth, D.7
  • 14
    • 84859924694 scopus 로고    scopus 로고
    • Automatic image annotation using auxiliary text Information
    • Colombus, Ohio
    • Yansong Feng and Mirella Lapata. 2008. Automatic Image Annotation Using Auxiliary Text Information. In ACL 08, pages 272-280, Colombus, Ohio.
    • (2008) ACL , vol.8 , pp. 272-280
    • Feng, Y.1    Lapata, M.2
  • 15
    • 84913561844 scopus 로고    scopus 로고
    • Rich feature hierarchies for accurate object detection and semantic segmentation
    • abs/1311.2
    • Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. CoRR, abs/1311.2.
    • (2014) CoRR
    • Girshick, R.1    Donahue, J.2    Darrell, T.3    Malik, J.4
  • 16
    • 84883394520 scopus 로고    scopus 로고
    • Framing image description as a ranking task: Data, Models and Evaluation Metrics
    • Micah Hodosh, Peter Young, and Julia Hockenmaier. 2013. Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics. JAIR, 47:853-899.
    • (2013) JAIR , vol.47 , pp. 853-899
    • Hodosh, M.1    Young, P.2    Hockenmaier, J.3
  • 17
    • 84913580146 scopus 로고    scopus 로고
    • Caffe: Convolutional architecture for fast feature Embedding
    • Orlando, FL, U.S.A
    • Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross B. Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional Architecture for Fast Feature Embedding. In MM 14, pages 675-678, Orlando, FL, U.S.A.
    • (2014) MM , vol.14 , pp. 675-678
    • Jia, Y.1    Shelhamer, E.2    Donahue, J.3    Karayev, S.4    Long, J.5    Girshick, R.B.6    Guadarrama, S.7    Darrell, T.8
  • 18
    • 84952902559 scopus 로고    scopus 로고
    • Deep visual-semantic alignments for generating Image Descriptions
    • Boston, MA, U.S.A
    • Andrej Karpathy and Li Fei-Fei. 2015. Deep Visual-Semantic Alignments for Generating Image Descriptions. In CVPR 15, Boston, MA, U.S.A.
    • (2015) CVPR , vol.15
    • Karpathy, A.1    Fei-Fei, L.2
  • 19
    • 84959252592 scopus 로고    scopus 로고
    • Deep fragment embeddings for bidirectional image Sentence Mapping
    • Montreal, Quebec, Canada
    • Andrej Karpathy, Armand Joulin, and Li Fei-Fei. 2014. Deep Fragment Embeddings for Bidirectional Image Sentence Mapping. In NIPS 28, Montreal, Quebec, Canada.
    • (2014) NIPS , vol.28
    • Karpathy, A.1    Joulin, A.2    Fei-Fei, L.3
  • 20
    • 80052901011 scopus 로고    scopus 로고
    • Baby talk: Understanding and generating simple image descriptions
    • Colorado Springs, CO, U.S.A
    • Girish Kulkarni, Visruth Premraj, Sagnik Dhar, Siming Li, Yejin Choi, Alexander C. Berg, and Tamara L. Berg. 2011. Baby talk: Understanding and generating simple image descriptions. In CVPR 11, pages 1601-1608, Colorado Springs, CO, U.S.A.
    • (2011) CVPR , vol.11 , pp. 1601-1608
    • Kulkarni, G.1    Premraj, V.2    Dhar, S.3    Li, S.4    Choi, Y.5    Berg, A.C.6    Berg, T.L.7
  • 21
    • 84878189119 scopus 로고    scopus 로고
    • Collective generation of natural image Descriptions
    • Jeju Island, South Korea
    • Polina Kuznetsova, Vicente Ordonez, Alexander C. Berg, Tamara L. Berg, and Yejin Choi. 2012. Collective Generation of Natural Image Descriptions. In ACL 12, pages 359-368, Jeju Island, South Korea.
    • (2012) ACL , vol.12 , pp. 359-368
    • Kuznetsova, P.1    Ordonez, V.2    Berg, A.C.3    Berg, T.L.4    Choi, Y.5
  • 22
    • 85062874978 scopus 로고    scopus 로고
    • Tuhoi : Trento universal human object Interaction Dataset
    • Dublin, Ireland
    • Dieu-Thu Le, Jasper Uijlings, and Raffaella Bernardi. 2014. TUHOI : Trento Universal Human Object Interaction Dataset. In WVL at COLING 14, pages 17-24, Dublin, Ireland.
    • (2014) WVL at COLING , vol.14 , pp. 17-24
    • Le, D.1    Uijlings, J.2    Bernardi, R.3
  • 23
    • 84970028761 scopus 로고    scopus 로고
    • Phrase-based image captioning
    • Lille, France, February
    • Remi Lebret, Pedro O. Pinheiro, and Ronan Collobert. 2015. Phrase-based Image Captioning. In ICML 15, Lille, France, February.
    • (2015) ICML , vol.15
    • Lebret, R.1    Pinheiro, P.O.2    Collobert, R.3
  • 24
    • 84862279067 scopus 로고    scopus 로고
    • Composing simple image descriptions using web-scale n-grams
    • Portland, OR, U.S.A
    • Siming Li, Girish Kulkarni, Tamara L. Berg, Alexander C. Berg, and Yejin Choi. 2011. Composing simple image descriptions using web-scale n-grams. In CoNLL 11, pages 220-228, Portland, OR, U.S.A.
    • (2011) CoNLL , vol.11 , pp. 220-228
    • Li, S.1    Kulkarni, G.2    Berg, T.L.3    Berg, A.C.4    Choi, Y.5
  • 25
    • 85149140250 scopus 로고    scopus 로고
    • Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics
    • Barcelona, Spain
    • Chin-Yew Lin and Franz Josef Och. 2004. Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics. In ACL 04, pages 605-612, Barcelona, Spain.
    • (2004) ACL , vol.4 , pp. 605-612
    • Lin, C.1    Josef Och, F.2
  • 26
    • 85083953135 scopus 로고    scopus 로고
    • Network in network
    • volume abs/1312.4, Banff, Canada
    • Min Lin, Qiang Chen, and Shuicheng Yan. 2014a. Network In Network. In ICLR 14, volume abs/1312.4, Banff, Canada.
    • (2014) ICLR , vol.14
    • Lin, M.1    Chen, Q.2    Yan, S.3
  • 28
    • 0001964555 scopus 로고    scopus 로고
    • In Paul Bloom, Mary A. Peterson, Lynn Nadel, and Merrill F. Garrett, editors, Language and Space MIT Press
    • GD Logan and DD Sadler. 1996. A computational analysis of the apprehension of spatial relations. In Paul Bloom, Mary A. Peterson, Lynn Nadel, and Merrill F. Garrett, editors, Language and Space, pages 492-592. MIT Press.
    • (1996) A Computational Analysis of the Apprehension of Spatial Relations , pp. 492-592
    • Logan, G.D.1    Sadler, D.D.2
  • 29
    • 85083950512 scopus 로고    scopus 로고
    • Deep captioning with multimodal recurrent neural networks (m-rnn
    • volume abs/1412.6632, San Diego, CA, U.S.A
    • Junhua Mao, Wei Xu, Yi Yang, Yiang Wang, and Alan L. Yuille. 2015. Deep captioning with multimodal recurrent neural networks (m-rnn). In ICLR 15, volume abs/1412.6632, San Diego, CA, U.S.A.
    • (2015) ICLR , vol.15
    • Mao, J.1    Xu, W.2    Yang, Y.3    Wang, Y.4    Yuille, A.L.5
  • 30
    • 0242626599 scopus 로고    scopus 로고
    • A taxonomy of relationships between images and text
    • Emily E. Marsh and Marilyn Domas White. 2003. A taxonomy of relationships between images and text. Journal of Documentation, 59(6):647-672.
    • (2003) Journal of Documentation , vol.59 , Issue.6 , pp. 647-672
    • Marsh, E.E.1    Domas White, M.2
  • 31
    • 85023117841 scopus 로고    scopus 로고
    • Applied morphological processing of English
    • Guido Minnen, John Carroll, and Darren Pearce. 2001. Applied morphological processing of English. Natural Language Engineering, 7(3):207-223.
    • (2001) Natural Language Engineering , vol.7 , Issue.3 , pp. 207-223
    • Minnen, G.1    Carroll, J.2    Pearce, D.3
  • 32
    • 85034832841 scopus 로고    scopus 로고
    • Midge : Generating image descriptions from Computer Vision Detections
    • Avignon, France
    • Margaret Mitchell, Jesse Dodge, Amit Goyal, Kota Yamaguchi, Karl Stratos, Alyssa Mensch, Alex Berg, Tamara Berg, and Hal Daum. 2012. Midge : Generating Image Descriptions From Computer Vision Detections. In EACL 12, pages 747-756, Avignon, France.
    • (2012) EACL , vol.12 , pp. 747-756
    • Mitchell, M.1    Dodge, J.2    Goyal, A.3    Yamaguchi, K.4    Stratos, K.5    Mensch, A.6    Berg, A.7    Berg, T.8    Daum, H.9
  • 34
    • 84911449395 scopus 로고    scopus 로고
    • Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks
    • Columbus, OH, US
    • Maxime Oquab, Leon Bottou, Ivan Laptev, and Josef Sivic. 2014. Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks. In CVPR 14, pages 1717-1724, Columbus, OH, US.
    • (2014) CVPR , vol.14 , pp. 1717-1724
    • Oquab, M.1    Bottou, L.2    Laptev, I.3    Sivic, J.4
  • 35
    • 84960194772 scopus 로고    scopus 로고
    • Learning to interpret and describe abstract scenes
    • Denver, CO, U.S.A
    • Luis M. G. Ortiz, Clemens Wolff, and Mirella Lapata. 2015. Learning to Interpret and Describe Abstract Scenes. In NAACL 15, Denver, CO, U.S.A.
    • (2015) NAACL , vol.15
    • Luis, M.1    Ortiz, G.2    Wolff, C.3    Lapata, M.4
  • 37
    • 85133336275 scopus 로고    scopus 로고
    • BLEU: A method for automatic evaluation of machine translation
    • Philadelphia, PA, U.S.A
    • Kishore Papineni, Salim Roukos, Todd Ward, and WJ Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In ACL 02, pages 311-318, Philadelphia, PA, U.S.A.
    • (2002) ACL , vol.2 , pp. 311-318
    • Papineni, K.1    Roukos, S.2    Ward, T.3    Zhu, W.J.4
  • 39
    • 85090348677 scopus 로고    scopus 로고
    • Collecting image annotations using Amazons Mechanical Turk
    • Los Angeles, CA, U.S.A
    • Cyrus Rashtchian, Peter Young, Micah Hodosh, and Julia Hockenmaier. 2010. Collecting image annotations using Amazons Mechanical Turk. In AMT at NAACL 10, pages 139-147, Los Angeles, CA, U.S.A.
    • (2011) AMT at NAACL , vol.10 , pp. 139-147
    • Rashtchian, C.1    Young, P.2    Hodosh, M.3    Hockenmaier, J.4
  • 41
    • 80052889458 scopus 로고    scopus 로고
    • Recognition using visual phrases
    • Colorado Springs, CO, U.S.A
    • Mohammad A Sadeghi and Ali Farhadi. 2011. Recognition Using Visual Phrases. In CVPR 11, pages 1745-1752, Colorado Springs, CO, U.S.A.
    • (2011) CVPR , vol.11 , pp. 1745-1752
    • Sadeghi, M.A.1    Farhadi, A.2
  • 42
    • 84952235015 scopus 로고
    • Analysing the subject of a picture: A theoretical Approach
    • Sara Shatford. 1986. Analysing the Subject of a Picture: A Theoretical Approach. Cataloging &Classification Quarterly, 6(3):39-62.
    • (1986) Cataloging &Classification Quarterly , vol.6 , Issue.3 , pp. 39-62
    • Shatford, S.1
  • 43
    • 85083953063 scopus 로고    scopus 로고
    • Very deep convolutional networks for large-scale Image Recognition
    • volume abs/1409.1, San Diego, CA, U.S.A
    • Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. In ICLR 15, volume abs/1409.1, San Diego, CA, U.S.A.
    • (2015) ICLR , vol.15
    • Simonyan, K.1    Zisserman, A.2
  • 44
    • 84906925854 scopus 로고    scopus 로고
    • Grounded compositional semantics for finding and Describing Images with Sentences
    • Richard Socher, Andrej Karpathy, Q Le, C Manning, and A Ng. 2014. Grounded Compositional Semantics for Finding and Describing Images with Sentences. TACL, 2:207-218.
    • (2014) TACL , vol.2 , pp. 207-218
    • Socher, R.1    Karpathy, A.2    Le, Q.3    Manning, C.4    Ng, A.5
  • 45
    • 84983470508 scopus 로고    scopus 로고
    • Feature-rich part-of-speech tagging with a cyclic Dependency Network
    • Edmonton, Canada
    • Kristina Toutanova, Dan Klein, and Christopher D Manning. 2003. Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network. In HLTNAACL 03, pages 173-180, Edmonton, Canada.
    • (2003) HLTNAACL , vol.3 , pp. 173-180
    • Toutanova, K.1    Klein, D.2    Manning, C.D.3
  • 46
    • 84946747440 scopus 로고    scopus 로고
    • Show and tell: A neural image caption generator
    • Boston, MA, U.S.A
    • Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. 2015. Show and tell: A neural image caption generator. In CVPR 15, Boston, MA, U.S.A.
    • (2015) CVPR , vol.15
    • Vinyals, O.1    Toshev, A.2    Bengio, S.3    Erhan, D.4
  • 47
    • 80053258778 scopus 로고    scopus 로고
    • Corpus-guided sentence generation of natural images
    • Edinburgh, Scotland, UK
    • Yezhou Yang, Ching Lik Teo, Hal Daumé, and Yiannis Aloimonos. 2011. Corpus-Guided Sentence Generation of Natural Images. In EMNLP 11, pages 444-454, Edinburgh, Scotland, UK.
    • (2011) EMNLP , vol.11 , pp. 444-454
    • Yang, Y.1    Lik Teo, C.2    Daumé, H.3    Aloimonos, Y.4
  • 48
    • 85026937926 scopus 로고    scopus 로고
    • See no evil, say no evil: Description generation from Densely Labeled Images
    • Dublin, Ireland
    • Mark Yatskar, Michel Galley, L Vanderwende, and L Zettlemoyer. 2014. See No Evil, Say No Evil: Description Generation from Densely Labeled Images. In SEM, pages 110-120, Dublin, Ireland.
    • (2014) SEM , pp. 110-120
    • Yatskar, M.1    Galley, M.2    Vanderwende, L.3    Zettlemoyer, L.4
  • 49
    • 84906494296 scopus 로고    scopus 로고
    • From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions
    • Peter Young, Alice Lai, Micah Hodosh, and Julia Hockenmaier. 2014. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. TACL, 2:67-78
    • (2014) TACL , vol.2 , pp. 67-78
    • Young, P.1    Lai, A.2    Hodosh, M.3    Hockenmaier, J.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.