SCOPUS 정보 검색 플랫폼

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

Volumn , Issue , 2018, Pages 5831-5840

Neural Motifs: Scene Graph Parsing with Global Context

(4) Zellers, Rowan a Yatskar, Mark a,b Thomson, Sam c Choi, Yejin a,b

a University of Washington (United States)

b Allen Institute for Artificial Intelligence ^* (Austria)

c Carnegie Mellon University (United States)

Author keywords

[No Author keywords available]

Indexed keywords

OBJECT DETECTION;

GLOBAL CONTEXT; HIGHER-ORDER; SCENE GRAPH; STATE OF THE ART; STRUCTURED GRAPHS; SUBGRAPHS; TRAINING SETS; VISUAL SCENE;

COMPUTER VISION;

EID: 85055127632 PISSN: 10636919 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/CVPR.2018.00611 Document Type: Conference Paper

Times cited : (1137)

References (59)

1
- 84973890960
- VQA: Visual question answering
- Santiago, Chile, December 7-13, 2015
- S. Antol, A. Agrawal, J. Lu, M. Mitchell, D. Batra, C. L. Zitnick, and D. Parikh. VQA: Visual question answering. In 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7-13, 2015, pages 2425-2433, 2015.
- (2015) 2015 IEEE International Conference on Computer Vision, ICCV 2015 , pp. 2425-2433
- Antol, S.¹ Agrawal, A.² Lu, J.³ Mitchell, M.⁴ Batra, D.⁵ Zitnick, C.L.⁶ Parikh, D.⁷

2
- 85041914730
- Annotating object instances with a polygon-rnn
- Honolulu, HI, USA, July 21-26, 2017
- L. Castrejon, K. Kundu, R. Urtasun, and S. Fidler. Annotating object instances with a polygon-rnn. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017, pages 4485-4493, 2017.
- (2017) 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017 , pp. 4485-4493
- Castrejon, L.¹ Kundu, K.² Urtasun, R.³ Fidler, S.⁴

3
- 84973868179
- Hico: A benchmark for recognizing human-object interactions in images
- Y.-W. Chao, Z. Wang, Y. He, J. Wang, and J. Deng. Hico: A benchmark for recognizing human-object interactions in images. In Proceedings of the IEEE International Conference on Computer Vision, 2015.
- (2015) Proceedings of the IEEE International Conference on Computer Vision
- Chao, Y.-W.¹ Wang, Z.² He, Y.³ Wang, J.⁴ Deng, J.⁵

4
- 84957029470
- Mind's eye: A recurrent visual representation for image caption generation
- X. Chen and C. Lawrence Zitnick. Mind's eye: A recurrent visual representation for image caption generation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2422-2431, 2015.
- (2015) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pp. 2422-2431
- Chen, X.¹ Lawrence Zitnick, C.²

5
- 84898803720
- Neil: Extracting visual knowledge from web data
- IEEE
- X. Chen, A. Shrivastava, and A. Gupta. Neil: Extracting visual knowledge from web data. In International Conference on Computer Vision (ICCV), pages 1409-1416. IEEE, 2013.
- (2013) International Conference on Computer Vision (ICCV). , pp. 1409
- Chen, X.¹ Shrivastava, A.² Gupta, A.³

6
- 85064822086
- B. Dai, Y. Zhang, and D. Lin. Detecting visual relationships with deep relational networks. 2017.
- (2017) Detecting Visual Relationships with Deep Relational Networks.
- Dai, B.¹ Zhang, Y.² Lin, D.³

7
- 70450161428
- An empirical study of context in object detection
- IEEE
- S. K. Divvala, D. Hoiem, J. H. Hays, A. A. Efros, and M. Hebert. An empirical study of context in object detection. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pages 1271-1278. IEEE, 2009.
- (2009) Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on , pp. 1271-1278
- Divvala, S.K.¹ Hoiem, D.² Hays, J.H.³ Efros, A.A.⁴ Hebert, M.⁵

8
- 84986248327
- Z. Et al. ArXiv preprint arXiv: 1507.05670
- Z. Et al. Building a large-scale multimodal knowledge base for visual question answering. ArXiv preprint arXiv: 1507.05670, 2015.
- (2015) Building A Large-scale Multimodal Knowledge Base for Visual Question Answering.

9
- 84944115860
- June
- H. Fang, S. Gupta, F. Iandola, R. K. Srivastava, L. Deng, P. Dollar, J. Gao, X. He, M. Mitchell, J. C. Platt, C. Lawrence Zitnick, and G. Zweig. From captions to visual concepts and back. June 2015.
- (2015) From Captions to Visual Concepts and Back.
- Fang, H.¹ Gupta, S.² Iandola, F.³ Srivastava, R.K.⁴ Deng, L.⁵ Dollar, P.⁶ Gao, J.⁷ He, X.⁸ Mitchell, M.⁹ Platt, J.C.¹⁰ Lawrence Zitnick, C.¹¹ Zweig, G.¹²

10
- 78149311145
- Every picture tells a story: Generating sentences from images
- Springer
- A. Farhadi, M. Hejrati, M. A. Sadeghi, P. Young, C. Rashtchian, J. Hockenmaier, and D. Forsyth. Every picture tells a story: Generating sentences from images. In European Conference on Computer Vision, pages 15-29. Springer, 2010.
- (2010) European Conference on Computer Vision , pp. 15-29
- Farhadi, A.¹ Hejrati, M.² Sadeghi, M.A.³ Young, P.⁴ Rashtchian, C.⁵ Hockenmaier, J.⁶ Forsyth, D.⁷

11
- 78651403274
- Context based object categorization: A critical survey
- C. Galleguillos and S. Belongie. Context based object categorization: A critical survey. Computer vision and image understanding, 114(6): 712-722, 2010.
- (2010) Computer Vision and Image Understanding , vol.114 , Issue.6 , pp. 712-722
- Galleguillos, C.¹ Belongie, S.²

12
- 84957033954
- arXiv preprint arXiv: 1505.05612
- H. E. A. Gao. Are you talking to a machine? Dataset and methods for multilingual image question answering. ArXiv preprint arXiv: 1505.05612, 2015.
- (2015) Are You Talking to A Machine? Dataset and Methods for Multilingual Image Question Answering.
- Gao, H.E.A.¹

13
- 84902318725
- A survey on still image based human action recognition
- G. Guo et al. A survey on still image based human action recognition. Pattern Recognition, 2014.
- (2014) Pattern Recognition
- Guo, G.¹

14
- 85040937906
- Deep semantic role labeling: What works and whats next
- L. He, K. Lee, M. Lewis, and L. Zettlemoyer. Deep semantic role labeling: What works and whats next. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, 2017.
- (2017) Proceedings of the Annual Meeting of the Association for Computational Linguistics
- He, L.¹ Lee, K.² Lewis, M.³ Zettlemoyer, L.⁴

15
- 0031573117
- Long short-term memory
- Nov
- S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Comput., 9(8): 1735-1780, Nov. 1997.
- (1997) Neural Comput. , vol.9 , Issue.8 , pp. 1735-1780
- Hochreiter, S.¹ Schmidhuber, J.²

16
- 85040312808
- arXiv preprint arXiv: 1704.05526
- R. Hu, J. Andreas, M. Rohrbach, T. Darrell, and K. Saenko. Learning to reason: End-To-end module networks for visual question answering. ArXiv preprint arXiv: 1704.05526, 2017.
- (2017) Learning to Reason: End-To-end Module Networks for Visual Question Answering.
- Hu, R.¹ Andreas, J.² Rohrbach, M.³ Darrell, T.⁴ Saenko, K.⁵

17
- 84959233256
- Image retrieval using scene graphs
- J. Johnson, R. Krishna, M. Stark, L.-J. Li, D. Shamma, M. Bernstein, and L. Fei-Fei. Image retrieval using scene graphs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3668-3678, 2015.
- (2015) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pp. 3668-3678
- Johnson, J.¹ Krishna, R.² Stark, M.³ Li, L.-J.⁴ Shamma, D.⁵ Bernstein, M.⁶ Fei-Fei, L.⁷

18
- 84946734827
- Deep visual-semantic alignments for generating image descriptions
- A. Karpathy and L. Fei-Fei. Deep visual-semantic alignments for generating image descriptions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3128-3137, 2015.
- (2015) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pp. 3128-3137
- Karpathy, A.¹ Fei-Fei, L.²

19
- 85087529518
- Hadamard product for low-rank bilinear pooling
- J.-H. Kim, K. W. On, W. Lim, J. Kim, J.-W. Ha, and B.-T. Zhang. Hadamard Product for Low-rank Bilinear Pooling. In The 5th International Conference on Learning Representations, 2017.
- (2017) The 5th International Conference on Learning Representations
- Kim, J.-H.¹ On, K.W.² Lim, W.³ Kim, J.⁴ Ha, J.-W.⁵ Zhang, B.-T.⁶

20
- 84911370987
- What are you talking about? Text-To-image coreference
- C. Kong, D. Lin, M. Bansal, R. Urtasun, and S. Fidler. What are you talking about? text-To-image coreference. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3558-3565, 2014.
- (2014) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pp. 3558-3565
- Kong, C.¹ Lin, D.² Bansal, M.³ Urtasun, R.⁴ Fidler, S.⁵

21
- 85040907831
- Neural amr: Sequence-To-sequence models for parsing and generation
- Long Papers, volume 1
- I. Konstas, S. Iyer, M. Yatskar, Y. Choi, and L. Zettlemoyer. Neural amr: Sequence-To-sequence models for parsing and generation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), volume 1, pages 146-157, 2017.
- (2017) Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics , vol.1 , pp. 146-157
- Konstas, I.¹ Iyer, S.² Yatskar, M.³ Choi, Y.⁴ Zettlemoyer, L.⁵

22
- 85011596790
- Visual genome: Connecting language and vision using crowdsourced dense image annotations
- R. Krishna, Y. Zhu, O. Groth, J. Johnson, K. Hata, J. Kravitz, S. Chen, Y. Kalantidis, L.-J. Li, D. A. Shamma, et al. Visual genome: Connecting language and vision using crowdsourced dense image annotations. International Journal of Computer Vision, 123(1): 32-73, 2017.
- (2017) International Journal of Computer Vision , vol.123 , Issue.1 , pp. 32-73
- Krishna, R.¹ Zhu, Y.² Groth, O.³ Johnson, J.⁴ Hata, K.⁵ Kravitz, J.⁶ Chen, S.⁷ Kalantidis, Y.⁸ Li, L.-J.⁹ Shamma, D.A.¹⁰

23
- 85041906062
- Vip-cnn: Visual phrase guided convolutional neural network
- IEEE
- Y. Li, W. Ouyang, X. Wang, et al. Vip-cnn: Visual phrase guided convolutional neural network. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 7244-7253. IEEE, 2017.
- (2017) 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). , pp. 7244-7253
- Li, Y.¹ Ouyang, W.² Wang, X.³

24
- 85044466079
- arXiv: 1702.07191 [cs], Feb. ArXiv: 1702.07191
- Y. Li, W. Ouyang, X. Wang, and X. Tang. ViPCNN: Visual Phrase Guided Convolutional Neural Network. ArXiv: 1702.07191 [cs], Feb. 2017. ArXiv: 1702.07191.
- (2017) ViPCNN: Visual Phrase Guided Convolutional Neural Network.
- Li, Y.¹ Ouyang, W.² Wang, X.³ Tang, X.⁴

25
- 85041915815
- Scene graph generation from objects, phrases and region captions
- Y. Li, W. Ouyang, B. Zhou, K. Wang, and X. Wang. Scene graph generation from objects, phrases and region captions. In Proceedings of the IEEE International Conference on Computer Vision, 2017.
- (2017) Proceedings of the IEEE International Conference on Computer Vision
- Li, Y.¹ Ouyang, W.² Zhou, B.³ Wang, K.⁴ Wang, X.⁵

26
- 78149310629
- What, where and who? Classifying events by scene and object recognition
- L.-J. Li et al. What, where and who? classifying events by scene and object recognition. In CVPR, 2007.
- (2007) CVPR
- Li, L.-J.¹

27
- 85018475248
- arXiv: 1703.03054 [cs], Mar.. ArXiv: 1703.03054
- X. Liang, L. Lee, and E. P. Xing. Deep Variation-structured Reinforcement Learning for Visual Relationship and Attribute Detection. ArXiv: 1703.03054 [cs], Mar. 2017. ArXiv: 1703.03054.
- (2017) Deep Variation-structured Reinforcement Learning for Visual Relationship and Attribute Detection.
- Liang, X.¹ Lee, L.² Xing, E.P.³

28
- 84937834115
- Microsoft coco: Common objects in context
- T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollr, and C. L. Zitnick. Microsoft coco: Common objects in context. In European Conference on Computer Vision (ECCV), Zrich, 2014.
- (2014) European Conference on Computer Vision (ECCV), Zrich
- Lin, T.-Y.¹ Maire, M.² Belongie, S.³ Hays, J.⁴ Perona, P.⁵ Ramanan, D.⁶ Dollr, P.⁷ Zitnick, C.L.⁸

29
- 85035234967
- Visual relationship detection with language priors
- C. Lu, R. Krishna, M. Bernstein, and L. Fei-Fei. Visual relationship detection with language priors. In European Conference on Computer Vision, 2016.
- (2016) European Conference on Computer Vision
- Lu, C.¹ Krishna, R.² Bernstein, M.³ Fei-Fei, L.⁴

30
- 70450177757
- Actions in context
- IEEE
- M. Marszalek, I. Laptev, and C. Schmid. Actions in context. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pages 2929-2936. IEEE, 2009.
- (2009) Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on , pp. 2929-2936
- Marszalek, M.¹ Laptev, I.² Schmid, C.³

31
- 85047015627
- Pixels to graphs by associative embedding
- A. Newell and J. Deng. Pixels to graphs by associative embedding. In Advances in neural information processing systems, 2017.
- (2017) Advances in Neural Information Processing Systems
- Newell, A.¹ Deng, J.²

32
- 84990043973
- arXiv preprint arXiv: 1505.04870
- B. Plummer, L. Wang, C. Cervantes, J. Caicedo, J. Hockenmaier, and S. Lazebnik. Flickr30k entities: Collecting region-To-phrase correspondences for richer image-Tosentence models. ArXiv preprint arXiv: 1505.04870, 2015.
- (2015) Flickr30k Entities: Collecting Region-To-phrase Correspondences for Richer Image-Tosentence Models.
- Plummer, B.¹ Wang, L.² Cervantes, C.³ Caicedo, J.⁴ Hockenmaier, J.⁵ Lazebnik, S.⁶

33
- 50649096757
- Objects in context
- IEEE
- A. Rabinovich, A. Vedaldi, C. Galleguillos, E. Wiewiora, and S. Belongie. Objects in context. In Computer vision, 2007. ICCV 2007. IEEE 11th international conference on, pages 1-8. IEEE, 2007.
- (2007) Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on , pp. 1-8
- Rabinovich, A.¹ Vedaldi, A.² Galleguillos, C.³ Wiewiora, E.⁴ Belongie, S.⁵

34
- 85021846258
- arXiv: 1612.08242 [cs], Dec.. ArXiv: 1612.08242
- J. Redmon and A. Farhadi. YOLO9000: Better, Faster, Stronger. ArXiv: 1612.08242 [cs], Dec. 2016. ArXiv: 1612.08242.
- (2016) YOLO9000: Better, Faster, Stronger.
- Redmon, J.¹ Farhadi, A.²

35
- 84990028830
- arXiv preprint arXiv: 1605.09410
- M. Ren and R. S. Zemel. End-To-end instance segmentation and counting with recurrent attention. ArXiv preprint arXiv: 1605.09410, 2016.
- (2016) End-To-end Instance Segmentation and Counting with Recurrent Attention.
- Ren, M.¹ Zemel, R.S.²

36
- 84955283951
- arXiv: 1506.01497 [cs], June. ArXiv: 1506.01497
- S. Ren, K. He, R. Girshick, and J. Sun. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. ArXiv: 1506.01497 [cs], June 2015. ArXiv: 1506.01497.
- (2015) Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.
- Ren, S.¹ He, K.² Girshick, R.³ Sun, J.⁴

37
- 84962816362
- arXiv preprint arXiv: 1505.02074
- M. Ren et al. Image question answering: A visual semantic embedding model and a new dataset. ArXiv preprint arXiv: 1505.02074, 2015.
- (2015) Image Question Answering: A Visual Semantic Embedding Model and A New Dataset.
- Ren, M.¹

38
- 0015064542
- Edge and curve detection for visual scene analysis
- May
- A. Rosenfeld and M. Thurston. Edge and curve detection for visual scene analysis. IEEE Trans. Comput., 20(5): 562-569, May 1971.
- (1971) IEEE Trans. Comput. , vol.20 , Issue.5 , pp. 562-569
- Rosenfeld, A.¹ Thurston, M.²

39
- 84959184467
- Viske: Visual knowledge extraction and question answering by visual verification of relation phrases
- F. Sadeghi, S. K. Divvala, and A. Farhadi. Viske: Visual knowledge extraction and question answering by visual verification of relation phrases. In Conference on Computer Vision and Pattern Recognition, pages 1456-1464, 2015.
- (2015) Conference on Computer Vision and Pattern Recognition , pp. 1456-1464
- Sadeghi, F.¹ Divvala, S.K.² Farhadi, A.³

40
- 84933585162
- Very deep convolutional networks for large-scale image recognition
- abs/1409.1556
- K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556, 2014.
- (2014) CoRR
- Simonyan, K.¹ Zisserman, A.²

41
- 84965164720
- Training very deep networks
- NIPS'15, Cambridge, MA, USA,. MIT Press
- R. K. Srivastava, K. Greff, and J. Schmidhuber. Training very deep networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems-Volume 2, NIPS'15, pages 2377-2385, Cambridge, MA, USA, 2015. MIT Press.
- (2015) Proceedings of the 28th International Conference on Neural Information Processing Systems , vol.2 , pp. 2377-2385
- Srivastava, R.K.¹ Greff, K.² Schmidhuber, J.³

42
- 84858436215
- Approaching the symbol grounding problem with probabilistic graphical models
- S. Tellex, T. Kollar, S. Dickerson, M. R.Walter, A. G. Banerjee, S. Teller, and N. Roy. Approaching the symbol grounding problem with probabilistic graphical models. AI magazine, 32(4): 64-76, 2011.
- (2011) AI Magazine , vol.32 , Issue.4 , pp. 64-76
- Tellex, S.¹ Kollar, T.² Dickerson, S.³ Walter, M.R.⁴ Banerjee, A.G.⁵ Teller, S.⁶ Roy, N.⁷

43
- 85040312182
- Graph-structured representations for visual question answering
- D. Teney, L. Liu, and A. Van den Hengel. Graph-structured representations for visual question answering. CVPR, 2017.
- (2017) CVPR
- Teney, D.¹ Liu, L.² Van Den Hengel, A.³

44
- 84965136196
- Grammar as a foreign language
- O. Vinyals, L. Kaiser, T. Koo, S. Petrov, I. Sutskever, and G. Hinton. Grammar as a foreign language. In Advances in Neural Information Processing Systems, pages 2773-2781, 2015.
- (2015) Advances in Neural Information Processing Systems , pp. 2773-2781
- Vinyals, O.¹ Kaiser, L.² Koo, T.³ Petrov, S.⁴ Sutskever, I.⁵ Hinton, G.⁶

45
- 84946747440
- Show and tell: A neural image caption generator
- O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3156-3164, 2015.
- (2015) 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). , pp. 3156-3164
- Vinyals, O.¹ Toshev, A.² Bengio, S.³ Erhan, D.⁴

46
- 84986313796
- Cnn-rnn: A unified framework for multi-label image classification
- J. Wang, Y. Yang, J. Mao, Z. Huang, C. Huang, and W. Xu. Cnn-rnn: A unified framework for multi-label image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2285-2294, 2016.
- (2016) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pp. 2285-2294
- Wang, J.¹ Yang, Y.² Mao, J.³ Huang, Z.⁴ Huang, C.⁵ Xu, W.⁶

47
- 85041900712
- arXiv: 1701[cs], Jan. ArXiv: 1701.02426
- D. Xu, Y. Zhu, C. B. Choy, and L. Fei-Fei. Scene Graph Generation by Iterative Message Passing. ArXiv: 1701. [cs], Jan. 2017. ArXiv: 1701.02426.
- (2017) Scene Graph Generation by Iterative Message Passing.
- Xu, D.¹ Zhu, Y.² Choy, C.B.³ Fei-Fei, L.⁴

48
- 77955988492
- Modeling mutual context of object and human pose in human-object interaction activities
- B. Yao et al. Modeling mutual context of object and human pose in human-object interaction activities. In CVPR, 2010.
- (2010) CVPR
- Yao, B.¹

49
- 84994129838
- Stating the obvious: Extracting visual common sense knowledge
- M. Yatskar, V. Ordonez, and A. Farhadi. Stating the obvious: Extracting visual common sense knowledge. In Proceedings of NAACL, 2016.
- (2016) Proceedings of NAACL
- Yatskar, M.¹ Ordonez, V.² Farhadi, A.³

50
- 84986247420
- Situation recognition: Visual semantic role labeling for image understanding
- June
- M. Yatskar, L. Zettlemoyer, and A. Farhadi. Situation Recognition: Visual Semantic Role Labeling for Image Understanding. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016.
- (2016) The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
- Yatskar, M.¹ Zettlemoyer, L.² Farhadi, A.³

51
- 85045896626
- Obj2text: Generating visually descriptive language from object layouts
- X. Yin and V. Ordonez. Obj2text: Generating visually descriptive language from object layouts. In EMNLP, 2017.
- (2017) EMNLP
- Yin, X.¹ Ordonez, V.²

52
- 85045896626
- Obj2text: Generating visually descriptive language from object layouts
- X. Yin and V. Ordonez. Obj2text: Generating visually descriptive language from object layouts. In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2017.
- (2017) Conference on Empirical Methods in Natural Language Processing (EMNLP)
- Yin, X.¹ Ordonez, V.²

53
- 84959862697
- arXiv preprint arXiv: 1506.00278
- L. E. A. Yu. Visual madlibs: Fill in the blank image generation and question answering. ArXiv preprint arXiv: 1506.00278, 2015.
- (2015) Visual Madlibs: Fill in the Blank Image Generation and Question Answering.
- Yu, L.E.A.¹

54
- 85041910221
- Visual relationship detection with internal and external linguistic knowledge distillation
- Oct
- R. Yu, A. Li, V. I. Morariu, and L. S. Davis. Visual relationship detection with internal and external linguistic knowledge distillation. In The IEEE International Conference on Computer Vision (ICCV), Oct 2017.
- (2017) The IEEE International Conference on Computer Vision (ICCV
- Yu, R.¹ Li, A.² Morariu, V.I.³ Davis, L.S.⁴

55
- 85073149778
- Zero-shot activity recognition with verb attribute induction
- R. Zellers and Y. Choi. Zero-shot activity recognition with verb attribute induction. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2017.
- (2017) Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP).
- Zellers, R.¹ Choi, Y.²

56
- 85030471090
- arXiv: 1606.09239 [cs], June. ArXiv: 1606.09239
- H. Zhang, Z. Hu, Y. Deng, M. Sachan, Z. Yan, and E. P. Xing. Learning Concept Taxonomies from Multimodal Data. ArXiv: 1606.09239 [cs], June 2016. ArXiv: 1606.09239.
- (2016) Learning Concept Taxonomies from Multimodal Data.
- Zhang, H.¹ Hu, Z.² Deng, Y.³ Sachan, M.⁴ Yan, Z.⁵ Xing, E.P.⁶

57
- 85029388674
- Visual translation embedding network for visual relation detection
- H. Zhang, Z. Kyaw, S.-F. Chang, and T.-S. Chua. Visual translation embedding network for visual relation detection. CVPR, 2017.
- (2017) CVPR
- Zhang, H.¹ Kyaw, Z.² Chang, S.-F.³ Chua, T.-S.⁴

58
- 84973358602
- Highway long short-Term memory rnns for distant speech recognition
- March
- Y. Zhang, G. Chen, D. Yu, K. Yaco, S. Khudanpur, and J. Glass. Highway long short-Term memory rnns for distant speech recognition. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 5755-5759, March 2016.
- (2016) 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP , pp. 5755-5759
- Zhang, Y.¹ Chen, G.² Yu, D.³ Yaco, K.⁴ Khudanpur, S.⁵ Glass, J.⁶

59
- 84906493890
- Reasoning about object affordances in a knowledge base representation
- Springer
- Y. Zhu, A. Fathi, and L. Fei-Fei. Reasoning about object affordances in a knowledge base representation. In European conference on computer vision, pages 408-424. Springer, 2014.
- (2014) European Conference on Computer Vision , pp. 408
- Zhu, Y.¹ Fathi, A.² Fei-Fei, L.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.