SCOPUS 정보 검색 플랫폼

Proceedings of the IEEE International Conference on Computer Vision

Volumn 2017-October, Issue , 2017, Pages 4243-4251

PPR-FCN: Weakly Supervised Visual Relation Detection via Parallel Pairwise R-FCN

(4) Zhang, Hanwang a Kyaw, Zawlin b Yu, Jinyang a Chang, Shih Fu a

a Howard Hughes Medical Institute (United States)

b NATIONAL UNIVERSITY OF SINGAPORE (Singapore)

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER VISION;

CONVOLUTIONAL NETWORKS; LOCAL OPTIMAL SOLUTION; OBJECT-RELATIONS; REGION-BASED; SPATIAL CONTEXT;

OBJECT DETECTION;

EID: 85035223616 PISSN: 15505499 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICCV.2017.454 Document Type: Conference Paper

Times cited : (159)

References (50)

1
- 84943794851
- Leveraging linguistic structure for open domain information extraction
- G. Angeli, M. J. Premkumar, and C. D. Manning. Leveraging linguistic structure for open domain information extraction. In ACL, 2015
- (2015) ACL
- Angeli, G.¹ Premkumar, M.J.² Manning, C.D.³

2
- 85041922388
- Learning to generalize to new compositions in image understanding
- Y. Atzmon, J. Berant, V. Kezami, A. Globerson, and G. Chechik. Learning to generalize to new compositions in image understanding. In EMNLP, 2016
- (2016) EMNLP
- Atzmon, Y.¹ Berant, J.² Kezami, V.³ Globerson, A.⁴ Chechik, G.⁵

3
- 84986269551
- Weakly supervised deep detection networks
- H. Bilen and A. Vedaldi. Weakly supervised deep detection networks. In CVPR, 2016
- (2016) CVPR
- Bilen, H.¹ Vedaldi, A.²

4
- 84973868179
- Hico: A benchmark for recognizing human-object interactions in images
- Y.-W. Chao, Z. Wang, Y. He, J. Wang, and J. Deng. Hico: A benchmark for recognizing human-object interactions in images. In ICCV, 2015
- (2015) ICCV
- Chao, Y.-W.¹ Wang, Z.² He, Y.³ Wang, J.⁴ Deng, J.⁵

5
- 85029348551
- Sca-cnn: Spatial and channel-wise attention in convolutional networks for image captioning
- L. Chen, H. Zhang, J. Xiao, L. Nie, J. Shao, W. Liu, and T.-S. Chua. Sca-cnn: Spatial and channel-wise attention in convolutional networks for image captioning. In CVPR, 2017
- (2017) CVPR
- Chen, L.¹ Zhang, H.² Xiao, J.³ Nie, L.⁴ Shao, J.⁵ Liu, W.⁶ Chua, T.-S.⁷

6
- 85003782026
- Weakly supervised object localization with multi-fold multiple instance learning
- R. G. Cinbis, J. Verbeek, and C. Schmid. Weakly supervised object localization with multi-fold multiple instance learning. TPAMI, 2017
- (2017) TPAMI
- Cinbis, R.G.¹ Verbeek, J.² Schmid, C.³

7
- 85041892861
- Detecting visual relationships with deep relational networks
- B. Dai, Y. Zhang, and D. Lin. Detecting visual relationships with deep relational networks. In CVPR, 2017
- (2017) CVPR
- Dai, B.¹ Zhang, Y.² Lin, D.³

8
- 84877748784
- Detecting actions, poses, and objects with relational phraselets
- C. Desai and D. Ramanan. Detecting actions, poses, and objects with relational phraselets. In ECCV, 2012
- (2012) ECCV
- Desai, C.¹ Ramanan, D.²

9
- 84898798806
- Restoring an image taken through a window covered with dirt or rain
- D. Eigen, D. Krishnan, and R. Fergus. Restoring an image taken through a window covered with dirt or rain. In ICCV, 2013
- (2013) ICCV
- Eigen, D.¹ Krishnan, D.² Fergus, R.³

10
- 85029359197
- Fast r-cnn
- R. Girshick. Fast r-cnn. In ICCV, 2015
- (2015) ICCV
- Girshick, R.¹

11
- 70450155469
- Beyond nouns: Exploiting prepositions and comparative adjectives for learning visual classifiers
- A. Gupta and L. S. Davis. Beyond nouns: Exploiting prepositions and comparative adjectives for learning visual classifiers. In ECCV, 2008
- (2008) ECCV
- Gupta, A.¹ Davis, L.S.²

12
- 69549121743
- Observing human-object interactions: Using spatial and functional compatibility for recognition
- A. Gupta, A. Kembhavi, and L. S. Davis. Observing human-object interactions: Using spatial and functional compatibility for recognition. TPAMI, 2009
- (2009) TPAMI
- Gupta, A.¹ Kembhavi, A.² Davis, L.S.³

13
- 84978717864
- K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. 2016
- (2016) Deep Residual Learning for Image Recognition
- He, K.¹ Zhang, X.² Ren, S.³ Sun, J.⁴

14
- 85041917330
- arXiv preprint arXiv:1611.09978
- R. Hu, M. Rohrbach, J. Andreas, T. Darrell, and K. Saenko. Modeling relationships in referential expressions with compositional modular networks. arXiv preprint arXiv:1611.09978, 2016
- (2016) Modeling Relationships in Referential Expressions with Compositional Modular Networks
- Hu, R.¹ Rohrbach, M.² Andreas, J.³ Darrell, T.⁴ Saenko, K.⁵

15
- 85041929043
- Modeling relationships in referential expressions with compositional modular networks
- R. Hu, M. Rohrbach, J. Andreas, T. Darrell, and K. Saenko. Modeling relationships in referential expressions with compositional modular networks. In CVPR, 2017
- (2017) CVPR
- Hu, R.¹ Rohrbach, M.² Andreas, J.³ Darrell, T.⁴ Saenko, K.⁵

16
- 85040949959
- Deep self-taught learning for weakly supervised object localization
- Z. Jie, Y. Wei, X. Jin, J. Feng, and W. Liu. Deep self-taught learning for weakly supervised object localization. In CVPR, 2017
- (2017) CVPR
- Jie, Z.¹ Wei, Y.² Jin, X.³ Feng, J.⁴ Liu, W.⁵

17
- 84959233256
- Image retrieval using scene graphs
- J. Johnson, R. Krishna, M. Stark, L.-J. Li, D. A. Shamma, M. S. Bernstein, and L. Fei-Fei. Image retrieval using scene graphs. In CVPR, 2015
- (2015) CVPR
- Johnson, J.¹ Krishna, R.² Stark, M.³ Li, L.-J.⁴ Shamma, D.A.⁵ Bernstein, M.S.⁶ Fei-Fei, L.⁷

18
- 85021823117
- Contextlocnet: Context-aware deep network models for weakly supervised localization
- V. Kantorov, M. Oquab, M. Cho, and I. Laptev. Contextlocnet: Context-aware deep network models for weakly supervised localization. In ECCV, 2016
- (2016) ECCV
- Kantorov, V.¹ Oquab, M.² Cho, M.³ Laptev, I.⁴

19
- 84941620184
- arXiv preprint arXiv:1412.6980
- D. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014
- (2014) Adam: A Method for Stochastic Optimization
- Kingma, D.¹ Ba, J.²

20
- 84990070438
- Visual genome: Connecting language and vision using crowdsourced dense image annotations
- R. Krishna, Y. Zhu, O. Groth, J. Johnson, K. Hata, J. Kravitz, S. Chen, Y. Kalantidis, L.-J. Li, D. A. Shamma, et al. Visual genome: Connecting language and vision using crowdsourced dense image annotations. IJCV, 2016
- (2016) IJCV
- Krishna, R.¹ Zhu, Y.² Groth, O.³ Johnson, J.⁴ Hata, K.⁵ Kravitz, J.⁶ Chen, S.⁷ Kalantidis, Y.⁸ Li, L.-J.⁹ Shamma, D.A.¹⁰

21
- 85161967298
- Self-paced learning for latent variable models
- M. P. Kumar, B. Packer, and D. Koller. Self-paced learning for latent variable models. In NIPS, 2010
- (2010) NIPS
- Kumar, M.P.¹ Packer, B.² Koller, D.³

22
- 84986317248
- Weakly supervised object localization with progressive domain adaptation
- D. Li, J.-B. Huang, Y. Li, S. Wang, and M.-H. Yang. Weakly supervised object localization with progressive domain adaptation. In CVPR, 2016
- (2016) CVPR
- Li, D.¹ Huang, J.-B.² Li, Y.³ Wang, S.⁴ Yang, M.-H.⁵

23
- 85018938177
- R-fcn: Object detection via region-based fully convolutional networks
- Y. Li, K. He, J. Sun, et al. R-fcn: Object detection via region-based fully convolutional networks. In NIPS, 2016
- (2016) NIPS
- Li, Y.¹ He, K.² Sun, J.³

24
- 85041906062
- Vip-cnn: Visual phrase guided convolutional neural network
- Y. Li, W. Ouyang, and X. Wang. Vip-cnn: Visual phrase guided convolutional neural network. In CVPR, 2017
- (2017) CVPR
- Li, Y.¹ Ouyang, W.² Wang, X.³

25
- 85041915815
- Scene graph generation from objects, phrases and region captions
- Y. Li, W. Ouyang, B. Zhou, K. Wang, and X. Wang. Scene graph generation from objects, phrases and region captions. In ICCV, 2017
- (2017) ICCV
- Li, Y.¹ Ouyang, W.² Zhou, B.³ Wang, K.⁴ Wang, X.⁵

26
- 84937834115
- Microsoft coco: Common objects in context
- T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, and C. L. Zitnick. Microsoft coco: Common objects in context. In ECCV, 2014
- (2014) ECCV
- Lin, T.-Y.¹ Maire, M.² Belongie, S.³ Hays, J.⁴ Perona, P.⁵ Ramanan, D.⁶ Dollar, P.⁷ Zitnick, C.L.⁸

27
- 85035233030
- Surveillance video parsing with single frame supervision
- S. Liu, C. Wang, R. Qian, H. Yu, R. Bao, and Y. Sun. Surveillance video parsing with single frame supervision. In CVPR, 2017
- (2017) CVPR
- Liu, S.¹ Wang, C.² Qian, R.³ Yu, H.⁴ Bao, R.⁵ Sun, Y.⁶

28
- 84959205572
- Fully convolutional networks for semantic segmentation
- J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. In CVPR, 2015
- (2015) CVPR
- Long, J.¹ Shelhamer, E.² Darrell, T.³

29
- 85035234967
- Visual relationship detection with language priors
- C. Lu, R. Krishna, M. Bernstein, and L. Fei-Fei. Visual relationship detection with language priors. In ECCV, 2016
- (2016) ECCV
- Lu, C.¹ Krishna, R.² Bernstein, M.³ Fei-Fei, L.⁴

30
- 84898935332
- A framework for multiple-instance learning
- O. Maron and T. Lozano-Perez. A framework for multiple-instance learning. In NIPS, 1998
- (1998) NIPS
- Maron, O.¹ Lozano-Perez, T.²

31
- 85021826252
- Modeling context between objects for referring expression understanding
- V. K. Nagaraja, V. I. Morariu, and L. S. Davis. Modeling context between objects for referring expression understanding. In ECCV, 2016
- (2016) ECCV
- Nagaraja, V.K.¹ Morariu, V.I.² Davis, L.S.³

32
- 84856142160
- Weakly supervised learning of interactions between humans and objects
- A. Prest, C. Schmid, and V. Ferrari. Weakly supervised learning of interactions between humans and objects. TPAMI, 2012
- (2012) TPAMI
- Prest, A.¹ Schmid, C.² Ferrari, V.³

33
- 84959233994
- Learning semantic relationships for better action retrieval in images
- V. Ramanathan, C. Li, J. Deng, W. Han, Z. Li, K. Gu, Y. Song, S. Bengio, C. Rossenberg, and L. Fei-Fei. Learning semantic relationships for better action retrieval in images. In CVPR, 2015
- (2015) CVPR
- Ramanathan, V.¹ Li, C.² Deng, J.³ Han, W.⁴ Li, Z.⁵ Gu, K.⁶ Song, Y.⁷ Bengio, S.⁸ Rossenberg, C.⁹ Fei-Fei, L.¹⁰

34
- 84960980241
- Faster r-cnn: Towards realtime object detection with region proposal networks
- S. Ren, K. He, R. Girshick, and J. Sun. Faster r-cnn: Towards realtime object detection with region proposal networks. In NIPS, 2015
- (2015) NIPS
- Ren, S.¹ He, K.² Girshick, R.³ Sun, J.⁴

35
- 84990024294
- Grounding of textual phrases in images by reconstruction
- A. Rohrbach, M. Rohrbach, R. Hu, T. Darrell, and B. Schiele. Grounding of textual phrases in images by reconstruction. In ECCV, 2016
- (2016) ECCV
- Rohrbach, A.¹ Rohrbach, M.² Hu, R.³ Darrell, T.⁴ Schiele, B.⁵

36
- 80052889458
- Recognition using visual phrases
- M. A. Sadeghi and A. Farhadi. Recognition using visual phrases. In CVPR, 2011
- (2011) CVPR
- Sadeghi, M.A.¹ Farhadi, A.²

37
- 85123605149
- Generating semantically precise scene graphs from textual descriptions for improved image retrieval
- S. Schuster, R. Krishna, A. Chang, L. Fei-Fei, and C. D. Manning. Generating semantically precise scene graphs from textual descriptions for improved image retrieval. In Workshop on Vision and Language, 2015
- (2015) Workshop on Vision and Language
- Schuster, S.¹ Krishna, R.² Chang, A.³ Fei-Fei, L.⁴ Manning, C.D.⁵

38
- 84919792468
- On learning to localize objects with minimal supervision
- H. O. Song, R. B. Girshick, S. Jegelka, J. Mairal, Z. Harchaoui, T. Darrell, et al. On learning to localize objects with minimal supervision. In ICML, pages 1611-1619, 2014
- (2014) ICML , pp. 1611-1619
- Song, H.O.¹ Girshick, R.B.² Jegelka, S.³ Mairal, J.⁴ Harchaoui, Z.⁵ Darrell, T.⁶

39
- 84937522268
- Going deeper with convolutions
- C. Szegedy,W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In CVPR, 2015
- (2015) CVPR
- Szegedy, C.¹ Liu, W.² Jia, Y.³ Sermanet, P.⁴ Reed, S.⁵ Anguelov, D.⁶ Erhan, D.⁷ Vanhoucke, V.⁸ Rabinovich, A.⁹

40
- 84957922397
- Yfcc100m: The new data in multimedia research
- B. Thomee, D. A. Shamma, G. Friedland, B. Elizalde, K. Ni, D. Poland, D. Borth, and L.-J. Li. Yfcc100m: The new data in multimedia research. Communications of the ACM, 2016
- (2016) Communications of the ACM
- Thomee, B.¹ Shamma, D.A.² Friedland, G.³ Elizalde, B.⁴ Ni, K.⁵ Poland, D.⁶ Borth, D.⁷ Li, L.-J.⁸

41
- 84881160857
- Selective search for object recognition
- J. R. Uijlings, K. E. Van Sande, T. Gevers, and A. W. Smeulders. Selective search for object recognition. IJCV, 2013
- (2013) IJCV
- Uijlings, J.R.¹ Van Sande, K.E.² Gevers, T.³ Smeulders, A.W.⁴

42
- 84946747440
- Show and tell: A neural image caption generator
- O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. In CVPR, 2015
- (2015) CVPR
- Vinyals, O.¹ Toshev, A.² Bengio, S.³ Erhan, D.⁴

43
- 84956604127
- Weakly supervised object localization with latent category learning
- C. Wang, W. Ren, K. Huang, and T. Tan. Weakly supervised object localization with latent category learning. In ECCV, 2014
- (2014) ECCV
- Wang, C.¹ Ren, W.² Huang, K.³ Tan, T.⁴

44
- 84986320870
- Ask me anything: Free-form visual question answering based on knowledge from external sources
- Q. Wu, P. Wang, C. Shen, A. Dick, and A. van den Hengel. Ask me anything: Free-form visual question answering based on knowledge from external sources. In CVPR, 2016
- (2016) CVPR
- Wu, Q.¹ Wang, P.² Shen, C.³ Dick, A.⁴ Vanden Hengel, A.⁵

45
- 77955988492
- Modeling mutual context of object and human pose in human-object interaction activities
- B. Yao and L. Fei-Fei. Modeling mutual context of object and human pose in human-object interaction activities. In CVPR, 2010
- (2010) CVPR
- Yao, B.¹ Fei-Fei, L.²

46
- 84986247420
- Situation recognition: Visual semantic role labeling for image understanding
- M. Yatskar, L. Zettlemoyer, and A. Farhadi. Situation recognition: Visual semantic role labeling for image understanding. In CVPR, 2016
- (2016) CVPR
- Yatskar, M.¹ Zettlemoyer, L.² Farhadi, A.³

47
- 84990061297
- Modeling context in referring expressions
- L. Yu, P. Poirson, S. Yang, A. C. Berg, and T. L. Berg. Modeling context in referring expressions. In ECCV, 2016
- (2016) ECCV
- Yu, L.¹ Poirson, P.² Yang, S.³ Berg, A.C.⁴ Berg, T.L.⁵

48
- 85029388674
- Visual translation embedding network for visual relation detection
- H. Zhang, Z. Kyaw, S.-F. Chang, and T.-S. Chua. Visual translation embedding network for visual relation detection. In CVPR, 2017
- (2017) CVPR
- Zhang, H.¹ Kyaw, Z.² Chang, S.-F.³ Chua, T.-S.⁴

49
- 85041918005
- Relationship proposal networks
- J. Zhang, M. Elhoseiny, S. Cohen, W. Chang, and A. Elgammal. Relationship proposal networks. In CVPR, 2017
- (2017) CVPR
- Zhang, J.¹ Elhoseiny, M.² Cohen, S.³ Chang, W.⁴ Elgammal, A.⁵

50
- 84952018709
- Edge boxes: Locating object proposals from edges
- C. L. Zitnick and P. Dollar. Edge boxes: Locating object proposals from edges. In ECCV, 2014
- (2014) ECCV
- Zitnick, C.L.¹ Dollar, P.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.