SCOPUS 정보 검색 플랫폼

Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017

Volumn 2017-January, Issue , 2017, Pages 3107-3115

Visual translation embedding network for visual relation detection

(4) Zhang, Hanwang a Kyaw, Zawlin b Chang, Shih Fu a Chua, Tat Seng b

a Och Spine at New York Presbyterian Hospitals (United States)

b NATIONAL UNIVERSITY OF SINGAPORE (Singapore)

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER VISION; CONVOLUTION; KNOWLEDGE MANAGEMENT; OBJECT DETECTION; TRANSLATION (LANGUAGES); VECTOR SPACES;

COMBINATORIAL COMPLEXITY; DETECTION NETWORKS; KNOWLEDGE TRANSFER; LARGE-SCALE DATASETS; RELATIONAL REPRESENTATIONS; SCENE UNDERSTANDING; STATE-OF-THE-ART METHODS; VISUAL TRANSLATION;

VISUAL LANGUAGES;

EID: 85029388674 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/CVPR.2017.331 Document Type: Conference Paper

Times cited : (506)

References (44)

1
- 84985013144
- Deep compositional question answering with neural module networks
- J. Andreas, M. Rohrbach, T. Darrell, and D. Klein. Deep compositional question answering with neural module networks. In CVPR, 2016.
- (2016) CVPR
- Andreas, J.¹ Rohrbach, M.² Darrell, T.³ Klein, D.⁴

2
- 84973890960
- Vqa: Visual question answering
- S. Antol, A. Agrawal, J. Lu, M. Mitchell, D. Batra, C. Lawrence Zitnick, and D. Parikh. Vqa: Visual question answering. In ICCV, 2015.
- (2015) ICCV
- Antol, S.¹ Agrawal, A.² Lu, J.³ Mitchell, M.⁴ Batra, D.⁵ Lawrence Zitnick, C.⁶ Parikh, D.⁷

3
- 85041922388
- Learning to generalize to new compositions in image understanding
- Y. Atzmon, J. Berant, V. Kezami, A. Globerson, and G. Chechik. Learning to generalize to new compositions in image understanding. In EMNLP, 2016.
- (2016) EMNLP
- Atzmon, Y.¹ Berant, J.² Kezami, V.³ Globerson, A.⁴ Chechik, G.⁵

4
- 84960130911
- Automatic description generation from images: A survey of models, datasets, and evaluation measures
- R. Bernardi, R. Cakici, D. Elliott, A. Erdem, E. Erdem, N. Ikizler- Cinbis, F. Keller, A. Muscat, and B. Plank. Automatic description generation from images: A survey of models, datasets, and evaluation measures. JAIR, 2016.
- (2016) JAIR
- Bernardi, R.¹ Cakici, R.² Elliott, D.³ Erdem, A.⁴ Erdem, E.⁵ Ikizler-Cinbis, N.⁶ Keller, F.⁷ Muscat, A.⁸ Plank, B.⁹

5
- 84899013802
- Translating embeddings for modeling multirelational data
- A. Bordes, N. Usunier, A. Garcia-Duran, J. Weston, and O. Yakhnenko. Translating embeddings for modeling multirelational data. In NIPS, 2013.
- (2013) NIPS
- Bordes, A.¹ Usunier, N.² Garcia-Duran, A.³ Weston, J.⁴ Yakhnenko, O.⁵

6
- 0033778373
- Representation of manipulable man-made objects in the dorsal stream
- L. L. Chao and A. Martin. Representation of manipulable man-made objects in the dorsal stream. Neuroimage, 2000.
- (2000) Neuroimage
- Chao, L.L.¹ Martin, A.²

7
- 79953187637
- Discriminative models for multi-class object layout
- C. Desai, D. Ramanan, and C. C. Fowlkes. Discriminative models for multi-class object layout. IJCV, 2011.
- (2011) IJCV
- Desai, C.¹ Ramanan, D.² Fowlkes, C.C.³

8
- 84943769848
- Question answering over freebase with multi-column convolutional neural networks
- L. Dong, F. Wei, M. Zhou, and K. Xu. Question answering over freebase with multi-column convolutional neural networks. In ACL, 2015.
- (2015) ACL
- Dong, L.¹ Wei, F.² Zhou, M.³ Xu, K.⁴

9
- 80052017343
- Every picture tells a story: Generating sentences from images
- A. Farhadi, M. Hejrati, M. A. Sadeghi, P. Young, C. Rashtchian, J. Hockenmaier, and D. Forsyth. Every picture tells a story: Generating sentences from images. In ECCV, 2010.
- (2010) ECCV
- Farhadi, A.¹ Hejrati, M.² Sadeghi, M.A.³ Young, P.⁴ Rashtchian, C.⁵ Hockenmaier, J.⁶ Forsyth, D.⁷

10
- 77955422240
- Object detection with discriminatively trained part-based models
- P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part-based models. TPAMI, 2010.
- (2010) TPAMI
- Felzenszwalb, P.F.¹ Girshick, R.B.² McAllester, D.³ Ramanan, D.⁴

11
- 85029359197
- Fast r-cnn
- R. Girshick. Fast r-cnn. In ICCV, 2015.
- (2015) ICCV
- Girshick, R.¹

12
- 84911400494
- Rich feature hierarchies for accurate object detection and semantic segmentation
- R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR, 2014.
- (2014) CVPR
- Girshick, R.¹ Donahue, J.² Darrell, T.³ Malik, J.⁴

13
- 84965100881
- arXiv preprint arXiv: 1502.04623
- K. Gregor, I. Danihelka, A. Graves, D. J. Rezende, and D. Wierstra. Draw: A recurrent neural network for image generation. arXiv preprint arXiv:1502.04623, 2015.
- (2015) Draw: A Recurrent Neural Network for Image Generation
- Gregor, K.¹ Danihelka, I.² Graves, A.³ Rezende, D.J.⁴ Wierstra, D.⁵

14
- 70450155469
- Beyond nouns: Exploiting prepositions and comparative adjectives for learning visual classifiers
- A. Gupta and L. S. Davis. Beyond nouns: Exploiting prepositions and comparative adjectives for learning visual classifiers. In ECCV, 2008.
- (2008) ECCV
- Gupta, A.¹ Davis, L.S.²

15
- 69549121743
- Observing human-object interactions: Using spatial and functional compatibility for recognition
- A. Gupta, A. Kembhavi, and L. S. Davis. Observing human-object interactions: Using spatial and functional compatibility for recognition. TPAMI, 2009.
- (2009) TPAMI
- Gupta, A.¹ Kembhavi, A.² Davis, L.S.³

16
- 84978717864
- K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. 2016. 1
- (2016) Deep Residual Learning for Image Recognition , pp. 1
- He, K.¹ Zhang, X.² Ren, S.³ Sun, J.⁴

17
- 85041926703
- Revisiting visual question answering baselines
- A. Jabri, A. Joulin, and L. van der Maaten. Revisiting visual question answering baselines. In ECCV, 2016.
- (2016) ECCV
- Jabri, A.¹ Joulin, A.² Van der Maaten, L.³

18
- 84965096967
- Spatial transformer networks
- M. Jaderberg, K. Simonyan, A. Zisserman, et al. Spatial transformer networks. In NIPS, 2015.
- (2015) NIPS
- Jaderberg, M.¹ Simonyan, K.² Zisserman, A.³

19
- 84986245786
- Densecap: Fully convolutional localization networks for dense captioning
- J. Johnson, A. Karpathy, and L. Fei-Fei. Densecap: Fully convolutional localization networks for dense captioning. In CVPR, 2016.
- (2016) CVPR
- Johnson, J.¹ Karpathy, A.² Fei-Fei, L.³

20
- 84959233256
- Image retrieval using scene graphs
- J. Johnson, R. Krishna, M. Stark, L.-J. Li, D. A. Shamma, M. S. Bernstein, and L. Fei-Fei. Image retrieval using scene graphs. In CVPR, 2015.
- (2015) CVPR
- Johnson, J.¹ Krishna, R.² Stark, M.³ Li, L.-J.⁴ Shamma, D.A.⁵ Bernstein, M.S.⁶ Fei-Fei, L.⁷

21
- 84946734827
- Deep visual-semantic alignments for generating image descriptions
- A. Karpathy and L. Fei-Fei. Deep visual-semantic alignments for generating image descriptions. In CVPR, 2015.
- (2015) CVPR
- Karpathy, A.¹ Fei-Fei, L.²

22
- 84941620184
- arXiv preprint arXiv: 1412.6980
- D. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- (2014) Adam: A Method for Stochastic Optimization
- Kingma, D.¹ Ba, J.²

23
- 84990070438
- Visual genome: Connecting language and vision using crowdsourced dense image annotations
- R. Krishna, Y. Zhu, O. Groth, J. Johnson, K. Hata, J. Kravitz, S. Chen, Y. Kalantidis, L.-J. Li, D. A. Shamma, et al. Visual genome: Connecting language and vision using crowdsourced dense image annotations. IJCV, 2016.
- (2016) IJCV
- Krishna, R.¹ Zhu, Y.² Groth, O.³ Johnson, J.⁴ Hata, K.⁵ Kravitz, J.⁶ Chen, S.⁷ Kalantidis, Y.⁸ Li, L.-J.⁹ Shamma, D.A.¹⁰

24
- 85041926899
- Deep variation-structured reinforcement learning for visual relationship and attribute detection
- X. Liang, L. Lee, and E. P. Xing. Deep variation-structured reinforcement learning for visual relationship and attribute detection. In CVPR, 2017.
- (2017) CVPR
- Liang, X.¹ Lee, L.² Xing, E.P.³

25
- 84952316342
- Learning entity and relation embeddings for knowledge graph completion
- Y. Lin, Z. Liu, M. Sun, Y. Liu, and X. Zhu. Learning entity and relation embeddings for knowledge graph completion. In AAAI, 2015.
- (2015) AAAI
- Lin, Y.¹ Liu, Z.² Sun, M.³ Liu, Y.⁴ Zhu, X.⁵

26
- 85011302702
- Ssd: Single shot multibox detector
- W. Liu, D. Anguelov, D. Erhan, C. Szegedy, and S. Reed. Ssd: Single shot multibox detector. In ECCV, 2016.
- (2016) ECCV
- Liu, W.¹ Anguelov, D.² Erhan, D.³ Szegedy, C.⁴ Reed, S.⁵

27
- 85035234967
- Visual relationship detection with language priors
- C. Lu, R. Krishna, M. Bernstein, and L. Fei-Fei. Visual relationship detection with language priors. In ECCV, 2016.
- (2016) ECCV
- Lu, C.¹ Krishna, R.² Bernstein, M.³ Fei-Fei, L.⁴

28
- 57249084011
- Visualizing data using t-sne
- L. v. d. Maaten and G. Hinton. Visualizing data using t-sne. JMLR, 2008.
- (2008) JMLR
- Maaten, L.V.D.¹ Hinton, G.²

29
- 85041909637
- Learning models for actions and personobject interactions with transfer to question answering
- A. Mallya and S. Lazebnik. Learning models for actions and personobject interactions with transfer to question answering. In ECCV, 2016.
- (2016) ECCV
- Mallya, A.¹ Lazebnik, S.²

30
- 84898956512
- Distributed representations of words and phrases and their compositionality
- T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In NIPS, 2013.
- (2013) NIPS
- Mikolov, T.¹ Sutskever, I.² Chen, K.³ Corrado, G.S.⁴ Dean, J.⁵

31
- 84943791201
- A review of relational machine learning for knowledge graphs
- M. Nickel, K. Murphy, V. Tresp, and E. Gabrilovich. A review of relational machine learning for knowledge graphs. Proceedings of the IEEE, 2016.
- (2016) Proceedings of the IEEE
- Nickel, M.¹ Murphy, K.² Tresp, V.³ Gabrilovich, E.⁴

32
- 84973856017
- Flickr30k entities: Collecting regionto- phrase correspondences for richer image-to-sentence models
- B. A. Plummer, L. Wang, C. M. Cervantes, J. C. Caicedo, J. Hockenmaier, and S. Lazebnik. Flickr30k entities: Collecting regionto- phrase correspondences for richer image-to-sentence models. In ICCV, 2015.
- (2015) ICCV
- Plummer, B.A.¹ Wang, L.² Cervantes, C.M.³ Caicedo, J.C.⁴ Hockenmaier, J.⁵ Lazebnik, S.⁶

33
- 84959233994
- Learning semantic relationships for better action retrieval in images
- V. Ramanathan, C. Li, J. Deng, W. Han, Z. Li, K. Gu, Y. Song, S. Bengio, C. Rossenberg, and L. Fei-Fei. Learning semantic relationships for better action retrieval in images. In CVPR, 2015.
- (2015) CVPR
- Ramanathan, V.¹ Li, C.² Deng, J.³ Han, W.⁴ Li, Z.⁵ Gu, K.⁶ Song, Y.⁷ Bengio, S.⁸ Rossenberg, C.⁹ Fei-Fei, L.¹⁰

34
- 84986308404
- You only look once: Unified, real-time object detection
- J. Redmon, S. Divvala, R. Girshick, and A. Farhadi. You only look once: Unified, real-time object detection. In CVPR, 2016.
- (2016) CVPR
- Redmon, J.¹ Divvala, S.² Girshick, R.³ Farhadi, A.⁴

35
- 84960980241
- Faster r-cnn: Towards realtime object detection with region proposal networks
- S. Ren, K. He, R. Girshick, and J. Sun. Faster r-cnn: Towards realtime object detection with region proposal networks. In NIPS, 2015.
- (2015) NIPS
- Ren, S.¹ He, K.² Girshick, R.³ Sun, J.⁴

36
- 84959184467
- Viske: Visual knowledge extraction and question answering by visual verification of relation phrases
- F. Sadeghi, S. K. Divvala, and A. Farhadi. Viske: Visual knowledge extraction and question answering by visual verification of relation phrases. In CVPR, 2015.
- (2015) CVPR
- Sadeghi, F.¹ Divvala, S.K.² Farhadi, A.³

37
- 80052889458
- Recognition using visual phrases
- M. A. Sadeghi and A. Farhadi. Recognition using visual phrases. In CVPR, 2011.
- (2011) CVPR
- Sadeghi, M.A.¹ Farhadi, A.²

38
- 84925410541
- arXiv preprint arXiv: 1409.1556
- K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
- (2014) Very Deep Convolutional Networks for Large-scale Image Recognition
- Simonyan, K.¹ Zisserman, A.²

39
- 80052896768
- Efficient object category recognition using classemes
- L. Torresani, M. Szummer, and A. Fitzgibbon. Efficient object category recognition using classemes. In ECCV, 2010.
- (2010) ECCV
- Torresani, L.¹ Szummer, M.² Fitzgibbon, A.³

40
- 85044362471
- Show and tell: Lessons learned from the 2015 mscoco image captioning challenge
- O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: Lessons learned from the 2015 mscoco image captioning challenge. TPAMI, 2016.
- (2016) TPAMI
- Vinyals, O.¹ Toshev, A.² Bengio, S.³ Erhan, D.⁴

41
- 85032356206
- arXiv preprint arXiv: 1606.05433
- P. Wang, Q. Wu, C. Shen, A. v. d. Hengel, and A. Dick. Fvqa: Factbased visual question answering. arXiv preprint arXiv:1606.05433, 2016.
- (2016) Fvqa: Factbased Visual Question Answering
- Wang, P.¹ Wu, Q.² Shen, C.³ Hengel, A.V.D.⁴ Dick, A.⁵

42
- 77955988492
- Modeling mutual context of object and human pose in human-object interaction activities
- B. Yao and L. Fei-Fei. Modeling mutual context of object and human pose in human-object interaction activities. In CVPR, 2010.
- (2010) CVPR
- Yao, B.¹ Fei-Fei, L.²

43
- 85035206689
- Learning from collective intelligence: Feature learning using social images and tags
- H. Zhang, X. Shang, H. Luan, M. Wang, and T.-S. Chua. Learning from collective intelligence: Feature learning using social images and tags. TOMM, 2016.
- (2016) TOMM
- Zhang, H.¹ Shang, X.² Luan, H.³ Wang, M.⁴ Chua, T.-S.⁵

44
- 84986325880
- Online collaborative learning for open-vocabulary visual classifiers
- H. Zhang, X. Shang, W. Yang, H. Xu, H. Luan, and T.-S. Chua. Online collaborative learning for open-vocabulary visual classifiers. In CVPR, 2016.
- (2016) CVPR
- Zhang, H.¹ Shang, X.² Yang, W.³ Xu, H.⁴ Luan, H.⁵ Chua, T.-S.⁶

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.