SCOPUS 정보 검색 플랫폼

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Volumn 9905 LNCS, Issue , 2016, Pages 817-834

Grounding of textual phrases in images by reconstruction

(5) Rohrbach, Anna a Rohrbach, Marcus b,c Hu, Ronghang b Darrell, Trevor b Schiele, Bernt a

a MAX PLANCK INSTITUTE FOR INFORMATICS (Germany)

b UNIVERSITY OF CALIFORNIA (United States)

c INTERNATIONAL COMPUTER SCIENCE INSTITUTE (United States)

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER VISION; HUMAN COMPUTER INTERACTION; LARGE DATASET; RECURRENT NEURAL NETWORKS;

ATTENTION MECHANISMS; IMAGE REGIONS; LARGE MARGINS; RECURRENT NETWORKS; REFERENCE RESOLUTION; SPATIAL LOCALIZATION; STATE OF THE ART; VISUAL CONTENT;

IMAGE RECONSTRUCTION;

EID: 84990068682 PISSN: 03029743 EISSN: 16113349 Source Type: Book Series
DOI: 10.1007/978-3-319-46448-0_49 Document Type: Conference Paper

Times cited : (387)

References (53)

1
- 84937844873
- Conditional random field autoencoders for unsupervised structured prediction
- Ammar, W., Dyer, C., Smith, N.A.: Conditional random field autoencoders for unsupervised structured prediction. In: Advances in Neural Information Processing Systems (NIPS) (2014)
- (2014) Advances in Neural Information Processing Systems (NIPS)
- Ammar, W.¹ Dyer, C.² Smith, N.A.³

2
- 85083953689
- Neural machine translation by jointly learning to align and translate
- Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: International Conference on Learning Representations (ICLR) (2015)
- (2015) International Conference on Learning Representations (ICLR)
- Bahdanau, D.¹ Cho, K.² Bengio, Y.³

3
- 85161984064
- Simultaneous object detection and ranking with weak supervision
- Blaschko, M., Vedaldi, A., Zisserman, A.: Simultaneous object detection and ranking with weak supervision. In: Advances in Neural Information Processing Systems (NIPS), pp. 235–243 (2010)
- (2010) Advances in Neural Information Processing Systems (NIPS) , pp. 235-243
- Blaschko, M.¹ Vedaldi, A.² Zisserman, A.³

4
- 84973865248
- Webly supervised learning of convolutional networks
- Chen, X., Gupta, A.: Webly supervised learning of convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2015)
- (2015) Proceedings of the IEEE International Conference on Computer Vision (ICCV)
- Chen, X.¹ Gupta, A.²

5
- 84957029470
- Mind’s eye: A recurrent visual representation for image caption generation
- Chen, X., Zitnick, C.L.: Mind’s eye: a recurrent visual representation for image caption generation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
- (2015) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- Chen, X.¹ Zitnick, C.L.²

6
- 84911376072
- Multi-fold MIL training for weakly supervised object localization
- Cinbis, R.G., Verbeek, J., Schmid, C.: Multi-fold MIL training for weakly supervised object localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014)
- (2014) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- Cinbis, R.G.¹ Verbeek, J.² Schmid, C.³

7
- 72449136144
- Imagenet: A large-scale hierarchical image database
- Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2009)
- (2009) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- Deng, J.¹ Dong, W.² Socher, R.³ Li, L.J.⁴ Li, K.⁵ Fei-Fei, L.⁶

8
- 84911368326
- Learning everything about anything:Weblysupervised visual concept learning
- Divvala, S., Farhadi, A., Guestrin, C.: Learning everything about anything:Weblysupervised visual concept learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014)
- (2014) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- Divvala, S.¹ Farhadi, A.² Guestrin, C.³

9
- 84959236502
- Long-term recurrent convolutional networks for visual recognition and description
- Donahue, J., Hendricks, L.A., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
- (2015) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- Donahue, J.¹ Hendricks, L.A.² Guadarrama, S.³ Rohrbach, M.⁴ Venugopalan, S.⁵ Saenko, K.⁶ Darrell, T.⁷

10
- 77951298115
- The Pascal Visual Object Classes (VOC) challenge
- Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The Pascal Visual Object Classes (VOC) challenge. Int. J. Comput. Vis. (IJCV) 88(2), 303–338 (2010)
- (2010) Int. J. Comput. Vis. (IJCV) , vol.88 , Issue.2 , pp. 303-338
- Everingham, M.¹ Van Gool, L.² Williams, C.K.³ Winn, J.⁴ Zisserman, A.⁵

11
- 0004289791
- The MIT Press, Cambridge
- Fellbaum, C.: WordNet: An Electronical Lexical Database. The MIT Press, Cambridge (1998)
- (1998) Wordnet: An Electronical Lexical Database
- Fellbaum, C.¹

12
- 84986264311
- Fast R-CNN
- Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2015)
- (2015) Proceedings of the IEEE International Conference on Computer Vision (ICCV)
- Girshick, R.¹

13
- 84862277874
- Understanding the difficulty of training deep feedforward neural networks
- Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)
- (2010) International Conference on Artificial Intelligence and Statistics , pp. 249-256
- Glorot, X.¹ Bengio, Y.²

14
- 84906484732
- Improving imagesentence embeddings using large weakly annotated photo collections
- Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.), Springer, Switzerland
- Gong, Y., Wang, L., Hodosh, M., Hockenmaier, J., Lazebnik, S.: Improving imagesentence embeddings using large weakly annotated photo collections. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part IV. LNCS, vol. 8692, pp. 529–545. Springer, Switzerland (2014)
- (2014) ECCV 2014, Part IV. LNCS , vol.8692 , pp. 529-545
- Gong, Y.¹ Wang, L.² Hodosh, M.³ Hockenmaier, J.⁴ Lazebnik, S.⁵

15
- 85131224768
- Open-vocabulary object retrieval
- Guadarrama, S., Rodner, E., Saenko, K., Zhang, N., Farrell, R., Donahue, J., Darrell, T.: Open-vocabulary object retrieval. In: Robotics: Science and Systems (2014)
- (2014) Robotics: Science and Systems
- Guadarrama, S.¹ Rodner, E.² Saenko, K.³ Zhang, N.⁴ Farrell, R.⁵ Donahue, J.⁶ Darrell, T.⁷

16
- 84973911419
- Delving deep into rectifiers: Surpassing humanlevel performance on imagenet classification
- He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing humanlevel performance on imagenet classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
- (2015) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- He, K.¹ Zhang, X.² Ren, S.³ Sun, J.⁴

17
- 0031573117
- Long short-term memory
- Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
- (1997) Neural Comput , vol.9 , Issue.8 , pp. 1735-1780
- Hochreiter, S.¹ Schmidhuber, J.²

18
- 84986305787
- Natural language object retrieval
- Hu, R., Xu, H., Rohrbach, M., Feng, J., Saenko, K., Darrell, T.: Natural language object retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
- (2016) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- Hu, R.¹ Xu, H.² Rohrbach, M.³ Feng, J.⁴ Saenko, K.⁵ Darrell, T.⁶

19
- 84990038050
- arXiv:1502.03167
- Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167 (2015)
- (2015) Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
- Ioffe, S.¹ Szegedy, C.²

20
- 84913580146
- Caffe: Convolutional architecture for fast feature embedding
- ACM
- Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the ACM International Conference on Multimedia, pp. 675–678. ACM (2014)
- (2014) Proceedings of the ACM International Conference on Multimedia , pp. 675-678
- Jia, Y.¹ Shelhamer, E.² Donahue, J.³ Karayev, S.⁴ Long, J.⁵ Girshick, R.⁶ Guadarrama, S.⁷ Darrell, T.⁸

21
- 84986312327
- arXiv:1506. 06272
- Jin, J., Fu, K., Cui, R., Sha, F., Zhang, C.: Aligning where to see and what to tell: image caption with region-based attention and scene factorization. arXiv:1506. 06272 (2015)
- (2015) Aligning Where to See and What to Tell: Image Caption with Region-Based Attention and Scene Factorization
- Jin, J.¹ Fu, K.² Cui, R.³ Sha, F.⁴ Zhang, C.⁵

22
- 84959233256
- Image retrieval using scene graphs
- Johnson, J., Krishna, R., Stark, M., Li, L.J., Shamma, D., Bernstein, M., Fei-Fei, L.: Image retrieval using scene graphs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3668–3678 (2015)
- (2015) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , pp. 3668-3678
- Johnson, J.¹ Krishna, R.² Stark, M.³ Li, L.J.⁴ Shamma, D.⁵ Bernstein, M.⁶ Fei-Fei, L.⁷

23
- 84906344543
- Efficient image and video co-localization with Frank-Wolfe algorithm
- Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.), Springer, Heidelberg
- Joulin, A., Tang, K., Fei-Fei, L.: Efficient image and video co-localization with Frank-Wolfe algorithm. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part VI. LNCS, vol. 8694, pp. 253–268. Springer, Heidelberg (2014)
- (2014) ECCV 2014, Part VI. LNCS , vol.8694 , pp. 253-268
- Joulin, A.¹ Tang, K.² Fei-Fei, L.³

24
- 84946734827
- Deep visual-semantic alignments for generating image descriptions
- Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
- (2015) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- Karpathy, A.¹ Fei-Fei, L.²

25
- 84937843643
- Deep fragment embeddings for bidirectional image sentence mapping
- Karpathy, A., Joulin, A., Fei-Fei, L.: Deep fragment embeddings for bidirectional image sentence mapping. In: Advances in Neural Information Processing Systems (NIPS) (2014)
- (2014) Advances in Neural Information Processing Systems (NIPS)
- Karpathy, A.¹ Joulin, A.² Fei-Fei, L.³

26
- 84943540775
- Referit game: Referring to objects in photographs of natural scenes
- Kazemzadeh, S., Ordonez, V., Matten, M., Berg, T.L.: Referit game: referring to objects in photographs of natural scenes. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) (2014)
- (2014) Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP)
- Kazemzadeh, S.¹ Ordonez, V.² Matten, M.³ Berg, T.L.⁴

27
- 84941620184
- arXiv:1412.6980
- Kingma, D., Ba, J.: Adam: a method for stochastic optimization. arXiv:1412.6980 (2014)
- (2014) Adam: A Method for Stochastic Optimization
- Kingma, D.¹ Ba, J.²

28
- 84911370987
- What are you talking about? Text-to-image coreference
- IEEE
- Kong, C., Lin, D., Bansal, M., Urtasun, R., Fidler, S.: What are you talking about? Text-to-image coreference. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3558–3565. IEEE (2014)
- (2014) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , pp. 3558-3565
- Kong, C.¹ Lin, D.² Bansal, M.³ Urtasun, R.⁴ Fidler, S.⁵

29
- 84906923690
- Jointly learning to parse and perceive: Connecting natural language to the physical world
- Krishnamurthy, J., Kollar, T.: Jointly learning to parse and perceive: connecting natural language to the physical world. Trans. Assoc. Comput. Linguist. (TACL) 1, 193–206 (2013)
- (2013) Trans. Assoc. Comput. Linguist. (TACL) , vol.1 , pp. 193-206
- Krishnamurthy, J.¹ Kollar, T.²

30
- 84973884868
- Unsupervised object discovery and tracking in video collections
- Kwak, S., Cho, M., Laptev, I., Ponce, J., Schmid, C.: Unsupervised object discovery and tracking in video collections. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2015)
- (2015) Proceedings of the IEEE International Conference on Computer Vision (ICCV)
- Kwak, S.¹ Cho, M.² Laptev, I.³ Ponce, J.⁴ Schmid, C.⁵

31
- 84911442106
- Visual semantic search: Retrieving videos via complex textual queries
- IEEE
- Lin, D., Fidler, S., Kong, C., Urtasun, R.: Visual semantic search: retrieving videos via complex textual queries. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2657–2664. IEEE (2014)
- (2014) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , pp. 2657-2664
- Lin, D.¹ Fidler, S.² Kong, C.³ Urtasun, R.⁴

32
- 84906493406
- Microsoft COCO: Common objects in context
- Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.), Springer, Switzerland
- Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part V. LNCS, vol. 8693, pp. 740–755. Springer, Switzerland (2014)
- (2014) ECCV 2014, Part V. LNCS , vol.8693 , pp. 740-755
- Lin, T.-Y.¹

33
- 84986260074
- Generation and comprehension of unambiguous object descriptions
- Mao, J., Huang, J., Toshev, A., Camburu, O., Yuille, A., Murphy, K.: Generation and comprehension of unambiguous object descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
- (2016) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- Mao, J.¹ Huang, J.² Toshev, A.³ Camburu, O.⁴ Yuille, A.⁵ Murphy, K.⁶

34
- 84867118595
- A joint model of language and perception for grounded attribute learning
- Matuszek, C., Fitzgerald, N., Zettlemoyer, L., Bo, L., Fox, D.: A joint model of language and perception for grounded attribute learning. In: Proceedings of the International Conference on Machine Learning (ICML) (2012)
- (2012) Proceedings of the International Conference on Machine Learning (ICML)
- Matuszek, C.¹ Fitzgerald, N.² Zettlemoyer, L.³ Bo, L.⁴ Fox, D.⁵

35
- 84973856017
- Flickr30k entities: Collecting region-to-phrase correspondences for richer image-tosentence models
- Plummer, B., Wang, L., Cervantes, C., Caicedo, J., Hockenmaier, J., Lazebnik, S.: Flickr30k entities: collecting region-to-phrase correspondences for richer image-tosentence models. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2015)
- (2015) Proceedings of the IEEE International Conference on Computer Vision (ICCV)
- Plummer, B.¹ Wang, L.² Cervantes, C.³ Caicedo, J.⁴ Hockenmaier, J.⁵ Lazebnik, S.⁶

36
- 84959184467
- Viske: Visual knowledge extraction and question answering by visual verification of relation phrases
- Sadeghi, F., Divvala, S.K., Farhadi, A.: Viske: visual knowledge extraction and question answering by visual verification of relation phrases. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
- (2015) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- Sadeghi, F.¹ Divvala, S.K.² Farhadi, A.³

37
- 85083953063
- Very deep convolutional networks for large-scale image recognition
- Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (ICLR) (2015)
- (2015) International Conference on Learning Representations (ICLR)
- Simonyan, K.¹ Zisserman, A.²

38
- 84990066399
- arXiv:1403.1024
- Song, H.O., Girshick, R., Jegelka, S., Mairal, J., Harchaoui, Z., Darrell, T.: On learning to localize objects with minimal supervision. arXiv:1403.1024 (2014)
- (2014) On Learning to Localize Objects with Minimal Supervision
- Song, H.O.¹ Girshick, R.² Jegelka, S.³ Mairal, J.⁴ Harchaoui, Z.⁵ Darrell, T.⁶

39
- 84928547704
- Sequence to sequence learning with neural networks
- Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems (NIPS), pp. 3104–3112 (2014)
- (2014) Advances in Neural Information Processing Systems (NIPS) , pp. 3104-3112
- Sutskever, I.¹ Vinyals, O.² Le, Q.V.³

40
- 84911407409
- Co-localization in real-world images
- IEEE
- Tang, K., Joulin, A., Li, L.J., Fei-Fei, L.: Co-localization in real-world images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2014)
- (2014) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- Tang, K.¹ Joulin, A.² Li, L.J.³ Fei-Fei, L.⁴

41
- 84959255361
- Book2movie: Aligning video scenes with book chapters
- Tapaswi, M., Bäuml, M., Stiefelhagen, R.: Book2movie: aligning video scenes with book chapters. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1827–1835 (2015)
- (2015) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , pp. 1827-1835
- Tapaswi, M.¹ Bäuml, M.² Stiefelhagen, R.³

42
- 84881160857
- Selective search for object recognition
- Uijlings, J.R., van de Sande, K.E., Gevers, T., Smeulders, A.W.: Selective search for object recognition. Int. J. Comput. Vis. (IJCV) 104(2), 154–171 (2013)
- (2013) Int. J. Comput. Vis. (IJCV) , vol.104 , Issue.2 , pp. 154-171
- Uijlings, J.R.¹ Van De Sande, K.E.² Gevers, T.³ Smeulders, A.W.⁴

43
- 56449089103
- Extracting and composing robust features with denoising autoencoders
- Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the International Conference on Machine Learning (ICML) (2008)
- (2008) Proceedings of the International Conference on Machine Learning (ICML)
- Vincent, P.¹ Larochelle, H.² Bengio, Y.³ Manzagol, P.A.⁴

44
- 84946747440
- Show and tell: A neural image caption generator
- Vinyals, O., Toshev, A., Bengio, S., Erhan, D.: Show and tell: a neural image caption generator. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
- (2015) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- Vinyals, O.¹ Toshev, A.² Bengio, S.³ Erhan, D.⁴

45
- 84986271102
- Learning deep structure-preserving image-text embeddings
- Wang, L., Li, Y., Lazebnik, S.: Learning deep structure-preserving image-text embeddings. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
- (2016) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- Wang, L.¹ Li, Y.² Lazebnik, S.³

46
- 84970002232
- Show, attend and tell: Neural image caption generation with visual attention
- Xu, K., Ba, J., Kiros, R., Courville, A., Salakhutdinov, R., Zemel, R., Bengio, Y.: Show, attend and tell: neural image caption generation with visual attention. In: Proceedings of the International Conference on Machine Learning (ICML) (2015)
- (2015) Proceedings of the International Conference on Machine Learning (ICML)
- Xu, K.¹ Ba, J.² Kiros, R.³ Courville, A.⁴ Salakhutdinov, R.⁵ Zemel, R.⁶ Bengio, Y.⁷

47
- 84973884896
- Describing videos by exploiting temporal structure
- Yao, L., Torabi, A., Cho, K., Ballas, N., Pal, C., Larochelle, H., Courville, A.: Describing videos by exploiting temporal structure. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2015)
- (2015) Proceedings of the IEEE International Conference on Computer Vision (ICCV)
- Yao, L.¹ Torabi, A.² Cho, K.³ Ballas, N.⁴ Pal, C.⁵ Larochelle, H.⁶ Courville, A.⁷

48
- 84986240394
- arXiv:1507. 05738
- Yeung, S., Russakovsky, O., Jin, N., Andriluka, M., Mori, G., Fei-Fei, L.: Every moment counts: dense detailed labeling of actions in complex videos. arXiv:1507. 05738 (2015)
- (2015) Every Moment Counts: Dense Detailed Labeling of Actions in Complex Videos
- Yeung, S.¹ Russakovsky, O.² Jin, N.³ Riluka, M.⁴ Mori, G.⁵ Fei-Fei, L.⁶

49
- 84906494296
- From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions
- Young, P., Lai, A., Hodosh, M., Hockenmaier, J.: From image descriptions to visual denotations: new similarity metrics for semantic inference over event descriptions. Trans. Assoc. Comput. Linguist. 2, 67–78 (2014)
- (2014) Trans. Assoc. Comput. Linguist , vol.2 , pp. 67-78
- Young, P.¹ Lai, A.² Hodosh, M.³ Hockenmaier, J.⁴

50
- 84897743886
- Grounded language learning from video described with sentences
- Yu, H., Siskind, J.M.: Grounded language learning from video described with sentences. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pp. 53–63 (2013)
- (2013) Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL) , pp. 53-63
- Yu, H.¹ Siskind, J.M.²

51
- 84990034898
- arXiv:1506.02059
- Yu, H., Siskind, J.M.: Sentence directed video object codetection. arXiv:1506.02059 (2015)
- (2015) Sentence Directed Video Object Codetection
- Yu, H.¹ Siskind, J.M.²

52
- 84973911532
- Aligning books and movies: Towards story-like visual explanations by watching movies and reading books
- Zhu, Y., Kiros, R., Zemel, R., Salakhutdinov, R., Urtasun, R., Torralba, A., Fidler, S.: Aligning books and movies: towards story-like visual explanations by watching movies and reading books. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2015)
- (2015) Proceedings of the IEEE International Conference on Computer Vision (ICCV)
- Zhu, Y.¹ Kiros, R.² Zemel, R.³ Salakhutdinov, R.⁴ Urtasun, R.⁵ Torralba, A.⁶ Fidler, S.⁷

53
- 84906489617
- Edge boxes: Locating object proposals from edges
- Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.), Springer, Switzerland
- Zitnick, C.L., Dollár, P.: Edge boxes: locating object proposals from edges. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part V. LNCS, vol. 8693, pp. 391–405. Springer, Switzerland (2014)
- (2014) ECCV 2014, Part V. LNCS , vol.8693 , pp. 391-405
- Zitnick, C.L.¹ Dollár, P.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.