SCOPUS 정보 검색 플랫폼

Proceedings of the IEEE International Conference on Computer Vision

Volumn 2017-October, Issue , 2017, Pages 2506-2515

Paying Attention to Descriptions Generated by Image Captioning Models

(4) Tavakoliy, Hamed R a Shetty, Rakshith b Borji, Ali c Laaksonen, Jorma a

a AALTO UNIVERSITY (Finland)

b MAX PLANCK INSTITUTE FOR INFORMATICS (Germany)

c University of Central Florida (United States)

Author keywords

[No Author keywords available]

Indexed keywords

BEHAVIORAL RESEARCH;

BOTTOM-UP SALIENCIES; HUMAN DESCRIPTIONS; IMAGE CAPTIONING; LOW-LEVEL CUES; SALIENT OBJECTS; SCENE DESCRIPTION; VISUAL ATTENTION; WRITTEN DESCRIPTION;

COMPUTER VISION;

EID: 85041928364 PISSN: 15505499 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICCV.2017.272 Document Type: Conference Paper

Times cited : (70)

References (68)

1
- 85041928342
- Accessed: 2016-03-01
- Microsoft COCO image captioning challenge. http: //competitions. codalab. org/competitions/ 3221. Accessed: 2016-03-01.
- Microsoft COCO Image Captioning Challenge

2
- 84973890960
- VQA: Visual question answering
- S. Antol, A. Agrawal, J. Lu, M. Mitchell, D. Batra, C. L. Zitnick, and D. Parikh. VQA: Visual question answering. In ICCV, 2015.
- (2015) ICCV
- Antol, S.¹ Agrawal, A.² Lu, J.³ Mitchell, M.⁴ Batra, D.⁵ Zitnick, C.L.⁶ Parikh, D.⁷

3
- 84951960494
- From generic to specific deep representation for visual recognition
- H. Azizpour, A. S. Razavian, J. Sullivan, A. Maki, and S. Carlsson. From generic to specific deep representation for visual recognition. In CVPR Workshops, 2015.
- (2015) CVPR Workshops
- Azizpour, H.¹ Razavian, A.S.² Sullivan, J.³ Maki, A.⁴ Carlsson, S.⁵

4
- 84866726859
- Understanding and predicting importance in images
- A. C. Berg, T. L. Berg, H. Daum, J. Dodge, A. Goyal, X. Han, A. Mensch, M. Mitchell, A. Sood, K. Stratos, and K. Yamaguchi. Understanding and predicting importance in images. In CVPR, 2012.
- (2012) CVPR
- Berg, A.C.¹ Berg, T.L.² Daum, H.³ Dodge, J.⁴ Goyal, A.⁵ Han, X.⁶ Mensch, A.⁷ Mitchell, M.⁸ Sood, A.⁹ Stratos, K.¹⁰ Yamaguchi, K.¹¹

5
- 84960130911
- Automatic description generation from images: A survey
- R. Bernardi, R. Cakici, D. Elliott, A. Erdem, E. Erdem, N. Ikizler-Cinbis, F. Keller, A. Muscat, and B. Plank. Automatic description generation from images: A survey. J. Artif. Intell. Res., 55 (1), 2016.
- (2016) J. Artif. Intell. Res. , vol.55 , Issue.1
- Bernardi, R.¹ Cakici, R.² Elliott, D.³ Erdem, A.⁴ Erdem, E.⁵ Ikizler-Cinbis, N.⁶ Keller, F.⁷ Muscat, A.⁸ Plank, B.⁹

6
- 0037611992
- Minding the clock
- K. Bock, D. Irwin, D. Davidson, andW. Levelt. Minding the clock. J. Mem. Lang., 48, 2003.
- (2003) J. Mem. Lang. , vol.48
- Bock, K.¹ Irwin, D.² Davidson, D.³ Levelt, W.⁴

7
- 84943411248
- Reconciling saliency and object center-bias hypotheses in explaining free-viewing fixations
- A. Borji and J. Tanner. Reconciling saliency and object center-bias hypotheses in explaining free-viewing fixations. IEEE Trans Neural Netw Learn Syst., 27 (6), 2016.
- (2016) IEEE Trans Neural Netw Learn Syst. , vol.27 , Issue.6
- Borji, A.¹ Tanner, J.²

8
- 84897056830
- Analysis of scores, datasets, and models in visual saliency prediction
- A. Borji, H. R. Tavakoli, D. N. Sihite, and L. Itti. Analysis of scores, datasets, and models in visual saliency prediction. In ICCV, 2013.
- (2013) ICCV
- Borji, A.¹ Tavakoli, H.R.² Sihite, D.N.³ Itti, L.⁴

9
- 84952349295
- CoRR, abs/1504. 00325
- X. Chen, T.-Y. L. Hao Fang, R. Vedantam, S. Gupta, P. Dollr, and C. L. Zitnick. Microsoft COCO captions: Data collection and evaluation server. CoRR, abs/1504. 00325, 2015.
- (2015) Microsoft COCO Captions: Data Collection and Evaluation Server
- Chen, X.¹ Hao Fang, T.-Y.L.² Vedantam, R.³ Gupta, S.⁴ Dollr, P.⁵ Zitnick, C.L.⁶

10
- 84950120533
- Deep filter banks for texture recognition and segmentation
- M. Cimpoi, S. Maji, and A. Vedaldi. Deep filter banks for texture recognition and segmentation. In CVPR, 2015.
- (2015) CVPR
- Cimpoi, M.¹ Maji, S.² Vedaldi, A.³

11
- 84954214926
- Giving good directions: Order of mention reflects visual salience
- A. D. F. Clarke, M. Elsner, and H. Rohde. Giving good directions: order of mention reflects visual salience. Front. Psychol., 6, 2015.
- (2015) Front. Psychol. , vol.6
- Clarke, A.D.F.¹ Elsner, M.² Rohde, H.³

12
- 85107661995
- Meteor universal: Language specific translation evaluation for any target language
- M. Denkowski and A. Lavie. Meteor universal: Language specific translation evaluation for any target language. In EACL, 2014.
- (2014) EACL
- Denkowski, M.¹ Lavie, A.²

13
- 84944096380
- Language models for image captioning: The quirks and what work
- J. Devlin, H. Cheng, H. Fang, S. Gupta, L. Deng, X. He, G. Zweig, and M. Mitchell. Language models for image captioning: The quirks and what work. In ACL, 2015.
- (2015) ACL
- Devlin, J.¹ Cheng, H.² Fang, H.³ Gupta, S.⁴ Deng, L.⁵ He, X.⁶ Zweig, G.⁷ Mitchell, M.⁸

14
- 84959236502
- Long-term recurrent convolutional networks for visual recognition and description
- J. Donahue, L. Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, K. Saenko, and T. Darrell. Long-term recurrent convolutional networks for visual recognition and description. In CVPR, 2015.
- (2015) CVPR
- Donahue, J.¹ Hendricks, L.² Guadarrama, S.³ Rohrbach, M.⁴ Venugopalan, S.⁵ Saenko, K.⁶ Darrell, T.⁷

15
- 84943812736
- Describing images using inferred visual dependency representations
- D. Elliott and A. P. de Vries. Describing images using inferred visual dependency representations. In ACL, 2015.
- (2015) ACL
- Elliott, D.¹ De Vries, A.P.²

16
- 84906929591
- Image description using visual dependency representations
- D. Elliott and F. Keller. Image description using visual dependency representations. In EMNLP, 2013.
- (2013) EMNLP
- Elliott, D.¹ Keller, F.²

17
- 84921069139
- The pascal visual object classes challenge: A retrospective
- M. Everingham, S. M. A. Eslami, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The pascal visual object classes challenge: A retrospective. IJCV, 111 (1), 2015.
- (2015) IJCV , vol.111 , Issue.1
- Everingham, M.¹ Eslami, S.M.A.² Van Gool, L.³ Williams, C.K.I.⁴ Winn, J.⁵ Zisserman, A.⁶

18
- 84959250180
- From captions to visual concepts and back
- H. Fang, S. Gupta, F. Iandola, R. Srivastava, L. Deng, P. Dollar, J. Gao, X. He, M. Mitchell, J. Platt, L. Zitnick, and G. Zweig. From captions to visual concepts and back. In CVPR, 2015.
- (2015) CVPR
- Fang, H.¹ Gupta, S.² Iandola, F.³ Srivastava, R.⁴ Deng, L.⁵ Dollar, P.⁶ Gao, J.⁷ He, X.⁸ Mitchell, M.⁹ Platt, J.¹⁰ Zitnick, L.¹¹ Zweig, G.¹²

19
- 80052017343
- Every picture tells a story: Generating sentences from images
- A. Farhadi, M. Hejrati, M. A. Sadeghi, P. Young, C. Rashtchian, J. Hockenmaier, and D. Forsyth. Every picture tells a story: Generating sentences from images. In ECCV, 2010.
- (2010) ECCV
- Farhadi, A.¹ Hejrati, M.² Sadeghi, M.A.³ Young, P.⁴ Rashtchian, C.⁵ Hockenmaier, J.⁶ Forsyth, D.⁷

20
- 84930639891
- Statistics of high-level scene context
- M. R. Greene. Statistics of high-level scene context. Front. Psychol., 4, 2013.
- (2013) Front. Psychol. , vol.4
- Greene, M.R.¹

21
- 0034232298
- What the eyes say about speaking
- Z. Griffin and K. Bock. What the eyes say about speaking. Psychol Sci., 11 (4), 2000.
- (2000) Psychol Sci. , vol.11 , Issue.4
- Griffin, Z.¹ Bock, K.²

22
- 33750499478
- Observing the what and when of language production for different age groups by monitoring speakers eye movements
- Language Comprehension across the Life Span
- Z. M. Griffin and D. H. Spieler. Observing the what and when of language production for different age groups by monitoring speakers eye movements. Brain and Language, 99 (3):272-288, 2006. Language Comprehension across the Life Span.
- (2006) Brain and Language , vol.99 , Issue.3 , pp. 272-288
- Griffin, Z.M.¹ Spieler, D.H.²

23
- 85156217966
- Graph-based visual saliency
- J. Harel, C. Koch, and P. Perona. Graph-based visual saliency. In NIPS, 2006.
- (2006) NIPS
- Harel, J.¹ Koch, C.² Perona, P.³

24
- 84986274465
- Deep residual learning for image recognition
- K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, 2016.
- (2016) CVPR
- He, K.¹ Zhang, X.² Ren, S.³ Sun, J.⁴

25
- 84883394520
- Framing image description as a ranking task: Data, models and evaluation metrics
- M. Hodosh, P. Young, and J. Hockenmaier. Framing image description as a ranking task: Data, models and evaluation metrics. J. Artif. Intell. Res., 47, 2013.
- (2013) J. Artif. Intell. Res. , vol.47
- Hodosh, M.¹ Young, P.² Hockenmaier, J.³

26
- 84897670422
- How we focus attention in picture viewing, picture description, and during mental imagery
- J. Holsanova. How we focus attention in picture viewing, picture description, and during mental imagery, pages 291-313. Bilder-sehen-denken : zum Verhltnis von begrifflichphilosophischen und empirisch-psychologischen Anstzen in der bildwissenschaftlichen Forschung. von Halem, 2011.
- (2011) Bilder-sehen-denken : Zum Verhltnis von Begrifflichphilosophischen und Empirisch-psychologischen Anstzen in der Bildwissenschaftlichen Forschung. von Halem , pp. 291-313
- Holsanova, J.¹

27
- 33745903481
- Extereme learning machine: Theory and applicatons
- G.-B. Huang, Q.-Y. Zhu, and C.-K. Siew. Extereme learning machine: Theory and applicatons. Neurocomput., 70, 2006.
- (2006) Neurocomput , pp. 70
- Huang, G.-B.¹ Zhu, Q.-Y.² Siew, C.-K.³

28
- 85041913089
- Combridge Press
- L. Itti and M. A. Arbib. Action to Language via the Mirror Neuron System, chapter Attention and the Minimal Subscene. Combridge Press, 2006.
- (2006) Action to Language Via the Mirror Neuron System, Chapter Attention and the Minimal Subscene
- Itti, L.¹ Arbib, M.A.²

29
- 84959225954
- SALICON: Saliency in context
- M. Jiang, S. Huang, J. Duan, and Q. Zhao. SALICON: Saliency in context. In CVPR, 2015.
- (2015) CVPR
- Jiang, M.¹ Huang, S.² Duan, J.³ Zhao, Q.⁴

30
- 77953205576
- Learning to predict where humans look
- T. Judd, K. Ehinger, F. Durand, and A. Torralba. Learning to predict where humans look. In ICCV, 2009.
- (2009) ICCV
- Judd, T.¹ Ehinger, K.² Durand, F.³ Torralba, A.⁴

31
- 20444505688
- Referential domains in spoken language comprehension: Using eye movements to bridge the product and action traditions
- Psychology Press
- M. k. Tanenhaus, C. Chambers, and J. E. Hanna. Referential domains in spoken language comprehension: Using eye movements to bridge the product and action traditions. In The interface of language, vision, and action: Eye movements and visual world. Psychology Press, 2004.
- (2004) The Interface of Language, Vision, and Action: Eye Movements and Visual World
- Tanenhaus, M.K.¹ Chambers, C.² Hanna, J.E.³

32
- 84946734827
- Deep visual-semantic alignments for generating image descriptions
- A. Karpathy and L. Fei-Fei. Deep visual-semantic alignments for generating image descriptions. In CVPR, 2015.
- (2015) CVPR
- Karpathy, A.¹ Fei-Fei, L.²

33
- 84937843643
- Deep fragment embeddings for bidirectional image sentence mapping
- A. Karpathy, A. Joulin, and L. Fei-Fei. Deep fragment embeddings for bidirectional image sentence mapping. In NIPS, 2014.
- (2014) NIPS
- Karpathy, A.¹ Joulin, A.² Fei-Fei, L.³

34
- 84862279067
- Composing simple image descriptions using web-scale n-grams
- S. Li, G. Kulkarni, T. L. Berg, A. C. Berg, and Y. Choi. Composing simple image descriptions using web-scale n-grams. In CoNLL, 2011.
- (2011) CoNLL
- Li, S.¹ Kulkarni, G.² Berg, T.L.³ Berg, A.C.⁴ Choi, Y.⁵

35
- 26944501715
- Rouge: A package for automatic evaluation of summaries
- C.-Y. Lin. Rouge: A package for automatic evaluation of summaries. In Text Summarization Branches Out: Proceedings of the ACL Workshop, 2004.
- (2004) Text Summarization Branches Out: Proceedings of the ACL Workshop
- Lin, C.-Y.¹

36
- 84937834115
- Microsoft COCO: Common objects in context
- T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, and C. L. Zitnick. Microsoft COCO: Common objects in context. In ECCV, 2014.
- (2014) ECCV
- Lin, T.-Y.¹ Maire, M.² Belongie, S.³ Hays, J.⁴ Perona, P.⁵ Ramanan, D.⁶ Dollar, P.⁷ Zitnick, C.L.⁸

37
- 84973896625
- Ask your neurons: A neural-based approach to answering questions about images
- M. Malinowski, M. Rohrbach, and M. Fritz. Ask your neurons: A neural-based approach to answering questions about images. In ICCV, 2015.
- (2015) ICCV
- Malinowski, M.¹ Rohrbach, M.² Fritz, M.³

38
- 85117622017
- The Stanford CoreNLP natural language processing toolkit
- C. D. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. J. Bethard, and D. McClosky. The Stanford CoreNLP natural language processing toolkit. In ACL, 2014.
- (2014) ACL
- Manning, C.D.¹ Surdeanu, M.² Bauer, J.³ Finkel, J.⁴ Bethard, S.J.⁵ McClosky, D.⁶

39
- 84961654805
- Actions in the eye: Dynamic gaze datasets and learnt saliency models for visual recognition
- S. Mathe and C. Sminchisescu. Actions in the eye: Dynamic gaze datasets and learnt saliency models for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell., 2015.
- (2015) IEEE Trans. Pattern Anal. Mach. Intell.
- Mathe, S.¹ Sminchisescu, C.²

40
- 84909390456
- The use of eye tracking in studies of sentence generation
- Psychology Press
- A. S. Meyer. The use of eye tracking in studies of sentence generation. In The interface of language, vision, and action: Eye movements and the visual world. Psychology Press, 2004.
- (2004) The Interface of Language, Vision, and Action: Eye Movements and the Visual World
- Meyer, A.S.¹

41
- 0032060018
- Viewing and naming objects: Eye movements during noun phrase production
- A. S. Meyer, A. M. Sleiderink, andW. J. Levelt. Viewing and naming objects: eye movements during noun phrase production. Cognition, 66 (2), 1998.
- (1998) Cognition , vol.66 , Issue.2
- Meyer, A.S.¹ Sleiderink, A.M.² Levelt, A.J.³

42
- 85083951332
- Efficient estimation of word representations in vector space
- T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. ICLR, 2013.
- (2013) ICLR
- Mikolov, T.¹ Chen, K.² Corrado, G.³ Dean, J.⁴

43
- 84911444024
- The role of context for object detection and semantic segmentation in the wild
- R. Mottaghi, X. Chen, X. Liu, N. G. Cho, S. W. Lee, S. Fidler, R. Urtasun, and A. Yuille. The role of context for object detection and semantic segmentation in the wild. In CVPR, 2014.
- (2014) CVPR
- Mottaghi, R.¹ Chen, X.² Liu, X.³ Cho, N.G.⁴ Lee, S.W.⁵ Fidler, S.⁶ Urtasun, R.⁷ Yuille, A.⁸

44
- 85162522202
- Im2text: Describing images using 1 million captioned photographs
- V. Ordonez, G. Kulkarni, and T. L. Berg. Im2text: Describing images using 1 million captioned photographs. In NIPS, 2011.
- (2011) NIPS
- Ordonez, V.¹ Kulkarni, G.² Berg, T.L.³

45
- 85133336275
- Bleu: A method for automatic evaluation of machine translation
- K. Papineni, S. Roukos, T. Ward, and W. jing Zhu. Bleu: a method for automatic evaluation of machine translation. In ACL, 2002.
- (2002) ACL
- Papineni, K.¹ Roukos, S.² Ward, T.³ Jing Zhu, W.⁴

46
- 85041899820
- Areas of attention for image captioning
- M. Pedersoli, T. Lucas, C. Schmid, and J. Verbeek. Areas of attention for image captioning. In CVPR, 2017.
- (2017) CVPR
- Pedersoli, M.¹ Lucas, T.² Schmid, C.³ Verbeek, J.⁴

47
- 20544446875
- Components of bottom-up gaze allocation in natural images
- R. J. Peters, A. Iyer, L. Itti, and C. Koch. Components of bottom-up gaze allocation in natural images. Vision Research, 45, 2005.
- (2005) Vision Research , pp. 45
- Peters, R.J.¹ Iyer, A.² Itti, L.³ Koch, C.⁴

48
- 0034886959
- Walking or talking: Behavioral and neurophysiological correlates of action verb processing
- F. Pulvermüller, M. Hrle, and F. Hummel. Walking or talking: Behavioral and neurophysiological correlates of action verb processing. Brain and Language, 78 (2), 2001.
- (2001) Brain and Language , vol.78 , Issue.2
- Pulvermüller, F.¹ Hrle, M.² Hummel, F.³

49
- 85041916724
- Top-down visual saliency guided by captions
- V. Ramanishka, A. Das, J. Zhang, and K. Saenko. Top-down visual saliency guided by captions. In CVPR, 2017.
- (2017) CVPR
- Ramanishka, V.¹ Das, A.² Zhang, J.³ Saenko, K.⁴

50
- 85090348677
- Collecting image annotations using amazon's mechanical turk
- C. Rashtchian, P. Young, M. Hodosh, and J. Hockenmaier. Collecting image annotations using amazon's mechanical turk. In NAACL HLT, 2010.
- (2010) NAACL HLT
- Rashtchian, C.¹ Young, P.² Hodosh, M.³ Hockenmaier, J.⁴

51
- 84973887740
- The long-short story of movie description
- A. Rohrbach, M. Rohrbach, and B. Schiele. The Long-Short Story of Movie Description. In GCPR, 2015.
- (2015) GCPR
- Rohrbach, A.¹ Rohrbach, M.² Schiele, B.³

52
- 84977650097
- Video captioning with recurrent networks based on frame-and video-level features and visual content classification
- R. Shetty and J. Laaksonen. Video captioning with recurrent networks based on frame-and video-level features and visual content classification. In CVPR Workshops, 2015.
- (2015) CVPR Workshops
- Shetty, R.¹ Laaksonen, J.²

53
- 84995460741
- Exploiting scene context for image captioning
- R. Shetty, H. R-Tavakoli, and J. Laaksonen. Exploiting scene context for image captioning. In ACMMM Vision and Language Integration Meets Multimedia Fusion Workshop, 2016.
- (2016) ACMMM Vision and Language Integration Meets Multimedia Fusion Workshop
- Shetty, R.¹ R-Tavakoli, H.² Laaksonen, J.³

54
- 84925410541
- CoRR, abs/1409. 1556
- K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409. 1556, 2014.
- (2014) Very Deep Convolutional Networks for Large-scale Image Recognition
- Simonyan, K.¹ Zisserman, A.²

55
- 0003644604
- Parsing english with a link grammar
- D. D. Sleator and D. Temperley. Parsing english with a link grammar. In Third International Workshop on Parsing Technologies, 1991.
- (1991) Third International Workshop on Parsing Technologies
- Sleator, D.D.¹ Temperley, D.²

56
- 85041918700
- Springer New York
- S. N. Sridhar. Cognition and Sentence Production: A Cross-Linguistic Study, chapter Models of Sentence Production, pages 7-19. Springer New York, 1988.
- (1988) Cognition and Sentence Production: A Cross-Linguistic Study, Chapter Models of Sentence Production , pp. 7-19
- Sridhar, S.N.¹

57
- 85031667891
- CoRR, abs/1608. 05203
- Y. Sugano and A. Bulling. Seeing with humans: Gazeassisted neural image captioning. CoRR, abs/1608. 05203, 2016.
- (2016) Seeing with Humans: Gazeassisted Neural Image Captioning
- Sugano, Y.¹ Bulling, A.²

58
- 85041901191
- Saliency revisited: Analysis of mouse movements versus fixations
- H. R. Tavakoli, F. Ahmad, A. Borji, and J. Laaksonen. Saliency revisited: Analysis of mouse movements versus fixations. In CVPR, 2017.
- (2017) CVPR
- Tavakoli, H.R.¹ Ahmad, F.² Borji, A.³ Laaksonen, J.⁴

59
- 85017026850
- Exploiting inter-image similarity and ensemble of extreme learners for fixation prediction using deep features
- H. R. Tavakoli, A. Borji, J. Laaksonen, and E. Rahtu. Exploiting inter-image similarity and ensemble of extreme learners for fixation prediction using deep features. Neurocomput., 244, 2017.
- (2017) Neurocomput. , pp. 244
- Tavakoli, H.R.¹ Borji, A.² Laaksonen, J.³ Rahtu, E.⁴

60
- 84956980995
- CIDEr: Consensus-based image description evaluation
- R. Vedantam, C. L. Zitnick, and D. Parikh. CIDEr: Consensus-based image description evaluation. In CVPR, 2015.
- (2015) CVPR
- Vedantam, R.¹ Zitnick, C.L.² Parikh, D.³

61
- 84946747440
- Show and tell: A neural image caption generator
- O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. In CVPR, 2015.
- (2015) CVPR
- Vinyals, O.¹ Toshev, A.² Bengio, S.³ Erhan, D.⁴

62
- 84970002232
- Show, attend and tell: Neural image caption generation with visual attention
- F. Bach and D. Blei, editors, Lille, France, 07-09 Jul PMLR
- K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov, R. Zemel, and Y. Bengio. Show, attend and tell: Neural image caption generation with visual attention. In F. Bach and D. Blei, editors, Proceedings of the 32nd International Conference on Machine Learning, volume 37 of Proceedings of Machine Learning Research, pages 2048-2057, Lille, France, 07-09 Jul 2015. PMLR.
- (2015) Proceedings of the 32nd International Conference on Machine Learning Volume 37 of Proceedings of Machine Learning Research , pp. 2048-2057
- Xu, K.¹ Ba, J.² Kiros, R.³ Cho, K.⁴ Courville, A.⁵ Salakhudinov, R.⁶ Zemel, R.⁷ Bengio, Y.⁸

63
- 84906494296
- From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions
- P. Young, A. Lai, M. Hodosh, and J. Hockenmaier. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. TACL, 2, 2014.
- (2014) TACL , vol.2
- Young, P.¹ Lai, A.² Hodosh, M.³ Hockenmaier, J.⁴

64
- 84891713152
- Exploring the role of gaze behavior and object detection in scene understanding
- K. Yun, Y. Peng, D. Samaras, G. Zelinsky, and T. Berg. Exploring the role of gaze behavior and object detection in scene understanding. Front. Psychol., 4, 2013.
- (2013) Front. Psychol. , vol.4
- Yun, K.¹ Peng, Y.² Samaras, D.³ Zelinsky, G.⁴ Berg, T.⁵

65
- 84887396648
- Studying relationships between human gaze, description, and computer vision
- K. Yun, Y. Peng, D. Samaras, G. J. Zelinsky, and T. L. Berg. Studying relationships between human gaze, description, and computer vision. In CVPR, 2013.
- (2013) CVPR
- Yun, K.¹ Peng, Y.² Samaras, D.³ Zelinsky, G.J.⁴ Berg, T.L.⁵

66
- 84875275420
- Learning saliency-based visual attention: A review
- Q. Zhao and C. Koch. Learning saliency-based visual attention: A review. Signal Processing, 93, 2013.
- (2013) Signal Processing , vol.93
- Zhao, Q.¹ Koch, C.²

67
- 84963603427
- Adopting abstract images for semantic scene understanding
- C. L. Zitnick, R. Vedantam, and D. Parikh. Adopting abstract images for semantic scene understanding. IEEE Trans. Pattern Anal. Mach. Intell., 38 (4), 2016.
- (2016) IEEE Trans. Pattern Anal. Mach. Intell. , vol.38 , Issue.4
- Zitnick, C.L.¹ Vedantam, R.² Parikh, D.³

68
- 84885340415
- Fixations on objects in natural scenes: Dissociating importance from salience
- B. M. t Hart, H. C. E. F. Schmidt, C. Roth, andW. Einhauser. Fixations on objects in natural scenes: dissociating importance from salience. Front. Psychol., 4, 2013.
- (2013) Front. Psychol. , vol.4
- Hart, B.M.T.¹ Schmidt, H.C.E.F.² Roth, C.³ Einhauser, W.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.