메뉴 건너뛰기




Volumn 2017-October, Issue , 2017, Pages 2506-2515

Paying Attention to Descriptions Generated by Image Captioning Models

Author keywords

[No Author keywords available]

Indexed keywords

BEHAVIORAL RESEARCH;

EID: 85041928364     PISSN: 15505499     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICCV.2017.272     Document Type: Conference Paper
Times cited : (70)

References (68)
  • 7
    • 84943411248 scopus 로고    scopus 로고
    • Reconciling saliency and object center-bias hypotheses in explaining free-viewing fixations
    • A. Borji and J. Tanner. Reconciling saliency and object center-bias hypotheses in explaining free-viewing fixations. IEEE Trans Neural Netw Learn Syst., 27 (6), 2016.
    • (2016) IEEE Trans Neural Netw Learn Syst. , vol.27 , Issue.6
    • Borji, A.1    Tanner, J.2
  • 8
    • 84897056830 scopus 로고    scopus 로고
    • Analysis of scores, datasets, and models in visual saliency prediction
    • A. Borji, H. R. Tavakoli, D. N. Sihite, and L. Itti. Analysis of scores, datasets, and models in visual saliency prediction. In ICCV, 2013.
    • (2013) ICCV
    • Borji, A.1    Tavakoli, H.R.2    Sihite, D.N.3    Itti, L.4
  • 10
    • 84950120533 scopus 로고    scopus 로고
    • Deep filter banks for texture recognition and segmentation
    • M. Cimpoi, S. Maji, and A. Vedaldi. Deep filter banks for texture recognition and segmentation. In CVPR, 2015.
    • (2015) CVPR
    • Cimpoi, M.1    Maji, S.2    Vedaldi, A.3
  • 11
    • 84954214926 scopus 로고    scopus 로고
    • Giving good directions: Order of mention reflects visual salience
    • A. D. F. Clarke, M. Elsner, and H. Rohde. Giving good directions: order of mention reflects visual salience. Front. Psychol., 6, 2015.
    • (2015) Front. Psychol. , vol.6
    • Clarke, A.D.F.1    Elsner, M.2    Rohde, H.3
  • 12
    • 85107661995 scopus 로고    scopus 로고
    • Meteor universal: Language specific translation evaluation for any target language
    • M. Denkowski and A. Lavie. Meteor universal: Language specific translation evaluation for any target language. In EACL, 2014.
    • (2014) EACL
    • Denkowski, M.1    Lavie, A.2
  • 15
    • 84943812736 scopus 로고    scopus 로고
    • Describing images using inferred visual dependency representations
    • D. Elliott and A. P. de Vries. Describing images using inferred visual dependency representations. In ACL, 2015.
    • (2015) ACL
    • Elliott, D.1    De Vries, A.P.2
  • 16
    • 84906929591 scopus 로고    scopus 로고
    • Image description using visual dependency representations
    • D. Elliott and F. Keller. Image description using visual dependency representations. In EMNLP, 2013.
    • (2013) EMNLP
    • Elliott, D.1    Keller, F.2
  • 20
    • 84930639891 scopus 로고    scopus 로고
    • Statistics of high-level scene context
    • M. R. Greene. Statistics of high-level scene context. Front. Psychol., 4, 2013.
    • (2013) Front. Psychol. , vol.4
    • Greene, M.R.1
  • 21
    • 0034232298 scopus 로고    scopus 로고
    • What the eyes say about speaking
    • Z. Griffin and K. Bock. What the eyes say about speaking. Psychol Sci., 11 (4), 2000.
    • (2000) Psychol Sci. , vol.11 , Issue.4
    • Griffin, Z.1    Bock, K.2
  • 22
    • 33750499478 scopus 로고    scopus 로고
    • Observing the what and when of language production for different age groups by monitoring speakers eye movements
    • Language Comprehension across the Life Span
    • Z. M. Griffin and D. H. Spieler. Observing the what and when of language production for different age groups by monitoring speakers eye movements. Brain and Language, 99 (3):272-288, 2006. Language Comprehension across the Life Span.
    • (2006) Brain and Language , vol.99 , Issue.3 , pp. 272-288
    • Griffin, Z.M.1    Spieler, D.H.2
  • 23
    • 85156217966 scopus 로고    scopus 로고
    • Graph-based visual saliency
    • J. Harel, C. Koch, and P. Perona. Graph-based visual saliency. In NIPS, 2006.
    • (2006) NIPS
    • Harel, J.1    Koch, C.2    Perona, P.3
  • 24
    • 84986274465 scopus 로고    scopus 로고
    • Deep residual learning for image recognition
    • K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, 2016.
    • (2016) CVPR
    • He, K.1    Zhang, X.2    Ren, S.3    Sun, J.4
  • 25
    • 84883394520 scopus 로고    scopus 로고
    • Framing image description as a ranking task: Data, models and evaluation metrics
    • M. Hodosh, P. Young, and J. Hockenmaier. Framing image description as a ranking task: Data, models and evaluation metrics. J. Artif. Intell. Res., 47, 2013.
    • (2013) J. Artif. Intell. Res. , vol.47
    • Hodosh, M.1    Young, P.2    Hockenmaier, J.3
  • 27
    • 33745903481 scopus 로고    scopus 로고
    • Extereme learning machine: Theory and applicatons
    • G.-B. Huang, Q.-Y. Zhu, and C.-K. Siew. Extereme learning machine: Theory and applicatons. Neurocomput., 70, 2006.
    • (2006) Neurocomput , pp. 70
    • Huang, G.-B.1    Zhu, Q.-Y.2    Siew, C.-K.3
  • 32
    • 84946734827 scopus 로고    scopus 로고
    • Deep visual-semantic alignments for generating image descriptions
    • A. Karpathy and L. Fei-Fei. Deep visual-semantic alignments for generating image descriptions. In CVPR, 2015.
    • (2015) CVPR
    • Karpathy, A.1    Fei-Fei, L.2
  • 33
    • 84937843643 scopus 로고    scopus 로고
    • Deep fragment embeddings for bidirectional image sentence mapping
    • A. Karpathy, A. Joulin, and L. Fei-Fei. Deep fragment embeddings for bidirectional image sentence mapping. In NIPS, 2014.
    • (2014) NIPS
    • Karpathy, A.1    Joulin, A.2    Fei-Fei, L.3
  • 34
    • 84862279067 scopus 로고    scopus 로고
    • Composing simple image descriptions using web-scale n-grams
    • S. Li, G. Kulkarni, T. L. Berg, A. C. Berg, and Y. Choi. Composing simple image descriptions using web-scale n-grams. In CoNLL, 2011.
    • (2011) CoNLL
    • Li, S.1    Kulkarni, G.2    Berg, T.L.3    Berg, A.C.4    Choi, Y.5
  • 37
    • 84973896625 scopus 로고    scopus 로고
    • Ask your neurons: A neural-based approach to answering questions about images
    • M. Malinowski, M. Rohrbach, and M. Fritz. Ask your neurons: A neural-based approach to answering questions about images. In ICCV, 2015.
    • (2015) ICCV
    • Malinowski, M.1    Rohrbach, M.2    Fritz, M.3
  • 39
    • 84961654805 scopus 로고    scopus 로고
    • Actions in the eye: Dynamic gaze datasets and learnt saliency models for visual recognition
    • S. Mathe and C. Sminchisescu. Actions in the eye: Dynamic gaze datasets and learnt saliency models for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell., 2015.
    • (2015) IEEE Trans. Pattern Anal. Mach. Intell.
    • Mathe, S.1    Sminchisescu, C.2
  • 41
    • 0032060018 scopus 로고    scopus 로고
    • Viewing and naming objects: Eye movements during noun phrase production
    • A. S. Meyer, A. M. Sleiderink, andW. J. Levelt. Viewing and naming objects: eye movements during noun phrase production. Cognition, 66 (2), 1998.
    • (1998) Cognition , vol.66 , Issue.2
    • Meyer, A.S.1    Sleiderink, A.M.2    Levelt, A.J.3
  • 42
    • 85083951332 scopus 로고    scopus 로고
    • Efficient estimation of word representations in vector space
    • T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. ICLR, 2013.
    • (2013) ICLR
    • Mikolov, T.1    Chen, K.2    Corrado, G.3    Dean, J.4
  • 44
    • 85162522202 scopus 로고    scopus 로고
    • Im2text: Describing images using 1 million captioned photographs
    • V. Ordonez, G. Kulkarni, and T. L. Berg. Im2text: Describing images using 1 million captioned photographs. In NIPS, 2011.
    • (2011) NIPS
    • Ordonez, V.1    Kulkarni, G.2    Berg, T.L.3
  • 45
    • 85133336275 scopus 로고    scopus 로고
    • Bleu: A method for automatic evaluation of machine translation
    • K. Papineni, S. Roukos, T. Ward, and W. jing Zhu. Bleu: a method for automatic evaluation of machine translation. In ACL, 2002.
    • (2002) ACL
    • Papineni, K.1    Roukos, S.2    Ward, T.3    Jing Zhu, W.4
  • 47
    • 20544446875 scopus 로고    scopus 로고
    • Components of bottom-up gaze allocation in natural images
    • R. J. Peters, A. Iyer, L. Itti, and C. Koch. Components of bottom-up gaze allocation in natural images. Vision Research, 45, 2005.
    • (2005) Vision Research , pp. 45
    • Peters, R.J.1    Iyer, A.2    Itti, L.3    Koch, C.4
  • 48
    • 0034886959 scopus 로고    scopus 로고
    • Walking or talking: Behavioral and neurophysiological correlates of action verb processing
    • F. Pulvermüller, M. Hrle, and F. Hummel. Walking or talking: Behavioral and neurophysiological correlates of action verb processing. Brain and Language, 78 (2), 2001.
    • (2001) Brain and Language , vol.78 , Issue.2
    • Pulvermüller, F.1    Hrle, M.2    Hummel, F.3
  • 51
    • 84973887740 scopus 로고    scopus 로고
    • The long-short story of movie description
    • A. Rohrbach, M. Rohrbach, and B. Schiele. The Long-Short Story of Movie Description. In GCPR, 2015.
    • (2015) GCPR
    • Rohrbach, A.1    Rohrbach, M.2    Schiele, B.3
  • 52
    • 84977650097 scopus 로고    scopus 로고
    • Video captioning with recurrent networks based on frame-and video-level features and visual content classification
    • R. Shetty and J. Laaksonen. Video captioning with recurrent networks based on frame-and video-level features and visual content classification. In CVPR Workshops, 2015.
    • (2015) CVPR Workshops
    • Shetty, R.1    Laaksonen, J.2
  • 58
    • 85041901191 scopus 로고    scopus 로고
    • Saliency revisited: Analysis of mouse movements versus fixations
    • H. R. Tavakoli, F. Ahmad, A. Borji, and J. Laaksonen. Saliency revisited: Analysis of mouse movements versus fixations. In CVPR, 2017.
    • (2017) CVPR
    • Tavakoli, H.R.1    Ahmad, F.2    Borji, A.3    Laaksonen, J.4
  • 59
    • 85017026850 scopus 로고    scopus 로고
    • Exploiting inter-image similarity and ensemble of extreme learners for fixation prediction using deep features
    • H. R. Tavakoli, A. Borji, J. Laaksonen, and E. Rahtu. Exploiting inter-image similarity and ensemble of extreme learners for fixation prediction using deep features. Neurocomput., 244, 2017.
    • (2017) Neurocomput. , pp. 244
    • Tavakoli, H.R.1    Borji, A.2    Laaksonen, J.3    Rahtu, E.4
  • 60
    • 84956980995 scopus 로고    scopus 로고
    • CIDEr: Consensus-based image description evaluation
    • R. Vedantam, C. L. Zitnick, and D. Parikh. CIDEr: Consensus-based image description evaluation. In CVPR, 2015.
    • (2015) CVPR
    • Vedantam, R.1    Zitnick, C.L.2    Parikh, D.3
  • 61
    • 84946747440 scopus 로고    scopus 로고
    • Show and tell: A neural image caption generator
    • O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. In CVPR, 2015.
    • (2015) CVPR
    • Vinyals, O.1    Toshev, A.2    Bengio, S.3    Erhan, D.4
  • 63
    • 84906494296 scopus 로고    scopus 로고
    • From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions
    • P. Young, A. Lai, M. Hodosh, and J. Hockenmaier. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. TACL, 2, 2014.
    • (2014) TACL , vol.2
    • Young, P.1    Lai, A.2    Hodosh, M.3    Hockenmaier, J.4
  • 64
    • 84891713152 scopus 로고    scopus 로고
    • Exploring the role of gaze behavior and object detection in scene understanding
    • K. Yun, Y. Peng, D. Samaras, G. Zelinsky, and T. Berg. Exploring the role of gaze behavior and object detection in scene understanding. Front. Psychol., 4, 2013.
    • (2013) Front. Psychol. , vol.4
    • Yun, K.1    Peng, Y.2    Samaras, D.3    Zelinsky, G.4    Berg, T.5
  • 65
    • 84887396648 scopus 로고    scopus 로고
    • Studying relationships between human gaze, description, and computer vision
    • K. Yun, Y. Peng, D. Samaras, G. J. Zelinsky, and T. L. Berg. Studying relationships between human gaze, description, and computer vision. In CVPR, 2013.
    • (2013) CVPR
    • Yun, K.1    Peng, Y.2    Samaras, D.3    Zelinsky, G.J.4    Berg, T.L.5
  • 66
    • 84875275420 scopus 로고    scopus 로고
    • Learning saliency-based visual attention: A review
    • Q. Zhao and C. Koch. Learning saliency-based visual attention: A review. Signal Processing, 93, 2013.
    • (2013) Signal Processing , vol.93
    • Zhao, Q.1    Koch, C.2
  • 68
    • 84885340415 scopus 로고    scopus 로고
    • Fixations on objects in natural scenes: Dissociating importance from salience
    • B. M. t Hart, H. C. E. F. Schmidt, C. Roth, andW. Einhauser. Fixations on objects in natural scenes: dissociating importance from salience. Front. Psychol., 4, 2013.
    • (2013) Front. Psychol. , vol.4
    • Hart, B.M.T.1    Schmidt, H.C.E.F.2    Roth, C.3    Einhauser, W.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.