SCOPUS 정보 검색 플랫폼

EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings

Volumn , Issue , 2016, Pages 932-937

Human attention in visual question answering: Do humans and deep networks look at the same regions?

(5) Das, Abhishek d Agrawal, Harsh d Lawrence Zitnick, C b Parikh, Devi a,c Batra, Dhruv a,c

a VIRGINIA POLYTECHNIC INSTITUTE AND STATE UNIVERSITY (United States)

b FACEBOOK AI RESEARCH (United States)

c GEORGIA INSTITUTE OF TECHNOLOGY (United States)

d NONE

Author keywords

[No Author keywords available]

Indexed keywords

ATTENTION MODEL; BLURRED IMAGE; DESIGN AND TESTS; HUMAN ATTENTION; LARGE-SCALE STUDIES; QUESTION ANSWERING; RANK ORDER; STATE OF THE ART;

NATURAL LANGUAGE PROCESSING SYSTEMS;

EID: 85072846928 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.18653/v1/d16-1092 Document Type: Conference Paper

Times cited : (155)

References (25)

1
- 84993660571
- Learning to compose neural networks for question answering
- Jacob Andreas, Marcus Rohrbach, Trevor Darrell, and Dan Klein. 2016. Learning to compose neural networks for question answering. In NAACL HLT. 2
- (2016) NAACL HLT , vol.2
- Andreas, J.¹ Rohrbach, M.² Darrell, T.³ Klein, D.⁴

2
- 84973890960
- VQA: Visual question answering
- Stanislaw Antol, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C. Lawrence Zitnick, and Devi Parikh. 2015. VQA: Visual Question Answering. In ICCV. 2
- (2015) ICCV , vol.2
- Antol, S.¹ Agrawal, A.² Lu, J.³ Mitchell, M.⁴ Batra, D.⁵ Lawrence Zitnick, C.⁶ Parikh, D.⁷

3
- 85083951423
- Multiple object recognition with visual attention
- Jimmy Lei Ba, Volodymyr Mnih, and Koray Kavukcuoglu. 2015. Multiple Object Recognition With Visual Attention. In ICLR. 1
- (2015) ICLR , vol.1
- Ba, J.L.¹ Mnih, V.² Kavukcuoglu, K.³

4
- 85083953689
- Neural machine translation by jointly learning to align and translate
- Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. In ICLR. 1
- (2015) ICLR , vol.1
- Bahdanau, D.¹ Cho, K.² Bengio, Y.³

5
- 84986313593
- abs/1507.01053. 1
- KyungHyun Cho, Aaron C. Courville, and Yoshua Bengio. 2015. Describing Multimedia Content using Attention-based Encoder-Decoder Networks. volume abs/1507.01053. 1
- (2015) Describing Multimedia Content Using Attention-Based Encoder-Decoder Networks
- Cho, K.¹ Courville, A.C.² Bengio, Y.³

6
- 85006764733
- Leveraging the wisdom of the crowd for fine-grained recognition
- Jia Deng, Jonathan Krause, Michael Stark, and Li Fei-Fei. 2015. Leveraging the Wisdom of the Crowd for Fine-Grained Recognition. PAMI. 3
- (2015) PAMI , vol.3
- Deng, J.¹ Krause, J.² Stark, M.³ Fei-Fei, L.⁴

7
- 84965102873
- abs/1505.04467. 1
- Jacob Devlin, Saurabh Gupta, Ross B. Girshick, Margaret Mitchell, and C. Lawrence Zitnick. 2015. Exploring nearest neighbor approaches for image captioning. volume abs/1505.04467. 1
- (2015) Exploring Nearest Neighbor Approaches for Image Captioning
- Devlin, J.¹ Gupta, S.² Girshick, R.B.³ Mitchell, M.⁴ Lawrence Zitnick, C.⁵

8
- 33846980853
- What do we perceive in a glance of a real-world scene?
- 2
- Li Fei-Fei, Asha Iyer, Christof Koch, and Pietro Perona. 2007. What do we perceive in a glance of a real-world scene? Journal of Vision, 7(1):10. 2
- (2007) Journal of Vision , vol.7 , Issue.1 , pp. 10
- Fei-Fei, L.¹ Iyer, A.² Koch, C.³ Perona, P.⁴

9
- 85004173151
- abs/1601.01073. 1
- Orhan Firat, KyungHyun Cho, and Yoshua Bengio. 2016. Multi-way, multilingual neural machine translation with a shared attention mechanism. volume abs/1601.01073. 1
- (2016) Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism
- Firat, O.¹ Cho, K.² Bengio, Y.³

10
- 84887325349
- Fine-grained crowdsourcing for fine-grained recognition
- Jia Deng and Jonathan Krause and Li Fei-Fei. 2013. Fine-Grained Crowdsourcing for Fine-Grained Recognition. In CVPR. 3
- (2013) CVPR , vol.3
- Deng, J.¹ Krause, J.² Fei-Fei, L.³

11
- 84959238667
- Saliency in crowd
- Ming Jiang, Juan Xu, and Qi Zhao. 2014. Saliency in Crowd. In ECCV. 2
- (2014) ECCV , vol.2
- Jiang, M.¹ Xu, J.² Zhao, Q.³

12
- 85072837424
- SALICON: Saliency in context
- Ming Jiang, Shengsheng Huang, Juanyong Duan, and Qi Zhao. 2015. Salicon: Saliency in context. In CVPR. 2, 3
- (2015) CVPR , vol.2 , pp. 3
- Jiang, M.¹ Huang, S.² Duan, J.³ Zhao, Q.⁴

13
- 77956309403
- Learning to predict where humans look
- Tilke Judd, Krista Ehinger, Frédo Du-rand, and Antonio Torralba. 2009. Learning to predict where humans look. In ICCV. 2, 4
- (2009) ICCV , vol.2 , pp. 4
- Judd, T.¹ Ehinger, K.² Du-Rand, F.³ Torralba, A.⁴

14
- 84937834115
- Microsoft COCO: Common objects in context
- Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollr, and C. Lawrence Zitnick. 2014. Microsoft COCO: Common Objects in Context. In ECCV. 2
- (2014) ECCV , vol.2
- Lin, T.-Y.¹ Maire, M.² Belongie, S.³ Hays, J.⁴ Perona, P.⁵ Ramanan, D.⁶ Dollr, P.⁷ Lawrence Zitnick, C.⁸

15
- 85018917850
- Hierarchical question-image co-attention for visual question answering
- Jiasen Lu, Jianwei Yang, Dhruv Batra, and Devi Parikh. 2016. Hierarchical Question-Image Co-Attention for Visual Question Answering. In NIPS. 1, 2, 4
- (2016) NIPS , vol.1 , Issue.2 , pp. 4
- Lu, J.¹ Yang, J.² Batra, D.³ Parikh, D.⁴

16
- 84937959846
- Recurrent models of visual attention
- Volodymyr Mnih, Nicolas Heess, Alex Graves, and Koray Kavukcuoglu. 2014. Recurrent Models of Visual Attention. In NIPS. 1
- (2014) NIPS , vol.1
- Mnih, V.¹ Heess, N.² Graves, A.³ Kavukcuoglu, K.⁴

17
- 0034096923
- The dynamic representation of scenes
- Ronald A. Rensink. 2000. The dynamic representation of scenes. Visual Cognition, 7(1-3):17-42. 1
- (2000) Visual Cognition , vol.7 , Issue.1-3 , pp. 17-42
- Rensink, R.A.¹

18
- 84973884051
- Pierre Sermanet, Andrea Frome, and Esteban Real. 2014. Attention for Fine-Grained Categorization. volume abs/1412.7054. 1
- (2014) Attention for Fine-Grained Categorization
- Sermanet, P.¹ Frome, A.² Real, E.³

19
- 36448979181
- The central fixation bias in scene viewing: Selecting an optimal viewing position independently of motor biases and image feature distributions
- 2
- Benjamin W. Tatler. 2007. The central fixation bias in scene viewing: Selecting an optimal viewing position independently of motor biases and image feature distributions. Journal of Vision, 7(14):4. 2, 4
- (2007) Journal of Vision , vol.7 , Issue.14 , pp. 4
- Tatler, B.W.¹

20
- 4544353199
- Labeling images with a computer game
- Luis von Ahn and Laura Dabbish. 2004. Labeling images with a computer game. In CHI. 3
- (2004) CHI , vol.3
- Von Ahn, L.¹ Dabbish, L.²

21
- 84999008900
- Dynamic memory networks for visual and textual question answering
- Caiming Xiong, Stephen Merity, and Richard Socher. 2016. Dynamic memory networks for visual and textual question answering. In ICML. 1
- (2016) ICML , vol.1
- Xiong, C.¹ Merity, S.² Socher, R.³

22
- 84990044633
- abs/1511.05234. 1
- Huijuan Xu and Kate Saenko. 2015. Ask, attend and answer: Exploring question-guided spatial attention for visual question answering. volume abs/1511.05234. 1
- (2015) Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering
- Xu, H.¹ Saenko, K.²

23
- 84970002232
- Show, attend and tell: Neural image caption generation with visual attention
- Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron C. Courville, Ruslan Salakhutdinov, Richard S. Zemel, and Yoshua Bengio. 2015. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. In ICML. 1
- (2015) ICML , vol.1
- Xu, K.¹ Ba, J.² Kiros, R.³ Cho, K.⁴ Courville, A.C.⁵ Salakhutdinov, R.⁶ Zemel, R.S.⁷ Bengio, Y.⁸

24
- 85067831524
- Stacked attention networks for image question answering
- Zichao Yang, Xiaodong He, Jianfeng Gao, Li Deng, and Alexander J. Smola. 2016. Stacked Attention Networks for Image Question Answering. In CVPR. 1, 2, 4
- (2016) CVPR , vol.1 , Issue.2 , pp. 4
- Yang, Z.¹ He, X.² Gao, J.³ Deng, L.⁴ Smola, A.J.⁵

25
- 0004070070
- Plenum. New York
- A. L. Yarbus. 1967. Eye Movements and Vision. Plenum. New York. 2
- (1967) Eye Movements and Vision , pp. 2
- Yarbus, A.L.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.