SCOPUS 정보 검색 플랫폼 - 논문 보기

메뉴 건너뛰기

Advances in Neural Information Processing Systems

Volumn 2017-December, Issue , 2017, Pages 6595-6605

Modulating early visual processing by language

(6) De Vries, Harm a Strub, Florian b Mary, Jérémie b,e Larochelle, Hugo c Pietquin, Olivier d Courville, Aaron a

a UNIVERSITÉ DE MONTRÉAL (Canada)

b UNIV LILLE (France)

c GOOGLE INC (United States)

d DEEPMIND (United Kingdom)

e Criteo (United States)

Author keywords

[No Author keywords available]

Indexed keywords

ABLATION; COMPUTATIONAL LINGUISTICS; PIPELINE PROCESSING SYSTEMS;

COMPUTATIONAL MODEL; FEATURE MAP; QUESTION ANSWERING TASK; VISUAL CONCEPT; VISUAL-PROCESSING;

VISUAL LANGUAGES;

EID: 85043992858 PISSN: 10495258 EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (517)

References (28)

1
- 84973890960
- Vqa: Visual question answering
- S. Antol, A. Agrawal, J. Lu, M. Mitchell, D. Batra, Z. Lawrence, and D. Parikh. Vqa: Visual question answering. In Proc. of ICCV, 2015.
- (2015) Proc. of ICCV
- Antol, S.¹ Agrawal, A.² Lu, J.³ Mitchell, M.⁴ Batra, D.⁵ Lawrence, Z.⁶ Parikh, D.⁷

2
- 85040307226
- H. Ben-Younes, R. Cadène, N. Thome, and M. Cord. MUTAN: Multimodal Tucker Fusion for Visual Question Answering. arXiv preprint arXiv: 1705.06676, 2017.
- (2017) MUTAN: Multimodal Tucker Fusion for Visual Question Answering
- Ben-Younes, H.¹ Cadène, R.² Thome, N.³ Cord, M.⁴

3
- 84933576022
- Words jump-start vision: A label advantage in object recognition
- B. Boutonnet and G. Lupyan. Words jump-start vision: A label advantage in object recognition. Journal of Neuroscience, 35(25): 9329-9335, 2015.
- (2015) Journal of Neuroscience , vol.35 , Issue.25 , pp. 9329-9335
- Boutonnet, B.¹ Lupyan, G.²

4
- 84961291190
- Learning phrase representations using RNN encoder-decoder for statistical machine translation
- K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proc. of EMNLP, 2014.
- (2014) Proc. of EMNLP
- Cho, K.¹ Van Merriënboer, B.² Gulcehre, C.³ Bahdanau, D.⁴ Bougares, F.⁵ Schwenk, H.⁶ Bengio, Y.⁷

5
- 85041927710
- Visual dialog
- A. Das, S. Kottur, K. Gupta, A. Singh, D. Yadav, J. Moura, D. Parikh, and D. Batra. Visual Dialog. In Proc. of CVPR, 2017.
- (2017) Proc. of CVPR
- Das, A.¹ Kottur, S.² Gupta, K.³ Singh, A.⁴ Yadav, D.⁵ Moura, J.⁶ Parikh, D.⁷ Batra, D.⁸

6
- 85041919303
- GuessWhat?! Visual object discovery through multi-modal dialogue
- H. de Vries, F. Strub, S. Chandar, O. Pietquin, H. Larochelle, and A. Courville. GuessWhat?! Visual object discovery through multi-modal dialogue. In Proc. of CVPR, 2017.
- (2017) Proc. of CVPR
- De Vries, H.¹ Strub, F.² Chandar, S.³ Pietquin, O.⁴ Larochelle, H.⁵ Courville, A.⁶

7
- 85088228106
- A learned representation for artistic style
- V. Dumoulin, J. Shlens, and M. Kudlur. A Learned Representation For Artistic Style. In Proc. of ICLR, 2017.
- (2017) Proc. of ICLR
- Dumoulin, V.¹ Shlens, J.² Kudlur, M.³

8
- 34848886912
- Introduction to the special issue on language-vision interactions
- F. Ferreira and M. Tanenhaus. Introduction to the special issue on language-vision interactions. Journal of Memory and Language, 57(4): 455-459, 2007.
- (2007) Journal of Memory and Language , vol.57 , Issue.4 , pp. 455-459
- Ferreira, F.¹ Tanenhaus, M.²

9
- 85044506279
- Multimodal compact bilinear pooling for visual question answering and visual grounding
- A. Fukui, D. Huk Park, D. Yang, A. Rohrbach, T. Darrell, and M. Rohrbach. Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding. In Proc. of EMNLP, 2016.
- (2016) Proc. of EMNLP
- Fukui, A.¹ Huk Park, D.² Yang, D.³ Rohrbach, A.⁴ Darrell, T.⁵ Rohrbach, M.⁶

10
- 0031573117
- Long short-term memory
- MIT Press
- S. Hochreiter and J. Schmidhuber. Long short-term memory. In Neural computation, Volume 9, pages 1735-1780. MIT Press, 1997.
- (1997) Neural Computation , vol.9 , pp. 1735-1780
- Hochreiter, S.¹ Schmidhuber, J.²

11
- 85018917850
- Hierarchical question-image co-attention for visual question answering
- J. Jiasen, J. Yang, D. Batra, and D. Parikh. Hierarchical question-image co-attention for visual question answering. In Proc. of NIPS, 2016.
- (2016) Proc. of NIPS
- Jiasen, J.¹ Yang, J.² Batra, D.³ Parikh, D.⁴

12
- 84986274465
- Deep residual learning for image recognition
- K. Kaiming, Z. Xiangyu, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proc. of CVPR, 2016.
- (2016) Proc. of CVPR
- Kaiming, K.¹ Xiangyu, Z.² Ren, S.³ Sun, J.⁴

13
- 85018868398
- Multimodal residual learning for visual qa
- J. Kim, S. Lee, D. Kwak, M. Heo, J. Kim, J. Ha, and B. Zhang. Multimodal residual learning for visual qa. In Proc. of NIPS, 2016.
- (2016) Proc. of NIPS
- Kim, J.¹ Lee, S.² Kwak, D.³ Heo, M.⁴ Kim, J.⁵ Ha, J.⁶ Zhang, B.⁷

14
- 85087529518
- Hadamard product for low-rank bilinear pooling
- J. Kim, K. On, J. Kim, J. Ha, and B. Zhang. Hadamard product for low-rank bilinear pooling. In Proc. of ICLR, 2017.
- (2017) Proc. of ICLR
- Kim, J.¹ On, K.² Kim, J.³ Ha, J.⁴ Zhang, B.⁵

15
- 84901593868
- Prior expectations evoke stimulus templates in the primary visual cortex
- P. Kok, M. Failing, and F. de Lange. Prior expectations evoke stimulus templates in the primary visual cortex. Journal of Cognitive Neuroscience, 26(7): 1546-1554, 2014.
- (2014) Journal of Cognitive Neuroscience , vol.26 , Issue.7 , pp. 1546-1554
- Kok, P.¹ Failing, M.² De Lange, F.³

16
- 84937834115
- Microsoft coco: Common objects in context
- T. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and L. Zitnick. Microsoft coco: Common objects in context. In Proc of ECCV, 2014.
- (2014) Proc of ECCV
- Lin, T.¹ Maire, M.² Belongie, S.³ Hays, J.⁴ Perona, P.⁵ Ramanan, D.⁶ Dollár, P.⁷ Zitnick, L.⁸

17
- 84973896625
- Ask your neurons: A neural-based approach to answering questions about images
- M. Malinowski, M. Rohrbach, and M. Fritz. Ask your neurons: A neural-based approach to answering questions about images. In Proc. of ICCV, 2015.
- (2015) Proc. of ICCV
- Malinowski, M.¹ Rohrbach, M.² Fritz, M.³

18
- 85030255039
- M. Malinowski, M. Rohrbach, and M. Fritz. Ask your neurons: A deep learning approach to visual question answering. arXiv preprint arXiv: 1605.02697, 2016.
- (2016) Ask Your Neurons: A Deep Learning Approach to Visual Question Answering
- Malinowski, M.¹ Rohrbach, M.² Fritz, M.³

19
- 84961289992
- Glove: Global vectors for word representation
- J. Pennington, R. Socher, and C. Manning. Glove: Global Vectors for Word Representation. In Proc. of EMNLP, 2014.
- (2014) Proc. of EMNLP
- Pennington, J.¹ Socher, R.² Manning, C.³

20
- 84965170394
- Exploring models and data for image question answering
- M. Ren, R. Kiros, and R. Zemel. Exploring models and data for image question answering. In Proc. of NIPS, 2015.
- (2015) Proc. of NIPS
- Ren, M.¹ Kiros, R.² Zemel, R.³

21
- 84964923476
- Batch normalization: Accelerating deep network training by reducing internal covariate shift
- I. Sergey and S. Christian. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In Proc. of ICML, 2015.
- (2015) Proc. of ICML
- Sergey, I.¹ Christian, S.²

22
- 85083953063
- K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. 2015.
- (2015) Very Deep Convolutional Networks for Large-scale Image Recognition
- Simonyan, K.¹ Zisserman, A.²

23
- 85041900002
- Making the V in VQA matter: Elevating the role of image understanding in visual question answering
- G. Yashand K. Tejas, S. Douglas, Dhruv B, and P. Devi. Making the V in VQA matter: Elevating the role of image understanding in Visual Question Answering. In Proc. of CVPR, 2017.
- (2017) Proc. of CVPR
- Yashand, G.¹ Tejas, K.² Douglas, S.³ Dhruv, B.⁴ Devi, P.⁵

24
- 63149129198
- Unconscious effects of language-specific terminology on preattentive color perception
- G. Thierry, P. Athanasopoulos, A. Wiggett, B. Dering, and JR. Kuipers. Unconscious effects of language-specific terminology on preattentive color perception. PNAS, 106(11): 4567-4570, 2009.
- (2009) PNAS , vol.106 , Issue.11 , pp. 4567-4570
- Thierry, G.¹ Athanasopoulos, P.² Wiggett, A.³ Dering, B.⁴ Kuipers, J.R.⁵

25
- 57249084011
- Visualizing data using t-sne
- L. Maaten van and G. der Hinton. Visualizing data using t-sne. JMLR, 9(Nov): 2579-2605, 2008.
- (2008) JMLR , vol.9 , Issue.NOV , pp. 2579-2605
- Maaten Van, L.¹ Der Hinton, G.²

26
- 84990044633
- Ask, attend and answer: Exploring question-guided spatial attention for visual question answering
- H. Xu and K. Saenko. Ask, attend and answer: Exploring question-guided spatial attention for visual question answering. In Proc. of ECCV, 2015.
- (2015) Proc. of ECCV
- Xu, H.¹ Saenko, K.²

27
- 84970002232
- Show, attend and tell: Neural image caption generation with visual attention
- K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhutdinov, R. Zemel, and Y. Bengio. Show, attend and tell: Neural image caption generation with visual attention. In Proc. of ICML, 2015.
- (2015) Proc. of ICML
- Xu, K.¹ Ba, J.² Kiros, R.³ Cho, K.⁴ Courville, A.⁵ Salakhutdinov, R.⁶ Zemel, R.⁷ Bengio, Y.⁸

28
- 84986334021
- Stacked attention networks for image question answering
- Z. Yang, X. He, J. Gao, and L. Deng A. Smola. Stacked attention networks for image question answering. In Proc. of CVPR, 2016.
- (2016) Proc. of CVPR
- Yang, Z.¹ He, X.² Gao, J.³ Deng, L.⁴ Smola, A.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.