SCOPUS 정보 검색 플랫폼

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

Volumn 2016-December, Issue , 2016, Pages 4613-4621

Where to look: Focus regions for visual question answering

(3) Shih, Kevin J a Singh, Saurabh a Hoiem, Derek a

a UNIVERSITY OF ILLINOIS AT URBANA CHAMPAIGN (United States)

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER VISION; QUERY PROCESSING;

IMAGE REGIONS; INNER PRODUCT; QUESTION ANSWERING; SHARED SPACES; SPECIFIC LOCATION; TEXT-BASED QUERIES; TEXTUAL QUERY; VISUAL FEATURE;

PATTERN RECOGNITION;

EID: 84986327457 PISSN: 10636919 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/CVPR.2016.499 Document Type: Conference Paper

Times cited : (525)

References (25)

1
- 84973890960
- Vqa: Visual question answering
- S. Antol, A. Agrawal, J. Lu, M. Mitchell, D. Batra, C. L. Zitnick, and D. Parikh. Vqa: Visual question answering. In International Conference on Computer Vision (ICCV), 2015.
- (2015) International Conference on Computer Vision (ICCV)
- Antol, S.¹ Agrawal, A.² Lu, J.³ Mitchell, M.⁴ Batra, D.⁵ Zitnick, C.L.⁶ Parikh, D.⁷

2
- 84973882857
- Predicting deep zero-shot convolutional neural networks using textual descriptions
- J. Ba, K. Swersky, S. Fidler, and R. Salakhutdinov. Predicting deep zero-shot convolutional neural networks using textual descriptions. In International Conference on Computer Vision (ICCV), 2015.
- (2015) International Conference on Computer Vision (ICCV)
- Ba, J.¹ Swersky, K.² Fidler, S.³ Salakhutdinov, R.⁴

3
- 85072028231
- Return of the devil in the details: Delving deep into convolutional nets
- K. Chatfield, K. Simonyan, A. Vedaldi, and A. Zisserman. Return of the devil in the details: Delving deep into convolutional nets. In British Machine Vision Conference, 2014.
- (2014) British Machine Vision Conference
- Chatfield, K.¹ Simonyan, K.² Vedaldi, A.³ Zisserman, A.⁴

4
- 84957029470
- Mind's eye: A recurrent visual representation for image caption generation
- X. Chen and C. Lawrence Zitnick. Mind's eye: A recurrent visual representation for image caption generation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015.
- (2015) The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June
- Chen, X.¹ Lawrence Zitnick, C.²

5
- 85037338954
- Generating typed dependency parses from phrase structure parses
- M.-C. De Marneffe, B. MacCartney, C. D. Manning, et al. Generating typed dependency parses from phrase structure parses. In Proceedings of LREC, volume 6, pages 449-454, 2006.
- (2006) Proceedings of LREC, Volume 6 , pp. 449-454
- De Marneffe, M.-C.¹ MacCartney, B.² Manning, C.D.³

6
- 85009912425
- arXiv preprint arXiv:1411.4389
- J. Donahue, L. A. Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, K. Saenko, and T. Darrell. Long-term recurrent convolutional networks for visual recognition and description. arXiv preprint arXiv:1411.4389, 2014.
- (2014) Long-term Recurrent Convolutional Networks for Visual Recognition and Description
- Donahue, J.¹ Hendricks, L.A.² Guadarrama, S.³ Rohrbach, M.⁴ Venugopalan, S.⁵ Saenko, K.⁶ Darrell, T.⁷

7
- 84944115860
- arXiv preprint arXiv:1411.4952
- H. Fang, S. Gupta, F. Iandola, R. Srivastava, L. Deng, P. Dollár, J. Gao, X. He, M. Mitchell, J. Platt, et al. From captions to visual concepts and back. arXiv preprint arXiv:1411.4952, 2014.
- (2014) From Captions to Visual Concepts and Back
- Fang, H.¹ Gupta, S.² Iandola, F.³ Srivastava, R.⁴ Deng, L.⁵ Dollár, P.⁶ Gao, J.⁷ He, X.⁸ Mitchell, M.⁹ Platt, J.¹⁰

8
- 84862277874
- Understanding the difficulty of training deep feedforward neural networks
- X. Glorot and Y. Bengio. Understanding the difficulty of training deep feedforward neural networks. In International conference on artificial intelligence and statistics, pages 249-256, 2010.
- (2010) International Conference on Artificial Intelligence and Statistics , pp. 249-256
- Glorot, X.¹ Bengio, Y.²

9
- 84862294866
- Deep sparse rectifier neural networks
- X. Glorot, A. Bordes, and Y. Bengio. Deep sparse rectifier neural networks. In International Conference on Artificial Intelligence and Statistics, pages 315-323, 2011.
- (2011) International Conference on Artificial Intelligence and Statistics , pp. 315-323
- Glorot, X.¹ Bordes, A.² Bengio, Y.³

10
- 84906484732
- Improving image-sentence embeddings using large weakly annotated photo collections
- Springer
- Y. Gong, L. Wang, M. Hodosh, J. Hockenmaier, and S. Lazebnik. Improving image-sentence embeddings using large weakly annotated photo collections. In Computer Vision-ECCV 2014, pages 529-545. Springer, 2014.
- (2014) Computer Vision-ECCV 2014 , pp. 529-545
- Gong, Y.¹ Wang, L.² Hodosh, M.³ Hockenmaier, J.⁴ Lazebnik, S.⁵

11
- 85009931336
- arXiv preprint arXiv:1502.03167
- S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167, 2015.
- (2015) Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
- Ioffe, S.¹ Szegedy, C.²

12
- 84946734827
- Deep visual-semantic alignments for generating image descriptions
- A. Karpathy and L. Fei-Fei. Deep visual-semantic alignments for generating image descriptions. In CVPR, 2015.
- (2015) CVPR
- Karpathy, A.¹ Fei-Fei, L.²

13
- 85083950512
- Deep captioning with multimodal recurrent neural networks (m-rnn)
- J. Mao, W. Xu, Y. Yang, J. Wang, Z. Huang, and A. Yuille. Deep captioning with multimodal recurrent neural networks (m-rnn). ICLR, 2015.
- (2015) ICLR
- Mao, J.¹ Xu, W.² Yang, Y.³ Wang, J.⁴ Huang, Z.⁵ Yuille, A.⁶

14
- 84951072975
- Explain images with multimodal recurrent neural networks
- J. Mao, W. Xu, Y. Yang, J. Wang, and A. L. Yuille. Explain images with multimodal recurrent neural networks. NIPS Deep Learning Workshop, 2014.
- (2014) NIPS Deep Learning Workshop
- Mao, J.¹ Xu, W.² Yang, Y.³ Wang, J.⁴ Yuille, A.L.⁵

15
- 84973896625
- Ask your neurons: A neural-based approach to answering questions about images
- M. F. Mateusz Malinowski, Marcus Rohrbach. Ask your neurons: A neural-based approach to answering questions about images. In ICCV, 2015.
- (2015) ICCV
- Mateusz, M.M.F.¹ Rohrbach, M.²

16
- 85083951332
- Efficient estimation of word representations in vector space
- T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781, 2013.
- (2013) ArXiv Preprint arXiv:1301.3781
- Mikolov, T.¹ Chen, K.² Corrado, G.³ Dean, J.⁴

17
- 85009914553
- Exploring models and data for image question answering
- M. Ren, R. Kiros, and R. Zemel. Exploring models and data for image question answering. arXiv preprint arXiv:1505.02074v3, 2015.
- (2015) ArXiv Preprint arXiv:1505.02074v3
- Ren, M.¹ Kiros, R.² Zemel, R.³

18
- 84947041871
- Image net large scale visual recognition challenge
- April
- O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), pages 1-42, April 2015.
- (2015) International Journal of Computer Vision (IJCV) , pp. 1-42
- Russakovsky, O.¹ Deng, J.² Su, H.³ Krause, J.⁴ Satheesh, S.⁵ Ma, S.⁶ Huang, Z.⁷ Karpathy, A.⁸ Khosla, A.⁹ Bernstein, M.¹⁰ Berg, A.C.¹¹ Fei-Fei, L.¹²

19
- 84946802551
- Weakly supervised memory networks
- abs/1503.08895
- S. Sukhbaatar, A. Szlam, J. Weston, and R. Fergus. Weakly supervised memory networks. CoRR, abs/1503.08895, 2015.
- (2015) CoRR
- Sukhbaatar, S.¹ Szlam, A.² Weston, J.³ Fergus, R.⁴

20
- 85009872204
- A. Vedaldi and K. Lenc. Matconvnet-convolutional neural networks for matlab. 2015.
- (2015) Matconvnet-convolutional Neural Networks for Matlab
- Vedaldi, A.¹ Lenc, K.²

21
- 84946747440
- Neural image caption generator
- June
- O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015.
- (2015) The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- Vinyals, O.¹ Toshev, A.² Bengio, S.³ Erhan, S.D.⁴ Tell, A.⁵

22
- 85009857480
- arXiv preprint arXiv:1502.03044
- K. Xu, J. Ba, R. Kiros, A. Courville, R. Salakhutdinov, R. Zemel, and Y. Bengio. Show, attend and tell: Neural image caption generation with visual attention. arXiv preprint arXiv:1502.03044, 2015.
- (2015) Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
- Xu, K.¹ Ba, J.² Kiros, R.³ Courville, A.⁴ Salakhutdinov, R.⁵ Zemel, R.⁶ Bengio, Y.⁷

23
- 84959862697
- arXiv preprint arXiv:1506.00278
- L. Yu, E. Park, A. C. Berg, and T. L. Berg. Visual madlibs: Fill in the blank image generation and question answering. arXiv preprint arXiv:1506.00278, 2015.
- (2015) Visual Madlibs: Fill in the Blank Image Generation and Question Answering
- Yu, L.¹ Park, E.² Berg, A.C.³ Berg, T.L.⁴

24
- 84986301525
- arXiv preprint arXiv:1512.02167
- B. Zhou, Y. Tian, S. Sukhbaatar, A. Szlam, and R. Fergus. Simple baseline for visual question answering. arXiv preprint arXiv:1512.02167, 2015.
- (2015) Simple Baseline for Visual Question Answering
- Zhou, B.¹ Tian, Y.² Sukhbaatar, S.³ Szlam, A.⁴ Fergus, R.⁵

25
- 84906489617
- Edge boxes: Locating object proposals from edges
- Springer
- C. L. Zitnick and P. Dollár. Edge boxes: Locating object proposals from edges. In Computer Vision-ECCV 2014, pages 391-405. Springer, 2014.
- (2014) Computer Vision-ECCV 2014 , pp. 391-405
- Zitnick, C.L.¹ Dollár, P.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.