메뉴 건너뛰기




Volumn 2017-January, Issue , 2017, Pages 4187-4195

Multi-level attention networks for visual question answering

Author keywords

[No Author keywords available]

Indexed keywords

ABSTRACTING; BEHAVIORAL RESEARCH; COMPUTER VISION; NATURAL LANGUAGE PROCESSING SYSTEMS; NEURAL NETWORKS; PATTERN RECOGNITION; RECURRENT NEURAL NETWORKS; SEMANTICS;

EID: 85041906381     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/CVPR.2017.446     Document Type: Conference Paper
Times cited : (247)

References (38)
  • 1
    • 85072842417 scopus 로고    scopus 로고
    • Analyzing the behavior of visual question answering models
    • A. Agrawal, D. Batra, and D. Parikh. Analyzing the behavior of visual question answering models. In EMNLP, 2016.
    • (2016) EMNLP
    • Agrawal, A.1    Batra, D.2    Parikh, D.3
  • 3
    • 85044308314 scopus 로고    scopus 로고
    • Visual genome: Connecting language and vision using crowdsourced dense image annotations
    • A. Das, H. Agrawal, et al. Visual genome: connecting language and vision using crowdsourced dense image annotations. In IJCV, 2016.
    • (2016) IJCV
    • Das, A.1    Agrawal, H.2
  • 4
    • 85072846928 scopus 로고    scopus 로고
    • Human attention in visual question answering: Do humans and deep networks look at the same regions? in
    • A. Das, H. Agrawal, C. L. Zitnick, D. Parikh, and D. Batra. Human attention in visual question answering: Do humans and deep networks look at the same regions? In EMNLP, 2016.
    • (2016) EMNLP
    • Das, A.1    Agrawal, H.2    Zitnick, C.L.3    Parikh, D.4    Batra, D.5
  • 7
    • 84938942041 scopus 로고    scopus 로고
    • Image tag refinement with view-dependent concept representations
    • J. Fu, J.Wang, Y. Rui, X.-J.Wang, T. Mei, and H. Lu. Image tag refinement with view-dependent concept representations. IEEE T-CSVT, 25(28):1409-1422, 2015.
    • (2015) IEEE IEEE T-CSVT , vol.25 , Issue.28 , pp. 1409-1422
    • Fu, J.1    Wang, J.2    Rui, Y.3    Wang, X.-J.4    Mei, T.5    Lu, H.6
  • 8
    • 84973896917 scopus 로고    scopus 로고
    • Relaxing from vocabulary: Robust weakly-supervised deep learning for vocabulary-free image tagging
    • J. Fu, Y. Wu, T. Mei, J. Wang, H. Lu, and Y. Rui. Relaxing from vocabulary: Robust weakly-supervised deep learning for vocabulary-free image tagging. In ICCV, 2015.
    • (2015) ICCV
    • Fu, J.1    Wu, Y.2    Mei, T.3    Wang, J.4    Lu, H.5    Rui, Y.6
  • 9
    • 85044506279 scopus 로고    scopus 로고
    • Multimodal compact bilinear pooling for visual question answering and visual grounding
    • A. Fukui, D. H. Park, D. Yang, A. Rohrbach, T. Darrell, and M. Rohrbach. Multimodal compact bilinear pooling for visual question answering and visual grounding. In EMNLP, 2016.
    • (2016) EMNLP
    • Fukui, A.1    Park, D.H.2    Yang, D.3    Rohrbach, A.4    Darrell, T.5    Rohrbach, M.6
  • 10
    • 84965148420 scopus 로고    scopus 로고
    • Are you talking to a machine? Dataset and methods for multilingual image question answering
    • H. Gao, J. Mao, J. Zhou, Z. Huang, L. Wang, and W. Xu. Are you talking to a machine? dataset and methods for multilingual image question answering. In NIPS, 2015.
    • (2015) NIPS
    • Gao, H.1    Mao, J.2    Zhou, J.3    Huang, Z.4    Wang, L.5    Xu, W.6
  • 11
    • 84986274465 scopus 로고    scopus 로고
    • Deep residual learning for image recognition
    • K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, 2016.
    • (2016) CVPR
    • He, K.1    Zhang, X.2    Ren, S.3    Sun, J.4
  • 13
    • 85044292278 scopus 로고    scopus 로고
    • A focused dynamic attention model for visual question answering
    • I. Ilievski, S. Yan, and J. Feng. A focused dynamic attention model for visual question answering. In ECCV, 2016.
    • (2016) ECCV
    • Ilievski, I.1    Yan, S.2    Feng, J.3
  • 14
    • 0032203257 scopus 로고    scopus 로고
    • Gradientbased learning applied to document recognition
    • Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradientbased learning applied to document recognition. Proceddings of the IEEE, 86(11):2278-2324, 1998.
    • (1998) Proceddings of the IEEE , vol.86 , Issue.11 , pp. 2278-2324
    • LeCun, Y.1    Bottou, L.2    Bengio, Y.3    Haffner, P.4
  • 16
    • 85030451435 scopus 로고    scopus 로고
    • Let your photos talk: Generating narrative paragraph for photo stream via bidirectional attention recurrent neural networks
    • Y. Liu, J. Fu, T. Mei, and C. W. Chen. Let your photos talk: Generating narrative paragraph for photo stream via bidirectional attention recurrent neural networks. In AAAI, pages 1445-1452, 2017.
    • (2017) AAAI , pp. 1445-1452
    • Liu, Y.1    Fu, J.2    Mei, T.3    Chen, C.W.4
  • 17
    • 85018917850 scopus 로고    scopus 로고
    • Hierarchical question-image co-attention for visual question answering
    • J. Lu, J. Yang, D. Batra, and D. Parikh. Hierarchical question-image co-attention for visual question answering. In NIPS, 2016.
    • (2016) NIPS
    • Lu, J.1    Yang, J.2    Batra, D.3    Parikh, D.4
  • 18
    • 84973896625 scopus 로고    scopus 로고
    • Ask your neurons: A neural-based approach to answering questions about images
    • M. Malinowski, M. Rohrbach, and M. Fritz. Ask your neurons: A neural-based approach to answering questions about images. In ICCV, 2015.
    • (2015) ICCV
    • Malinowski, M.1    Rohrbach, M.2    Fritz, M.3
  • 19
    • 85083950512 scopus 로고    scopus 로고
    • Deep captioning with multimodal recurrent neural networks (m-RNN)
    • J. Mao, W. Xu, Y. Yang, J. Wang, Z. Huang, and A. Yuille. Deep captioning with multimodal recurrent neural networks (m-RNN). In ICLR, 2015.
    • (2015) ICLR
    • Mao, J.1    Xu, W.2    Yang, Y.3    Wang, J.4    Huang, Z.5    Yuille, A.6
  • 21
    • 84986261711 scopus 로고    scopus 로고
    • Image question answering using convolutional neural network with dynamic parameter prediction
    • H. Noh, P. H. Seo, and B. Han. Image question answering using convolutional neural network with dynamic parameter prediction. In CVPR, 2016.
    • (2016) CVPR
    • Noh, H.1    Seo, P.H.2    Han, B.3
  • 22
    • 84986332702 scopus 로고    scopus 로고
    • Jointly modeling embedding and translation to bridge video and language
    • Y. Pan, T. Mei, T. Yao, H. Li, and Y. Rui. Jointly modeling embedding and translation to bridge video and language. In CVPR, 2016.
    • (2016) CVPR
    • Pan, Y.1    Mei, T.2    Yao, T.3    Li, H.4    Rui, Y.5
  • 24
    • 84965170394 scopus 로고    scopus 로고
    • Exploring models and data for image question answering
    • M. Ren, R. Kiros, and R. S. Zemel. Exploring models and data for image question answering. In NIPS, 2015.
    • (2015) NIPS
    • Ren, M.1    Kiros, R.2    Zemel, R.S.3
  • 26
    • 84986327457 scopus 로고    scopus 로고
    • Where to look: Focus regions for visual question answering
    • K. J. Shih, S. Singh, and D. Hoiem. Where to look: Focus regions for visual question answering. In CVPR, 2016.
    • (2016) CVPR
    • Shih, K.J.1    Singh, S.2    Hoiem, D.3
  • 27
    • 85083953063 scopus 로고    scopus 로고
    • Very deep convolutional networks for large-scale image recognition
    • K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015.
    • (2015) ICLR
    • Simonyan, K.1    Zisserman, A.2
  • 28
    • 84946747440 scopus 로고    scopus 로고
    • Show and tell: A neural image caption generator
    • O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. In CVPR, 2015.
    • (2015) CVPR
    • Vinyals, O.1    Toshev, A.2    Bengio, S.3    Erhan, D.4
  • 29
    • 85006134827 scopus 로고    scopus 로고
    • Beyond object recognition: Visual sentiment analysis with deep coupled adjective and noun neural networks
    • J.Wang, J. Fu, T. Mei, and Y. Xu. Beyond object recognition: Visual sentiment analysis with deep coupled adjective and noun neural networks. In IJCAI, 2016.
    • (2016) IJCAI
    • Wang, J.1    Fu, J.2    Mei, T.3    Xu, Y.4
  • 30
    • 84986301177 scopus 로고    scopus 로고
    • What value do explicit high level concepts have in vision to language problems?
    • Q.Wu, C. Shen, L. Liu, A. Dick, and A. Hengel. What value do explicit high level concepts have in vision to language problems? In CVPR, 2016.
    • (2016) CVPR
    • Wu, Q.1    Shen, C.2    Liu, L.3    Dick, A.4    Hengel, A.5
  • 31
    • 84986320870 scopus 로고    scopus 로고
    • Ask me anything: Free-form visual question answering based on knowledge from external sources
    • Q. Wu, P. Wang, C. Shen, A. Dick, and A. Hengel. Ask me anything: Free-form visual question answering based on knowledge from external sources. In CVPR, 2016.
    • (2016) CVPR
    • Wu, Q.1    Wang, P.2    Shen, C.3    Dick, A.4    Hengel, A.5
  • 32
    • 84999008900 scopus 로고    scopus 로고
    • Dynamic memory networks for visual and texual question answering
    • C. Xiong, S. Merity, and R. Socher. Dynamic memory networks for visual and texual question answering. In ICML, 2016.
    • (2016) ICML
    • Xiong, C.1    Merity, S.2    Socher, R.3
  • 33
    • 85035008367 scopus 로고    scopus 로고
    • Ask, attend and answer: Exploring question-guided spatial attention for visual question answering
    • H. Xu and K. Saenko. Ask, attend and answer: Exploring question-guided spatial attention for visual question answering. In ECCV, 2016.
    • (2016) ECCV
    • Xu, H.1    Saenko, K.2
  • 34
    • 84986334021 scopus 로고    scopus 로고
    • Stacked attention networks for image question answering
    • Z. Yang, X. He, J. Gao, L. Deng, and A. Smola. Stacked attention networks for image question answering. In CVPR, 2016.
    • (2016) CVPR
    • Yang, Z.1    He, X.2    Gao, J.3    Deng, L.4    Smola, A.5
  • 36
    • 84986317307 scopus 로고    scopus 로고
    • Image captioning with semantic attention
    • Q. You, H. Jin, Z. Wang, C. Fang, and J. Luo. Image captioning with semantic attention. In CVPR, 2016.
    • (2016) CVPR
    • You, Q.1    Jin, H.2    Wang, Z.3    Fang, C.4    Luo, J.5
  • 37
    • 84986275767 scopus 로고    scopus 로고
    • Visual7w: Grounded question answering in images
    • Y. Zhu, O. Groth, M. Bernstein, and L. Fei-Fei. Visual7w: Grounded question answering in images. In CVPR, 2016.
    • (2016) CVPR
    • Zhu, Y.1    Groth, O.2    Bernstein, M.3    Fei-Fei, L.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.