메뉴 건너뛰기




Volumn 2017-January, Issue , 2017, Pages 1080-1089

Visual dialog

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER VISION; DECODING; PATTERN RECOGNITION; SIGNAL ENCODING;

EID: 85041927710     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/CVPR.2017.121     Document Type: Conference Paper
Times cited : (888)

References (64)
  • 1
    • 85044300992 scopus 로고    scopus 로고
    • Torch. http://torch.ch/.
  • 3
    • 85044317065 scopus 로고    scopus 로고
    • Amazon. Alexa. http://alexa.amazon.com/.
  • 8
    • 85072845519 scopus 로고    scopus 로고
    • Resolving language and vision ambiguities together: Joint segmentation and prepositional attachment resolution in captioned scenes
    • G. Christie, A. Laddha, A. Agrawal, S. Antol, Y. Goyal, K. Kochersberger, and D. Batra. Resolving language and vision ambiguities together: Joint segmentation and prepositional attachment resolution in captioned scenes. In EMNLP, 2016.
    • (2016) EMNLP
    • Christie, G.1    Laddha, A.2    Agrawal, A.3    Antol, S.4    Goyal, Y.5    Kochersberger, K.6    Batra, D.7
  • 9
    • 85072846928 scopus 로고    scopus 로고
    • Human attention in visual question answering: Do humans and deep networks look at the same regions?
    • A. Das, H. Agrawal, C. L. Zitnick, D. Parikh, and D. Batra. Human Attention in Visual Question Answering: Do Humans and Deep Networks Look at the Same Regions? In EMNLP, 2016.
    • (2016) EMNLP
    • Das, A.1    Agrawal, H.2    Zitnick, C.L.3    Parikh, D.4    Batra, D.5
  • 14
    • 84965148420 scopus 로고    scopus 로고
    • Are you talking to a machine? Dataset and methods for multilingual image question answering
    • H. Gao, J. Mao, J. Zhou, Z. Huang, L. Wang, and W. Xu. Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question Answering. In NIPS, 2015.
    • (2015) NIPS
    • Gao, H.1    Mao, J.2    Zhou, J.3    Huang, Z.4    Wang, L.5    Xu, W.6
  • 15
  • 16
    • 85041900002 scopus 로고    scopus 로고
    • Making the v in vqa matter: Elevating the role of image understanding in visual question answering
    • Y. Goyal, T. Khot, D. Summers-Stay, D. Batra, and D. Parikh. Making the v in vqa matter: Elevating the role of image understanding in visual question answering. In CVPR, 2017.
    • (2017) CVPR
    • Goyal, Y.1    Khot, T.2    Summers-Stay, D.3    Batra, D.4    Parikh, D.5
  • 17
    • 84986274465 scopus 로고    scopus 로고
    • Deep residual learning for image recognition
    • K. He, X. Zhang, S. Ren, and J. Sun. Deep Residual Learning for Image Recognition. In CVPR, 2016.
    • (2016) CVPR
    • He, K.1    Zhang, X.2    Ren, S.3    Sun, J.4
  • 19
    • 85030448950 scopus 로고    scopus 로고
    • Segmentation from natural language expressions
    • R. Hu, M. Rohrbach, and T. Darrell. Segmentation from natural language expressions. In ECCV, 2016.
    • (2016) ECCV
    • Hu, R.1    Rohrbach, M.2    Darrell, T.3
  • 21
    • 85041926703 scopus 로고    scopus 로고
    • Revisiting visual question answering baselines
    • A. Jabri, A. Joulin, and L. van der Maaten. Revisiting visual question answering baselines. In ECCV, 2016.
    • (2016) ECCV
    • Jabri, A.1    Joulin, A.2    Van Der Maaten, L.3
  • 23
    • 84946734827 scopus 로고    scopus 로고
    • Deep visual-semantic alignments for generating image descriptions
    • A. Karpathy and L. Fei-Fei. Deep visual-semantic alignments for generating image descriptions. In CVPR, 2015.
    • (2015) CVPR
    • Karpathy, A.1    Fei-Fei, L.2
  • 24
    • 84911370987 scopus 로고    scopus 로고
    • What are you talking about? Text-to-image coreference
    • C. Kong, D. Lin, M. Bansal, R. Urtasun, and S. Fidler. What are you talking about? text-to-image coreference. In CVPR, 2014.
    • (2014) CVPR
    • Kong, C.1    Lin, D.2    Bansal, M.3    Urtasun, R.4    Fidler, S.5
  • 25
    • 84893350028 scopus 로고    scopus 로고
    • An ISU dialogue system exhibiting reinforcement learning of dialogue policies: Generic slot-filling in the TALK in-car system
    • O. Lemon, K. Georgila, J. Henderson, and M. Stuttle. An ISU dialogue system exhibiting reinforcement learning of dialogue policies: generic slot-filling in the TALK in-car system. In EACL, 2006.
    • (2006) EACL
    • Lemon, O.1    Georgila, K.2    Henderson, J.3    Stuttle, M.4
  • 28
    • 85072827450 scopus 로고    scopus 로고
    • How NOT to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation
    • C.-W. Liu, R. Lowe, I. V. Serban, M. Noseworthy, L. Charlin, and J. Pineau. How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation. In EMNLP, 2016.
    • (2016) EMNLP
    • Liu, C.-W.1    Lowe, R.2    Serban, I.V.3    Noseworthy, M.4    Charlin, L.5    Pineau, J.6
  • 30
    • 84988430909 scopus 로고    scopus 로고
    • The ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems
    • R. Lowe, N. Pow, I. Serban, and J. Pineau. The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems. In SIGDIAL, 2015.
    • (2015) SIGDIAL
    • Lowe, R.1    Pow, N.2    Serban, I.3    Pineau, J.4
  • 32
    • 85018917850 scopus 로고    scopus 로고
    • Hierarchical question-image co-attention for visual question answering
    • J. Lu, J. Yang, D. Batra, and D. Parikh. Hierarchical Question-Image Co-Attention for Visual Question Answering. In NIPS, 2016.
    • (2016) NIPS
    • Lu, J.1    Yang, J.2    Batra, D.3    Parikh, D.4
  • 33
    • 84937822746 scopus 로고    scopus 로고
    • A multi-world approach to question answering about real-world scenes based on uncertain input
    • M. Malinowski and M. Fritz. A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input. In NIPS, 2014.
    • (2014) NIPS
    • Malinowski, M.1    Fritz, M.2
  • 34
    • 84973896625 scopus 로고    scopus 로고
    • Ask your neurons: A neural-based approach to answering questions about images
    • M. Malinowski, M. Rohrbach, and M. Fritz. Ask your neurons: A neural-based approach to answering questions about images. In ICCV, 2015.
    • (2015) ICCV
    • Malinowski, M.1    Rohrbach, M.2    Fritz, M.3
  • 35
    • 85007207124 scopus 로고    scopus 로고
    • Listen, attend, and walk: Neural mapping of navigational instructions to action sequences
    • H. Mei, M. Bansal, and M. R. Walter. Listen, attend, and walk: Neural mapping of navigational instructions to action sequences. In AAAI, 2016.
    • (2016) AAAI
    • Mei, H.1    Bansal, M.2    Walter, M.R.3
  • 39
    • 84973856017 scopus 로고    scopus 로고
    • Flickr30k entities: Collecting region-to-phrase correspondences for richer image-to-sentence models
    • B. A. Plummer, L. Wang, C. M. Cervantes, J. C. Caicedo, J. Hockenmaier, and S. Lazebnik. Flickr30k entities: Collecting region-to-phrase correspondences for richer image-to-sentence models. In ICCV, 2015.
    • (2015) ICCV
    • Plummer, B.A.1    Wang, L.2    Cervantes, C.M.3    Caicedo, J.C.4    Hockenmaier, J.5    Lazebnik, S.6
  • 40
    • 85071396128 scopus 로고    scopus 로고
    • SQuAD: 100,000+ questions for machine comprehension of text
    • P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang. SQuAD: 100,000+ Questions for Machine Comprehension of Text. In EMNLP, 2016.
    • (2016) EMNLP
    • Rajpurkar, P.1    Zhang, J.2    Lopyrev, K.3    Liang, P.4
  • 41
    • 84943782750 scopus 로고    scopus 로고
    • Linking people with "their" names using coreference resolution
    • V. Ramanathan, A. Joulin, P. Liang, and L. Fei-Fei. Linking people with "their" names using coreference resolution. In ECCV, 2014.
    • (2014) ECCV
    • Ramanathan, V.1    Joulin, A.2    Liang, P.3    Fei-Fei, L.4
  • 42
    • 85072826753 scopus 로고    scopus 로고
    • Question relevance in VQA: Identifying non-visual and false-premise questions
    • A. Ray, G. Christie, M. Bansal, D. Batra, and D. Parikh. Question Relevance in VQA: Identifying Non-Visual And False-Premise Questions. In EMNLP, 2016.
    • (2016) EMNLP
    • Ray, A.1    Christie, G.2    Bansal, M.3    Batra, D.4    Parikh, D.5
  • 43
    • 84965170394 scopus 로고    scopus 로고
    • Exploring models and data for image question answering
    • M. Ren, R. Kiros, and R. Zemel. Exploring Models and Data for Image Question Answering. In NIPS, 2015.
    • (2015) NIPS
    • Ren, M.1    Kiros, R.2    Zemel, R.3
  • 47
    • 84980367197 scopus 로고    scopus 로고
    • Building end-to-end dialogue systems using generative hierarchical neural network models
    • I. V. Serban, A. Sordoni, Y. Bengio, A. Courville, and J. Pineau. Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models. In AAAI, 2016.
    • (2016) AAAI
    • Serban, I.V.1    Sordoni, A.2    Bengio, Y.3    Courville, A.4    Pineau, J.5
  • 50
    • 85083953063 scopus 로고    scopus 로고
    • Very deep convolutional networks for large-scale image recognition
    • K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015.
    • (2015) ICLR
    • Simonyan, K.1    Zisserman, A.2
  • 52
    • 84901405262 scopus 로고    scopus 로고
    • Joint video and text parsing for understanding events and answering queries
    • K. Tu, M. Meng, M. W. Lee, T. E. Choe, and S. C. Zhu. Joint Video and Text Parsing for Understanding Events and Answering Queries. IEEE MultiMedia, 2014.
    • (2014) IEEE MultiMedia
    • Tu, K.1    Meng, M.2    Lee, M.W.3    Choe, T.E.4    Zhu, S.C.5
  • 56
    • 84946747440 scopus 로고    scopus 로고
    • Show and tell: A neural image caption generator
    • O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. In CVPR, 2015.
    • (2015) CVPR
    • Vinyals, O.1    Toshev, A.2    Bengio, S.3    Erhan, D.4
  • 58
    • 85044310149 scopus 로고    scopus 로고
    • J. Weizenbaum. ELIZA. http://psych.fullerton.edu/mbirnbaum/psych101/Eliza.htm.
    • ELIZA
    • Weizenbaum, J.1
  • 59
    • 85083951707 scopus 로고    scopus 로고
    • Towards AI-complete question answering: A set of prerequisite toy tasks
    • J. Weston, A. Bordes, S. Chopra, and T. Mikolov. Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks. In ICLR, 2016.
    • (2016) ICLR
    • Weston, J.1    Bordes, A.2    Chopra, S.3    Mikolov, T.4
  • 61
    • 84986334021 scopus 로고    scopus 로고
    • Stacked attention networks for image question answering
    • Z. Yang, X. He, J. Gao, L. Deng, and A. J. Smola. Stacked Attention Networks for Image Question Answering. In CVPR, 2016.
    • (2016) CVPR
    • Yang, Z.1    He, X.2    Gao, J.3    Deng, L.4    Smola, A.J.5
  • 63
    • 84986275767 scopus 로고    scopus 로고
    • Visual7W: Grounded question answering in images
    • Y. Zhu, O. Groth, M. Bernstein, and L. Fei-Fei. Visual7W: Grounded Question Answering in Images. In CVPR, 2016.
    • (2016) CVPR
    • Zhu, Y.1    Groth, O.2    Bernstein, M.3    Fei-Fei, L.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.