-
2
-
-
84973890960
-
VQA: Visual question answering
-
Stanislaw Antol, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C. Lawrence Zitnick, and Devi Parikh. VQA: Visual Question Answering. IEEE International Conference on Computer Vision, 2015.
-
(2015)
IEEE International Conference on Computer Vision
-
-
Antol, S.1
Agrawal, A.2
Lu, J.3
Mitchell, M.4
Batra, D.5
Lawrence Zitnick, C.6
Parikh, D.7
-
3
-
-
84869158135
-
Finding frequent items in data streams
-
Springer
-
Moses Charikar, Kevin Chen, and Martin Farach-Colton. Finding frequent items in data streams. In International Colloquium on Automata, Languages, and Programming, pp. 693-703. Springer, 2002.
-
(2002)
International Colloquium on Automata, Languages, and Programming
, pp. 693-703
-
-
Charikar, M.1
Chen, K.2
Farach-Colton, M.3
-
4
-
-
84969930652
-
Compressing neural networks with the hashing trick
-
Wenlin Chen, James T. Wilson, Stephen Tyree, Kilian Q. Weinberger, and Yixin Chen. Compressing Neural Networks with the Hashing Trick. In 32nd International Conference on Machine Learning, pp. 2285-2294, 2015.
-
(2015)
32nd International Conference on Machine Learning
, pp. 2285-2294
-
-
Chen, W.1
Wilson, J.T.2
Tyree, S.3
Weinberger, K.Q.4
Chen, Y.5
-
5
-
-
84961291190
-
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation
-
Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. In 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1724-1734, 2014.
-
(2014)
2014 Conference on Empirical Methods in Natural Language Processing
, pp. 1724-1734
-
-
Cho, K.1
Van Merriënboer, B.2
Gulcehre, C.3
Bahdanau, D.4
Bougares, F.5
Schwenk, H.6
Bengio, Y.7
-
6
-
-
84990060711
-
-
arXiv preprint
-
Akira Fukui, Dong Huk Park, Daylen Yang, Anna Rohrbach, Trevor Darrell, and Marcus Rohrbach. Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding. arXiv preprint arXiv:1606.01847, 2016.
-
(2016)
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
-
-
Fukui, A.1
Park, D.H.2
Yang, D.3
Rohrbach, A.4
Darrell, T.5
Rohrbach, M.6
-
12
-
-
84965096967
-
Spatial transformer networks
-
Max Jaderberg, Karen Simonyan, Andrew Zisserman, and Koray Kavukcuoglu. Spatial Transformer Networks. In Advances in Neural Information Processing Systems 28, pp. 2008-2016, 2015.
-
(2015)
Advances in Neural Information Processing Systems
, vol.28
, pp. 2008-2016
-
-
Jaderberg, M.1
Simonyan, K.2
Zisserman, A.3
Kavukcuoglu, K.4
-
15
-
-
85018919661
-
Trimzero: A torch recurrent module for efficient natural language processing
-
Jin-Hwa Kim, Jeonghee Kim, Jung-Woo Ha, and Byoung-Tak Zhang. TrimZero: A Torch Recurrent Module for Efficient Natural Language Processing. In KIIS Spring Conference, volume 26, pp. 165-166, 2016a.
-
(2016)
KIIS Spring Conference
, vol.26
, pp. 165-166
-
-
Kim, J.-H.1
Kim, J.2
Ha, J.-W.3
Zhang, B.-T.4
-
16
-
-
85020657410
-
-
arXiv preprint
-
Jin-Hwa Kim, Sang-Woo Lee, Dong-Hyun Kwak, Min-Oh Heo, Jeonghee Kim, Jung-Woo Ha, and Byoung-Tak Zhang. Multimodal Residual Learning for Visual QA. arXiv preprint arXiv:1606.01455, 2016b.
-
(2016)
Multimodal Residual Learning for Visual QA
-
-
Kim, J.-H.1
Lee, S.-W.2
Kwak, D.-H.3
Heo, M.-O.4
Kim, J.5
Ha, J.-W.6
Zhang, B.-T.7
-
17
-
-
84965153327
-
Skip-thought vectors
-
Ryan Kiros, Yukun Zhu, Ruslan Salakhutdinov, Richard S. Zemel, Antonio Torralba, Raquel Urtasun, and Sanja Fidler. Skip-Thought Vectors. In Advances in Neural Information Processing Systems 28, pp. 3294-3302, 2015.
-
(2015)
Advances in Neural Information Processing Systems
, vol.28
, pp. 3294-3302
-
-
Kiros, R.1
Zhu, Y.2
Salakhutdinov, R.3
Zemel, R.S.4
Torralba, A.5
Urtasun, R.6
Fidler, S.7
-
18
-
-
84978730111
-
-
arXiv preprint
-
Ranjay Krishna, Yuke Zhu, Oliver Groth, Justin Johnson, Kenji Hata, Joshua Kravitz, Stephanie Chen, Yannis Kalantidis, Li-Jia Li, David A Shamma, Michael Bernstein, and Li Fei-Fei. Visual genome: Connecting language and vision using crowdsourced dense image annotations. arXiv preprint arXiv:1602.07332, 2016.
-
(2016)
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
-
-
Krishna, R.1
Zhu, Y.2
Groth, O.3
Johnson, J.4
Hata, K.5
Kravitz, J.6
Chen, S.7
Kalantidis, Y.8
Li, L.-J.9
Shamma, D.A.10
Bernstein, M.11
Fei-Fei, L.12
-
24
-
-
77953520240
-
Learning to represent spatial transformations with factored higher-order Boltzmann machines
-
Roland Memisevic and Geoffrey E Hinton. Learning to represent spatial transformations with factored higher-order Boltzmann machines. Neural computation, 22(6):1473-1492, 2010.
-
(2010)
Neural Computation
, vol.22
, Issue.6
, pp. 1473-1492
-
-
Memisevic, R.1
Hinton, G.E.2
-
29
-
-
0034202338
-
Separating style and content with bilinear models
-
Joshua B Tenenbaum and William T Freeman. Separating style and content with bilinear models. Neural computation, 12(6):1247-1283, 2000.
-
(2000)
Neural Computation
, vol.12
, Issue.6
, pp. 1247-1283
-
-
Tenenbaum, J.B.1
Freeman, W.T.2
-
30
-
-
84893343292
-
Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude
-
Tijmen Tieleman and Geoffrey Hinton. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks for Machine Learning, 4, 2012.
-
(2012)
COURSERA: Neural Networks for Machine Learning
, vol.4
-
-
Tieleman, T.1
Hinton, G.2
-
31
-
-
70049083823
-
Feature hashing for large scale multitask learning
-
Kilian Weinberger, Anirban Dasgupta, John Langford, Alex Smola, and Josh Attenberg. Feature hashing for large scale multitask learning. In 26th International Conference on Machine Learning, pp. 1113-1120, 2009.
-
(2009)
26th International Conference on Machine Learning
, pp. 1113-1120
-
-
Weinberger, K.1
Dasgupta, A.2
Langford, J.3
Smola, A.4
Attenberg, J.5
-
32
-
-
85035003427
-
-
arXiv preprint
-
Qi Wu, Damien Teney, Peng Wang, Chunhua Shen, Anthony Dick, and Anton van den Hengel. Visual Question Answering: A Survey of Methods and Datasets. arXiv preprint arXiv:1607.05910, 2016a.
-
(2016)
Visual Question Answering: A Survey of Methods and Datasets
-
-
Wu, Q.1
Teney, D.2
Wang, P.3
Shen, C.4
Dick, A.5
Van Den Hengel, A.6
-
33
-
-
84986320870
-
Ask me Anything: Free-form Visual Question Answering Based on Knowledge from External Sources
-
Qi Wu, Peng Wang, Chunhua Shen, Anthony Dick, and Anton van den Hengel. Ask Me Anything: Free-form Visual Question Answering Based on Knowledge from External Sources. In IEEE Conference on Computer Vision and Pattern Recognition, 2016b.
-
(2016)
IEEE Conference on Computer Vision and Pattern Recognition
-
-
Wu, Q.1
Wang, P.2
Shen, C.3
Dick, A.4
Van Den Hengel, A.5
-
34
-
-
85030982311
-
-
arXiv preprint
-
Yuhuai Wu, Saizheng Zhang, Ying Zhang, Yoshua Bengio, and Ruslan Salakhutdinov. On Multiplicative Integration with Recurrent Neural Networks. arXiv preprint arXiv:1606.06630, 2016c.
-
(2016)
On Multiplicative Integration with Recurrent Neural Networks
-
-
Wu, Y.1
Zhang, S.2
Zhang, Y.3
Bengio, Y.4
Salakhutdinov, R.5
-
36
-
-
85035008367
-
Ask, attend and answer: Exploring question-guided spatial attention for visual question answering
-
Huijuan Xu and Kate Saenko. Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering. In European Conference on Computer Vision, 2016.
-
(2016)
European Conference on Computer Vision
-
-
Xu, H.1
Saenko, K.2
-
37
-
-
84986334021
-
Stacked attention networks for image question answering
-
Zichao Yang, Xiaodong He, Jianfeng Gao, Li Deng, and Alex Smola. Stacked Attention Networks for Image Question Answering. In IEEE Conference on Computer Vision and Pattern Recognition, 2016.
-
(2016)
IEEE Conference on Computer Vision and Pattern Recognition
-
-
Yang, Z.1
He, X.2
Gao, J.3
Deng, L.4
Smola, A.5
-
38
-
-
84986301525
-
-
arXiv preprint
-
Bolei Zhou, Yuandong Tian, Sainbayar Sukhbaatar, Arthur Szlam, and Rob Fergus. Simple Baseline for Visual Question Answering. arXiv preprint arXiv:1512.02167, 2015.
-
(2015)
Simple Baseline for Visual Question Answering
-
-
Zhou, B.1
Tian, Y.2
Sukhbaatar, S.3
Szlam, A.4
Fergus, R.5
|