SCOPUS 정보 검색 플랫폼

5th International Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings

Volumn , Issue , 2017, Pages

Hadamard product for low-rank bilinear pooling

(6) Kim, Jin Hwa a On, Kyoung Woon a Lim, Woosang b Kim, Jeonghee c Ha, Jung Woo c Zhang, Byoung Tak a

a Seoul National University (South Korea)

b Korea Advanced Institute of Science and Technology (KAIST) (South Korea)

c NAVER LABS EUROPE (France)

Author keywords

[No Author keywords available]

Indexed keywords

OBJECT RECOGNITION;

ATTENTION MECHANISMS; HADAMARD PRODUCTS; HIGH-DIMENSIONAL; MULTI-MODAL LEARNING; QUESTION ANSWERING; QUESTION ANSWERING TASK; STATE OF THE ART; STATE-OF-THE-ART PERFORMANCE;

VISION;

EID: 85087529518 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (455)

References (38)

1
- 84993660571
- arXiv preprint
- Jacob Andreas, Marcus Rohrbach, Trevor Darrell, and Dan Klein. Learning to Compose Neural Networks for Question Answering. arXiv preprint arXiv:1601.01705, 2016.
- (2016) Learning to Compose Neural Networks for Question Answering
- Andreas, J.¹ Rohrbach, M.² Darrell, T.³ Klein, D.⁴

2
- 84973890960
- VQA: Visual question answering
- Stanislaw Antol, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C. Lawrence Zitnick, and Devi Parikh. VQA: Visual Question Answering. IEEE International Conference on Computer Vision, 2015.
- (2015) IEEE International Conference on Computer Vision
- Antol, S.¹ Agrawal, A.² Lu, J.³ Mitchell, M.⁴ Batra, D.⁵ Lawrence Zitnick, C.⁶ Parikh, D.⁷

3
- 84869158135
- Finding frequent items in data streams
- Springer
- Moses Charikar, Kevin Chen, and Martin Farach-Colton. Finding frequent items in data streams. In International Colloquium on Automata, Languages, and Programming, pp. 693-703. Springer, 2002.
- (2002) International Colloquium on Automata, Languages, and Programming , pp. 693-703
- Charikar, M.¹ Chen, K.² Farach-Colton, M.³

4
- 84969930652
- Compressing neural networks with the hashing trick
- Wenlin Chen, James T. Wilson, Stephen Tyree, Kilian Q. Weinberger, and Yixin Chen. Compressing Neural Networks with the Hashing Trick. In 32nd International Conference on Machine Learning, pp. 2285-2294, 2015.
- (2015) 32nd International Conference on Machine Learning , pp. 2285-2294
- Chen, W.¹ Wilson, J.T.² Tyree, S.³ Weinberger, K.Q.⁴ Chen, Y.⁵

5
- 84961291190
- Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation
- Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. In 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1724-1734, 2014.
- (2014) 2014 Conference on Empirical Methods in Natural Language Processing , pp. 1724-1734
- Cho, K.¹ Van Merriënboer, B.² Gulcehre, C.³ Bahdanau, D.⁴ Bougares, F.⁵ Schwenk, H.⁶ Bengio, Y.⁷

6
- 84990060711
- arXiv preprint
- Akira Fukui, Dong Huk Park, Daylen Yang, Anna Rohrbach, Trevor Darrell, and Marcus Rohrbach. Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding. arXiv preprint arXiv:1606.01847, 2016.
- (2016) Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
- Fukui, A.¹ Park, D.H.² Yang, D.³ Rohrbach, A.⁴ Darrell, T.⁵ Rohrbach, M.⁶

7
- 84994531050
- arXiv preprint
- Yarin Gal. A Theoretically Grounded Application of Dropout in Recurrent Neural Networks. arXiv preprint arXiv:1512.05287, 2015.
- (2015) A Theoretically Grounded Application of Dropout in Recurrent Neural Networks
- Gal, Y.¹

8
- 84986266770
- Compact bilinear pooling
- Yang Gao, Oscar Beijbom, Ning Zhang, and Trevor Darrell. Compact Bilinear Pooling. In IEEE Conference on Computer Vision and Pattern Recognition, 2016.
- (2016) IEEE Conference on Computer Vision and Pattern Recognition
- Gao, Y.¹ Beijbom, O.² Zhang, N.³ Darrell, T.⁴

9
- 84986274465
- Deep residual learning for image recognition
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep Residual Learning for Image Recognition. In IEEE Conference on Computer Vision and Pattern Recognition, 2016.
- (2016) IEEE Conference on Computer Vision and Pattern Recognition
- He, K.¹ Zhang, X.² Ren, S.³ Sun, J.⁴

10
- 0031573117
- Long short-term memory
- Sepp Hochreiter and Jürgen Schmidhuber. Long Short-Term Memory. Neural computation, 9(8):1735-1780, 1997.
- (1997) Neural Computation , vol.9 , Issue.8 , pp. 1735-1780
- Hochreiter, S.¹ Schmidhuber, J.²

11
- 85018925213
- arXiv preprint
- Ilija Ilievski, Shuicheng Yan, and Jiashi Feng. A Focused Dynamic Attention Model for Visual Question Answering. arXiv preprint arXiv:1604.01485, 2016.
- (2016) A Focused Dynamic Attention Model for Visual Question Answering
- Ilievski, I.¹ Yan, S.² Feng, J.³

12
- 84965096967
- Spatial transformer networks
- Max Jaderberg, Karen Simonyan, Andrew Zisserman, and Koray Kavukcuoglu. Spatial Transformer Networks. In Advances in Neural Information Processing Systems 28, pp. 2008-2016, 2015.
- (2015) Advances in Neural Information Processing Systems , vol.28 , pp. 2008-2016
- Jaderberg, M.¹ Simonyan, K.² Zisserman, A.³ Kavukcuoglu, K.⁴

13
- 85040923687
- arXiv preprint
- Kushal Kafle and Christopher Kanan. Visual Question Answering: Datasets, Algorithms, and Future Challenges. arXiv preprint arXiv:1610.01465, 2016a.
- (2016) Visual Question Answering: Datasets, Algorithms, and Future Challenges
- Kafle, K.¹ Kanan, C.²

14
- 84986300506
- Answer-type prediction for visual question answering
- Kushal Kafle and Christopher Kanan. Answer-Type Prediction for Visual Question Answering. IEEE Conference on Computer Vision and Pattern Recognition, pp. 4976-4984, 2016b.
- (2016) IEEE Conference on Computer Vision and Pattern Recognition , pp. 4976-4984
- Kafle, K.¹ Kanan, C.²

15
- 85018919661
- Trimzero: A torch recurrent module for efficient natural language processing
- Jin-Hwa Kim, Jeonghee Kim, Jung-Woo Ha, and Byoung-Tak Zhang. TrimZero: A Torch Recurrent Module for Efficient Natural Language Processing. In KIIS Spring Conference, volume 26, pp. 165-166, 2016a.
- (2016) KIIS Spring Conference , vol.26 , pp. 165-166
- Kim, J.-H.¹ Kim, J.² Ha, J.-W.³ Zhang, B.-T.⁴

16
- 85020657410
- arXiv preprint
- Jin-Hwa Kim, Sang-Woo Lee, Dong-Hyun Kwak, Min-Oh Heo, Jeonghee Kim, Jung-Woo Ha, and Byoung-Tak Zhang. Multimodal Residual Learning for Visual QA. arXiv preprint arXiv:1606.01455, 2016b.
- (2016) Multimodal Residual Learning for Visual QA
- Kim, J.-H.¹ Lee, S.-W.² Kwak, D.-H.³ Heo, M.-O.⁴ Kim, J.⁵ Ha, J.-W.⁶ Zhang, B.-T.⁷

17
- 84965153327
- Skip-thought vectors
- Ryan Kiros, Yukun Zhu, Ruslan Salakhutdinov, Richard S. Zemel, Antonio Torralba, Raquel Urtasun, and Sanja Fidler. Skip-Thought Vectors. In Advances in Neural Information Processing Systems 28, pp. 3294-3302, 2015.
- (2015) Advances in Neural Information Processing Systems , vol.28 , pp. 3294-3302
- Kiros, R.¹ Zhu, Y.² Salakhutdinov, R.³ Zemel, R.S.⁴ Torralba, A.⁵ Urtasun, R.⁶ Fidler, S.⁷

18
- 84978730111
- arXiv preprint
- Ranjay Krishna, Yuke Zhu, Oliver Groth, Justin Johnson, Kenji Hata, Joshua Kravitz, Stephanie Chen, Yannis Kalantidis, Li-Jia Li, David A Shamma, Michael Bernstein, and Li Fei-Fei. Visual genome: Connecting language and vision using crowdsourced dense image annotations. arXiv preprint arXiv:1602.07332, 2016.
- (2016) Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
- Krishna, R.¹ Zhu, Y.² Groth, O.³ Johnson, J.⁴ Hata, K.⁵ Kravitz, J.⁶ Chen, S.⁷ Kalantidis, Y.⁸ Li, L.-J.⁹ Shamma, D.A.¹⁰ Bernstein, M.¹¹ Fei-Fei, L.¹²

19
- 84991466123
- arXiv preprint
- Nicholas Léonard, Sagar Waghmare, Yang Wang, and Jin-Hwa Kim. rnn: Recurrent Library for Torch. arXiv preprint arXiv:1511.07889, 2015.
- (2015) Rnn: Recurrent Library for Torch
- Léonard, N.¹ Waghmare, S.² Wang, Y.³ Kim, J.-H.⁴

20
- 84973863234
- Bilinear CNN Models for Fine-grained Visual Recognition
- Tsung-Yu Lin, Aruni RoyChowdhury, and Subhransu Maji. Bilinear CNN Models for Fine-grained Visual Recognition. In IEEE International Conference on Computer Vision, pp. 1449-1457, 2015.
- (2015) IEEE International Conference on Computer Vision , pp. 1449-1457
- Lin, T.-Y.¹ RoyChowdhury, A.² Maji, S.³

21
- 84990020800
- arXiv preprint
- Jiasen Lu, Jianwei Yang, Dhruv Batra, and Devi Parikh. Hierarchical Question-Image Co-Attention for Visual Question Answering. arXiv preprint arXiv:1606.00061, 2016.
- (2016) Hierarchical Question-Image Co-Attention for Visual Question Answering
- Lu, J.¹ Yang, J.² Batra, D.³ Parikh, D.⁴

22
- 85030255039
- arXiv preprint
- Mateusz Malinowski, Marcus Rohrbach, and Mario Fritz. Ask Your Neurons: A Deep Learning Approach to Visual Question Answering. arXiv preprint arXiv:1605.02697, 2016.
- (2016) Ask Your Neurons: A Deep Learning Approach to Visual Question Answering
- Malinowski, M.¹ Rohrbach, M.² Fritz, M.³

23
- 34948828582
- Unsupervised learning of image transformations
- Roland Memisevic and Geoffrey E Hinton. Unsupervised learning of image transformations. In IEEE Conference on Computer Vision and Pattern Recognition, 2007.
- (2007) IEEE Conference on Computer Vision and Pattern Recognition
- Memisevic, R.¹ Hinton, G.E.²

24
- 77953520240
- Learning to represent spatial transformations with factored higher-order Boltzmann machines
- Roland Memisevic and Geoffrey E Hinton. Learning to represent spatial transformations with factored higher-order Boltzmann machines. Neural computation, 22(6):1473-1492, 2010.
- (2010) Neural Computation , vol.22 , Issue.6 , pp. 1473-1492
- Memisevic, R.¹ Hinton, G.E.²

25
- 85030462424
- arXiv preprint
- Hyeonwoo Noh and Bohyung Han. Training Recurrent Answering Units with Joint Loss Minimization for VQA. arXiv preprint arXiv:1606.03647, 2016.
- (2016) Training Recurrent Answering Units with Joint Loss Minimization for VQA
- Noh, H.¹ Han, B.²

26
- 84986261711
- Image Question Answering using Convolutional Neural Network with Dynamic Parameter Prediction
- Hyeonwoo Noh, Paul Hongsuck Seo, and Bohyung Han. Image Question Answering using Convolutional Neural Network with Dynamic Parameter Prediction. In IEEE Conference on Computer Vision and Pattern Recognition, 2016.
- (2016) IEEE Conference on Computer Vision and Pattern Recognition
- Noh, H.¹ Seo, P.H.² Han, B.³

27
- 85023199520
- Fast and scalable polynomial kernels via explicit feature maps
- Ninh Pham and Rasmus Pagh. Fast and scalable polynomial kernels via explicit feature maps. In 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 239-247. ACM, 2013.
- (2013) 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , pp. 239-247
- Pham, N.¹ Pagh, R.²

28
- 78149347664
- Bilinear classifiers for visual recognition
- Hamed Pirsiavash, Deva Ramanan, and Charless C. Fowlkes. Bilinear classifiers for visual recognition. In Advances in Neural Information Processing Systems 22, pp. 1482-1490, 2009.
- (2009) Advances in Neural Information Processing Systems , vol.22 , pp. 1482-1490
- Pirsiavash, H.¹ Ramanan, D.² Fowlkes, C.C.³

29
- 0034202338
- Separating style and content with bilinear models
- Joshua B Tenenbaum and William T Freeman. Separating style and content with bilinear models. Neural computation, 12(6):1247-1283, 2000.
- (2000) Neural Computation , vol.12 , Issue.6 , pp. 1247-1283
- Tenenbaum, J.B.¹ Freeman, W.T.²

30
- 84893343292
- Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude
- Tijmen Tieleman and Geoffrey Hinton. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks for Machine Learning, 4, 2012.
- (2012) COURSERA: Neural Networks for Machine Learning , vol.4
- Tieleman, T.¹ Hinton, G.²

31
- 70049083823
- Feature hashing for large scale multitask learning
- Kilian Weinberger, Anirban Dasgupta, John Langford, Alex Smola, and Josh Attenberg. Feature hashing for large scale multitask learning. In 26th International Conference on Machine Learning, pp. 1113-1120, 2009.
- (2009) 26th International Conference on Machine Learning , pp. 1113-1120
- Weinberger, K.¹ Dasgupta, A.² Langford, J.³ Smola, A.⁴ Attenberg, J.⁵

32
- 85035003427
- arXiv preprint
- Qi Wu, Damien Teney, Peng Wang, Chunhua Shen, Anthony Dick, and Anton van den Hengel. Visual Question Answering: A Survey of Methods and Datasets. arXiv preprint arXiv:1607.05910, 2016a.
- (2016) Visual Question Answering: A Survey of Methods and Datasets
- Wu, Q.¹ Teney, D.² Wang, P.³ Shen, C.⁴ Dick, A.⁵ Van Den Hengel, A.⁶

33
- 84986320870
- Ask me Anything: Free-form Visual Question Answering Based on Knowledge from External Sources
- Qi Wu, Peng Wang, Chunhua Shen, Anthony Dick, and Anton van den Hengel. Ask Me Anything: Free-form Visual Question Answering Based on Knowledge from External Sources. In IEEE Conference on Computer Vision and Pattern Recognition, 2016b.
- (2016) IEEE Conference on Computer Vision and Pattern Recognition
- Wu, Q.¹ Wang, P.² Shen, C.³ Dick, A.⁴ Van Den Hengel, A.⁵

34
- 85030982311
- arXiv preprint
- Yuhuai Wu, Saizheng Zhang, Ying Zhang, Yoshua Bengio, and Ruslan Salakhutdinov. On Multiplicative Integration with Recurrent Neural Networks. arXiv preprint arXiv:1606.06630, 2016c.
- (2016) On Multiplicative Integration with Recurrent Neural Networks
- Wu, Y.¹ Zhang, S.² Zhang, Y.³ Bengio, Y.⁴ Salakhutdinov, R.⁵

35
- 84999008900
- Dynamic memory networks for visual and textual question answering
- Caiming Xiong, Stephen Merity, and Richard Socher. Dynamic Memory Networks for Visual and Textual Question Answering. In 33rd International Conference on Machine Learning, 2016.
- (2016) 33rd International Conference on Machine Learning
- Xiong, C.¹ Merity, S.² Socher, R.³

36
- 85035008367
- Ask, attend and answer: Exploring question-guided spatial attention for visual question answering
- Huijuan Xu and Kate Saenko. Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering. In European Conference on Computer Vision, 2016.
- (2016) European Conference on Computer Vision
- Xu, H.¹ Saenko, K.²

37
- 84986334021
- Stacked attention networks for image question answering
- Zichao Yang, Xiaodong He, Jianfeng Gao, Li Deng, and Alex Smola. Stacked Attention Networks for Image Question Answering. In IEEE Conference on Computer Vision and Pattern Recognition, 2016.
- (2016) IEEE Conference on Computer Vision and Pattern Recognition
- Yang, Z.¹ He, X.² Gao, J.³ Deng, L.⁴ Smola, A.⁵

38
- 84986301525
- arXiv preprint
- Bolei Zhou, Yuandong Tian, Sainbayar Sukhbaatar, Arthur Szlam, and Rob Fergus. Simple Baseline for Visual Question Answering. arXiv preprint arXiv:1512.02167, 2015.
- (2015) Simple Baseline for Visual Question Answering
- Zhou, B.¹ Tian, Y.² Sukhbaatar, S.³ Szlam, A.⁴ Fergus, R.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.