-
1
-
-
85019172761
-
Learning to learn by gradient descent by gradient descent
-
Andrychowicz, Marcin, Denil, Misha, Gomez, Sergio, Hoffman, Matthew W, Pfau, David, Schaul, Tom, Shillingford, Brendan, and de Freitas, Nando. Learning to learn by gradient descent by gradient descent. In Advances in Neural Information Processing Systems, 2016.
-
(2016)
Advances in Neural Information Processing Systems
-
-
Andrychowicz, M.1
Denil, M.2
Gomez, S.3
Hoffman, M.W.4
Pfau, D.5
Schaul, T.6
Shillingford, B.7
De Freitas, N.8
-
2
-
-
85083953689
-
Neural machine translation by jointly learning to align and translate
-
Bahdanau, Dzmitry, Cho, Kyunghyun, and Bengio, Yoshua. Neural machine translation by jointly learning to align and translate, iclr, 2015.
-
(2015)
Iclr
-
-
Bahdanau, D.1
Cho, K.2
Bengio, Y.3
-
3
-
-
34249757641
-
On the search for new learning rules for ANNs
-
Bengio, S., Bengio, Y., and Cloutier, J. On the search for new learning rules for ANNs. Neural Processing Letters, 2(4): 26-30, 1995.
-
(1995)
Neural Processing Letters
, vol.2
, Issue.4
, pp. 26-30
-
-
Bengio, S.1
Bengio, Y.2
Cloutier, J.3
-
4
-
-
84921824478
-
-
Université de Montréal, Département d'informatique et de recherche opérationnelle
-
Bengio, Yoshua, Bengio, Samy, and Cloutier, Jocelyn. Learning a synaptic learning rule. Université de Montréal, Département d'informatique et de recherche opérationnelle, 1990.
-
(1990)
Learning a Synaptic Learning Rule
-
-
Bengio, Y.1
Bengio, S.2
Cloutier, J.3
-
5
-
-
85047008902
-
On the optimization of a synaptic learning rule
-
Bengio, Yoshua, Bengio, Samy, Cloutier, Jocelyn, and Gecsei, Jan. On the optimization of a synaptic learning rule. In in Conference on Optimality in Biological and Artificial Networks, 1992.
-
(1992)
Conference on Optimality in Biological and Artificial Networks
-
-
Bengio, Y.1
Bengio, S.2
Cloutier, J.3
Gecsei, J.4
-
6
-
-
85047008315
-
-
Chen, Yutian, Hoffman, Matthew W., Colmenarejo, Sergio Gomez, Denil, Misha, Lillicrap, Timothy P., and de Freitas, Nando. Learning to learn for global optimization of black box functions. arXiv Report 1611.03824, 2016.
-
(2016)
Learning to Learn for Global Optimization of Black Box Functions
-
-
Chen, Y.1
Hoffman, M.W.2
Colmenarejo, S.G.3
Denil, M.4
Lillicrap, T.P.5
De Freitas, N.6
-
7
-
-
84943799837
-
-
arXiv preprint
-
Cho, Kyunghyun, Van Merriënboer, Bart, Bahdanau, Dzmitry, and Bengio, Yoshua. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv: 1409.1259, 2014.
-
(2014)
On the Properties of Neural Machine Translation: Encoder-decoder Approaches
-
-
Cho, K.1
Van Merriënboer, B.2
Bahdanau, D.3
Bengio, Y.4
-
10
-
-
0002833950
-
The formation of learning sets
-
Harlow, Harry F. The formation of learning sets. Psychological review, 56(1): 51, 1949.
-
(1949)
Psychological Review
, vol.56
, Issue.1
, pp. 51
-
-
Harlow, H.F.1
-
11
-
-
84990050094
-
Identity mappings in deep residual networks
-
Springer
-
He, Kaiming, Zhang, Xiangyu, Ren, Shaoqing, and Sun, Jian. Identity mappings in deep residual networks. In European Conference on Computer Vision, pp. 630-645. Springer, 2016.
-
(2016)
European Conference on Computer Vision
, pp. 630-645
-
-
He, K.1
Zhang, X.2
Ren, S.3
Sun, J.4
-
12
-
-
84958985283
-
Learning to learn using gradient descent
-
Springer
-
Hochreiter, Sepp, Younger, A Steven, and Conwell, Peter R. Learning to learn using gradient descent. In International Conference on Artificial Neural Networks, pp. 87-94. Springer, 2001.
-
(2001)
International Conference on Artificial Neural Networks
, pp. 87-94
-
-
Hochreiter, S.1
Younger, A.S.2
Conwell, P.R.3
-
13
-
-
0024099853
-
A layered network model of associative learning: Learning to learn and configuration
-
Kehoe, E James. A layered network model of associative learning: learning to learn and configuration. Psychological review, 95(4): 411, 1988.
-
(1988)
Psychological Review
, vol.95
, Issue.4
, pp. 411
-
-
Kehoe, E.J.1
-
14
-
-
85083951076
-
Adam: A method for stochastic optimization
-
Kingma, Diederik and Ba, Jimmy. Adam: A method for stochastic optimization, iclr, 2015.
-
(2015)
Iclr
-
-
Kingma, D.1
Ba, J.2
-
15
-
-
84876231242
-
Imagenet classification with deep convolutional neural networks
-
Krizhevsky, Alex, Sutskever, Ilya, and Hinton, Geoffrey E. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pp. 1097-1105, 2012.
-
(2012)
Advances in Neural Information Processing Systems
, pp. 1097-1105
-
-
Krizhevsky, A.1
Sutskever, I.2
Hinton, G.E.3
-
16
-
-
84989320630
-
-
Lake, Brenden M, Ullman, Tomer D, Tenenbaum, Joshua B, and Gershman, Samuel J. Building machines that learn and think like people. arXiv Report 1604.00289, 2016.
-
(2016)
Building Machines that Learn and Think Like People
-
-
Lake, B.M.1
Ullman, T.D.2
Tenenbaum, J.B.3
Gershman, S.J.4
-
19
-
-
34548480020
-
A method of solving a convex programming problem with convergence rate o (1/k2)
-
Nesterov, Yurii. A method of solving a convex programming problem with convergence rate o (1/k2). In Soviet Mathematics Doklady, Volume 27, pp. 372-376, 1983a.
-
(1983)
Soviet Mathematics Doklady
, vol.27
, pp. 372-376
-
-
Nesterov, Y.1
-
20
-
-
34548480020
-
A method of solving a convex programming problem with convergence rate o (1/k2)
-
Nesterov, Yurii. A method of solving a convex programming problem with convergence rate o (1/k2). In Soviet Mathematics Doklady, Volume 27, pp. 372-376, 1983b.
-
(1983)
Soviet Mathematics Doklady
, vol.27
, pp. 372-376
-
-
Nesterov, Y.1
-
23
-
-
84998717754
-
Meta-leaming with memory-augmented neural networks
-
Santoro, ADAM, Bartunov, Sergey, Botvinick, Matthew, Wierstra, Daan, and Lillicrap, Timothy. Meta-leaming with memory-augmented neural networks. In International Conference on Machine Learning, 2016.
-
(2016)
International Conference on Machine Learning
-
-
Santoro, A.D.A.M.1
Bartunov, S.2
Botvinick, M.3
Wierstra, D.4
Lillicrap, T.5
-
25
-
-
84963949906
-
Mastering the game of go with deep neural networks and tree search
-
Silver, David, Huang, Aja, Maddison, Chris J, Guez, Arthur, Sifre, Laurent, Van Den Driessche, George, Schrittwieser, Julian, Antonoglou, Ioannis, Panneershelvam, Veda, Lanctot, Marc, et al. Mastering the game of go with deep neural networks and tree search. Nature, 529(7587): 484-489, 2016.
-
(2016)
Nature
, vol.529
, Issue.7587
, pp. 484-489
-
-
Silver, D.1
Huang, A.2
Maddison, C.J.3
Guez, A.4
Sifre, L.5
Van Den Driessche, G.6
Schrittwieser, J.7
Antonoglou, I.8
Panneershelvam, V.9
Lanctot, M.10
-
27
-
-
0026971570
-
Adapting bias by gradient descent: An incremental version of delta-bar-delta
-
Sutton, Richard S. Adapting bias by gradient descent: An incremental version of delta-bar-delta. In Association for the Advancement of Artificial Intelligence, pp. 171-176, 1992.
-
(1992)
Association for the Advancement of Artificial Intelligence
, pp. 171-176
-
-
Sutton, R.S.1
-
28
-
-
84986296808
-
Rethinking the inception architecture for computer vision
-
Szegedy, Christian, Vanhoucke, Vincent, Ioffe, Sergey, Shlens, Jon, and Wojna, Zbigniew. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818-2826, 2016.
-
(2016)
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
, pp. 2818-2826
-
-
Szegedy, C.1
Vanhoucke, V.2
Ioffe, S.3
Shlens, J.4
Wojna, Z.5
-
29
-
-
0003901612
-
-
Springer Science and Business Media
-
Thrun, Sebastian and Pratt, Lorien. Learning to learn. Springer Science and Business Media, 1998.
-
(1998)
Learning to Learn
-
-
Thrun, S.1
Pratt, L.2
-
30
-
-
84893343292
-
Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude
-
Tieleman, Tijmen and Hinton, Geoffrey. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks for Machine Learning, 4: 2, 2012.
-
(2012)
COURSERA: Neural Networks for Machine Learning
, vol.4
, pp. 2
-
-
Tieleman, T.1
Hinton, G.2
-
31
-
-
0032222083
-
An incremental gradient (-projection) method with momentum term and adaptive stepsize rule
-
Tseng, Paul. An incremental gradient (-projection) method with momentum term and adaptive stepsize rule. Journal on Optimization, 8(2): 506-531, 1998.
-
(1998)
Journal on Optimization
, vol.8
, Issue.2
, pp. 506-531
-
-
Tseng, P.1
-
32
-
-
85028474927
-
-
Wang, Jane X., Kurth-Nelson, Zeb, Tirumala, Dhruva, Soyer, Hubert, Leibo, Joel Z., Munos, Rémi, Blundell, Charles, Kumaran, Dharshan, and Botvinick, Matt. Learning to reinforcement learn. arXiv Report 1611.05763, 2016.
-
(2016)
Learning to Reinforcement Learn
-
-
Wang, J.X.1
Kurth-Nelson, Z.2
Tirumala, D.3
Soyer, H.4
Leibo, J.Z.5
Munos, R.6
Blundell, C.7
Kumaran, D.8
Botvinick, M.9
-
33
-
-
0347559154
-
Reminiscence and rote learning
-
Ward, Lewis B. Reminiscence and rote learning. Psychological Monographs, 49(4):i, 1937.
-
(1937)
Psychological Monographs
, vol.49
, Issue.4
, pp. i
-
-
Ward, L.B.1
|