-
1
-
-
85037743325
-
Learning to learn by gradient descent by gradient descent
-
abs
-
Marcin Andrychowicz, Misha Denil, Sergio Gomez, Matthew W. Hoffman, David Pfau, Tom Schaul, and Nando de Freitas. Learning to learn by gradient descent by gradient descent. CoRR, abs/1606.04474, 2016. URL http://arxiv.org/abs/1606.04474.
-
(2016)
CoRR
-
-
Andrychowicz, M.1
Denil, M.2
Gomez, S.3
Hoffman, M.W.4
Pfau, D.5
Schaul, T.6
De Freitas, N.7
-
3
-
-
34249757641
-
On the search for new learning rules for ANNs
-
Samy Bengio, Yoshua Bengio, and Jocelyn Cloutier. On the search for new learning rules for ANNs. Neural Processing Letters, 2(4):26-30, 1995.
-
(1995)
Neural Processing Letters
, vol.2
, Issue.4
, pp. 26-30
-
-
Bengio, S.1
Bengio, Y.2
Cloutier, J.3
-
4
-
-
84921824478
-
-
Université de Montréal, Département d'informatique et de recherche opérationnelle
-
Yoshua Bengio, Samy Bengio, and Jocelyn Cloutier. Learning a synaptic learning rule. Université de Montréal, Département d'informatique et de recherche opérationnelle, 1990.
-
(1990)
Learning a Synaptic Learning Rule
-
-
Bengio, Y.1
Bengio, S.2
Cloutier, J.3
-
5
-
-
84904548965
-
Deep learning of representations for unsupervised and transfer learning
-
Yoshua Bengio et al. Deep learning of representations for unsupervised and transfer learning. ICML Unsupervised and Transfer Learning, 27:17-36, 2012.
-
(2012)
ICML Unsupervised and Transfer Learning
, vol.27
, pp. 17-36
-
-
Bengio, Y.1
-
6
-
-
85018918773
-
Learning feed-forward one-shot learners
-
abs
-
Luca Bertinetto, João F. Henriques, Jack Valmadre, Philip H. S. Torr, and Andrea Vedaldi. Learning feed-forward one-shot learners. CoRR, abs/1606.05233, 2016. URL http://arxiv.org/abs/1606.05233.
-
(2016)
CoRR
-
-
Bertinetto, L.1
Henriques, J.F.2
Valmadre, J.3
Torr, P.H.S.4
Vedaldi, A.5
-
8
-
-
85153936556
-
Learning many related tasks at the same time with backpropagation
-
Rich Caruana. Learning many related tasks at the same time with backpropagation. Advances in neural information processing systems, pp. 657-664, 1995.
-
(1995)
Advances in Neural Information Processing Systems
, pp. 657-664
-
-
Caruana, R.1
-
9
-
-
84921940378
-
Learning phrase representations using RNN encoder-decoder for statistical machine translation
-
abs
-
Kyunghyun Cho, Bart van Merrienboer, Çaglar Gülçehre, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. Learning phrase representations using RNN encoder-decoder for statistical machine translation. CoRR, abs/1406.1078, 2014. URL http://arxiv.org/abs/1406.1078.
-
(2014)
CoRR
-
-
Cho, K.1
Van Merrienboer, B.2
Gülçehre, Ç.3
Bougares, F.4
Schwenk, H.5
Bengio, Y.6
-
10
-
-
84904482223
-
DeCAF: A deep convolutional activation feature for generic visual recognition
-
abs
-
Jeff Donahue, Yangqing Jia, Oriol Vinyals, Judy Hoffman, Ning Zhang, Eric Tzeng, and Trevor Darrell. Decaf: A deep convolutional activation feature for generic visual recognition. CoRR, abs/1310.1531, 2013. URL http://arxiv.org/abs/1310.1531.
-
(2013)
CoRR
-
-
Donahue, J.1
Jia, Y.2
Vinyals, O.3
Hoffman, J.4
Zhang, N.5
Tzeng, E.6
Darrell, T.7
-
11
-
-
80052250414
-
Adaptive subgradient methods for online learning and stochastic optimization
-
July ISSN
-
John Duchi, Elad Hazan, and Yoram Singer. Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res., 12:2121-2159, July 2011. ISSN 1532-4435. URL http://dl.acm.org/citation.cfm?id=1953048.2021068.
-
(2011)
J. Mach. Learn. Res.
, vol.12
, pp. 2121-2159
-
-
Duchi, J.1
Hazan, E.2
Singer, Y.3
-
12
-
-
84958589374
-
Deep residual learning for image recognition
-
abs/1512.03385
-
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. CoRR, abs/1512.03385, 2015. URL http://arxiv.org/abs/1512.03385.
-
(2015)
CoRR
-
-
He, K.1
Zhang, X.2
Ren, S.3
Sun, J.4
-
14
-
-
84958985283
-
Learning to learn using gradient descent
-
Springer
-
Sepp Hochreiter, A. Steven Younger, and Peter R. Conwell. Learning to learn using gradient descent. In IN LECTURE NOTES ON COMP. SCI. 2130, PROC. INTL. CONF. ON ARTI NEURAL NETWORKS ICANN-2001, pp. 87-94. Springer, 2001.
-
(2001)
Lecture Notes on Comp. Sci. 2130, Proc. Intl. Conf. On Arti Neural Networks (ICANN-2001)
, pp. 87-94
-
-
Hochreiter, S.1
Steven Younger, A.2
Conwell, P.R.3
-
15
-
-
84946590546
-
Batch normalization: Accelerating deep network training by reducing internal covariate shift
-
abs
-
Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. CoRR, abs/1502.03167, 2015. URL http://arxiv.org/abs/1502.03167.
-
(2015)
CoRR
-
-
Ioffe, S.1
Szegedy, C.2
-
16
-
-
85083951076
-
ADaM: A method for stochastic optimization
-
abs
-
Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. CoRR, abs/1412.6980, 2014. URL http://arxiv.org/abs/1412.6980.
-
(2014)
CoRR
-
-
Kingma, D.P.1
Ba, J.2
-
18
-
-
85019169098
-
Building machines that learn and think like people
-
Brenden M. Lake, Tomer D. Ullman, Joshua B. Tenenbaum, and Samuel J. Gershman. Building machines that learn and think like people. CoRR, abs/1604.00289, 2016. URL http://arxiv.org/abs/1604.00289.
-
(2016)
CoRR
-
-
Lake, B.M.1
Ullman, T.D.2
Tenenbaum, J.B.3
Gershman, S.J.4
-
21
-
-
85011070895
-
-
arXiv preprint
-
Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499, 2016.
-
(2016)
Wavenet: A Generative Model for Raw Audio
-
-
Van Den Oord, A.1
Dieleman, S.2
Zen, H.3
Simonyan, K.4
Vinyals, O.5
Graves, A.6
Kalchbrenner, N.7
Senior, A.8
Kavukcuoglu, K.9
-
22
-
-
85040946180
-
Lillicrap. One-shot learning with memory-augmented neural networks
-
abs
-
Adam Santoro, Sergey Bartunov, Matthew Botvinick, Daan Wierstra, and Timothy P. Lillicrap. One-shot learning with memory-augmented neural networks. CoRR, abs/1605.06065, 2016. URL http://arxiv.org/abs/1605.06065.
-
(2016)
CoRR
-
-
Santoro, A.1
Bartunov, S.2
Botvinick, M.3
Wierstra, D.4
Timothy, P.5
-
23
-
-
0346377064
-
Learning to control fast-weight memories: An alternative to dynamic recurrent networks
-
Jürgen Schmidhuber. Learning to control fast-weight memories: An alternative to dynamic recurrent networks. Neural Computation, 4(1):131-139, 1992.
-
(1992)
Neural Computation
, vol.4
, Issue.1
, pp. 131-139
-
-
Schmidhuber, J.1
-
25
-
-
0031186687
-
Shifting inductive bias with success-story algorithm, adaptive levin search, and incremental self-improvement
-
Jürgen Schmidhuber, Jieyu Zhao, and Marco Wiering. Shifting inductive bias with success-story algorithm, adaptive levin search, and incremental self-improvement. Machine Learning, 28(1): 105-130, 1997.
-
(1997)
Machine Learning
, vol.28
, Issue.1
, pp. 105-130
-
-
Schmidhuber, J.1
Zhao, J.2
Wiering, M.3
-
26
-
-
0010687621
-
Lifelong learning algorithms
-
Springer
-
Sebastian Thrun. Lifelong learning algorithms. In Learning to learn, pp. 181-209. Springer, 1998.
-
(1998)
Learning to Learn
, pp. 181-209
-
-
Thrun, S.1
-
27
-
-
85030218957
-
Matching networks for one shot learning
-
abs
-
Oriol Vinyals, Charles Blundell, Timothy P. Lillicrap, Koray Kavukcuoglu, and Daan Wierstra. Matching networks for one shot learning. CoRR, abs/1606.04080, 2016. URL http://arxiv.org/abs/1606.04080.
-
(2016)
CoRR
-
-
Vinyals, O.1
Blundell, C.2
Lillicrap, T.P.3
Kavukcuoglu, K.4
Wierstra, D.5
-
28
-
-
85018271332
-
-
arXiv preprint
-
Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, et al. Google's neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144, 2016.
-
(2016)
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
-
-
Wu, Y.1
Schuster, M.2
Chen, Z.3
Le, Q.V.4
Norouzi, M.5
Macherey, W.6
Krikun, M.7
Cao, Y.8
Gao, Q.9
Macherey, K.10
-
29
-
-
84952032150
-
How transferable are features in deep neural networks?
-
abs
-
Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. How transferable are features in deep neural networks? CoRR, abs/1411.1792, 2014. URL http://arxiv.org/abs/1411.1792.
-
(2014)
CoRR
-
-
Yosinski, J.1
Clune, J.2
Bengio, Y.3
Lipson, H.4
-
31
-
-
84905272120
-
Adadelta: An adaptive learning rate method
-
abs
-
Matthew D. Zeiler. ADADELTA: an adaptive learning rate method. CoRR, abs/1212.5701, 2012. URL http://arxiv.org/abs/1212.5701.
-
(2012)
CoRR
-
-
Zeiler, M.D.1
|