-
1
-
-
0000396062
-
Natural gradient works efficiently in learning
-
Shun-ichi Amari. Natural gradient works efficiently in learning. Neural Computation, 1998.
-
(1998)
Neural Computation
-
-
Amari, S.-I.1
-
2
-
-
84937961091
-
Do deep nets really need to be deep?
-
Jimmy Ba and Rich Caruana. Do deep nets really need to be deep? In NIPS. 2014.
-
(2014)
NIPS
-
-
Ba, J.1
Caruana, R.2
-
3
-
-
0037403111
-
Mirror descent and nonlinear projected subgradient methods for convex optimization
-
Amir Beck and Marc Teboulle. Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper. Res. Lett., 2003.
-
(2003)
Oper. Res. Lett.
-
-
Beck, A.1
Teboulle, M.2
-
5
-
-
80052250414
-
Adaptive subgradient methods for online learning and stochastic optimization
-
John Duchi, Elad Hazan, and Yoram Singer. Adaptive subgradient methods for online learning and stochastic optimization. In JMLR. 2011.
-
(2011)
JMLR
-
-
Duchi, J.1
Hazan, E.2
Singer, Y.3
-
6
-
-
79951563340
-
Understanding the difficulty of training deep feedforward neural networks
-
May
-
Xavier Glorot and Yoshua Bengio. Understanding the difficulty of training deep feedforward neural networks. In AISTATS, May 2010.
-
(2010)
AISTATS
-
-
Glorot, X.1
Bengio, Y.2
-
7
-
-
84969584486
-
Batch normalization: Accelerating deep network training by reducing internal covariate shift
-
Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. ICML, 2015.
-
(2015)
ICML
-
-
Ioffe, S.1
Szegedy, C.2
-
8
-
-
84969988426
-
Optimizing neural networks with kronecker-factored approximate curvature
-
June
-
Roger Grosse James Martens. Optimizing neural networks with kronecker-factored approximate curvature. In ICML, June 2015.
-
(2015)
ICML
-
-
Grosse, R.1
Martens, J.2
-
10
-
-
0001857994
-
Efficient backprop
-
Lecture Notes in Computer Science LNCS 1524. Springer Verlag
-
Yann LeCun, Léon Bottou, Genevieve B. Orr, and Klaus-Robert Müller. Efficient backprop. In Neural Networks, Tricks of the Trade, Lecture Notes in Computer Science LNCS 1524. Springer Verlag, 1998.
-
(1998)
Neural Networks, Tricks of the Trade
-
-
LeCun, Y.1
Bottou, L.2
Orr, G.B.3
Müller, K.-R.4
-
11
-
-
0032203257
-
Gradient-based learning applied to document recognition
-
Yann Lecun, Lon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. In Proceedings of the IEEE, pages 2278-2324, 1998.
-
(1998)
Proceedings of the IEEE
, pp. 2278-2324
-
-
Lecun, Y.1
Bottou, L.2
Bengio, Y.3
Haffner, P.4
-
12
-
-
77956541496
-
Deep learning via Hessian-free optimization
-
June
-
James Martens. Deep learning via Hessian-free optimization. In ICML, June 2010.
-
(2010)
ICML
-
-
Martens, J.1
-
13
-
-
84965139370
-
Deep boltzmann machines and the centering trick
-
K.-R. Müller, G. Montavon, and G. B. Orr, editors, Springer
-
K.-R. Müller and G. Montavon. Deep boltzmann machines and the centering trick. In K.-R. Müller, G. Montavon, and G. B. Orr, editors, Neural Networks: Tricks of the Trade. Springer, 2013.
-
(2013)
Neural Networks: Tricks of the Trade
-
-
Müller, K.-R.1
Montavon, G.2
-
15
-
-
85083950291
-
Revisiting natural gradient for deep networks
-
Razvan Pascanu and Yoshua Bengio. Revisiting natural gradient for deep networks. In ICLR, 2014.
-
(2014)
ICLR
-
-
Pascanu, R.1
Bengio, Y.2
-
16
-
-
85083954109
-
Parallel training of deep neural networks with natural gradient and parameter averaging
-
Daniel Povey, Xiaohui Zhang, and Sanjeev Khudanpur. Parallel training of deep neural networks with natural gradient and parameter averaging. ICLR workshop, 2015.
-
(2015)
ICLR Workshop
-
-
Povey, D.1
Zhang, X.2
Khudanpur, S.3
-
17
-
-
84893414160
-
Deep learning made easier by linear transformations in perceptrons
-
T. Raiko, H. Valpola, and Y. LeCun. Deep learning made easier by linear transformations in perceptrons. In AISTATS, 2012.
-
(2012)
AISTATS
-
-
Raiko, T.1
Valpola, H.2
LeCun, Y.3
-
19
-
-
84969962257
-
Scaling up natural gradient by sparsely factorizing the inverse fisher matrix
-
June
-
Ruslan Salakhutdinov Roger B. Grosse. Scaling up natural gradient by sparsely factorizing the inverse fisher matrix. In ICML, June 2015.
-
(2015)
ICML
-
-
Roger, R.S.B.G.1
-
20
-
-
84947041871
-
ImageNet large scale visual recognition challenge
-
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 2015.
-
(2015)
International Journal of Computer Vision (IJCV)
-
-
Russakovsky, O.1
Deng, J.2
Su, H.3
Krause, J.4
Satheesh, S.5
Ma, S.6
Huang, Z.7
Karpathy, A.8
Khosla, A.9
Bernstein, M.10
Berg, A.C.11
Fei-Fei, L.12
-
24
-
-
84904163933
-
Dropout: A simple way to prevent neural networks from overfitting
-
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 2014.
-
(2014)
Journal of Machine Learning Research
-
-
Srivastava, N.1
Hinton, G.2
Krizhevsky, A.3
Sutskever, I.4
Salakhutdinov, R.5
-
25
-
-
84964983441
-
-
arXiv
-
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. arXiv, 2014.
-
(2014)
Going Deeper with Convolutions
-
-
Szegedy, C.1
Liu, W.2
Jia, Y.3
Sermanet, P.4
Reed, S.5
Anguelov, D.6
Erhan, D.7
Vanhoucke, V.8
Rabinovich, A.9
-
28
-
-
84965126433
-
Pushing stochastic gradient towards second-order methods - Backpropagation learning with transformations in nonlinearities
-
Tommi Vatanen, Tapani Raiko, Harri Valpola, and Yann LeCun. Pushing stochastic gradient towards second-order methods - backpropagation learning with transformations in nonlinearities. ICONIP, 2013.
-
(2013)
ICONIP
-
-
Vatanen, T.1
Raiko, T.2
Valpola, H.3
LeCun, Y.4
|