-
1
-
-
84958264664
-
-
arXiv preprint
-
Martın Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, et al. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467, 2016.
-
(2016)
Tensorflow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
-
-
Abadi, M.1
Agarwal, A.2
Barham, P.3
Brevdo, E.4
Chen, Z.5
Citro, C.6
Corrado, G.S.7
Davis, A.8
Dean, J.9
Devin, M.10
-
2
-
-
0000396062
-
Natural gradient works efficiently in learning
-
Shun-Ichi Amari. Natural gradient works efficiently in learning. Neural computation, 10(2):251-276, 1998.
-
(1998)
Neural Computation
, vol.10
, Issue.2
, pp. 251-276
-
-
Amari, S.-I.1
-
3
-
-
84857819132
-
Theano: A cpu and GPU math compiler in python
-
James Bergstra, Olivier Breuleux, Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, Guillaume Desjardins, Joseph Turian, David Warde-Farley, and Yoshua Bengio. Theano: A cpu and gpu math compiler in python. In Proc. 9th Python in Science Conf, pages 1-7, 2010.
-
(2010)
Proc. 9th Python in Science Conf
, pp. 1-7
-
-
Bergstra, J.1
Breuleux, O.2
Bastien, F.3
Lamblin, P.4
Pascanu, R.5
Desjardins, G.6
Turian, J.7
Warde-Farley, D.8
Bengio, Y.9
-
5
-
-
84976910543
-
A stochastic quasi-Newton method for large-scale optimization
-
Richard H Byrd, SL Hansen, Jorge Nocedal, and Yoram Singer. A stochastic quasi-newton method for large-scale optimization. SIAM Journal on Optimization, 26(2):1008-1031, 2016.
-
(2016)
SIAM Journal on Optimization
, vol.26
, Issue.2
, pp. 1008-1031
-
-
Byrd, R.H.1
Hansen, S.L.2
Nocedal, J.3
Singer, Y.4
-
6
-
-
84965175669
-
Hessian-free optimization for learning deep multidimensional recurrent neural networks
-
Minhyung Cho, Chandra Dhir, and Jaehyung Lee. Hessian-free optimization for learning deep multidimensional recurrent neural networks. In Advances in Neural Information Processing Systems, pages 883-891, 2015.
-
(2015)
Advances in Neural Information Processing Systems
, pp. 883-891
-
-
Cho, M.1
Dhir, C.2
Lee, J.3
-
8
-
-
84877760312
-
Large scale distributed deep networks
-
Jeffrey Dean, Greg Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Mark Mao, Andrew Senior, Paul Tucker, Ke Yang, Quoc V Le, et al. Large scale distributed deep networks. In Advances in neural information processing systems, pages 1223-1231, 2012.
-
(2012)
Advances in Neural Information Processing Systems
, pp. 1223-1231
-
-
Dean, J.1
Corrado, G.2
Monga, R.3
Chen, K.4
Devin, M.5
Mao, M.6
Senior, A.7
Tucker, P.8
Yang, K.9
Le, Q.V.10
-
9
-
-
84965130201
-
Natural neural networks
-
Guillaume Desjardins, Karen Simonyan, Razvan Pascanu, and Koray Kavukcuoglu. Natural neural networks. In Advances in Neural Information Processing Systems, pages 2071-2079, 2015.
-
(2015)
Advances in Neural Information Processing Systems
, pp. 2071-2079
-
-
Desjardins, G.1
Simonyan, K.2
Pascanu, R.3
Kavukcuoglu, K.4
-
10
-
-
80052250414
-
Adaptive subgradient methods for online learning and stochastic optimization
-
Jul
-
John Duchi, Elad Hazan, and Yoram Singer. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12(Jul):2121-2159, 2011.
-
(2011)
Journal of Machine Learning Research
, vol.12
, pp. 2121-2159
-
-
Duchi, J.1
Hazan, E.2
Singer, Y.3
-
15
-
-
0034167148
-
On “natural” learning and pruning in multilayered perceptrons
-
Tom Heskes. On “natural” learning and pruning in multilayered perceptrons. Neural Computation, 12(4): 881-901, 2000.
-
(2000)
Neural Computation
, vol.12
, Issue.4
, pp. 881-901
-
-
Heskes, T.1
-
23
-
-
0032203257
-
Gradient-based learning applied to document recognition
-
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278-2324, 1998.
-
(1998)
Proceedings of the IEEE
, vol.86
, Issue.11
, pp. 2278-2324
-
-
LeCun, Y.1
Bottou, L.2
Bengio, Y.3
Haffner, P.4
-
27
-
-
84872565347
-
Training deep and recurrent networks with Hessian-free optimization
-
Springer
-
James Martens and Ilya Sutskever. Training deep and recurrent networks with Hessian-free optimization. In Neural Networks: Tricks of the Trade, pages 479-535. Springer, 2012.
-
(2012)
Neural Networks: Tricks of the Trade
, pp. 479-535
-
-
Martens, J.1
Sutskever, I.2
-
33
-
-
84947041871
-
Imagenet large scale visual recognition challenge
-
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, An-drej Karpathy, Aditya Khosla, Michael Bernstein, et al. Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3):211-252, 2015.
-
(2015)
International Journal of Computer Vision
, vol.115
, Issue.3
, pp. 211-252
-
-
Russakovsky, O.1
Deng, J.2
Su, H.3
Krause, J.4
Satheesh, S.5
Ma, S.6
Huang, Z.7
Karpathy, A.-D.8
Khosla, A.9
Bernstein, M.10
-
34
-
-
0038231917
-
Centering neural network gradient factors
-
Genevieve B. Orr and Klaus-Robert Müller, editors, Springer Verlag, Berlin
-
Nicol N. Schraudolph. Centering neural network gradient factors. In Genevieve B. Orr and Klaus-Robert Müller, editors, Neural Networks: Tricks of the Trade, volume 1524 of Lecture Notes in Computer Science, pages 207-226. Springer Verlag, Berlin, 1998.
-
(1998)
Neural Networks: Tricks of the Trade, 1524 of Lecture Notes in Computer Science
, pp. 207-226
-
-
Schraudolph, N.N.1
-
35
-
-
0036631778
-
Fast curvature matrix-vector products for second-order gradient descent
-
Nicol N. Schraudolph. Fast curvature matrix-vector products for second-order gradient descent. Neural Computation, 14(7), 2002.
-
(2002)
Neural Computation
, vol.14
, Issue.7
-
-
Schraudolph, N.N.1
-
36
-
-
72449211086
-
A stochastic quasi-Newton method for online convex optimization
-
Nicol N Schraudolph, Jin Yu, Simon Günter, et al. A stochastic quasi-newton method for online convex optimization. In AISTATS, volume 7, pages 436-443, 2007.
-
(2007)
AISTATS
, vol.7
, pp. 436-443
-
-
Schraudolph, N.N.1
Yu, J.2
Günter, S.3
-
37
-
-
84964983441
-
-
arXiv preprint
-
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Er-han, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. arXiv preprint arXiv:1409.4842, 2014.
-
(2014)
Going Deeper with Convolutions
-
-
Szegedy, C.1
Liu, W.2
Jia, Y.3
Sermanet, P.4
Reed, S.5
Anguelov, D.6
Er-Han, D.7
Vanhoucke, V.8
Rabinovich, A.9
-
38
-
-
84990032289
-
-
arXiv preprint
-
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, and Zbigniew Wojna. Rethinking the inception architecture for computer vision. arXiv preprint arXiv:1512.00567, 2015.
-
(2015)
Rethinking the Inception Architecture for Computer Vision
-
-
Szegedy, C.1
Vanhoucke, V.2
Ioffe, S.3
Shlens, J.4
Wojna, Z.5
-
39
-
-
84954239313
-
Krylov subspace descent for deep learning
-
Oriol Vinyals and Daniel Povey. Krylov subspace descent for deep learning. In AISTATS, pages 1261-1268, 2012.
-
(2012)
AISTATS
, pp. 1261-1268
-
-
Vinyals, O.1
Povey, D.2
|