-
1
-
-
0000396062
-
Natural gradient works efficiently in learning
-
Amari, S.-I. Natural gradient works efficiently in learning. Neural Computation, 10(2):251–276, 1998.
-
(1998)
Neural Computation
, vol.10
, Issue.2
, pp. 251-276
-
-
Amari, S.-I.1
-
2
-
-
84965180108
-
Rectified factor networks
-
Cortes, C., Lawrence, N. D., Lee, D. D., Sugiyama, M., and Garnett, R. (eds), Curran Associates, Inc
-
Clevert, D.-A., Unterthiner, T., Mayr, A., and Hochreiter, S. Rectified factor networks. In Cortes, C., Lawrence, N. D., Lee, D. D., Sugiyama, M., and Garnett, R. (eds.), Advances in Neural Information Processing Systems 28. Curran Associates, Inc., 2015.
-
(2015)
Advances in Neural Information Processing Systems
, vol.28
-
-
Clevert, D.-A.1
Unterthiner, T.2
Mayr, A.3
Hochreiter, S.4
-
3
-
-
84965130201
-
Natural neural networks
-
Desjardins, G., Simonyan, K., Pascanu, R., and Kavukcuoglu, K. Natural neural networks. CoRR, abs/1507.00210, 2015. URL http://arxiv.org/abs/1507.00210.
-
(2015)
CoRR
-
-
Desjardins, G.1
Simonyan, K.2
Pascanu, R.3
Kavukcuoglu, K.4
-
4
-
-
84862294866
-
Deep sparse rectifier neural networks
-
Gordon, G., Dunson, D., and Dudk, M. (eds)
-
Glorot, X., Bordes, A., and Bengio, Y. Deep sparse rectifier neural networks. In Gordon, G., Dunson, D., and Dudk, M. (eds.), JMLR W&CP: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2011), volume 15, pp. 315–323, 2011.
-
(2011)
JMLR W&CP: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2011)
, vol.15
, pp. 315-323
-
-
Glorot, X.1
Bordes, A.2
Bengio, Y.3
-
5
-
-
84892421248
-
-
arXiv e-prints
-
Goodfellow, I. J., Warde-Farley, D., Mirza, M., Courville, A., and Bengio, Y. Maxout networks. ArXiv e-prints, 2013.
-
(2013)
Maxout Networks
-
-
Goodfellow, I.J.1
Warde-Farley, D.2
Mirza, M.3
Courville, A.4
Bengio, Y.5
-
6
-
-
84978059147
-
Fractional max-pooling
-
Graham, Benjamin. Fractional max-pooling. CoRR, abs/1412.6071, 2014. URL http://arxiv.org/abs/1412.6071.
-
(2014)
CoRR
-
-
Graham, B.1
-
7
-
-
84994894307
-
Scaling up natural gradient by sparsely factorizing the inverse Fisher matrix
-
Proceedings of the 32nd International Conference on Machine Learning (ICML15)
-
Grosse, R. and Salakhudinov, R. Scaling up natural gradient by sparsely factorizing the inverse Fisher matrix. Journal of Machine Learning Research, 37:2304–2313, 2015. URL http://jmlr.org/proceedings/papers/v37/grosse15.pdf. Proceedings of the 32nd International Conference on Machine Learning (ICML15).
-
(2015)
Journal of Machine Learning Research
, vol.37
, pp. 2304-2313
-
-
Grosse, R.1
Salakhudinov, R.2
-
8
-
-
84973911419
-
Delving deep into rectifiers: Surpassing human-level performance on imagenet classification
-
He, K., Zhang, X., Ren, S., and Sun, J. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In IEEE International Conference on Computer Vision (ICCV), 2015.
-
(2015)
IEEE International Conference on Computer Vision (ICCV)
-
-
He, K.1
Zhang, X.2
Ren, S.3
Sun, J.4
-
10
-
-
0033114102
-
Feature extraction through LOCOCODE
-
Hochreiter, S. and Schmidhuber, J. Feature extraction through LOCOCODE. Neural Computation, 11(3): 679–714, 1999.
-
(1999)
Neural Computation
, vol.11
, Issue.3
, pp. 679-714
-
-
Hochreiter, S.1
Schmidhuber, J.2
-
11
-
-
0041914606
-
Gradient flow in recurrent nets: The difficulty of learning long-term dependencies
-
Kremer and Kolen (eds), IEEE Press
-
Hochreiter, S., Bengio, Y., Frasconi, P., and Schmidhuber, J. Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In Kremer and Kolen (eds.), A Field Guide to Dynamical Recurrent Neural Networks. IEEE Press, 2001.
-
(2001)
A Field Guide to Dynamical Recurrent Neural Networks
-
-
Hochreiter, S.1
Bengio, Y.2
Frasconi, P.3
Schmidhuber, J.4
-
12
-
-
84969584486
-
Batch normalization: Accelerating deep network training by reducing internal covariate shift
-
Proceedings of the 32nd International Conference on Machine Learning (ICML15)
-
Ioffe, S. and Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. Journal of Machine Learning Research, 37:448–456, 2015. URL http://jmlr.org/proceedings/papers/v37/ioffe15.pdf. Proceedings of the 32nd International Conference on Machine Learning (ICML15).
-
(2015)
Journal of Machine Learning Research
, vol.37
, pp. 448-456
-
-
Ioffe, S.1
Szegedy, C.2
-
13
-
-
85021667706
-
-
PhD thesis, EECS Department, University of California, Berkeley, May
-
Jia, Yangqing. Learning Semantic Image Representations at a Large Scale. PhD thesis, EECS Department, University of California, Berkeley, May 2014. URL http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-93.html.
-
(2014)
Learning Semantic Image Representations at A Large Scale
-
-
Jia, Y.1
-
14
-
-
84876231242
-
ImageNet classification with deep convolutional neural networks
-
Pereira, F., Burges, C. J. C., Bottou, L., and Weinberger, K. Q. (eds), Curran Associates, Inc
-
Krizhevsky, A., Sutskever, I., and Hinton, G. E. ImageNet classification with deep convolutional neural networks. In Pereira, F., Burges, C. J. C., Bottou, L., and Weinberger, K. Q. (eds.), Advances in Neural Information Processing Systems 25, pp. 1097–1105. Curran Associates, Inc., 2012.
-
(2012)
Advances in Neural Information Processing Systems
, vol.25
, pp. 1097-1105
-
-
Krizhevsky, A.1
Sutskever, I.2
Hinton, G.E.3
-
16
-
-
0000044667
-
Eigenvalues of covariance matrices: Application to neural-network learning
-
LeCun, Y., Kanter, I., and Solla, S. A. Eigenvalues of covariance matrices: Application to neural-network learning. Physical Review Letters, 66(18):2396–2399, 1991.
-
(1991)
Physical Review Letters
, vol.66
, Issue.18
, pp. 2396-2399
-
-
LeCun, Y.1
Kanter, I.2
Solla, S.A.3
-
17
-
-
84872543023
-
Efficient backprop
-
Orr, G. B. and Müller, K.-R. (eds), Springer
-
LeCun, Y., Bottou, L., Orr, G. B., and Müller, K.-R. Efficient backprop. In Orr, G. B. and Müller, K.-R. (eds.), Neural Networks: Tricks of the Trade, volume 1524 of Lecture Notes in Computer Science, pp. 9–50. Springer, 1998.
-
(1998)
Neural Networks: Tricks of the Trade, Volume 1524 of Lecture Notes in Computer Science
, pp. 9-50
-
-
LeCun, Y.1
Bottou, L.2
Orr, G.B.3
Müller, K.-R.4
-
18
-
-
85009928594
-
Deeply-supervised nets
-
Lee, Chen-Yu, Xie, Saining, Gallagher, Patrick W., Zhang, Zhengyou, and Tu, Zhuowen. Deeply-supervised nets. In AISTATS, 2015.
-
(2015)
AISTATS
-
-
Lee, C.-Y.1
Xie, S.2
Gallagher, P.W.3
Zhang, Z.4
Tu, Z.5
-
19
-
-
85162000799
-
Topmoumoute online natural gradient algorithm
-
Platt, J. C., Koller, D., Singer, Y., and Roweis, S. T. (eds)
-
LeRoux, N., Manzagol, P.-A., and Bengio, Y. Topmoumoute online natural gradient algorithm. In Platt, J. C., Koller, D., Singer, Y., and Roweis, S. T. (eds.), Advances in Neural Information Processing Systems 20 (NIPS), pp. 849–856, 2008.
-
(2008)
Advances in Neural Information Processing Systems 20 (NIPS)
, pp. 849-856
-
-
LeRoux, N.1
Manzagol, P.-A.2
Bengio, Y.3
-
20
-
-
84908678178
-
Network in network
-
Lin, Min, Chen, Qiang, and Yan, Shuicheng. Network in network. CoRR, abs/1312.4400, 2013. URL http://arxiv.org/abs/1312.4400.
-
(2013)
CoRR
-
-
Lin, M.1
Chen, Q.2
Yan, S.3
-
22
-
-
77956541496
-
Deep learning via Hessian-free optimization
-
Fürnkranz, J. and Joachims, T. (eds)
-
Martens, J. Deep learning via Hessian-free optimization. In Fürnkranz, J. and Joachims, T. (eds.), Proceedings of the 27th International Conference on Machine Learning (ICML10), pp. 735–742, 2010.
-
(2010)
Proceedings of the 27th International Conference on Machine Learning (ICML10)
, pp. 735-742
-
-
Martens, J.1
-
23
-
-
84987943069
-
DeepTox: Toxicity prediction using deep learning
-
Mayr, A., Klambauer, G., Unterthiner, T., and Hochreiter, S. DeepTox: Toxicity prediction using deep learning. Front. Environ. Sci., 3(80), 2015. doi: 10.3389/fenvs.2015.00080. URL http://journal.frontiersin.org/article/10.3389/fenvs.2015.00080.
-
(2015)
Front. Environ. Sci.
, vol.3
, Issue.80
-
-
Mayr, A.1
Klambauer, G.2
Unterthiner, T.3
Hochreiter, S.4
-
24
-
-
77956509090
-
Rectified linear units improve restricted Boltzmann machines
-
Fürnkranz, J. and Joachims, T. (eds)
-
Nair, V. and Hinton, G. E. Rectified linear units improve restricted Boltzmann machines. In Fürnkranz, J. and Joachims, T. (eds.), Proceedings of the 27th International Conference on Machine Learning (ICML10), pp. 807–814, 2010.
-
(2010)
Proceedings of the 27th International Conference on Machine Learning (ICML10)
, pp. 807-814
-
-
Nair, V.1
Hinton, G.E.2
-
25
-
-
85070998699
-
Riemannian metrics for neural networks I: Feedforward networks
-
Olivier, Y. Riemannian metrics for neural networks i: feedforward networks. CoRR, abs/1303.0818, 2013. URL http://arxiv.org/abs/1303.0818.
-
(2013)
CoRR
-
-
Olivier, Y.1
-
27
-
-
84893409634
-
Deep learning made easier by linear transformations in perceptrons
-
Lawrence, N. D. and Girolami, M. A. (eds)
-
Raiko, T., Valpola, H., and LeCun, Y. Deep learning made easier by linear transformations in perceptrons. In Lawrence, N. D. and Girolami, M. A. (eds.), Proceedings of the 15th International Conference on Artificial Intelligence and Statistics (AISTATS12), volume 22, pp. 924–932, 2012.
-
(2012)
Proceedings of the 15th International Conference on Artificial Intelligence and Statistics (AISTATS12)
, vol.22
, pp. 924-932
-
-
Raiko, T.1
Valpola, H.2
LeCun, Y.3
-
28
-
-
0038231917
-
Centering neural network gradient factor
-
Orr, G. B. and Müller, K.-R. (eds), Springer
-
Schraudolph, N. N. Centering neural network gradient factor. In Orr, G. B. and Müller, K.-R. (eds.), Neural Networks: Tricks of the Trade, volume 1524 of Lecture Notes in Computer Science, pp. 207–226. Springer, 1998.
-
(1998)
Neural Networks: Tricks of the Trade, Volume 1524 of Lecture Notes in Computer Science
, pp. 207-226
-
-
Schraudolph, N.N.1
-
29
-
-
0033561855
-
A fast, compact approximation of the exponential function
-
Schraudolph, Nicol N. A Fast, Compact Approximation of the Exponential Function. Neural Computation, 11: 853–862, 1999.
-
(1999)
Neural Computation
, vol.11
, pp. 853-862
-
-
Schraudolph, N.N.1
-
30
-
-
84962006941
-
Striving for simplicity: The all convolutional net
-
Springenberg, Jost Tobias, Dosovitskiy, Alexey, Brox, Thomas, and Riedmiller, Martin A. Striving for simplicity: The all convolutional net. CoRR, abs/1412.6806, 2014. URL http://arxiv.org/abs/1412.6806.
-
(2014)
CoRR
-
-
Springenberg, J.T.1
Dosovitskiy, A.2
Brox, T.3
Riedmiller, M.A.4
-
31
-
-
84973388607
-
Training very deep networks
-
Srivastava, Rupesh Kumar, Greff, Klaus, and Schmidhuber, Jürgen. Training very deep networks. CoRR, abs/1507.06228, 2015. URL http://arxiv.org/abs/1507.06228.
-
(2015)
CoRR
-
-
Srivastava, R.K.1
Greff, K.2
Schmidhuber, J.3
-
32
-
-
85070976738
-
Toxicity prediction using deep learning
-
Unterthiner, T., Mayr, A., Klambauer, G., and Hochreiter, S. Toxicity prediction using deep learning. CoRR, abs/1503.01445, 2015. URL http://arxiv.org/abs/1503.01445.
-
(2015)
CoRR
-
-
Unterthiner, T.1
Mayr, A.2
Klambauer, G.3
Hochreiter, S.4
-
33
-
-
84867614640
-
Krylov subspace descent for deep learning
-
Vinyals, O. and Povey, D. Krylov subspace descent for deep learning. In AISTATS, 2012. URL http://arxiv.org/pdf/1111.4259v1. arXiv:1111.4259.
-
(2012)
AISTATS
-
-
Vinyals, O.1
Povey, D.2
-
34
-
-
85013858782
-
Empirical evaluation of rectified activations in convolutional network
-
Xu, B., Wang, N., Chen, T., and Li, M. Empirical evaluation of rectified activations in convolutional network. CoRR, abs/1505.00853, 2015. URL http://arxiv.org/abs/1505.00853.
-
(2015)
CoRR
-
-
Xu, B.1
Wang, N.2
Chen, T.3
Li, M.4
-
35
-
-
0032533046
-
Complexity issues in natural gradient descent method for training multilayer perceptrons
-
Yang, H. H. and Amari, S.-I. Complexity issues in natural gradient descent method for training multilayer perceptrons. Neural Computation, 10(8), 1998.
-
(1998)
Neural Computation
, vol.10
, Issue.8
-
-
Yang, H.H.1
Amari, S.-I.2
|