-
1
-
-
0013141758
-
Reversible jump mcmc simulated annealing for neural networks
-
C. Andrieu, N. de Freitas, and A. Doucet. Reversible jump mcmc simulated annealing for neural networks. In UAI, 2000.
-
(2000)
UAI
-
-
Andrieu, C.1
de Freitas, N.2
Doucet, A.3
-
3
-
-
84904136037
-
Large-scale machine learning with stochastic gradient descent
-
L. Bottou. Large-scale machine learning with stochastic gradient descent. In Proc. COMPSTAT, 2010.
-
(2010)
Proc. COMPSTAT
-
-
Bottou, L.1
-
4
-
-
84867129058
-
Modeling temporal dependencies in high-dimensional sequences: Application to polyphonic music generation and transcription
-
N. Boulanger-Lewandowski, Y. Bengio, and P. Vincent. Modeling temporal dependencies in high-dimensional sequences: Application to polyphonic music generation and transcription. In ICML, 2012.
-
(2012)
ICML
-
-
Boulanger-Lewandowski, N.1
Bengio, Y.2
Vincent, P.3
-
5
-
-
84965148019
-
Preconditioned spectral descent for deep learning
-
D. E. Carlson, E. Collins, Y.-P. Hsieh, L. Carin, and V. Cevher. Preconditioned spectral descent for deep learning. In Advances in Neural Information Processing Systems, pages 2953–2961, 2015.
-
(2015)
Advances in Neural Information Processing Systems
, pp. 2953-2961
-
-
Carlson, D.E.1
Collins, E.2
Hsieh, Y.-P.3
Carin, L.4
Cevher, V.5
-
6
-
-
84965095225
-
On the convergence of stochastic gradient mcmc algorithms with high-order integrators
-
C. Chen, N. Ding, and L. Carin. On the convergence of stochastic gradient mcmc algorithms with high-order integrators. In NIPS, 2015.
-
(2015)
NIPS
-
-
Chen, C.1
Ding, N.2
Carin, L.3
-
7
-
-
84919787787
-
Stochastic gradient hamiltonian monte carlo
-
T. Chen, E. B. Fox, and C. Guestrin. Stochastic gradient Hamiltonian Monte Carlo. In ICML, 2014.
-
(2014)
ICML
-
-
Chen, T.1
Fox, E.B.2
Guestrin, C.3
-
8
-
-
84961291190
-
-
K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio. Learning phrase representations using rnn encoder-decoder for statistical machine translation. In arXiv:1406.1078, 2014.
-
(2014)
Learning Phrase Representations Using Rnn Encoder-Decoder for Statistical Machine Translation
-
-
Cho, K.1
Van Merriënboer, B.2
Gulcehre, C.3
Bahdanau, D.4
Bougares, F.5
Schwenk, H.6
Bengio, Y.7
-
9
-
-
84965117097
-
Equilibrated adaptive learning rates for non-convex optimization
-
Y. N. Dauphin, H. de Vries, and Y. Bengio. Equilibrated adaptive learning rates for non-convex optimization. In NIPS, 2015.
-
(2015)
NIPS
-
-
Dauphin, Y.N.1
de Vries, H.2
Bengio, Y.3
-
10
-
-
84937959155
-
Bayesian sampling using stochastic gradient thermostats
-
N. Ding, Y. Fang, R. Babbush, C. Chen, R. D. Skeel, and H. Neven. Bayesian sampling using stochastic gradient thermostats. In NIPS, 2014.
-
(2014)
NIPS
-
-
Ding, N.1
Fang, Y.2
Babbush, R.3
Chen, C.4
Skeel, R.D.5
Neven, H.6
-
11
-
-
80052250414
-
Adaptive sub-gradient methods for online learning and stochastic optimization
-
J. Duchi, E. Hazan, and Y. Singer. Adaptive sub-gradient methods for online learning and stochastic optimization. In JMLR, 2011.
-
(2011)
JMLR
-
-
Duchi, J.1
Hazan, E.2
Singer, Y.3
-
12
-
-
84970024465
-
Scalable deep poisson factor analysis for topic modeling
-
Z. Gan, C. Chen, R. Henao, D. Carlson, and L. Carin. Scalable deep Poisson factor analysis for topic modeling. In ICML, 2015.
-
(2015)
ICML
-
-
Gan, Z.1
Chen, C.2
Henao, R.3
Carlson, D.4
Carin, L.5
-
13
-
-
0021518209
-
Stochastic relaxation, gibbs distributions, and the bayesian restoration of images
-
S. Geman and D. Geman. Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. In PAMI, 1984.
-
(1984)
PAMI
-
-
Geman, S.1
Geman, D.2
-
14
-
-
79952295497
-
Riemann manifold langevin and hamiltonian monte carlo methods
-
M. Girolami and B. Calderhead. Riemann manifold Langevin and Hamiltonian Monte Carlo methods. In JRSS, 2011.
-
(2011)
JRSS
-
-
Girolami, M.1
Calderhead, B.2
-
15
-
-
84897543523
-
Maxout networks
-
I. Goodfellow, D. Warde-farley, M. Mirza, A. Courville, and Y. Bengio. Maxout networks. In ICML, 2013.
-
(2013)
ICML
-
-
Goodfellow, I.1
Warde-farley, D.2
Mirza, M.3
Courville, A.4
Bengio, Y.5
-
16
-
-
77953183471
-
What is the best multi-stage architecture for object recognition?
-
K. Jarrett, K. Kavukcuoglu, M. Ranzato, and Y. Le-Cun. What is the best multi-stage architecture for object recognition? In ICCV, 2009.
-
(2009)
ICCV
-
-
Jarrett, K.1
Kavukcuoglu, K.2
Ranzato, M.3
Le-Cun, Y.4
-
17
-
-
85083951076
-
A method for stochastic optimization
-
D. Kingma and J. Ba. Adam: A method for stochastic optimization. In ICLR, 2015.
-
(2015)
ICLR
-
-
Kingma, D.1
Adam, J.Ba.2
-
18
-
-
26444479778
-
Optimization by simulated annealing
-
C. D. G. Jr
-
S. Kirkpatrick, C. D. G. Jr, and M. P. Vecchi. Optimization by simulated annealing. In Science, 1983.
-
(1983)
Science
-
-
Kirkpatrick, S.1
Vecchi, M.P.2
-
19
-
-
85007196088
-
Preconditioned stochastic gradient langevin dynamics for deep neural networks
-
C. Li, C. Chen, D. Carlson, and L. Carin. Preconditioned stochastic gradient Langevin dynamics for deep neural networks. In AAAI, 2016a.
-
(2016)
AAAI
-
-
Li, C.1
Chen, C.2
Carlson, D.3
Carin, L.4
-
20
-
-
85007273869
-
High-order stochastic gradient thermostats for bayesian learning of deep models
-
C. Li, C. Chen, K. Fan, and L. Carin. High-order stochastic gradient thermostats for Bayesian learning of deep models. In AAAI, 2016b.
-
(2016)
AAAI
-
-
Li, C.1
Chen, C.2
Fan, K.3
Carin, L.4
-
21
-
-
67349202839
-
Hybrid parallel tempering and simulated annealing method
-
Y. Li, V. A. Protopopescu, N. Arnold, X. Zhang, and A. Gorin. Hybrid parallel tempering and simulated annealing method. In Applied Mathematics and Computation, 2009.
-
(2009)
Applied Mathematics and Computation
-
-
Li, Y.1
Protopopescu, V.A.2
Arnold, N.3
Zhang, X.4
Gorin, A.5
-
23
-
-
77950857322
-
Construction of numerical time-average and stationary measures via poisson equations
-
J. C. Mattingly, A. M. Stuart, and M. V. Tretyakov. Construction of numerical time-average and stationary measures via Poisson equations. In SIAM J. NUMER. ANAL., 2010.
-
(2010)
SIAM J. NUMER. ANAL.
-
-
Mattingly, J.C.1
Stuart, A.M.2
Tretyakov, M.V.3
-
26
-
-
85067545570
-
Scaling nonparametric bayesian inference via subsample-annealing
-
F. Obermeyer, J. Glidden, and E. Jonas. Scaling nonparametric bayesian inference via subsample-annealing. In AISTATS, 2014.
-
(2014)
AISTATS
-
-
Obermeyer, F.1
Glidden, J.2
Jonas, E.3
-
27
-
-
84898939739
-
Stochastic gradient riemannian langevin dynamics on the probability simplex
-
S. Patterson and Y. W. Teh. Stochastic gradient Riemannian Langevin dynamics on the probability simplex. In NIPS, 2013.
-
(2013)
NIPS
-
-
Patterson, S.1
Teh, Y.W.2
-
31
-
-
84897510162
-
On the importance of initialization and momentum in deep learning
-
I. Sutskever, J. Martens, G. Dahl, and G. E. Hinton. On the importance of initialization and momentum in deep learning. In ICML, 2013.
-
(2013)
ICML
-
-
Sutskever, I.1
Martens, J.2
Dahl, G.3
Hinton, G.E.4
-
35
-
-
0021819411
-
Thermodynamical approach to the traveling salesman problem: An efficient simulation algorithm
-
V. Černý. Thermodynamical approach to the traveling salesman problem: An efficient simulation algorithm. In J. Optimization Theory and Applications, 1985.
-
(1985)
J. Optimization Theory and Applications
-
-
Černý, V.1
-
37
-
-
80053452150
-
Bayesian learning via stochastic gradient langevin dynamics
-
M. Welling and Y. W. Teh. Bayesian learning via stochastic gradient Langevin dynamics. In ICML, 2011.
-
(2011)
ICML
-
-
Welling, M.1
Teh, Y.W.2
-
38
-
-
85083954484
-
Stochastic pooling for regularization of deep convolutional neural networks
-
M. Zeiler and R. Fergus. Stochastic pooling for regularization of deep convolutional neural networks. In ICLR, 2013.
-
(2013)
ICLR
-
-
Zeiler, M.1
Fergus, R.2
|