-
1
-
-
74049127928
-
-
John C. Platt, Daphne Koller, Yoram Singer, and Sam T. Roweis, editors, NIPS MIT Press
-
András Antos, Rémi Munos, and Csaba Szepesvári. Fitted q-iteration in continuous action-space MDPs. In John C. Platt, Daphne Koller, Yoram Singer, and Sam T. Roweis, editors, NIPS. MIT Press, 2007.
-
(2007)
Fitted q-iteration in Continuous Action-space MDPs
-
-
Antos, A.1
Munos, R.2
Szepesvári, C.3
-
4
-
-
0001771345
-
Linear least-squares algorithms for temporal difference learning
-
Steven J. Bradtke, Andrew G. Barto, and Pack Kaelbling. Linear least-squares algorithms for temporal difference learning. In Machine Learning, pages 22-33, 1996.
-
(1996)
Machine Learning
, pp. 22-33
-
-
Bradtke, S.J.1
Barto, A.G.2
Kaelbling, P.3
-
6
-
-
0038891993
-
Sparse on-line Gaussian Processes
-
Lehel Csató and Manfred Opper. Sparse on-line Gaussian Processes. Neural Computation, 14(3):641-669, 2002.
-
(2002)
Neural Computation
, vol.14
, Issue.3
, pp. 641-669
-
-
Csató, L.1
Opper, M.2
-
7
-
-
80053441894
-
PILCO: A model-based and data-efficient approach to policy search
-
L. Getoor and T. Scheffer, editors, Bellevue, WA, USA, June
-
Marc P. Deisenroth and Carl E. Rasmussen. PILCO: A Model-Based and Data-Efficient Approach to Policy Search. In L. Getoor and T. Scheffer, editors, Proceedings of the 28th International Conference on Machine Learning, Bellevue, WA, USA, June 2011.
-
(2011)
Proceedings of the 28th International Conference on Machine Learning
-
-
Deisenroth, M.P.1
Rasmussen, C.E.2
-
8
-
-
61849173491
-
Gaussian process dynamic programming
-
Marc Peter Deisenroth, Carl Edward Rasmussen, and Jan Peters. Gaussian process dynamic programming. Neurocomputing, 72(7-9):1508-1524, 2009.
-
(2009)
Neurocomputing
, vol.72
, Issue.7-9
, pp. 1508-1524
-
-
Deisenroth, M.P.1
Rasmussen, C.E.2
Peters, J.3
-
9
-
-
84864065133
-
Bayesian policy gradient algorithms
-
B. Schölkopf, J. Platt, and T. Hoffman, editors, Cambridge, MA, MIT Press
-
Mohammad Ghavamzadeh and Yaakov Engel. Bayesian policy gradient algorithms. In B. Schölkopf, J. Platt, and T. Hoffman, editors, NIPS '07: Advances in Neural Information Processing Systems 19, pages 457-464, Cambridge, MA, 2007. MIT Press.
-
(2007)
NIPS '07: Advances in Neural Information Processing Systems
, vol.19
, pp. 457-464
-
-
Ghavamzadeh, M.1
Engel, Y.2
-
10
-
-
0033706365
-
Evolving robust gaits with aibo
-
G.S. Hornby, S. Takamura, J. Yokono, O. Hanagata, T. Yamamoto, and M. Fujita. Evolving robust gaits with aibo. In IEEE International Conference on Robotics and Automation (ICRA2000), pages 3040-3045, 2000.
-
(2000)
IEEE International Conference on Robotics and Automation (ICRA2000)
, pp. 3040-3045
-
-
Hornby, G.S.1
Takamura, S.2
Yokono, J.3
Hanagata, O.4
Yamamoto, T.5
Fujita, M.6
-
12
-
-
79959344344
-
Improving Gaussian process value function approximation in policy gradient algorithms
-
Timo Honkela, Włodzisław Duch, Mark Girolami, and Samuel Kaski, editors, volume 6792 of Lecture Notes in Computer Science, Springer
-
Hunor Jakab and Lehel Csató. Improving Gaussian process value function approximation in policy gradient algorithms. In Timo Honkela, Włodzisław Duch, Mark Girolami, and Samuel Kaski, editors, Artificial Neural Networks and Machine Learning - ICANN 2011, volume 6792 of Lecture Notes in Computer Science, pages 221-228. Springer, 2011.
-
(2011)
Artificial Neural Networks and Machine Learning - ICANN 2011
, pp. 221-228
-
-
Jakab, H.1
Csató, L.2
-
13
-
-
33646243319
-
-
Cambridge, MA, MIT Press
-
Sham Kakade. A natural policy gradient. volume 2, pages 1531-1538, Cambridge, MA, 2002. MIT Press.
-
(2002)
A Natural Policy Gradient.
, vol.2
, pp. 1531-1538
-
-
Kakade, S.1
-
16
-
-
79251576558
-
MCMC using Hamiltonian dynamics
-
Steve Brooks, Andrew Gelman, Galin Jones, and Xiao-Li Meng, Editors. Chapman & Hall/CRC Press
-
Radford M. Neal. MCMC using Hamiltonian dynamics. In Steve Brooks, Andrew Gelman, Galin Jones, and Xiao-Li Meng, editors, Handbook of Markov Chain Monte Carlo. Chapman & Hall/CRC Press, 2010.
-
(2010)
Handbook of Markov Chain Monte Carlo
-
-
Neal, R.M.1
-
17
-
-
44949241322
-
Reinforcement learning of motor skills with policy gradients
-
Jan Peters and Stefan Schaal. Reinforcement learning of motor skills with policy gradients. Neural Networks, 21(4):682-697, 2008.
-
(2008)
Neural Networks
, vol.21
, Issue.4
, pp. 682-697
-
-
Peters, J.1
Schaal, S.2
-
22
-
-
84898939480
-
Policy gradient methods for reinforcement learning with function approximation
-
Sara A. Solla, Todd K. Leen, and Klaus- Robert Müller, editors
-
Richard S. Sutton, David A. McAllester, Satinder P. Singh, and Yishay Mansour. Policy gradient methods for reinforcement learning with function approximation. In Sara A. Solla, Todd K. Leen, and Klaus- Robert Müller, editors, NIPS '99: Advances in Neural Information Processing Systems, pages 1057-1063, 1999.
-
(1999)
NIPS '99: Advances in Neural Information Processing Systems
, pp. 1057-1063
-
-
Sutton, R.S.1
McAllester, D.A.2
Singh, S.P.3
Mansour, Y.4
-
23
-
-
0000337576
-
Simple statistical gradient-following algorithms for connectionist reinforcement learning
-
Ronald J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8:229-256, 1992.
-
(1992)
Machine Learning
, vol.8
, pp. 229-256
-
-
Williams, R.J.1
|