-
4
-
-
0036832950
-
Technical update: Least-squares temporal difference learning
-
J. Boyan, "Technical update: Least-squares temporal difference learning, " Machine Learning, vol. 49, pp. 233-246, 2002.
-
(2002)
Machine Learning
, vol.49
, pp. 233-246
-
-
Boyan, J.1
-
5
-
-
0037288398
-
Least-squares policy evaluation algorithms with linear function approximation
-
A. Nedić and D. P. Bertsekas, "Least-squares policy evaluation algorithms with linear function approximation, " Discrete Event Dynamic Systems: Theory and Applications, vol. 13, no. 1-2, pp. 79-110, 2003.
-
(2003)
Discrete Event Dynamic Systems: Theory and Applications
, vol.13
, Issue.1-2
, pp. 79-110
-
-
Nedić, A.1
Bertsekas, D.P.2
-
7
-
-
21844465127
-
Tree-based batch mode reinforcement learning
-
D. Ernst, P. Geurts, and L. Wehenkel, "Tree-based batch mode reinforcement learning, " Journal ofMachine Learning Research, vol. 6, pp. 503-556, 2005.
-
(2005)
Journal OfMachine Learning Research
, vol.6
, pp. 503-556
-
-
Ernst, D.1
Geurts, P.2
Wehenkel, L.3
-
8
-
-
67949109470
-
Convergence results for some temporal difference methods based on least squares
-
H. Yu and D. P. Bertsekas, "Convergence results for some temporal difference methods based on least squares, " IEEE Transactions on Automatic Control, vol. 54, no. 7, pp. 1515-1531, 2009.
-
(2009)
IEEE Transactions on Automatic Control
, vol.54
, Issue.7
, pp. 1515-1531
-
-
Yu, H.1
Bertsekas, D.P.2
-
9
-
-
77957782880
-
Online least-squares policy iteration for reinforcement learning control
-
Baltimore, US, 30 June - 2 July, accepted for publication
-
L. Buşoniu, D. Ernst, B. De Schutter, and R. Babǔska, "Online least-squares policy iteration for reinforcement learning control, " in Proceedings 2010 American Control Conference (ACC-10), Baltimore, US, 30 June - 2 July 2010, accepted for publication.
-
(2010)
Proceedings 2010 American Control Conference (ACC-10)
-
-
Buşoniu, L.1
Ernst, D.2
Schutter, B.D.3
Babǔska, R.4
-
10
-
-
33847202724
-
Learning to predict by the method of temporal differences
-
R. S. Sutton, "Learning to predict by the method of temporal differences, " Machine Learning, vol. 3, pp. 9-44, 1988.
-
(1988)
Machine Learning
, vol.3
, pp. 9-44
-
-
Sutton, R.S.1
-
11
-
-
84899834143
-
Online exploration in least-squares policy iteration
-
Budapest, Hungary, 10-15 May
-
L. Li, M. L. Littman, and C. R. Mansley, "Online exploration in least-squares policy iteration, " in Proceedings 8th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS- 09), vol. 2, Budapest, Hungary, 10-15 May 2009, pp. 733-739.
-
(2009)
Proceedings 8th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS- 09)
, vol.2
, pp. 733-739
-
-
Li, L.1
Littman, M.L.2
Mansley, C.R.3
-
12
-
-
34548765672
-
Kernelizing LSPE(λ?)
-
Honolulu, US, 1-5 April
-
T. Jung and D. Polani, "Kernelizing LSPE(λ?), " in Proceedings 2007 IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning (ADPRL-07), Honolulu, US, 1-5 April 2007, pp. 338-345.
-
(2007)
Proceedings 2007 IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning (ADPRL-07)
, pp. 338-345
-
-
Jung, T.1
Polani, D.2
|