-
1
-
-
85151728371
-
Residual algorithms: Reinforcement learning with function approximation
-
L. Baird. Residual algorithms: Reinforcement learning with function approximation. In ICML'95, 1995.
-
(1995)
ICML'95
-
-
Baird, L.1
-
2
-
-
85162041278
-
Predictive state temporal difference learning
-
B. Boots and G. J. Gordon. Predictive state temporal difference learning. In NIPS'10, 2010.
-
(2010)
NIPS'10
-
-
Boots, B.1
Gordon, G.J.2
-
3
-
-
0036832950
-
Technical update: Least-squares temporal difference learning
-
ISSN 0885-6125
-
J. A. Boyan. Technical update: Least-squares temporal difference learning. Machine Learning, 49:233-246, 2002. ISSN 0885-6125.
-
(2002)
Machine Learning
, vol.49
, pp. 233-246
-
-
Boyan, J.A.1
-
4
-
-
0001771345
-
Linear least-squares algorithms for temporal difference learning
-
S. J. Bradtke, A. G. Barto, and L. P. Kaelbling. Linear least-squares algorithms for temporal difference learning. Machine Learning, 22:33-57, 1996.
-
(1996)
Machine Learning
, vol.22
, pp. 33-57
-
-
Bradtke, S.J.1
Barto, A.G.2
Kaelbling, L.P.3
-
5
-
-
80053451079
-
Incremental least-squares temporal difference learning
-
A. Geramifard, M. Bowling, and R. S. Sutton. Incremental least-squares temporal difference learning. In AAAI'06, 2006.
-
(2006)
AAAI'06
-
-
Geramifard, A.1
Bowling, M.2
Sutton, R.S.3
-
7
-
-
33749263205
-
Automatic basis function construction for approximate dynamic programming and reinforcement learning
-
P. W. Keller, S. Mannor, and D. Precup. Automatic basis function construction for approximate dynamic programming and reinforcement learning. In ICML'06, 2006.
-
(2006)
ICML'06
-
-
Keller, P.W.1
Mannor, S.2
Precup, D.3
-
8
-
-
71149121683
-
Regularization and feature selection in least-squares temporal difference learning
-
Z. J. Kolter and A. Y. Ng. Regularization and feature selection in least-squares temporal difference learning. In ICML'09, 2009.
-
(2009)
ICML'09
-
-
Kolter, Z.J.1
Ng, A.Y.2
-
10
-
-
77954101982
-
Gq(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces
-
H. R. Maei and R. S. Sutton. Gq(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces. In AGI'10, 2010.
-
(2010)
AGI'10
-
-
Maei, H.R.1
Sutton, R.S.2
-
11
-
-
85161990353
-
Basis construction from power series expansions of value functions
-
S. Mahadevan and B. Liu. Basis construction from power series expansions of value functions. In NIPS'10, 2010.
-
(2010)
NIPS'10
-
-
Mahadevan, S.1
Liu, B.2
-
12
-
-
35748957806
-
Proto-value functions: A laplacian framework for learning representation and control in markov decision processes
-
S. Mahadevan, M. Maggioni, and C. Guestrin. Proto-value functions: A laplacian framework for learning representation and control in markov decision processes. Journal of Machine Learning Research, 8: 2007, 2006.
-
(2006)
Journal of Machine Learning Research
, vol.8
, pp. 2007
-
-
Mahadevan, S.1
Maggioni, M.2
Guestrin, C.3
-
14
-
-
33847202724
-
Learning to predict by the methods of temporal differences
-
R. S. Sutton. Learning to predict by the methods of temporal differences. Machine Learning, 3:9-44, 1988.
-
(1988)
Machine Learning
, vol.3
, pp. 9-44
-
-
Sutton, R.S.1
-
16
-
-
77956513316
-
A convergent o(n) algorithm for off-policy temporal-difference learning with linear function approximation
-
R. S. Sutton, C. Szepesvári, and H. R. Maei. A convergent o(n) algorithm for off-policy temporal-difference learning with linear function approximation. In NIPS'08, 2008.
-
(2008)
NIPS'08
-
-
Sutton, R.S.1
Szepesvári, C.2
Maei, H.R.3
-
17
-
-
71149099079
-
Fast gradient-descent methods for temporal-difference learning with linear function approximation
-
R. S. Sutton, H. R. Maei, D. Precup, S. Bhatnagar, D. Silver, C. Szepesvári, and E. Wiewiora. Fast gradient-descent methods for temporal-difference learning with linear function approximation. In ICML'09, 2009.
-
(2009)
ICML'09
-
-
Sutton, R.S.1
Maei, H.R.2
Precup, D.3
Bhatnagar, S.4
Silver, D.5
Szepesvári, C.6
Wiewiora, E.7
-
18
-
-
26944495251
-
Feature-discovering approximate value iteration methods
-
J.-H. Wu and R. Givan. Feature-discovering approximate value iteration methods. In SARA'05, pages 321-331, 2005.
-
(2005)
SARA'05
, pp. 321-331
-
-
Wu, J.-H.1
Givan, R.2
|