-
1
-
-
85151728371
-
Residual algorithms: Reinforcement learning with function approximation
-
San Francisco: Morgan Kaufman Publishers
-
Baird, L. (1995). Residual algorithms: Reinforcement learning with function approximation. Twelfth International Conference on Machine Learning (pp. 30-37). San Francisco: Morgan Kaufman Publishers.
-
(1995)
Twelfth International Conference on Machine Learning
, pp. 30-37
-
-
Baird, L.1
-
2
-
-
0036832950
-
Technical update: Least-squares temporal difference learning
-
Boyan, J. (2002). Technical update: Least-squares temporal difference learning. Machine Learning, 49, 233-246.
-
(2002)
Machine Learning
, vol.49
, pp. 233-246
-
-
Boyan, J.1
-
3
-
-
0001771345
-
Linear least-squares algorithms for temporal difference learning
-
Bradtke, S. J., amp; Barto, A. G. (1996). Linear least-squares algorithms for temporal difference learning. Machine Learning, 22, 33-57.
-
(1996)
Machine Learning
, vol.22
, pp. 33-57
-
-
Bradtke, S.J.1
Barto, A.G.2
-
4
-
-
33646435300
-
A generalized kalman filter for fixed point approximation and efficient temporal difference learning
-
Choi, D., amp; Roy, B. V. (2006). A generalized kalman filter for fixed point approximation and efficient temporal difference learning. Discrete Event Dynamic Systems, 16, 207-239.
-
(2006)
Discrete Event Dynamic Systems
, vol.16
, pp. 207-239
-
-
Choi, D.1
Roy, B.V.2
-
5
-
-
0002278788
-
Hierarchical reinforcement learning with the MAXQ value function decomposition
-
Dietterieh, T. G. (2000). Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13, 227-303.
-
(2000)
Journal of Artificial Intelligence Research
, vol.13
, pp. 227-303
-
-
Dietterieh, T.G.1
-
10
-
-
0036832953
-
Variable resolution discretization in optimal control
-
Munos, R., amp; Moore, A. (2002). Variable resolution discretization in optimal control. Machine Learning, 49, 291-323.
-
(2002)
Machine Learning
, vol.49
, pp. 291-323
-
-
Munos, R.1
Moore, A.2
-
11
-
-
22944468429
-
A convergent form of approximate policy iteration
-
Vancouver, British Columbia, Canada: MIT Press
-
Perkins, T. J., amp; Precup, D. (2002). A convergent form of approximate policy iteration. Neural Information Processing Systems (pp. 1595-1602). Vancouver, British Columbia, Canada: MIT Press.
-
(2002)
Neural Information Processing Systems
, pp. 1595-1602
-
-
Perkins, T.J.1
Precup, D.2
-
12
-
-
30044434365
-
Incremental learning of linear model trees
-
Potts, D., amp; Sammut, C. (2005). Incremental learning of linear model trees. Machine Learning, 61, 5-48.
-
(2005)
Machine Learning
, vol.61
, pp. 5-48
-
-
Potts, D.1
Sammut, C.2
-
16
-
-
0031143730
-
An analysis of temporal-difference learning with function approximation
-
Tsitsiklis, J. N., amp; Roy, B. V. (1997). An analysis of temporal-difference learning with function approximation. IEEE Transactions on Automatic Control, 42, 674-690.
-
(1997)
IEEE Transactions on Automatic Control
, vol.42
, pp. 674-690
-
-
Tsitsiklis, J.N.1
Roy, B.V.2
-
18
-
-
0041345290
-
Efficient reinforcement learning using recursive least squares methods
-
Xu, X., gen He, H., & Hu, D. (2002). Efficient reinforcement learning using recursive least squares methods. Journal of Artificial Intelligence Research, 16, 259-292.
-
(2002)
Journal of Artificial Intelligence Research
, vol.16
, pp. 259-292
-
-
Xu, X.1
gen He, H.2
Hu, D.3
|