-
1
-
-
0036832950
-
Technical update: Least-squares temporal difference learning
-
Justin A. Boyan. Technical update: Least-squares temporal difference learning. Machine Learning, 49(2-3):233-246, 2002.
-
(2002)
Machine Learning
, vol.49
, Issue.2-3
, pp. 233-246
-
-
Boyan, J.A.1
-
2
-
-
0001771345
-
Linear least-squares algorithms for temporal difference learning
-
Steven J. Bradtke and Andrew G. Barto. Linear least-squares algorithms for temporal difference learning. Machine Learning, 22(1-3):33-57, 1996.
-
(1996)
Machine Learning
, vol.22
, Issue.1-3
, pp. 33-57
-
-
Bradtke, S.J.1
Barto, A.G.2
-
3
-
-
40249088278
-
Learning relational options for inductive transfer in relational reinforcement learning
-
T. Croonenborghs, K. Driessens, and M. Bruynooghe. Learning relational options for inductive transfer in relational reinforcement learning. Lecture Notes in Computer Science, 4894:88, 2008.
-
(2008)
Lecture Notes in Computer Science
, vol.4894
, pp. 88
-
-
Croonenborghs, T.1
Driessens, K.2
Bruynooghe, M.3
-
4
-
-
0002278788
-
Hierarchical reinforcement learning with the MAXQ value function decomposition
-
Thomas Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition. Journal of Artificial Intelligence Research, 13:227-303, 1998.
-
(1998)
Journal of Artificial Intelligence Research
, vol.13
, pp. 227-303
-
-
Dietterich, T.1
-
7
-
-
56449092660
-
An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning
-
New York, NY, USA ACM
-
Ronald Parr, Lihong Li, Gavin Taylor, Christopher Painter-Wakefield, and Michael L. Littman. An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning. In ICML '08: Proceedings of the 25th international conference on Machine learning, pages 752-759, New York, NY, USA, 2008. ACM.
-
(2008)
ICML '08: Proceedings of the 25th International Conference on Machine Learning
, pp. 752-759
-
-
Parr, R.1
Li, L.2
Taylor, G.3
Painter-Wakefield, C.4
Littman, M.L.5
-
10
-
-
0033170372
-
Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning
-
Richard Sutton, Doina Precup, and Satinder Singh. Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112:181-211, 1999.
-
(1999)
Artificial Intelligence
, vol.112
, pp. 181-211
-
-
Sutton, R.1
Precup, D.2
Singh, S.3
-
11
-
-
85132026293
-
Integrated architectures for learning, planning, and reacting based on approximating dynamic programming
-
Richard S. Sutton. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming. In The Seventh International Conference on Machine Learning, pages 216-224, 1990.
-
(1990)
The Seventh International Conference on Machine Learning
, pp. 216-224
-
-
Sutton, R.S.1
-
12
-
-
80053284668
-
Dyna-style planning with linear function approximation and prioritized sweeping
-
Richard S. Sutton, Csaba Szepesvari, Alborz Geramifard, and Michael Bowling. Dyna-style planning with linear function approximation and prioritized sweeping. In Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence, pages 528-536, 2008.
-
(2008)
Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence
, pp. 528-536
-
-
Sutton, R.S.1
Szepesvari, C.2
Geramifard, A.3
Bowling, M.4
-
13
-
-
0031143730
-
An analysis of temporal-difference learning with function approximation
-
John N. Tsitsiklis and Benjamin Van Roy. An analysis of temporal-difference learning with function approximation. IEEE Transactions on Automatic Control, 42:674-690, 1997.
-
(1997)
IEEE Transactions on Automatic Control
, vol.42
, pp. 674-690
-
-
Tsitsiklis, J.N.1
Van Roy, B.2
|