-
1
-
-
0141596576
-
Policy invariance under reward transformations: Theory and application to reward shaping
-
Ng, A.Y., Harada, D., Russell, S.J.: Policy invariance under reward transformations: Theory and application to reward shaping. In: Proceedings of the 16th International Conference on Machine Learning, pp. 278-287 (1999)
-
(1999)
Proceedings of the 16th International Conference on Machine Learning
, pp. 278-287
-
-
Ng, A.Y.1
Harada, D.2
Russell, S.J.3
-
5
-
-
0346942368
-
Decision-theoretic planning: Structural assumptions and computational leverage
-
Boutilier, C., Dean, T., Hanks, S.: Decision-theoretic planning: Structural assumptions and computational leverage. Journal of Artificial Intelligence Research 11, 1-94 (1999)
-
(1999)
Journal of Artificial Intelligence Research
, vol.11
, pp. 1-94
-
-
Boutilier, C.1
Dean, T.2
Hanks, S.3
-
6
-
-
0026206780
-
An optimal one-way multigrid algorithm for discretetime stochastic control
-
Chow, C.S., Tsitsiklis, J.N.: An optimal one-way multigrid algorithm for discretetime stochastic control. IEEE Transactions on Automatic Control 36(8), 898-914 (1991)
-
(1991)
IEEE Transactions on Automatic Control
, vol.36
, Issue.8
, pp. 898-914
-
-
Chow, C.S.1
Tsitsiklis, J.N.2
-
7
-
-
0042353224
-
Multigrid Q-learning
-
Technical Report CS-94-121, Colorado State University
-
Anderson, C., Crawford-Hines, S.: Multigrid Q-learning. Technical Report CS-94-121, Colorado State University (1994)
-
(1994)
-
-
Anderson, C.1
Crawford-Hines, S.2
-
8
-
-
0033170372
-
Between MDPs and Semi-MDPs: A framework for temporal abstraction in reinforcement learning
-
Sutton, R.S., Precup, D., Singh, S.P.: Between MDPs and Semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112(1-2), 181-211 (1999)
-
(1999)
Artificial Intelligence
, vol.112
, Issue.1-2
, pp. 181-211
-
-
Sutton, R.S.1
Precup, D.2
Singh, S.P.3
-
10
-
-
34250717446
-
-
Epshteyn, A., De.Jong, G.: Qualitative reinforcement learning. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 305-312 (2006)
-
Epshteyn, A., De.Jong, G.: Qualitative reinforcement learning. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 305-312 (2006)
-
-
-
-
11
-
-
0036832953
-
Variable resolution discretization in optimal control
-
Munos, R., Moore, A.: Variable resolution discretization in optimal control. Machine Learning 49(2-3), 291-323 (2002)
-
(2002)
Machine Learning
, vol.49
, Issue.2-3
, pp. 291-323
-
-
Munos, R.1
Moore, A.2
-
14
-
-
84880688141
-
Multi-value-functions: Efficient automatic action hierarchies for multiple goal MDPs
-
Moore, A., Baird, L., Kaelbling, L.P.: Multi-value-functions: Efficient automatic action hierarchies for multiple goal MDPs. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp. 1316-1323 (1999)
-
(1999)
Proceedings of the International Joint Conference on Artificial Intelligence
, pp. 1316-1323
-
-
Moore, A.1
Baird, L.2
Kaelbling, L.P.3
-
17
-
-
0002278788
-
Hierarchical reinforcement learning with the MAXQ value function decomposition
-
Dietterich, T.G.: Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research 13, 227-303 (2000)
-
(2000)
Journal of Artificial Intelligence Research
, vol.13
, pp. 227-303
-
-
Dietterich, T.G.1
|