-
1
-
-
0029679044
-
Reinforcement learning: A survey
-
L. P. Kaelbling, M. L. Littman, and A. P. Moore, "Reinforcement learning: A survey," Journal of Artificial Intelligence Research, Vol. 4, pp. 237-285, 1996. (Pubitemid 126646155)
-
(1996)
Journal of Artificial Intelligence Research
, vol.4
, pp. 237-285
-
-
Kaelbling, L.P.1
Littman, M.L.2
Moore, A.W.3
-
4
-
-
85012688561
-
-
Princeton, NJ.: Princeton University Press
-
R. E. Bellman, Dynamic Programming. Princeton, NJ.: Princeton University Press, 1957.
-
(1957)
Dynamic Programming
-
-
Bellman, R.E.1
-
5
-
-
33847202724
-
Learning to predict by the methods of temporal differences
-
R. S. Sutton, "Learning to predict by the methods of temporal differences," Machine Learning, Vol. 3, pp. 9-44, 1988.
-
(1988)
Machine Learning
, vol.3
, pp. 9-44
-
-
Sutton, R.S.1
-
6
-
-
34249833101
-
-
C. Watkins and P. Dayan, "Q-learning," Machine Learning, Vol. 8(3-4), pp. 9-44, 1992.
-
(1992)
Q-learning. Machine Learning
, vol.8
, Issue.3-4
, pp. 9-44
-
-
Watkins, C.1
Dayan, P.2
-
7
-
-
85153940465
-
Generalization in reinforcement learning: Safely approximating the value function
-
G. Tesauro, D. S. Touretzky, and T. K. Leen, Eds. MIT Press Cambridge MA
-
J. A. Boyan and A. W. Moore, "Generalization in reinforcement learning: Safely approximating the value function," in Advances in Neural Information Processing Systems 7, G. Tesauro, D. S. Touretzky, and T. K. Leen, Eds. MIT Press Cambridge MA, 1995, pp. 369-376.
-
(1995)
Advances in Neural Information Processing Systems
, vol.7
, pp. 369-376
-
-
Boyan, J.A.1
Moore, A.W.2
-
8
-
-
84880694195
-
Stable function approximation in dynamic programming
-
A. Prieditis and S. Russell, Eds. Morgan Kaufmann Publishers, San Francisco, CA
-
G. Gordon, "Stable function approximation in dynamic programming," in Machine Learning: Proceedings of the Twelfth International Conference, A. Prieditis and S. Russell, Eds. Morgan Kaufmann Publishers, San Francisco, CA, 1995, pp. 261-268.
-
(1995)
Machine Learning: Proceedings of the Twelfth International Conference
, pp. 261-268
-
-
Gordon, G.1
-
9
-
-
85151728371
-
Residual algorithms: Reinforcement learning with function approximation
-
A. Prieditis and S. Russell, Eds. Morgan Kaufmann Publishers, San Francisco, CA
-
L. Baird, "Residual algorithms: Reinforcement learning with function approximation," in Machine Learning: Proceedings of the Twelfth International Conference, A. Prieditis and S. Russell, Eds. Morgan Kaufmann Publishers, San Francisco, CA, 1995, pp. 30-37.
-
(1995)
Machine Learning: Proceedings of the Twelfth International Conference
, pp. 30-37
-
-
Baird, L.1
-
11
-
-
85156221438
-
Generalization in reinforcement learning: Successful examples using sparse coarse coding
-
R. Sutton, "Generalization in reinforcement learning: Successful examples using sparse coarse coding," in Advances in Neural Information Processing Systems 8, 1996, pp. 1038-1044.
-
(1996)
Advances in Neural Information Processing Systems
, vol.8
, pp. 1038-1044
-
-
Sutton, R.1
-
12
-
-
0033901602
-
Convergence results for single-step on-policy reinforcement-learning algorithms
-
DOI 10.1023/A:1007678930559
-
S. Singh, T. Jaakkola, M. L. Littman, and C. Szepesvári, "Convergence results for single-step on-policy reinforcement-learning algorithms," Machine Learning, Vol. 38, no. 3, pp. 287-308, 2000. (Pubitemid 30572449)
-
(2000)
Machine Learning
, vol.38
, Issue.3
, pp. 287-308
-
-
Singh, S.1
Jaakkola, T.2
Littman, M.L.3
Szepesvari, C.4
|