-
3
-
-
0001133021
-
Generalization in reinforcement learning: Safely approximating the value function
-
G. Tesauro, D. S. Touretzky, and T. K. Leen (eds), MIT Press
-
Boyan J. A. and Moore A. W.: 1995, Generalization in reinforcement learning: Safely approximating the value function, in: G. Tesauro, D. S. Touretzky, and T. K. Leen (eds), Advances in Neural Information Processing Systems Vol. 7, MIT Press.
-
(1995)
Advances in Neural Information Processing Systems
, vol.7
-
-
Boyan, J.A.1
Moore, A.W.2
-
5
-
-
84968515237
-
Scattered data interpolation: Tests of some methods
-
Franke R.: 1982, Scattered data interpolation: Tests of some methods. Mathematics of Computation 38(157), 181-200.
-
(1982)
Mathematics of Computation
, vol.38
, Issue.157
, pp. 181-200
-
-
Franke, R.1
-
6
-
-
0000439891
-
On the convergence of stochastic iterative dynamic programming algorithms
-
Jaakola T., Jordan M. I., and Singh S. P.: 1994, On the convergence of stochastic iterative dynamic programming algorithms, Neural Computation 6(6), 1185-1201.
-
(1994)
Neural Computation
, vol.6
, Issue.6
, pp. 1185-1201
-
-
Jaakola, T.1
Jordan, M.I.2
Singh, S.P.3
-
8
-
-
0000123778
-
Self-improving reactive agents based on reinforcement learning, planning and teaching
-
Lin L.-Ji: 1992, Self-improving reactive agents based on reinforcement learning, planning and teaching, Machine Learning 8, 293-321.
-
(1992)
Machine Learning
, vol.8
, pp. 293-321
-
-
Lin, L.-J.1
-
10
-
-
0026880130
-
Automatic programming of behavior-based robots using reinforcement learning
-
Mahadevan S. and Connell J.: 1992, Automatic programming of behavior-based robots using reinforcement learning, Artificial Intelligence 55, 311-365.
-
(1992)
Artificial Intelligence
, vol.55
, pp. 311-365
-
-
Mahadevan, S.1
Connell, J.2
-
11
-
-
17144419347
-
The NSF workshop on reinforcement learning: Summary and observations
-
in press
-
Mahadevan S. and Kaelbling L. P.: 1996, The NSF workshop on reinforcement learning: Summary and observations, AI Magazine, in press.
-
(1996)
AI Magazine
-
-
Mahadevan, S.1
Kaelbling, L.P.2
-
13
-
-
0039753967
-
Attentional mechanisms as a strategy for generalisation in the Q-learning algorithm
-
F. Fogelman-Soulié and P. Gallinari (eds), EC2 et Cie
-
Ribeiro C. H. C.: 1995, Attentional mechanisms as a strategy for generalisation in the Q-learning algorithm, in: F. Fogelman-Soulié and P. Gallinari (eds), Procs. of the International Conf. on Artificial Neural Networks (ICANN'95), Vol. 1, EC2 et Cie, pp. 455-460.
-
(1995)
Procs. of the International Conf. on Artificial Neural Networks (ICANN'95)
, vol.1
, pp. 455-460
-
-
Ribeiro, C.H.C.1
-
14
-
-
0014432211
-
A two-dimensional interpolation function for irregularly spaced data
-
Shepard D.: 1968, A two-dimensional interpolation function for irregularly spaced data, in: Procs. of the 23th National Conf. ACM, pp. 517-523.
-
(1968)
Procs. of the 23th National Conf. ACM
, pp. 517-523
-
-
Shepard, D.1
-
15
-
-
85156221438
-
Generalization in reinforcement learning: Succesful examples using sparse coarse coding
-
D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo (eds), MIT Press
-
Sutton R. S.: 1996, Generalization in reinforcement learning: Succesful examples using sparse coarse coding, in: D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo (eds), Advances in Neural Information Processing Systems Vol. 8, MIT Press, pp. 1038-1044.
-
(1996)
Advances in Neural Information Processing Systems
, vol.8
, pp. 1038-1044
-
-
Sutton, R.S.1
-
16
-
-
0003629453
-
-
Brown University, Department of Computer Science, Providence
-
Szepesvári C. and Littman M. L.: 1996, Generalized Markov decision processes: Dynamic-programming and reinforcement-learning algorithms, Cs-96-11, Brown University, Department of Computer Science, Providence.
-
(1996)
Generalized Markov Decision Processes: Dynamic-programming and Reinforcement-learning Algorithms, Cs-96-11
-
-
Szepesvári, C.1
Littman, M.L.2
-
17
-
-
0001046225
-
Practical issues in temporal difference learning
-
Tesauro G.: 1992, Practical issues in temporal difference learning, Machine Learning 8, 257-277.
-
(1992)
Machine Learning
, vol.8
, pp. 257-277
-
-
Tesauro, G.1
-
19
-
-
0029752470
-
Feature-based methods for large scale dynamic programming
-
Tsitsiklis J. N. and Van Roy B.: 1996, Feature-based methods for large scale dynamic programming, Machine Learning 22, 59-94.
-
(1996)
Machine Learning
, vol.22
, pp. 59-94
-
-
Tsitsiklis, J.N.1
Van Roy, B.2
|