-
5
-
-
0020970738
-
Neuronlike elements that can solve difficult learning control problems
-
reprinted in:, Anderson J.A. Rosenfeld E. Neurocomputing: Foundations of Research 1988 MIT Press Cambridge, MA
-
(1983)
IEEE Trans. Syst. Man Cybern.
, vol.13
, pp. 835-846
-
-
Barto1
Sutton2
Anderson3
-
9
-
-
84968468700
-
Polynomial approximation—a new computational technique in dynamic programming: allocation processes
-
(1973)
Math. Comp.
, vol.17
, pp. 155-161
-
-
Bellman1
Kalaba2
Kotkin3
-
15
-
-
0002227762
-
Penquins can make cake
-
(1989)
AI Mag.
, vol.10
, pp. 45-50
-
-
Chapman1
-
18
-
-
0041541978
-
A theoretical comparison of the efficiencies of two classical methods and a Monte Carlo method for computing one component of the solution of a set of linear algebraic equations
-
H.A. Meyer, Wiley, New York
-
(1954)
Symposium on Monte Carlo Methods
, pp. 191-233
-
-
Curtiss1
-
22
-
-
84916483603
-
Reinforcing connectionism: learning the statistical way
-
University of Edinburgh, Edinburgh, Scotland
-
(1991)
Ph.D. Thesis
-
-
Dayan1
-
23
-
-
0000430514
-
The convergence of TD(λ) for general λ
-
(1992)
Mach. Learn.
, vol.8
, pp. 341-362
-
-
Dayan1
-
25
-
-
0000104548
-
Contraction mappings in the theory underlying dynamic programming
-
(1967)
SIAM Review
, vol.9
, pp. 165-177
-
-
Denardo1
-
28
-
-
0024885107
-
Universal planning: an (almost) universally bad idea
-
(1989)
AI Mag.
, vol.10
, pp. 40-44
-
-
Ginsberg1
-
38
-
-
0003900353
-
Brain function and adaptive systems—a heterostatic theory
-
Air Force Cambridge Research Laboratories, Bedford, MA
-
(1972)
Tech. Report AFCRL-72-0164
-
-
Klopf1
-
48
-
-
85151437138
-
Programming robots using reinforcement learning and teaching
-
Anaheim, CA
-
(1991)
Proceedings AAAI-91
, pp. 781-786
-
-
Lin1
-
51
-
-
0000123778
-
Self-improving reactive agents based on reinforcement learning, planning and teaching
-
(1992)
Mach. Learn.
, vol.8
, pp. 293-321
-
-
Lin1
-
56
-
-
0013500961
-
Theory of neural-analog reinforcement systems and its application to the brain-model problem
-
Princeton University, Princeton, NJ
-
(1954)
Ph.D. Thesis
-
-
Minsky1
-
58
-
-
0003442587
-
Efficient memory-based learning for robot control
-
University of Cambridge, Cambridge, England
-
(1990)
Ph.D. Thesis
-
-
Moore1
-
66
-
-
0344252216
-
Adaptive confidence and adaptive curiosity
-
Institut für Informatik, Technische Universität München, 800 München 2, Germany
-
(1991)
Tech. Report FKI-149-91
-
-
Schmidhuber1
-
68
-
-
0008487586
-
In defense of reaction plans as caches
-
(1989)
AI Mag.
, vol.10
, pp. 51-60
-
-
Schoppers1
-
69
-
-
0028497385
-
An upper bound on the loss from approximate optimal value functions. technical note
-
(1994)
Mach. Learn.
, vol.16
, pp. 227-233
-
-
Singh1
Yee2
-
70
-
-
0003617454
-
Temporal credit assignment in reinforcement learning
-
University of Massachusetts, Amherst, MA
-
(1984)
Ph.D. Thesis
-
-
Sutton1
-
71
-
-
33847202724
-
Learning to predict by the method of temporal differences
-
(1988)
Mach. Learn.
, vol.3
, pp. 9-44
-
-
Sutton1
-
74
-
-
0010714713
-
A Special Issue of Machine Learning on Reinforcement Learning
-
(1992)
Mach. Learn.
, vol.8
-
-
Sutton1
-
76
-
-
0019537951
-
Toward a modern theory of adaptive networks: expectation and prediction
-
(1981)
Psychol. Rev.
, vol.88
, pp. 135-170
-
-
Sutton1
Barto2
-
81
-
-
0001046225
-
Practical issues in temporal difference learning
-
(1992)
Mach. Learn.
, vol.8
, pp. 257-277
-
-
Tesauro1
-
85
-
-
0004049893
-
Learning from delayed rewards
-
3d ed., Cambridge University, Cambridge, England
-
(1989)
Ph.D. Thesis
-
-
Watkins1
-
88
-
-
0003529238
-
Beyond regression: new tools for prediction and analysis in the behavioral sciences
-
3d ed., Harvard University, Cambridge, MA
-
(1974)
Ph.D. Thesis
-
-
Werbos1
-
92
-
-
0000903748
-
Generalization of back propagation with applications to a recurrent gas market model
-
(1988)
Neural Networks
, vol.1
, pp. 339-356
-
-
Werbos1
-
96
-
-
0017524329
-
An adaptive optimal controller for discrete-time Markov environments
-
(1977)
Infor. Control
, vol.34
, pp. 286-295
-
-
Witten1
-
98
-
-
5844332810
-
Abstraction in control learning
-
3d ed., Department of Computer Science, University of Massachusetts, Amherst, MA
-
(1992)
Tech. Report 92-16
-
-
Yee1
|