-
1
-
-
0003997198
-
Strategy learning with multilayer connectionist representations
-
GTE Laboratories Incorporated. Computer and Intelligent Systems Laboratory, 40 Sylvan Road Waltham, MA 02254
-
Anderson, C. W. (1988). Strategy learning with multilayer connectionist representations Technical Report 87-509.3. GTE Laboratories Incorporated. Computer and Intelligent Systems Laboratory, 40 Sylvan Road Waltham, MA 02254.
-
(1988)
Technical Report 87-509.3
-
-
Anderson, C.W.1
-
2
-
-
0020970738
-
Netuonlike elements that can solve difficult learning control problems
-
Barto, A. G., Sutton, R. S. & Anderson, C. W. (1983). Netuonlike elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man, and Cybernetics, 13:835-846.
-
(1983)
IEEE Transactions on Systems, Man, and Cybernetics
, vol.13
, pp. 835-846
-
-
Barto, A.G.1
Sutton, R.S.2
Anderson, C.W.3
-
3
-
-
6344250104
-
Incremental Dynamic Programming for On-Line Adaptive Optimal Control
-
PhD thesis, University of Massachusetts, Computer Science Dept.
-
Bradtke, S. J., (1994). Incremental Dynamic Programming for On-Line Adaptive Optimal Control. PhD thesis, University of Massachusetts, Computer Science Dept. Technical Report 94-62.
-
(1994)
Technical Report
, vol.94
, Issue.62
-
-
Bradtke, S.J.1
-
4
-
-
84996565038
-
Learning rate schedules for faster stochastic gradient search
-
Proceedings of the 1992 IEEE Workshop. IEEE Press
-
Darken, C. Chang, I. & Moody, J., (1992) Learning rate schedules for faster stochastic gradient search. In Neural Networks for Signal Processing 2 - Proceedings of the 1992 IEEE Workshop. IEEE Press.
-
(1992)
Neural Networks for Signal Processing
, vol.2
-
-
Darken, C.1
Chang, I.2
Moody, J.3
-
5
-
-
0000430514
-
The convergence of TP(λ) for general λ
-
Dayan, P., (1992) The convergence of TP(λ) for general λ. Machine Learning, 8:341-362.
-
(1992)
Machine Learning
, vol.8
, pp. 341-362
-
-
Dayan, P.1
-
8
-
-
0000439891
-
On the convergence of stochastic iterative dynamic programming algorithms
-
Jaakkola, T. Jordan, M.I & Singh, S. P., (1994). On the convergence of stochastic iterative dynamic programming algorithms Neural Computation, 6(6).
-
(1994)
Neural Computation
, vol.6
, Issue.6
-
-
Jaakkola, T.1
Jordan, M.I.2
Singh, S.P.3
-
11
-
-
27144479240
-
Expectation driven learning with an associative memory
-
Lukes, G., Thompson, B. & Werbos, P., (1990) Expectation driven learning with an associative memory. In Proceedings of the International Joint Conference on Neural Networks, pages 1:521-524.
-
(1990)
Proceedings of the International Joint Conference on Neural Networks
, vol.1
, pp. 521-524
-
-
Lukes, G.1
Thompson, B.2
Werbos, P.3
-
14
-
-
0003617454
-
-
PhD thesis, Department of Computer and Information Science, University of Massachusetts at Amherst, Amherst, MA 01003
-
Sutton A.S., (1984). Temporal Credit Assignment in Reinforcement Learning. PhD thesis, Department of Computer and Information Science, University of Massachusetts at Amherst, Amherst, MA 01003.
-
(1984)
Temporal Credit Assignment in Reinforcement Learning
-
-
Sutton, A.S.1
-
15
-
-
33847202724
-
Learning to predict by the method of temporal differences
-
Sutton, R.S., (1988) Learning to predict by the method of temporal differences. Machine Learning. 3:9-44.
-
(1988)
Machine Learning.
, vol.3
, pp. 9-44
-
-
Sutton, R.S.1
-
16
-
-
0001046225
-
Practical issues in temporal difference learning
-
Tesauro, G.J., (1992). Practical issues in temporal difference learning Machine Learning 8(3/4):257-277.
-
(1992)
Machine Learning
, vol.8
, Issue.3-4
, pp. 257-277
-
-
Tesauro, G.J.1
-
17
-
-
27144547178
-
Asynchronous stochastic approximation and Q-learning
-
Laboratory for Information and Decision Systems. MIT. Cambridge, MA
-
Tsitsiklis, J. N. (1993). Asynchronous stochastic approximation and Q-learning. Technical Report LIDS-P-2172, Laboratory for Information and Decision Systems. MIT. Cambridge, MA.
-
(1993)
Technical Report LIDS-P-2172
-
-
Tsitsiklis, J.N.1
-
19
-
-
34249833101
-
Q-learning
-
May 1992
-
Watkins, C. J. C. H. & Dayan, P., (1992). Q-learning. Machine Learning, 8(3/4):257-277, May 1992.
-
(1992)
Machine Learning
, vol.8
, Issue.3-4
, pp. 257-277
-
-
Watkins, C.J.C.H.1
Dayan, P.2
-
20
-
-
0023169119
-
Building and understanding adaptive systems: A statistical/numerical approach to factory automation and brain research
-
Werbos, P.J., (1987). Building and understanding adaptive systems: A statistical/numerical approach to factory automation and brain research. IEEE Transactions on Systems, Man. and Cybernetics, 17(1):7-20.
-
(1987)
IEEE Transactions on Systems, Man. and Cybernetics
, vol.17
, Issue.1
, pp. 7-20
-
-
Werbos, P.J.1
-
21
-
-
0000903748
-
Generalization of backpropagation with application to a recurrent gas market model
-
1988
-
Werbos, P.J. (1988). Generalization of backpropagation with application to a recurrent gas market model. Neural Networks, 1(4):339-356, 1988.
-
(1988)
Neural Networks
, vol.1
, Issue.4
, pp. 339-356
-
-
Werbos, P.J.1
-
22
-
-
0025229247
-
Consistency of HDP applied to a simple reinforcement learning problem
-
Werbos, P.J. (1990) Consistency of HDP applied to a simple reinforcement learning problem. Neural Networks, 3(2):179-190.
-
(1990)
Neural Networks
, vol.3
, Issue.2
, pp. 179-190
-
-
Werbos, P.J.1
-
23
-
-
0002031779
-
Approximate dynamic programming for real time control and neural modeling
-
D. A. White and D. A. Sofge, editors, Van Nostrand Reinhold. New York
-
Werbos, P.J. (1992) Approximate dynamic programming for real time control and neural modeling. In D. A. White and D. A. Sofge, editors, Handbook of Intelligent Control Neural, Fuzzy, and Adaptive Approaches. pages 493-525. Van Nostrand Reinhold. New York.
-
(1992)
Handbook of Intelligent Control Neural, Fuzzy, and Adaptive Approaches.
, pp. 493-525
-
-
Werbos, P.J.1
|