-
1
-
-
66449130966
-
Adaptive dynamic programming: an introduction
-
F.-Y.Wang, H. Zhang, and D. Liu. Adaptive dynamic programming: an introduction. IEEE Computational Intelligence Magazine, 4(2):39-47, 2009.
-
(2009)
IEEE Computational Intelligence Magazine
, vol.4
, Issue.2
, pp. 39-47
-
-
Wang, F.-Y.1
Zhang, H.2
Liu, D.3
-
2
-
-
0008011457
-
Neural networks, system identification, and control in the chemical process industries
-
White and Sofge, editors, Van Nostrant Reinhold, New York
-
P.J. Werbos. Neural networks, system identification, and control in the chemical process industries. In White and Sofge, editors. Handbook of Intelligent Control. Van Nostrant Reinhold, New York, 1992, pp. 283-356.
-
(1992)
Handbook of Intelligent Control.
, pp. 283-356
-
-
Werbos, P.J.1
-
3
-
-
85012688561
-
Dynamic Programming
-
Princeton University Press, Princeton, NJ
-
R.E. Bellman. Dynamic Programming. Princeton University Press, Princeton, NJ, 1957.
-
(1957)
-
-
Bellman, R.E.1
-
4
-
-
0002031779
-
Approximating dynamic programming for real-time control and neural modeling
-
White and Sofge, editors, Van Nostrant Reinhold,New York
-
P.J.Werbos. Approximating dynamic programming for real-time control and neural modeling. In White and Sofge, editors. Handbook of Intelligent Control. Van Nostrant Reinhold,New York, 1992, pp. 493-525.
-
(1992)
Handbook of Intelligent Control.
, pp. 493-525
-
-
Werbos, P.J.1
-
6
-
-
33847202724
-
Learning to predict by the methods of temporal differences
-
R.S. Sutton. Learning to predict by the methods of temporal differences. Machine Learning, 3:9-44, 1988.
-
(1988)
Machine Learning
, vol.3
, pp. 9-44
-
-
Sutton, R.S.1
-
7
-
-
0004049893
-
Learning from Delayed Rewards
-
PhD thesis, Cambridge University
-
C.J.C.H. Watkins. Learning from Delayed Rewards. PhD thesis, Cambridge University, 1989.
-
(1989)
-
-
Watkins, C.J.C.H.1
-
8
-
-
0025503558
-
Backpropagation through time: What it does and how to do it
-
P.J. Werbos. Backpropagation through time: What it does and how to do it. Proceedings of the IEEE, 78(10):1550-1560, 1990.
-
(1990)
Proceedings of the IEEE
, vol.78
, Issue.10
, pp. 1550-1560
-
-
Werbos, P.J.1
-
9
-
-
0003950434
-
Stable adaptive control using new critic designs
-
eprint arXiv:adaporg/ 9810001
-
P.J. Werbos. Stable adaptive control using new critic designs. eprint arXiv:adaporg/ 9810001, 1998.
-
(1998)
-
-
Werbos, P.J.1
-
10
-
-
0008813539
-
An analysis of temporal-difference learning with function approximation
-
Technical Report LIDS-P-2322
-
J.N. Tsitsiklis and B. Van Roy. An analysis of temporal-difference learning with function approximation. Technical Report LIDS-P-2322, 1996.
-
(1996)
-
-
Tsitsiklis, J.N.1
Van Roy, B.2
-
11
-
-
84865080654
-
The divergence of reinforcement learning algorithms with value-iteration and function approximation
-
eprint arXiv:1107.4606
-
M. Fairbank and E. Alonso. The divergence of reinforcement learning algorithms with value-iteration and function approximation. eprint arXiv:1107.4606, 2011.
-
(2011)
-
-
Fairbank, M.1
Alonso, E.2
-
12
-
-
85032189594
-
Model-based adaptive critic designs
-
J. Si, et al., editors., Wiley-IEEE Press, New York
-
S. Ferrari and R.F. Stengel. Model-based adaptive critic designs. In J. Si, et al., editors. Handbook of Learning and Approximate Dynamic Programming.Wiley-IEEE Press, New York, 2004, pp. 65-96.
-
(2004)
Handbook of Learning and Approximate Dynamic Programming.
, pp. 65-96
-
-
Ferrari, S.1
Stengel, R.F.2
-
13
-
-
84865070696
-
The local optimality of reinforcement learning by value gradients and its relationship to policy gradient learning
-
eprint arXiv:1101.0428
-
M. Fairbank and E. Alonso. The local optimality of reinforcement learning by value gradients and its relationship to policy gradient learning. eprint arXiv:1101.0428, 2011.
-
(2011)
-
-
Fairbank, M.1
Alonso, E.2
-
14
-
-
84865080650
-
Reinforcement learning by value gradients
-
eprint arXiv:0803.3539
-
M. Fairbank. Reinforcement learning by value gradients. eprint arXiv:0803.3539, 2008.
-
(2008)
-
-
Fairbank, M.1
-
15
-
-
0033629916
-
Reinforcement learning in continuous time and space
-
K. Doya. Reinforcement learning in continuous time and space. Neural Computation, 12(1):219-245, 2000.
-
(2000)
Neural Computation
, vol.12
, Issue.1
, pp. 219-245
-
-
Doya, K.1
-
16
-
-
80053166137
-
Finite-horizon input-constrained nonlinear optimal control using single network adaptive critics
-
Ali Heydari and S.N. Balakrishnan. Finite-horizon input-constrained nonlinear optimal control using single network adaptive critics. American Control Conference ACC, 2011, pp. 3047-3052.
-
(2011)
American Control Conference ACC
, pp. 3047-3052
-
-
Heydari, A.1
Balakrishnan, S.N.2
-
17
-
-
0003000735
-
Faster-learning variations on back-propagation: an empirical study
-
San Mateo, CA, Morgan Kaufmann
-
S.E. Fahlman. Faster-learning variations on back-propagation: an empirical study. In Proceedings of the 1988 Connectionist Summer School, pp. 38-51, San Mateo, CA, 1988. Morgan Kaufmann.
-
(1988)
Proceedings of the 1988 Connectionist Summer School
, pp. 38-51
-
-
Fahlman, S.E.1
-
18
-
-
0003487601
-
Neural Networks for Pattern Recognition
-
Oxford University Press
-
C.M. Bishop. Neural Networks for Pattern Recognition. Oxford University Press, 1995.
-
(1995)
-
-
Bishop, C.M.1
|