-
1
-
-
66449130966
-
Adaptive dynamic programming: An introduction
-
May
-
F.-Y. Wang, H. Zhang, and D. Liu, "Adaptive dynamic programming: An introduction," IEEE Comput. Intell. Mag., vol. 4, no. 2, pp. 39-47, May 2009.
-
(2009)
IEEE Comput. Intell. Mag.
, vol.4
, Issue.2
, pp. 39-47
-
-
Wang, F.-Y.1
Zhang, H.2
Liu, D.3
-
2
-
-
85012688561
-
-
Princeton NJ USA: Princeton Univ. Press
-
R. E. Bellman, Dynamic Programming. Princeton, NJ, USA: Princeton Univ. Press, 1957.
-
(1957)
Dynamic Programming
-
-
Bellman, R.E.1
-
3
-
-
0002031779
-
Approximate dynamic programming for real-time control and neural modeling
-
D. White and D. Sofge, Eds. New York, NY, USA: Van Nostrand Reinhold ch. 13
-
P. J. Werbos, "Approximate dynamic programming for real-time control and neural modeling," in Handbook of Intelligent Control, D. White and D. Sofge, Eds. New York, NY, USA: Van Nostrand Reinhold, 1992, ch. 13, pp. 493-525.
-
(1992)
Handbook of Intelligent Control
, pp. 493-525
-
-
Werbos, P.J.1
-
4
-
-
0031236002
-
Adaptive critic designs
-
PII S1045922797052430
-
D. Prokhorov and D. Wunsch, "Adaptive critic designs," IEEE Trans. Neural Netw., vol. 8, no. 5, pp. 997-1007, Sep. 1997. (Pubitemid 127763331)
-
(1997)
IEEE Transactions on Neural Networks
, vol.8
, Issue.5
, pp. 997-1007
-
-
Prokhorov, D.V.1
Wunsch II, D.C.2
-
5
-
-
85032189594
-
Model-based adaptive critic designs
-
J. Si, A. Barto, W. Powell, and D. Wunsch, Eds. New York, NY, USA: Wiley
-
S. Ferrari and R. F. Stengel, "Model-based adaptive critic designs," in Handbook of Learning and Approximate Dynamic Programming, J. Si, A. Barto, W. Powell, and D. Wunsch, Eds. New York, NY, USA: Wiley, 2004, pp. 65-96.
-
(2004)
Handbook of Learning and Approximate Dynamic Programming
, pp. 65-96
-
-
Ferrari, S.1
Stengel, R.F.2
-
6
-
-
84876158475
-
Simple and fast calculation of the second-order gradients for globalized dual heuristic dynamic programming in neural networks
-
Oct 2012
-
M. Fairbank, E. Alonso, and D. Prokhorov, "Simple and fast calculation of the second-order gradients for globalized dual heuristic dynamic programming in neural networks," IEEE Trans. Neural Netw. Learn. Syst., vol. 23, no. 10, pp. 1671-1678, Oct. 2012.
-
IEEE Trans. Neural Netw. Learn. Syst.
, vol.23
, Issue.10
, pp. 1671-1678
-
-
Fairbank, M.1
Alonso, E.2
Prokhorov, D.3
-
8
-
-
84865069763
-
Value-gradient learning
-
Jun.
-
M. Fairbank and E. Alonso, "Value-gradient learning," in Proc. IEEE IJCNN, Jun. 2012, pp. 3062-3069.
-
(2012)
Proc. IEEE IJCNN
, pp. 3062-3069
-
-
Fairbank, M.1
Alonso, E.2
-
9
-
-
33847202724
-
Learning to predict by the methods of temporal differences
-
R. S. Sutton, "Learning to predict by the methods of temporal differences," Mach. Learn., vol. 3, no. 1, pp. 9-44, 1988.
-
(1988)
Mach. Learn.
, vol.3
, Issue.1
, pp. 9-44
-
-
Sutton, R.S.1
-
10
-
-
0037561866
-
Dual heuristic programming excitation neurocontrol for generators in a multimachine power system
-
Mar./Apr
-
G. K. Venayagamoorthy and D. C. Wunsch, "Dual heuristic programming excitation neurocontrol for generators in a multimachine power system," IEEE Trans. Ind. Appl., vol. 39, no. 2, pp. 382-394, Mar./Apr. 2003.
-
(2003)
IEEE Trans. Ind. Appl.
, vol.39
, Issue.2
, pp. 382-394
-
-
Venayagamoorthy, G.K.1
Wunsch, D.C.2
-
11
-
-
0030702730
-
Training strategies for critic and action neural networks in dual heuristic programming method
-
Jun.
-
G. G. Lendaris and C. Paintz, "Training strategies for critic and action neural networks in dual heuristic programming method," in Proc. Int. Conf. Neural Netw., Jun. 1997, pp. 712-717.
-
(1997)
Proc. Int. Conf. Neural Netw.
, pp. 712-717
-
-
Lendaris, G.G.1
Paintz, C.2
-
12
-
-
0003716450
-
-
New York, NY, USA: Wiley
-
L. S. Pontryagin, V. G. Boltayanskii, R. V. Gamkrelidze, and E. F. Mishchenko, The Mathematical Theory of Optimal Processes, vol. 4. New York, NY, USA: Wiley, 1962.
-
(1962)
The Mathematical Theory of Optimal Processes
, vol.4
-
-
Pontryagin, L.S.1
Boltayanskii, V.G.2
Gamkrelidze, R.V.3
Mishchenko, E.F.4
-
14
-
-
84865077338
-
A comparison of learning speed and ability to cope without exploration between DHP and TD(0)
-
Jun.
-
M. Fairbank and E. Alonso, "A comparison of learning speed and ability to cope without exploration between DHP and TD(0)," in Proc. IEEE IJCNN, Jun. 2012, pp. 1478-1485.
-
(2012)
Proc. IEEE IJCNN
, pp. 1478-1485
-
-
Fairbank, M.1
Alonso, E.2
-
15
-
-
0008011457
-
Neural networks, system identification, and control in the chemical process industries
-
D. White and D. Sofge, Eds. New York, NY, USA: Van Nostrand Reinhold ch. 10
-
P. J. Werbos, "Neural networks, system identification, and control in the chemical process industries," in Handbook of Intelligent Control, D. White and D. Sofge, Eds. New York, NY, USA: Van Nostrand Reinhold, 1992, ch. 10, pp. 283-356.
-
(1992)
Handbook of Intelligent Control
, pp. 283-356
-
-
Werbos, P.J.1
-
17
-
-
49049089962
-
Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof
-
Aug
-
A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, "Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof," IEEE Trans. Syst., Man, Cybern., B, Cybern., vol. 38, no. 4, pp. 943-949, Aug. 2008.
-
(2008)
IEEE Trans. Syst., Man, Cybern., B, Cybern.
, vol.38
, Issue.4
, pp. 943-949
-
-
Al-Tamimi, A.1
Lewis, F.L.2
Abu-Khalaf, M.3
-
18
-
-
80053166137
-
Finite-horizon input-constrained nonlinear optimal control using single network adaptive critics
-
Jun./Jul.
-
A. Heydari and S. N. Balakrishnan, "Finite-horizon input-constrained nonlinear optimal control using single network adaptive critics," in Proc. ACC, Jun./Jul. 2011, pp. 3047-3052.
-
(2011)
Proc. ACC
, pp. 3047-3052
-
-
Heydari, A.1
Balakrishnan, S.N.2
-
20
-
-
0032687566
-
Stable adaptive control using new critic designs
-
Mar, ArXiv:adap-org/9810001
-
P. J. Werbos, "Stable adaptive control using new critic designs," Proc. SPIE, vol. 3728, p. 510, Mar. 1999, ArXiv:adap-org/9810001
-
(1999)
Proc. SPIE
, vol.3728
, pp. 510
-
-
Werbos, P.J.1
-
21
-
-
85151728371
-
Residual algorithms: Reinforcement learning with function approximation
-
L. C. Baird, "Residual algorithms: Reinforcement learning with function approximation," in Proc. Int. Conf. Mach. Learn., 1995, pp. 30-37.
-
(1995)
Proc. Int. Conf. Mach. Learn.
, pp. 30-37
-
-
Baird, L.C.1
-
22
-
-
0025229247
-
Consistency of HDP applied to a simple reinforcement learning problem
-
Jan
-
P. J. Werbos, "Consistency of HDP applied to a simple reinforcement learning problem," Neural Netw., vol. 3, pp. 179-189, Jan. 1990.
-
(1990)
Neural Netw.
, vol.3
, pp. 179-189
-
-
Werbos, P.J.1
-
23
-
-
0025503558
-
Backpropagation through time: What it does and how to do it
-
Oct
-
P. J. Werbos, "Backpropagation through time: What it does and how to do it," Proc. IEEE, vol. 78, no. 10, pp. 1550-1560, Oct. 1990.
-
(1990)
Proc. IEEE
, vol.78
, Issue.10
, pp. 1550-1560
-
-
Werbos, P.J.1
-
24
-
-
0033629916
-
Reinforcement learning in continuous time and space
-
K. Doya, "Reinforcement learning in continuous time and space," Neural Comput., vol. 12, no. 1, pp. 219-245, 2000.
-
(2000)
Neural Comput.
, vol.12
, Issue.1
, pp. 219-245
-
-
Doya, K.1
-
25
-
-
0027554566
-
Temporal-difference methods and Markov models
-
Mar./Apr
-
E. Barnard, "Temporal-difference methods and Markov models," IEEE Trans. Syst., Man, Cybern., vol. 23, no. 2, pp. 357-365, Mar./Apr. 1993.
-
(1993)
IEEE Trans. Syst., Man, Cybern.
, vol.23
, Issue.2
, pp. 357-365
-
-
Barnard, E.1
-
26
-
-
84886350301
-
Approximating optimal control with value gradient learning
-
F. Lewis and D. Liu, Eds. New York, NY, USA: Wiley, Sections 7.3.4 and 7.4.3
-
M. Fairbank, D. Prokhorov, and E. Alonso, "Approximating optimal control with value gradient learning," in Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, F. Lewis and D. Liu, Eds. New York, NY, USA: Wiley, 2012, Sections 7.3.4 and 7.4.3.
-
(2012)
Reinforcement Learning and Approximate Dynamic Programming for Feedback Control
-
-
Fairbank, M.1
Prokhorov, D.2
Alonso, E.3
-
27
-
-
84943274699
-
A direct adaptive method for faster backpropagation learning: The RPROP algorithm
-
San Francisco, CA, USA, Apr.
-
M. Riedmiller and H. Braun, "A direct adaptive method for faster backpropagation learning: The RPROP algorithm," in Proc. IEEE Int. Conf. Neural Netw., San Francisco, CA, USA, Apr. 1993, pp. 586-591.
-
(1993)
Proc. IEEE Int. Conf. Neural Netw.
, pp. 586-591
-
-
Riedmiller, M.1
Braun, H.2
-
28
-
-
0020970738
-
Neuronlike adaptive elements that can solve difficult learning control problems
-
A. G. Barto, R. S. Sutton, and C. W. Anderson, "Neuronlike adaptive elements that can solve difficult learning control problems," IEEE Trans. Syst., Man, Cybern., vol. 13, no. 5, pp. 834-846, Sep./Oct. 1983. (Pubitemid 14138646)
-
(1983)
IEEE Transactions on Systems, Man and Cybernetics
, vol.13
, Issue.5
, pp. 834-846
-
-
Barto, A.G.1
Sutton, R.S.2
Anderson, C.W.3
-
29
-
-
76649091717
-
Correct equations for the dynamics of the cart-pole system
-
Cluj-Napoca, Romania, Tech. Rep.
-
R. V. Florian, "Correct equations for the dynamics of the cart-pole system," Center for Cognit., Neural Studies (Coneural), Cluj-Napoca, Romania, Tech. Rep., 2007.
-
(2007)
Center for Cognit., Neural Studies (Coneural)
-
-
Florian, R.V.1
|