-
1
-
-
66449130966
-
Adaptive dynamic programming: An introduction
-
May
-
F.-Y. Wang, H. Zhang, and D. Liu, "Adaptive dynamic programming: An introduction," IEEE Comput. Intell. Mag., vol. 4, no. 2, pp. 39-47, May 2009.
-
(2009)
IEEE Comput. Intell. Mag
, vol.4
, Issue.2
, pp. 39-47
-
-
Wang, F.-Y.1
Zhang, H.2
Liu, D.3
-
2
-
-
84907812505
-
Neurocontrol: Where it is going and why it is crucial
-
North Holland, Amsterdam, I. Aleksander, Ed. New York, NY, USA: Taylor
-
P. J. Werbos, "Neurocontrol: Where it is going and why it is crucial," in Artifical Neural Networks II (ICANN-2), North Holland, Amsterdam, I. Aleksander, Ed. New York, NY, USA: Taylor, 1992, pp. 61-68.
-
(1992)
Artifical Neural Networks II (ICANN-2)
, pp. 61-68
-
-
Werbos, P.J.1
-
3
-
-
0033284518
-
Neural networks for control
-
M. Hagan and H. Demuth, "Neural networks for control," in Proc. Amer. Control Conf., vol. 3. 1999, pp. 1642-1656.
-
(1999)
Proc. Amer. Control Conf
, vol.3
, pp. 1642-1656
-
-
Hagan, M.1
Demuth, H.2
-
5
-
-
84883537695
-
Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers
-
Dec
-
F. L. Lewis, D. Vrabie, and K. G. Vamvoudakis, "Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers," IEEE Control Syst., vol. 32, no. 6, pp. 76-105, Dec. 2012.
-
(2012)
IEEE Control Syst
, vol.32
, Issue.6
, pp. 76-105
-
-
Lewis, F.L.1
Vrabie, D.2
Vamvoudakis, K.G.3
-
6
-
-
0002031779
-
Approximating dynamic programming for real-time control and neural modeling
-
D. A. White and D. A. Sofge, Eds. New York, NY, USA: Van Nostrand, ch. 13
-
P. J. Werbos, "Approximating dynamic programming for real-time control and neural modeling." in Handbook of Intelligent Control, D. A. White and D. A. Sofge, Eds. New York, NY, USA: Van Nostrand, 1992, pp. 493-525, ch. 13.
-
(1992)
Handbook of Intelligent Control
, pp. 493-525
-
-
Werbos, P.J.1
-
7
-
-
0031236002
-
Adaptive critic designs
-
Sep
-
D. Prokhorov and D. Wunsch, "Adaptive critic designs," IEEE Trans. Neural Netw., vol. 8, no. 5, pp. 997-1007, Sep. 1997.
-
(1997)
IEEE Trans. Neural Netw
, vol.8
, Issue.5
, pp. 997-1007
-
-
Prokhorov, D.1
Wunsch, D.2
-
8
-
-
0025503558
-
Backpropagation through time: What it does and how to do it
-
Oct
-
P. J. Werbos, "Backpropagation through time: What it does and how to do it," Proc. IEEE, vol. 78, no. 10, pp. 1550-1560, Oct. 1990.
-
(1990)
Proc. IEEE
, vol.78
, Issue.10
, pp. 1550-1560
-
-
Werbos, P.J.1
-
9
-
-
84865069763
-
Value-gradient learning
-
Jun
-
M. Fairbank and E. Alonso, "Value-gradient learning," in Proc. IEEE IJCNN, Jun. 2012, pp. 3062-3069.
-
(2012)
Proc. IEEE IJCNN
, pp. 3062-3069
-
-
Fairbank, M.1
Alonso, E.2
-
10
-
-
84886350301
-
Approximating optimal control with value gradient learning
-
F. Lewis and D. Liu, Eds. New York, NY, USA: Wiley
-
M. Fairbank, D. Prokhorov, and E. Alonso, "Approximating optimal control with value gradient learning," in Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, F. Lewis and D. Liu, Eds. New York, NY, USA: Wiley, 2012, pp. 142-161.
-
(2012)
Reinforcement Learning and Approximate Dynamic Programming for Feedback Control
, pp. 142-161
-
-
Fairbank, M.1
Prokhorov, D.2
Alonso, E.3
-
12
-
-
84887996993
-
An equivalence between adaptive dynamic programming with a critic and backpropagation through time
-
Dec
-
M. Fairbank, E. Alonso, and D. Prokhorov, "An equivalence between adaptive dynamic programming with a critic and backpropagation through time," IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 12, pp. 2088-2100, Dec. 2013.
-
(2013)
IEEE Trans. Neural Netw. Learn. Syst
, vol.24
, Issue.12
, pp. 2088-2100
-
-
Fairbank, M.1
Alonso, E.2
Prokhorov, D.3
-
13
-
-
34548772284
-
Backpropagation through time and derivative adaptive critics: A common framework for comparison
-
J. Si, A. Barto, W. Powell, and D. Wunsch, Eds. New York, NY, USA: Wiley
-
D. Prokhorov, "Backpropagation through time and derivative adaptive critics: A common framework for comparison," in Handbook of Learning and Approximate Dynamic Programming, J. Si, A. Barto, W. Powell, and D. Wunsch, Eds. New York, NY, USA: Wiley, 2004.
-
(2004)
Handbook of Learning and Approximate Dynamic Programming
-
-
Prokhorov, D.1
-
15
-
-
33646384929
-
Policy gradient in continuous time
-
Jan
-
R. Munos, "Policy gradient in continuous time," J. Mach. Learn. Res., vol. 7, pp. 413-427, Jan. 2006.
-
(2006)
J. Mach. Learn. Res
, vol.7
, pp. 413-427
-
-
Munos, R.1
-
16
-
-
84876158475
-
Simple and fast calculation of the second-order gradients for globalized dual heuristic dynamic programming in neural networks
-
Oct
-
M. Fairbank, E. Alonso, and D. Prokhorov, "Simple and fast calculation of the second-order gradients for globalized dual heuristic dynamic programming in neural networks," IEEE Trans. Neural Netw. Learn. Syst., vol. 23, no. 10, pp. 1671-1678, Oct. 2012.
-
(2012)
IEEE Trans. Neural Netw. Learn. Syst
, vol.23
, Issue.10
, pp. 1671-1678
-
-
Fairbank, M.1
Alonso, E.2
Prokhorov, D.3
-
17
-
-
14844340822
-
Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network hjb approach
-
M. Abu-Khalaf and F. L. Lewis, "Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach," Automatica, vol. 41, no. 5, pp. 779-791, 2005.
-
(2005)
Automatica
, vol.41
, Issue.5
, pp. 779-791
-
-
Abu-Khalaf, M.1
Lewis, F.L.2
-
18
-
-
33847648898
-
Adaptive critic designs for discrete-time zero-sum games with application to h∞ control
-
Feb
-
A. Al-Tamimi, M. Abu-Khalaf, and F. L. Lewis, "Adaptive critic designs for discrete-time zero-sum games with application to h∞ control," IEEE Trans. Syst., Man, Cybern., B, Cybern., vol. 37, no. 1, pp. 240-247, Feb. 2007.
-
(2007)
IEEE Trans. Syst., Man, Cybern., B, Cybern
, vol.37
, Issue.1
, pp. 240-247
-
-
Al-Tamimi, A.1
Abu-Khalaf, M.2
Lewis, F.L.3
-
19
-
-
49049089962
-
Discrete-time nonlinear hjb solution using approximate dynamic programming: Convergence proof
-
Aug
-
A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, "Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof," IEEE Trans. Syst., Man, Cybern., B, Cybern., vol. 38, no. 4, pp. 943-949, Aug. 2008.
-
(2008)
IEEE Trans. Syst., Man, Cybern., B, Cybern
, vol.38
, Issue.4
, pp. 943-949
-
-
Al-Tamimi, A.1
Lewis, F.L.2
Abu-Khalaf, M.3
-
20
-
-
70349253929
-
Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints
-
Sep
-
H. Zhang, Y. Luo, and D. Liu, "Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints," IEEE Trans. Neural Netw., vol. 20, no. 9, pp. 1490-1503, Sep. 2009.
-
(2009)
IEEE Trans. Neural Netw
, vol.20
, Issue.9
, pp. 1490-1503
-
-
Zhang, H.1
Luo, Y.2
Liu, D.3
-
21
-
-
49049119493
-
A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy hdp iteration algorithm
-
Aug
-
H. Zhang, Q. Wei, and Y. Luo, "A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm," IEEE Trans. Syst., Man, Cybern., B, Cybern., vol. 38, no. 4, pp. 937-942, Aug. 2008.
-
(2008)
IEEE Trans. Syst., Man, Cybern., B, Cybern
, vol.38
, Issue.4
, pp. 937-942
-
-
Zhang, H.1
Wei, Q.2
Luo, Y.3
-
22
-
-
0000337576
-
Simple statistical gradient-following algorithms for connectionist reinforcement learning
-
R. J. Williams, "Simple statistical gradient-following algorithms for connectionist reinforcement learning," Mach. Learn., vol. 8, nos. 3-4, pp. 229-356, 1992.
-
(1992)
Mach. Learn
, vol.8
, Issue.3-4
, pp. 229-356
-
-
Williams, R.J.1
-
24
-
-
0004049893
-
-
Ph.D. dissertation, Dept. Comput. Sci., Cambridge Univ., Cambridge, U.K.
-
C. J. C. H. Watkins, "Learning from delayed rewards," Ph.D. dissertation, Dept. Comput. Sci., Cambridge Univ., Cambridge, U.K., 1989.
-
(1989)
Learning from Delayed Rewards
-
-
Watkins, C.J.C.H.1
-
25
-
-
85032189594
-
Model-based adaptive critic designs
-
J. Si, A. Barto, W. Powell, and D. Wunsch, Eds. New York, NY, USA: Wiley
-
S. Ferrari and R. F. Stengel, "Model-based adaptive critic designs," in Handbook of Learning and Approximate Dynamic Programming, J. Si, A. Barto, W. Powell, and D. Wunsch, Eds. New York, NY, USA: Wiley, 2004, pp. 65-96.
-
(2004)
Handbook of Learning and Approximate Dynamic Programming
, pp. 65-96
-
-
Ferrari, S.1
Stengel, R.F.2
-
26
-
-
33847202724
-
Learning to predict by the methods of temporal differences
-
R. S. Sutton, "Learning to predict by the methods of temporal differences," Mach. Learn., vol. 3, no. 1, pp. 9-44, 1988.
-
(1988)
Mach. Learn
, vol.3
, Issue.1
, pp. 9-44
-
-
Sutton, R.S.1
-
27
-
-
0033629916
-
Reinforcement learning in continuous time and space
-
K. Doya, "Reinforcement learning in continuous time and space," Neural Comput., vol. 12, no. 1, pp. 219-245, 2000.
-
(2000)
Neural Comput
, vol.12
, Issue.1
, pp. 219-245
-
-
Doya, K.1
-
28
-
-
0003529238
-
-
Ph.D. dissertation, Dept. Comput. Sci., Harvard Univ., Cambridge, MA, USA
-
P. J. Werbos, "Beyond regression: New tools for prediction and analysis in the behavioral sciences," Ph.D. dissertation, Dept. Comput. Sci., Harvard Univ., Cambridge, MA, USA, 1974.
-
(1974)
Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences
-
-
Werbos, P.J.1
-
29
-
-
0022471098
-
Learning representations by back-propagating errors
-
Oct
-
D. Rumelhart, G. Hinton, and R. Williams, "Learning representations by back-propagating errors," Nature, vol. 323, no. 6088, pp. 533-536, Oct. 1986.
-
(1986)
Nature
, vol.323
, Issue.6088
, pp. 533-536
-
-
Rumelhart, D.1
Hinton, G.2
Williams, R.3
-
30
-
-
0027554566
-
Temporal-difference methods and markov models
-
Mar./Apr
-
E. Barnard, "Temporal-difference methods and Markov models," IEEE Trans. Syst., Man, Cybern., vol. 23, no. 2, pp. 357-365, Mar./Apr. 1993.
-
(1993)
IEEE Trans. Syst., Man, Cybern
, vol.23
, Issue.2
, pp. 357-365
-
-
Barnard, E.1
-
32
-
-
84865077338
-
A comparison of learning speed and ability to cope without exploration between dhp and td(0)
-
Jun
-
M. Fairbank and E. Alonso, "A comparison of learning speed and ability to cope without exploration between DHP and TD(0)," in Proc. IEEE Int. Joint Conf. Neural Netw., Jun. 2012, pp. 1478-1485.
-
(2012)
Proc. IEEE Int. Joint Conf. Neural Netw
, pp. 1478-1485
-
-
Fairbank, M.1
Alonso, E.2
-
33
-
-
0020970738
-
Neuronlike adaptive elements that can solve difficult learning control problems
-
Sep./Oct
-
A. G. Barto, R. S. Sutton, and C. W. Anderson, "Neuronlike adaptive elements that can solve difficult learning control problems," IEEE Trans. Syst., Man, Cybern., vol. 13, no. 5, pp. 834-846, Sep./Oct. 1983.
-
(1983)
IEEE Trans. Syst., Man, Cybern
, vol.13
, Issue.5
, pp. 834-846
-
-
Barto, A.G.1
Sutton, R.S.2
Anderson, C.W.3
-
34
-
-
0030702730
-
Training strategies for critic and action neural networks in dual heuristic programming method
-
Jun
-
G. G. Lendaris and C. Paintz, "Training strategies for critic and action neural networks in dual heuristic programming method," in Proc. Int. Conf. Neural Netw., Jun. 1997, pp. 712-717.
-
(1997)
Proc. Int. Conf. Neural Netw
, pp. 712-717
-
-
Lendaris, G.G.1
Paintz, C.2
-
35
-
-
76649091717
-
Correct equations for the dynamics of the cart-pole system
-
Univ. Bucharest, Cluj-Napoca, Romania, Tech. Rep
-
R. V. Florian, "Correct equations for the dynamics of the cart-pole system," Center for Cognitive and Neural Studies (Coneural), Univ. Bucharest, Cluj-Napoca, Romania, Tech. Rep., 2007.
-
(2007)
Center for Cognitive and Neural Studies (Coneural)
-
-
Florian, R.V.1
|