-
1
-
-
33847202724
-
Learning to predict by the methods of temporal differences
-
R. Sutton, "Learning to predict by the methods of temporal differences," Mach. Learn., vol. 3, no. 1, pp. 9-44, 1988.
-
(1988)
Mach. Learn.
, vol.3
, Issue.1
, pp. 9-44
-
-
Sutton, R.1
-
3
-
-
0015667648
-
Punish/reward: Learning with a critic in adaptive threshold systems
-
B. Widrow, N. Gupta, and S. Maitra, "Punish/reward: Learning with a critic in adaptive threshold systems," IEEE Trans. Syst. Man Cybern., vol. 3, no. 5, pp. 455-465, 1973.
-
(1973)
IEEE Trans. Syst. Man Cybern.
, vol.3
, Issue.5
, pp. 455-465
-
-
Widrow, B.1
Gupta, N.2
Maitra, S.3
-
4
-
-
0020970738
-
Neuronlike adaptive elements that can solve difficult learning control problems
-
A. Barto, R. Sutton, and C. Anderson, "Neuronlike adaptive elements that can solve difficult learning control problems," IEEE Trans. Syst. Man Cybern., vol. 13, no. 5, pp. 834-846, 1983.
-
(1983)
IEEE Trans. Syst. Man Cybern.
, vol.13
, Issue.5
, pp. 834-846
-
-
Barto, A.1
Sutton, R.2
Anderson, C.3
-
5
-
-
0026852362
-
Reinforcement learning is direct adaptive optimal control
-
R. Sutton, A. Barto, and R. Williams, "Reinforcement learning is direct adaptive optimal control," IEEE Contr. Syst. Mag., vol. 12, no. 2, pp. 19-22, 1992.
-
(1992)
IEEE Contr. Syst. Mag.
, vol.12
, Issue.2
, pp. 19-22
-
-
Sutton, R.1
Barto, A.2
Williams, R.3
-
7
-
-
0002011091
-
A menu of designs for reinforcement learning over time
-
ser. MIT Press Series In Neural Network Modeling And Connectionism. Cambridge, MA, USA: MIT Press
-
P. Webros, "A menu of designs for reinforcement learning over time," in Neural networks for control, ser. MIT Press Series In Neural Network Modeling And Connectionism. Cambridge, MA, USA: MIT Press, 1990, pp. 67-95.
-
(1990)
Neural Networks for Control
, pp. 67-95
-
-
Webros, P.1
-
9
-
-
0031236002
-
Adaptive critic designs
-
D. C. Sep.
-
D. V. Prokhorov and I. Wunsch, D. C., "Adaptive critic designs," IEEE Trans. Neural Networks, vol. 8, no. 5, pp. 997-1007, Sep. 1997.
-
(1997)
IEEE Trans. Neural Networks
, vol.8
, Issue.5
, pp. 997-1007
-
-
Prokhorov, D.V.1
Wunsch, I.2
-
10
-
-
84921399937
-
-
Wiley-IEEE Press
-
J. Si, A. Barto, W. Powell, and D. Wunsch, Eds., Handbook of Learning and Approximate Dynamic Programming. Wiley-IEEE Press, 2004.
-
(2004)
Handbook of Learning and Approximate Dynamic Programming
-
-
Si, J.1
Barto, A.2
Powell, W.3
Wunsch, D.4
-
11
-
-
0030196717
-
Adaptive-critic-based neural networks for aircraft optimal control
-
S. Balakrishnan, "Adaptive-critic-based neural networks for aircraft optimal control," J. Guid. Contr. Dynam., vol. 19, no. 4, pp. 893-898, 1996.
-
(1996)
J. Guid. Contr. Dynam.
, vol.19
, Issue.4
, pp. 893-898
-
-
Balakrishnan, S.1
-
12
-
-
0033685661
-
Adaptive critic design for intelligent steering and speed control of a 2-axle vehicle
-
G. Lendaris, L. Schultz, and T. Shannon, "Adaptive critic design for intelligent steering and speed control of a 2-axle vehicle," in Int. Joint Conf. Neural Netw., 2000, pp. 73-78.
-
(2000)
Int. Joint Conf. Neural Netw.
, pp. 73-78
-
-
Lendaris, G.1
Schultz, L.2
Shannon, T.3
-
14
-
-
0036641793
-
State-constrained agile missile control with adaptive-critic-based neural networks
-
D. Han and S. Balakrishnan, "State-constrained agile missile control with adaptive-critic-based neural networks," IEEE Trans. Control Syst. Technol., vol. 10, no. 4, pp. 481-489, 2002.
-
(2002)
IEEE Trans. Control Syst. Technol.
, vol.10
, Issue.4
, pp. 481-489
-
-
Han, D.1
Balakrishnan, S.2
-
15
-
-
34047138362
-
Reinforcement learning neural-network-based controller for nonlinear discrete-time systems with input constraints
-
P. He and S. Jagannathan, "Reinforcement learning neural-network-based controller for nonlinear discrete-time systems with input constraints," IEEE Trans. Syst. Man Cybern. Part B Cybern., vol. 37, no. 2, pp. 425-436, 2007.
-
(2007)
IEEE Trans. Syst. Man Cybern. Part B Cybern.
, vol.37
, Issue.2
, pp. 425-436
-
-
He, P.1
Jagannathan, S.2
-
16
-
-
49049089962
-
Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof
-
Aug.
-
A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, "Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof," IEEE Trans. Syst. Man Cybern. Part B Cybern., vol. 38, no. 4, pp. 943-949, Aug. 2008.
-
(2008)
IEEE Trans. Syst. Man Cybern. Part B Cybern.
, vol.38
, Issue.4
, pp. 943-949
-
-
Al-Tamimi, A.1
Lewis, F.L.2
Abu-Khalaf, M.3
-
17
-
-
0004370245
-
-
Wright Lab, Wright-Patterson Air Force Base, OH, Tech. Rep.
-
L. Baird, "Advantage updating," Wright Lab, Wright-Patterson Air Force Base, OH, Tech. Rep., 1993.
-
(1993)
Advantage Updating
-
-
Baird, L.1
-
18
-
-
0033629916
-
Reinforcement learning in continuous time and space
-
K. Doya, "Reinforcement learning in continuous time and space," Neural Comput., vol. 12, no. 1, pp. 219-245, 2000.
-
(2000)
Neural Comput.
, vol.12
, Issue.1
, pp. 219-245
-
-
Doya, K.1
-
19
-
-
0036588686
-
Adaptive dynamic programming
-
J. Murray, C. Cox, G. Lendaris, and R. Saeks, "Adaptive dynamic programming," IEEE Trans. Syst. Man Cybern. Part C Appl. Rev., vol. 32, no. 2, pp. 140-153, 2002.
-
(2002)
IEEE Trans. Syst. Man Cybern. Part C Appl. Rev.
, vol.32
, Issue.2
, pp. 140-153
-
-
Murray, J.1
Cox, C.2
Lendaris, G.3
Saeks, R.4
-
20
-
-
0031332446
-
Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation
-
R. Beard, G. Saridis, and J. Wen, "Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation," Automatica, vol. 33, pp. 2159-2178, 1997.
-
(1997)
Automatica
, vol.33
, pp. 2159-2178
-
-
Beard, R.1
Saridis, G.2
Wen, J.3
-
21
-
-
14844340822
-
Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach
-
M. Abu-Khalaf and F. Lewis, "Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach," Automatica, vol. 41, no. 5, pp. 779-791, 2005.
-
(2005)
Automatica
, vol.41
, Issue.5
, pp. 779-791
-
-
Abu-Khalaf, M.1
Lewis, F.2
-
22
-
-
34249047468
-
Continuous-time adaptive critics
-
T. Hanselmann, L. Noakes, and A. Zaknich, "Continuous-time adaptive critics," IEEE Trans. Neural Networks, vol. 18, no. 3, pp. 631-647, 2007.
-
(2007)
IEEE Trans. Neural Networks
, vol.18
, Issue.3
, pp. 631-647
-
-
Hanselmann, T.1
Noakes, L.2
Zaknich, A.3
-
23
-
-
67349145396
-
Neural network approach to continuoustime direct adaptive optimal control for partially unknown nonlinear systems
-
D. Vrabie and F. Lewis, "Neural network approach to continuoustime direct adaptive optimal control for partially unknown nonlinear systems," Neural Networks, vol. 22, no. 3, pp. 237-246, 2009.
-
(2009)
Neural Networks
, vol.22
, Issue.3
, pp. 237-246
-
-
Vrabie, D.1
Lewis, F.2
-
25
-
-
0002031779
-
Approximate dynamic programming for real-time control and neural modeling
-
D. A. White and D. A. Sofge, Eds. New York: Van Nostrand Reinhold
-
P. Werbos, "Approximate dynamic programming for real-time control and neural modeling," in Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches, D. A. White and D. A. Sofge, Eds. New York: Van Nostrand Reinhold, 1992.
-
(1992)
Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches
-
-
Werbos, P.1
-
26
-
-
0004469897
-
Neurons with graded response have collective computational properties like those of two-state neurons
-
J. Hopfield, "Neurons with graded response have collective computational properties like those of two-state neurons," Proc. Nat. Acad. Sci. U.S.A., vol. 81, no. 10, p. 3088, 1984.
-
(1984)
Proc. Nat. Acad. Sci. U.S.A.
, vol.81
, Issue.10
, pp. 3088
-
-
Hopfield, J.1
-
28
-
-
0003581164
-
Identification and control of nonlinear systems using neural network models: Design and stability analysis
-
University of Southern California
-
M. Polycarpou and P. Ioannou, "Identification and control of nonlinear systems using neural network models: Design and stability analysis," Systems Report 91-09-01, University of Southern California, 1991.
-
(1991)
Systems Report 91-09-01
-
-
Polycarpou, M.1
Ioannou, P.2
-
29
-
-
0024861871
-
Approximation by superpositions of a sigmoidal function
-
G. Cybenko, "Approximation by superpositions of a sigmoidal function," Math. Control Signals Syst., vol. 2, pp. 303-314, 1989.
-
(1989)
Math. Control Signals Syst.
, vol.2
, pp. 303-314
-
-
Cybenko, G.1
-
30
-
-
0000466705
-
Nonlinear network structures for feedback control
-
F. L. Lewis, "Nonlinear network structures for feedback control," Asian J. Control, vol. 1, no. 4, pp. 205-228, 1999.
-
(1999)
Asian J. Control
, vol.1
, Issue.4
, pp. 205-228
-
-
Lewis, F.L.1
-
32
-
-
0003427482
-
-
S. Haykin, Ed. John Wiley & Sons
-
M. Krstic, P. V. Kokotovic, and I. Kanellakopoulos, Nonlinear and Adaptive Control Design, S. Haykin, Ed. John Wiley & Sons, 1995.
-
(1995)
Nonlinear and Adaptive Control Design
-
-
Krstic, M.1
Kokotovic, P.V.2
Kanellakopoulos, I.3
-
34
-
-
0004178386
-
-
3rd ed. Upper Saddle River, NJ: Prentice Hall
-
H. K. Khalil, Nonlinear Systems, 3rd ed. Upper Saddle River, NJ: Prentice Hall, 2002.
-
(2002)
Nonlinear Systems
-
-
Khalil, H.K.1
|