-
2
-
-
66449130966
-
Adaptive dynamic programming: An introduction
-
May
-
F. Y. Wang, H. Zhang, and D. Liu, "Adaptive dynamic programming: An introduction," IEEE Comput. Intell. Mag., vol. 4, no. 2, pp. 39-47, May 2009.
-
(2009)
IEEE Comput. Intell. Mag.
, vol.4
, Issue.2
, pp. 39-47
-
-
Wang, F.Y.1
Zhang, H.2
Liu, D.3
-
5
-
-
67349247013
-
Intelligence in the brain: A theory of how it works and how to build it
-
Apr.
-
P. J. Werbos, "Intelligence in the brain: A theory of how it works and how to build it," Neural Netw., vol. 22, no. 3, pp. 200-212, Apr. 2009.
-
(2009)
Neural Netw.
, vol.22
, Issue.3
, pp. 200-212
-
-
Werbos, P.J.1
-
8
-
-
26844483839
-
A self-learning call admission control scheme for CDMA cellular networks
-
DOI 10.1109/TNN.2005.853408
-
D. Liu, Y. Zhang, and H. Zhang, "A self-learning call admission control scheme for CDMA cellular networks," IEEE Trans. Neural Netw., vol. 16, no. 5, pp. 1219-1228, Sep. 2005. (Pubitemid 41444623)
-
(2005)
IEEE Transactions on Neural Networks
, vol.16
, Issue.5
, pp. 1219-1228
-
-
Liu, D.1
Zhang, Y.2
Zhang, H.3
-
9
-
-
0032208335
-
Elevator group control using multiple reinforcement learning agents
-
Nov.
-
R. H. Crites and A. G. Barto, "Elevator group control using multiple reinforcement learning agents," Mach. Learn., vol. 33, nos. 2-3, pp. 235-262, Nov. 1998.
-
(1998)
Mach. Learn.
, vol.33
, Issue.2-3
, pp. 235-262
-
-
Crites, R.H.1
Barto, A.G.2
-
10
-
-
0000985504
-
TD-Gammon, a self-teaching backgammon program, achieves master-level play
-
Mar.
-
G. Tesauro, "TD-Gammon, a self-teaching backgammon program, achieves master-level play," Neural Comput., vol. 6, no. 2, pp. 215-219, Mar. 1994.
-
(1994)
Neural Comput.
, vol.6
, Issue.2
, pp. 215-219
-
-
Tesauro, G.1
-
11
-
-
49649121741
-
Reinforcement-learning-based dual-control methodology for complex nonlinear discrete-time systems with application to spark engine EGR operation
-
Aug.
-
P. Shih, B. C. Kaul, S. Jagannathan, and J. A. Drallmeier, "Reinforcement-learning-based dual-control methodology for complex nonlinear discrete-time systems with application to spark engine EGR operation," IEEE Trans. Neural Netw., vol. 19, no. 8, pp. 1369-1388, Aug. 2008.
-
(2008)
IEEE Trans. Neural Netw.
, vol.19
, Issue.8
, pp. 1369-1388
-
-
Shih, P.1
Kaul, B.C.2
Jagannathan, S.3
Drallmeier, J.A.4
-
13
-
-
85046476577
-
-
Boca Raton, FL: CRC Press
-
L. Busoniu, R. Babuska, B. De Schutter, and D. Ernst, Reinforcement Learning and Dynamic Programming Using Function Approximators. Boca Raton, FL: CRC Press, 2010
-
(2010)
Reinforcement Learning and Dynamic Programming Using Function Approximators
-
-
Busoniu, L.1
Babuska, R.2
De Schutter, B.3
Ernst, D.4
-
14
-
-
71149099079
-
Fast gradient-descent methods for temporaldifference learning with linear function approximation
-
R. S. Sutton, H. R. Maei, D. Precup, S. Bhatnagar, D. Silver, C. Szepesvari, and E. Wiewiora, "Fast gradient-descent methods for temporaldifference learning with linear function approximation," in Proc. Int. Conf. Mach. Learn., 2009, pp. 993-1000.
-
Proc. Int. Conf. Mach. Learn., 2009
, pp. 993-1000
-
-
Sutton, R.S.1
Maei, H.R.2
Precup, D.3
Bhatnagar, S.4
Silver, D.5
Szepesvari, C.6
Wiewiora, E.7
-
15
-
-
0013535965
-
Infinite-horizon policy-gradient estimation
-
Jul.
-
J. Baxter and P. L. Bartlett, "Infinite-horizon policy-gradient estimation," J. Artif. Intell. Res., vol. 15, no. 1, pp. 319-350, Jul. 2001.
-
(2001)
J. Artif. Intell. Res.
, vol.15
, Issue.1
, pp. 319-350
-
-
Baxter, J.1
Bartlett, P.L.2
-
17
-
-
0031236002
-
Adaptive critic designs
-
Jul.
-
D. V. Prokhorov and D. C. Wunsch, "Adaptive critic designs," IEEE Trans. Neural Netw., vol. 8, no. 5, pp. 997-1007, Jul. 1997.
-
(1997)
IEEE Trans. Neural Netw.
, vol.8
, Issue.5
, pp. 997-1007
-
-
Prokhorov, D.V.1
Wunsch, D.C.2
-
18
-
-
0020970738
-
Neuron-like adaptive elements that can solve difficult learning control problems
-
Sep.-Oct.
-
A. G. Barto, R. S. Sutton, and C. W. Anderson, "Neuron-like adaptive elements that can solve difficult learning control problems," IEEE Trans. Syst., Man, Cybern., vol. 13, no. 5, pp. 834-846, Sep.-Oct. 1983.
-
(1983)
IEEE Trans. Syst., Man, Cybern.
, vol.13
, Issue.5
, pp. 834-846
-
-
Barto, A.G.1
Sutton, R.S.2
Anderson, C.W.3
-
19
-
-
0036565019
-
Comparison of heuristic dynamic programming and dual heuristic programming adaptive critics for neurocontrol of a turbogenerator
-
May
-
G. K. Venayagamoorthy, R. G. Harley, and D. C. Wunsch, "Comparison of heuristic dynamic programming and dual heuristic programming adaptive critics for neurocontrol of a turbogenerator," IEEE Trans. Neural Netw., vol. 13, no. 3, pp. 764-773, May 2002.
-
(2002)
IEEE Trans. Neural Netw.
, vol.13
, Issue.3
, pp. 764-773
-
-
Venayagamoorthy, G.K.1
Harley, R.G.2
Wunsch, D.C.3
-
20
-
-
70349116541
-
Reinforcement learning and adaptive dynamic programming for feedback control
-
Aug.
-
F. L. Lewis and D. Vrabie, "Reinforcement learning and adaptive dynamic programming for feedback control," IEEE Circuits Syst. Mag., vol. 9, no. 3, pp. 32-50, Aug. 2009.
-
(2009)
IEEE Circuits Syst. Mag.
, vol.9
, Issue.3
, pp. 32-50
-
-
Lewis, F.L.1
Vrabie, D.2
-
21
-
-
40649106649
-
Natural actor-critic
-
Mar.
-
J. Peters and S. Schaal, "Natural actor-critic," Neurocomputing, vol. 71, nos. 7-9, pp. 1180-1190, Mar. 2008.
-
(2008)
Neurocomputing
, vol.71
, Issue.7-9
, pp. 1180-1190
-
-
Peters, J.1
Schaal, S.2
-
22
-
-
78049336028
-
-
Alberta, Canada: Dept. Comput. Sci.
-
S. Bhatnagar, R. S. Sutton, M. Ghavamzadeh, and M. Lee, Natural Actor-Critic Algorithms. Alberta, Canada: Dept. Comput. Sci., 2009.
-
(2009)
Natural Actor-Critic Algorithms
-
-
Bhatnagar, S.1
Sutton, R.S.2
Ghavamzadeh, M.3
Lee, M.4
-
23
-
-
0030196717
-
Adaptive-critic-based neural networks for aircraft optimal control
-
S. N. Balakrishnan and V. Biega, "Adaptive-critic-based neural networks for aircraft optimal control," J. Guid., Control, Dynamics, vol. 19, no. 4, pp. 893-898, 1996. (Pubitemid 126539437)
-
(1996)
Journal of Guidance, Control, and Dynamics
, vol.19
, Issue.4
, pp. 893-898
-
-
Balakrishnan, S.N.1
Biega, V.2
-
24
-
-
0043026775
-
Helicopter trimming and tracking control using direct neural dynamic programming
-
Jul.
-
R. Enns and J. Si, "Helicopter trimming and tracking control using direct neural dynamic programming," IEEE Trans. Neural Netw., vol. 14, no. 4, pp. 929-939, Jul. 2003.
-
(2003)
IEEE Trans. Neural Netw.
, vol.14
, Issue.4
, pp. 929-939
-
-
Enns, R.1
Si, J.2
-
25
-
-
49049106959
-
Direct heuristic dynamic programming for damping oscillations in a large power system
-
Aug.
-
C. Lu, J. Si, and X. Xie, "Direct heuristic dynamic programming for damping oscillations in a large power system," IEEE Trans. Syst., Man, Cybern., Part B, Cybern., vol. 38, no. 4, pp. 1008-1013, Aug. 2008.
-
(2008)
IEEE Trans. Syst., Man, Cybern., Part B, Cybern.
, vol.38
, Issue.4
, pp. 1008-1013
-
-
Lu, C.1
Si, J.2
Xie, X.3
-
26
-
-
49649121741
-
Reinforcement-learning-based dual-control methodology for complex nonlinear discrete-time systems with application to spark engine EGR operation
-
Aug.
-
P. Shih, B. C. Kaul, S. Jagannathan, and J. A. Drallmeier, "Reinforcement-learning-based dual-control methodology for complex nonlinear discrete-time systems with application to spark engine EGR operation," IEEE Trans. Neural Netw., vol. 19, no. 8, pp. 1369-1388, Aug. 2008.
-
(2008)
IEEE Trans. Neural Netw.
, vol.19
, Issue.8
, pp. 1369-1388
-
-
Shih, P.1
Kaul, B.C.2
Jagannathan, S.3
Drallmeier, J.A.4
-
27
-
-
0031028741
-
Effective Backpropagation training with variable stepsize
-
Jan.
-
G. D. Magoulasa, M. N. Vrahatisb, and G. S. Androulakisb, "Effective Backpropagation training with variable stepsize," Neural Netw., vol. 10, no. 1, pp. 69-82, Jan. 1997.
-
(1997)
Neural Netw.
, vol.10
, Issue.1
, pp. 69-82
-
-
Magoulasa, G.D.1
Vrahatisb, M.N.2
Androulakisb, G.S.3
-
28
-
-
79960468564
-
Asymptotic tracking by a reinforcement learning-based adaptive critic controller
-
S. Bhasin, N. Sharma, P. Patre, and W. E. Dixon, "Asymptotic tracking by a reinforcement learning-based adaptive critic controller," J. Control Theory Appl., vol. 9, No. 3, pp. 400-409, 2011.
-
(2011)
J. Control Theory Appl.
, vol.9
, Issue.3
, pp. 400-409
-
-
Bhasin, S.1
Sharma, N.2
Patre, P.3
Dixon, W.E.4
-
29
-
-
77950630017
-
Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem
-
May
-
K. G. Vamvoudakis and F. L. Lewis, "Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem," Automatica, vol. 46, no. 5, pp. 878-888, May 2010.
-
(2010)
Automatica
, vol.46
, Issue.5
, pp. 878-888
-
-
Vamvoudakis, K.G.1
Lewis, F.L.2
-
30
-
-
83655163786
-
Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method
-
Dec.
-
H. Zhang, L. Cui, X. Zhang, and Y. Luo, "Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method," IEEE Trans. Neural Netw., vol. 22, no. 12, pp. 2226-2236, Dec. 2011.
-
(2011)
IEEE Trans. Neural Netw.
, vol.22
, Issue.12
, pp. 2226-2236
-
-
Zhang, H.1
Cui, L.2
Zhang, X.3
Luo, Y.4
-
34
-
-
0011812771
-
Kernel independent component analysis
-
Jul.
-
F. R. Bach and M. I. Jordan, "Kernel independent component analysis," J. Mach. Learn. Res., vol. 3, pp. 1-48, Jul. 2002.
-
(2002)
J. Mach. Learn. Res.
, vol.3
, pp. 1-48
-
-
Bach, F.R.1
Jordan, M.I.2
-
35
-
-
51049096780
-
Kernel methods in machine learning
-
T. Hofmann, B. Schölkopf, and A. J. Smola, "Kernel methods in machine learning," Ann. Statist., vol. 36, no. 3 pp. 1171-1220, 2008.
-
(2008)
Ann. Statist.
, vol.36
, Issue.3
, pp. 1171-1220
-
-
Hofmann, T.1
Schölkopf, B.2
Smola, A.J.3
-
36
-
-
0036832956
-
Kernel-based reinforcement learning
-
D. Ormoneit and S. Sen, "Kernel-based reinforcement learning," Mach. Learn., vol. 49, nos. 2-3, pp. 161-178, 2002.
-
(2002)
Mach. Learn.
, vol.49
, Issue.2-3
, pp. 161-178
-
-
Ormoneit, D.1
Sen, S.2
-
37
-
-
1942421151
-
Bayes meets bellman: The Gaussian Process approach to temporal difference learning
-
Y. Engel, S. Mannor, and R. Meir, "Bayes meets bellman: The Gaussian Process approach to temporal difference learning," in Proc. Int. Conf. Mach. Learn., 2003, pp. 154-161.
-
Proc. Int. Conf. Mach. Learn., 2003
, pp. 154-161
-
-
Engel, Y.1
Mannor, S.2
Meir, R.3
-
38
-
-
84899029004
-
Batch value function approximation via support vectors
-
Cambridge, MA: MIT Press
-
T. G. Dietterich and X. Wang, "Batch value function approximation via support vectors," in Advances in Neural Information Processing Systems 14, Cambridge, MA: MIT Press, 2002, pp. 1491-1498.
-
(2002)
Advances in Neural Information Processing Systems
, vol.14
, pp. 1491-1498
-
-
Dietterich, T.G.1
Wang, X.2
-
39
-
-
84899026055
-
Gaussian processes in reinforcement learning
-
S. Thrun, L. K. Saul, and B. Schölkopf, Eds., Cambridge, MA: MIT Press
-
C. E. Rasmussen and M. Kuss, "Gaussian processes in reinforcement learning," in Advances in Neural Information Processing Systems 16, S. Thrun, L. K. Saul, and B. Schölkopf, Eds., Cambridge, MA: MIT Press, 2004, pp. 751-759.
-
(2004)
Advances in Neural Information Processing Systems
, vol.16
, pp. 751-759
-
-
Rasmussen, C.E.1
Kuss, M.2
-
40
-
-
4644323293
-
Least-squares policy iteration
-
Dec.
-
M. G. Lagoudakis and R. Parr, "Least-squares policy iteration," J. Mach. Learn. Res., vol. 4, pp. 1107-1149, Dec. 2003.
-
(2003)
J. Mach. Learn. Res.
, vol.4
, pp. 1107-1149
-
-
Lagoudakis, M.G.1
Parr, R.2
-
41
-
-
34547098844
-
Kernel-based least squares policy iteration for reinforcement learning
-
DOI 10.1109/TNN.2007.899161, Neural Networks for Feedback Control Systems
-
X. Xu, D. Hu, and X. Lu, "Kernel-based least-squares policy iteration for reinforcement learning," IEEE Trans. Neural Netw., vol. 18, no. 4, pp. 973-992, Jul. 2007. (Pubitemid 47098876)
-
(2007)
IEEE Transactions on Neural Networks
, vol.18
, Issue.4
, pp. 973-992
-
-
Xu, X.1
Hu, D.2
Lu, X.3
-
42
-
-
3543096272
-
The kernel recursive least-squares algorithm
-
Aug.
-
Y. Engel, S. Mannor, and R. Meir, "The kernel recursive least-squares algorithm," IEEE Trans. Signal Process., vol. 52, no. 8, pp. 2275-2285, Aug. 2004.
-
(2004)
IEEE Trans. Signal Process.
, vol.52
, Issue.8
, pp. 2275-2285
-
-
Engel, Y.1
Mannor, S.2
Meir, R.3
-
43
-
-
0041345290
-
Efficient reinforcement learning using recursive least-squares methods
-
X. Xu, H. G. He, and D. W. Hu, "Efficient reinforcement learning using recursive least-squares methods," J. Artif. Intell. Res., vol. 16, pp. 259-292, Jun. 2002. (Pubitemid 43057174)
-
(2002)
Journal of Artificial Intelligence Research
, vol.16
, pp. 259-292
-
-
Xu, X.1
He, H.-G.2
Hu, D.3
-
44
-
-
33750328566
-
Kernel least-squares temporal difference learning
-
X. Xu, T. Xie, D. Hu, and X. Lu, "Kernel least-squares temporal difference learning," Int. J. Inf. Technol., vol. 11, no. 9, pp. 54-63, 2005.
-
(2005)
Int. J. Inf. Technol.
, vol.11
, Issue.9
, pp. 54-63
-
-
Xu, X.1
Xie, T.2
Hu, D.3
Lu, X.4
-
45
-
-
0031143730
-
An analysis of temporal difference learning with function approximation
-
May
-
J. N. Tsitsiklis and B. V. Roy, "An analysis of temporal difference learning with function approximation," IEEE Trans. Autom. Control, vol. 42, no. 5, pp. 674-690, May 1997.
-
(1997)
IEEE Trans. Autom. Control
, vol.42
, Issue.5
, pp. 674-690
-
-
Tsitsiklis, J.N.1
Roy, B.V.2
-
46
-
-
0037288398
-
Least squares policy evaluation algorithms with linear function approximation
-
Jan.-Apr.
-
A. Nedic and D. P. Bertsekas, "Least squares policy evaluation algorithms with linear function approximation," Discrete Event Dyn. Syst., vol. 13, nos. 1-2, pp. 79-110, Jan.-Apr. 2003.
-
(2003)
Discrete Event Dyn. Syst.
, vol.13
, Issue.1-2
, pp. 79-110
-
-
Nedic, A.1
Bertsekas, D.P.2
-
47
-
-
84875270081
-
Online optimal control of affine nonlinear discrete-time systems with unknown internal dynamics by using time-based policy update
-
Jul.
-
T. Dierks and S. Jagannathan, "Online optimal control of affine nonlinear discrete-time systems with unknown internal dynamics by using time-based policy update," IEEE Trans. Neural Netw. Learn. Syst., vol. 23, no. 7, pp. 1118-1129, Jul. 2012.
-
(2012)
IEEE Trans. Neural Netw. Learn. Syst.
, vol.23
, Issue.7
, pp. 1118-1129
-
-
Dierks, T.1
Jagannathan, S.2
-
48
-
-
84875242151
-
Sensitivity-based adaptive learning rules for binary feedforward neural networks
-
Mar.
-
S. Zhong, X. Zeng, S. Wu, and L. Han, "Sensitivity-based adaptive learning rules for binary feedforward neural networks," IEEE Trans. Neural Netw. Learn. Syst., vol. 23, no. 3, pp. 480-491, Mar. 2012.
-
(2012)
IEEE Trans. Neural Netw. Learn. Syst.
, vol.23
, Issue.3
, pp. 480-491
-
-
Zhong, S.1
Zeng, X.2
Wu, S.3
Han, L.4
|