-
2
-
-
0029679044
-
Reinforcement learning: A survey
-
L. P. Kaelbling, M. L. Littman, and A. W. Moore, "Reinforcement learning: A survey," J. Artif. Intell. Res., vol. 4, pp. 237-285, May 1996. (Pubitemid 126646155)
-
(1996)
Journal of Artificial Intelligence Research
, vol.4
, pp. 237-285
-
-
Kaelbling, L.P.1
Littman, M.L.2
Moore, A.W.3
-
4
-
-
84873991429
-
Feedback optimal control of distributed parameter systems by using finite-dimensional approximation schemes
-
Jun.
-
A. Alessandri, M. Gaggero, and R. Zoppoli "Feedback optimal control of distributed parameter systems by using finite-dimensional approximation schemes," IEEE Trans. Neural Netw. Learn. Syst., vol. 23, no. 6, pp. 984-996, Jun. 2012.
-
(2012)
IEEE Trans. Neural Netw. Learn. Syst.
, vol.23
, Issue.6
, pp. 984-996
-
-
Alessandri, A.1
Gaggero, M.2
Zoppoli, R.3
-
5
-
-
66449130966
-
Adaptive dynamic programming: An introduction
-
May
-
F. Y. Wang, H. Zhang, and D. Liu, "Adaptive dynamic programming: An introduction," IEEE Comput. Intell. Mag., vol. 4, no. 2, pp. 39-47, May 2009.
-
(2009)
IEEE Comput. Intell. Mag.
, vol.4
, Issue.2
, pp. 39-47
-
-
Wang, F.Y.1
Zhang, H.2
Liu, D.3
-
6
-
-
49049105169
-
Ensemble algorithms in reinforcement learning
-
Aug.
-
M. A. Wiering and H. van Hasselt, "Ensemble algorithms in reinforcement learning," IEEE Trans. Syst. Man Cybern. B, Cybern., vol. 38, no. 4, pp. 930-936, Aug. 2008.
-
(2008)
IEEE Trans. Syst. Man Cybern. B, Cybern.
, vol.38
, Issue.4
, pp. 930-936
-
-
Wiering, M.A.1
Van Hasselt, H.2
-
7
-
-
70349116541
-
Reinforcement learning and adaptive dynamic programming for feedback control
-
Jul-Sep.
-
F. L. Lewis and D. Vrabie, "Reinforcement learning and adaptive dynamic programming for feedback control," IEEE Circuits Syst. Mag., vol. 9, no. 3, pp. 32-50, Jul.-Sep. 2009.
-
(2009)
IEEE Circuits Syst. Mag.
, vol.9
, Issue.3
, pp. 32-50
-
-
Lewis, F.L.1
Vrabie, D.2
-
8
-
-
71149099079
-
Fast gradient-descent methods for temporal-difference learning with linear function approximation
-
R. S. Sutton, H. R. Maei, D. Precup, S. Bhatnagar, D. Silver, C. Szepesvári, and E. Wiewiora, "Fast gradient-descent methods for temporal-difference learning with linear function approximation," in Proc. Int. Conf. Mach. Learn., 2009, pp. 993-1000.
-
(2009)
Proc. Int. Conf. Mach. Learn.
, pp. 993-1000
-
-
Sutton, R.S.1
Maei, H.R.2
Precup, D.3
Bhatnagar, S.4
Silver, D.5
Szepesvári, C.6
Wiewiora, E.7
-
9
-
-
0013535965
-
Infinite-horizon policy-gradient estimation
-
Nov.
-
J. Baxter and P. L. Bartlett, "Infinite-horizon policy-gradient estimation," J. Artif. Intell. Res., vol. 15, pp. 319-350, Nov. 2001.
-
(2001)
J. Artif. Intell. Res.
, vol.15
, pp. 319-350
-
-
Baxter, J.1
Bartlett, P.L.2
-
10
-
-
84898938510
-
Actor-critic algorithms
-
Cambridge, MA, USA: MIT Press
-
V. R. Konda and J. N. Tsitsiklis, "Actor-critic algorithms," in Advances in Neural Information Processing Systems, Cambridge, MA, USA: MIT Press, 2000.
-
(2000)
Advances in Neural Information Processing Systems
-
-
Konda, V.R.1
Tsitsiklis, J.N.2
-
11
-
-
0031236002
-
Adaptive critic designs
-
PII S1045922797052430
-
D. V. Prokhorov and D. C. Wunsch, "Adaptive critic designs," IEEE Trans. Neural Netw., vol. 8, no. 5, pp. 997-1007, Sep. 1997. (Pubitemid 127763331)
-
(1997)
IEEE Transactions on Neural Networks
, vol.8
, Issue.5
, pp. 997-1007
-
-
Prokhorov, D.V.1
Wunsch II, D.C.2
-
12
-
-
79960468564
-
Asymptotic tracking by a reinforcement learning-based adaptive critic controller
-
Aug.
-
S. Bhasin, N. Sharma, P. Patre, and W. E. Dixon, "Asymptotic tracking by a reinforcement learning-based adaptive critic controller," J. Control Theory Appl., vol. 9, no. 3, pp. 400-409, Aug. 2011.
-
(2011)
J. Control Theory Appl.
, vol.9
, Issue.3
, pp. 400-409
-
-
Bhasin, S.1
Sharma, N.2
Patre, P.3
Dixon, W.E.4
-
13
-
-
74249090869
-
Adaptive critic design for energy minimization of portable video communication devices
-
Jun.
-
Z. Sun, "Adaptive critic design for energy minimization of portable video communication devices," IEEE Trans. Circuits Syst. Video Technol., vol. 20, no. 1, pp. 27-37, Jun. 2010.
-
(2010)
IEEE Trans. Circuits Syst. Video Technol.
, vol.20
, Issue.1
, pp. 27-37
-
-
Sun, Z.1
-
14
-
-
84863534247
-
Adaptive optimal control for designing automatic train regulation for metro line
-
Sep.
-
J.-W. Sheu and W.-S. Lin, "Adaptive optimal control for designing automatic train regulation for metro line," IEEE Trans. Control Syst. Technol., vol. 20, no. 5, pp. 1319-1327, Sep. 2012.
-
(2012)
IEEE Trans. Control Syst. Technol.
, vol.20
, Issue.5
, pp. 1319-1327
-
-
Sheu, J.-W.1
Lin, W.-S.2
-
15
-
-
83655163786
-
Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method
-
Dec.
-
H. Zhang, L. Cui, X. Zhang, and Y. Luo, "Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method," IEEE Trans. Neural Netw., vol. 22, no. 12, pp. 2226-2236, Dec. 2011.
-
(2011)
IEEE Trans. Neural Netw.
, vol.22
, Issue.12
, pp. 2226-2236
-
-
Zhang, H.1
Cui, L.2
Zhang, X.3
Luo, Y.4
-
16
-
-
40649106649
-
Natural actor-critic
-
Mar.
-
J. Peters and S. Schaal, "Natural actor-critic," Neurocomputing, vol. 71, nos. 7-9, pp. 1180-1190, Mar. 2008.
-
(2008)
Neurocomputing
, vol.71
, Issue.7-9
, pp. 1180-1190
-
-
Peters, J.1
Schaal, S.2
-
17
-
-
0036565019
-
Comparison of heuristic dynamic programming and dual heuristic programming adaptive critics for neurocontrol of a turbogenerator
-
DOI 10.1109/TNN.2002.1000146, PII S1045922702044417
-
G. K. Venayagamoorthy, R. G. Harley, and D. C. Wunsch, "Comparison of heuristic dynamic programming and dual heuristic programming adaptive critics for neurocontrol of a turbogenerator," IEEE Trans. Neural Netw., vol. 13, no. 3, pp. 764-773, May 2002. (Pubitemid 34669664)
-
(2002)
IEEE Transactions on Neural Networks
, vol.13
, Issue.3
, pp. 764-773
-
-
Venayagamoorthy, G.K.1
Harley, R.G.2
Wunsch, D.C.3
-
18
-
-
77950630017
-
Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem
-
May
-
K. G. Vamvoudakis and F. L. Lewis, "Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem," Automatica, vol. 46, no. 5, pp. 878-888, May 2010.
-
(2010)
Automatica
, vol.46
, Issue.5
, pp. 878-888
-
-
Vamvoudakis, K.G.1
Lewis, F.L.2
-
19
-
-
0043026775
-
Helicopter trimming and tracking control using direct neural dynamic programming
-
Jul.
-
R. Enns and J. Si, "Helicopter trimming and tracking control using direct neural dynamic programming," IEEE Trans. Neural Netw., vol. 14, no. 4, pp. 929-939, Jul. 2003.
-
(2003)
IEEE Trans. Neural Netw.
, vol.14
, Issue.4
, pp. 929-939
-
-
Enns, R.1
Si, J.2
-
20
-
-
49049106959
-
Direct heuristic dynamic programming for damping oscillations in a large power system
-
Aug.
-
C. Lu, J. Si, and X. Xie, "Direct heuristic dynamic programming for damping oscillations in a large power system," IEEE Trans. Syst. Man Cybern. B, Cybern., vol. 38, no. 4, pp. 1008-1013, Aug. 2008.
-
(2008)
IEEE Trans. Syst. Man Cybern. B, Cybern.
, vol.38
, Issue.4
, pp. 1008-1013
-
-
Lu, C.1
Si, J.2
Xie, X.3
-
21
-
-
49649121741
-
Reinforcement-learning-based dual-control methodology for complex nonlinear discrete-time systems with application to spark engine EGR operation
-
Aug.
-
P. Shih, B. C. Kaul, S. Jagannathan, and J. A. Drallmeier, "Reinforcement-learning-based dual-control methodology for complex nonlinear discrete-time systems with application to spark engine EGR operation," IEEE Trans. Neural Netw., vol. 19, no. 8, pp. 1369-1388, Aug. 2008.
-
(2008)
IEEE Trans. Neural Netw.
, vol.19
, Issue.8
, pp. 1369-1388
-
-
Shih, P.1
Kaul, B.C.2
Jagannathan, S.3
Drallmeier, J.A.4
-
22
-
-
0036832956
-
Kernel-based reinforcement learning
-
DOI 10.1023/A:1017928328829
-
D. Ormoneit and S. Sen, "Kernel-based reinforcement learning," Mach. Learn., vol. 49, nos. 2-3, pp. 161-178, Nov. 2002. (Pubitemid 34325684)
-
(2002)
Machine Learning
, vol.49
, Issue.2-3
, pp. 161-178
-
-
Ormoneit, D.1
Sen, A.2
-
23
-
-
1942421151
-
Bayes meets bellman: The gaussian process approach to temporal difference learning
-
Y. Engel, S. Mannor, and R. Meir, "Bayes meets bellman: The gaussian process approach to temporal difference learning," in Proc. 20th Int. Conf. Mach. Learn., 2003, pp. 154-161.
-
(2003)
Proc. 20th Int. Conf. Mach. Learn.
, pp. 154-161
-
-
Engel, Y.1
Mannor, S.2
Meir, R.3
-
24
-
-
84899029004
-
Batch value function approximation via support vectors
-
Cambridge, MA, USA: MIT Press
-
T. G Dietterich and X. Wang, "Batch value function approximation via support vectors," in Advances in Neural Information Processing Systems, Cambridge, MA, USA: MIT Press, 2002.
-
(2002)
Advances in Neural Information Processing Systems
-
-
Dietterich, T.G.1
Wang, X.2
-
25
-
-
84899026055
-
Gaussian Processes in reinforcement learning
-
Cambridge, MA, USA: MIT Press
-
C. E. Rasmussen and M. Kuss, "Gaussian Processes in reinforcement learning," in Advances in Neural Information Processing Systems, Cambridge, MA, USA: MIT Press, 2004.
-
(2004)
Advances in Neural Information Processing Systems
-
-
Rasmussen, C.E.1
Kuss, M.2
-
26
-
-
34547098844
-
Kernel-based least squares policy iteration for reinforcement learning
-
DOI 10.1109/TNN.2007.899161, Neural Networks for Feedback Control Systems
-
X. Xu, D. Hu, and X. Lu, "Kernel-based least squares policy iteration for reinforcement learning," IEEE Trans. Neural Netw., vol. 19, no. 4, pp. 973-992, Jul. 2007. (Pubitemid 47098876)
-
(2007)
IEEE Transactions on Neural Networks
, vol.18
, Issue.4
, pp. 973-992
-
-
Xu, X.1
Hu, D.2
Lu, X.3
-
29
-
-
70349984547
-
Natural actor-critic algorithms
-
Nov.
-
S. Bhatnagar, R. S. Sutton, M. Ghavamzadeh, and M. Lee. "Natural actor-critic algorithms," Automatica, vol. 45, no. 11, pp. 2471-2482, Nov. 2009.
-
(2009)
Automatica
, vol.45
, Issue.11
, pp. 2471-2482
-
-
Bhatnagar, S.1
Sutton, R.S.2
Ghavamzadeh, M.3
Lee, M.4
-
30
-
-
77951149420
-
The design and implementation of a wheeled inverted pendulum using an adaptive output recurrent cerebellar model articulation controller
-
May
-
C. H. Chiu, "The design and implementation of a wheeled inverted pendulum using an adaptive output recurrent cerebellar model articulation controller," IEEE Trans. Ind. Electron., vol. 57, no. 5, pp. 1814-1822, May 2010.
-
(2010)
IEEE Trans. Ind. Electron.
, vol.57
, Issue.5
, pp. 1814-1822
-
-
Chiu, C.H.1
-
31
-
-
1642415925
-
Control under constraints: An application of the command governor approach to an inverted pendulum
-
Jan.
-
A. Casavola, E. Mosca, and M. Papini, "Control under constraints: An application of the command governor approach to an inverted pendulum," IEEE Trans. Control Syst. Technol., vol. 12, no. 1, pp. 193-204, Jan. 2004.
-
(2004)
IEEE Trans. Control Syst. Technol.
, vol.12
, Issue.1
, pp. 193-204
-
-
Casavola, A.1
Mosca, E.2
Papini, M.3
-
32
-
-
33750417910
-
Adaptive fuzzy control of the inverted pendulum problem
-
DOI 10.1109/TCST.2006.880217
-
M. I. El-Hawwary, A. L. Elshafei, H. M. Emara, and H. A. A. Fattah, "Adaptive fuzzy control of the inverted pendulum problem," IEEE Trans. Control Syst. Technol., vol. 14, no. 6, pp. 1135-1144, Nov. 2006. (Pubitemid 44637628)
-
(2006)
IEEE Transactions on Control Systems Technology
, vol.14
, Issue.6
, pp. 1135-1144
-
-
El-Hawwary, M.I.1
Elshafei, A.L.2
Emara, H.M.3
Fattah, H.A.A.4
-
33
-
-
84865325212
-
Efficient kernel models for learning and approximate minimization problems
-
Nov.
-
C. Cervellera, M. Gaggero, and D. Maccio, "Efficient kernel models for learning and approximate minimization problems," Neurocomputing, vol. 97, pp. 74-85, Nov. 2012.
-
(2012)
Neurocomputing
, vol.97
, pp. 74-85
-
-
Cervellera, C.1
Gaggero, M.2
Maccio, D.3
|