SCOPUS 정보 검색 플랫폼

IEEE Transactions on Control Systems Technology

Volumn 22, Issue 1, 2014, Pages 146-156

Kernel-based approximate dynamic programming for real-time online learning control: An experimental study

(4) Xu, Xin a Lian, Chuanqiang a Zuo, Lei a He, Haibo b

a NATIONAL UNIVERSITY OF DEFENSE TECHNOLOGY (China)

b The University of Rhode Island (United States)

Author keywords

Approximate dynamic programming (ADP); learning control; Markov decision processes (MDPs); online learning; reinforcement learning (RL)

Indexed keywords

APPROXIMATE DYNAMIC PROGRAMMING; DUAL HEURISTIC PROGRAMMING; FUNCTION APPROXIMATION TECHNIQUES; LEARNING CONTROL; MARKOV DECISION PROCESSES; MULTI-LAYER PERCEPTRON NEURAL NETWORKS; ONLINE LEARNING; UNCERTAIN NONLINEAR SYSTEMS;

HEURISTIC PROGRAMMING; LEARNING ALGORITHMS; MARKOV PROCESSES; NEURAL NETWORKS; NONLINEAR DYNAMICAL SYSTEMS; OPTIMAL CONTROL SYSTEMS; PENDULUMS; REAL TIME CONTROL; REINFORCEMENT LEARNING; UNCERTAINTY ANALYSIS;

E-LEARNING;

EID: 84891557251 PISSN: 10636536 EISSN: None Source Type: Journal
DOI: 10.1109/TCST.2013.2246866 Document Type: Article

Times cited : (73)

References (33)

1
- 0004102479
- Cambridge, MA, USA: MIT Press
- R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction, Cambridge, MA, USA: MIT Press, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

2
- 0029679044
- Reinforcement learning: A survey
- L. P. Kaelbling, M. L. Littman, and A. W. Moore, "Reinforcement learning: A survey," J. Artif. Intell. Res., vol. 4, pp. 237-285, May 1996. (Pubitemid 126646155)
- (1996) Journal of Artificial Intelligence Research , vol.4 , pp. 237-285
- Kaelbling, L.P.¹ Littman, M.L.² Moore, A.W.³

3
- 47349092417
- New York USA: Wiley
- W. B. Powell, Approximate Dynamic Programming: Solving the Curses of Dimensionality, New York, USA: Wiley, 2007.
- (2007) Approximate Dynamic Programming: Solving the Curses of Dimensionality
- Powell, W.B.¹

4
- 84873991429
- Feedback optimal control of distributed parameter systems by using finite-dimensional approximation schemes
- Jun.
- A. Alessandri, M. Gaggero, and R. Zoppoli "Feedback optimal control of distributed parameter systems by using finite-dimensional approximation schemes," IEEE Trans. Neural Netw. Learn. Syst., vol. 23, no. 6, pp. 984-996, Jun. 2012.
- (2012) IEEE Trans. Neural Netw. Learn. Syst. , vol.23 , Issue.6 , pp. 984-996
- Alessandri, A.¹ Gaggero, M.² Zoppoli, R.³

5
- 66449130966
- Adaptive dynamic programming: An introduction
- May
- F. Y. Wang, H. Zhang, and D. Liu, "Adaptive dynamic programming: An introduction," IEEE Comput. Intell. Mag., vol. 4, no. 2, pp. 39-47, May 2009.
- (2009) IEEE Comput. Intell. Mag. , vol.4 , Issue.2 , pp. 39-47
- Wang, F.Y.¹ Zhang, H.² Liu, D.³

6
- 49049105169
- Ensemble algorithms in reinforcement learning
- Aug.
- M. A. Wiering and H. van Hasselt, "Ensemble algorithms in reinforcement learning," IEEE Trans. Syst. Man Cybern. B, Cybern., vol. 38, no. 4, pp. 930-936, Aug. 2008.
- (2008) IEEE Trans. Syst. Man Cybern. B, Cybern. , vol.38 , Issue.4 , pp. 930-936
- Wiering, M.A.¹ Van Hasselt, H.²

7
- 70349116541
- Reinforcement learning and adaptive dynamic programming for feedback control
- Jul-Sep.
- F. L. Lewis and D. Vrabie, "Reinforcement learning and adaptive dynamic programming for feedback control," IEEE Circuits Syst. Mag., vol. 9, no. 3, pp. 32-50, Jul.-Sep. 2009.
- (2009) IEEE Circuits Syst. Mag. , vol.9 , Issue.3 , pp. 32-50
- Lewis, F.L.¹ Vrabie, D.²

8
- 71149099079
- Fast gradient-descent methods for temporal-difference learning with linear function approximation
- R. S. Sutton, H. R. Maei, D. Precup, S. Bhatnagar, D. Silver, C. Szepesvári, and E. Wiewiora, "Fast gradient-descent methods for temporal-difference learning with linear function approximation," in Proc. Int. Conf. Mach. Learn., 2009, pp. 993-1000.
- (2009) Proc. Int. Conf. Mach. Learn. , pp. 993-1000
- Sutton, R.S.¹ Maei, H.R.² Precup, D.³ Bhatnagar, S.⁴ Silver, D.⁵ Szepesvári, C.⁶ Wiewiora, E.⁷

9
- 0013535965
- Infinite-horizon policy-gradient estimation
- Nov.
- J. Baxter and P. L. Bartlett, "Infinite-horizon policy-gradient estimation," J. Artif. Intell. Res., vol. 15, pp. 319-350, Nov. 2001.
- (2001) J. Artif. Intell. Res. , vol.15 , pp. 319-350
- Baxter, J.¹ Bartlett, P.L.²

10
- 84898938510
- Actor-critic algorithms
- Cambridge, MA, USA: MIT Press
- V. R. Konda and J. N. Tsitsiklis, "Actor-critic algorithms," in Advances in Neural Information Processing Systems, Cambridge, MA, USA: MIT Press, 2000.
- (2000) Advances in Neural Information Processing Systems
- Konda, V.R.¹ Tsitsiklis, J.N.²

11
- 0031236002
- Adaptive critic designs
- PII S1045922797052430
- D. V. Prokhorov and D. C. Wunsch, "Adaptive critic designs," IEEE Trans. Neural Netw., vol. 8, no. 5, pp. 997-1007, Sep. 1997. (Pubitemid 127763331)
- (1997) IEEE Transactions on Neural Networks , vol.8 , Issue.5 , pp. 997-1007
- Prokhorov, D.V.¹ Wunsch II, D.C.²

12
- 79960468564
- Asymptotic tracking by a reinforcement learning-based adaptive critic controller
- Aug.
- S. Bhasin, N. Sharma, P. Patre, and W. E. Dixon, "Asymptotic tracking by a reinforcement learning-based adaptive critic controller," J. Control Theory Appl., vol. 9, no. 3, pp. 400-409, Aug. 2011.
- (2011) J. Control Theory Appl. , vol.9 , Issue.3 , pp. 400-409
- Bhasin, S.¹ Sharma, N.² Patre, P.³ Dixon, W.E.⁴

13
- 74249090869
- Adaptive critic design for energy minimization of portable video communication devices
- Jun.
- Z. Sun, "Adaptive critic design for energy minimization of portable video communication devices," IEEE Trans. Circuits Syst. Video Technol., vol. 20, no. 1, pp. 27-37, Jun. 2010.
- (2010) IEEE Trans. Circuits Syst. Video Technol. , vol.20 , Issue.1 , pp. 27-37
- Sun, Z.¹

14
- 84863534247
- Adaptive optimal control for designing automatic train regulation for metro line
- Sep.
- J.-W. Sheu and W.-S. Lin, "Adaptive optimal control for designing automatic train regulation for metro line," IEEE Trans. Control Syst. Technol., vol. 20, no. 5, pp. 1319-1327, Sep. 2012.
- (2012) IEEE Trans. Control Syst. Technol. , vol.20 , Issue.5 , pp. 1319-1327
- Sheu, J.-W.¹ Lin, W.-S.²

15
- 83655163786
- Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method
- Dec.
- H. Zhang, L. Cui, X. Zhang, and Y. Luo, "Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method," IEEE Trans. Neural Netw., vol. 22, no. 12, pp. 2226-2236, Dec. 2011.
- (2011) IEEE Trans. Neural Netw. , vol.22 , Issue.12 , pp. 2226-2236
- Zhang, H.¹ Cui, L.² Zhang, X.³ Luo, Y.⁴

16
- 40649106649
- Natural actor-critic
- Mar.
- J. Peters and S. Schaal, "Natural actor-critic," Neurocomputing, vol. 71, nos. 7-9, pp. 1180-1190, Mar. 2008.
- (2008) Neurocomputing , vol.71 , Issue.7-9 , pp. 1180-1190
- Peters, J.¹ Schaal, S.²

17
- 0036565019
- Comparison of heuristic dynamic programming and dual heuristic programming adaptive critics for neurocontrol of a turbogenerator
- DOI 10.1109/TNN.2002.1000146, PII S1045922702044417
- G. K. Venayagamoorthy, R. G. Harley, and D. C. Wunsch, "Comparison of heuristic dynamic programming and dual heuristic programming adaptive critics for neurocontrol of a turbogenerator," IEEE Trans. Neural Netw., vol. 13, no. 3, pp. 764-773, May 2002. (Pubitemid 34669664)
- (2002) IEEE Transactions on Neural Networks , vol.13 , Issue.3 , pp. 764-773
- Venayagamoorthy, G.K.¹ Harley, R.G.² Wunsch, D.C.³

18
- 77950630017
- Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem
- May
- K. G. Vamvoudakis and F. L. Lewis, "Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem," Automatica, vol. 46, no. 5, pp. 878-888, May 2010.
- (2010) Automatica , vol.46 , Issue.5 , pp. 878-888
- Vamvoudakis, K.G.¹ Lewis, F.L.²

19
- 0043026775
- Helicopter trimming and tracking control using direct neural dynamic programming
- Jul.
- R. Enns and J. Si, "Helicopter trimming and tracking control using direct neural dynamic programming," IEEE Trans. Neural Netw., vol. 14, no. 4, pp. 929-939, Jul. 2003.
- (2003) IEEE Trans. Neural Netw. , vol.14 , Issue.4 , pp. 929-939
- Enns, R.¹ Si, J.²

20
- 49049106959
- Direct heuristic dynamic programming for damping oscillations in a large power system
- Aug.
- C. Lu, J. Si, and X. Xie, "Direct heuristic dynamic programming for damping oscillations in a large power system," IEEE Trans. Syst. Man Cybern. B, Cybern., vol. 38, no. 4, pp. 1008-1013, Aug. 2008.
- (2008) IEEE Trans. Syst. Man Cybern. B, Cybern. , vol.38 , Issue.4 , pp. 1008-1013
- Lu, C.¹ Si, J.² Xie, X.³

21
- 49649121741
- Reinforcement-learning-based dual-control methodology for complex nonlinear discrete-time systems with application to spark engine EGR operation
- Aug.
- P. Shih, B. C. Kaul, S. Jagannathan, and J. A. Drallmeier, "Reinforcement-learning-based dual-control methodology for complex nonlinear discrete-time systems with application to spark engine EGR operation," IEEE Trans. Neural Netw., vol. 19, no. 8, pp. 1369-1388, Aug. 2008.
- (2008) IEEE Trans. Neural Netw. , vol.19 , Issue.8 , pp. 1369-1388
- Shih, P.¹ Kaul, B.C.² Jagannathan, S.³ Drallmeier, J.A.⁴

22
- 0036832956
- Kernel-based reinforcement learning
- DOI 10.1023/A:1017928328829
- D. Ormoneit and S. Sen, "Kernel-based reinforcement learning," Mach. Learn., vol. 49, nos. 2-3, pp. 161-178, Nov. 2002. (Pubitemid 34325684)
- (2002) Machine Learning , vol.49 , Issue.2-3 , pp. 161-178
- Ormoneit, D.¹ Sen, A.²

23
- 1942421151
- Bayes meets bellman: The gaussian process approach to temporal difference learning
- Y. Engel, S. Mannor, and R. Meir, "Bayes meets bellman: The gaussian process approach to temporal difference learning," in Proc. 20th Int. Conf. Mach. Learn., 2003, pp. 154-161.
- (2003) Proc. 20th Int. Conf. Mach. Learn. , pp. 154-161
- Engel, Y.¹ Mannor, S.² Meir, R.³

24
- 84899029004
- Batch value function approximation via support vectors
- Cambridge, MA, USA: MIT Press
- T. G Dietterich and X. Wang, "Batch value function approximation via support vectors," in Advances in Neural Information Processing Systems, Cambridge, MA, USA: MIT Press, 2002.
- (2002) Advances in Neural Information Processing Systems
- Dietterich, T.G.¹ Wang, X.²

25
- 84899026055
- Gaussian Processes in reinforcement learning
- Cambridge, MA, USA: MIT Press
- C. E. Rasmussen and M. Kuss, "Gaussian Processes in reinforcement learning," in Advances in Neural Information Processing Systems, Cambridge, MA, USA: MIT Press, 2004.
- (2004) Advances in Neural Information Processing Systems
- Rasmussen, C.E.¹ Kuss, M.²

26
- 34547098844
- Kernel-based least squares policy iteration for reinforcement learning
- DOI 10.1109/TNN.2007.899161, Neural Networks for Feedback Control Systems
- X. Xu, D. Hu, and X. Lu, "Kernel-based least squares policy iteration for reinforcement learning," IEEE Trans. Neural Netw., vol. 19, no. 4, pp. 973-992, Jul. 2007. (Pubitemid 47098876)
- (2007) IEEE Transactions on Neural Networks , vol.18 , Issue.4 , pp. 973-992
- Xu, X.¹ Hu, D.² Lu, X.³

27
- 0003991806
- New York, USA: Wiley
- V. Vapnik, Statistical Learning Theory, New York, USA: Wiley, 1998.
- (1998) Statistical Learning Theory
- Vapnik, V.¹

28
- 0004094721
- Cambridge, MA, USA: MIT Press
- B. Schölkopf and A. Smola, Learning with Kernels, Cambridge, MA, USA: MIT Press, 2002.
- (2002) Learning with Kernels
- Schölkopf, B.¹ Smola, A.²

29
- 70349984547
- Natural actor-critic algorithms
- Nov.
- S. Bhatnagar, R. S. Sutton, M. Ghavamzadeh, and M. Lee. "Natural actor-critic algorithms," Automatica, vol. 45, no. 11, pp. 2471-2482, Nov. 2009.
- (2009) Automatica , vol.45 , Issue.11 , pp. 2471-2482
- Bhatnagar, S.¹ Sutton, R.S.² Ghavamzadeh, M.³ Lee, M.⁴

30
- 77951149420
- The design and implementation of a wheeled inverted pendulum using an adaptive output recurrent cerebellar model articulation controller
- May
- C. H. Chiu, "The design and implementation of a wheeled inverted pendulum using an adaptive output recurrent cerebellar model articulation controller," IEEE Trans. Ind. Electron., vol. 57, no. 5, pp. 1814-1822, May 2010.
- (2010) IEEE Trans. Ind. Electron. , vol.57 , Issue.5 , pp. 1814-1822
- Chiu, C.H.¹

31
- 1642415925
- Control under constraints: An application of the command governor approach to an inverted pendulum
- Jan.
- A. Casavola, E. Mosca, and M. Papini, "Control under constraints: An application of the command governor approach to an inverted pendulum," IEEE Trans. Control Syst. Technol., vol. 12, no. 1, pp. 193-204, Jan. 2004.
- (2004) IEEE Trans. Control Syst. Technol. , vol.12 , Issue.1 , pp. 193-204
- Casavola, A.¹ Mosca, E.² Papini, M.³

32
- 33750417910
- Adaptive fuzzy control of the inverted pendulum problem
- DOI 10.1109/TCST.2006.880217
- M. I. El-Hawwary, A. L. Elshafei, H. M. Emara, and H. A. A. Fattah, "Adaptive fuzzy control of the inverted pendulum problem," IEEE Trans. Control Syst. Technol., vol. 14, no. 6, pp. 1135-1144, Nov. 2006. (Pubitemid 44637628)
- (2006) IEEE Transactions on Control Systems Technology , vol.14 , Issue.6 , pp. 1135-1144
- El-Hawwary, M.I.¹ Elshafei, A.L.² Emara, H.M.³ Fattah, H.A.A.⁴

33
- 84865325212
- Efficient kernel models for learning and approximate minimization problems
- Nov.
- C. Cervellera, M. Gaggero, and D. Maccio, "Efficient kernel models for learning and approximate minimization problems," Neurocomputing, vol. 97, pp. 74-85, Nov. 2012.
- (2012) Neurocomputing , vol.97 , pp. 74-85
- Cervellera, C.¹ Gaggero, M.² Maccio, D.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.