메뉴 건너뛰기




Volumn 22, Issue 1, 2014, Pages 146-156

Kernel-based approximate dynamic programming for real-time online learning control: An experimental study

Author keywords

Approximate dynamic programming (ADP); learning control; Markov decision processes (MDPs); online learning; reinforcement learning (RL)

Indexed keywords

APPROXIMATE DYNAMIC PROGRAMMING; DUAL HEURISTIC PROGRAMMING; FUNCTION APPROXIMATION TECHNIQUES; LEARNING CONTROL; MARKOV DECISION PROCESSES; MULTI-LAYER PERCEPTRON NEURAL NETWORKS; ONLINE LEARNING; UNCERTAIN NONLINEAR SYSTEMS;

EID: 84891557251     PISSN: 10636536     EISSN: None     Source Type: Journal    
DOI: 10.1109/TCST.2013.2246866     Document Type: Article
Times cited : (73)

References (33)
  • 4
    • 84873991429 scopus 로고    scopus 로고
    • Feedback optimal control of distributed parameter systems by using finite-dimensional approximation schemes
    • Jun.
    • A. Alessandri, M. Gaggero, and R. Zoppoli "Feedback optimal control of distributed parameter systems by using finite-dimensional approximation schemes," IEEE Trans. Neural Netw. Learn. Syst., vol. 23, no. 6, pp. 984-996, Jun. 2012.
    • (2012) IEEE Trans. Neural Netw. Learn. Syst. , vol.23 , Issue.6 , pp. 984-996
    • Alessandri, A.1    Gaggero, M.2    Zoppoli, R.3
  • 5
    • 66449130966 scopus 로고    scopus 로고
    • Adaptive dynamic programming: An introduction
    • May
    • F. Y. Wang, H. Zhang, and D. Liu, "Adaptive dynamic programming: An introduction," IEEE Comput. Intell. Mag., vol. 4, no. 2, pp. 39-47, May 2009.
    • (2009) IEEE Comput. Intell. Mag. , vol.4 , Issue.2 , pp. 39-47
    • Wang, F.Y.1    Zhang, H.2    Liu, D.3
  • 7
    • 70349116541 scopus 로고    scopus 로고
    • Reinforcement learning and adaptive dynamic programming for feedback control
    • Jul-Sep.
    • F. L. Lewis and D. Vrabie, "Reinforcement learning and adaptive dynamic programming for feedback control," IEEE Circuits Syst. Mag., vol. 9, no. 3, pp. 32-50, Jul.-Sep. 2009.
    • (2009) IEEE Circuits Syst. Mag. , vol.9 , Issue.3 , pp. 32-50
    • Lewis, F.L.1    Vrabie, D.2
  • 9
    • 0013535965 scopus 로고    scopus 로고
    • Infinite-horizon policy-gradient estimation
    • Nov.
    • J. Baxter and P. L. Bartlett, "Infinite-horizon policy-gradient estimation," J. Artif. Intell. Res., vol. 15, pp. 319-350, Nov. 2001.
    • (2001) J. Artif. Intell. Res. , vol.15 , pp. 319-350
    • Baxter, J.1    Bartlett, P.L.2
  • 12
    • 79960468564 scopus 로고    scopus 로고
    • Asymptotic tracking by a reinforcement learning-based adaptive critic controller
    • Aug.
    • S. Bhasin, N. Sharma, P. Patre, and W. E. Dixon, "Asymptotic tracking by a reinforcement learning-based adaptive critic controller," J. Control Theory Appl., vol. 9, no. 3, pp. 400-409, Aug. 2011.
    • (2011) J. Control Theory Appl. , vol.9 , Issue.3 , pp. 400-409
    • Bhasin, S.1    Sharma, N.2    Patre, P.3    Dixon, W.E.4
  • 13
    • 74249090869 scopus 로고    scopus 로고
    • Adaptive critic design for energy minimization of portable video communication devices
    • Jun.
    • Z. Sun, "Adaptive critic design for energy minimization of portable video communication devices," IEEE Trans. Circuits Syst. Video Technol., vol. 20, no. 1, pp. 27-37, Jun. 2010.
    • (2010) IEEE Trans. Circuits Syst. Video Technol. , vol.20 , Issue.1 , pp. 27-37
    • Sun, Z.1
  • 14
    • 84863534247 scopus 로고    scopus 로고
    • Adaptive optimal control for designing automatic train regulation for metro line
    • Sep.
    • J.-W. Sheu and W.-S. Lin, "Adaptive optimal control for designing automatic train regulation for metro line," IEEE Trans. Control Syst. Technol., vol. 20, no. 5, pp. 1319-1327, Sep. 2012.
    • (2012) IEEE Trans. Control Syst. Technol. , vol.20 , Issue.5 , pp. 1319-1327
    • Sheu, J.-W.1    Lin, W.-S.2
  • 15
    • 83655163786 scopus 로고    scopus 로고
    • Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method
    • Dec.
    • H. Zhang, L. Cui, X. Zhang, and Y. Luo, "Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method," IEEE Trans. Neural Netw., vol. 22, no. 12, pp. 2226-2236, Dec. 2011.
    • (2011) IEEE Trans. Neural Netw. , vol.22 , Issue.12 , pp. 2226-2236
    • Zhang, H.1    Cui, L.2    Zhang, X.3    Luo, Y.4
  • 16
    • 40649106649 scopus 로고    scopus 로고
    • Natural actor-critic
    • Mar.
    • J. Peters and S. Schaal, "Natural actor-critic," Neurocomputing, vol. 71, nos. 7-9, pp. 1180-1190, Mar. 2008.
    • (2008) Neurocomputing , vol.71 , Issue.7-9 , pp. 1180-1190
    • Peters, J.1    Schaal, S.2
  • 17
    • 0036565019 scopus 로고    scopus 로고
    • Comparison of heuristic dynamic programming and dual heuristic programming adaptive critics for neurocontrol of a turbogenerator
    • DOI 10.1109/TNN.2002.1000146, PII S1045922702044417
    • G. K. Venayagamoorthy, R. G. Harley, and D. C. Wunsch, "Comparison of heuristic dynamic programming and dual heuristic programming adaptive critics for neurocontrol of a turbogenerator," IEEE Trans. Neural Netw., vol. 13, no. 3, pp. 764-773, May 2002. (Pubitemid 34669664)
    • (2002) IEEE Transactions on Neural Networks , vol.13 , Issue.3 , pp. 764-773
    • Venayagamoorthy, G.K.1    Harley, R.G.2    Wunsch, D.C.3
  • 18
    • 77950630017 scopus 로고    scopus 로고
    • Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem
    • May
    • K. G. Vamvoudakis and F. L. Lewis, "Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem," Automatica, vol. 46, no. 5, pp. 878-888, May 2010.
    • (2010) Automatica , vol.46 , Issue.5 , pp. 878-888
    • Vamvoudakis, K.G.1    Lewis, F.L.2
  • 19
    • 0043026775 scopus 로고    scopus 로고
    • Helicopter trimming and tracking control using direct neural dynamic programming
    • Jul.
    • R. Enns and J. Si, "Helicopter trimming and tracking control using direct neural dynamic programming," IEEE Trans. Neural Netw., vol. 14, no. 4, pp. 929-939, Jul. 2003.
    • (2003) IEEE Trans. Neural Netw. , vol.14 , Issue.4 , pp. 929-939
    • Enns, R.1    Si, J.2
  • 20
    • 49049106959 scopus 로고    scopus 로고
    • Direct heuristic dynamic programming for damping oscillations in a large power system
    • Aug.
    • C. Lu, J. Si, and X. Xie, "Direct heuristic dynamic programming for damping oscillations in a large power system," IEEE Trans. Syst. Man Cybern. B, Cybern., vol. 38, no. 4, pp. 1008-1013, Aug. 2008.
    • (2008) IEEE Trans. Syst. Man Cybern. B, Cybern. , vol.38 , Issue.4 , pp. 1008-1013
    • Lu, C.1    Si, J.2    Xie, X.3
  • 21
    • 49649121741 scopus 로고    scopus 로고
    • Reinforcement-learning-based dual-control methodology for complex nonlinear discrete-time systems with application to spark engine EGR operation
    • Aug.
    • P. Shih, B. C. Kaul, S. Jagannathan, and J. A. Drallmeier, "Reinforcement-learning-based dual-control methodology for complex nonlinear discrete-time systems with application to spark engine EGR operation," IEEE Trans. Neural Netw., vol. 19, no. 8, pp. 1369-1388, Aug. 2008.
    • (2008) IEEE Trans. Neural Netw. , vol.19 , Issue.8 , pp. 1369-1388
    • Shih, P.1    Kaul, B.C.2    Jagannathan, S.3    Drallmeier, J.A.4
  • 22
    • 0036832956 scopus 로고    scopus 로고
    • Kernel-based reinforcement learning
    • DOI 10.1023/A:1017928328829
    • D. Ormoneit and S. Sen, "Kernel-based reinforcement learning," Mach. Learn., vol. 49, nos. 2-3, pp. 161-178, Nov. 2002. (Pubitemid 34325684)
    • (2002) Machine Learning , vol.49 , Issue.2-3 , pp. 161-178
    • Ormoneit, D.1    Sen, A.2
  • 23
    • 1942421151 scopus 로고    scopus 로고
    • Bayes meets bellman: The gaussian process approach to temporal difference learning
    • Y. Engel, S. Mannor, and R. Meir, "Bayes meets bellman: The gaussian process approach to temporal difference learning," in Proc. 20th Int. Conf. Mach. Learn., 2003, pp. 154-161.
    • (2003) Proc. 20th Int. Conf. Mach. Learn. , pp. 154-161
    • Engel, Y.1    Mannor, S.2    Meir, R.3
  • 26
    • 34547098844 scopus 로고    scopus 로고
    • Kernel-based least squares policy iteration for reinforcement learning
    • DOI 10.1109/TNN.2007.899161, Neural Networks for Feedback Control Systems
    • X. Xu, D. Hu, and X. Lu, "Kernel-based least squares policy iteration for reinforcement learning," IEEE Trans. Neural Netw., vol. 19, no. 4, pp. 973-992, Jul. 2007. (Pubitemid 47098876)
    • (2007) IEEE Transactions on Neural Networks , vol.18 , Issue.4 , pp. 973-992
    • Xu, X.1    Hu, D.2    Lu, X.3
  • 29
    • 70349984547 scopus 로고    scopus 로고
    • Natural actor-critic algorithms
    • Nov.
    • S. Bhatnagar, R. S. Sutton, M. Ghavamzadeh, and M. Lee. "Natural actor-critic algorithms," Automatica, vol. 45, no. 11, pp. 2471-2482, Nov. 2009.
    • (2009) Automatica , vol.45 , Issue.11 , pp. 2471-2482
    • Bhatnagar, S.1    Sutton, R.S.2    Ghavamzadeh, M.3    Lee, M.4
  • 30
    • 77951149420 scopus 로고    scopus 로고
    • The design and implementation of a wheeled inverted pendulum using an adaptive output recurrent cerebellar model articulation controller
    • May
    • C. H. Chiu, "The design and implementation of a wheeled inverted pendulum using an adaptive output recurrent cerebellar model articulation controller," IEEE Trans. Ind. Electron., vol. 57, no. 5, pp. 1814-1822, May 2010.
    • (2010) IEEE Trans. Ind. Electron. , vol.57 , Issue.5 , pp. 1814-1822
    • Chiu, C.H.1
  • 31
    • 1642415925 scopus 로고    scopus 로고
    • Control under constraints: An application of the command governor approach to an inverted pendulum
    • Jan.
    • A. Casavola, E. Mosca, and M. Papini, "Control under constraints: An application of the command governor approach to an inverted pendulum," IEEE Trans. Control Syst. Technol., vol. 12, no. 1, pp. 193-204, Jan. 2004.
    • (2004) IEEE Trans. Control Syst. Technol. , vol.12 , Issue.1 , pp. 193-204
    • Casavola, A.1    Mosca, E.2    Papini, M.3
  • 33
    • 84865325212 scopus 로고    scopus 로고
    • Efficient kernel models for learning and approximate minimization problems
    • Nov.
    • C. Cervellera, M. Gaggero, and D. Maccio, "Efficient kernel models for learning and approximate minimization problems," Neurocomputing, vol. 97, pp. 74-85, Nov. 2012.
    • (2012) Neurocomputing , vol.97 , pp. 74-85
    • Cervellera, C.1    Gaggero, M.2    Maccio, D.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.