메뉴 건너뛰기




Volumn 48, Issue 11, 2012, Pages 2850-2859

Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems

Author keywords

Adaptive control; LQR; Optimization under uncertainties; Policy iteration; Q learning

Indexed keywords

ADAPTIVE CONTROL; LQR; OPTIMIZATION UNDER UNCERTAINTY; POLICY ITERATION; Q-LEARNING;

EID: 84867400046     PISSN: 00051098     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.automatica.2012.06.008     Document Type: Article
Times cited : (153)

References (32)
  • 2
    • 0028733775 scopus 로고
    • Reinforcement learning in continuous-time: Advantage updating
    • Baird, L. C. III (1994). Reinforcement learning in continuous-time: advantage updating. In Proc. of ICNN. vol. 4 (pp. 2448-2453).
    • (1994) Proc. of ICNN , vol.4 , pp. 2448-2453
    • Baird Iii, L.C.1
  • 5
    • 33645410501 scopus 로고    scopus 로고
    • Dynamic programming and suboptimal control: A survey from ADP to MPC
    • D.P. Bertsekas, and J.N. Tsitsiklis Dynamic programming and suboptimal control: a survey from ADP to MPC European Journal of Control 11 2005 310 334
    • (2005) European Journal of Control , vol.11 , pp. 310-334
    • Bertsekas, D.P.1    Tsitsiklis, J.N.2
  • 6
    • 0028584964 scopus 로고
    • Adaptive linear quadratic control using policy iteration
    • Bradtke, S. J.; & Ydstie, B. E. (1994). Adaptive linear quadratic control using policy iteration. In Proc. ACC (pp. 3475-3479).
    • (1994) Proc. ACC , pp. 3475-3479
    • Bradtke, S.J.1    Ydstie, B.E.2
  • 7
    • 77950838809 scopus 로고    scopus 로고
    • Adaptive approximately optimal control of unknown nonlinear systems based on locally weighted learning
    • Dong, W.; & Farrell, J. A. (2009). Adaptive approximately optimal control of unknown nonlinear systems based on locally weighted learning. In Proc. CDC (pp. 345-350).
    • (2009) Proc. CDC , pp. 345-350
    • Dong, W.1    Farrell, J.A.2
  • 8
    • 0033629916 scopus 로고    scopus 로고
    • Reinforcement learning in continuous-time and space
    • K. Doya Reinforcement learning in continuous-time and space Neural Computation 12 2000 219 245
    • (2000) Neural Computation , vol.12 , pp. 219-245
    • Doya, K.1
  • 11
    • 84914965022 scopus 로고
    • On the iterative technique for Riccati equation computations
    • D. Kleinman On the iterative technique for Riccati equation computations IEEE Transactions on Automatic Control 13 1 1968 114 115
    • (1968) IEEE Transactions on Automatic Control , vol.13 , Issue.1 , pp. 114-115
    • Kleinman, D.1
  • 14
    • 4544319442 scopus 로고    scopus 로고
    • Approximate dynamic programming strategies and their applicability for process control: A review and future directions
    • J.M. Lee, and J.H. Lee Approximate dynamic programming strategies and their applicability for process control: a review and future directions International Journal of Control, Automation, and Systems (IJCAS) 2 3 2004 263 278
    • (2004) International Journal of Control, Automation, and Systems (IJCAS) , vol.2 , Issue.3 , pp. 263-278
    • Lee, J.M.1    Lee, J.H.2
  • 15
    • 77950824225 scopus 로고    scopus 로고
    • Model-free approximate dynamic programming for continuous-time linear systems
    • Lee, J. Y.; Park, J. B.; & Choi, Y. H. (2009). Model-free approximate dynamic programming for continuous-time linear systems. In Proc. CDC (pp. 5009-5014).
    • (2009) Proc. CDC , pp. 5009-5014
    • Lee, J.Y.1    Park, J.B.2    Choi, Y.H.3
  • 17
    • 79551685808 scopus 로고    scopus 로고
    • Reinforcement learning for partially observable dynamic processes: Adaptive dynamic programming using measured output data
    • F.L. Lewis, and K.G. Vamvoudakis Reinforcement learning for partially observable dynamic processes: adaptive dynamic programming using measured output data IEEE Transactions on Systems, Man and Cybernetics, Part B 41 1 2010 14 25
    • (2010) IEEE Transactions on Systems, Man and Cybernetics, Part B , vol.41 , Issue.1 , pp. 14-25
    • Lewis, F.L.1    Vamvoudakis, K.G.2
  • 18
    • 70349116541 scopus 로고    scopus 로고
    • Reinforcement learning and adaptive dynamic programming for feedback control
    • F.L. Lewis, and D. Vrabie Reinforcement learning and adaptive dynamic programming for feedback control IEEE Circuits and Systems Magazine 9 3 2009 32 50
    • (2009) IEEE Circuits and Systems Magazine , vol.9 , Issue.3 , pp. 32-50
    • Lewis, F.L.1    Vrabie, D.2
  • 19
    • 77950806766 scopus 로고    scopus 로고
    • Q-learning and pontryagins minimum principle
    • Mehta, P.; & Meyn, S. (2009). Q-learning and pontryagins minimum principle. In Proc. CDC (pp. 3598-3605).
    • (2009) Proc. CDC , pp. 3598-3605
    • Mehta, P.1    Meyn, S.2
  • 25
    • 77950630017 scopus 로고    scopus 로고
    • Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem
    • K.G. Vamvoudakis, and F.L. Lewis Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem Automatica 46 5 2010 878 888
    • (2010) Automatica , vol.46 , Issue.5 , pp. 878-888
    • Vamvoudakis, K.G.1    Lewis, F.L.2
  • 26
    • 58349110975 scopus 로고    scopus 로고
    • Adaptive optimal control for continuous-time linear systems based on policy iteration
    • D. Vrabie, O. Pastravanu, M. Abu-Khalaf, and F.L. Lewis Adaptive optimal control for continuous-time linear systems based on policy iteration Automatica 45 2 2009 477 484
    • (2009) Automatica , vol.45 , Issue.2 , pp. 477-484
    • Vrabie, D.1    Pastravanu, O.2    Abu-Khalaf, M.3    Lewis, F.L.4
  • 30
    • 0002031779 scopus 로고
    • Approximate dynamic programming for real-time control and neural modeling
    • D.A. White, D.A. Sofge, Van Nostrand Reinhold New York
    • P.J. Webos Approximate dynamic programming for real-time control and neural modeling D.A. White, D.A. Sofge, Handbook of intelligent control 1992 Van Nostrand Reinhold New York
    • (1992) Handbook of Intelligent Control
    • Webos, P.J.1
  • 32
    • 67650505616 scopus 로고    scopus 로고
    • Algorithm and stability of ATC receding horizon control
    • Zhang, H.; Huang, J.; & Lewis, F. L. (2009). Algorithm and stability of ATC receding horizon control. In IEEE Symp. ADPRL (pp. 28-35).
    • (2009) IEEE Symp. ADPRL , pp. 28-35
    • Zhang, H.1    Huang, J.2    Lewis, F.L.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.