메뉴 건너뛰기




Volumn 222, Issue , 2013, Pages 472-485

Simultaneous policy update algorithms for learning the solution of linear continuous-time H∞ state feedback control

Author keywords

Algebra Riccati equation; H state feedback control; Lyapunov equation; Offline; Online; Simultaneous policy update algorithm

Indexed keywords

ACTION POLICIES; COMPARATIVE SIMULATION; FIXED POINT EQUATION; INTERNAL SYSTEMS; LINEAR CONTINUOUS-TIME; LYAPUNOV EQUATION; MODEL BASED APPROACH; MODEL FREE; OFFLINE; ONLINE; ONLINE VERSIONS; ZERO-SUM GAME;

EID: 84870062175     PISSN: 00200255     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.ins.2012.08.012     Document Type: Article
Times cited : (81)

References (33)
  • 1
    • 48949116222 scopus 로고    scopus 로고
    • Neurodynamic programming and zero-sum games for constrained control systems
    • M. Abu-Khalaf, F.L. Lewis, and J. Huang Neurodynamic programming and zero-sum games for constrained control systems IEEE Transactions on Neural Networks 19 7 2008 1243 1252
    • (2008) IEEE Transactions on Neural Networks , vol.19 , Issue.7 , pp. 1243-1252
    • Abu-Khalaf, M.1    Lewis, F.L.2    Huang, J.3
  • 2
    • 0032202335 scopus 로고    scopus 로고
    • Successive Galerkin approximation algorithms for nonlinear optimal and robust control
    • R.W. Beard, and T.W. Mclain Successive Galerkin approximation algorithms for nonlinear optimal and robust control International Journal of Control 71 5 1998 717 743
    • (1998) International Journal of Control , vol.71 , Issue.5 , pp. 717-743
    • Beard, R.W.1    McLain, T.W.2
  • 3
    • 0031332446 scopus 로고    scopus 로고
    • Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation
    • R.W. Beard, G.N. Saridis, and J. Wen Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation Automatica 33 12 1997 2159 2177
    • (1997) Automatica , vol.33 , Issue.12 , pp. 2159-2177
    • Beard, R.W.1    Saridis, G.N.2    Wen, J.3
  • 5
    • 79960439729 scopus 로고    scopus 로고
    • Approximate policy iteration a survey and some new methods
    • D.P. Bertsekas Approximate policy iteration a survey and some new methods Journal of Control Theory and Applications 9 3 2011 310 335
    • (2011) Journal of Control Theory and Applications , vol.9 , Issue.3 , pp. 310-335
    • Bertsekas, D.P.1
  • 8
    • 79958784455 scopus 로고    scopus 로고
    • A numerical procedure to compute the stabilising solution of game theoretic Riccati equations of stochastic control
    • V. Dragan, and I. Ivanov A numerical procedure to compute the stabilising solution of game theoretic Riccati equations of stochastic control International Journal of Control 84 4 2011 783 800
    • (2011) International Journal of Control , vol.84 , Issue.4 , pp. 783-800
    • Dragan, V.1    Ivanov, I.2
  • 9
    • 79953793452 scopus 로고    scopus 로고
    • A new iterative algorithm to solve periodic Riccati differential equations with sign indefinite quadratic terms
    • Y.T. Feng, A. Varga, B.D.O. Anderson, and M. Lovera A new iterative algorithm to solve periodic Riccati differential equations with sign indefinite quadratic terms IEEE Transactions on Automatic Control 56 4 2011 929 934
    • (2011) IEEE Transactions on Automatic Control , vol.56 , Issue.4 , pp. 929-934
    • Feng, Y.T.1    Varga, A.2    Anderson, B.D.O.3    Lovera, M.4
  • 10
    • 74449083177 scopus 로고    scopus 로고
    • An iterative algorithm to solve state-perturbed stochastic algebraic Riccati equations in LQ zero-sum games
    • Y.T. Feng, and B.D.O. Anderson An iterative algorithm to solve state-perturbed stochastic algebraic Riccati equations in LQ zero-sum games Systems & Control Letters 59 1 2010 50 56
    • (2010) Systems & Control Letters , vol.59 , Issue.1 , pp. 50-56
    • Feng, Y.T.1    Anderson, B.D.O.2
  • 12
    • 79953906172 scopus 로고    scopus 로고
    • Self-organizing state aggregation for architecture design of Q-learning
    • K.-S. Hwang, H.-Y. Lin, Y. -P Hsu, and H.-H. Yu Self-organizing state aggregation for architecture design of Q-learning Information Sciences 181 13 2011 2813 2822
    • (2011) Information Sciences , vol.181 , Issue.13 , pp. 2813-2822
    • Hwang, K.-S.1    Lin, H.-Y.2    Hsu, Y.-P.3    Yu, H.-H.4
  • 13
  • 14
    • 51249194918 scopus 로고
    • The method of successive approximation for functional equations
    • L. Kantorovitch The method of successive approximation for functional equations Acta Mathematica 71 1 1939 63 97
    • (1939) Acta Mathematica , vol.71 , Issue.1 , pp. 63-97
    • Kantorovitch, L.1
  • 15
    • 84914965022 scopus 로고
    • On an iterative technique for Riccati equation computations
    • D.L. Kleinman On an iterative technique for Riccati equation computations IEEE Transactions on Automatic Control 13 1 1968 114 115
    • (1968) IEEE Transactions on Automatic Control , vol.13 , Issue.1 , pp. 114-115
    • Kleinman, D.L.1
  • 16
    • 56549098855 scopus 로고    scopus 로고
    • Computing the positive stabilizing solution to algebraic Riccati equations with an indefinite quadratic term via a recursive method
    • A. Lanzon, Y. Feng, B.D.O. Anderson, and M. Rotkowitz Computing the positive stabilizing solution to algebraic Riccati equations with an indefinite quadratic term via a recursive method IEEE Transactions on Automatic Control 53 10 2008 2280 2291
    • (2008) IEEE Transactions on Automatic Control , vol.53 , Issue.10 , pp. 2280-2291
    • Lanzon, A.1    Feng, Y.2    Anderson, B.D.O.3    Rotkowitz, M.4
  • 17
    • 70349116541 scopus 로고    scopus 로고
    • Reinforcement learning and adaptive dynamic programming for feedback control
    • F.L. Lewis, and D. Vrabie Reinforcement learning and adaptive dynamic programming for feedback control IEEE Circuits and Systems Magazine 9 3 2009 32 50
    • (2009) IEEE Circuits and Systems Magazine , vol.9 , Issue.3 , pp. 32-50
    • Lewis, F.L.1    Vrabie, D.2
  • 19
    • 0002521058 scopus 로고
    • A note on the convergence of Newton's method
    • L.B. Rall A note on the convergence of Newton's method SIAM Journal on Numerical Analysis 11 1 1974 34 36
    • (1974) SIAM Journal on Numerical Analysis , vol.11 , Issue.1 , pp. 34-36
    • Rall, L.B.1
  • 23
    • 0000816132 scopus 로고
    • The Kantorovich theorem for Newton's method
    • R.A. Tapia The Kantorovich theorem for Newton's method The American Mathematical Monthly 78 4 1971 389 392
    • (1971) The American Mathematical Monthly , vol.78 , Issue.4 , pp. 389-392
    • Tapia, R.A.1
  • 25
    • 77950630017 scopus 로고    scopus 로고
    • Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem
    • K.G. Vamvoudakis, and F.L. Lewis Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem Automatica 46 5 2010 878 888
    • (2010) Automatica , vol.46 , Issue.5 , pp. 878-888
    • Vamvoudakis, K.G.1    Lewis, F.L.2
  • 28
    • 79952312120 scopus 로고    scopus 로고
    • Hessian matrix distribution for Bayesian policy gradient reinforcement learning
    • N.A. Vien, H. Yu, and T.C. Chung Hessian matrix distribution for Bayesian policy gradient reinforcement learning Information Sciences 181 9 2011 1671 1685
    • (2011) Information Sciences , vol.181 , Issue.9 , pp. 1671-1685
    • Vien, N.A.1    Yu, H.2    Chung, T.C.3
  • 29
    • 79960443754 scopus 로고    scopus 로고
    • Adaptive dynamic programming for online solution of a zero-sum differential game
    • D. Vrabie, and F.L. Lewis Adaptive dynamic programming for online solution of a zero-sum differential game Journal of Control Theory and Applications 9 3 2011 353 360
    • (2011) Journal of Control Theory and Applications , vol.9 , Issue.3 , pp. 353-360
    • Vrabie, D.1    Lewis, F.L.2
  • 30
    • 58349110975 scopus 로고    scopus 로고
    • Adaptive optimal control for continuous-time linear systems based on policy iteration
    • D. Vrabie, O. Pastravanu, M. Abu-Khalaf, and F.L. Lewis Adaptive optimal control for continuous-time linear systems based on policy iteration Automatica 45 2 2009 477 484
    • (2009) Automatica , vol.45 , Issue.2 , pp. 477-484
    • Vrabie, D.1    Pastravanu, O.2    Abu-Khalaf, M.3    Lewis, F.L.4
  • 32
    • 34250731840 scopus 로고    scopus 로고
    • A fuzzy actor-critic reinforcement learning network
    • X. Wang, Y. Cheng, and J. Yi A fuzzy actor-critic reinforcement learning network Information Sciences 177 18 2007 3764 3781
    • (2007) Information Sciences , vol.177 , Issue.18 , pp. 3764-3781
    • Wang, X.1    Cheng, Y.2    Yi, J.3
  • 33
    • 78650805234 scopus 로고    scopus 로고
    • An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games
    • H. Zhang, Q. Wei, and D. Liu An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games Automatica 47 1 2011 207 214
    • (2011) Automatica , vol.47 , Issue.1 , pp. 207-214
    • Zhang, H.1    Wei, Q.2    Liu, D.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.