메뉴 건너뛰기




Volumn 23, Issue 12, 2012, Pages 1884-1895

Neural network based online simultaneous policy update algorithm for solving the HJI equation in nonlinear H\infty control

Author keywords

H infty state feedback control; Hamilton Jacobi Isaacs equation; neural network; online; simultaneous policy update algorithm

Indexed keywords

HAMILTON-JACOBI-ISAACS; HAMILTON-JACOBI-ISAACS EQUATIONS; LEAST SQUARE METHODS; NONLINEAR PARTIAL DIFFERENTIAL EQUATIONS; ONLINE; REINFORCEMENT LEARNING TECHNIQUES; UNKNOWN ENVIRONMENTS; UPDATE ALGORITHMS;

EID: 84876909440     PISSN: 2162237X     EISSN: 21622388     Source Type: Journal    
DOI: 10.1109/TNNLS.2012.2217349     Document Type: Article
Times cited : (191)

References (49)
  • 4
    • 0029264110 scopus 로고
    • H∞ control via measurement feedback for general nonlinear systems
    • Mar
    • A. Isidori and W. Kang, "H∞ control via measurement feedback for general nonlinear systems," IEEE Trans. Autom. Control, vol. 40, no. 3, pp. 466-472, Mar. 1995.
    • (1995) IEEE Trans. Autom. Control , vol.40 , Issue.3 , pp. 466-472
    • Isidori, A.1    Kang, W.2
  • 5
    • 1442313356 scopus 로고    scopus 로고
    • Global H∞ controllers for a class of nonlinear systems
    • Feb
    • G. Bianchini, R. Genesio, A. Parenti, and A. Tesi, "Global H∞ controllers for a class of nonlinear systems," IEEE Trans. Autom. Control, vol. 49, no. 2, pp. 244-249, Feb. 2004.
    • (2004) IEEE Trans. Autom. Control , vol.49 , Issue.2 , pp. 244-249
    • Bianchini, G.1    Genesio, R.2    Parenti, A.3    Tesi, A.4
  • 6
    • 0026883666 scopus 로고
    • L2-gain analysis of nonlinear systems and nonlinear state-feedback H∞ control
    • Jun
    • A. J. van der Schaft, "L2-gain analysis of nonlinear systems and nonlinear state-feedback H∞ control," IEEE Trans. Autom. Control, vol. 37, no. 6, pp. 770-784, Jun. 1992.
    • (1992) IEEE Trans. Autom. Control , vol.37 , Issue.6 , pp. 770-784
    • Schaft Der Van, A.J.1
  • 7
    • 49049089962 scopus 로고    scopus 로고
    • Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof
    • Aug
    • A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, "Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof," IEEE Trans. Syst., Man, Cybern., B, Cybern., vol. 38, no. 4, pp. 943-949, Aug. 2008.
    • (2008) IEEE Trans. Syst., Man, Cybern., B, Cybern , vol.38 , Issue.4 , pp. 943-949
    • Al-Tamimi, A.1    Lewis, F.L.2    Abu-Khalaf, M.3
  • 8
    • 49049119493 scopus 로고    scopus 로고
    • A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm
    • Aug
    • H. Zhang, Q. Wei, and Y. Luo, "A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm," IEEE Trans. Syst., Man, Cybern., B, Cybern., vol. 38, no. 4, pp. 937-942, Aug. 2008.
    • (2008) IEEE Trans. Syst., Man, Cybern., B, Cybern , vol.38 , Issue.4 , pp. 937-942
    • Zhang, H.1    Wei, Q.2    Luo, Y.3
  • 9
    • 49049091364 scopus 로고    scopus 로고
    • Control of nonaffine nonlinear discrete-time systems using reinforcement-learning-based linearly parameterized neural networks
    • Aug
    • Q. M. Yang, J. B. Vance, and S. Jagannathan, "Control of nonaffine nonlinear discrete-time systems using reinforcement-learning-based linearly parameterized neural networks," IEEE Trans. Syst., Man, Cybern., B, Cybern., vol. 38, no. 4, pp. 994-1001, Aug. 2008.
    • (2008) IEEE Trans. Syst., Man, Cybern., B, Cybern , vol.38 , Issue.4 , pp. 994-1001
    • Yang, Q.M.1    Vance, J.B.2    Jagannathan, S.3
  • 10
    • 58349110975 scopus 로고    scopus 로고
    • Adaptive optimal control for continuous-time linear systems based on policy iteration
    • D. Vrabie, O. Pastravanu, M. Abu-Khalaf, and F. L. Lewis, "Adaptive optimal control for continuous-time linear systems based on policy iteration," Automatica, vol. 45, no. 2, pp. 477-484, 2009.
    • (2009) Automatica , vol.45 , Issue.2 , pp. 477-484
    • Vrabie, D.1    Pastravanu, O.2    Abu-Khalaf, M.3    Lewis, F.L.4
  • 11
    • 67349145396 scopus 로고    scopus 로고
    • Neural network approach to continuoustime direct adaptive optimal control for partially unknown nonlinear systems
    • D. Vrabie and F. L. Lewis, "Neural network approach to continuoustime direct adaptive optimal control for partially unknown nonlinear systems," Neural Netw., vol. 22, no. 3, pp. 237-246, 2009.
    • (2009) Neural Netw , vol.22 , Issue.3 , pp. 237-246
    • Vrabie, D.1    Lewis, F.L.2
  • 12
    • 70349253929 scopus 로고    scopus 로고
    • Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints
    • Sep
    • H. Zhang, Y. Luo, and D. Liu, "Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints," IEEE Trans. Neural Netw., vol. 20, no. 9, pp. 1490-1503, Sep. 2009.
    • (2009) IEEE Trans. Neural Netw , vol.20 , Issue.9 , pp. 1490-1503
    • Zhang, H.1    Luo, Y.2    Liu, D.3
  • 13
    • 77950630017 scopus 로고    scopus 로고
    • Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem
    • K. G. Vamvoudakis and F. L. Lewis, "Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem," Automatica, vol. 46, no. 5, pp. 878-888, 2010.
    • (2010) Automatica , vol.46 , Issue.5 , pp. 878-888
    • Vamvoudakis, K.G.1    Lewis, F.L.2
  • 14
    • 78651311269 scopus 로고    scopus 로고
    • Adaptive dynamic programming for finite-horizon optimal control of discrete-time nonlinear systems with ε-error bound
    • Jan.
    • F. Wang, N. Jin, D. Liu, and Q. Wei, "Adaptive dynamic programming for finite-horizon optimal control of discrete-time nonlinear systems with ε-error bound," IEEE Trans. Neural Netw., vol. 22, no. 1, pp. 24-36, Jan. 2011.
    • (2011) IEEE Trans. Neural Netw , vol.22 , Issue.1 , pp. 24-36
    • Wang, F.1    Jin, N.2    Liu, D.3    Wei, Q.4
  • 15
    • 84875270081 scopus 로고    scopus 로고
    • Online optimal control of affine nonlinear discrete-time systems with unknown internal dynamics by using timebased policy update
    • Jul.
    • T. Dierks and S. Jagannathan, "Online optimal control of affine nonlinear discrete-time systems with unknown internal dynamics by using timebased policy update," IEEE Trans. Neural Netw. Learn. Syst., vol. 23, no. 7, pp. 1118-1129, Jul. 2012.
    • (2012) IEEE Trans. Neural Netw. Learn. Syst , vol.23 , Issue.7 , pp. 1118-1129
    • Dierks, T.1    Jagannathan, S.2
  • 16
    • 83655163786 scopus 로고    scopus 로고
    • Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method
    • Dec.
    • H. Zhang, L. Cui, X. Zhang, and Y. Luo, "Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method," IEEE Trans. Neural Netw., vol. 22, no. 12, pp. 2226-2236, Dec. 2011.
    • (2011) IEEE Trans. Neural Netw , vol.22 , Issue.12 , pp. 2226-2236
    • Zhang, H.1    Cui, L.2    Zhang, X.3    Luo, Y.4
  • 17
    • 83655167263 scopus 로고    scopus 로고
    • Approximate dynamic programming for optimal stationary control with control-dependent noise
    • Dec.
    • Y. Jiang and Z. P. Jiang, "Approximate dynamic programming for optimal stationary control with control-dependent noise," IEEE Trans. Neural Netw., vol. 22, no. 12, pp. 2392-2398, Dec. 2011.
    • (2011) IEEE Trans. Neural Netw , vol.22 , Issue.12 , pp. 2392-2398
    • Jiang, Y.1    Jiang, Z.P.2
  • 18
    • 83855165164 scopus 로고    scopus 로고
    • Optimal tracking control for a class of nonlinear discrete-time systems with time delays based on heuristic dynamic programming
    • Dec.
    • H. Zhang, R. Song, Q. Wei, and T. Zhang, "Optimal tracking control for a class of nonlinear discrete-time systems with time delays based on heuristic dynamic programming," IEEE Trans. Neural Netw., vol. 22, no. 12, pp. 1851-1862, Dec. 2011.
    • (2011) IEEE Trans. Neural Netw , vol.22 , Issue.12 , pp. 1851-1862
    • Zhang, H.1    Song, R.2    Wei, Q.3    Zhang, T.4
  • 20
    • 70349116541 scopus 로고    scopus 로고
    • Reinforcement learning and adaptive dynamic programming for feedback control
    • Sep
    • F. L. Lewis and D. Vrabie, "Reinforcement learning and adaptive dynamic programming for feedback control," IEEE Circuits Syst. Mag., vol. 9, no. 3, pp. 32-50, Sep. 2009.
    • (2009) IEEE Circuits Syst. Mag , vol.9 , Issue.3 , pp. 32-50
    • Lewis, F.L.1    Vrabie, D.2
  • 21
    • 66449130966 scopus 로고    scopus 로고
    • Adaptive dynamic programming: An introduction
    • May
    • F. Wang, H. Zhang, and D. Liu, "Adaptive dynamic programming: An introduction," IEEE Comput. Intell. Mag., vol. 4, no. 2, pp. 39-47, May 2009.
    • (2009) IEEE Comput. Intell. Mag , vol.4 , Issue.2 , pp. 39-47
    • Wang, F.1    Zhang, H.2    Liu, D.3
  • 24
    • 83855164075 scopus 로고    scopus 로고
    • Hierarchical approximate policy iteration with binary-tree state space decomposition
    • Dec.
    • X. Xu, C. Liu, S. X. Yang, and D. Hu, "Hierarchical approximate policy iteration with binary-tree state space decomposition," IEEE Trans. Neural Netw., vol. 22, no. 12, pp. 1863-1877, Dec. 2011.
    • (2011) IEEE Trans. Neural Netw , vol.22 , Issue.12 , pp. 1863-1877
    • Xu, X.1    Liu, C.2    Yang, S.X.3    Hu, D.4
  • 25
    • 84876158475 scopus 로고    scopus 로고
    • Simple and fast calculation of the second-order gradients for globalized dual heuristic dynamic programming in neural networks
    • Oct.
    • M. Fairbank, E. Alonso, and D. Prokhorov, "Simple and fast calculation of the second-order gradients for globalized dual heuristic dynamic programming in neural networks," IEEE Trans. Neural Netw. Learn. Syst., vol. 23, no. 10, pp. 1671-1676, Oct. 2012.
    • (2012) IEEE Trans. Neural Netw. Learn. Syst , vol.23 , Issue.10 , pp. 1671-1676
    • Fairbank, M.1    Alonso, E.2    Prokhorov, D.3
  • 26
    • 61849156874 scopus 로고    scopus 로고
    • A game theoretic algorithm to compute local stabilizing solutions to HJBI equations in nonlinear H∞ control
    • Y. Feng, B. Anderson, and M. Rotkowitz, "A game theoretic algorithm to compute local stabilizing solutions to HJBI equations in nonlinear H∞ control," Automatica, vol. 45, no. 4, pp. 881-888, 2009.
    • (2009) Automatica , vol.45 , Issue.4 , pp. 881-888
    • Feng, Y.1    Anderson, B.2    Rotkowitz, M.3
  • 27
    • 56549098855 scopus 로고    scopus 로고
    • Computing the positive stabilizing solution to algebraic Riccati equations with an indefinite quadratic term via a recursive method
    • Nov
    • A. Lanzon, Y. Feng, B. D. O. Anderson, and M. Rotkowitz, "Computing the positive stabilizing solution to algebraic Riccati equations with an indefinite quadratic term via a recursive method," IEEE Trans. Autom. Control, vol. 53, no. 10, pp. 2280-2291, Nov. 2008.
    • (2008) IEEE Trans. Autom. Control , vol.53 , Issue.10 , pp. 2280-2291
    • Lanzon, A.1    Feng, Y.2    Anderson, B.D.O.3    Rotkowitz, M.4
  • 28
    • 0029371239 scopus 로고
    • Numerical approach to computing nonlinear H∞ control laws
    • J. Huang and C. Lin, "Numerical approach to computing nonlinear H∞ control laws," AIAA J. Guidance, Control, Dynamics, vol. 18, no. 5, pp. 989-994, 1995.
    • (1995) AIAA J. Guidance, Control, Dynamics , vol.18 , Issue.5 , pp. 989-994
    • Huang, J.1    Lin, C.2
  • 29
    • 0018441647 scopus 로고
    • An approximation theory of optimal control for trainable manipulators
    • Mar
    • G. N. Saridis and C. G. Lee, "An approximation theory of optimal control for trainable manipulators," IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 9, no. 3, pp. 152-159, Mar. 1979.
    • (1979) IEEE Trans. Syst., Man, Cybern. B, Cybern , vol.9 , Issue.3 , pp. 152-159
    • Saridis, G.N.1    Lee, C.G.2
  • 30
    • 0031332446 scopus 로고    scopus 로고
    • Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation
    • R. Beard, G. N. Saridis, and J. Wen, "Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation," Automatica, vol. 33, no. 12, pp. 2159-2177, 1997.
    • (1997) Automatica , vol.33 , Issue.12 , pp. 2159-2177
    • Beard, R.1    Saridis, G.N.2    Wen, J.3
  • 31
    • 0032387028 scopus 로고    scopus 로고
    • Approximate solutions to the timeinvariant Hamilton-Jacobi-Bellman equation
    • R. Beard, G. N. Saridis, and J. Wen, "Approximate solutions to the timeinvariant Hamilton-Jacobi-Bellman equation," J. Optim. Theory Appl., vol. 96, no. 3, pp. 589-626, 1998.
    • (1998) J. Optim. Theory Appl , vol.96 , Issue.3 , pp. 589-626
    • Beard, R.1    Saridis, G.N.2    Wen, J.3
  • 32
    • 0032202335 scopus 로고    scopus 로고
    • Successive Galerkin approximation algorithms for nonlinear optimal and robust control
    • R. W. Beard and T. W. Mclain, "Successive Galerkin approximation algorithms for nonlinear optimal and robust control," Int. J. Control, vol. 71, no. 5, pp. 717-743, 1998.
    • (1998) Int. J. Control , vol.71 , Issue.5 , pp. 717-743
    • Beard, R.W.1    McLain, T.W.2
  • 33
    • 84864463039 scopus 로고    scopus 로고
    • Online solution of nonlinear two-player zero-sum games using synchronous policy iteration
    • K. G. Vamvoudakis and F. L. Lewis, "Online solution of nonlinear two-player zero-sum games using synchronous policy iteration," Int. J. Robust Nonlinear Control, vol. 22, no. 13, pp. 1460-1483, 2011.
    • (2011) Int. J. Robust Nonlinear Control , vol.22 , Issue.13 , pp. 1460-1483
    • Vamvoudakis, K.G.1    Lewis, F.L.2
  • 34
    • 33845759425 scopus 로고    scopus 로고
    • Policy iterations on the Hamilton-Jacobi-Isaacs equation for H∞ state feedback control with input saturation
    • Dec
    • M. Abu-Khalaf, F. L. Lewis, and J. Huang, "Policy iterations on the Hamilton-Jacobi-Isaacs equation for H∞ state feedback control with input saturation," IEEE Trans. Autom. Control, vol. 51, no. 12, pp. 1989-1995, Dec. 2006.
    • (2006) IEEE Trans. Autom. Control , vol.51 , Issue.12 , pp. 1989-1995
    • Abu-Khalaf, M.1    Lewis, F.L.2    Huang, J.3
  • 35
    • 48949116222 scopus 로고    scopus 로고
    • Neurodynamic programming and zero-sum games for constrained control systems
    • Jul
    • M. Abu-Khalaf, F. L. Lewis, and J. Huang, "Neurodynamic programming and zero-sum games for constrained control systems," IEEE Trans. Neural Netw., vol. 19, no. 7, pp. 1243-1252, Jul. 2008.
    • (2008) IEEE Trans. Neural Netw , vol.19 , Issue.7 , pp. 1243-1252
    • Abu-Khalaf, M.1    Lewis, F.L.2    Huang, J.3
  • 37
    • 79960443754 scopus 로고    scopus 로고
    • Adaptive dynamic programming for online solution of a zero-sum differential game
    • D. Vrabie and F. L. Lewis, "Adaptive dynamic programming for online solution of a zero-sum differential game," J. Control Theory Appl., vol. 9, no. 3, pp. 353-360, 2011.
    • (2011) J. Control Theory Appl , vol.9 , Issue.3 , pp. 353-360
    • Vrabie, D.1    Lewis, F.L.2
  • 39
    • 51249194918 scopus 로고
    • The method of successive approximation for functional equations
    • L. Kantorovitch, "The method of successive approximation for functional equations," Acta Math., vol. 71, no. 1, pp. 63-97, 1939.
    • (1939) Acta Math , vol.71 , Issue.1 , pp. 63-97
    • Kantorovitch, L.1
  • 40
    • 0000816132 scopus 로고
    • The Kantorovich theorem for Newton's method
    • R. A. Tapia, "The Kantorovich theorem for Newton's method," Amer. Math. Monthly, vol. 78, no. 4, pp. 389-392, 1971.
    • (1971) Amer. Math. Monthly , vol.78 , Issue.4 , pp. 389-392
    • Tapia, R.A.1
  • 41
    • 0002521058 scopus 로고
    • A note on the convergence of Newton's method
    • L. B. Rall, "A note on the convergence of Newton's method," SIAM J. Numer. Anal., vol. 11, no. 1, pp. 34-36, 1974.
    • (1974) SIAM J. Numer. Anal , vol.11 , Issue.1 , pp. 34-36
    • Rall, L.B.1
  • 42
    • 0020970738 scopus 로고
    • Neuronlike adaptive elements that can solve difficult learning control problems
    • May
    • A. G. Barto, R. S. Sutton, and C. W. Anderson, "Neuronlike adaptive elements that can solve difficult learning control problems," IEEE Trans. Syst., Man, Cybern., B, Cybern., vol. 13, no. 5, pp. 834-846, May 1983.
    • (1983) IEEE Trans. Syst., Man, Cybern., B, Cybern , vol.13 , Issue.5 , pp. 834-846
    • Barto, A.G.1    Sutton, R.S.2    Anderson, C.W.3
  • 43
    • 0042758707 scopus 로고    scopus 로고
    • Ph.D dissertation, Dept. Electr. Eng. & Comput. Sci., Massachusetts Inst. Technology, Cambridge
    • V. Konda, "On actor-critic algorithms," Ph.D dissertation, Dept. Electr. Eng. & Comput. Sci., Massachusetts Inst. Technology, Cambridge, 2002.
    • (2002) On Actor-critic Algorithms
    • Konda, V.1
  • 44
    • 4043069840 scopus 로고    scopus 로고
    • On actor-critic algorithms
    • V. Konda and J. N. Tsitsiklis, "On actor-critic algorithms," SIAM J. Control Optim., vol. 42, no. 4, pp. 1143-1166, 2003.
    • (2003) SIAM J. Control Optim , vol.42 , Issue.4 , pp. 1143-1166
    • Konda, V.1    Tsitsiklis, J.N.2
  • 45
    • 14844340822 scopus 로고    scopus 로고
    • Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach
    • M. Abu-Khalaf and F. L. Lewis, "Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach," Automatica, vol. 41, no. 5, pp. 779-791, 2005.
    • (2005) Automatica , vol.41 , Issue.5 , pp. 779-791
    • Abu-Khalaf, M.1    Lewis, F.L.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.