메뉴 건너뛰기




Volumn 22, Issue 13, 2012, Pages 1460-1483

Online solution of nonlinear two-player zero-sum games using synchronous policy iteration

Author keywords

approximate dynamic programming; Hamilton Jacobi Isaacs equation; Nash equilibrium; synchronous zero sum game policy iteration

Indexed keywords

ADAPTIVE LEARNING ALGORITHM; APPROXIMATE DYNAMIC PROGRAMMING; CLOSED LOOP STABILITY; COMPLEX NONLINEAR SYSTEM; CONTINUOUS TIME; GAME POLICIES; GAME PROBLEM; HAMILTON-JACOBI-ISAACS; HAMILTON-JACOBI-ISAACS EQUATIONS; INFINITE HORIZONS; NASH EQUILIBRIA; ON-LINE GAMING; OPTIMAL VALUE FUNCTIONS; OPTIMAL VALUES; PERSISTENCE OF EXCITATION; POLICY ITERATION; REAL TIME; SADDLE POINT; SIMULATION EXAMPLE; TUNING ALGORITHM; ZERO-SUM GAME;

EID: 84864463039     PISSN: 10498923     EISSN: 10991239     Source Type: Journal    
DOI: 10.1002/rnc.1760     Document Type: Article
Times cited : (192)

References (34)
  • 5
    • 0033629916 scopus 로고    scopus 로고
    • Reinforcement learning in continuous time and space
    • Doya K,. Reinforcement learning in continuous time and space. Neural Computation. 2000 12 1: 219-245.
    • (2000) Neural Computation , vol.12 , Issue.1 , pp. 219-245
    • Doya, K.1
  • 13
    • 0002031779 scopus 로고
    • Approximate dynamic programming for real-time control and neural modeling
    • White D.A. Sofge D.A. (eds), Van Nostrand Reinhold, New York
    • Werbos PJ,. 1992. Approximate dynamic programming for real-time control and neural modeling, In Handbook of Intelligent Control, White DA, Sofge DA, (eds), Van Nostrand Reinhold, New York.
    • (1992) Handbook of Intelligent Control
    • Werbos, P.J.1
  • 14
    • 77953770221 scopus 로고    scopus 로고
    • Ph.D. Thesis, Department of Electrical Engineering, University of Texas at Arlington, Arlington, TX
    • Vrabie D,. 2009. Online adaptive optimal control for continuous time systems, Ph.D. Thesis, Department of Electrical Engineering, University of Texas at Arlington, Arlington, TX.
    • (2009) Online Adaptive Optimal Control for Continuous Time Systems
    • Vrabie, D.1
  • 15
    • 77950630017 scopus 로고    scopus 로고
    • Online actor-critic algorithm to solve the continuous-time inifinite horizon optimal control problem
    • Vamvoudakis KG, Lewis FL,. Online actor-critic algorithm to solve the continuous-time inifinite horizon optimal control problem. Automatica. 2010 46 5: 878-888.
    • (2010) Automatica , vol.46 , Issue.5 , pp. 878-888
    • Vamvoudakis, K.G.1    Lewis, F.L.2
  • 20
    • 48949116222 scopus 로고    scopus 로고
    • Neurodynamic programming and zero-sum games for constrained control systems
    • Abu-Khalaf M, Lewis FL,. Neurodynamic programming and zero-sum games for constrained control systems. IEEE Transactions on Neural Networks. 2008 19 7: 1243-1252.
    • (2008) IEEE Transactions on Neural Networks , vol.19 , Issue.7 , pp. 1243-1252
    • Abu-Khalaf, M.1    Lewis, F.L.2
  • 22
    • 84914965022 scopus 로고
    • On an iterative technique for Riccati equation computations
    • Kleinman D,. On an iterative technique for Riccati equation computations. IEEE Transactions on Automatic Control. 1968 13 1: 114-115.
    • (1968) IEEE Transactions on Automatic Control , vol.13 , Issue.1 , pp. 114-115
    • Kleinman, D.1
  • 24
    • 14844340822 scopus 로고    scopus 로고
    • Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach
    • DOI 10.1016/j.automatica.2004.11.034, PII S0005109805000105
    • Abu-Khalaf M, Lewis FL,. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica. 2005 41 5: 779-791. (Pubitemid 40352391)
    • (2005) Automatica , vol.41 , Issue.5 , pp. 779-791
    • Abu-Khalaf, M.1    Lewis, F.L.2
  • 26
    • 0025627940 scopus 로고
    • Universal Approximation of an unknown mapping and its derivatives using multilayer feedforward networks
    • Hornik K, Stinchcombe M, White H,. Universal Approximation of an unknown mapping and its derivatives using multilayer feedforward networks. Neural Networks. 1990 3 5: 551-560.
    • (1990) Neural Networks , vol.3 , Issue.5 , pp. 551-560
    • Hornik, K.1    Stinchcombe, M.2    White, H.3
  • 33
    • 0004178386 scopus 로고    scopus 로고
    • Prentice-Hall: Upper Saddle River, NJ
    • Khalil HK,. 1996. Nonlinear Systems, Prentice-Hall: Upper Saddle River, NJ.
    • (1996) Nonlinear Systems
    • Khalil, H.K.1
  • 34
    • 79953151751 scopus 로고    scopus 로고
    • A model free robust policy iteration algorithm for optimal control of nonlinear systems
    • Atlanta, GA, 15-17 December
    • Bhasin S, Johnson M, Dixon WE,. A model free robust policy iteration algorithm for optimal control of nonlinear systems, Proceedings of the 49th IEEE Conference on Decision and Control, Atlanta, GA, 15-17 December 2010; 3060-3065.
    • (2010) Proceedings of the 49th IEEE Conference on Decision and Control , pp. 3060-3065
    • Bhasin, S.1    Johnson, M.2    Dixon, W.E.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.