메뉴 건너뛰기




Volumn 26, Issue 8, 2015, Pages 1645-1658

Approximate N-Player Nonzero-Sum Game Solution for an Uncertain Continuous Nonlinear System

Author keywords

Actor critic (AC) methods; adaptive control; adaptive dynamic programming; differential games; optimal control

Indexed keywords

CONTINUOUS TIME SYSTEMS; GAME THEORY; LEAST SQUARES APPROXIMATIONS; NEURAL NETWORKS; OPTIMIZATION;

EID: 84937390462     PISSN: 2162237X     EISSN: 21622388     Source Type: Journal    
DOI: 10.1109/TNNLS.2014.2350835     Document Type: Article
Times cited : (80)

References (59)
  • 3
    • 0004071782 scopus 로고    scopus 로고
    • (Classics in Applied Mathematics), 2nd ed Philadelphia, PA, USA: SIAM
    • T. Basar and G. J. Olsder, Dynamic Noncooperative Game Theory (Classics in Applied Mathematics), 2nd ed. Philadelphia, PA, USA: SIAM, 1999.
    • (1999) Dynamic Noncooperative Game Theory
    • Basar, T.1    Olsder, G.J.2
  • 4
    • 14844340822 scopus 로고    scopus 로고
    • Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach
    • M. Abu-Khalaf and F. L. Lewis, "Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach," Automatica, vol. 41, no. 5, pp. 779-791, 2005.
    • (2005) Automatica , vol.41 , Issue.5 , pp. 779-791
    • Abu-Khalaf, M.1    Lewis, F.L.2
  • 5
    • 0001316642 scopus 로고
    • Further properties of nonzero-sum differential games
    • A. W. Starr and Y. C. Ho, "Further properties of nonzero-sum differential games," J. Optim. Theory Appl., vol. 3, no. 4, pp. 207-219, 1969.
    • (1969) J. Optim. Theory Appl. , vol.3 , Issue.4 , pp. 207-219
    • Starr, A.W.1    Ho, Y.C.2
  • 7
    • 0014509068 scopus 로고
    • Toward a theory of many player differential games
    • J. H. Case, "Toward a theory of many player differential games," SIAM J. Control, vol. 7, no. 2, pp. 179-197, 1969.
    • (1969) SIAM J. Control , vol.7 , Issue.2 , pp. 179-197
    • Case, J.H.1
  • 10
    • 0002031779 scopus 로고
    • Approximate dynamic programming for real-time control and neural modeling
    • D. A. White and D. A. Sofge, Eds. New York, NY, USA: Van Nostrand Reinhold
    • P. Werbos, "Approximate dynamic programming for real-time control and neural modeling," in Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches, D. A. White and D. A. Sofge, Eds. New York, NY, USA: Van Nostrand Reinhold, 1992.
    • (1992) Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches
    • Werbos, P.1
  • 13
    • 49049089962 scopus 로고    scopus 로고
    • Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof
    • Aug
    • A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, "Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof," IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 38, no. 4, pp. 943-949, Aug. 2008.
    • (2008) IEEE Trans. Syst., Man, Cybern. B, Cybern. , vol.38 , Issue.4 , pp. 943-949
    • Al-Tamimi, A.1    Lewis, F.L.2    Abu-Khalaf, M.3
  • 14
    • 33846781129 scopus 로고    scopus 로고
    • Model-free Q-learning designs for linear discrete-time zero-sum games with application to H8 control
    • A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, "Model-free Q-learning designs for linear discrete-time zero-sum games with application to H8 control," Automatica, vol. 43, no. 3, pp. 473-481, 2007.
    • (2007) Automatica , vol.43 , Issue.3 , pp. 473-481
    • Al-Tamimi, A.1    Lewis, F.L.2    Abu-Khalaf, M.3
  • 16
    • 0015667648 scopus 로고
    • Punish/reward: Learning with a critic in adaptive threshold systems
    • Sep
    • B. Widrow, N. Gupta, and S. Maitra, "Punish/reward: Learning with a critic in adaptive threshold systems," IEEE Trans. Syst. Man Cybern., vol. SMC-3, no. 5, pp. 455-465, Sep. 1973.
    • (1973) IEEE Trans. Syst. Man Cybern. , vol.SMC-3 , Issue.5 , pp. 455-465
    • Widrow, B.1    Gupta, N.2    Maitra, S.3
  • 17
    • 0020970738 scopus 로고
    • Neuron-like adaptive elements that can solve difficult learning control problems
    • Sep./Oct
    • A. G. Barto, R. S. Sutton, and C. W. Anderson, "Neuron-like adaptive elements that can solve difficult learning control problems," IEEE Trans. Syst. Man Cybern., vol. SMC-13, no. 5, pp. 834-846, Sep./Oct. 1983.
    • (1983) IEEE Trans. Syst. Man Cybern. , vol.SMC-13 , Issue.5 , pp. 834-846
    • Barto, A.G.1    Sutton, R.S.2    Anderson, C.W.3
  • 18
    • 0026852362 scopus 로고
    • Reinforcement learning is direct adaptive optimal control
    • Apr
    • R. S. Sutton, A. G. Barto, and R. J. Williams, "Reinforcement learning is direct adaptive optimal control," IEEE Control Syst. Mag., vol. 12, no. 2, pp. 19-22, Apr. 1992.
    • (1992) IEEE Control Syst. Mag. , vol.12 , Issue.2 , pp. 19-22
    • Sutton, R.S.1    Barto, A.G.2    Williams, R.J.3
  • 19
    • 0033285710 scopus 로고    scopus 로고
    • Adaptive critic neural network for feedforward compensation
    • Jun
    • J. Campos and F. L. Lewis, "Adaptive critic neural network for feedforward compensation," in Proc. Amer. Control Conf., vol. 4. Jun. 1999, pp. 2813-2818.
    • (1999) Proc. Amer. Control Conf. , vol.4 , pp. 2813-2818
    • Campos, J.1    Lewis, F.L.2
  • 20
    • 33847648898 scopus 로고    scopus 로고
    • Adaptive critic designs for discrete-time zero-sum games with application to H8 control
    • Feb
    • A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, "Adaptive critic designs for discrete-time zero-sum games with application to H8 control," IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 37, no. 1, pp. 240-247, Feb. 2007.
    • (2007) IEEE Trans. Syst., Man, Cybern. B, Cybern. , vol.37 , Issue.1 , pp. 240-247
    • Al-Tamimi, A.1    Lewis, F.L.2    Abu-Khalaf, M.3
  • 21
    • 0030196717 scopus 로고    scopus 로고
    • Adaptive-critic-based neural networks for aircraft optimal control
    • S. N. Balakrishnan and V. Biega, "Adaptive-critic-based neural networks for aircraft optimal control," J. Guid., Control, Dyn., vol. 19, no. 4, pp. 893-898, 1996.
    • (1996) J. Guid., Control, Dyn. , vol.19 , Issue.4 , pp. 893-898
    • Balakrishnan, S.N.1    Biega, V.2
  • 22
    • 0033685661 scopus 로고    scopus 로고
    • Adaptive critic design for intelligent steering and speed control of a 2-axle vehicle
    • Jul
    • G. G. Lendaris, L. Schultz, and T. Shannon, "Adaptive critic design for intelligent steering and speed control of a 2-axle vehicle," in Proc. Int. Joint Conf. Neural Netw., Jul. 2000, pp. 73-78.
    • (2000) Proc. Int. Joint Conf. Neural Netw. , pp. 73-78
    • Lendaris, G.G.1    Schultz, L.2    Shannon, T.3
  • 23
    • 0036060633 scopus 로고    scopus 로고
    • An adaptive critic global controller
    • S. Ferrari and R. F. Stengel, "An adaptive critic global controller," in Proc. Amer. Control Conf., vol. 4. 2002, pp. 2665-2670.
    • (2002) Proc. Amer. Control Conf. , vol.4 , pp. 2665-2670
    • Ferrari, S.1    Stengel, R.F.2
  • 24
    • 0036641793 scopus 로고    scopus 로고
    • State-constrained agile missile control with adaptive-critic-based neural networks
    • Jul
    • D. Han and S. N. Balakrishnan, "State-constrained agile missile control with adaptive-critic-based neural networks," IEEE Trans. Control Syst. Technol., vol. 10, no. 4, pp. 481-489, Jul. 2002.
    • (2002) IEEE Trans. Control Syst. Technol. , vol.10 , Issue.4 , pp. 481-489
    • Han, D.1    Balakrishnan, S.N.2
  • 25
    • 34047138362 scopus 로고    scopus 로고
    • Reinforcement learning neural-networkbased controller for nonlinear discrete-time systems with input constraints
    • Apr
    • P. He and S. Jagannathan, "Reinforcement learning neural-networkbased controller for nonlinear discrete-time systems with input constraints," IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 37, no. 2, pp. 425-436, Apr. 2007.
    • (2007) IEEE Trans. Syst., Man, Cybern. B, Cybern. , vol.37 , Issue.2 , pp. 425-436
    • He, P.1    Jagannathan, S.2
  • 26
    • 0004370245 scopus 로고
    • Wright Lab., Wright-Patterson AFB, OH, Tech. Rep. WL-TR-93-1146
    • L. C. Baird, III, "Advantage updating," Wright Lab., Wright-Patterson AFB, OH, Tech. Rep. WL-TR-93-1146, 1993.
    • (1993) Advantage Updating
    • Baird, L.C.1
  • 27
    • 0033629916 scopus 로고    scopus 로고
    • Reinforcement learning in continuous time and space
    • K. Doya, "Reinforcement learning in continuous time and space," Neural Comput., vol. 12, no. 1, pp. 219-245, 2000.
    • (2000) Neural Comput. , vol.12 , Issue.1 , pp. 219-245
    • Doya, K.1
  • 29
    • 0031332446 scopus 로고    scopus 로고
    • Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation
    • R. W. Beard, G. N. Saridis, and J. T. Wen, "Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation," Automatica, vol. 33, no. 12, pp. 2159-2178, 1997.
    • (1997) Automatica , vol.33 , Issue.12 , pp. 2159-2178
    • Beard, R.W.1    Saridis, G.N.2    Wen, J.T.3
  • 30
    • 67349145396 scopus 로고    scopus 로고
    • Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems
    • D. Vrabie and F. Lewis, "Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems," Neural Netw., vol. 22, no. 3, pp. 237-246, 2009.
    • (2009) Neural Netw. , vol.22 , Issue.3 , pp. 237-246
    • Vrabie, D.1    Lewis, F.2
  • 31
    • 79953132013 scopus 로고    scopus 로고
    • Online synchronous policy iteration method for optimal control
    • W. Yu, Ed. New York, NY, USA: Springer-Verlag
    • K. G. Vamvoudakis and F. L. Lewis, "Online synchronous policy iteration method for optimal control," in Recent Advances in Intelligent Control Systems, W. Yu, Ed. New York, NY, USA: Springer-Verlag, 2009, pp. 357-374.
    • (2009) Recent Advances in Intelligent Control Systems , pp. 357-374
    • Vamvoudakis, K.G.1    Lewis, F.L.2
  • 32
    • 79960468564 scopus 로고    scopus 로고
    • Asymptotic tracking by a reinforcement learning-based adaptive critic controller
    • S. Bhasin, N. Sharma, P. Patre, and W. Dixon, "Asymptotic tracking by a reinforcement learning-based adaptive critic controller," J. Control Theory Appl., vol. 9, no. 3, pp. 400-409, 2011.
    • (2011) J. Control Theory Appl. , vol.9 , Issue.3 , pp. 400-409
    • Bhasin, S.1    Sharma, N.2    Patre, P.3    Dixon, W.4
  • 33
    • 49249124071 scopus 로고    scopus 로고
    • A new approach to solve a class of continuoustime nonlinear quadratic zero-sum game using ADP
    • Apr
    • Q. Wei and H. Zhang, "A new approach to solve a class of continuoustime nonlinear quadratic zero-sum game using ADP," in Proc. IEEE Int. Conf. Netw. Sens. Control, Apr. 2008, pp. 507-512.
    • (2008) Proc. IEEE Int. Conf. Netw. Sens. Control , pp. 507-512
    • Wei, Q.1    Zhang, H.2
  • 34
    • 78650805234 scopus 로고    scopus 로고
    • An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games
    • H. Zhang, Q. Wei, and D. Liu, "An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games," Automatica, vol. 47, no. 1, pp. 207-214, 2010.
    • (2010) Automatica , vol.47 , Issue.1 , pp. 207-214
    • Zhang, H.1    Wei, Q.2    Liu, D.3
  • 35
    • 77955405588 scopus 로고    scopus 로고
    • Iteration algorithm for solving the optimal strategies of a class of nonaffine nonlinear quadratic zero-sum games
    • May
    • X. Zhang, H. Zhang, Y. Luo, and M. Dong, "Iteration algorithm for solving the optimal strategies of a class of nonaffine nonlinear quadratic zero-sum games," in Proc. IEEE Conf. Decision Control, May 2010, pp. 1359-1364.
    • (2010) Proc. IEEE Conf. Decision Control , pp. 1359-1364
    • Zhang, X.1    Zhang, H.2    Luo, Y.3    Dong, M.4
  • 37
    • 77950630017 scopus 로고    scopus 로고
    • Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem
    • K. G. Vamvoudakis and F. L. Lewis, "Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem," Automatica, vol. 46, no. 5, pp. 878-888, 2010.
    • (2010) Automatica , vol.46 , Issue.5 , pp. 878-888
    • Vamvoudakis, K.G.1    Lewis, F.L.2
  • 38
    • 79953155097 scopus 로고    scopus 로고
    • Online solution of nonlinear two-player zero-sum games using synchronous policy iteration
    • Dec
    • K. G. Vamvoudakis and F. L. Lewis, "Online solution of nonlinear two-player zero-sum games using synchronous policy iteration," in Proc. IEEE Conf. Decision Control, Dec. 2010, pp. 3040-3047.
    • (2010) Proc. IEEE Conf. Decision Control , pp. 3040-3047
    • Vamvoudakis, K.G.1    Lewis, F.L.2
  • 39
    • 79960897012 scopus 로고    scopus 로고
    • Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations
    • Mar
    • K. G. Vamvoudakis and F. L. Lewis, "Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations," Automatica, vol. 47, pp. 1556-1569, Mar. 2011.
    • (2011) Automatica , vol.47 , pp. 1556-1569
    • Vamvoudakis, K.G.1    Lewis, F.L.2
  • 40
    • 0001547175 scopus 로고    scopus 로고
    • Value-function reinforcement learning in Markov games
    • M. L. Littman, "Value-function reinforcement learning in Markov games," Cognit. Syst. Res., vol. 2, no. 1, pp. 55-66, 2001.
    • (2001) Cognit. Syst. Res. , vol.2 , Issue.1 , pp. 55-66
    • Littman, M.L.1
  • 41
    • 84885835001 scopus 로고    scopus 로고
    • Near-optimal control for nonzero-sum differential games of continuous-time nonlinear systems using singlenetwork ADP
    • Feb
    • H. Zhang, L. Cui, and Y. Luo, "Near-optimal control for nonzero-sum differential games of continuous-time nonlinear systems using singlenetwork ADP," IEEE Trans. Cybern., vol. 43, no. 1, pp. 206-216, Feb. 2013.
    • (2013) IEEE Trans. Cybern. , vol.43 , Issue.1 , pp. 206-216
    • Zhang, H.1    Cui, L.2    Luo, Y.3
  • 42
    • 84860670757 scopus 로고    scopus 로고
    • Nonlinear two-player zerosum game approximate solution using a policy iteration algorithm
    • Dec
    • M. Johnson, S. Bhasin, and W. E. Dixon, "Nonlinear two-player zerosum game approximate solution using a policy iteration algorithm," in Proc. IEEE Conf. Decision Control, Dec. 2011, pp. 142-147.
    • (2011) Proc. IEEE Conf. Decision Control , pp. 142-147
    • Johnson, M.1    Bhasin, S.2    Dixon, W.E.3
  • 48
    • 54349114997 scopus 로고    scopus 로고
    • Asymptotic tracking for uncertain dynamic systems via a multilayer neural network feedforward and RISE feedback control structure
    • Oct
    • P. M. Patre, W. MacKunis, K. Kaiser, and W. E. Dixon, "Asymptotic tracking for uncertain dynamic systems via a multilayer neural network feedforward and RISE feedback control structure," IEEE Trans. Autom. Control, vol. 53, no. 9, pp. 2180-2185, Oct. 2008.
    • (2008) IEEE Trans. Autom. Control , vol.53 , Issue.9 , pp. 2180-2185
    • Patre, P.M.1    MacKunis, W.2    Kaiser, K.3    Dixon, W.E.4
  • 51
    • 0023123261 scopus 로고
    • A calculus for computing Filippov's differential inclusion with application to the variable structure control of robot manipulators
    • Jan
    • B. E. Paden and S. S. Sastry, "A calculus for computing Filippov's differential inclusion with application to the variable structure control of robot manipulators," IEEE Trans. Circuits Syst., vol. 34, no. 1, pp. 73-82, Jan. 1987.
    • (1987) IEEE Trans. Circuits Syst. , vol.34 , Issue.1 , pp. 73-82
    • Paden, B.E.1    Sastry, S.S.2
  • 52
    • 84882449240 scopus 로고    scopus 로고
    • LaSalle-Yoshizawa corollaries for nonsmooth systems
    • Sep
    • N. Fischer, R. Kamalapurkar, and W. E. Dixon, "LaSalle-Yoshizawa corollaries for nonsmooth systems," IEEE Trans. Autom. Control, vol. 58, no. 9, pp. 2333-2338, Sep. 2013.
    • (2013) IEEE Trans. Autom. Control , vol.58 , Issue.9 , pp. 2333-2338
    • Fischer, N.1    Kamalapurkar, R.2    Dixon, W.E.3
  • 54
    • 0020306480 scopus 로고
    • Exponential convergence of recursive least squares with exponential forgetting factor
    • Dec
    • R. M. Johnstone, C. R. Johnson, R. R. Bitmead, and B. D. O. Anderson, "Exponential convergence of recursive least squares with exponential forgetting factor," in Proc. IEEE Conf. Decision Control, vol. 21. Dec. 1982, pp. 994-997.
    • (1982) Proc. IEEE Conf. Decision Control , vol.21 , pp. 994-997
    • Johnstone, R.M.1    Johnson, C.R.2    Bitmead, R.R.3    Anderson, B.D.O.4
  • 56
    • 0037119843 scopus 로고    scopus 로고
    • Uniform exponential stability of linear timevarying systems: Revisited
    • A. Loría and E. Panteley, "Uniform exponential stability of linear timevarying systems: Revisited," Syst. Control Lett., vol. 47, no. 1, pp. 13-24, 2002.
    • (2002) Syst. Control Lett. , vol.47 , Issue.1 , pp. 13-24
    • Loría, A.1    Panteley, E.2
  • 57
    • 0004178386 scopus 로고    scopus 로고
    • 3rd ed. Upper Saddle River, NJ, USA: Prentice-Hall
    • H. K. Khalil, Nonlinear Systems, 3rd ed. Upper Saddle River, NJ, USA: Prentice-Hall, 2002.
    • (2002) Nonlinear Systems
    • Khalil, H.K.1
  • 58
    • 62949149213 scopus 로고    scopus 로고
    • Constrained nonlinear optimal control: A converse HJB approach
    • California Inst. Technol., Pasadena, CA, USA, Tech. Rep. CaltechCDSTR: 1996.021
    • V. Nevistic and J. A. Primbs, "Constrained nonlinear optimal control: A converse HJB approach," Control and Dynamical Systems Group, California Inst. Technol., Pasadena, CA, USA, Tech. Rep. CaltechCDSTR: 1996.021, 1996.
    • (1996) Control and Dynamical Systems Group
    • Nevistic, V.1    Primbs, J.A.2
  • 59
    • 84871319455 scopus 로고    scopus 로고
    • A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems
    • S. Bhasin, R. Kamalapurkar, M. Johnson, K. G. Vamvoudakis, F. L. Lewis, and W. E. Dixon, "A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems," Automatica, vol. 49, no. 1, pp. 89-92, 2013.
    • (2013) Automatica , vol.49 , Issue.1 , pp. 89-92
    • Bhasin, S.1    Kamalapurkar, R.2    Johnson, M.3    Vamvoudakis, K.G.4    Lewis, F.L.5    Dixon, W.E.6


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.