메뉴 건너뛰기




Volumn 11, Issue 3, 2014, Pages 706-714

Integral reinforcement learning for linear continuous-time zero-sum games with completely unknown dynamics

Author keywords

Adaptive critic designs; adaptive dynamic programming; approximate dynamic programming; policy iteration; reinforcement learning; zero sum games

Indexed keywords

ALGEBRA; ALGORITHMS; CONTINUOUS TIME SYSTEMS; DYNAMIC PROGRAMMING; GAME THEORY; LEAST SQUARES APPROXIMATIONS; NEWTON-RAPHSON METHOD; RICCATI EQUATIONS; STATE FEEDBACK;

EID: 84904398037     PISSN: 15455955     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASE.2014.2300532     Document Type: Article
Times cited : (176)

References (49)
  • 1
    • 67349247013 scopus 로고    scopus 로고
    • Intelligence in the brain: A theory of how it works and how to build it
    • Apr.
    • P. J. Werbos, "Intelligence in the brain: A theory of how it works and how to build it," Neural Netw., vol. 22, no. 3, pp. 200-212, Apr. 2009.
    • (2009) Neural Netw. , vol.22 , Issue.3 , pp. 200-212
    • Werbos, P.J.1
  • 2
    • 66449130966 scopus 로고    scopus 로고
    • Adaptive dynamic programming: An introduction
    • May
    • F. Y. Wang, H. Zhang, and D. Liu, "Adaptive dynamic programming: An introduction," IEEE Comput. Intell. Mag., vol. 4, no. 2, pp. 39-47, May 2009.
    • (2009) IEEE Comput. Intell. Mag. , vol.4 , Issue.2 , pp. 39-47
    • Wang, F.Y.1    Zhang, H.2    Liu, D.3
  • 3
    • 84883537695 scopus 로고    scopus 로고
    • Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers
    • Dec.
    • F. L. Lewis and D. Vrabie, "Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers," IEEE Control Syst. Mag., vol. 32, no. 6, pp. 76-105, Dec. 2012.
    • (2012) IEEE Control Syst. Mag. , vol.32 , Issue.6 , pp. 76-105
    • Lewis, F.L.1    Vrabie, D.2
  • 6
    • 49049089962 scopus 로고    scopus 로고
    • Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof
    • Aug.
    • A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, "Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof," IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 38, no. 4, pp. 943-949, Aug. 2008.
    • (2008) IEEE Trans. Syst., Man, Cybern. B, Cybern. , vol.38 , Issue.4 , pp. 943-949
    • Al-Tamimi, A.1    Lewis, F.L.2    Abu-Khalaf, M.3
  • 7
    • 70349253929 scopus 로고    scopus 로고
    • Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systemswith control constraints
    • Sep.
    • H. Zhang, Y. Luo, and D. Liu, "Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systemswith control constraints," IEEE Trans. Neural Netw., vol. 20, no. 9, pp. 1490-1503, Sep. 2009.
    • (2009) IEEE Trans. Neural Netw. , vol.20 , Issue.9 , pp. 1490-1503
    • Zhang, H.1    Luo, Y.2    Liu, D.3
  • 8
    • 78651311269 scopus 로고    scopus 로고
    • Adaptive dynamic programming for finite-horizon optimal control of discrete-time nonlinear systems with ∈-error bound
    • Dec.
    • F. Y. Wang, N. Jin, D. Liu, and Q. Wei, "Adaptive dynamic programming for finite-horizon optimal control of discrete-time nonlinear systems with ∈-error bound," IEEE Trans. Neural Netw., vol. 22, no. 12, pp. 1854-1862, Dec. 2011.
    • (2011) IEEE Trans. Neural Netw. , vol.22 , Issue.12 , pp. 1854-1862
    • Wang, F.Y.1    Jin, N.2    Liu, D.3    Wei, Q.4
  • 9
    • 84864489666 scopus 로고    scopus 로고
    • Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming
    • Aug.
    • D. Wang, D. Liu, Q. Wei, D. Zhao, and N. Jin, "Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming," Automatica, vol. 48, no. 8, pp. 1825-1832, Aug. 2012.
    • (2012) Automatica , vol.48 , Issue.8 , pp. 1825-1832
    • Wang, D.1    Liu, D.2    Wei, Q.3    Zhao, D.4    Jin, N.5
  • 10
    • 84863467146 scopus 로고    scopus 로고
    • Neural-network-based optimal control for a class of unknown discrete-time nonlinear systems using globalized dual heuristic programming
    • Jul.
    • D. Liu, D. Wang, D. Zhao, Q. Wei, and N. Jin, "Neural-network-based optimal control for a class of unknown discrete-time nonlinear systems using globalized dual heuristic programming," IEEE Trans. Autom. Sci. Eng., vol. 9, no. 3, pp. 628-634, Jul. 2012.
    • (2012) IEEE Trans. Autom. Sci. Eng. , vol.9 , Issue.3 , pp. 628-634
    • Liu, D.1    Wang, D.2    Zhao, D.3    Wei, Q.4    Jin, N.5
  • 11
    • 84878421441 scopus 로고    scopus 로고
    • Optimal control for discrete-time affine nonlinear systems using general value iteration
    • Dec.
    • H. Li and D. Liu, "Optimal control for discrete-time affine nonlinear systems using general value iteration," IET Control Theory Appl., vol. 6, no. 18, pp. 2725-2736, Dec. 2012.
    • (2012) IET Control Theory Appl. , vol.6 , Issue.18 , pp. 2725-2736
    • Li, H.1    Liu, D.2
  • 12
    • 84862811062 scopus 로고    scopus 로고
    • An iterative-optimal control scheme for a class of discrete-time nonlinear systems with unfixed initial state
    • Aug.
    • Q. Wei and D. Liu, "An iterative-optimal control scheme for a class of discrete-time nonlinear systems with unfixed initial state," Neural Netw., vol. 32, no. 6, pp. 236-244, Aug. 2012.
    • (2012) Neural Netw. , vol.32 , Issue.6 , pp. 236-244
    • Wei, Q.1    Liu, D.2
  • 13
    • 84868467610 scopus 로고    scopus 로고
    • An iterative adaptive dynamic programming algorithmfor optimal control of unknown discrete-time nonlinear systemswith constrained inputs
    • Jan.
    • D. Liu, D. Wang, and X. Yang, "An iterative adaptive dynamic programming algorithmfor optimal control of unknown discrete-time nonlinear systemswith constrained inputs," Inf. Sci., vol. 220, pp. 331-342, Jan. 2013.
    • (2013) Inf. Sci. , vol.220 , pp. 331-342
    • Liu, D.1    Wang, D.2    Yang, X.3
  • 14
    • 84876066909 scopus 로고    scopus 로고
    • Neural-network-based zero-sum game for discrete-time nonlinear systems via iterative adaptive dynamic programming algorithm
    • June
    • D. Liu, H. Li, and D. Wang, "Neural-network-based zero-sum game for discrete-time nonlinear systems via iterative adaptive dynamic programming algorithm," Neurocomputing, vol. 110, pp. 92-100, June 2013.
    • (2013) Neurocomputing , vol.110 , pp. 92-100
    • Liu, D.1    Li, H.2    Wang, D.3
  • 15
    • 84881555023 scopus 로고    scopus 로고
    • Finite-approximation-error based optimal control approach for discrete-time nonlinear systems
    • Apr.
    • D. Liu and Q. Wei, "Finite-approximation-error based optimal control approach for discrete-time nonlinear systems," IEEE Trans. Cybern., vol. 43, no. 2, pp. 779-789, Apr. 2013.
    • (2013) IEEE Trans. Cybern. , vol.43 , Issue.2 , pp. 779-789
    • Liu, D.1    Wei, Q.2
  • 16
    • 0035273403 scopus 로고    scopus 로고
    • On-line learning control by association and reinforcement
    • DOI 10.1109/72.914523, PII S1045922701014047
    • J. Si and Y. T. Wang, "On-line learning control by association and reinforcement," IEEE Trans. Neural Netw., vol. 12, no. 2, pp. 264-276, Mar. 2001. (Pubitemid 32371483)
    • (2001) IEEE Transactions on Neural Networks , vol.12 , Issue.2 , pp. 264-276
    • Si, J.1    Wang, Y.-T.2
  • 17
    • 26844483839 scopus 로고    scopus 로고
    • A self-learning call admission control scheme for CDMA cellular networks
    • DOI 10.1109/TNN.2005.853408
    • D. Liu, Y. Zhang, and H. Zhang, "Aself-learning call admission control scheme for CDMA cellular networks," IEEE Trans. Neural Netw., vol. 16, no. 5, pp. 1219-1228, Sep. 2005. (Pubitemid 41444623)
    • (2005) IEEE Transactions on Neural Networks , vol.16 , Issue.5 , pp. 1219-1228
    • Liu, D.1    Zhang, Y.2    Zhang, H.3
  • 18
    • 49049108697 scopus 로고    scopus 로고
    • Adaptive critic learning techniques for engine torque and air-fuel ratio control
    • Aug.
    • D. Liu, H. Javaherian, O. Kovalenko, and T. Huang, "Adaptive critic learning techniques for engine torque and air-fuel ratio control," IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 38, no. 4, pp. 988-993, Aug. 2008.
    • (2008) IEEE Trans. Syst., Man, Cybern. B, Cybern. , vol.38 , Issue.4 , pp. 988-993
    • Liu, D.1    Javaherian, H.2    Kovalenko, O.3    Huang, T.4
  • 19
    • 17644391408 scopus 로고    scopus 로고
    • Improving the performance of globalized dual heuristic programming for fault tolerant control through an online learning supervisor
    • Apr.
    • G. G. Yen and P. G. Delima, "Improving the performance of globalized dual heuristic programming for fault tolerant control through an online learning supervisor," IEEE Trans. Autom. Sci. Eng., vol. 2, no. 2, pp. 121-131, Apr. 2005.
    • (2005) IEEE Trans. Autom. Sci. Eng. , vol.2 , Issue.2 , pp. 121-131
    • Yen, G.G.1    Delima, P.G.2
  • 20
    • 34147169178 scopus 로고    scopus 로고
    • A multiresolution analysis-assisted reinforcement learning approach to run-by-run control
    • DOI 10.1109/TASE.2006.879915
    • R. Ganesan, T. K. Das, and K. M. Ramachandran, "A multiresolution analysis-assisted reinforcement learning approach to run-by-run control," IEEE Trans. Autom. Sci. Eng., vol. 4, no. 2, pp. 182-193, Apr. 2007. (Pubitemid 46574023)
    • (2007) IEEE Transactions on Automation Science and Engineering , vol.4 , Issue.2 , pp. 182-193
    • Ganesan, R.1    Das, T.K.2    Ramachandran, K.M.3
  • 21
    • 80053632509 scopus 로고    scopus 로고
    • Optimization of train regulation and energy usage of metro lines using an adaptive-optimal-control algorithm
    • Oct.
    • W. S. Lin and J.W. Sheu, "Optimization of train regulation and energy usage of metro lines using an adaptive-optimal-control algorithm," IEEE Trans. Autom. Sci. Eng., vol. 8, no. 4, pp. 855-864, Oct. 2011.
    • (2011) IEEE Trans. Autom. Sci. Eng. , vol.8 , Issue.4 , pp. 855-864
    • Lin, W.S.1    Sheu, J.W.2
  • 22
    • 84859774473 scopus 로고    scopus 로고
    • Real-time adaptive control of a flexible manipulator using reinforcement learning
    • Apr.
    • S. K. Pradhan and B. Subudhi, "Real-time adaptive control of a flexible manipulator using reinforcement learning," IEEE Trans. Autom. Sci. Eng., vol. 9, no. 2, pp. 237-249, Apr. 2012.
    • (2012) IEEE Trans. Autom. Sci. Eng. , vol.9 , Issue.2 , pp. 237-249
    • Pradhan, S.K.1    Subudhi, B.2
  • 23
    • 84876138680 scopus 로고    scopus 로고
    • Swarm intelligence approaches to optimal power flow problem with distributed generator failures in power networks
    • Apr.
    • Q. Kang, M. Zhou, J. An, and Q. Wu, "Swarm intelligence approaches to optimal power flow problem with distributed generator failures in power networks," IEEE Trans. Autom. Sci. Eng., vol. 10, no. 2, pp. 343-353, Apr. 2013.
    • (2013) IEEE Trans. Autom. Sci. Eng. , vol.10 , Issue.2 , pp. 343-353
    • Kang, Q.1    Zhou, M.2    An, J.3    Wu, Q.4
  • 24
    • 84892578275 scopus 로고    scopus 로고
    • Building energy management: Integrated control of active and passive heating, cooling, lighting, shading, and ventilation systems
    • Jul.
    • B. Sun, P. B. Luh, Q. Jia, Z. Jiang, F. Wang, and C. Song, "Building energy management: Integrated control of active and passive heating, cooling, lighting, shading, and ventilation systems," IEEE Trans. Autom. Sci. Eng., vol. 10, no. 3, pp. 588-602, Jul. 2013.
    • (2013) IEEE Trans. Autom. Sci. Eng. , vol.10 , Issue.3 , pp. 588-602
    • Sun, B.1    Luh, P.B.2    Jia, Q.3    Jiang, Z.4    Wang, F.5    Song, C.6
  • 25
    • 84892617907 scopus 로고    scopus 로고
    • Smart management of multiple energy systems in automotive painting shop
    • Jul.
    • Z. Xu, Q. Jia, H. Guan, and J. Shen, "Smart management of multiple energy systems in automotive painting shop," IEEE Trans. Autom. Sci. Eng., vol. 10, no. 3, pp. 603-614, Jul. 2013.
    • (2013) IEEE Trans. Autom. Sci. Eng. , vol.10 , Issue.3 , pp. 603-614
    • Xu, Z.1    Jia, Q.2    Guan, H.3    Shen, J.4
  • 26
    • 82655173881 scopus 로고    scopus 로고
    • A three-network architecture for on-line learning and optimization based on adaptive dynamic programming
    • Feb.
    • H. He, Z. Ni, and J. Fu, "A three-network architecture for on-line learning and optimization based on adaptive dynamic programming," Neurocomputing, vol. 78, no. 1, pp. 3-13, Feb. 2012.
    • (2012) Neurocomputing , vol.78 , Issue.1 , pp. 3-13
    • He, H.1    Ni, Z.2    Fu, J.3
  • 27
    • 84876149222 scopus 로고    scopus 로고
    • Adaptive learning in tracking control based on the dual critic network design
    • Jun.
    • Z. Ni, H. He, and J. Wen, "Adaptive learning in tracking control based on the dual critic network design," IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 6, pp. 913-928, Jun. 2013.
    • (2013) IEEE Trans. Neural Netw. Learn. Syst. , vol.24 , Issue.6 , pp. 913-928
    • Ni, Z.1    He, H.2    Wen, J.3
  • 28
    • 0033629916 scopus 로고    scopus 로고
    • Reinforcement learning in continuous time and space
    • K. Doya, "Reinforcement learning in continuous time and space," Neural Comput., vol. 12, no. 1, pp. 219-245, 2000.
    • (2000) Neural Comput. , vol.12 , Issue.1 , pp. 219-245
    • Doya, K.1
  • 29
    • 77950630017 scopus 로고    scopus 로고
    • Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem
    • May
    • K. G. Vamvoudakis and F. L. Lewis, "Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem," Automatica, vol. 46, no. 5, pp. 878-888, May 2010.
    • (2010) Automatica , vol.46 , Issue.5 , pp. 878-888
    • Vamvoudakis, K.G.1    Lewis, F.L.2
  • 30
    • 83655163786 scopus 로고    scopus 로고
    • Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method
    • Dec.
    • H. Zhang, L. Cui, X. Zhang, and Y. Luo, "Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method," IEEE Trans. Neural Netw., vol. 22, no. 12, pp. 2226-2236, Dec. 2011.
    • (2011) IEEE Trans. Neural Netw. , vol.22 , Issue.12 , pp. 2226-2236
    • Zhang, H.1    Cui, L.2    Zhang, X.3    Luo, Y.4
  • 31
    • 84871319455 scopus 로고    scopus 로고
    • A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems
    • Jan.
    • S. Bhasin, R. Kamalapurkar, M. Johnson, K. G. Vamvoudakis, F. L. Lewis, and W. E. Dixon, "A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems," Automatica, vol. 49, no. 1, pp. 82-92, Jan. 2013.
    • (2013) Automatica , vol.49 , Issue.1 , pp. 82-92
    • Bhasin, S.1    Kamalapurkar, R.2    Johnson, M.3    Vamvoudakis, K.G.4    Lewis, F.L.5    Dixon, W.E.6
  • 32
    • 58349110975 scopus 로고    scopus 로고
    • Adaptive optimal control for continuous-time linear systems based on policy iteration
    • Feb.
    • D. Vrabie, O. Pastravanu, M. Abu-Khalaf, and F. L. Lewis, "Adaptive optimal control for continuous-time linear systems based on policy iteration," Automatica, vol. 45, no. 2, pp. 477-484, Feb. 2009.
    • (2009) Automatica , vol.45 , Issue.2 , pp. 477-484
    • Vrabie, D.1    Pastravanu, O.2    Abu-Khalaf, M.3    Lewis, F.L.4
  • 33
    • 67349145396 scopus 로고    scopus 로고
    • Neural network approach to continuoustime direct adaptive optimal control for partially unknown nonlinear systems
    • Apr.
    • D. Vrabie and F. L. Lewis, "Neural network approach to continuoustime direct adaptive optimal control for partially unknown nonlinear systems," Neural Netw., vol. 22, no. 3, pp. 237-246, Apr. 2009.
    • (2009) Neural Netw. , vol.22 , Issue.3 , pp. 237-246
    • Vrabie, D.1    Lewis, F.L.2
  • 34
    • 77950806766 scopus 로고    scopus 로고
    • Q-learning and pontryagins minimum principle
    • Shanghai, China, Dec.
    • P. Mehta and S. Meyn, "Q-learning and pontryagins minimum principle," in Proc. IEEE Conf. Decision Control, Shanghai, China, Dec. 2009, pp. 3598-3605.
    • (2009) Proc. IEEE Conf. Decision Control , pp. 3598-3605
    • Mehta, P.1    Meyn, S.2
  • 35
    • 84867400046 scopus 로고    scopus 로고
    • Integral Q-learning and explorized policy iteration for adaptive optimal control of continuoustime linear systems
    • Nov.
    • J. Y. Lee, J. B. Park, and Y. H. Choi, "Integral Q-learning and explorized policy iteration for adaptive optimal control of continuoustime linear systems," Automatica, vol. 48, no. 11, pp. 2850-2859, Nov. 2012.
    • (2012) Automatica , vol.48 , Issue.11 , pp. 2850-2859
    • Lee, J.Y.1    Park, J.B.2    Choi, Y.H.3
  • 36
    • 84865467087 scopus 로고    scopus 로고
    • Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics
    • Oct.
    • Y. Jiang and Z.-P. Jiang, "Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics," Automatica, vol. 48, no. 10, pp. 2699-2704, Oct. 2012.
    • (2012) Automatica , vol.48 , Issue.10 , pp. 2699-2704
    • Jiang, Y.1    Jiang, Z.-P.2
  • 37
    • 0003981511 scopus 로고    scopus 로고
    • 2nd ed. Philadelphia, PA, USA: Society for Industrial and Applied Mathematics
    • T. Basar and G. J. Olsder, Dynamic Noncooperative Game Theory, 2nd ed. Philadelphia, PA, USA: Society for Industrial and Applied Mathematics, 1999.
    • (1999) Dynamic Noncooperative Game Theory
    • Basar, T.1    Olsder, G.J.2
  • 39
    • 33847648898 scopus 로고    scopus 로고
    • Adaptive critic designs for discrete-time zero-sum games with application to H control
    • DOI 10.1109/TSMCB.2006.880135, Special Issue on Memetic Algorithms
    • A. Al-Tamimi, M. Abu-Khalaf, and F. L. Lewis, "Adaptive critic designs for discrete-time zero-sum games with application to H∞ control," IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 37, no. 1, pp. 240-247, Feb. 2007. (Pubitemid 46358495)
    • (2007) IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics , vol.37 , Issue.1 , pp. 240-247
    • Al-Tamimi, A.1    Abu-Khalaf, M.2    Lewis, F.L.3
  • 40
    • 33846781129 scopus 로고    scopus 로고
    • Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control
    • DOI 10.1016/j.automatica.2006.09.019, PII S0005109806004249
    • A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, "Model-free Q-learning designs for linear discrete-time zero-sum games with application to H∞ control," Automatica, vol. 43, no. 3, pp. 473-481, Mar. 2007. (Pubitemid 46209050)
    • (2007) Automatica , vol.43 , Issue.3 , pp. 473-481
    • Al-Tamimi, A.1    Lewis, F.L.2    Abu-Khalaf, M.3
  • 41
    • 77955423822 scopus 로고    scopus 로고
    • Model-free H∞ control design for unknown linear discrete-time systems via Q-learning with LMI
    • Aug.
    • J.-H. Kim and F. L. Lewis, "Model-free H∞ control design for unknown linear discrete-time systems via Q-learning with LMI," Automatica, vol. 46, no. 8, pp. 1320-1326, Aug. 2010.
    • (2010) Automatica , vol.46 , Issue.8 , pp. 1320-1326
    • Kim, J.-H.1    Lewis, F.L.2
  • 42
    • 33845759425 scopus 로고    scopus 로고
    • Policy iterations on the Hamilton-Jacobi-Isaacs equation for H state feedback control with input saturation
    • DOI 10.1109/TAC.2006.884959
    • M. Abu-Khalaf, F. L. Lewis, and J. Huang, "Policy iterations and the Hamilton-Jacobi-Isaacs equation for H∞ state feedback control with input saturation," IEEE Trans. Autom. Control, vol. 51, no. 12, pp. 1989-1995, Dec. 2006. (Pubitemid 46002295)
    • (2006) IEEE Transactions on Automatic Control , vol.51 , Issue.12 , pp. 1989-1995
    • Abu-Khalaf, M.1    Lewis, F.L.2    Huang, J.3
  • 43
    • 48949116222 scopus 로고    scopus 로고
    • Neurodynamic progarmming and zero-sum games for constrained control systems
    • Jul.
    • M. Abu-Khalaf, F. L. Lewis, and J. Huang, "Neurodynamic progarmming and zero-sum games for constrained control systems," IEEE Trans. Neural Netw., vol. 19, no. 7, pp. 1243-1252, Jul. 2008.
    • (2008) IEEE Trans. Neural Netw. , vol.19 , Issue.7 , pp. 1243-1252
    • Abu-Khalaf, M.1    Lewis, F.L.2    Huang, J.3
  • 44
    • 78650805234 scopus 로고    scopus 로고
    • An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games
    • Jan.
    • H. Zhang, Q. Wei, and D. Liu, "An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games," Automatica, vol. 47, no. 1, pp. 207-214, Jan. 2011.
    • (2011) Automatica , vol.47 , Issue.1 , pp. 207-214
    • Zhang, H.1    Wei, Q.2    Liu, D.3
  • 45
    • 84864463039 scopus 로고    scopus 로고
    • Online solution of nonlinear two-player zero-sum games using synchronous policy iteration
    • K. G. Vamvoudakis and F. L. Lewis, "Online solution of nonlinear two-player zero-sum games using synchronous policy iteration," Int. J. Robust Nonlinear Control, vol. 22, no. 13, pp. 1460-1483, 2011.
    • (2011) Int. J. Robust Nonlinear Control , vol.22 , Issue.13 , pp. 1460-1483
    • Vamvoudakis, K.G.1    Lewis, F.L.2
  • 46
    • 79953143055 scopus 로고    scopus 로고
    • Optimal control of affine nonlinear continuous-time systems using an online Hamilton-Jacobi-Isaacs formulation
    • Atlanta, GA, USA, Dec.
    • T. Dierks and S. Jagannathan, "Optimal control of affine nonlinear continuous-time systems using an online Hamilton-Jacobi-Isaacs formulation," in Proc. IEEE Conf. Decision Control, Atlanta, GA, USA, Dec. 2010, pp. 3048-3053.
    • (2010) Proc. IEEE Conf. Decision Control , pp. 3048-3053
    • Dierks, T.1    Jagannathan, S.2
  • 47
    • 84876909440 scopus 로고    scopus 로고
    • Neural network based online simultaneous policy update algorithm for solving the HJI equation in nonlinear H∞ control
    • Dec.
    • H. Wu and B. Luo, "Neural network based online simultaneous policy update algorithm for solving the HJI equation in nonlinear H∞ control," IEEE Trans. Neural Netw. Learn. Syst., vol. 23, no. 12, pp. 1884-1895, Dec. 2012.
    • (2012) IEEE Trans. Neural Netw. Learn. Syst. , vol.23 , Issue.12 , pp. 1884-1895
    • Wu, H.1    Luo, B.2
  • 48
    • 79960443754 scopus 로고    scopus 로고
    • Adaptive dynamic programming for online solution of a zero-sum differential game
    • D. Varbie and F. L. Lewis, "Adaptive dynamic programming for online solution of a zero-sum differential game," J. Control Theory Appl., vol. 9, no. 3, pp. 353-360, 2011.
    • (2011) J. Control Theory Appl. , vol.9 , Issue.3 , pp. 353-360
    • Varbie, D.1    Lewis, F.L.2
  • 49
    • 84870062175 scopus 로고    scopus 로고
    • Simultaneous policy update algorithms for learning the solution of linear continuous-time H∞ state feedback control
    • Feb.
    • H. Wu and B. Luo, "Simultaneous policy update algorithms for learning the solution of linear continuous-time H∞ state feedback control," Inf. Sci., vol. 222, no. 10, pp. 472-485, Feb. 2013.
    • (2013) Inf. Sci. , vol.222 , Issue.10 , pp. 472-485
    • Wu, H.1    Luo, B.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.