메뉴 건너뛰기




Volumn 46, Issue 3, 2016, Pages 854-865

Experience Replay for Optimal Control of Nonzero-Sum Game Systems with Unknown Dynamics

Author keywords

Adaptive dynamic programming (ADP); experience replay; nonzero sum (NZS) games; optimal control; unknown dynamics

Indexed keywords

CLOSED LOOP SYSTEMS; DYNAMICS; NETWORK LAYERS; NONLINEAR EQUATIONS; ONLINE SYSTEMS; SYSTEM STABILITY;

EID: 84945951645     PISSN: 21682267     EISSN: None     Source Type: Journal    
DOI: 10.1109/TCYB.2015.2488680     Document Type: Article
Times cited : (199)

References (38)
  • 2
    • 84862811062 scopus 로고    scopus 로고
    • An iterative optimal control scheme for a class of discrete-time nonlinear systems with unfixed initial state
    • Aug.
    • Q. Wei and D. Liu, "An iterative optimal control scheme for a class of discrete-time nonlinear systems with unfixed initial state," Neural Netw., vol. 32, no. 6, pp. 236-244, Aug. 2012.
    • (2012) Neural Netw. , vol.32 , Issue.6 , pp. 236-244
    • Wei, Q.1    Liu, D.2
  • 3
    • 84888019460 scopus 로고    scopus 로고
    • Full-range adaptive cruise control based on supervised adaptive dynamic programming
    • Feb.
    • D. Zhao et al., "Full-range adaptive cruise control based on supervised adaptive dynamic programming," Neurocomputing, vol. 125, pp. 57-67, Feb. 2014.
    • (2014) Neurocomputing , vol.125 , pp. 57-67
    • Zhao, D.1
  • 4
    • 49049108697 scopus 로고    scopus 로고
    • Adaptive critic learning techniques for engine torque and air-fuel ratio control
    • Aug.
    • D. Liu, H. Javaherian, O. Kovalenko, and T. Huang, "Adaptive critic learning techniques for engine torque and air-fuel ratio control," IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 38, no. 4, pp. 988-993, Aug. 2008.
    • (2008) IEEE Trans. Syst., Man, Cybern. B, Cybern. , vol.38 , Issue.4 , pp. 988-993
    • Liu, D.1    Javaherian, H.2    Kovalenko, O.3    Huang, T.4
  • 6
    • 82455175244 scopus 로고    scopus 로고
    • DHP method for ramp metering of freeway traffic
    • Dec.
    • D. Zhao, X. Bai, F.-Y. Wang, J. Xu, and W. Yu, "DHP method for ramp metering of freeway traffic," IEEE Intell. Transp. Syst. Mag., vol. 12, no. 4, pp. 990-999, Dec. 2011.
    • (2011) IEEE Intell. Transp. Syst. Mag. , vol.12 , Issue.4 , pp. 990-999
    • Zhao, D.1    Bai, X.2    Wang, F.-Y.3    Xu, J.4    Yu, W.5
  • 7
    • 84961288449 scopus 로고    scopus 로고
    • Convergence analysis and application of fuzzy-HDP for nonlinear discrete-time HJB systems
    • Feb.
    • Y. Zhu, D. Zhao, and D. Liu, "Convergence analysis and application of fuzzy-HDP for nonlinear discrete-time HJB systems," Neurocomputing, vol. 149, pp. 124-131, Feb. 2015.
    • (2015) Neurocomputing , vol.149 , pp. 124-131
    • Zhu, Y.1    Zhao, D.2    Liu, D.3
  • 8
    • 84897594646 scopus 로고    scopus 로고
    • Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems
    • Mar.
    • D. Liu and Q. Wei, "Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems," IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 3, pp. 621-634, Mar. 2014.
    • (2014) IEEE Trans. Neural Netw. Learn. Syst. , vol.25 , Issue.3 , pp. 621-634
    • Liu, D.1    Wei, Q.2
  • 9
    • 84881026082 scopus 로고    scopus 로고
    • Distributed cooperative secondary control of microgrids using feedback linearization
    • Aug.
    • A. Bidram, A. Davoudi, F. L. Lewis, and J. M. Guerrero, "Distributed cooperative secondary control of microgrids using feedback linearization," IEEE Trans. Power Syst., vol. 28, no. 3, pp. 3462-3470, Aug. 2013.
    • (2013) IEEE Trans. Power Syst. , vol.28 , Issue.3 , pp. 3462-3470
    • Bidram, A.1    Davoudi, A.2    Lewis, F.L.3    Guerrero, J.M.4
  • 12
    • 84887283655 scopus 로고    scopus 로고
    • Insight into the so-called spatial reciprocity
    • Oct., Art. ID
    • Z. Wang, S. Kokubo, J. Tanimoto, E. Fukuda, and K. Shigaki, "Insight into the so-called spatial reciprocity," Phys. Rev. E, vol. 88, no. 4, Oct. 2013, Art. ID 042145.
    • (2013) Phys. Rev. e , vol.88 , Issue.4
    • Wang, Z.1    Kokubo, S.2    Tanimoto, J.3    Fukuda, E.4    Shigaki, K.5
  • 13
    • 84891330472 scopus 로고    scopus 로고
    • Impact of social punishment on cooperative behavior in complex networks
    • Oct. Art. ID
    • Z. Wang, C.-Y. Xia, S. Meloni, C.-S. Zhou, and Y. Moreno, "Impact of social punishment on cooperative behavior in complex networks," Sci. Rep., vol. 3, Oct. 2013, Art. ID 3055.
    • (2013) Sci. Rep. , vol.3
    • Wang, Z.1    Xia, C.-Y.2    Meloni, S.3    Zhou, C.-S.4    Moreno, Y.5
  • 14
    • 84883180371 scopus 로고    scopus 로고
    • Optimal interdependence between networks for the evolution of cooperation
    • Aug., Art. ID
    • Z. Wang, A. Szolnoki, and M. Perc, "Optimal interdependence between networks for the evolution of cooperation," Sci. Rep., vol. 3, Aug. 2013, Art. ID 2470.
    • (2013) Sci. Rep. , vol.3
    • Wang, Z.1    Szolnoki, A.2    Perc, M.3
  • 15
    • 84897997302 scopus 로고    scopus 로고
    • Self-organization towards optimally interdependent networks by means of coevolution
    • Art. ID
    • Z. Wang, A. Szolnoki, and M. Perc, "Self-organization towards optimally interdependent networks by means of coevolution," New J. Phys., vol. 16, no. 3, 2014, Art. ID 033041.
    • (2014) New J. Phys. , vol.16 , Issue.3
    • Wang, Z.1    Szolnoki, A.2    Perc, M.3
  • 16
    • 34250487269 scopus 로고
    • Nonzero-sum differential games
    • A. W. Starr and Y.-C. Ho, "Nonzero-sum differential games," J. Optim. Theory Appl., vol. 3, no. 3, pp. 184-206, 1969.
    • (1969) J. Optim. Theory Appl. , vol.3 , Issue.3 , pp. 184-206
    • Starr, A.W.1    Ho, Y.-C.2
  • 18
    • 84937390462 scopus 로고    scopus 로고
    • Approximate N-player nonzero-sum game solution for an uncertain continuous nonlinear system
    • Aug.
    • M. Johnson, R. Kamalapurkar, S. Bhasin, and W. E. Dixon, "Approximate N-player nonzero-sum game solution for an uncertain continuous nonlinear system," IEEE Trans. Neural Netw. Learn. Syst., vol. 26, no. 8, pp. 1645-1658, Aug. 2015.
    • (2015) IEEE Trans. Neural Netw. Learn. Syst. , vol.26 , Issue.8 , pp. 1645-1658
    • Johnson, M.1    Kamalapurkar, R.2    Bhasin, S.3    Dixon, W.E.4
  • 19
    • 0014509068 scopus 로고
    • Toward a theory of many player differential games
    • J. H. Case, "Toward a theory of many player differential games," SIAM J. Control, vol. 7, no. 2, pp. 179-197, 1969.
    • (1969) SIAM J. Control , vol.7 , Issue.2 , pp. 179-197
    • Case, J.H.1
  • 20
    • 79551575772 scopus 로고    scopus 로고
    • Mineola, NY, USA: Courier Corporation
    • A. Friedman, Differential Games. Mineola, NY, USA: Courier Corporation, 2013.
    • (2013) Differential Games
    • Friedman, A.1
  • 21
    • 79960897012 scopus 로고    scopus 로고
    • Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations
    • K. G. Vamvoudakis and F. L. Lewis, "Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations," Automatica, vol. 47, no. 8, pp. 1556-1569, 2011.
    • (2011) Automatica , vol.47 , Issue.8 , pp. 1556-1569
    • Vamvoudakis, K.G.1    Lewis, F.L.2
  • 22
    • 84885835001 scopus 로고    scopus 로고
    • Near-optimal control for nonzero-sum differential games of continuous-time nonlinear systems using singlenetwork ADP
    • Feb.
    • H. Zhang, L. Cui, and Y. Luo, "Near-optimal control for nonzero-sum differential games of continuous-time nonlinear systems using singlenetwork ADP," IEEE Trans. Cybern., vol. 43, no. 1, pp. 206-216, Feb. 2013.
    • (2013) IEEE Trans. Cybern. , vol.43 , Issue.1 , pp. 206-216
    • Zhang, H.1    Cui, L.2    Luo, Y.3
  • 23
    • 79953133535 scopus 로고    scopus 로고
    • Integral reinforcement learning for online computation of feedback Nash strategies of nonzero-sum differential games
    • Atlanta, GA, USA
    • D. Vrabie and F. Lewis, "Integral reinforcement learning for online computation of feedback Nash strategies of nonzero-sum differential games," in Proc. IEEE Conf. Decis. Control (CDC), Atlanta, GA, USA, 2010, pp. 3066-3071.
    • (2010) Proc. IEEE Conf. Decis. Control (CDC) , pp. 3066-3071
    • Vrabie, D.1    Lewis, F.2
  • 24
    • 85027928575 scopus 로고    scopus 로고
    • Integral reinforcement learning for continuous-time input-affine nonlinear systems with simultaneous invariant explorations
    • May
    • J. Y. Lee, J. B. Park, and Y. H. Choi, "Integral reinforcement learning for continuous-time input-affine nonlinear systems with simultaneous invariant explorations," IEEE Trans. Neural Netw. Learn. Syst., vol. 26, no. 15, pp. 916-932, May 2015.
    • (2015) IEEE Trans. Neural Netw. Learn. Syst. , vol.26 , Issue.15 , pp. 916-932
    • Lee, J.Y.1    Park, J.B.2    Choi, Y.H.3
  • 25
    • 84904398037 scopus 로고    scopus 로고
    • Integral reinforcement learning for linear continuous-time zero-sum games with completely unknown dynamics
    • Jul.
    • H. Li, D. Liu, and D. Wang, "Integral reinforcement learning for linear continuous-time zero-sum games with completely unknown dynamics," IEEE Trans. Autom. Sci. Eng., vol. 11, no. 3, pp. 706-714, Jul. 2014.
    • (2014) IEEE Trans. Autom. Sci. Eng. , vol.11 , Issue.3 , pp. 706-714
    • Li, H.1    Liu, D.2    Wang, D.3
  • 26
    • 84960449514 scopus 로고    scopus 로고
    • Model-free optimal control for affine nonlinear systems with convergence analysis
    • Oct.
    • D. Zhao, Z. Xia, and D. Wang, "Model-free optimal control for affine nonlinear systems with convergence analysis," IEEE Trans. Autom. Sci. Eng., vol. 12, no. 4, pp. 1461-1468, Oct. 2014.
    • (2014) IEEE Trans. Autom. Sci. Eng. , vol.12 , Issue.4 , pp. 1461-1468
    • Zhao, D.1    Xia, Z.2    Wang, D.3
  • 27
    • 84863467146 scopus 로고    scopus 로고
    • Neural-network-based optimal control for a class of unknown discrete-time nonlinear systems using globalized dual heuristic programming
    • Jul.
    • D. Liu, D. Wang, D. Zhao, Q. Wei, and N. Jin, "Neural-network-based optimal control for a class of unknown discrete-time nonlinear systems using globalized dual heuristic programming," IEEE Trans. Autom. Sci. Eng., vol. 9, no. 3, pp. 628-634, Jul. 2012.
    • (2012) IEEE Trans. Autom. Sci. Eng. , vol.9 , Issue.3 , pp. 628-634
    • Liu, D.1    Wang, D.2    Zhao, D.3    Wei, Q.4    Jin, N.5
  • 28
    • 84904706555 scopus 로고    scopus 로고
    • Online synchronous approximate optimal learning algorithm for multi-player non-zero-sum games with unknown dynamics
    • Aug.
    • D. Liu, H. Li, and D. Wang, "Online synchronous approximate optimal learning algorithm for multi-player non-zero-sum games with unknown dynamics," IEEE Trans. Syst., Man, Cybern., Syst., vol. 44, no. 8, pp. 1015-1027, Aug. 2014.
    • (2014) IEEE Trans. Syst., Man, Cybern., Syst. , vol.44 , Issue.8 , pp. 1015-1027
    • Liu, D.1    Li, H.2    Wang, D.3
  • 29
    • 84862815087 scopus 로고    scopus 로고
    • Stochastic optimal control of unknown linear networked control system in the presence of random delays and packet losses
    • H. Xu, S. Jagannathan, and F. L. Lewis, "Stochastic optimal control of unknown linear networked control system in the presence of random delays and packet losses," Automatica, vol. 48, no. 6, pp. 1017-1030, 2012.
    • (2012) Automatica , vol.48 , Issue.6 , pp. 1017-1030
    • Xu, H.1    Jagannathan, S.2    Lewis, F.L.3
  • 30
    • 84857501996 scopus 로고    scopus 로고
    • Experience replay for real-time reinforcement learning control
    • Mar.
    • S. Adam, L. Busoniu, and R. Babuska, "Experience replay for real-time reinforcement learning control," IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 42, no. 2, pp. 201-212, Mar. 2012.
    • (2012) IEEE Trans. Syst., Man, Cybern. C, Appl. Rev. , vol.42 , Issue.2 , pp. 201-212
    • Adam, S.1    Busoniu, L.2    Babuska, R.3
  • 31
    • 79953141961 scopus 로고    scopus 로고
    • Concurrent learning for convergence in adaptive control without persistency of excitation
    • Atlanta, GA, USA
    • G. Chowdhary and E. Johnson, "Concurrent learning for convergence in adaptive control without persistency of excitation," in Proc. IEEE. Conf. Decis. Control (CDC), Atlanta, GA, USA, 2010, pp. 3674-3679.
    • (2010) Proc. IEEE. Conf. Decis. Control (CDC) , pp. 3674-3679
    • Chowdhary, G.1    Johnson, E.2
  • 32
    • 84885176157 scopus 로고    scopus 로고
    • Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks
    • Oct.
    • H. Modares, F. L. Lewis, and M.-B. Naghibi-Sistani, "Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks," IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 10, pp. 1513-1525, Oct. 2013.
    • (2013) IEEE Trans. Neural Netw. Learn. Syst. , vol.24 , Issue.10 , pp. 1513-1525
    • Modares, H.1    Lewis, F.L.2    Naghibi-Sistani, M.-B.3
  • 33
    • 84893708995 scopus 로고    scopus 로고
    • Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems
    • H. Modares, F. L. Lewis, and M.-B. Naghibi-Sistani, "Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems," Automatica, vol. 50, no. 1, pp. 193-202, 2014.
    • (2014) Automatica , vol.50 , Issue.1 , pp. 193-202
    • Modares, H.1    Lewis, F.L.2    Naghibi-Sistani, M.-B.3
  • 34
    • 84961977508 scopus 로고    scopus 로고
    • Reinforcement learning and neural networks for multi-agent nonzero-sum games of nonlinear constrained-input systems
    • Oct.
    • S. Yasini, M. B. N. Sitani, and A. Kirampor, "Reinforcement learning and neural networks for multi-agent nonzero-sum games of nonlinear constrained-input systems," Int. J. Mach. Learn. Cybern., pp. 1-14, Oct. 2014.
    • (2014) Int. J. Mach. Learn. Cybern. , pp. 1-14
    • Yasini, S.1    Sitani, M.B.N.2    Kirampor, A.3
  • 35
    • 84921346879 scopus 로고    scopus 로고
    • Concurrent learningbased approximate feedback-Nash equilibrium solution of N-player nonzero-sum differential games
    • Jul.
    • R. Kamalapurkar, J. R. Klotz, and W. E. Dixon, "Concurrent learningbased approximate feedback-Nash equilibrium solution of N-player nonzero-sum differential games," IEEE/CAA J. Autom. Sinica, vol. 1, no. 3, pp. 239-247, Jul. 2014.
    • (2014) IEEE/CAA J. Autom. Sinica , vol.1 , Issue.3 , pp. 239-247
    • Kamalapurkar, R.1    Klotz, J.R.2    Dixon, W.E.3
  • 38
    • 14844340822 scopus 로고    scopus 로고
    • Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach
    • M. Abu-Khalaf and F. L. Lewis, "Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach," Automatica, vol. 41, no. 5, pp. 779-791, 2005.
    • (2005) Automatica , vol.41 , Issue.5 , pp. 779-791
    • Abu-Khalaf, M.1    Lewis, F.L.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.