메뉴 건너뛰기




Volumn 47, Issue 5, 2017, Pages 1224-1237

Discrete-time deterministic Q-learning: A novel convergence analysis

Author keywords

Adaptive critic designs; adaptive dynamic programming (ADP); approximate dynamic programming; neural networks (NNs); neuro dynamic programming; optimal control; Q learning

Indexed keywords

ALGORITHMS; ITERATIVE METHODS;

EID: 84963604827     PISSN: 21682267     EISSN: None     Source Type: Journal    
DOI: 10.1109/TCYB.2016.2542923     Document Type: Article
Times cited : (170)

References (62)
  • 1
    • 0002557583 scopus 로고
    • Advanced forecasting methods for global crisis warning, and models of intelligence
    • P. J. Werbos, "Advanced forecasting methods for global crisis warning, and models of intelligence," Gen. Syst. Yearbook, vol. 22, pp. 25-38, 1977
    • (1977) Gen. Syst. Yearbook , vol.22 , pp. 25-38
    • Werbos, P.J.1
  • 2
    • 0002011091 scopus 로고
    • A menu of designs for reinforcement learning over time
    • W. T. Miller, R. S. Sutton, and P. J. Werbos, Eds. Cambridge, MA, USA MIT Press
    • P. J. Werbos, "A menu of designs for reinforcement learning over time," in Neural Networks for Control, W. T. Miller, R. S. Sutton, and P. J. Werbos, Eds. Cambridge, MA, USA: MIT Press, 1991, pp. 67-95
    • (1991) Neural Networks for Control , pp. 67-95
    • Werbos, P.J.1
  • 3
    • 84959450562 scopus 로고    scopus 로고
    • An event-triggered ADP control approach for continuous-time system with unknown internal states
    • to be published
    • X. Zhong, and H. He, "An event-triggered ADP control approach for continuous-time system with unknown internal states," IEEE Trans. Cybern., to be published, doi: 10.1109/TCYB.2016.2523878
    • IEEE Trans. Cybern
    • Zhong, X.1    He, H.2
  • 4
    • 84906781179 scopus 로고    scopus 로고
    • Adaptive dynamic programming for a class of complex-valued nonlinear systems
    • Sep
    • R. Song, W. Xiao, H. Zhang, and C. Sun, "Adaptive dynamic programming for a class of complex-valued nonlinear systems," IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 9, pp. 1733-1739, Sep. 2014
    • (2014) IEEE Trans. Neural Netw. Learn. Syst , vol.25 , Issue.9 , pp. 1733-1739
    • Song, R.1    Xiao, W.2    Zhang, H.3    Sun, C.4
  • 5
    • 84939617304 scopus 로고    scopus 로고
    • Data-driven zero-sum neuro-optimal control for a class of continuous-time unknown nonlinear systems with disturbance using ADP
    • Feb
    • Q. Wei, R. Song, and P. Yan, "Data-driven zero-sum neuro-optimal control for a class of continuous-time unknown nonlinear systems with disturbance using ADP," IEEE Trans. Neural Netw. Learn. Syst., vol. 27, no. 2, pp. 444-458, Feb. 2016
    • (2016) IEEE Trans. Neural Netw. Learn. Syst , vol.27 , Issue.2 , pp. 444-458
    • Wei, Q.1    Song, R.2    Yan, P.3
  • 6
    • 85027955915 scopus 로고    scopus 로고
    • GrDHP: A general utility function representation for dual heuristic dynamic programming
    • Mar
    • Z. Ni, H. He, D. Zhao, X. Xu, and D. V. Prokhorov, "GrDHP: A general utility function representation for dual heuristic dynamic programming," IEEE Trans. Neural Netw. Learn. Syst., vol. 26, no. 3, pp. 614-627, Mar. 2015
    • (2015) IEEE Trans. Neural Netw. Learn. Syst , vol.26 , Issue.3 , pp. 614-627
    • Ni, Z.1    He, H.2    Zhao, D.3    Xu, X.4    Prokhorov, D.V.5
  • 7
    • 84906778934 scopus 로고    scopus 로고
    • Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification
    • Oct
    • Q. Wei, and D. Liu, "Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification," IEEE Trans. Autom. Sci. Eng., vol. 11, no. 4, pp. 1020-1036, Oct. 2014
    • (2014) IEEE Trans. Autom. Sci. Eng , vol.11 , Issue.4 , pp. 1020-1036
    • Wei, Q.1    Liu, D.2
  • 8
    • 84887990637 scopus 로고    scopus 로고
    • Goal representation heuristic dynamic programming on maze navigation
    • Dec
    • Z. Ni, H. He, J. Wen, and X. Xu, "Goal representation heuristic dynamic programming on maze navigation," IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 12, pp. 2038-2050, Dec. 2013
    • (2013) IEEE Trans. Neural Netw. Learn. Syst , vol.24 , Issue.12 , pp. 2038-2050
    • Ni, Z.1    He, H.2    Wen, J.3    Xu, X.4
  • 9
    • 85027700528 scopus 로고    scopus 로고
    • Value, and policy iterations in optimal control, and adaptive dynamic programming
    • to be published
    • D. P. Bertsekas, "Value, and policy iterations in optimal control, and adaptive dynamic programming," IEEE Trans. Neural Netw. Learn. Syst., to be published, doi: 10.1109/TNNLS.2015.2503980
    • IEEE Trans. Neural Netw. Learn. Syst
    • Bertsekas, D.P.1
  • 10
    • 84883537695 scopus 로고    scopus 로고
    • Reinforcement learning, and feedback control: Using natural decision methods to design optimal adaptive controllers
    • Dec
    • F. L. Lewis, D. Vrabie, and K. G. Vamvoudakis, "Reinforcement learning, and feedback control: Using natural decision methods to design optimal adaptive controllers," IEEE Control Syst., vol. 32, no. 6, pp. 76-105, Dec. 2012
    • (2012) IEEE Control Syst , vol.32 , Issue.6 , pp. 76-105
    • Lewis, F.L.1    Vrabie, D.2    Vamvoudakis, K.G.3
  • 12
    • 84912026937 scopus 로고    scopus 로고
    • Revisiting approximate dynamic programming, and its convergence
    • Dec
    • A. Heydari, "Revisiting approximate dynamic programming, and its convergence," IEEE Trans. Cybern., vol. 44, no. 12, pp. 2733-2743, Dec. 2014
    • (2014) IEEE Trans. Cybern , vol.44 , Issue.12 , pp. 2733-2743
    • Heydari, A.1
  • 13
    • 84875270081 scopus 로고    scopus 로고
    • Online optimal control of affine nonlinear discrete-time systems with unknown internal dynamics by using timebased policy update
    • Jul
    • T. Dierks, and S. Jagannathan, "Online optimal control of affine nonlinear discrete-time systems with unknown internal dynamics by using timebased policy update," IEEE Trans. Neural Netw. Learn. Syst., vol. 23, no. 7, pp. 1118-1129, Jul. 2012
    • (2012) IEEE Trans. Neural Netw. Learn. Syst , vol.23 , Issue.7 , pp. 1118-1129
    • Dierks, T.1    Jagannathan, S.2
  • 14
    • 84893708995 scopus 로고    scopus 로고
    • Integral reinforcement learning, and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems
    • Jan
    • H. Modares, F. L. Lewis, and M.-B. Naghibi-Sistani, "Integral reinforcement learning, and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems," Automatica, vol. 50, no. 1, pp. 193-202, Jan. 2014
    • (2014) Automatica , vol.50 , Issue.1 , pp. 193-202
    • Modares, H.1    Lewis, F.L.2    Naghibi-Sistani, M.-B.3
  • 15
    • 84908432682 scopus 로고    scopus 로고
    • Linear quadratic tracking control of partially-unknown continuous-time systems using reinforcement learning
    • Nov
    • H. Modares, and F. L. Lewis, "Linear quadratic tracking control of partially-unknown continuous-time systems using reinforcement learning," IEEE Trans. Autom. Control, vol. 59, no. 11, pp. 3051-3056, Nov. 2014
    • (2014) IEEE Trans. Autom. Control , vol.59 , Issue.11 , pp. 3051-3056
    • Modares, H.1    Lewis, F.L.2
  • 16
    • 84912122528 scopus 로고    scopus 로고
    • Finite-approximation-errorbased discrete-time iterative adaptive dynamic programming
    • Dec
    • Q. Wei, F. Y. Wang, D. Liu, and X. Yang, "Finite-approximation-errorbased discrete-time iterative adaptive dynamic programming," IEEE Trans. Cybern., vol. 44, no. 12, pp. 2820-2833, Dec. 2014
    • (2014) IEEE Trans. Cybern , vol.44 , Issue.12 , pp. 2820-2833
    • Wei, Q.1    Wang, F.Y.2    Liu, D.3    Yang, X.4
  • 17
    • 85017730584 scopus 로고    scopus 로고
    • Asymptotically stable adaptive-optimal control algorithm with saturating actuators, and relaxed persistence of excitation
    • to be published
    • K. G. Vamvoudakis, M. F. Miranda, and J. P. Hespanha, "Asymptotically stable adaptive-optimal control algorithm with saturating actuators, and relaxed persistence of excitation," IEEE Trans. Neural Netw. Learn. Syst., to be published, doi: 10.1109/TNNLS.2015.2487972
    • IEEE Trans. Neural Netw. Learn. Syst
    • Vamvoudakis, K.G.1    Miranda, M.F.2    Hespanha, J.P.3
  • 18
    • 79960897012 scopus 로고    scopus 로고
    • Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations
    • Aug
    • K. G. Vamvoudakis, and F. L. Lewis, "Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations," Automatica, vol. 47, no. 8, pp. 1556-1569, Aug. 2011
    • (2011) Automatica , vol.47 , Issue.8 , pp. 1556-1569
    • Vamvoudakis, K.G.1    Lewis, F.L.2
  • 19
    • 84885835001 scopus 로고    scopus 로고
    • Near-optimal control for nonzero-sum differential games of continuous-time nonlinear systems using singlenetwork ADP
    • Feb
    • H. Zhang, L. Cui, and Y. Luo, "Near-optimal control for nonzero-sum differential games of continuous-time nonlinear systems using singlenetwork ADP," IEEE Trans. Cybern., vol. 43, no. 1, pp. 206-216, Feb. 2013
    • (2013) IEEE Trans. Cybern , vol.43 , Issue.1 , pp. 206-216
    • Zhang, H.1    Cui, L.2    Luo, Y.3
  • 20
    • 85027929469 scopus 로고    scopus 로고
    • Multiple actor-critic structures for continuous-time optimal control using input-output data
    • Apr
    • R. Song, et al., "Multiple actor-critic structures for continuous-time optimal control using input-output data," IEEE Trans. Neural Netw. Learn. Syst., vol. 26, no. 4, pp. 851-865, Apr. 2015
    • (2015) IEEE Trans. Neural Netw. Learn. Syst , vol.26 , Issue.4 , pp. 851-865
    • Song, R.1
  • 21
    • 84904739156 scopus 로고    scopus 로고
    • Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning
    • Jul
    • H. Modares, and F. L. Lewis, "Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning," Automatica, vol. 50, no. 7, pp. 1780-1792, Jul. 2014
    • (2014) Automatica , vol.50 , Issue.7 , pp. 1780-1792
    • Modares, H.1    Lewis, F.L.2
  • 22
    • 84897594646 scopus 로고    scopus 로고
    • Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems
    • Mar
    • D. Liu, and Q. Wei, "Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems," IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 3, pp. 621-634, Mar. 2014
    • (2014) IEEE Trans. Neural Netw. Learn. Syst , vol.25 , Issue.3 , pp. 621-634
    • Liu, D.1    Wei, Q.2
  • 23
    • 33747862706 scopus 로고    scopus 로고
    • Relaxing dynamic programming
    • Aug
    • B. Lincoln, and A. Rantzer, "Relaxing dynamic programming," IEEE Trans. Autom. Control, vol. 51, no. 8, pp. 1249-1260, Aug. 2006
    • (2006) IEEE Trans. Autom. Control , vol.51 , Issue.8 , pp. 1249-1260
    • Lincoln, B.1    Rantzer, A.2
  • 24
    • 84930506123 scopus 로고    scopus 로고
    • Multibattery optimal coordination control for home energy management systems via distributed iterative adaptive dynamic programming
    • Jul
    • Q. Wei, D. Liu, G. Shi, and Y. Liu, "Multibattery optimal coordination control for home energy management systems via distributed iterative adaptive dynamic programming," IEEE Trans. Ind. Electron., vol. 62, no. 7, pp. 4203-4214, Jul. 2015
    • (2015) IEEE Trans. Ind. Electron , vol.62 , Issue.7 , pp. 4203-4214
    • Wei, Q.1    Liu, D.2    Shi, G.3    Liu, Y.4
  • 25
    • 84924872284 scopus 로고    scopus 로고
    • A novel dual iterative Q-learning method for optimal battery management in smart residential environments
    • Apr
    • Q. Wei, D. Liu, and G. Shi, "A novel dual iterative Q-learning method for optimal battery management in smart residential environments," IEEE Trans. Ind. Electron., vol. 62, no. 4, pp. 2509-2518, Apr. 2015
    • (2015) IEEE Trans. Ind. Electron , vol.62 , Issue.4 , pp. 2509-2518
    • Wei, Q.1    Liu, D.2    Shi, G.3
  • 26
    • 49049089962 scopus 로고    scopus 로고
    • Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof
    • Aug
    • A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, "Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof," IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 38, no. 4, pp. 943-949, Aug. 2008
    • (2008) IEEE Trans. Syst., Man, Cybern. B, Cybern , vol.38 , Issue.4 , pp. 943-949
    • Al-Tamimi, A.1    Lewis, F.L.2    Abu-Khalaf, M.3
  • 27
    • 49049119493 scopus 로고    scopus 로고
    • A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm
    • Aug
    • H. Zhang, Q. Wei, and Y. Luo, "A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm," IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 38, no. 4, pp. 937-942, Aug. 2008
    • (2008) IEEE Trans. Syst., Man, Cybern. B, Cybern , vol.38 , Issue.4 , pp. 937-942
    • Zhang, H.1    Wei, Q.2    Luo, Y.3
  • 28
    • 18444379381 scopus 로고    scopus 로고
    • Approximate dynamic programming-based approaches for input-output data-driven control of nonlinear processes
    • Jul
    • J. M. Lee, and J. H. Lee, "Approximate dynamic programming-based approaches for input-output data-driven control of nonlinear processes," Automatica, vol. 41, no. 7, pp. 1281-1288, Jul. 2005
    • (2005) Automatica , vol.41 , Issue.7 , pp. 1281-1288
    • Lee, J.M.1    Lee, J.H.2
  • 30
    • 84885176157 scopus 로고    scopus 로고
    • Adaptive optimal control of unknown constrained-input systems using policy iteration, and neural networks
    • Oct
    • H. Modares, F. L. Lewis, and M. B. Naghibi-Sistani, "Adaptive optimal control of unknown constrained-input systems using policy iteration, and neural networks," IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 10, pp. 1513-1525, Oct. 2013
    • (2013) IEEE Trans. Neural Netw. Learn. Syst , vol.24 , Issue.10 , pp. 1513-1525
    • Modares, H.1    Lewis, F.L.2    Naghibi-Sistani, M.B.3
  • 31
    • 84908658175 scopus 로고    scopus 로고
    • A novel iterative-adaptive dynamic programming for discrete-time nonlinear systems
    • Oct
    • Q. Wei, and D. Liu, "A novel iterative -adaptive dynamic programming for discrete-time nonlinear systems," IEEE Trans. Autom. Sci. Eng., vol. 11, no. 4, pp. 1176-1190, Oct. 2014
    • (2014) IEEE Trans. Autom. Sci. Eng , vol.11 , Issue.4 , pp. 1176-1190
    • Wei, Q.1    Liu, D.2
  • 32
    • 84902352795 scopus 로고    scopus 로고
    • Data-driven neuro-optimal temperature control of water-gas shift reaction using stable iterative adaptive dynamic programming
    • Nov
    • Q. Wei, and D. Liu, "Data-driven neuro-optimal temperature control of water-gas shift reaction using stable iterative adaptive dynamic programming," IEEE Trans. Ind. Electron., vol. 61, no. 11, pp. 6399-6408, Nov. 2014
    • (2014) IEEE Trans. Ind. Electron , vol.61 , Issue.11 , pp. 6399-6408
    • Wei, Q.1    Liu, D.2
  • 33
    • 84928747516 scopus 로고    scopus 로고
    • Off-policy actor-critic structure for optimal control of unknown systems with disturbances
    • to be published
    • R. Song, F. L. Lewis, Q. Wei, and H. Zhang, "Off-policy actor-critic structure for optimal control of unknown systems with disturbances," IEEE Trans. Cybern., to be published, doi: 10.1109/TCYB.2015.2421338
    • IEEE Trans. Cybern
    • Song, R.1    Lewis, F.L.2    Wei, Q.3    Zhang, H.4
  • 34
    • 0004049893 scopus 로고
    • Ph.D. dissertation Cambridge Univ., Cambridge, U.K
    • C. Watkins, "Learning from delayed rewards," Ph.D. dissertation, Cambridge Univ., Cambridge, U.K., 1989
    • (1989) Learning from Delayed Rewards
    • Watkins, C.1
  • 35
    • 34249833101 scopus 로고
    • Q-learning
    • May
    • C. Watkins, and P. Dayan, "Q-learning," Mach. Learn., vol. 8, nos. 3-4, pp. 279-292, May 1992
    • (1992) Mach. Learn , vol.8 , Issue.3-4 , pp. 279-292
    • Watkins, C.1    Dayan, P.2
  • 36
    • 33846781129 scopus 로고    scopus 로고
    • Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control
    • Mar
    • A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, "Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control," Automatica, vol. 43, no. 3, pp. 473-481, Mar. 2007
    • (2007) Automatica , vol.43 , Issue.3 , pp. 473-481
    • Al-Tamimi, A.1    Lewis, F.L.2    Abu-Khalaf, M.3
  • 37
    • 77955423822 scopus 로고    scopus 로고
    • Model-free H control design for unknown linear discrete-time systems via Q-learning with LMI
    • Aug
    • J.-H. Kim, and F. L. Lewis, "Model-free H control design for unknown linear discrete-time systems via Q-learning with LMI," Automatica, vol. 46, no. 8, pp. 1320-1326, Aug. 2010
    • (2010) Automatica , vol.46 , Issue.8 , pp. 1320-1326
    • Kim, J.-H.1    Lewis, F.L.2
  • 39
    • 0031236002 scopus 로고    scopus 로고
    • Adaptive critic designs
    • Sep
    • D. V. Prokhorov, and D. C. Wunsch, "Adaptive critic designs," IEEE Trans. Neural Netw., vol. 8, no. 5, pp. 997-1007, Sep. 1997
    • (1997) IEEE Trans. Neural Netw , vol.8 , Issue.5 , pp. 997-1007
    • Prokhorov, D.V.1    Wunsch, D.C.2
  • 40
    • 77955828918 scopus 로고    scopus 로고
    • An adaptive Q-learning algorithm developed for agent-based computational modeling of electricity market
    • Sep
    • M. Rahimiyan, and H. R. Mashhadi, "An adaptive Q-learning algorithm developed for agent-based computational modeling of electricity market," IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 40, no. 5, pp. 547-556, Sep. 2010
    • (2010) IEEE Trans. Syst., Man, Cybern. C, Appl. Rev , vol.40 , Issue.5 , pp. 547-556
    • Rahimiyan, M.1    Mashhadi, H.R.2
  • 41
    • 79958173163 scopus 로고    scopus 로고
    • Reinforcement learning with function approximation for traffic signal control
    • Jun
    • L. A. Prashanth, and S. Bhatnagar, "Reinforcement learning with function approximation for traffic signal control," IEEE Trans. Intell. Transp. Syst., vol. 12, no. 2, pp. 412-421, Jun. 2011
    • (2011) IEEE Trans. Intell. Transp. Syst , vol.12 , Issue.2 , pp. 412-421
    • Prashanth, L.A.1    Bhatnagar, S.2
  • 42
    • 84897585055 scopus 로고    scopus 로고
    • QD-learning: A collaborative distributed strategy for multi-agent reinforcement learning through consensus + innovations
    • Jul
    • S. Kar, J. M. F. Moura, and H. V. Poor, "QD-learning: A collaborative distributed strategy for multi-agent reinforcement learning through consensus + innovations," IEEE Trans. Signal Process., vol. 61, no. 7, pp. 1848-1862, Jul. 2013
    • (2013) IEEE Trans. Signal Process , vol.61 , Issue.7 , pp. 1848-1862
    • Kar, S.1    Moura, J.M.F.2    Poor, H.V.3
  • 43
    • 84872594962 scopus 로고    scopus 로고
    • A self-learning scheme for residential energy system control, and management
    • Feb
    • T. Huang, and D. Liu, "A self-learning scheme for residential energy system control, and management," Neural Comput. Appl., vol. 22, no. 2, pp. 259-269, Feb. 2013
    • (2013) Neural Comput. Appl , vol.22 , Issue.2 , pp. 259-269
    • Huang, T.1    Liu, D.2
  • 44
    • 84930645788 scopus 로고    scopus 로고
    • Hybrid threephase/single-phase microgrid architecture with power management capabilities
    • Oct
    • Q. Sun, J. Zhou, J. M. Guerrero, and H. Zhang, "Hybrid threephase/single-phase microgrid architecture with power management capabilities," IEEE Trans. Power Electron., vol. 30, no. 10, pp. 5964-5977, Oct. 2015
    • (2015) IEEE Trans. Power Electron , vol.30 , Issue.10 , pp. 5964-5977
    • Sun, Q.1    Zhou, J.2    Guerrero, J.M.3    Zhang, H.4
  • 45
    • 84959484631 scopus 로고    scopus 로고
    • A multiagent-based consensus algorithm for distributed coordinated control of distributed generators in the energy Internet
    • Nov
    • Q. Sun, R. Han, H. Zhang, J. Zhou, and J. M. Guerrero, "A multiagent-based consensus algorithm for distributed coordinated control of distributed generators in the energy Internet," IEEE Trans. Smart Grid, vol. 6, no. 6, pp. 3006-3019, Nov. 2015, doi: 10.1109/TSG.2015.2412779
    • (2015) IEEE Trans. Smart Grid , vol.6 , Issue.6 , pp. 3006-3019
    • Sun, Q.1    Han, R.2    Zhang, H.3    Zhou, J.4    Guerrero, J.M.5
  • 46
    • 85025171615 scopus 로고    scopus 로고
    • A novel energy function-based stability evaluation, and nonlinear control for energy Internet
    • to be published
    • Q. Sun, Y. Zhang, H. He, D. Ma, and H. Zhang, "A novel energy function-based stability evaluation, and nonlinear control for energy Internet," IEEE Trans. Smart Grid, to be published, doi: 10.1109/TSG.2015.2497691
    • IEEE Trans. Smart Grid
    • Sun, Q.1    Zhang, Y.2    He, H.3    Ma, D.4    Zhang, H.5
  • 47
    • 84892442931 scopus 로고    scopus 로고
    • A multiagent Q-learningbased optimal allocation approach for urban water resource management system
    • Jan
    • J. Ni, M. Liu, L. Ren, and S. X. Yang, "A multiagent Q-learningbased optimal allocation approach for urban water resource management system," IEEE Trans. Autom. Sci. Eng., vol. 11, no. 1, pp. 204-214, Jan. 2014
    • (2014) IEEE Trans. Autom. Sci. Eng , vol.11 , Issue.1 , pp. 204-214
    • Ni, J.1    Liu, M.2    Ren, L.3    Yang, S.X.4
  • 48
    • 0003787146 scopus 로고
    • Princeton NJ USA: Princeton Univ. Press
    • R. E. Bellman, Dynamic Programming. Princeton, NJ, USA: Princeton Univ. Press, 1957
    • (1957) Dynamic Programming
    • Bellman, R.E.1
  • 49
    • 85028172044 scopus 로고    scopus 로고
    • Adaptive neural network tracking control of uncertain nonlinear discrete-time systems with nonaffine dead-zone input
    • Mar
    • Y.-J. Liu, and S. C. Tong, "Adaptive neural network tracking control of uncertain nonlinear discrete-time systems with nonaffine dead-zone input," IEEE Trans. Cybern., vol. 45, no. 3, pp. 497-505, Mar. 2015
    • (2015) IEEE Trans. Cybern , vol.45 , Issue.3 , pp. 497-505
    • Liu, Y.-J.1    Tong, S.C.2
  • 50
    • 84941079390 scopus 로고    scopus 로고
    • A unified approach to adaptive neural control for nonlinear discrete-time systems with nonlinear dead-zone input
    • Jan
    • Y. J. Liu, Y. Gao, S. C. Tong, and C. L. P. Chen, "A unified approach to adaptive neural control for nonlinear discrete-time systems with nonlinear dead-zone input," IEEE Trans. Neural Netw. Learn. Syst., vol. 27, no. 1, pp. 139-150, Jan. 2016
    • (2016) IEEE Trans. Neural Netw. Learn. Syst , vol.27 , Issue.1 , pp. 139-150
    • Liu, Y.J.1    Gao, Y.2    Tong, S.C.3    Chen, C.L.P.4
  • 51
    • 84919600707 scopus 로고    scopus 로고
    • Reinforcement learning design-based adaptive tracking control with less learning parameters for nonlinear discrete-time MIMO systems
    • Jan
    • Y. J. Liu, L. Tang, S. C. Tong, C. L. P. Chen, and D. J. Li, "Reinforcement learning design-based adaptive tracking control with less learning parameters for nonlinear discrete-time MIMO systems," IEEE Trans. Neural Netw. Learn. Syst., vol. 26, no. 1, pp. 165-176, Jan. 2015
    • (2015) IEEE Trans. Neural Netw. Learn. Syst , vol.26 , Issue.1 , pp. 165-176
    • Liu, Y.J.1    Tang, L.2    Tong, S.C.3    Chen, C.L.P.4    Li, D.J.5
  • 52
    • 84897630989 scopus 로고    scopus 로고
    • A survey on CPG-inspired control models, and system implementation
    • Mar
    • J. Yu, M. Tan, J. Chen, and J. Zhang, "A survey on CPG-inspired control models, and system implementation," IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 3, pp. 441-456, Mar. 2014
    • (2014) IEEE Trans. Neural Netw. Learn. Syst , vol.25 , Issue.3 , pp. 441-456
    • Yu, J.1    Tan, M.2    Chen, J.3    Zhang, J.4
  • 55
    • 84955579295 scopus 로고    scopus 로고
    • Finite-horizon near optimal adaptive control of uncertain linear discrete-time systems
    • Q. Zhao, H. Xu, and S. Jagannathan, "Finite-horizon near optimal adaptive control of uncertain linear discrete-time systems," Optimal Control Appl. Methods, vol. 36, no. 6, pp. 853-872, 2015, doi: 10.1002/oca.2143
    • (2015) Optimal Control Appl. Methods , vol.36 , Issue.6 , pp. 853-872
    • Zhao, Q.1    Xu, H.2    Jagannathan, S.3
  • 56
    • 84911399192 scopus 로고    scopus 로고
    • Stochastic optimal output feedback design for unknown linear discrete-time system zero-sum games under communication constraints
    • Sep
    • H. Xu, S. Jagannathan, and F. L. Lewis, "Stochastic optimal output feedback design for unknown linear discrete-time system zero-sum games under communication constraints," Asian J. Control, vol. 16, no. 5, pp. 1263-1276, Sep. 2014
    • (2014) Asian J. Control , vol.16 , Issue.5 , pp. 1263-1276
    • Xu, H.1    Jagannathan, S.2    Lewis, F.L.3
  • 57
    • 84862815087 scopus 로고    scopus 로고
    • Stochastic optimal control of unknown linear networked control systems in the presence of random delays, and packet losses
    • Jun
    • H. Xu, S. Jagannathan, and F. L. Lewis, "Stochastic optimal control of unknown linear networked control systems in the presence of random delays, and packet losses," Automatica, vol. 48, no. 6, pp. 1017-1030, Jun. 2012
    • (2012) Automatica , vol.48 , Issue.6 , pp. 1017-1030
    • Xu, H.1    Jagannathan, S.2    Lewis, F.L.3
  • 58
    • 84946780761 scopus 로고    scopus 로고
    • Global adaptive dynamic programming for continuous-time nonlinear systems
    • Nov
    • Y. Jiang, and Z. P. Jiang, "Global adaptive dynamic programming for continuous-time nonlinear systems," IEEE Trans. Autom. Control, vol. 60, no. 11, pp. 2917-2929, Nov. 2015
    • (2015) IEEE Trans. Autom. Control , vol.60 , Issue.11 , pp. 2917-2929
    • Jiang, Y.1    Jiang, Z.P.2
  • 59
    • 84908120758 scopus 로고    scopus 로고
    • Adaptive dynamic programming, and optimal control of nonlinear nonaffine systems
    • Oct
    • T. Bian, Y. Jiang, and Z.-P. Jiang, "Adaptive dynamic programming, and optimal control of nonlinear nonaffine systems," Automatica, vol. 50, no. 10, pp. 2624-2632, Oct. 2014
    • (2014) Automatica , vol.50 , Issue.10 , pp. 2624-2632
    • Bian, T.1    Jiang, Y.2    Jiang, Z.-P.3
  • 60
    • 85028229548 scopus 로고    scopus 로고
    • Distributed cooperative optimal control for multiagent systems on directed graphs: An inverse optimal approach
    • Jul
    • H. Zhang, T. Feng, G. H. Yang, and H. Liang, "Distributed cooperative optimal control for multiagent systems on directed graphs: An inverse optimal approach," IEEE Trans. Cybern., vol. 45, no. 7, pp. 1315-1326, Jul. 2015
    • (2015) IEEE Trans. Cybern , vol.45 , Issue.7 , pp. 1315-1326
    • Zhang, H.1    Feng, T.2    Yang, G.H.3    Liang, H.4
  • 61
    • 84946811900 scopus 로고    scopus 로고
    • Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems
    • Mar
    • Q. Wei, D. Liu, and H. Lin, "Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems," IEEE Trans. Cybern., vol. 46, no. 3, pp. 840-853, Mar. 2016
    • (2016) IEEE Trans. Cybern , vol.46 , Issue.3 , pp. 840-853
    • Wei, Q.1    Liu, D.2    Lin, H.3
  • 62
    • 85027953921 scopus 로고    scopus 로고
    • Infinite horizon self-learning optimal control of nonaffine discrete-time nonlinear systems
    • Apr
    • Q. Wei, D. Liu, and X. Yang, "Infinite horizon self-learning optimal control of nonaffine discrete-time nonlinear systems," IEEE Trans. Neural Netw. Learn. Syst., vol. 26, no. 4, pp. 866-879, Apr. 2015
    • (2015) IEEE Trans. Neural Netw. Learn. Syst , vol.26 , Issue.4 , pp. 866-879
    • Wei, Q.1    Liu, D.2    Yang, X.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.