메뉴 건너뛰기




Volumn 71, Issue , 2015, Pages 150-158

Reinforcement learning solution for HJB equation arising in constrained optimal control problem

Author keywords

Constrained optimal control; Data based; Hamilton Jacobi Bellman equation; Off policy reinforcement learning; The method of weighted residuals

Indexed keywords

CONTROL THEORY; DYNAMIC PROGRAMMING; OPTIMAL CONTROL SYSTEMS; TELECOMMUNICATION NETWORKS;

EID: 84941097144     PISSN: 08936080     EISSN: 18792782     Source Type: Journal    
DOI: 10.1016/j.neunet.2015.08.007     Document Type: Article
Times cited : (110)

References (43)
  • 1
    • 14844340822 scopus 로고    scopus 로고
    • Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach
    • Abu-Khalaf M., Lewis F.L. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 2005, 41(5):779-791.
    • (2005) Automatica , vol.41 , Issue.5 , pp. 779-791
    • Abu-Khalaf, M.1    Lewis, F.L.2
  • 2
    • 48949116222 scopus 로고    scopus 로고
    • Neurodynamic programming and zero-sum games for constrained control systems
    • Abu-Khalaf M., Lewis F.L., Huang J. Neurodynamic programming and zero-sum games for constrained control systems. IEEE Transactions on Neural Networks 2008, 19(7):1243-1252.
    • (2008) IEEE Transactions on Neural Networks , vol.19 , Issue.7 , pp. 1243-1252
    • Abu-Khalaf, M.1    Lewis, F.L.2    Huang, J.3
  • 4
    • 0031332446 scopus 로고    scopus 로고
    • Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation
    • Beard R.W., Saridis G.N., Wen J.T. Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation. Automatica 1997, 33(12):2159-2177.
    • (1997) Automatica , vol.33 , Issue.12 , pp. 2159-2177
    • Beard, R.W.1    Saridis, G.N.2    Wen, J.T.3
  • 6
    • 39549085591 scopus 로고    scopus 로고
    • Generalized Hamilton-Jacobi-Bellman formulation-based neural network control of affine nonlinear discrete-time systems
    • Chen Z., Jagannathan S. Generalized Hamilton-Jacobi-Bellman formulation-based neural network control of affine nonlinear discrete-time systems. IEEE Transactions on Neural Networks 2008, 19(1):90-106.
    • (2008) IEEE Transactions on Neural Networks , vol.19 , Issue.1 , pp. 90-106
    • Chen, Z.1    Jagannathan, S.2
  • 7
    • 0033629916 scopus 로고    scopus 로고
    • Reinforcement learning in continuous time and space
    • Doya K. Reinforcement learning in continuous time and space. Neural Computation 2000, 12(1):219-245.
    • (2000) Neural Computation , vol.12 , Issue.1 , pp. 219-245
    • Doya, K.1
  • 10
    • 34047138362 scopus 로고    scopus 로고
    • Reinforcement learning neural-network-based controller for nonlinear discrete-time systems with input constraints
    • He P., Jagannathan S. Reinforcement learning neural-network-based controller for nonlinear discrete-time systems with input constraints. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 2007, 37(2):425-436.
    • (2007) IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics , vol.37 , Issue.2 , pp. 425-436
    • He, P.1    Jagannathan, S.2
  • 11
    • 84880065287 scopus 로고    scopus 로고
    • Finite-horizon control-constrained nonlinear optimal control using single network adaptive critics
    • Heydari A., Balakrishnan S.N. Finite-horizon control-constrained nonlinear optimal control using single network adaptive critics. IEEE Transactions on Neural Networks and Learning Systems 2013, 24(1):147-157.
    • (2013) IEEE Transactions on Neural Networks and Learning Systems , vol.24 , Issue.1 , pp. 147-157
    • Heydari, A.1    Balakrishnan, S.N.2
  • 13
    • 84865467087 scopus 로고    scopus 로고
    • Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics
    • Jiang Y., Jiang Z.-P. Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics. Automatica 2012, 48(10):2699-2704.
    • (2012) Automatica , vol.48 , Issue.10 , pp. 2699-2704
    • Jiang, Y.1    Jiang, Z.-P.2
  • 14
    • 84899471403 scopus 로고    scopus 로고
    • Robust adaptive dynamic programming and feedback stabilization of nonlinear systems
    • Jiang Y., Jiang Z.-P. Robust adaptive dynamic programming and feedback stabilization of nonlinear systems. IEEE Transactions on Neural Networks and Learning Systems 2014, 25(5):882-893.
    • (2014) IEEE Transactions on Neural Networks and Learning Systems , vol.25 , Issue.5 , pp. 882-893
    • Jiang, Y.1    Jiang, Z.-P.2
  • 15
    • 84867400046 scopus 로고    scopus 로고
    • Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems
    • Lee J.Y., Park J.B., Choi Y.H. Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems. Automatica 2012, 48(11):2850-2859.
    • (2012) Automatica , vol.48 , Issue.11 , pp. 2850-2859
    • Lee, J.Y.1    Park, J.B.2    Choi, Y.H.3
  • 16
    • 68249144754 scopus 로고    scopus 로고
    • Adaptive dynamic programming approach to experience-based systems identification and control
    • Lendaris G.G. Adaptive dynamic programming approach to experience-based systems identification and control. Neural Networks 2009, 22(5-6):822-832.
    • (2009) Neural Networks , vol.22 , Issue.5-6 , pp. 822-832
    • Lendaris, G.G.1
  • 18
    • 84893640946 scopus 로고    scopus 로고
    • Decentralized stabilization for a class of continuous-time nonlinear interconnected systems using online learning optimal control approach
    • Liu D., Wang D., Li H. Decentralized stabilization for a class of continuous-time nonlinear interconnected systems using online learning optimal control approach. IEEE Transactions on Neural Networks and Learning Systems 2014, 25(2):418-428.
    • (2014) IEEE Transactions on Neural Networks and Learning Systems , vol.25 , Issue.2 , pp. 418-428
    • Liu, D.1    Wang, D.2    Li, H.3
  • 19
    • 84868467610 scopus 로고    scopus 로고
    • An iterative adaptive dynamic programming algorithm for optimal control of unknown discrete-time nonlinear systems with constrained inputs
    • Liu D., Wang D., Yang X. An iterative adaptive dynamic programming algorithm for optimal control of unknown discrete-time nonlinear systems with constrained inputs. Information Sciences 2013, 220:331-342.
    • (2013) Information Sciences , vol.220 , pp. 331-342
    • Liu, D.1    Wang, D.2    Yang, X.3
  • 20
    • 84897594646 scopus 로고    scopus 로고
    • Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems
    • Liu D., Wei Q. Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems. IEEE Transactions on Neural Networks and Learning Systems 2014, 25(3):621-634.
    • (2014) IEEE Transactions on Neural Networks and Learning Systems , vol.25 , Issue.3 , pp. 621-634
    • Liu, D.1    Wei, Q.2
  • 22
    • 84919448289 scopus 로고    scopus 로고
    • Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design
    • Luo B., Wu H.-N., Huang T., Liu D. Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design. Automatica 2014, 50(12):3281-3290.
    • (2014) Automatica , vol.50 , Issue.12 , pp. 3281-3290
    • Luo, B.1    Wu, H.-N.2    Huang, T.3    Liu, D.4
  • 23
    • 0030392685 scopus 로고    scopus 로고
    • Constrained optimization and control of nonlinear systems: new results in optimal control
    • Lyashevskiy, S. (1996). Constrained optimization and control of nonlinear systems: new results in optimal control. In Proceedings of the 35th IEEE decision and control (pp. 541-546).
    • (1996) Proceedings of the 35th IEEE decision and control , pp. 541-546
    • Lyashevskiy, S.1
  • 24
    • 84881324637 scopus 로고    scopus 로고
    • Optimal control of nonlinear continuous-time systems: design of bounded controllers via generalized nonquadratic functionals
    • IEEE
    • Lyshevski S.E. Optimal control of nonlinear continuous-time systems: design of bounded controllers via generalized nonquadratic functionals. Proceedings of the 1998 American control conference. Vol. 1 1998, 205-209. IEEE.
    • (1998) Proceedings of the 1998 American control conference. Vol. 1 , pp. 205-209
    • Lyshevski, S.E.1
  • 26
    • 84908432682 scopus 로고    scopus 로고
    • Linear quadratic tracking control of partially-unknown continuous-time systems using reinforcement learning
    • Modares H., Lewis F.L. Linear quadratic tracking control of partially-unknown continuous-time systems using reinforcement learning. IEEE Transactions on Automatic Control 2014, 59(11):3051-3056.
    • (2014) IEEE Transactions on Automatic Control , vol.59 , Issue.11 , pp. 3051-3056
    • Modares, H.1    Lewis, F.L.2
  • 28
    • 84893708995 scopus 로고    scopus 로고
    • Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems
    • Modares H., Lewis F.L., Naghibi-Sistani M.-B. Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems. Automatica 2014, 50(1):193-202.
    • (2014) Automatica , vol.50 , Issue.1 , pp. 193-202
    • Modares, H.1    Lewis, F.L.2    Naghibi-Sistani, M.-B.3
  • 30
    • 0011636441 scopus 로고
    • A new algorithm for adaptive multidimensional integration
    • Peter Lepage G. A new algorithm for adaptive multidimensional integration. Journal of Computational Physics 1978, 27(2):192-203.
    • (1978) Journal of Computational Physics , vol.27 , Issue.2 , pp. 192-203
    • Peter Lepage, G.1
  • 35
    • 77950630017 scopus 로고    scopus 로고
    • Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem
    • Vamvoudakis K.G., Lewis F.L. Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica 2010, 46(5):878-888.
    • (2010) Automatica , vol.46 , Issue.5 , pp. 878-888
    • Vamvoudakis, K.G.1    Lewis, F.L.2
  • 36
    • 67349145396 scopus 로고    scopus 로고
    • Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems
    • Vrabie D., Lewis F.L. Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems. Neural Networks 2009, 22(3):237-246.
    • (2009) Neural Networks , vol.22 , Issue.3 , pp. 237-246
    • Vrabie, D.1    Lewis, F.L.2
  • 37
    • 58349110975 scopus 로고    scopus 로고
    • Adaptive optimal control for continuous-time linear systems based on policy iteration
    • Vrabie D., Pastravanu O., Abu-Khalaf M., Lewis F.L. Adaptive optimal control for continuous-time linear systems based on policy iteration. Automatica 2009, 45(2):477-484.
    • (2009) Automatica , vol.45 , Issue.2 , pp. 477-484
    • Vrabie, D.1    Pastravanu, O.2    Abu-Khalaf, M.3    Lewis, F.L.4
  • 38
    • 84898803345 scopus 로고    scopus 로고
    • Policy iteration algorithm for online design of robust control for a class of continuous-time nonlinear systems
    • Wang D., Liu D., Li H. Policy iteration algorithm for online design of robust control for a class of continuous-time nonlinear systems. IEEE Transactions on Automation Science and Engineering 2014, 11(2):627-632.
    • (2014) IEEE Transactions on Automation Science and Engineering , vol.11 , Issue.2 , pp. 627-632
    • Wang, D.1    Liu, D.2    Li, H.3
  • 39
    • 84862811062 scopus 로고    scopus 로고
    • An iterative ε-optimal control scheme for a class of discrete-time nonlinear systems with unfixed initial state
    • Wei Q., Liu D. An iterative ε-optimal control scheme for a class of discrete-time nonlinear systems with unfixed initial state. Neural Networks 2012, 32:236-244.
    • (2012) Neural Networks , vol.32 , pp. 236-244
    • Wei, Q.1    Liu, D.2
  • 40
    • 84893949931 scopus 로고    scopus 로고
    • Reinforcement learning for adaptive optimal control of unknown continuous-time nonlinear systems with input constraints
    • Yang X., Liu D., Wang D. Reinforcement learning for adaptive optimal control of unknown continuous-time nonlinear systems with input constraints. International Journal of Control 2014, 87(3):553-566.
    • (2014) International Journal of Control , vol.87 , Issue.3 , pp. 553-566
    • Yang, X.1    Liu, D.2    Wang, D.3
  • 41
    • 84897950099 scopus 로고    scopus 로고
    • Discrete-time online learning control for a class of unknown nonaffine nonlinear systems using reinforcement learning
    • Yang X., Liu D., Wang D., Wei Q. Discrete-time online learning control for a class of unknown nonaffine nonlinear systems using reinforcement learning. Neural Networks 2014, 55:30-41.
    • (2014) Neural Networks , vol.55 , pp. 30-41
    • Yang, X.1    Liu, D.2    Wang, D.3    Wei, Q.4
  • 42
    • 70349253929 scopus 로고    scopus 로고
    • Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints
    • Zhang H., Luo Y., Liu D. Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints. IEEE Transactions on Neural Networks 2009, 20(9):1490-1503.
    • (2009) IEEE Transactions on Neural Networks , vol.20 , Issue.9 , pp. 1490-1503
    • Zhang, H.1    Luo, Y.2    Liu, D.3
  • 43
    • 84921361841 scopus 로고    scopus 로고
    • Near optimal output feedback control of nonlinear discrete-time systems based on reinforcement neural network learning
    • Zhao Q., Xu H., Jagannathan S. Near optimal output feedback control of nonlinear discrete-time systems based on reinforcement neural network learning. IEEE/CAA Journal of Automatica Sinica 2014, 1(4):372-384.
    • (2014) IEEE/CAA Journal of Automatica Sinica , vol.1 , Issue.4 , pp. 372-384
    • Zhao, Q.1    Xu, H.2    Jagannathan, S.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.