메뉴 건너뛰기




Volumn 26, Issue 5, 2015, Pages 916-932

Integral reinforcement learning for continuous-time input-affine nonlinear systems with simultaneous invariant explorations

Author keywords

Adaptive optimal control; continuous time (CT); exploration; policy iteration (PI); Q learning; reinforcement learning (RL)

Indexed keywords

ALGORITHMS; CLOSED LOOP SYSTEMS; CONTINUOUS TIME SYSTEMS; ITERATIVE METHODS; LEARNING ALGORITHMS; NATURAL RESOURCES EXPLORATION; NONLINEAR SYSTEMS; NUMERICAL METHODS; OPTIMAL CONTROL SYSTEMS; SOCIAL NETWORKING (ONLINE);

EID: 85027928575     PISSN: 2162237X     EISSN: 21622388     Source Type: Journal    
DOI: 10.1109/TNNLS.2014.2328590     Document Type: Article
Times cited : (113)

References (36)
  • 1
    • 70349116541 scopus 로고    scopus 로고
    • Reinforcement learning and adaptive dynamic programming for feedback control
    • Jun
    • F. L. Lewis and D. Vrabie, "Reinforcement learning and adaptive dynamic programming for feedback control," IEEE Circuits Syst. Mag., vol. 9, no. 3, pp. 32-50, Jun. 2009.
    • (2009) IEEE Circuits Syst. Mag. , vol.9 , Issue.3 , pp. 32-50
    • Lewis, F.L.1    Vrabie, D.2
  • 5
    • 34249833101 scopus 로고
    • Q-learning
    • C. J. C. H. Watkins and P. Dayan, "Q-learning," Mach. Learn., vol. 8, nos. 3-4, pp. 279-292, 1992.
    • (1992) Mach. Learn. , vol.8 , Issue.3-4 , pp. 279-292
    • Watkins, C.J.C.H.1    Dayan, P.2
  • 7
    • 0018441647 scopus 로고
    • An approximation theory of optimal control for trainable manipulators
    • Mar
    • G. N. Saridis and C.-S. G. Lee, "An approximation theory of optimal control for trainable manipulators," IEEE Trans. Syst., Man Cybern., vol. 9, no. 3, pp. 152-159, Mar. 1979.
    • (1979) IEEE Trans. Syst., Man Cybern. , vol.9 , Issue.3 , pp. 152-159
    • Saridis, G.N.1    Lee, C.-S.G.2
  • 8
    • 0031332446 scopus 로고    scopus 로고
    • Galerkin approximation of the generalized hamilton-jacobi equation
    • R. W. Beard, G. N. Saridis, and J. T. Wen, "Galerkin approximation of the generalized Hamilton-Jacobi equation," Automatica, vol. 33, no. 12, pp. 2159-2177, 1996.
    • (1996) Automatica , vol.33 , Issue.12 , pp. 2159-2177
    • Beard, R.W.1    Saridis, G.N.2    Wen, J.T.3
  • 10
    • 85028203812 scopus 로고    scopus 로고
    • Invariantly admissible policy iteration for a class of nonlinear optimal control problems
    • , Apr. [Online] Available
    • J. Y. Lee, J. B. Park, and Y. H. Choi. (2014, Apr.). Invariantly admissible policy iteration for a class of nonlinear optimal control problems. Syst. Control Lett. [Online]. Available: http://arxiv.org/abs/1402.4187
    • (2014) Syst. Control Lett
    • Lee, J.Y.1    Park, J.B.2    Choi, Y.H.3
  • 12
    • 67349145396 scopus 로고    scopus 로고
    • Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems
    • D. Vrabie and F. Lewis, "Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems," Neural Netw., vol. 22, no. 3, pp. 237-246, 2009.
    • (2009) Neural Netw. , vol.22 , Issue.3 , pp. 237-246
    • Vrabie, D.1    Lewis, F.2
  • 13
    • 84867400046 scopus 로고    scopus 로고
    • Integral q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems
    • J. Y. Lee, J. B. Park, and Y. H. Choi, "Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems," Automatica, vol. 48, no. 11, pp. 2850-2859, 2012.
    • (2012) Automatica , vol.48 , Issue.11 , pp. 2850-2859
    • Lee, J.Y.1    Park, J.B.2    Choi, Y.H.3
  • 14
    • 84865092901 scopus 로고    scopus 로고
    • Integral reinforcement learning with explorations for continuous-time nonlinear systems
    • J. Y. Lee, J. B. Park, and Y. H. Choi, "Integral reinforcement learning with explorations for continuous-time nonlinear systems," in Proc. Int. Joint Conf. Neural Netw. (IJCNN), 2012, pp. 1042-1047.
    • (2012) Proc. Int. Joint Conf. Neural Netw. (IJCNN) , pp. 1042-1047
    • Lee, J.Y.1    Park, J.B.2    Choi, Y.H.3
  • 15
    • 84865467087 scopus 로고    scopus 로고
    • Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics
    • Y. Jiang and Z.-P. Jiang, "Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics," Automatica, vol. 48, no. 10, pp. 2699-2704, 2012.
    • (2012) Automatica , vol.48 , Issue.10 , pp. 2699-2704
    • Jiang, Y.1    Jiang, Z.-P.2
  • 16
    • 0028584964 scopus 로고
    • Adaptive linear quadratic control using policy iteration
    • Jul
    • S. J. Bradtke, B. E. Ydstie, and A. G. Barto, "Adaptive linear quadratic control using policy iteration," in Proc. Amer. Control Conf. (ACC), vol. 3. Jul. 1994, pp. 3475-3479.
    • (1994) Proc. Amer. Control Conf. (ACC) , vol.3 , pp. 3475-3479
    • Bradtke, S.J.1    Ydstie, B.E.2    Barto, A.G.3
  • 17
    • 84893829511 scopus 로고    scopus 로고
    • On integral generalized policy iteration for continuous-time linear quadratic regulations
    • J. Y. Lee, J. B. Park, and Y. H. Choi, "On integral generalized policy iteration for continuous-time linear quadratic regulations," Automatica, vol. 50, no. 2, pp. 475-489, 2014.
    • (2014) Automatica , vol.50 , Issue.2 , pp. 475-489
    • Lee, J.Y.1    Park, J.B.2    Choi, Y.H.3
  • 18
    • 33846781129 scopus 로고    scopus 로고
    • Model-free q-learning designs for linear discrete-time zero-sum games with application to h-infinity control
    • A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, "Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control," Automatica, vol. 43, no. 3, pp. 473-481, 2007.
    • (2007) Automatica , vol.43 , Issue.3 , pp. 473-481
    • Al-Tamimi, A.1    Lewis, F.L.2    Abu-Khalaf, M.3
  • 19
    • 0002031779 scopus 로고
    • Approximate dynamic programming for real-time control and neural modeling
    • D. A. White and D. A. Sofge, Eds. New York, NY, USA Van Nostrand ch. 13
    • P. J. Werbos, "Approximate dynamic programming for real-time control and neural modeling," in Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches, D. A. White and D. A. Sofge, Eds. New York, NY, USA: Van Nostrand, 1992, ch. 13, pp. 493-525.
    • (1992) Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches , pp. 493-525
    • Werbos, P.J.1
  • 20
  • 22
    • 77950630017 scopus 로고    scopus 로고
    • Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem
    • K. G. Vamvoudakis and F. L. Lewis, "Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem," Automatica, vol. 46, no. 5, pp. 878-888, 2010.
    • (2010) Automatica , vol.46 , Issue.5 , pp. 878-888
    • Vamvoudakis, K.G.1    Lewis, F.L.2
  • 23
    • 84939468993 scopus 로고    scopus 로고
    • Online adaptive algorithm for optimal control with integral reinforcement learning
    • K. G. Vamvoudakis, D. Vrabie, and F. L. Lewis, "Online adaptive algorithm for optimal control with integral reinforcement learning," Int. J. Robust Nonlinear Control, 2013, doi: 10.1002/rnc.3018.
    • (2013) Int. J. Robust Nonlinear Control
    • Vamvoudakis, K.G.1    Vrabie, D.2    Lewis, F.L.3
  • 24
    • 84871319455 scopus 로고    scopus 로고
    • A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems
    • S. Bhasin, R. Kamalapurkar, M. Johnson, K. G. Vamvoudakis, F. L. Lewis, and W. E. Dixon, "A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems," Automatica, vol. 49, no. 1, pp. 82-92, 2013.
    • (2013) Automatica , vol.49 , Issue.1 , pp. 82-92
    • Bhasin, S.1    Kamalapurkar, R.2    Johnson, M.3    Vamvoudakis, K.G.4    Lewis, F.L.5    Dixon, W.E.6
  • 25
    • 84885176157 scopus 로고    scopus 로고
    • Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks
    • Oct
    • H. Modares, F. L. Lewis, and M.-B. Naghibi-Sistani, "Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks," IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 10, pp. 1513-1525, Oct. 2013.
    • (2013) IEEE Trans. Neural Netw. Learn. Syst. , vol.24 , Issue.10 , pp. 1513-1525
    • Modares, H.1    Lewis, F.L.2    Naghibi-Sistani, M.-B.3
  • 28
  • 29
    • 84876942592 scopus 로고    scopus 로고
    • Discrete-time neural inverse optimal control for nonlinear systems via passivation
    • Aug
    • F. Ornelas-Tellez, E. N. Sanchez, and A. G. Loukianov, "Discrete-time neural inverse optimal control for nonlinear systems via passivation," IEEE Trans. Neural Netw. Learn. Syst., vol. 23, no. 8, pp. 1327-1339, Aug. 2012.
    • (2012) IEEE Trans. Neural Netw. Learn. Syst. , vol.23 , Issue.8 , pp. 1327-1339
    • Ornelas-Tellez, F.1    Sanchez, E.N.2    Loukianov, A.G.3
  • 30
    • 84858698937 scopus 로고    scopus 로고
    • Inverse optimal neural control of a class of nonlinear systems with constrained inputs for trajectory tracking
    • L. J. Ricalde and E. N. Sanchez, "Inverse optimal neural control of a class of nonlinear systems with constrained inputs for trajectory tracking," Optim. Control Appl. Methods, vol. 33, no. 2, pp. 176-198, 2012.
    • (2012) Optim. Control Appl. Methods , vol.33 , Issue.2 , pp. 176-198
    • Ricalde, L.J.1    Sanchez, E.N.2
  • 31
    • 0022875454 scopus 로고
    • Adaptive stabilization of linear systems via switching control
    • Dec
    • M. Fu and B. Barmish, "Adaptive stabilization of linear systems via switching control," IEEE Trans. Autom. Control, vol. 31, no. 12, pp. 1097-1103, Dec. 1986.
    • (1986) IEEE Trans. Autom. Control , vol.31 , Issue.12 , pp. 1097-1103
    • Fu, M.1    Barmish, B.2
  • 32
    • 77955518498 scopus 로고    scopus 로고
    • Control of unknown nonlinear systems with efficient transient performance using concurrent exploitation and exploration
    • Aug
    • E. B. Kosmatopoulos, "Control of unknown nonlinear systems with efficient transient performance using concurrent exploitation and exploration," IEEE Trans. Neural Netw., vol. 21, no. 8, pp. 1245-1261, Aug. 2010.
    • (2010) IEEE Trans. Neural Netw. , vol.21 , Issue.8 , pp. 1245-1261
    • Kosmatopoulos, E.B.1
  • 34
    • 0004178386 scopus 로고    scopus 로고
    • Englewood Cliffs,NJ USA Prentice-Hall
    • H. K. Khalil, Nonlinear Systems. Englewood Cliffs, NJ, USA: Prentice-Hall, 2002.
    • (2002) Nonlinear Systems.
    • Khalil, H.K.1
  • 36
    • 84914965022 scopus 로고
    • On an iterative technique for riccati equation computations
    • Feb
    • D. Kleinman, "On an iterative technique for Riccati equation computations," IEEE Trans. Autom. Control, vol. 13, no. 1, pp. 114-115, Feb. 1968.
    • (1968) IEEE Trans. Autom. Control , vol.13 , Issue.1 , pp. 114-115
    • Kleinman, D.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.