SCOPUS 정보 검색 플랫폼

IEEE Transactions on Neural Networks and Learning Systems

Volumn 26, Issue 5, 2015, Pages 916-932

Integral reinforcement learning for continuous-time input-affine nonlinear systems with simultaneous invariant explorations

(3) Lee, Jae Young a Park, Jin Bae a Choi, Yoon Ho b

a Yonsei University (South Korea)

b Kyonggi University (South Korea)

Author keywords

Adaptive optimal control; continuous time (CT); exploration; policy iteration (PI); Q learning; reinforcement learning (RL)

Indexed keywords

ALGORITHMS; CLOSED LOOP SYSTEMS; CONTINUOUS TIME SYSTEMS; ITERATIVE METHODS; LEARNING ALGORITHMS; NATURAL RESOURCES EXPLORATION; NONLINEAR SYSTEMS; NUMERICAL METHODS; OPTIMAL CONTROL SYSTEMS; SOCIAL NETWORKING (ONLINE);

ADAPTIVE OPTIMAL CONTROL; CONTINUOUSTIME; CONVERGENCE PROPERTIES; EXCITATION CONDITIONS; INPUT-TO-STATE STABILITY; NONLINEAR OPTIMAL CONTROL PROBLEMS; POLICY ITERATION; Q-LEARNING;

REINFORCEMENT LEARNING;

EID: 85027928575 PISSN: 2162237X EISSN: 21622388 Source Type: Journal
DOI: 10.1109/TNNLS.2014.2328590 Document Type: Article

Times cited : (113)

References (36)

1
- 70349116541
- Reinforcement learning and adaptive dynamic programming for feedback control
- Jun
- F. L. Lewis and D. Vrabie, "Reinforcement learning and adaptive dynamic programming for feedback control," IEEE Circuits Syst. Mag., vol. 9, no. 3, pp. 32-50, Jun. 2009.
- (2009) IEEE Circuits Syst. Mag. , vol.9 , Issue.3 , pp. 32-50
- Lewis, F.L.¹ Vrabie, D.²

2
- 84921399937
- New York NY USA Wiley
- J. Si, A. G. Barto, W. B. Powell, and D. Wunsch, Handbook of Learning and Approximate Dynamic Programming. New York, NY, USA: Wiley, 2004.
- (2004) Handbook of Learning and Approximate Dynamic Programming
- Si, J.¹ Barto, A.G.² Powell, W.B.³ Wunsch, D.⁴

3
- 0004102479
- Cambridge, U.K.: Cambridge Univ. Press
- R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. Cambridge, U.K.: Cambridge Univ. Press, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

4
- 47349092417
- New York NY USA Wiley
- W. B. Powell, Approximate Dynamic Programming: Solving the Curses of Dimensionality. New York, NY, USA: Wiley, 2007.
- (2007) Approximate Dynamic Programming: Solving the Curses of Dimensionality
- Powell, W.B.¹

5
- 34249833101
- Q-learning
- C. J. C. H. Watkins and P. Dayan, "Q-learning," Mach. Learn., vol. 8, nos. 3-4, pp. 279-292, 1992.
- (1992) Mach. Learn. , vol.8 , Issue.3-4 , pp. 279-292
- Watkins, C.J.C.H.¹ Dayan, P.²

6
- 0029679044
- Reinforcement learning: A survey
- May
- L. P. Kaelbling, M. L. Littman, and A. W. Moore, "Reinforcement learning: A survey," J. Artif. Intell. Res., vol. 4, pp. 237-285, May 1996.
- (1996) J. Artif. Intell. Res. , vol.4 , pp. 237-285
- Kaelbling, L.P.¹ Littman, M.L.² Moore, A.W.³

7
- 0018441647
- An approximation theory of optimal control for trainable manipulators
- Mar
- G. N. Saridis and C.-S. G. Lee, "An approximation theory of optimal control for trainable manipulators," IEEE Trans. Syst., Man Cybern., vol. 9, no. 3, pp. 152-159, Mar. 1979.
- (1979) IEEE Trans. Syst., Man Cybern. , vol.9 , Issue.3 , pp. 152-159
- Saridis, G.N.¹ Lee, C.-S.G.²

8
- 0031332446
- Galerkin approximation of the generalized hamilton-jacobi equation
- R. W. Beard, G. N. Saridis, and J. T. Wen, "Galerkin approximation of the generalized Hamilton-Jacobi equation," Automatica, vol. 33, no. 12, pp. 2159-2177, 1996.
- (1996) Automatica , vol.33 , Issue.12 , pp. 2159-2177
- Beard, R.W.¹ Saridis, G.N.² Wen, J.T.³

9
- 0003785722
- Ph.D. dissertation Dept. Phys., Rensselaer Polytechnic Institute, Troy, NY, USA
- R. W. Beard, "Improving the closed-loop performance of nonlinear systems," Ph.D. dissertation, Dept. Phys., Rensselaer Polytechnic Institute, Troy, NY, USA, 1995.
- (1995) Improving the Closed-Loop Performance of Nonlinear Systems
- Beard, R.W.¹

10
- 85028203812
- Invariantly admissible policy iteration for a class of nonlinear optimal control problems
- , Apr. [Online] Available
- J. Y. Lee, J. B. Park, and Y. H. Choi. (2014, Apr.). Invariantly admissible policy iteration for a class of nonlinear optimal control problems. Syst. Control Lett. [Online]. Available: http://arxiv.org/abs/1402.4187
- (2014) Syst. Control Lett
- Lee, J.Y.¹ Park, J.B.² Choi, Y.H.³

11
- 0036588686
- Adaptive dynamic programming
- May
- J. J. Murray, C. J. Cox, G. G. Lendaris, and R. Saeks, "Adaptive dynamic programming," IEEE Trans. Syst., Man, Cybern., Appl. Rev., vol. 32, no. 2, pp. 140-153, May 2002.
- (2002) IEEE Trans. Syst., Man, Cybern., Appl. Rev. , vol.32 , Issue.2 , pp. 140-153
- Murray, J.J.¹ Cox, C.J.² Lendaris, G.G.³ Saeks, R.⁴

12
- 67349145396
- Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems
- D. Vrabie and F. Lewis, "Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems," Neural Netw., vol. 22, no. 3, pp. 237-246, 2009.
- (2009) Neural Netw. , vol.22 , Issue.3 , pp. 237-246
- Vrabie, D.¹ Lewis, F.²

13
- 84867400046
- Integral q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems
- J. Y. Lee, J. B. Park, and Y. H. Choi, "Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems," Automatica, vol. 48, no. 11, pp. 2850-2859, 2012.
- (2012) Automatica , vol.48 , Issue.11 , pp. 2850-2859
- Lee, J.Y.¹ Park, J.B.² Choi, Y.H.³

14
- 84865092901
- Integral reinforcement learning with explorations for continuous-time nonlinear systems
- J. Y. Lee, J. B. Park, and Y. H. Choi, "Integral reinforcement learning with explorations for continuous-time nonlinear systems," in Proc. Int. Joint Conf. Neural Netw. (IJCNN), 2012, pp. 1042-1047.
- (2012) Proc. Int. Joint Conf. Neural Netw. (IJCNN) , pp. 1042-1047
- Lee, J.Y.¹ Park, J.B.² Choi, Y.H.³

15
- 84865467087
- Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics
- Y. Jiang and Z.-P. Jiang, "Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics," Automatica, vol. 48, no. 10, pp. 2699-2704, 2012.
- (2012) Automatica , vol.48 , Issue.10 , pp. 2699-2704
- Jiang, Y.¹ Jiang, Z.-P.²

16
- 0028584964
- Adaptive linear quadratic control using policy iteration
- Jul
- S. J. Bradtke, B. E. Ydstie, and A. G. Barto, "Adaptive linear quadratic control using policy iteration," in Proc. Amer. Control Conf. (ACC), vol. 3. Jul. 1994, pp. 3475-3479.
- (1994) Proc. Amer. Control Conf. (ACC) , vol.3 , pp. 3475-3479
- Bradtke, S.J.¹ Ydstie, B.E.² Barto, A.G.³

17
- 84893829511
- On integral generalized policy iteration for continuous-time linear quadratic regulations
- J. Y. Lee, J. B. Park, and Y. H. Choi, "On integral generalized policy iteration for continuous-time linear quadratic regulations," Automatica, vol. 50, no. 2, pp. 475-489, 2014.
- (2014) Automatica , vol.50 , Issue.2 , pp. 475-489
- Lee, J.Y.¹ Park, J.B.² Choi, Y.H.³

18
- 33846781129
- Model-free q-learning designs for linear discrete-time zero-sum games with application to h-infinity control
- A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, "Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control," Automatica, vol. 43, no. 3, pp. 473-481, 2007.
- (2007) Automatica , vol.43 , Issue.3 , pp. 473-481
- Al-Tamimi, A.¹ Lewis, F.L.² Abu-Khalaf, M.³

19
- 0002031779
- Approximate dynamic programming for real-time control and neural modeling
- D. A. White and D. A. Sofge, Eds. New York, NY, USA Van Nostrand ch. 13
- P. J. Werbos, "Approximate dynamic programming for real-time control and neural modeling," in Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches, D. A. White and D. A. Sofge, Eds. New York, NY, USA: Van Nostrand, 1992, ch. 13, pp. 493-525.
- (1992) Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches , pp. 493-525
- Werbos, P.J.¹

20
- 0031236002
- Adaptive critic designs
- Sep
- D. V. Prokhorov and D. C. Wunsch, "Adaptive critic designs," IEEE Trans. Neural Netw., vol. 8, no. 5, pp. 997-1007, Sep. 1997.
- (1997) IEEE Trans. Neural Netw. , vol.8 , Issue.5 , pp. 997-1007
- Prokhorov, D.V.¹ Wunsch, D.C.²

21
- 77953770221
- Ph.D. dissertation Dept. History, Univ. Texas Arlington, Arlington, TX, USA
- D. Vrabie, "Online adaptive optimal control for continuous-time systems," Ph.D. dissertation, Dept. History, Univ. Texas Arlington, Arlington, TX, USA, 2010.
- (2010) Online Adaptive Optimal Control for Continuous-Time Systems
- Vrabie, D.¹

22
- 77950630017
- Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem
- K. G. Vamvoudakis and F. L. Lewis, "Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem," Automatica, vol. 46, no. 5, pp. 878-888, 2010.
- (2010) Automatica , vol.46 , Issue.5 , pp. 878-888
- Vamvoudakis, K.G.¹ Lewis, F.L.²

23
- 84939468993
- Online adaptive algorithm for optimal control with integral reinforcement learning
- K. G. Vamvoudakis, D. Vrabie, and F. L. Lewis, "Online adaptive algorithm for optimal control with integral reinforcement learning," Int. J. Robust Nonlinear Control, 2013, doi: 10.1002/rnc.3018.
- (2013) Int. J. Robust Nonlinear Control
- Vamvoudakis, K.G.¹ Vrabie, D.² Lewis, F.L.³

24
- 84871319455
- A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems
- S. Bhasin, R. Kamalapurkar, M. Johnson, K. G. Vamvoudakis, F. L. Lewis, and W. E. Dixon, "A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems," Automatica, vol. 49, no. 1, pp. 82-92, 2013.
- (2013) Automatica , vol.49 , Issue.1 , pp. 82-92
- Bhasin, S.¹ Kamalapurkar, R.² Johnson, M.³ Vamvoudakis, K.G.⁴ Lewis, F.L.⁵ Dixon, W.E.⁶

25
- 84885176157
- Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks
- Oct
- H. Modares, F. L. Lewis, and M.-B. Naghibi-Sistani, "Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks," IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 10, pp. 1513-1525, Oct. 2013.
- (2013) IEEE Trans. Neural Netw. Learn. Syst. , vol.24 , Issue.10 , pp. 1513-1525
- Modares, H.¹ Lewis, F.L.² Naghibi-Sistani, M.-B.³

26
- 0004163205
- New York, NY, USA Wiley
- F. L. Lewis and V. L. Syrmos, Optimal Control. New York, NY, USA: Wiley, 1995.
- (1995) Optimal Control
- Lewis, F.L.¹ Syrmos, V.L.²

27
- 24644490744
- New York NY USA Dover
- D. E. Kirk, Optimal Control Theory: An Introduction. New York, NY, USA: Dover, 2004.
- (2004) Optimal Control Theory: An Introduction
- Kirk, D.E.¹

28
- 85012688561
- Princeton NJ USA: Princeton Univ
- R. Bellman, Dynamic Programming. Princeton, NJ, USA: Princeton Univ., 1957.
- (1957) Dynamic Programming.
- Bellman, R.¹

29
- 84876942592
- Discrete-time neural inverse optimal control for nonlinear systems via passivation
- Aug
- F. Ornelas-Tellez, E. N. Sanchez, and A. G. Loukianov, "Discrete-time neural inverse optimal control for nonlinear systems via passivation," IEEE Trans. Neural Netw. Learn. Syst., vol. 23, no. 8, pp. 1327-1339, Aug. 2012.
- (2012) IEEE Trans. Neural Netw. Learn. Syst. , vol.23 , Issue.8 , pp. 1327-1339
- Ornelas-Tellez, F.¹ Sanchez, E.N.² Loukianov, A.G.³

30
- 84858698937
- Inverse optimal neural control of a class of nonlinear systems with constrained inputs for trajectory tracking
- L. J. Ricalde and E. N. Sanchez, "Inverse optimal neural control of a class of nonlinear systems with constrained inputs for trajectory tracking," Optim. Control Appl. Methods, vol. 33, no. 2, pp. 176-198, 2012.
- (2012) Optim. Control Appl. Methods , vol.33 , Issue.2 , pp. 176-198
- Ricalde, L.J.¹ Sanchez, E.N.²

31
- 0022875454
- Adaptive stabilization of linear systems via switching control
- Dec
- M. Fu and B. Barmish, "Adaptive stabilization of linear systems via switching control," IEEE Trans. Autom. Control, vol. 31, no. 12, pp. 1097-1103, Dec. 1986.
- (1986) IEEE Trans. Autom. Control , vol.31 , Issue.12 , pp. 1097-1103
- Fu, M.¹ Barmish, B.²

32
- 77955518498
- Control of unknown nonlinear systems with efficient transient performance using concurrent exploitation and exploration
- Aug
- E. B. Kosmatopoulos, "Control of unknown nonlinear systems with efficient transient performance using concurrent exploitation and exploration," IEEE Trans. Neural Netw., vol. 21, no. 8, pp. 1245-1261, Aug. 2010.
- (2010) IEEE Trans. Neural Netw. , vol.21 , Issue.8 , pp. 1245-1261
- Kosmatopoulos, E.B.¹

33
- 0004106917
- Englewood Cliffs, NJ USA Prentice-Hall
- P. A. Ioannou and J. Sun, Stable and Robust Adaptive Control. Englewood Cliffs, NJ, USA: Prentice-Hall, 1995.
- (1995) Stable and Robust Adaptive Control.
- Ioannou, P.A.¹ Sun, J.²

34
- 0004178386
- Englewood Cliffs,NJ USA Prentice-Hall
- H. K. Khalil, Nonlinear Systems. Englewood Cliffs, NJ, USA: Prentice-Hall, 2002.
- (2002) Nonlinear Systems.
- Khalil, H.K.¹

35
- 62949149213
- Constrained nonlinear optimal control: A converse hjb approach
- Pasadena, CA, USA, Tech. Rep. CIT-CDS 96-021
- V. Nevistíc and J. A. Primbs, "Constrained nonlinear optimal control: A converse HJB approach," Dept. Control & Dynamical Syst., California Inst. Technology, Pasadena, CA, USA, Tech. Rep. CIT-CDS 96-021, 1996.
- (1996) Dept. Control & Dynamical Syst., California Inst. Technology
- Nevistíc, V.¹ Primbs, J.A.²

36
- 84914965022
- On an iterative technique for riccati equation computations
- Feb
- D. Kleinman, "On an iterative technique for Riccati equation computations," IEEE Trans. Autom. Control, vol. 13, no. 1, pp. 114-115, Feb. 1968.
- (1968) IEEE Trans. Autom. Control , vol.13 , Issue.1 , pp. 114-115
- Kleinman, D.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.