SCOPUS 정보 검색 플랫폼

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

Volumn 42, Issue 2, 2012, Pages 377-390

Reinforcement learning controller design for affine nonlinear discrete-time systems using online approximators

(2) Yang, Qinmin a Jagannathan, Sarangapani b

a STATE KEY LABORATORY OF INDUSTRIAL CONTROL TECHNOLOGY (China)

b Architectural and Environmental Engineering (United States)

Author keywords

Adaptive critic; dynamic programming (DP); Lyapunov method; neural networks (NNs); online approximators (OLAs); online learning; reinforcement learning

Indexed keywords

ACTION NETWORK; ADAPTIVE CRITIC; APPROXIMATORS; BALANCING SYSTEM; BOUNDED DISTURBANCES; CONTROLLER DESIGNS; CRITIC NETWORK; HEURISTIC DYNAMIC PROGRAMMING; LEARNING CONTROLLERS; LYAPUNOV; LYAPUNOV THEORIES; MULTI-OUTPUT; MULTIINPUT; NONLINEAR DISCRETE-TIME SYSTEMS; ONLINE LEARNING; OPTIMAL SIGNALS; OUTPUT-FEEDBACK; RADIAL BASIS FUNCTIONS; RECURSIVE EQUATIONS; SEPARATION PRINCIPLE; SYSTEM STATE; TWO-LINK; UNIFORM ULTIMATE BOUNDEDNESS;

DIGITAL CONTROL SYSTEMS; DISCRETE TIME CONTROL SYSTEMS; DYNAMIC PROGRAMMING; FUZZY LOGIC; LYAPUNOV METHODS; NEURAL NETWORKS; NONLINEAR FEEDBACK; ONLINE SYSTEMS; RADIAL BASIS FUNCTION NETWORKS; REINFORCEMENT LEARNING; STATE FEEDBACK;

SEARCH ENGINES;

EID: 84859001250 PISSN: 10834419 EISSN: None Source Type: Journal
DOI: 10.1109/TSMCB.2011.2166384 Document Type: Article

Times cited : (183)

References (31)

1
- 84954213764
- Princeton, NJ: Princeton Univ. Press
- R. Bellman and S. Dreyfus, Applied Dynamic Programming. Princeton, NJ: Princeton Univ. Press, 1962.
- (1962) Applied Dynamic Programming
- Bellman, R.¹ Dreyfus, S.²

2
- 0039319294
- Suboptimal design of intentionally nonlinear controllers
- Oct
- Z. V. Rekasius, "Suboptimal design of intentionally nonlinear controllers," IEEE Trans. Autom. Control, vol. AC-9, no. 4, pp. 380-386, Oct. 1964.
- (1964) IEEE Trans. Autom. Control , vol.AC-9 , Issue.4 , pp. 380-386
- Rekasius, Z.V.¹

3
- 0002526302
- Construction of suboptimal control sequences
- R. J. Leake and R. Liu, "Construction of suboptimal control sequences," SIAM J. Control Optim., vol. 5, no. 1, pp. 54-63, 1967.
- (1967) SIAM J. Control Optim. , vol.5 , Issue.1 , pp. 54-63
- Leake, R.J.¹ Liu, R.²

4
- 0003448648
- Englewood Cliffs, NJ: Prentice-Hall
- D. Kirk, Optimal Control Theory: An Introduction. Englewood Cliffs, NJ: Prentice-Hall, 1970.
- (1970) Optimal Control Theory: An Introduction
- Kirk, D.¹

5
- 0004025786
- Philadelphia, PA: Taylor & Francis
- F. L. Lewis, S. Jagannathan, and A. Yesildirek, Neural Network Control of Robot Manipulators and Nonlinear Systems. Philadelphia, PA: Taylor & Francis, 1999.
- (1999) Neural Network Control of Robot Manipulators and Nonlinear Systems
- Lewis, F.L.¹ Jagannathan, S.² Yesildirek, A.³

6
- 0035273403
- On-line learning control by association and reinforcement
- Mar
- J. Si and Y. T. Wang, "On-line learning control by association and reinforcement," IEEE Trans. Neural Netw., vol. 12, no. 2, pp. 264-276, Mar. 2001.
- (2001) IEEE Trans. Neural Netw. , vol.12 , Issue.2 , pp. 264-276
- Si, J.¹ Wang, Y.T.²

7
- 0007908166
- Univ. Massachusetts, Amherst, MA, COINS Tech. Rep. 96-88 Dec
- J. C. Santamaria, R. S. Sutton, and A. Ram, "Experiments with reinforcement learning in problems with continuous state and action spaces," Univ. Massachusetts, Amherst, MA, COINS Tech. Rep. 96-88, Dec. 1996.
- (1996) Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces
- Santamaria, J.C.¹ Sutton, R.S.² Ram, A.³

8
- 0004040219
- Boca Raton, FL: CRC Press
- R. Luus, Iterative Dynamic Programming. Boca Raton, FL: CRC Press, 2000.
- (2000) Iterative Dynamic Programming
- Luus, R.¹

9
- 0020970738
- Neuron like adaptive elements that can solve difficult learning control problems
- Sep./Oct
- A. G. Barto, R. S. Sutton, and C. W. Anderson, "Neuron like adaptive elements that can solve difficult learning control problems," IEEE Trans. Syst., Man, Cybern., vol. SMC-13, no. 5, pp. 834-847, Sep./Oct. 1983.
- (1983) IEEE Trans. Syst., Man, Cybern. , vol.SMC-13 , Issue.5 , pp. 834-847
- Barto, A.G.¹ Sutton, R.S.² Anderson, C.W.³

10
- 33847202724
- Learning to predict by the methods of temporal difference
- Aug
- R. S. Sutton, "Learning to predict by the methods of temporal difference," Mach. Learn., vol. 3, no. 1, pp. 9-44, Aug. 1988.
- (1988) Mach. Learn. , vol.3 , Issue.1 , pp. 9-44
- Sutton, R.S.¹

11
- 34249833101
- Q-learning
- May
- C. J. C. H. Watkins and P. Dayan, "Q-learning," Mach. Learn., vol. 8, no. 3/4, pp. 279-292, May 1992.
- (1992) Mach. Learn. , vol.8 , Issue.3-4 , pp. 279-292
- Watkins, C.J.C.H.¹ Dayan, P.²

12
- 0004102479
- Cambridge: MIT Press
- R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. Cambridge: MIT Press, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

13
- 0030675610
- Efficient reinforcement learning: Model-based acrobot control
- Albuquerque, NM
- G. Boone, "Efficient reinforcement learning: Model-based acrobot control," in Proc. IEEE Int. Conf. Robot. Autom., Albuquerque, NM, 1997, pp. 229-234.
- (1997) Proc. IEEE Int. Conf. Robot. Autom. , pp. 229-234
- Boone, G.¹

14
- 13644265156
- Reinforcement learning-based output feedback control of nonlinear systems with input constraints
- Feb
- P. He and S. Jagannathan, "Reinforcement learning-based output feedback control of nonlinear systems with input constraints," IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 35, no. 1, pp. 150-154, Feb. 2005.
- (2005) IEEE Trans. Syst., Man, Cybern. B, Cybern. , vol.35 , Issue.1 , pp. 150-154
- He, P.¹ Jagannathan, S.²

15
- 70349615619
- Direct heuristic dynamic programming for nonlinear tracking control with filtered tracking error
- Dec
- L. Yang, J. Si, K. S. Tsakalis, and A. A. Rodriguez, "Direct heuristic dynamic programming for nonlinear tracking control with filtered tracking error," IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 39, no. 6, pp. 1617-1622, Dec. 2009.
- (2009) IEEE Trans. Syst., Man, Cybern. B, Cybern. , vol.39 , Issue.6 , pp. 1617-1622
- Yang, L.¹ Si, J.² Tsakalis, K.S.³ Rodriguez, A.A.⁴

16
- 0003565783
- Belmont, MA: Athena Scientific
- D. P. Bertsekas, Dynamic Programming and Optimal Control. Belmont, MA: Athena Scientific, 2000.
- (2000) Dynamic Programming and Optimal Control
- Bertsekas, D.P.¹

17
- 84921399937
- New York: Wiley-IEEE Press
- J. Si, A. G. Barto, W. B. Powell, and D. Wunsch, Eds., Handbook of Learning and Approximate Dynamic Programming. New York: Wiley-IEEE Press, 2004.
- (2004) Handbook of Learning and Approximate Dynamic Programming
- Si, J.¹ Barto, A.G.² Powell, W.B.³ Wunsch, D.⁴

18
- 0029403793
- Stochastic choice of basis functions in adaptive function approximation and the functional-link net
- Nov
- B. Igelnik and Y. H. Pao, "Stochastic choice of basis functions in adaptive function approximation and the functional-link net," IEEE Trans. Neural Networks, vol. 6, no. 6, pp. 1320-1329, Nov. 1995.
- (1995) IEEE Trans. Neural Networks , vol.6 , Issue.6 , pp. 1320-1329
- Igelnik, B.¹ Pao, Y.H.²

19
- 0031236002
- Adaptive critic designs
- Sep
- D. Prokhorov and D. Wunsch, "Adaptive critic designs," IEEE Trans. Neural Netw., vol. 8, no. 5, pp. 997-1007, Sep. 1997.
- (1997) IEEE Trans. Neural Netw. , vol.8 , Issue.5 , pp. 997-1007
- Prokhorov, D.¹ Wunsch, D.²

20
- 0004059199
- Cambridge, MA: MIT Press
- W. T. Miller, R. S. Sutton, and P. J. Werbos, Eds., Neural Networks for Control. Cambridge, MA: MIT Press, 1990.
- (1990) Neural Networks for Control
- Miller, W.T.¹ Sutton, R.S.² Werbos, P.J.³

21
- 0023169119
- Building and understanding adaptive systems: A statistical/numerical approach to factory automation and brain research
- Jan
- P. J.Werbos, "Building and understanding adaptive systems: A statistical/numerical approach to factory automation and brain research," IEEE Trans. Syst., Man, Cybern., vol. SMC-17, no. 1, pp. 7-20, Jan. 1987.
- (1987) IEEE Trans. Syst., Man, Cybern. , vol.SMC-17 , Issue.1 , pp. 7-20
- Werbos, P.J.¹

22
- 0002557583
- Advanced forecasting methods for global crisis warning and models of intelligence
- P. J.Werbos, "Advanced forecasting methods for global crisis warning and models of intelligence," Gen. Syst. Yearbook, vol. 22, pp. 25-38, 1977.
- (1977) Gen. Syst. Yearbook , vol.22 , pp. 25-38
- Werbos, P.J.¹

23
- 0031281590
- Learning through reinforcement and replicator dynamics
- Nov
- T. Borgers and R. Sarin, "Learning through reinforcement and replicator dynamics," J. Economic Theory, vol. 77, no. 1, pp. 1-17, Nov. 1997.
- (1997) J. Economic Theory , vol.77 , Issue.1 , pp. 1-17
- Borgers, T.¹ Sarin, R.²

24
- 49049091364
- Control of nonaffine nonlinear discrete-time systems using reinforcement learning-based linearly parameterized neural networks
- Aug
- Q. Yang, J. B. Vance, and S. Jagannathan, "Control of nonaffine nonlinear discrete-time systems using reinforcement learning-based linearly parameterized neural networks," IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 38, no. 4, pp. 994-1001, Aug. 2008.
- (2008) IEEE Trans. Syst., Man, Cybern. B, Cybern. , vol.38 , Issue.4 , pp. 994-1001
- Yang, Q.¹ Vance, J.B.² Jagannathan, S.³

25
- 0032785795
- Discrete-time CMAC control of a feedback linearizable nonlinear systems under a persistence of excitation
- Jan
- S. Jagannathan, "Discrete-time CMAC control of a feedback linearizable nonlinear systems under a persistence of excitation," IEEE Trans. Neural Netw., vol. 10, no. 1, pp. 128-137, Jan. 1999.
- (1999) IEEE Trans. Neural Netw. , vol.10 , Issue.1 , pp. 128-137
- Jagannathan, S.¹

26
- 34047119733
- Boca Raton, FL: CRC Press
- S. Jagannathan, Neural Network Control of Nonlinear Discrete-time Systems. Boca Raton, FL: CRC Press, 2006.
- (2006) Neural Network Control of Nonlinear Discrete-time Systems
- Jagannathan, S.¹

27
- 79960462685
- Online optimal control of nonlinear discrete-time systems using approximate dynamic programming
- T. Dierks and S. Jagannathan, "Online optimal control of nonlinear discrete-time systems using approximate dynamic programming," J. Control Theory Appl., vol. 9, no. 3, pp. 361-369, 2011.
- (2011) J. Control Theory Appl. , vol.9 , Issue.3 , pp. 361-369
- Dierks, T.¹ Jagannathan, S.²

28
- 0003427482
- New York: Wiley
- M. Krstic, I. Kanellakopoulos, and P. Kokotovic, Nonlinear and Adaptive Control Design. New York: Wiley, 1995.
- (1995) Nonlinear and Adaptive Control Design
- Krstic, M.¹ Kanellakopoulos, I.² Kokotovic, P.³

29
- 0036588686
- Adaptive dynamic programming
- May
- J. J. Murray, C. J. Cox, G. G. Lendaris, and R. Saeks, "Adaptive dynamic programming," IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 32, no. 2, pp. 140-153, May 2002.
- (2002) IEEE Trans. Syst., Man, Cybern. C, Appl. Rev. , vol.32 , Issue.2 , pp. 140-153
- Murray, J.J.¹ Cox, C.J.² Lendaris, G.G.³ Saeks, R.⁴

30
- 0004255876
- 2nd ed. Reading, MA: Addison-Wesley
- K. J. Astrom and B. Wittenmark, Adaptive Control., 2nd ed. Reading, MA: Addison-Wesley, 1994.
- (1994) Adaptive Control.
- Astrom, K.J.¹ Wittenmark, B.²

31
- 49049089962
- Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof
- Aug
- A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, "Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof," IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 38, no. 4, pp. 943-949, Aug. 2008.
- (2008) IEEE Trans. Syst., Man, Cybern. B, Cybern. , vol.38 , Issue.4 , pp. 943-949
- Al-Tamimi, A.¹ Lewis, F.L.² Abu-Khalaf, M.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.