SCOPUS 정보 검색 플랫폼

IEEE Transactions on Neural Networks and Learning Systems

Volumn 24, Issue 5, 2013, Pages 762-775

Online learning control using adaptive critic designs with sparse kernel machines

(4) Xu, Xin a Hou, Zhongsheng b Lian, Chuanqiang a He, Haibo c

a NATIONAL UNIVERSITY OF DEFENSE TECHNOLOGY (China)

b BEIJING JIAOTONG UNIVERSITY (China)

c University of Rhode Island (United States)

Author keywords

Adaptive critic designs; Approximate dynamic programming; Kernel machines; Learning control; Markov decision processes; Reinforcement learning

Indexed keywords

ADAPTIVE CRITIC DESIGNS; APPROXIMATE DYNAMIC PROGRAMMING; KERNEL MACHINE; LEARNING CONTROL; MARKOV DECISION PROCESSES;

DYNAMICAL SYSTEMS; HEURISTIC PROGRAMMING; LEARNING ALGORITHMS; MARKOV PROCESSES; NEURAL NETWORKS; REINFORCEMENT LEARNING;

E-LEARNING;

ALGORITHM; ARTIFICIAL INTELLIGENCE; COMPUTER SIMULATION; DECISION SUPPORT SYSTEM; FEEDBACK SYSTEM; HUMAN; LEARNING; ONLINE SYSTEM; PROBABILITY;

ALGORITHMS; ARTIFICIAL INTELLIGENCE; COMPUTER SIMULATION; DECISION SUPPORT TECHNIQUES; FEEDBACK; HUMANS; LEARNING; MARKOV CHAINS; ONLINE SYSTEMS;

EID: 84884922436 PISSN: 2162237X EISSN: 21622388 Source Type: Journal
DOI: 10.1109/TNNLS.2012.2236354 Document Type: Article

Times cited : (117)

References (48)

1
- 0004102479
- Cambridge, MA: MIT Press
- R. Sutton and A. Barto, Reinforcement Learning. Introduction. Cambridge, MA: MIT Press, 1998.
- (1998) Reinforcement Learning. Introduction
- Sutton, R.¹ Barto, A.²

2
- 66449130966
- Adaptive dynamic programming: An introduction
- May
- F. Y. Wang, H. Zhang, and D. Liu, "Adaptive dynamic programming: An introduction," IEEE Comput. Intell. Mag., vol. 4, no. 2, pp. 39-47, May 2009.
- (2009) IEEE Comput. Intell. Mag. , vol.4 , Issue.2 , pp. 39-47
- Wang, F.Y.¹ Zhang, H.² Liu, D.³

3
- 79955859296
- San Mateo, CA: Morgan
- C. Szepesvári, Algorithms for Reinforcement Learning. San Mateo, CA: Morgan, 2010.
- (2010) Algorithms for Reinforcement Learning
- Szepesvári, C.¹

4
- 0003544743
- New York: Van Nostrand
- D. A. White and D. A. Sofge, Handbook of Intelligent Control. New York: Van Nostrand, 1992.
- (1992) Handbook of Intelligent Control
- White, D.A.¹ Sofge, D.A.²

5
- 67349247013
- Intelligence in the brain: A theory of how it works and how to build it
- Apr.
- P. J. Werbos, "Intelligence in the brain: A theory of how it works and how to build it," Neural Netw., vol. 22, no. 3, pp. 200-212, Apr. 2009.
- (2009) Neural Netw. , vol.22 , Issue.3 , pp. 200-212
- Werbos, P.J.¹

6
- 34548766755
- Using ADP to understand and replicate brain intelligence: The next level design
- P. J. Werbos, "Using ADP to understand and replicate brain intelligence: The next level design," in Proc. IEEE Int. Symp. Approx. Dynamic Program. Reinforcement Learn., Apr. 2007, pp. 209-216.
- Proc. IEEE Int. Symp. Approx. Dynamic Program. Reinforcement Learn., Apr. 2007 , pp. 209-216
- Werbos, P.J.¹

7
- 0003487482
- Belmont, MA: Athena Scientific
- D. P. Bertsekas and J. Tsitsiklis, Neuro-Dynamic Programming. Belmont, MA: Athena Scientific, 1996.
- (1996) Neuro-Dynamic Programming
- Bertsekas, D.P.¹ Tsitsiklis, J.²

8
- 26844483839
- A self-learning call admission control scheme for CDMA cellular networks
- DOI 10.1109/TNN.2005.853408
- D. Liu, Y. Zhang, and H. Zhang, "A self-learning call admission control scheme for CDMA cellular networks," IEEE Trans. Neural Netw., vol. 16, no. 5, pp. 1219-1228, Sep. 2005. (Pubitemid 41444623)
- (2005) IEEE Transactions on Neural Networks , vol.16 , Issue.5 , pp. 1219-1228
- Liu, D.¹ Zhang, Y.² Zhang, H.³

9
- 0032208335
- Elevator group control using multiple reinforcement learning agents
- Nov.
- R. H. Crites and A. G. Barto, "Elevator group control using multiple reinforcement learning agents," Mach. Learn., vol. 33, nos. 2-3, pp. 235-262, Nov. 1998.
- (1998) Mach. Learn. , vol.33 , Issue.2-3 , pp. 235-262
- Crites, R.H.¹ Barto, A.G.²

10
- 0000985504
- TD-Gammon, a self-teaching backgammon program, achieves master-level play
- Mar.
- G. Tesauro, "TD-Gammon, a self-teaching backgammon program, achieves master-level play," Neural Comput., vol. 6, no. 2, pp. 215-219, Mar. 1994.
- (1994) Neural Comput. , vol.6 , Issue.2 , pp. 215-219
- Tesauro, G.¹

11
- 49649121741
- Reinforcement-learning-based dual-control methodology for complex nonlinear discrete-time systems with application to spark engine EGR operation
- Aug.
- P. Shih, B. C. Kaul, S. Jagannathan, and J. A. Drallmeier, "Reinforcement-learning-based dual-control methodology for complex nonlinear discrete-time systems with application to spark engine EGR operation," IEEE Trans. Neural Netw., vol. 19, no. 8, pp. 1369-1388, Aug. 2008.
- (2008) IEEE Trans. Neural Netw. , vol.19 , Issue.8 , pp. 1369-1388
- Shih, P.¹ Kaul, B.C.² Jagannathan, S.³ Drallmeier, J.A.⁴

12
- 47349092417
- New York: Wiley
- W. B. Powell, Approximate Dynamic Programming: Solving the Curses of Dimensionality. New York: Wiley, 2007.
- (2007) Approximate Dynamic Programming: Solving the Curses of Dimensionality
- Powell, W.B.¹

13
- 85046476577
- Boca Raton, FL: CRC Press
- L. Busoniu, R. Babuska, B. De Schutter, and D. Ernst, Reinforcement Learning and Dynamic Programming Using Function Approximators. Boca Raton, FL: CRC Press, 2010
- (2010) Reinforcement Learning and Dynamic Programming Using Function Approximators
- Busoniu, L.¹ Babuska, R.² De Schutter, B.³ Ernst, D.⁴

14
- 71149099079
- Fast gradient-descent methods for temporaldifference learning with linear function approximation
- R. S. Sutton, H. R. Maei, D. Precup, S. Bhatnagar, D. Silver, C. Szepesvari, and E. Wiewiora, "Fast gradient-descent methods for temporaldifference learning with linear function approximation," in Proc. Int. Conf. Mach. Learn., 2009, pp. 993-1000.
- Proc. Int. Conf. Mach. Learn., 2009 , pp. 993-1000
- Sutton, R.S.¹ Maei, H.R.² Precup, D.³ Bhatnagar, S.⁴ Silver, D.⁵ Szepesvari, C.⁶ Wiewiora, E.⁷

15
- 0013535965
- Infinite-horizon policy-gradient estimation
- Jul.
- J. Baxter and P. L. Bartlett, "Infinite-horizon policy-gradient estimation," J. Artif. Intell. Res., vol. 15, no. 1, pp. 319-350, Jul. 2001.
- (2001) J. Artif. Intell. Res. , vol.15 , Issue.1 , pp. 319-350
- Baxter, J.¹ Bartlett, P.L.²

16
- 84898938510
- Actor-critic algorithms
- Cambridge, MA: MIT Press
- V. R. Konda and J. N. Tsitsiklis, "Actor-critic algorithms," in Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press, 2000.
- (2000) Advances in Neural Information Processing Systems
- Konda, V.R.¹ Tsitsiklis, J.N.²

17
- 0031236002
- Adaptive critic designs
- Jul.
- D. V. Prokhorov and D. C. Wunsch, "Adaptive critic designs," IEEE Trans. Neural Netw., vol. 8, no. 5, pp. 997-1007, Jul. 1997.
- (1997) IEEE Trans. Neural Netw. , vol.8 , Issue.5 , pp. 997-1007
- Prokhorov, D.V.¹ Wunsch, D.C.²

18
- 0020970738
- Neuron-like adaptive elements that can solve difficult learning control problems
- Sep.-Oct.
- A. G. Barto, R. S. Sutton, and C. W. Anderson, "Neuron-like adaptive elements that can solve difficult learning control problems," IEEE Trans. Syst., Man, Cybern., vol. 13, no. 5, pp. 834-846, Sep.-Oct. 1983.
- (1983) IEEE Trans. Syst., Man, Cybern. , vol.13 , Issue.5 , pp. 834-846
- Barto, A.G.¹ Sutton, R.S.² Anderson, C.W.³

19
- 0036565019
- Comparison of heuristic dynamic programming and dual heuristic programming adaptive critics for neurocontrol of a turbogenerator
- May
- G. K. Venayagamoorthy, R. G. Harley, and D. C. Wunsch, "Comparison of heuristic dynamic programming and dual heuristic programming adaptive critics for neurocontrol of a turbogenerator," IEEE Trans. Neural Netw., vol. 13, no. 3, pp. 764-773, May 2002.
- (2002) IEEE Trans. Neural Netw. , vol.13 , Issue.3 , pp. 764-773
- Venayagamoorthy, G.K.¹ Harley, R.G.² Wunsch, D.C.³

20
- 70349116541
- Reinforcement learning and adaptive dynamic programming for feedback control
- Aug.
- F. L. Lewis and D. Vrabie, "Reinforcement learning and adaptive dynamic programming for feedback control," IEEE Circuits Syst. Mag., vol. 9, no. 3, pp. 32-50, Aug. 2009.
- (2009) IEEE Circuits Syst. Mag. , vol.9 , Issue.3 , pp. 32-50
- Lewis, F.L.¹ Vrabie, D.²

21
- 40649106649
- Natural actor-critic
- Mar.
- J. Peters and S. Schaal, "Natural actor-critic," Neurocomputing, vol. 71, nos. 7-9, pp. 1180-1190, Mar. 2008.
- (2008) Neurocomputing , vol.71 , Issue.7-9 , pp. 1180-1190
- Peters, J.¹ Schaal, S.²

22
- 78049336028
- Alberta, Canada: Dept. Comput. Sci.
- S. Bhatnagar, R. S. Sutton, M. Ghavamzadeh, and M. Lee, Natural Actor-Critic Algorithms. Alberta, Canada: Dept. Comput. Sci., 2009.
- (2009) Natural Actor-Critic Algorithms
- Bhatnagar, S.¹ Sutton, R.S.² Ghavamzadeh, M.³ Lee, M.⁴

23
- 0030196717
- Adaptive-critic-based neural networks for aircraft optimal control
- S. N. Balakrishnan and V. Biega, "Adaptive-critic-based neural networks for aircraft optimal control," J. Guid., Control, Dynamics, vol. 19, no. 4, pp. 893-898, 1996. (Pubitemid 126539437)
- (1996) Journal of Guidance, Control, and Dynamics , vol.19 , Issue.4 , pp. 893-898
- Balakrishnan, S.N.¹ Biega, V.²

24
- 0043026775
- Helicopter trimming and tracking control using direct neural dynamic programming
- Jul.
- R. Enns and J. Si, "Helicopter trimming and tracking control using direct neural dynamic programming," IEEE Trans. Neural Netw., vol. 14, no. 4, pp. 929-939, Jul. 2003.
- (2003) IEEE Trans. Neural Netw. , vol.14 , Issue.4 , pp. 929-939
- Enns, R.¹ Si, J.²

25
- 49049106959
- Direct heuristic dynamic programming for damping oscillations in a large power system
- Aug.
- C. Lu, J. Si, and X. Xie, "Direct heuristic dynamic programming for damping oscillations in a large power system," IEEE Trans. Syst., Man, Cybern., Part B, Cybern., vol. 38, no. 4, pp. 1008-1013, Aug. 2008.
- (2008) IEEE Trans. Syst., Man, Cybern., Part B, Cybern. , vol.38 , Issue.4 , pp. 1008-1013
- Lu, C.¹ Si, J.² Xie, X.³

26
- 49649121741
- Reinforcement-learning-based dual-control methodology for complex nonlinear discrete-time systems with application to spark engine EGR operation
- Aug.
- P. Shih, B. C. Kaul, S. Jagannathan, and J. A. Drallmeier, "Reinforcement-learning-based dual-control methodology for complex nonlinear discrete-time systems with application to spark engine EGR operation," IEEE Trans. Neural Netw., vol. 19, no. 8, pp. 1369-1388, Aug. 2008.
- (2008) IEEE Trans. Neural Netw. , vol.19 , Issue.8 , pp. 1369-1388
- Shih, P.¹ Kaul, B.C.² Jagannathan, S.³ Drallmeier, J.A.⁴

27
- 0031028741
- Effective Backpropagation training with variable stepsize
- Jan.
- G. D. Magoulasa, M. N. Vrahatisb, and G. S. Androulakisb, "Effective Backpropagation training with variable stepsize," Neural Netw., vol. 10, no. 1, pp. 69-82, Jan. 1997.
- (1997) Neural Netw. , vol.10 , Issue.1 , pp. 69-82
- Magoulasa, G.D.¹ Vrahatisb, M.N.² Androulakisb, G.S.³

28
- 79960468564
- Asymptotic tracking by a reinforcement learning-based adaptive critic controller
- S. Bhasin, N. Sharma, P. Patre, and W. E. Dixon, "Asymptotic tracking by a reinforcement learning-based adaptive critic controller," J. Control Theory Appl., vol. 9, No. 3, pp. 400-409, 2011.
- (2011) J. Control Theory Appl. , vol.9 , Issue.3 , pp. 400-409
- Bhasin, S.¹ Sharma, N.² Patre, P.³ Dixon, W.E.⁴

29
- 77950630017
- Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem
- May
- K. G. Vamvoudakis and F. L. Lewis, "Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem," Automatica, vol. 46, no. 5, pp. 878-888, May 2010.
- (2010) Automatica , vol.46 , Issue.5 , pp. 878-888
- Vamvoudakis, K.G.¹ Lewis, F.L.²

30
- 83655163786
- Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method
- Dec.
- H. Zhang, L. Cui, X. Zhang, and Y. Luo, "Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method," IEEE Trans. Neural Netw., vol. 22, no. 12, pp. 2226-2236, Dec. 2011.
- (2011) IEEE Trans. Neural Netw. , vol.22 , Issue.12 , pp. 2226-2236
- Zhang, H.¹ Cui, L.² Zhang, X.³ Luo, Y.⁴

31
- 0003991806
- New York: Wiley
- V. Vapnik, Statistical Learning Theory. New York: Wiley, 1998.
- (1998) Statistical Learning Theory
- Vapnik, V.¹

32
- 0004094721
- Cambridge, MA: MIT Press
- B. Schölkopf and A. Smola, Learning With Kernels. Cambridge, MA: MIT Press, 2002.
- (2002) Learning with Kernels
- Schölkopf, B.¹ Smola, A.²

33
- 0003798635
- Cambridge, U.K.: Cambridge Univ. Press
- N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector Machines . Cambridge, U.K.: Cambridge Univ. Press, 2000.
- (2000) An Introduction to Support Vector Machines
- Cristianini, N.¹ Shawe-Taylor, J.²

34
- 0011812771
- Kernel independent component analysis
- Jul.
- F. R. Bach and M. I. Jordan, "Kernel independent component analysis," J. Mach. Learn. Res., vol. 3, pp. 1-48, Jul. 2002.
- (2002) J. Mach. Learn. Res. , vol.3 , pp. 1-48
- Bach, F.R.¹ Jordan, M.I.²

35
- 51049096780
- Kernel methods in machine learning
- T. Hofmann, B. Schölkopf, and A. J. Smola, "Kernel methods in machine learning," Ann. Statist., vol. 36, no. 3 pp. 1171-1220, 2008.
- (2008) Ann. Statist. , vol.36 , Issue.3 , pp. 1171-1220
- Hofmann, T.¹ Schölkopf, B.² Smola, A.J.³

36
- 0036832956
- Kernel-based reinforcement learning
- D. Ormoneit and S. Sen, "Kernel-based reinforcement learning," Mach. Learn., vol. 49, nos. 2-3, pp. 161-178, 2002.
- (2002) Mach. Learn. , vol.49 , Issue.2-3 , pp. 161-178
- Ormoneit, D.¹ Sen, S.²

37
- 1942421151
- Bayes meets bellman: The Gaussian Process approach to temporal difference learning
- Y. Engel, S. Mannor, and R. Meir, "Bayes meets bellman: The Gaussian Process approach to temporal difference learning," in Proc. Int. Conf. Mach. Learn., 2003, pp. 154-161.
- Proc. Int. Conf. Mach. Learn., 2003 , pp. 154-161
- Engel, Y.¹ Mannor, S.² Meir, R.³

38
- 84899029004
- Batch value function approximation via support vectors
- Cambridge, MA: MIT Press
- T. G. Dietterich and X. Wang, "Batch value function approximation via support vectors," in Advances in Neural Information Processing Systems 14, Cambridge, MA: MIT Press, 2002, pp. 1491-1498.
- (2002) Advances in Neural Information Processing Systems , vol.14 , pp. 1491-1498
- Dietterich, T.G.¹ Wang, X.²

39
- 84899026055
- Gaussian processes in reinforcement learning
- S. Thrun, L. K. Saul, and B. Schölkopf, Eds., Cambridge, MA: MIT Press
- C. E. Rasmussen and M. Kuss, "Gaussian processes in reinforcement learning," in Advances in Neural Information Processing Systems 16, S. Thrun, L. K. Saul, and B. Schölkopf, Eds., Cambridge, MA: MIT Press, 2004, pp. 751-759.
- (2004) Advances in Neural Information Processing Systems , vol.16 , pp. 751-759
- Rasmussen, C.E.¹ Kuss, M.²

40
- 4644323293
- Least-squares policy iteration
- Dec.
- M. G. Lagoudakis and R. Parr, "Least-squares policy iteration," J. Mach. Learn. Res., vol. 4, pp. 1107-1149, Dec. 2003.
- (2003) J. Mach. Learn. Res. , vol.4 , pp. 1107-1149
- Lagoudakis, M.G.¹ Parr, R.²

41
- 34547098844
- Kernel-based least squares policy iteration for reinforcement learning
- DOI 10.1109/TNN.2007.899161, Neural Networks for Feedback Control Systems
- X. Xu, D. Hu, and X. Lu, "Kernel-based least-squares policy iteration for reinforcement learning," IEEE Trans. Neural Netw., vol. 18, no. 4, pp. 973-992, Jul. 2007. (Pubitemid 47098876)
- (2007) IEEE Transactions on Neural Networks , vol.18 , Issue.4 , pp. 973-992
- Xu, X.¹ Hu, D.² Lu, X.³

42
- 3543096272
- The kernel recursive least-squares algorithm
- Aug.
- Y. Engel, S. Mannor, and R. Meir, "The kernel recursive least-squares algorithm," IEEE Trans. Signal Process., vol. 52, no. 8, pp. 2275-2285, Aug. 2004.
- (2004) IEEE Trans. Signal Process. , vol.52 , Issue.8 , pp. 2275-2285
- Engel, Y.¹ Mannor, S.² Meir, R.³

43
- 0041345290
- Efficient reinforcement learning using recursive least-squares methods
- X. Xu, H. G. He, and D. W. Hu, "Efficient reinforcement learning using recursive least-squares methods," J. Artif. Intell. Res., vol. 16, pp. 259-292, Jun. 2002. (Pubitemid 43057174)
- (2002) Journal of Artificial Intelligence Research , vol.16 , pp. 259-292
- Xu, X.¹ He, H.-G.² Hu, D.³

44
- 33750328566
- Kernel least-squares temporal difference learning
- X. Xu, T. Xie, D. Hu, and X. Lu, "Kernel least-squares temporal difference learning," Int. J. Inf. Technol., vol. 11, no. 9, pp. 54-63, 2005.
- (2005) Int. J. Inf. Technol. , vol.11 , Issue.9 , pp. 54-63
- Xu, X.¹ Xie, T.² Hu, D.³ Lu, X.⁴

45
- 0031143730
- An analysis of temporal difference learning with function approximation
- May
- J. N. Tsitsiklis and B. V. Roy, "An analysis of temporal difference learning with function approximation," IEEE Trans. Autom. Control, vol. 42, no. 5, pp. 674-690, May 1997.
- (1997) IEEE Trans. Autom. Control , vol.42 , Issue.5 , pp. 674-690
- Tsitsiklis, J.N.¹ Roy, B.V.²

46
- 0037288398
- Least squares policy evaluation algorithms with linear function approximation
- Jan.-Apr.
- A. Nedic and D. P. Bertsekas, "Least squares policy evaluation algorithms with linear function approximation," Discrete Event Dyn. Syst., vol. 13, nos. 1-2, pp. 79-110, Jan.-Apr. 2003.
- (2003) Discrete Event Dyn. Syst. , vol.13 , Issue.1-2 , pp. 79-110
- Nedic, A.¹ Bertsekas, D.P.²

47
- 84875270081
- Online optimal control of affine nonlinear discrete-time systems with unknown internal dynamics by using time-based policy update
- Jul.
- T. Dierks and S. Jagannathan, "Online optimal control of affine nonlinear discrete-time systems with unknown internal dynamics by using time-based policy update," IEEE Trans. Neural Netw. Learn. Syst., vol. 23, no. 7, pp. 1118-1129, Jul. 2012.
- (2012) IEEE Trans. Neural Netw. Learn. Syst. , vol.23 , Issue.7 , pp. 1118-1129
- Dierks, T.¹ Jagannathan, S.²

48
- 84875242151
- Sensitivity-based adaptive learning rules for binary feedforward neural networks
- Mar.
- S. Zhong, X. Zeng, S. Wu, and L. Han, "Sensitivity-based adaptive learning rules for binary feedforward neural networks," IEEE Trans. Neural Netw. Learn. Syst., vol. 23, no. 3, pp. 480-491, Mar. 2012.
- (2012) IEEE Trans. Neural Netw. Learn. Syst. , vol.23 , Issue.3 , pp. 480-491
- Zhong, S.¹ Zeng, X.² Wu, S.³ Han, L.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.