SCOPUS 정보 검색 플랫폼

Volumn , Issue , 2002, Pages 758-763

Residual-gradient-based neural reinforcement learning for the optimal control of an acrobot

Author keywords

Learning control; Reinforcement learning; Residual gradient; Under actuated robots

Indexed keywords

COMPUTER SIMULATION; CONVERGENCE OF NUMERICAL METHODS; INTELLIGENT ROBOTS; LEARNING ALGORITHMS; ROBOT LEARNING; TIME VARYING CONTROL SYSTEMS;

ACROBOT; LEARNING CONTROL; REINFORCEMENT LEARNING; UNDERACTUATED ROBOTS;

OPTIMAL CONTROL SYSTEMS;

EID: 0036911781 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (16)

References (14)

1
- 0029679044
- Reinforcement learning :A survey
- L.P.Kaelbling, M.L Littman, A.W.Moore. Reinforcement Learning :A Survey. Journal of Artificial Intelligence Research, Vol.4., pp:237-285, 1996.
- (1996) Journal of Artificial Intelligence Research , vol.4 , pp. 237-285
- Kaelbling, L.P.¹ Littman, M.L.² Moore, A.W.³

2
- 85151728371
- Residual algorithms: Reinforcement learning with function approximation
- L.C.Baird. Residual algorithms: Reinforcement learning with function approximation. Proc. International Conf. On Machine Learning'95. 1995, Morgan Kaufman, San Francisco, CA.
- Proc. International Conf. On Machine Learning'95. 1995, Morgan Kaufman, San Francisco, CA
- Baird, L.C.¹

4
- 0029751418
- The loss from imperfect value functions in expectation-based and minimax-based tasks
- M. Heger. The Loss From Imperfect Value Functions in Expectation-Based and Minimax-Based Tasks. Machine Learning, 22, pp. 197-225, 1996.
- (1996) Machine Learning , vol.22 , pp. 197-225
- Heger, M.¹

6
- 34249833101
- Q-learning
- C. Watkins, P. Dayan. Q-Learning, Machine Learning, Vol. 8, pp. 279-292, 1992.
- (1992) Machine Learning , vol.8 , pp. 279-292
- Watkins, C.¹ Dayan, P.²

7
- 84880694195
- Stable function approximation in dynamic programming
- G. Gorden. Stable function approximation in dynamic programming. Proc. Of International Conference on Machine Learning (ML95). 1995, Morgan Kaufman, San Francisco, CA.
- Proc. of International Conference on Machine Learning (ML95). 1995, Morgan Kaufman, San Francisco, CA
- Gorden, G.¹

8
- 0034389611
- Gradient convergence in gradient methods
- D. P. Bertsekas and J. N. Tsitsiklis. Gradient convergence in gradient methods, SIAM J. on Optimization, Vol. 10, pp. 627-642, 2000
- (2000) SIAM J. on Optimization , vol.10 , pp. 627-642
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

9
- 0025421590
- Nonlinear controllers for non-integratable systems: The acrobot example
- J. Hauser, R.M. Murray. Nonlinear controllers for non-integratable systems: the acrobot example. Proc. Of American Control Conference, San Diego, USA, 1990, pp. 669-671.
- Proc. of American Control Conference, San Diego, USA, 1990 , pp. 669-671
- Hauser, J.¹ Murray, R.M.²

10
- 0041123319
- Psedolinearization of the acrobot using spline functions
- S. Bortoff, M.W. Spong, Psedolinearization of the acrobot using spline functions. Proc. Of the IEEE Conf. On Decision and Control, Teuson, Arizona, 1992, pp. 593-598.
- Proc. of the IEEE Conf. on Decision and Control, Teuson, Arizona, 1992 , pp. 593-598
- Bortoff, S.¹ Spong, M.W.²

11
- 0029255284
- The swing up control problem for the acrobot
- M.W. Spong. The swing up control problem for the acrobot. IEEE Control System Magazine, 15(1), pp. 49-55, 1995.
- (1995) IEEE Control System Magazine , vol.15 , Issue.1 , pp. 49-55
- Spong, M.W.¹

12
- 0033901602
- Convergence results for single-step on-policy reinforcement-learning algorithms
- S. Singh, T. Jaakkola, M.L. Littman and C. Szepesvari Convergence Results for Single-step On-policy Reinforcement-learning Algorithms. Machine Learning, Vol. 38, pp. 287-308, 2000.
- (2000) Machine Learning , vol.38 , pp. 287-308
- Singh, S.¹ Jaakkola, T.² Littman, M.L.³ Szepesvari, C.⁴

14
- 0031143730
- An analysis of temporal difference learning with function approximation
- J.N. Tsitsiklis and B.V. Roy. An analysis of Temporal Difference Learning with Function Approximation. IEEE Transactions on Automatic Control. 42(5), pp. 674-690, 1997.
- (1997) IEEE Transactions on Automatic Control , vol.42 , Issue.5 , pp. 674-690
- Tsitsiklis, J.N.¹ Roy, B.V.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.