메뉴 건너뛰기




Volumn , Issue , 2002, Pages 758-763

Residual-gradient-based neural reinforcement learning for the optimal control of an acrobot

Author keywords

Learning control; Reinforcement learning; Residual gradient; Under actuated robots

Indexed keywords

COMPUTER SIMULATION; CONVERGENCE OF NUMERICAL METHODS; INTELLIGENT ROBOTS; LEARNING ALGORITHMS; ROBOT LEARNING; TIME VARYING CONTROL SYSTEMS;

EID: 0036911781     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (16)

References (14)
  • 3
    • 84898958374 scopus 로고    scopus 로고
    • Gradient descent for general reinforcement learning
    • M. S. Kearns, S. A. Solla, and D. A. Cohn, editors; MIT Press, Cambridge, MA
    • L.C. Baird. Gradient descent for general reinforcement learning. M. S. Kearns, S. A. Solla, and D. A. Cohn, editors, Advances in NeuralInformation Processing Systems 11, 1999, MIT Press, Cambridge, MA.
    • (1999) Advances in NeuralInformation Processing Systems , vol.11
    • Baird, L.C.1
  • 4
    • 0029751418 scopus 로고    scopus 로고
    • The loss from imperfect value functions in expectation-based and minimax-based tasks
    • M. Heger. The Loss From Imperfect Value Functions in Expectation-Based and Minimax-Based Tasks. Machine Learning, 22, pp. 197-225, 1996.
    • (1996) Machine Learning , vol.22 , pp. 197-225
    • Heger, M.1
  • 5
    • 85156221438 scopus 로고    scopus 로고
    • Generalization in reinforcement learning: Successful examples using sparse coarse coding
    • MIT Press
    • R. Sutton. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding. Advances in Neural Information Processing Systems 8, 1996, MIT Press, pp: 1038-1044.
    • (1996) Advances in Neural Information Processing Systems , vol.8 , pp. 1038-1044
    • Sutton, R.1
  • 11
    • 0029255284 scopus 로고
    • The swing up control problem for the acrobot
    • M.W. Spong. The swing up control problem for the acrobot. IEEE Control System Magazine, 15(1), pp. 49-55, 1995.
    • (1995) IEEE Control System Magazine , vol.15 , Issue.1 , pp. 49-55
    • Spong, M.W.1
  • 12
    • 0033901602 scopus 로고    scopus 로고
    • Convergence results for single-step on-policy reinforcement-learning algorithms
    • S. Singh, T. Jaakkola, M.L. Littman and C. Szepesvari Convergence Results for Single-step On-policy Reinforcement-learning Algorithms. Machine Learning, Vol. 38, pp. 287-308, 2000.
    • (2000) Machine Learning , vol.38 , pp. 287-308
    • Singh, S.1    Jaakkola, T.2    Littman, M.L.3    Szepesvari, C.4
  • 13
    • 84898939480 scopus 로고    scopus 로고
    • Policy gradient methods for reinforcement learning with function approximation
    • MIT Press
    • R. Sutton, D. McAllester, S. Singh, Y. Mansour. Policy gradient methods for reinforcement learning with function approximation. Advances in NIPS 12, 1999, pp. 1057-1063, MIT Press.
    • (1999) Advances in NIPS , vol.12 , pp. 1057-1063
    • Sutton, R.1    Mcallester, D.2    Singh, S.3    Mansour, Y.4
  • 14
    • 0031143730 scopus 로고    scopus 로고
    • An analysis of temporal difference learning with function approximation
    • J.N. Tsitsiklis and B.V. Roy. An analysis of Temporal Difference Learning with Function Approximation. IEEE Transactions on Automatic Control. 42(5), pp. 674-690, 1997.
    • (1997) IEEE Transactions on Automatic Control , vol.42 , Issue.5 , pp. 674-690
    • Tsitsiklis, J.N.1    Roy, B.V.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.