메뉴 건너뛰기




Volumn 17, Issue 3, 2004, Pages 299-305

Reinforcement learning with via-point representation

Author keywords

Cart pole; Hierarchical reinforcement learning; Motor control; Robotics; Swing up; Via point

Indexed keywords

COMPUTER SIMULATION; HIERARCHICAL SYSTEMS; NEURAL NETWORKS; ROBOT APPLICATIONS;

EID: 1642352667     PISSN: 08936080     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.neunet.2003.11.004     Document Type: Article
Times cited : (31)

References (22)
  • 4
    • 85156231814 scopus 로고    scopus 로고
    • Temporal difference learning in continuous time and space
    • D.S. Touretzky, M.C. Mozer, Hasselmo M.E. Cambridge, MA: MIT Press
    • Doya K. Temporal difference learning in continuous time and space. Touretzky D.S., Mozer M.C., Hasselmo M.E. Advances in neural information processing systems. 8:1996;1073-1079 MIT Press, Cambridge, MA.
    • (1996) Advances in Neural Information Processing Systems , vol.8 , pp. 1073-1079
    • Doya, K.1
  • 5
    • 0000406101 scopus 로고    scopus 로고
    • Efficient nonlinear control with actor-tutor architecture
    • M.C. Mozer, Jordan M.I. Cambridge, MA: MIT Press
    • Doya K. Efficient nonlinear control with actor-tutor architecture. Mozer M.C., Jordan M.I. Advances in neural information processing systems. 9:1997;1012-1018 MIT Press, Cambridge, MA.
    • (1997) Advances in Neural Information Processing Systems , vol.9 , pp. 1012-1018
    • Doya, K.1
  • 6
    • 0033629916 scopus 로고    scopus 로고
    • Reinforcement learning in continuous time and space
    • Doya K. Reinforcement learning in continuous time and space. Neural Computation. 12:2000;243-269.
    • (2000) Neural Computation , vol.12 , pp. 243-269
    • Doya, K.1
  • 7
    • 0022417008 scopus 로고
    • The coordination of arm movements: An experimentally confirmed mathematical model
    • Flash T., Hogan N. The coordination of arm movements: An experimentally confirmed mathematical model. Journal of Neuroscience. 5:1985;1688-1703.
    • (1985) Journal of Neuroscience , vol.5 , pp. 1688-1703
    • Flash, T.1    Hogan, N.2
  • 8
    • 0025600638 scopus 로고
    • A stochastic reinforcement learning algorithm for learning real-valued functions
    • Gullapalli V. A stochastic reinforcement learning algorithm for learning real-valued functions. Neural Networks. 3:1990;671-692.
    • (1990) Neural Networks , vol.3 , pp. 671-692
    • Gullapalli, V.1
  • 9
    • 0032552114 scopus 로고    scopus 로고
    • Signal-dependent noise determines motor planning
    • Harris C.M., Wolpert D.M. Signal-dependent noise determines motor planning. Nature. 394:(20):1998;780-784.
    • (1998) Nature , vol.394 , Issue.20 , pp. 780-784
    • Harris, C.M.1    Wolpert, D.M.2
  • 10
    • 72749118903 scopus 로고
    • Models of trajectory formation and temporal interaction of reach and grasp
    • Hoff B., Arbib M.A. Models of trajectory formation and temporal interaction of reach and grasp. Journal of Motor Behavior. 25:(3):1993;175-192.
    • (1993) Journal of Motor Behavior , vol.25 , Issue.3 , pp. 175-192
    • Hoff, B.1    Arbib, M.A.2
  • 12
    • 0003543129 scopus 로고    scopus 로고
    • Macro-actions in reinforcement learning: An empirical analysis
    • University of Massachusetts, Department of Computer Science.
    • McGovern, A., Sutton, R.S (1998) Macro-actions in reinforcement learning: An empirical analysis. Technical Report 98-70, University of Massachusetts, Department of Computer Science.
    • (1998) Technical Report 98-70
    • McGovern, A.1    Sutton, R.S.2
  • 14
    • 0032191729 scopus 로고    scopus 로고
    • A tennis serve and upswing learning robot based on dynamic optimization theory
    • Miyamoto H., Kawato M. A tennis serve and upswing learning robot based on dynamic optimization theory. Neural Networks. 11:(7-8):1998;1331-1344.
    • (1998) Neural Networks , vol.11 , Issue.78 , pp. 1331-1344
    • Miyamoto, H.1    Kawato, M.2
  • 17
    • 0033151712 scopus 로고    scopus 로고
    • Is imitation learning the way to humanoid robots?
    • Schaal S. Is imitation learning the way to humanoid robots? Trends in Cognitive Sciences. 3:(6):1999;233-242.
    • (1999) Trends in Cognitive Sciences , vol.3 , Issue.6 , pp. 233-242
    • Schaal, S.1
  • 20
    • 0024314287 scopus 로고
    • Formation and control of optimal trajectory in human multijoint arm movement - Minimum torque-change model
    • Uno Y., Kawato M., Suzuki R. Formation and control of optimal trajectory in human multijoint arm movement - minimum torque-change model. Biological Cybernetics. 61:1989;89-101.
    • (1989) Biological Cybernetics , vol.61 , pp. 89-101
    • Uno, Y.1    Kawato, M.2    Suzuki, R.3
  • 22
    • 0027884471 scopus 로고
    • A neural network model for arm trajectory formation using forward and inverse dynamics models
    • Wada Y., Kawato M. A neural network model for arm trajectory formation using forward and inverse dynamics models. Neural Networks. 6:(7):1993;919-932.
    • (1993) Neural Networks , vol.6 , Issue.7 , pp. 919-932
    • Wada, Y.1    Kawato, M.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.