SCOPUS 정보 검색 플랫폼

Neural Networks

Volumn 17, Issue 3, 2004, Pages 299-305

Reinforcement learning with via-point representation

(4) Miyamoto, Hiroyuki a,b Morimoto, Jun c Doya, Kenji c Kawato, Mitsuo c

a JAPAN SCIENCE AND TECHNOLOGY AGENCY (Japan)

b KYUSHU INSTITUTE OF TECHNOLOGY (Japan)

c ADVANCED TELECOMMUNICATIONS RESEARCH INSTITUTE INTERNATIONAL (Japan)

Author keywords

Cart pole; Hierarchical reinforcement learning; Motor control; Robotics; Swing up; Via point

Indexed keywords

COMPUTER SIMULATION; HIERARCHICAL SYSTEMS; NEURAL NETWORKS; ROBOT APPLICATIONS;

MOTOR CONTROL; SPATIAL SCALES;

LEARNING SYSTEMS;

ANALYTICAL ERROR; ARCHITECTURE; ARTICLE; COMPUTER SIMULATION; LEARNING; MATHEMATICAL ANALYSIS; MOTOR CONTROL; PRIORITY JOURNAL; REINFORCEMENT; ROBOTICS;

COMPUTER SIMULATION; FEEDBACK, PSYCHOLOGICAL; HUMANS; MODELS, PSYCHOLOGICAL; PSYCHOMOTOR PERFORMANCE; REACTION TIME; REINFORCEMENT (PSYCHOLOGY);

EID: 1642352667 PISSN: 08936080 EISSN: None Source Type: Journal
DOI: 10.1016/j.neunet.2003.11.004 Document Type: Article

Times cited : (31)

References (22)

1
- 0030652809
- Learning tasks from a single demonstration
- Atkeson C.G., Schaal S. Learning tasks from a single demonstration. In IEEE International Conference on Robotics and Automation. 2:1997;1706-1712.
- (1997) In IEEE International Conference on Robotics and Automation , vol.2 , pp. 1706-1712
- Atkeson, C.G.¹ Schaal, S.²

2
- 0002130986
- Robot learning from demonstration
- Atkeson, C. G., & Schaal, S (1997) Robot learning from demonstration. In International Conference on Machine Learning (ICML97).
- (1997) International Conference on Machine Learning (ICML97)
- Atkeson, C.G.¹ Schaal, S.²

3
- 0020970738
- Neuronlike adaptive elements that can solve difficult learning control problems
- Barto A.G., Sutton R.S., Anderson C.W. Neuronlike adaptive elements that can solve difficult learning control problems. In IEEE Transactions on Systems, Man, and Cybernetics. 3:1983;834-846.
- (1983) In IEEE Transactions on Systems, Man, and Cybernetics , vol.3 , pp. 834-846
- Barto, A.G.¹ Sutton, R.S.² Anderson, C.W.³

4
- 85156231814
- Temporal difference learning in continuous time and space
- D.S. Touretzky, M.C. Mozer, Hasselmo M.E. Cambridge, MA: MIT Press
- Doya K. Temporal difference learning in continuous time and space. Touretzky D.S., Mozer M.C., Hasselmo M.E. Advances in neural information processing systems. 8:1996;1073-1079 MIT Press, Cambridge, MA.
- (1996) Advances in Neural Information Processing Systems , vol.8 , pp. 1073-1079
- Doya, K.¹

5
- 0000406101
- Efficient nonlinear control with actor-tutor architecture
- M.C. Mozer, Jordan M.I. Cambridge, MA: MIT Press
- Doya K. Efficient nonlinear control with actor-tutor architecture. Mozer M.C., Jordan M.I. Advances in neural information processing systems. 9:1997;1012-1018 MIT Press, Cambridge, MA.
- (1997) Advances in Neural Information Processing Systems , vol.9 , pp. 1012-1018
- Doya, K.¹

6
- 0033629916
- Reinforcement learning in continuous time and space
- Doya K. Reinforcement learning in continuous time and space. Neural Computation. 12:2000;243-269.
- (2000) Neural Computation , vol.12 , pp. 243-269
- Doya, K.¹

7
- 0022417008
- The coordination of arm movements: An experimentally confirmed mathematical model
- Flash T., Hogan N. The coordination of arm movements: An experimentally confirmed mathematical model. Journal of Neuroscience. 5:1985;1688-1703.
- (1985) Journal of Neuroscience , vol.5 , pp. 1688-1703
- Flash, T.¹ Hogan, N.²

8
- 0025600638
- A stochastic reinforcement learning algorithm for learning real-valued functions
- Gullapalli V. A stochastic reinforcement learning algorithm for learning real-valued functions. Neural Networks. 3:1990;671-692.
- (1990) Neural Networks , vol.3 , pp. 671-692
- Gullapalli, V.¹

9
- 0032552114
- Signal-dependent noise determines motor planning
- Harris C.M., Wolpert D.M. Signal-dependent noise determines motor planning. Nature. 394:(20):1998;780-784.
- (1998) Nature , vol.394 , Issue.20 , pp. 780-784
- Harris, C.M.¹ Wolpert, D.M.²

10
- 72749118903
- Models of trajectory formation and temporal interaction of reach and grasp
- Hoff B., Arbib M.A. Models of trajectory formation and temporal interaction of reach and grasp. Journal of Motor Behavior. 25:(3):1993;175-192.
- (1993) Journal of Motor Behavior , vol.25 , Issue.3 , pp. 175-192
- Hoff, B.¹ Arbib, M.A.²

11
- 0001246127
- Optimization and learning in neural networks for formation and control of coordinated movement
- D. Meyer, & S. Kornblum. Cambridge, MA: MIT Press
- Kawato M. Optimization and learning in neural networks for formation and control of coordinated movement. Meyer D., Kornblum S. Attention and performance, XIV: synergies in experimental psychology, artificial intelligence, and cognitive neuroscience - A silver jubilee. 1992;821-849 MIT Press, Cambridge, MA.
- (1992) Attention and Performance, XIV: Synergies in Experimental Psychology, Artificial Intelligence, and Cognitive Neuroscience - A Silver Jubilee , pp. 821-849
- Kawato, M.¹

12
- 0003543129
- Macro-actions in reinforcement learning: An empirical analysis
- University of Massachusetts, Department of Computer Science.
- McGovern, A., Sutton, R.S (1998) Macro-actions in reinforcement learning: An empirical analysis. Technical Report 98-70, University of Massachusetts, Department of Computer Science.
- (1998) Technical Report 98-70
- McGovern, A.¹ Sutton, R.S.²

13
- 12744263996
- Hierarchical optimal control of MDPs
- pp. 186-191
- McGovern A., Precup A.D., Ravindran B., Singh S., Sutton R.S. Hierarchical optimal control of MDPs. In Proceedings of the Tenth Yale Workshop on Adaptive and Learning Systems. 1998;. pp. 186-191.
- (1998) In Proceedings of the Tenth Yale Workshop on Adaptive and Learning Systems
- McGovern, A.¹ Precup, A.D.² Ravindran, B.³ Singh, S.⁴ Sutton, R.S.⁵

14
- 0032191729
- A tennis serve and upswing learning robot based on dynamic optimization theory
- Miyamoto H., Kawato M. A tennis serve and upswing learning robot based on dynamic optimization theory. Neural Networks. 11:(7-8):1998;1331-1344.
- (1998) Neural Networks , vol.11 , Issue.78 , pp. 1331-1344
- Miyamoto, H.¹ Kawato, M.²

15
- 0030297195
- A Kendama learning robot based on dynamic optimization theory
- Miyamoto H., Schaal S., Gandolfo F., Gomi H., Koike Y., Osu R., Nakano E., Wada Y., Kawato M. A Kendama learning robot based on dynamic optimization theory. Neural Networks. 9:(8):1996;1281-1302.
- (1996) Neural Networks , vol.9 , Issue.8 , pp. 1281-1302
- Miyamoto, H.¹ Schaal, S.² Gandolfo, F.³ Gomi, H.⁴ Koike, Y.⁵ Osu, R.⁶ Nakano, E.⁷ Wada, Y.⁸ Kawato, M.⁹

16
- 0032312876
- Reinforcement learning of dynamic motor sequence: Learning to stand up
- Morimoto J., Doya K. Reinforcement learning of dynamic motor sequence: learning to stand up. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems. 3:1998;1721-1726.
- (1998) In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems , vol.3 , pp. 1721-1726
- Morimoto, J.¹ Doya, K.²

17
- 0033151712
- Is imitation learning the way to humanoid robots?
- Schaal S. Is imitation learning the way to humanoid robots? Trends in Cognitive Sciences. 3:(6):1999;233-242.
- (1999) Trends in Cognitive Sciences , vol.3 , Issue.6 , pp. 233-242
- Schaal, S.¹

18
- 0004102479
- MIT Press, Cambridge, MA
- Sutton R.S., Barto A.G. Reinforcement learning, an introduction. A Bradford Book. 1998;MIT Press, Cambridge, MA.
- (1998) Reinforcement Learning, An Introduction. a Bradford Book
- Sutton, R.S.¹ Barto, A.G.²

19
- 0003814328
- PhD thesis. Massachusetts Institute of Technology.
- Todorov, E. V (1998) Studies of goal directed movements. PhD thesis. Massachusetts Institute of Technology.
- (1998) Studies of Goal Directed Movements
- Todorov, E.V.¹

20
- 0024314287
- Formation and control of optimal trajectory in human multijoint arm movement - Minimum torque-change model
- Uno Y., Kawato M., Suzuki R. Formation and control of optimal trajectory in human multijoint arm movement - minimum torque-change model. Biological Cybernetics. 61:1989;89-101.
- (1989) Biological Cybernetics , vol.61 , pp. 89-101
- Uno, Y.¹ Kawato, M.² Suzuki, R.³

21
- 0011820281
- Minimum muscle-tension-change model which reproduces human arm movement
- in Japanese
- Uno Y., Suzuki R., Kawato M. Minimum muscle-tension-change model which reproduces human arm movement. Proceedings of the 4th Symposium on Biological and Physiological Engineering. 1989;. in Japanese.
- (1989) Proceedings of the 4th Symposium on Biological and Physiological Engineering
- Uno, Y.¹ Suzuki, R.² Kawato, M.³

22
- 0027884471
- A neural network model for arm trajectory formation using forward and inverse dynamics models
- Wada Y., Kawato M. A neural network model for arm trajectory formation using forward and inverse dynamics models. Neural Networks. 6:(7):1993;919-932.
- (1993) Neural Networks , vol.6 , Issue.7 , pp. 919-932
- Wada, Y.¹ Kawato, M.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.