SCOPUS 정보 검색 플랫폼

Machine Learning

Volumn 84, Issue 1-2, 2011, Pages 171-203

Policy search for motor primitives in robotics

(2) Kober, Jens a Peters, Jan a

a MAX PLANCK INSTITUTE FOR BIOLOGICAL CYBERNETICS (Germany)

Author keywords

Episodic reinforcement learning; Motor control; Motor primitives; Policy learning

Indexed keywords

ACTOR CRITIC; BENCH-MARK PROBLEMS; EPISODIC REINFORCEMENT LEARNING; EXPECTATION MAXIMIZATION; HIGH-DIMENSIONAL; HUMANOID ROBOTICS; IMITATION LEARNING; MOTOR CONTROL; MOTOR LEARNING; MOTOR PRIMITIVES; MOTOR SKILLS; POLICY GRADIENT; POLICY GRADIENT METHODS; POLICY SEARCH; REAL ROBOT; REINFORCEMENT LEARNING METHOD; ROBOT ARMS;

ANTHROPOMORPHIC ROBOTS; DYNAMICAL SYSTEMS; GRADIENT METHODS; REINFORCEMENT LEARNING; ROBOTICS;

ALGORITHMS;

EID: 78049390740 PISSN: 08856125 EISSN: 15730565 Source Type: Journal
DOI: 10.1007/s10994-010-5223-6 Document Type: Article

Times cited : (262)

References (61)

1
- 0037262814
- An introduction to MCMC for machine learning
- 1033.68081 10.1023/A:1020281327116
- C. Andrieu N. de Freitas A. Doucet M. I. Jordan 2003 An introduction to MCMC for machine learning Machine Learning 50 1 5 43 1033.68081 10.1023/A:1020281327116
- (2003) Machine Learning , vol.50 , Issue.1 , pp. 5-43
- Andrieu, C.¹ De Freitas, N.² Doucet, A.³ Jordan, M.I.⁴

2
- 0039816976
- Using local trajectory optimizers to speed up global optimization in dynamic programming
- Atkeson, C. G. (1994). Using local trajectory optimizers to speed up global optimization in dynamic programming. In Advances in neural information processing systems (Vol. 6, pp. 503-521), Denver, CO, USA.
- (1994) Advances in Neural Information Processing Systems Denver, CO, USA , vol.6 , pp. 503-521
- Atkeson, C.G.¹

3
- 33749242151
- Planning by probabilistic inference
- Key West, FL, USA
- Attias, H. (2003). Planning by probabilistic inference. In Proceedings of the ninth international workshop on artificial intelligence and statistics (AISTATS), Key West, FL, USA.
- (2003) Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics (AISTATS)
- Attias, H.¹

4
- 84858765598
- Covariant policy search
- Acapulco, Mexico
- Bagnell, J., & Schneider, J. (2003). Covariant policy search. In Proceedings of the international joint conference on artificial intelligence (IJCAI) (pp. 1019-1024), Acapulco, Mexico.
- (2003) Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) , pp. 1019-1024
- Bagnell, J.¹ Schneider, J.²

5
- 84898962948
- Policy search by dynamic programming
- Vancouver, BC, CA
- Bagnell, J., Kadade, S., Ng, A., & Schneider, J. (2004). Policy search by dynamic programming. In Advances in neural information processing systems (Vol. 16), Vancouver, BC, CA.
- (2004) Advances in Neural Information Processing Systems , vol.16
- Bagnell, J.¹ Kadade, S.² Ng, A.³ Schneider, J.⁴

6
- 0031273462
- Adaptive Probabilistic Networks with Hidden Variables
- J. Binder D. Koller S. Russell K. Kanazawa 1997 Adaptive probabilistic networks with hidden variables Machine Learning 29 2-3 213 244 0892.68079 10.1023/A:1007421730016 (Pubitemid 127510039)
- (1997) Machine Learning , vol.29 , Issue.2-3 , pp. 213-244
- Binder, J.¹ Koller, D.² Russell, S.³ Kanazawa, K.⁴

7
- 70049111229
- Using Bayesian dynamical systems for motion template libraries
- D. Koller D. Schuurmans Y. Bengio L. Bottou (eds)
- Chiappa, S., Kober, J., & Peters, J. (2009). Using Bayesian dynamical systems for motion template libraries. In D. Koller, D. Schuurmans, Y. Bengio, & L. Bottou (Eds.), Advances in neural information processing systems (Vol. 21, pp. 297-304).
- (2009) Advances in Neural Information Processing Systems , vol.21 , pp. 297-304
- Chiappa, S.¹ Kober, J.² Peters, J.³

8
- 79958813136
- DARPA
- DARPA (2010a). Learning locomotion (L2). http://www.darpa.mil/ipto/ programs/ll/ll.asp.
- (2010) Learning Locomotion (L2)

9
- 79958805602
- DARPA
- DARPA (2010b). Learning applied to ground robotics (LAGR). http://www.darpa.mil/ipto/programs/lagr/lagr.asp.
- (2010) Learning Applied to Ground Robotics (LAGR)

10
- 84864472536
- DARPA
- DARPA (2010c). Autonomous robot manipulation (ARM). http://www.darpa.mil/ ipto/programs/arm/arm.asp.
- (2010) Autonomous Robot Manipulation (ARM)

11
- 0346982426
- Using expectation-maximization for reinforcement learning
- P. Dayan G. E. Hinton 1997 Using expectation-maximization for reinforcement learning Neural Computation 9 2 271 278 0876.68090 10.1162/neco.1997.9.2.271 (Pubitemid 127635391)
- (1997) Neural Computation , vol.9 , Issue.2 , pp. 271-278
- Dayan, P.¹ Hinton, G.E.²

12
- 0002629270
- Maximum likelihood from incomplete data via the em algorithm
- 501537 0364.62022
- A. P. Dempster N. M. Laird D. B. Rubin 1977 Maximum likelihood from incomplete data via the EM algorithm Journal of the Royal Statistical Society, Series B (Methodological) 39 1 38 501537 0364.62022
- (1977) Journal of the Royal Statistical Society, Series B (Methodological) , vol.39 , pp. 1-38
- Dempster, A.P.¹ Laird, N.M.² Rubin, D.B.³

13
- 34250644253
- Towards direct policy search reinforcement learning for robot control
- Beijing, China
- El-Fakdi, A., Carreras, M., & Ridao, P. (2006). Towards direct policy search reinforcement learning for robot control. In Proceedings of the IEEE/RSJ 2006 international conference on intelligent robots and systems (IROS), Beijing, China.
- (2006) Proceedings of the IEEE/RSJ 2006 International Conference on Intelligent Robots and Systems (IROS)
- El-Fakdi, A.¹ Carreras, M.² Ridao, P.³

14
- 0003898661
- Springer New York
- Fantoni, I., & Lozano, R. (2001). Non-linear control for underactuated mechanical systems. New York: Springer.
- (2001) Non-linear Control for Underactuated Mechanical Systems
- Fantoni, I.¹ Lozano, R.²

15
- 34948857495
- Reinforcement learning for imitating constrained reaching movements
- F. Guenter M. Hersch S. Calinon A. Billard 2007 Reinforcement learning for imitating constrained reaching movements Advanced Robotics, Special Issue on Imitative Robots 21 13 1521 1544 (Pubitemid 47529845)
- (2007) Advanced Robotics , vol.21 , Issue.13 , pp. 1521-1544
- Guenter, F.¹ Hersch, M.² Calinon, S.³ Billard, A.⁴

16
- 0028381374
- Acquiring robot skills via reinforcement learning
- V. Gullapalli J. Franklin H. Benbrahim 1994 Acquiring robot skills via reinforcement learning IEEE Control Systems Journal, Special Issue on Robotics: Capturing Natural Motion 4 1 13 24
- (1994) IEEE Control Systems Journal, Special Issue on Robotics: Capturing Natural Motion , vol.4 , Issue.1 , pp. 13-24
- Gullapalli, V.¹ Franklin, J.² Benbrahim, H.³

17
- 70350090880
- Bayesian policy learning with trans-dimensional MCMC
- Vancouver, BC, CA
- Hoffman, M., Doucet, A., de Freitas, N., & Jasra, A. (2007). Bayesian policy learning with trans-dimensional MCMC. In Advances in neural information processing systems (Vol. 20), Vancouver, BC, CA.
- (2007) Advances in Neural Information Processing Systems , vol.20
- Hoffman, M.¹ Doucet, A.² De Freitas, N.³ Jasra, A.⁴

18
- 0036059542
- Movement imitation with nonlinear dynamical systems in humanoid robots
- Ijspeert, A. J., Nakanishi, J., & Schaal, S. (2002). Movement imitation with nonlinear dynamical systems in humanoid robots. In Proceedings of IEEE international conference on robotics and automation (ICRA) (pp. 1398-1403), Washington, DC. (Pubitemid 34916443)
- (2002) Proceedings - IEEE International Conference on Robotics and Automation , vol.2 , pp. 1398-1403
- Ijspeert, A.J.¹ Nakanishi, J.² Schaal, S.³

19
- 84899019754
- Learning attractor landscapes for learning motor primitives
- Vancouver, BC, CA
- Ijspeert, A. J., Nakanishi, J., & Schaal, S. (2003). Learning attractor landscapes for learning motor primitives. In Advances in neural information processing systems (Vol. 15, pp. 1547-1554), Vancouver, BC, CA.
- (2003) Advances in Neural Information Processing Systems , vol.15 , pp. 1547-1554
- Ijspeert, A.J.¹ Nakanishi, J.² Schaal, S.³

20
- 4243385070
- Convergence of stochastic iterative dynamic programming algorithms
- J. D. Cowan G. Tesauro J. Alspector (eds). Morgan Kaufmann San Mateo
- Jaakkola, T., Jordan, M. I., & Singh, S. P. (1994). Convergence of stochastic iterative dynamic programming algorithms. In J. D. Cowan, G. Tesauro, & J. Alspector (Eds.), Advances in neural information processing systems (Vol. 6, pp. 703-710). San Mateo: Morgan Kaufmann.
- (1994) Advances in Neural Information Processing Systems , vol.6 , pp. 703-710
- Jaakkola, T.¹ Jordan, M.I.² Singh, S.P.³

21
- 0003448648
- Prentice-Hall Englewood Cliffs
- Kirk, D. E. (1970). Optimal control theory. Englewood Cliffs: Prentice-Hall.
- (1970) Optimal Control Theory
- Kirk, D.E.¹

22
- 85060321083
- Learning motor primitives for robotics
- Kober, J., & Peters, J. (2009a). Learning motor primitives for robotics. In Proceedings of IEEE international conference on robotics and automation (ICRA) (pp. 2112-2118).
- (2009) Proceedings of IEEE International Conference on Robotics and Automation (ICRA) , pp. 2112-2118
- Kober, J.¹ Peters, J.²

23
- 84858754385
- Policy search for motor primitives in robotics
- D. Koller D. Schuurmans Y. Bengio L. Bottou (eds)
- Kober, J., & Peters, J. (2009b). Policy search for motor primitives in robotics. In D. Koller, D. Schuurmans, Y. Bengio, & L. Bottou (Eds.), Advances in neural information processing systems (Vol. 21, pp. 849-856).
- (2009) Advances in Neural Information Processing Systems , vol.21 , pp. 849-856
- Kober, J.¹ Peters, J.²

24
- 67650835709
- Learning perceptual coupling for motor primitives
- Nice, France 10.1109/IROS.2008.4650953
- Kober, J., Mohler, B., & Peters, J. (2008). Learning perceptual coupling for motor primitives. In Proceedings of the IEEE/RSJ 2008 international conference on intelligent robots and systems (IROS) (pp. 834-839), Nice, France.
- (2008) Proceedings of the IEEE/RSJ 2008 International Conference on Intelligent Robots and Systems (IROS) , pp. 834-839
- Kober, J.¹ Mohler, B.² Peters, J.³

25
- 78651490434
- Robot motor skill coordination with em-based reinforcement learning
- Kormushev, P., Calinon, S., & Caldwell, D. G. (2010). Robot motor skill coordination with em-based reinforcement learning. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS).
- (2010) Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- Kormushev, P.¹ Calinon, S.² Caldwell, D.G.³

26
- 33750422418
- Gradient-based reinforcement planning in policy-search methods
- M. A. Wiering (eds). Onderwijsinsituut CKI, Utrecht University Manno
- Kwee, I., Hutter, M., & Schmidhuber, J. (2001). Gradient-based reinforcement planning in policy-search methods. In M. A. Wiering (Ed.), Cognitieve Kunstmatige Intelligentie: Vol. 27. Proceedings of the 5th European workshop on reinforcement learning (EWRL) (pp. 27-29), Lugano. Manno: Onderwijsinsituut CKI, Utrecht University.
- (2001) Proceedings of the 5th European Workshop on Reinforcement Learning (EWRL). Lugano Cognitieve Kunstmatige Intelligentie , vol.27 , pp. 27-29
- Kwee, I.¹ Hutter, M.² Schmidhuber, J.³

27
- 79958834032
- Efficient gradient estimation for motor control learning
- Acapulco, Mexico
- Lawrence, G., Cowan, N., & Russell, S. (2003). Efficient gradient estimation for motor control learning. In Proceedings of the international conference on uncertainty in artificial intelligence (UAI) (pp. 354-361), Acapulco, Mexico.
- (2003) Proceedings of the International Conference on Uncertainty in Artificial Intelligence (UAI) , pp. 354-361
- Lawrence, G.¹ Cowan, N.² Russell, S.³

28
- 70350701868
- The knn-td reinforcement learning algorithm
- Springer Berlin
- Martín, H. J. A., de Lope, J., & Maravall, D. (2009). The knn-td reinforcement learning algorithm. In Proceedings of the 3rd international work-conference on the interplay between natural and artificial computation (IWINAC) (pp. 305-314). Berlin: Springer.
- (2009) Proceedings of the 3rd International Work-conference on the Interplay between Natural and Artificial Computation (IWINAC) , pp. 305-314
- Martín, H.J.A.¹ De Lope, J.² Maravall, D.³

29
- 0004203240
- Wiley New York
- McLachan, G. J., & Krishnan, T. (1997). Wiley series in probability and statistics. The EM algorithm and extensions. New York: Wiley.
- (1997) The em Algorithm and Extensions. Wiley Series in Probability and Statistics
- McLachan, G.J.¹ Krishnan, T.²

30
- 0030297195
- A Kendama learning robot based on bi-directional theory
- DOI 10.1016/S0893-6080(96)00043-3, PII S0893608096000433
- H. Miyamoto S. Schaal F. Gandolfo H. Gomi Y. Koike R. Osu E. Nakano Y. Wada M. Kawato 1996 A Kendama learning robot based on bi-directional theory Neural Networks 9 8 1281 1302 10.1016/S0893-6080(96)00043-3 (Pubitemid 26413052)
- (1996) Neural Networks , vol.9 , Issue.8 , pp. 1281-1302
- Miyamoto, H.¹ Schaal, S.² Gandolfo, F.³ Gomi, H.⁴ Koike, Y.⁵ Osu, R.⁶ Nakano, E.⁷ Wada, Y.⁸ Kawato, M.⁹

31
- 0141819580
- Pegasus: A policy search method for large mdps and pomdps
- Palo Alto, CA
- Ng, A. Y., & Jordan, M. (2000). Pegasus: A policy search method for large mdps and pomdps. In Proceedings of the international conference on uncertainty in artificial intelligence (UAI) (pp. 406-415), Palo Alto, CA.
- (2000) Proceedings of the International Conference on Uncertainty in Artificial Intelligence (UAI) , pp. 406-415
- Ng, A.Y.¹ Jordan, M.²

32
- 33744488034
- Inverted autonomous helicopter flight via reinforcement learning
- MIT Press Cambridge
- Ng, A. Y., Kim, H. J., Jordan, M. I., & Sastry, S. (2004). Inverted autonomous helicopter flight via reinforcement learning. In Proceedings of the international symposium on experimental robotics (ISER). Cambridge: MIT Press.
- (2004) Proceedings of the International Symposium on Experimental Robotics (ISER)
- Ng, A.Y.¹ Kim, H.J.² Jordan, M.I.³ Sastry, S.⁴

33
- 63549125238
- Movement reproduction and obstacle avoidance with dynamic movement primitives and potential fields
- Park, D. H., Hoffmann, H., Pastor, P., & Schaal, S. (2008). Movement reproduction and obstacle avoidance with dynamic movement primitives and potential fields. In IEEE international conference on humanoid robots (HUMANOIDS) (pp. 91-98).
- (2008) IEEE International Conference on Humanoid Robots (HUMANOIDS) , pp. 91-98
- Park, D.H.¹ Hoffmann, H.² Pastor, P.³ Schaal, S.⁴

34
- 79951968365
- PASCAL2
- PASCAL2 (2010). Challenges. http://pascallin2.ecs.soton.ac.uk/Challenges/ .
- (2010) Challenges

35
- 0005943267
- PhD thesis, Brown University, Providence, RI
- Peshkin, L. (2001). Reinforcement learning by policy search. PhD thesis, Brown University, Providence, RI.
- (2001) Reinforcement Learning by Policy Search
- Peshkin, L.¹

36
- 70049104346
- PhD thesis, University of Southern California, Los Angeles, CA, 90089, USA
- Peters, J. (2007). Machine learning of motor skills for robotics. PhD thesis, University of Southern California, Los Angeles, CA, 90089, USA.
- (2007) Machine Learning of Motor Skills for Robotics
- Peters, J.¹

37
- 34250635407
- Policy gradient methods for robotics
- DOI 10.1109/IROS.2006.282564, 4058714, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2006
- Peters, J., & Schaal, S. (2006). Policy gradient methods for robotics. In Proceedings of the IEEE/RSJ 2006 international conference on intelligent robots and systems (IROS) (pp. 2219-2225), Beijing, China. (Pubitemid 46928224)
- (2006) IEEE International Conference on Intelligent Robots and Systems , pp. 2219-2225
- Peters, J.¹ Schaal, S.²

38
- 34547964788
- Reinforcement learning by reward-weighted regression for operational space control
- Corvallis, OR, USA
- Peters, J., & Schaal, S. (2007). Reinforcement learning by reward-weighted regression for operational space control. In Proceedings of the international conference on machine learning (ICML), Corvallis, OR, USA.
- (2007) Proceedings of the International Conference on Machine Learning (ICML)
- Peters, J.¹ Schaal, S.²

39
- 34447553096
- Reinforcement learning for humanoid robotics
- Karlsruhe, Germany
- Peters, J., Vijayakumar, S., & Schaal, S. (2003). Reinforcement learning for humanoid robotics. In Proceedings of the IEEE-RAS international conference on humanoid robots (HUMANOIDS) (pp. 103-123), Karlsruhe, Germany.
- (2003) Proceedings of the IEEE-RAS International Conference on Humanoid Robots (HUMANOIDS) , pp. 103-123
- Peters, J.¹ Vijayakumar, S.² Schaal, S.³

40
- 33646413135
- Natural actor-critic
- Porto, Portugal
- Peters, J., Vijayakumar, S., & Schaal, S. (2005). Natural actor-critic. In Proceedings of the European conference on machine learning (ECML) (pp. 280-291), Porto, Portugal.
- (2005) Proceedings of the European Conference on Machine Learning (ECML) , pp. 280-291
- Peters, J.¹ Vijayakumar, S.² Schaal, S.³

41
- 56049089041
- State-dependent exploration for policy gradient methods
- Antwerp, Belgium
- Rückstie, T., Felder, M., & Schmidhuber, J. (2008). State-dependent exploration for policy gradient methods. In Proceedings of the European conference on machine learning (ECML) (pp. 234-249), Antwerp, Belgium.
- (2008) Proceedings of the European Conference on Machine Learning (ECML) , pp. 234-249
- Rückstie, T.¹ Felder, M.² Schmidhuber, J.³

42
- 0027541059
- Mastering of a task with interaction between a robot and its environment: "kendama" task
- S. Sato T. Sakaguchi Y. Masutani F. Miyazaki 1993 Mastering of a task with interaction between a robot and its environment: "kendama" task Transactions of the Japan Society of Mechanical Engineers C 59 558 487 493
- (1993) Transactions of the Japan Society of Mechanical Engineers C , vol.59 , Issue.558 , pp. 487-493
- Sato, S.¹ Sakaguchi, T.² Masutani, Y.³ Miyazaki, F.⁴

43
- 0036639869
- Scalable techniques from nonparametric statistics for real time robot learning
- DOI 10.1023/A:1015727715131
- S. Schaal C. G. Atkeson S. Vijayakumar 2002 Scalable techniques from nonparameteric statistics for real-time robot learning Applied Intelligence 17 1 49 60 1003.68169 10.1023/A:1015727715131 (Pubitemid 34789897)
- (2002) Applied Intelligence , vol.17 , Issue.1 , pp. 49-60
- Schaal, S.¹ Atkeson, C.G.² Vijayakumar, S.³

44
- 84871709049
- Control, planning, learning, and imitation with dynamic movement primitives
- Las Vegas, NV October 27-31, 2003
- Schaal, S., Peters, J., Nakanishi, J., & Ijspeert, A. J. (2003). Control, planning, learning, and imitation with dynamic movement primitives. In Proceedings of the workshop on bilateral paradigms on humans and humanoids, IEEE international conference on intelligent robots and systems (IROS), Las Vegas, NV, October 27-31, 2003.
- (2003) Proceedings of the Workshop on Bilateral Paradigms on Humans and Humanoids, IEEE International Conference on Intelligent Robots and Systems (IROS)
- Schaal, S.¹ Peters, J.² Nakanishi, J.³ Ijspeert, A.J.⁴

45
- 34848832311
- Dynamics systems vs. optimal control - A unifying view
- DOI 10.1016/S0079-6123(06)65027-9, PII S0079612306650279, Computational Neuroscience: Theoretical Insights into Brain Function
- S. Schaal P. Mohajerian A. J. Ijspeert 2007 Dynamics systems vs. optimal control-a unifying view Progress in Brain Research 165 1 425 445 10.1016/S0079-6123(06)65027-9 (Pubitemid 47513886)
- (2007) Progress in Brain Research , vol.165 , pp. 425-445
- Schaal, S.¹ Mohajerian, P.² Ijspeert, A.³

46
- 77950297907
- Parameter-exploring policy gradients
- 10.1016/j.neunet.2009.12.004
- F. Sehnke C. Osendorfer T. Rückstie A. Graves J. Peters J. Schmidhuber 2010 Parameter-exploring policy gradients Neural Networks 21 4 551 559 10.1016/j.neunet.2009.12.004
- (2010) Neural Networks , vol.21 , Issue.4 , pp. 551-559
- Sehnke, F.¹ Osendorfer, C.² Rückstie, T.³ Graves, A.⁴ Peters, J.⁵ Schmidhuber, J.⁶

47
- 74049155167
- Rensselaer Polytechnic Institute
- Shone, T., Krudysz, G., & Brown, K. (2000). Dynamic manipulation of Kendama (Tech. rep.). Rensselaer Polytechnic Institute.
- (2000) Dynamic Manipulation of Kendama (Tech. Rep.)
- Shone, T.¹ Krudysz, G.² Brown, K.³

48
- 0141484963
- Direct policy search using paired statistical tests
- Strens, M., & Moore, A. (2001). Direct policy search using paired statistical tests. In Proceedings of the 18th international conference on machine learning (ICML).
- (2001) Proceedings of the 18th International Conference on Machine Learning (ICML)
- Strens, M.¹ Moore, A.²

49
- 79958826048
- McGraw-Hill New York
- Sumners, C. (1997). Toys in space: exploring science with the astronauts. New York: McGraw-Hill.
- (1997) Toys in Space: Exploring Science with the Astronauts
- Sumners, C.¹

50
- 0004007508
- MIT Press Cambridge
- Sutton, R., & Barto, A. (1998). Reinforcement learning. Cambridge: MIT Press.
- (1998) Reinforcement Learning
- Sutton, R.¹ Barto, A.²

51
- 0002995053
- Integrated architectures for learning, planning, and reacting based on approximating dynamic programming
- Sutton, R. S. (1990). Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In Proceedings of the international machine learning conference (pp. 9-44).
- (1990) Proceedings of the International Machine Learning Conference , pp. 9-44
- Sutton, R.S.¹

52
- 84898939480
- Policy gradient methods for reinforcement learning with function approximation
- Denver, CO, USA
- Sutton, R. S., McAllester, D., Singh, S., & Mansour, Y. (2000). Policy gradient methods for reinforcement learning with function approximation. In Advances in neural information processing systems (NIPS) (Vol. 13, pp. 1057-1063), Denver, CO, USA.
- (2000) Advances in Neural Information Processing Systems (NIPS) , vol.13 , pp. 1057-1063
- Sutton, R.S.¹ McAllester, D.² Singh, S.³ Mansour, Y.⁴

53
- 0021508185
- Dynamical control of manipulator with vision: "cup and ball" game demonstrated by robot
- K. Takenaka 1984 Dynamical control of manipulator with vision: "cup and ball" game demonstrated by robot Transactions of the Japan Society of Mechanical Engineers C 50 458 2046 2053 (Pubitemid 16635650)
- (1984) Nippon Kikai Gakkai Ronbunshu, C Hen/Transactions of the Japan Society of Mechanical Engineers, Part C , vol.50 , Issue.458 , pp. 2046-2053
- Takenaka Kazuki¹

54
- 60349107400
- Transfer via inter-task mappings in policy search reinforcement learning
- Taylor, M. E., Whiteson, S., & Stone, P. (2007). Transfer via inter-task mappings in policy search reinforcement learning. In Proceedings of the sixth international joint conference on autonomous agents and multiagent systems (AAMAS).
- (2007) Proceedings of the Sixth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS)
- Taylor, M.E.¹ Whiteson, S.² Stone, P.³

55
- 14044262287
- Stochastic policy gradient reinforcement learning on a simple 3D biped
- SA1-E5, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- Tedrake, R., Zhang, T. W., & Seung, H. S. (2004). Stochastic policy gradient reinforcement learning on a simple 3d biped. In Proceedings of the IEEE 2004 international conference on intelligent robots and systems (IROS) (pp. 2849-2854). (Pubitemid 40276027)
- (2004) 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , vol.3 , pp. 2849-2854
- Tedrake, R.¹ Zhang, T.W.² Seung, H.S.³

56
- 77955836276
- Reinforcement learning of motor skills in high dimensions: A path integral approach
- Theodorou, E. A., Buchli, J., & Schaal, S. (2010). Reinforcement learning of motor skills in high dimensions: a path integral approach. In Proceedings of IEEE international conference on robotics and automation (ICRA) (pp. 2397-2403).
- (2010) Proceedings of IEEE International Conference on Robotics and Automation (ICRA) , pp. 2397-2403
- Theodorou, E.A.¹ Buchli, J.² Schaal, S.³

57
- 51349151157
- Probabilistic inference for structured planning in robotics
- San Diego, CA, USA
- Toussaint, M., & Goerick, C. (2007). Probabilistic inference for structured planning in robotics. In Proceedings of the IEEE/RSJ 2007 international conference on intelligent robots and systems (IROS), San Diego, CA, USA.
- (2007) Proceedings of the IEEE/RSJ 2007 International Conference on Intelligent Robots and Systems (IROS)
- Toussaint, M.¹ Goerick, C.²

58
- 38349018685
- Preprint
- Van Der Maaten, L., Postma, E., & Van Den Herik, H. (2007). Dimensionality reduction: a comparative review. Preprint.
- (2007) Dimensionality Reduction: A Comparative Review
- Van Der Maaten, L.¹ Postma, E.² Van Den Herik, H.³

59
- 70349327392
- Learning model-free robot control by a Monte Carlo em algorithm
- 10.1007/s10514-009-9132-0
- N. Vlassis M. Toussaint G. Kontes S. Piperidis 2009 Learning model-free robot control by a Monte Carlo EM algorithm Autonomous Robots 27 2 123 130 10.1007/s10514-009-9132-0
- (2009) Autonomous Robots , vol.27 , Issue.2 , pp. 123-130
- Vlassis, N.¹ Toussaint, M.² Kontes, G.³ Piperidis, S.⁴

60
- 0000337576
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- 0772.68076
- R. J. Williams 1992 Simple statistical gradient-following algorithms for connectionist reinforcement learning Machine Learning 8 229 256 0772.68076
- (1992) Machine Learning , vol.8 , pp. 229-256
- Williams, R.J.¹

61
- 34547296028
- Human Kinetics Champaign
- Wulf, G. (2007). Attention and motor skill learning. Champaign: Human Kinetics.
- (2007) Attention and Motor Skill Learning
- Wulf, G.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.