SCOPUS 정보 검색 플랫폼

Paladyn

Volumn 4, Issue 1, 2013, Pages 49-61

Robot Skill Learning: From Reinforcement Learning to Evolution Strategies

(2) Stulp, Freek a,b Sigaud, Olivier c

a ENSTA PARISTECH (France)

b INRIA (France)

c UNIVERSITÉ PIERRE ET MARIE CURIE (France)

Author keywords

black box optimization; dynamic movement primitives; evolution strategies; reinforcement learning

Indexed keywords

EVOLUTIONARY ALGORITHMS; LEARNING ALGORITHMS; OPTIMIZATION; ROBOTS;

'CURRENT; BLACK-BOX OPTIMIZATION; DYNAMIC MOVEMENT PRIMITIVES; EVOLUTION STRATEGIES; IMPROVEMENT METHODS; REINFORCEMENT LEARNING ALGORITHMS; REINFORCEMENT LEARNINGS; ROBOT SKILLS; SKILL LEARNING; UTILITY FUNCTIONS;

REINFORCEMENT LEARNING;

EID: 84899573498 PISSN: None EISSN: 20814836 Source Type: Journal
DOI: 10.2478/pjbr-2013-0003 Document Type: Article

Times cited : (132)

References (41)

1
- 83555179019
- Technical report INRIA Saclay
- L. Arnold, A. Auger, N. Hansen, and Y. Ollivier. Informationgeometric optimization algorithms: A unifying picture via invariance principles. Technical report, INRIA Saclay, 2011.
- (2011) Informationgeometric Optimization Algorithms: A Unifying Picture Via Invariance Principles
- Arnold, L.¹ Auger, A.² Hansen, N.³ Ollivier, Y.⁴

2
- 0037288370
- Recent advances in hierarchical reinforcement learning
- A. Barto and S. Mahadevan. Recent advances in hierarchical reinforcement learning. Discrete event systems, 13(1-2):41-77, 2003.
- (2003) Discrete Event Systems , vol.13 , Issue.1-2 , pp. 41-77
- Barto, A.¹ Mahadevan, S.²

3
- 0037592480
- Evolution strategies-a comprehensive introduction
- Hans-Georg Beyer and Hans-Paul Schwefel. Evolution strategies-a comprehensive introduction. Natural Computing, 1(1):3-52, 2002.
- (2002) Natural Computing , vol.1 , Issue.1 , pp. 3-52
- Beyer, H.-G.¹ Schwefel, H.-P.²

4
- 79551686776
- Crossentropy optimization of control policies with adaptive basis functions
- L. Busoniu, D. Ernst, B. De Schutter, and R. Babuska. Crossentropy optimization of control policies with adaptive basis functions. IEEE Transactions on Systems, Man, andCybernetics-Part B: Cybernetics, 41(1):196-209, 2011.
- (2011) IEEE Transactions on Systems, Man, andCybernetics-Part B: Cybernetics , vol.41 , Issue.1 , pp. 196-209
- Busoniu, L.¹ Ernst, D.² De Schutter, B.³ Babuska, R.⁴

5
- 44649193889
- Accelerated neural evolution through cooperatively coevolved synapses
- F. Gomez, J. Schmidhuber, and R. Miikkulainen. Accelerated neural evolution through cooperatively coevolved synapses. Journalof Machine Learning Research, 9:937-965, 2008.
- (2008) Journalof Machine Learning Research , vol.9 , pp. 937-965
- Gomez, F.¹ Schmidhuber, J.² Miikkulainen, R.³

6
- 0035377566
- Completely derandomized selfadaptation in evolution strategies
- N. Hansen and A. Ostermeier. Completely derandomized selfadaptation in evolution strategies. Evolutionary Computation, 9(2):159-195, 2001.
- (2001) Evolutionary Computation , vol.9 , Issue.2 , pp. 159-195
- Hansen, N.¹ Ostermeier, A.²

7
- 34547475891
- June
- Nikolaus Hansen. The CMA evolution strategy: A tutorial, June 2011. http://www.lri.fr/hansen/cmatutorial.pdf.
- (2011) The CMA Evolution Strategy: A Tutorial
- Hansen, N.¹

8
- 56449106904
- Evolution strategies for direct policy search
- Berlin, Heidelberg,. Springer-Verlag
- Verena Heidrich-Meisner and Christian Igel. Evolution strategies for direct policy search. In Proceedings of the 10th interna-tional conference on Parallel Problem Solving from Nature:PPSN X, pages 428-437, Berlin, Heidelberg, 2008. Springer-Verlag. ISBN 978-3-540-87699-1.
- (2008) Proceedings of the 10th interna-tional conference on Parallel Problem Solving from Nature:PPSN X , pp. 428-437
- Heidrich-Meisner, V.¹ Igel, C.²

9
- 84886993021
- Similarities and differences between policy gradient methods and evolution strategies
- Verena Heidrich-Meisner and Christian Igel. Similarities and differences between policy gradient methods and evolution strategies. In ESANN 2008, 16th European Symposium on Artifi-cial Neural Networks, Bruges, Belgium, April 23-25, 2008,Proceedings, pages 149-154, 2008.
- (2008) ESANN 2008, 16th European Symposium on Artifi-cial Neural Networks, Bruges, Belgium, April 23-25, 2008,Proceedings , pp. 149-154
- Heidrich-Meisner, V.¹ Igel, C.²

10
- 84875592161
- Dynamical Movement Primitives: Learning attractor models for motor behaviors
- A. Ijspeert, J. Nakanishi, P Pastor, H. Hoffmann, and S. Schaal. Dynamical Movement Primitives: Learning attractor models for motor behaviors. Neural Computation, 25(2):328-373, 2013.
- (2013) Neural Computation , vol.25 , Issue.2 , pp. 328-373
- Ijspeert, A.¹ Nakanishi, J.² Pastor, P.³ Hoffmann, H.⁴ Schaal, S.⁵

11
- 0036059542
- Movement imitation with nonlinear dynamical systems in humanoid robots
- A. J. Ijspeert, J. Nakanishi, and S. Schaal. Movement imitation with nonlinear dynamical systems in humanoid robots. In Pro-ceedings of the IEEE International Conference on Roboticsand Automation (ICRA), 2002.
- (2002) Pro-ceedings of the IEEE International Conference on Roboticsand Automation (ICRA
- Ijspeert, A.J.¹ Nakanishi, J.² Schaal, S.³

12
- 79958852534
- Characterizing reinforcement learning methods through parameterized learning problems
- Shivaram Kalyanakrishnan and Peter Stone. Characterizing reinforcement learning methods through parameterized learning problems. Machine Learning, 84(1-2):205-247, 2011.
- (2011) Machine Learning , vol.84 , Issue.1-2 , pp. 205-247
- Kalyanakrishnan, S.¹ Stone, P.²

13
- 29044440299
- Path integrals and symmetry breaking for optimal control theory
- 2005
- H.J. Kappen. Path integrals and symmetry breaking for optimal control theory. Journal of Statistical Mechanics: Theory andExperiment, 2005(11):P11011, 2005.
- (2005) Journal of Statistical Mechanics: Theory andExperiment , vol.11 , pp. P11011
- Kappen, H.J.¹

14
- 80053623760
- Learning stable non-linear dynamical systems with gaussian mixture models
- S. Mohammad Khansari-Zadeh and Aude Billard. Learning stable non-linear dynamical systems with gaussian mixture models. IEEE Transactions on Robotics, 2011.
- (2011) IEEE Transactions on Robotics
- Khansari-Zadeh, S.M.¹ Billard, A.²

15
- 78651495944
- Reinforcement learning to adjust robot movements to new situations
- June
- J. Kober, E. Oztop, and J. Peters. Reinforcement learning to adjust robot movements to new situations. In Proceedings of Robotics:Science and Systems, Zaragoza, Spain, June 2010.
- (2010) Proceedings of Robotics:Science and Systems, Zaragoza, Spain
- Kober, J.¹ Oztop, E.² Peters, J.³

16
- 78049390740
- Policy search for motor primitives in robotics
- J. Kober and J. Peters. Policy search for motor primitives in robotics. Machine Learning, 84:171-203, 2011.
- (2011) Machine Learning , vol.84 , pp. 171-203
- Kober, J.¹ Peters, J.²

17
- 84885895576
- Towards fast and adaptive optimal control policies for robots: A direct policy search approach
- Guimaraes, Portugal
- D. Marin and O. Sigaud. Towards fast and adaptive optimal control policies for robots: A direct policy search approach. In Proceed-ings Robotica, pages 21-26, Guimaraes, Portugal, 2012.
- (2012) Proceed-ings Robotica , pp. 21-26
- Marin, D.¹ Sigaud, O.²

18
- 84864436640
- Closed-loop primitives: A method to generate and recognize reaching actions from demonstration
- Mustafa Parlaktuna, Doruk Tunaoglu, Erol Sahin, and Emre Ugur. Closed-loop primitives: A method to generate and recognize reaching actions from demonstration. In International Confer-ence on Robotics and Automation, pages 2015-2020, 2012.
- (2012) International Confer-ence on Robotics and Automation , pp. 2015-2020
- Parlaktuna, M.¹ Tunaoglu, D.² Sahin, E.³ Ugur, E.⁴

19
- 84886998125
- Applying the episodic natural actor-critic architecture to motor primitive learning
- J. Peters and S. Schaal. Applying the episodic natural actor-critic architecture to motor primitive learning. In Proceedings of the15th European Symposium on Artificial Neural Networks(ESANN 2007), pages 1-6, 2007.
- (2007) Proceedings of the15th European Symposium on Artificial Neural Networks(ESANN 2007 , pp. 1-6
- Peters, J.¹ Schaal, S.²

20
- 40649106649
- Natural actor-critic
- Jan Peters and Stefan Schaal. Natural actor-critic. Neurocom-puting, 71(7-9):1180-1190, 2008.
- (2008) Neurocom-puting , vol.71 , Issue.7-9 , pp. 1180-1190
- Peters, J.¹ Schaal, S.²

21
- 44949241322
- Reinforcement learning of mo-tor skills with policy gradients
- May. ISSN 0893-6080
- Jan Peters and Stefan Schaal. Reinforcement learning of mo-tor skills with policy gradients. Neural networks: the officialjournal of the International Neural Network Society, 21(4): 682-97, May 2008. ISSN 0893-6080.
- (2008) Neural Networks: The Official Journal of the International Neural Network Society , vol.21 , Issue.4 , pp. 682-697
- Peters, J.¹ Schaal, S.²

22
- 47349092417
- Wiley-Blackwell
- W. B. Powell. Approximate Dynamic Programming: Solvingthe curses of dimensionality, volume 703. Wiley-Blackwell, 2007.
- (2007) Approximate Dynamic Programming: Solvingthe curses of dimensionality, volume 703
- Powell, W.B.¹

23
- 34548763245
- Evaluation of Policy Gradient Methods and Variants on the Cart-Pole Benchmark
- IEEE, April
- Martin Riedmiller, Jan Peters, and Stefan Schaal. Evaluation of Policy Gradient Methods and Variants on the Cart-Pole Benchmark. In 2007 IEEE International Symposium on Approxi-mate Dynamic Programming and Reinforcement Learning, pages 254-261. IEEE, April 2007. ISBN 1-4244-0706-0. URL
- (2007) 2007 IEEE International Symposium on Approxi-mate Dynamic Programming and Reinforcement Learning , pp. 254-261
- Riedmiller, M.¹ Peters, J.² Schaal, S.³

24
- 67650368177
- State-dependent exploration for policy gradient methods
- T. Rückstiess, M. Felder, and J. Schmidhuber. State-dependent exploration for policy gradient methods. In 19th European Con-ference on Machine Learning (ECML), 2010.
- (2010) 19th European Con-ference on Machine Learning (ECML
- Rückstiess, T.¹ Felder, M.² Schmidhuber, J.³

25
- 85141643084
- Exploring parameter space in reinforcement learning. Paladyn
- ISSN 2080-9778
- Thomas Rückstiess, Frank Sehnke, Tom Schaul, Daan Wierstra, Yi Sun, and Jürgen Schmidhuber. Exploring parameter space in reinforcement learning. Paladyn. Journal of BehavioralRobotics, 1:14-24, 2010. ISSN 2080-9778.
- (2010) Journal of BehavioralRobotics , vol.1 , pp. 14-24
- Rückstiess, T.¹ Sehnke, F.² Schaul, T.³ Wierstra, D.⁴ Sun, Y.⁵ Schmidhuber, J.⁶

26
- 0031231885
- Experiments with reinforcement learning in problems with continuous state and action spaces
- J.C. Santamaría, R.S. Sutton, and A. Ram. Experiments with reinforcement learning in problems with continuous state and action spaces. Adaptive behavior, 6(2):163-217, 1997.
- (1997) Adaptive Behavior , vol.6 , Issue.2 , pp. 163-217
- Santamaría, J.C.¹ Sutton, R.S.² Ram, A.³

27
- 0003904118
- PhD thesis, TU Berlin
- H.-P. Schwefel. Evolutionsstrategie und numerische Opti-mierung. PhD thesis, TU Berlin, 1975.
- (1975) Evolutionsstrategie und Numerische Optimierung
- Schwefel, H.-P.¹

28
- 77950297907
- Parameterexploring policy gradients
- Frank Sehnke, Christian Osendorfer, Thomas Rückstie, Alex Graves, Jan Peters, and Jürgen Schmidhuber. Parameterexploring policy gradients. Neural Networks, 23(4):551-559, 2010.
- (2010) Neural Networks , vol.23 , Issue.4 , pp. 551-559
- Sehnke, F.¹ Osendorfer, C.² Rückstie, T.³ Graves, A.⁴ Peters, J.⁵ Schmidhuber, J.⁶

29
- 74049165047
- From motor learning to interaction learning in robots
- Springer-Verlag
- O. Sigaud and J. Peters. From motor learning to interaction learning in robots. In From Motor Learning to Interaction Learningin Robots, volume 264, pages 1-12. Springer-Verlag, 2010.
- (2010) From Motor Learning to Interaction Learningin Robots , vol.264 , pp. 1-12
- Sigaud, O.¹ Peters, J.²

30
- 84867115622
- Learning parameterized skills
- In John Langford and Joelle Pineau, editors, New York, NY, USA, July. Omnipress
- Bruno Da Silva, George Konidaris, and Andrew Barto. Learning parameterized skills. In John Langford and Joelle Pineau, editors, Proceedings of the 29th International Conference on Ma-chine Learning (ICML-12), ICML '12, pages 1679-1686, New York, NY, USA, July 2012. Omnipress. ISBN 978-1-4503-1285-1.
- (2012) Proceedings of the 29th International Conference on Ma-chine Learning (ICML-12), ICML '12 , pp. 1679-1686
- Da Silva, B.¹ Konidaris, G.² Barto, A.³

31
- 84867129779
- Path integral policy improvement with covariance matrix adaptation
- Freek Stulp and Olivier Sigaud. Path integral policy improvement with covariance matrix adaptation. In Proceedings ofthe 29th International Conference on Machine Learning(ICML), 2012.
- (2012) Proceedings ofthe 29th International Conference on Machine Learning(ICML
- Stulp, F.¹ Sigaud, O.²

32
- 84455172101
- Learning motion primitive goals for robust manipulation
- Freek Stulp, Evangelos Theodorou, Mrinal Kalakrishnan, Peter Pastor, Ludovic Righetti, and Stefan Schaal. Learning motion primitive goals for robust manipulation. In International Con-ference on Intelligent Robots and Systems (IROS), 2011.
- (2011) International Con-ference on Intelligent Robots and Systems (IROS
- Stulp, F.¹ Theodorou, E.² Kalakrishnan, M.³ Pastor, P.⁴ Righetti, L.⁵ Schaal, S.⁶

33
- 84870935597
- Reinforcement learning with sequences of motion primitives for robust manipulation
- King-Sun Fu Best Paper Award of the IEEE Trans-actions on Robotics for the year 2012
- Freek Stulp, Evangelos Theodorou, and Stefan Schaal. Reinforcement learning with sequences of motion primitives for robust manipulation. IEEE Transactions on Robotics, 28(6):1360-1370, 2012. King-Sun Fu Best Paper Award of the IEEE Trans-actions on Robotics for the year 2012.
- (2012) IEEE Transactions on Robotics , vol.28 , Issue.6 , pp. 1360-1370
- Stulp, F.¹ Theodorou, E.² Schaal, S.³

34
- 0004102479
- MIT Press
- R. Sutton and A. Barto. Reinforcement Learning: an Introduc-tion. MIT Press, 1998.
- (1998) Reinforcement Learning: An Introduc-Tion
- Sutton, R.¹ Barto, A.²

35
- 80052851862
- Learning to pour with a robot arm combining goal and shape learning for dynamic movement primitives
- Minija Tamosiumaite, Bojan Nemec, Ales Ude, and Florentin Wörgötter. Learning to pour with a robot arm combining goal and shape learning for dynamic movement primitives. Robots andAutonomous Systems, 59(11):910-922, 2011.
- (2011) Robots andAutonomous Systems , vol.59 , Issue.11 , pp. 910-922
- Tamosiumaite, M.¹ Nemec, B.² Ude, A.³ Wörgötter, F.⁴

36
- 79551503171
- A generalized path integral control approach to reinforcement learning
- Evangelos Theodorou, Jonas Buchli, and Stefan Schaal. A generalized path integral control approach to reinforcement learning. Journal of Machine Learning Research, 11:3137-3181, 2010.
- (2010) Journal of Machine Learning Research , vol.11 , pp. 3137-3181
- Theodorou, E.¹ Buchli, J.² Schaal, S.³

37
- 79958789196
- Ontogenetic and phylogenetic reinforcement learning
- Julian Togelius, Tom Schaul, Daan Wierstra, Christian Igel, Faustino Gomez, and Jürgen Schmidhuber. Ontogenetic and phylogenetic reinforcement learning. Zeitschrift Künstliche In-telligenz-Special Issue on Reinforcement Learning, pages 30-33, 2009.
- (2009) Zeitschrift Künstliche In-telligenz-Special Issue on Reinforcement Learning , pp. 30-33
- Togelius, J.¹ Schaul, T.² Wierstra, D.³ Igel, C.⁴ Gomez, F.⁵ Schmidhuber, J.⁶

38
- 77957706006
- Taskspecific generalization of discrete and periodic dynamic movement primitives
- Ales Ude, Andrej Gams, Tamim Asfour, and Jun Morimoto. Taskspecific generalization of discrete and periodic dynamic movement primitives. IEEE Transactions on Robotics, 26(5): 800-815, 2010.
- (2010) IEEE Transactions on Robotics , vol.26 , Issue.5 , pp. 800-815
- Ude, A.¹ Gams, A.² Asfour, T.³ Morimoto, J.⁴

39
- 0002891388
- Locally weighted projection regression: An o(n) algorithm for incremental real time learning in high dimensional spaces
- S. Vijayakumar and S. Schaal. Locally weighted projection regression: An o(n) algorithm for incremental real time learning in high dimensional spaces. In Proceedings of the 17th InternationalConference on Machine Learning (ICML), pages 288-293, 2000.
- (2000) Proceedings of the 17th InternationalConference on Machine Learning (ICML , pp. 288-293
- Vijayakumar, S.¹ Schaal, S.²

40
- 55749088183
- Natural evolution strategies
- Daan Wierstra, Tom Schaul, Jan Peters, and Juergen Schmidhuber. Natural evolution strategies. In Proceedings of IEEECongress on Evolutionary Computation (CEC), 2008.
- (2008) Proceedings of IEEECongress on Evolutionary Computation (CEC
- Wierstra, D.¹ Schaul, T.² Peters, J.³ Schmidhuber, J.⁴

41
- 0000337576
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- R. J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8: 229-256, 1992.
- (1992) Machine Learning , vol.8 , pp. 229-256
- Williams, R.J.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.