SCOPUS 정보 검색 플랫폼

Adaptation, Learning, and Optimization

Volumn 12, Issue , 2012, Pages 579-610

Reinforcement learning in robotics: A survey

(2) Kober, Jens a,b Peters, Jan a,b

a DARMSTADT UNIVERSITY OF TECHNOLOGY (Germany)

b MAX PLANCK INSTITUTE FOR INTELLIGENT SYSTEMS (Germany)

Author keywords

Entropy; Torque

Indexed keywords

EID: 84892593209 PISSN: 18674534 EISSN: 18674542 Source Type: Book Series
DOI: 10.1007/978-3-642-27645-3_18 Document Type: Chapter

Times cited : (175)

References (132)

1
- 33749242451
- Using inaccurate models in reinforcement learning
- Abbeel, P., Quigley, M., Ng, A.Y.: Using inaccurate models in reinforcement learning. In: International Conference on Machine Learning, ICML (2006)
- (2006) International Conference on Machine Learning, ICML
- Abbeel, P.¹ Quigley, M.² Ng, A.Y.³

2
- 84864030941
- An application of reinforcement learning to aerobatic helicopter flight
- Abbeel, P., Coates, A., Quigley, M., Ng, A.Y.: An application of reinforcement learning to aerobatic helicopter flight. In: Advances in Neural Information Processing Systems, NIPS (2007)
- (2007) Advances in Neural Information Processing Systems, NIPS
- Abbeel, P.¹ Coates, A.² Quigley, M.³ Ng, A.Y.⁴

3
- 67650136522
- Apprenticeship learning for motion planning with application to parking lot navigation
- Abbeel, P., Dolgov, D., Ng, A.Y., Thrun, S.: Apprenticeship learning for motion planning with application to parking lot navigation. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS (2008)
- (2008) IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS
- Abbeel, P.¹ Dolgov, D.² Ng, A.Y.³ Thrun, S.⁴

4
- 69549135371
- Learning robot motion control with demonstration and advice-operators
- Argall, B.D., Browning, B., Veloso, M.: Learning robot motion control with demonstration and advice-operators. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS (2008)
- (2008) IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS
- Argall, B.D.¹ Browning, B.² Veloso, M.³

5
- 63149159130
- A survey of robot learning from demonstration
- Argall, B.D., Chernova, S., Veloso, M., Browning, B.: A survey of robot learning from demonstration. Robotics and Autonomous Systems 57, 469–483 (2009)
- (2009) Robotics and Autonomous Systems , vol.57 , pp. 469-483
- Argall, B.D.¹ Chernova, S.² Veloso, M.³ Browning, B.⁴

6
- 0030149709
- Purposive behavior acquisition for a real robot by vision-based reinforcement learning
- Asada, M., Noda, S., Tawaratsumida, S., Hosoda, K.: Purposive behavior acquisition for a real robot by vision-based reinforcement learning. Machine Learning 23(2-3), 279–303 (1996)
- (1996) Machine Learning , vol.23 , Issue.2-3 , pp. 279-303
- Asada, M.¹ Noda, S.² Tawaratsumida, S.³ Hosoda, K.⁴

7
- 0031073475
- Locally weighted learning for control
- Atkeson, C., Moore, A., Stefan, S.: Locally weighted learning for control. AI Review 11, 75–113 (1997)
- (1997) AI Review , vol.11 , pp. 75-113
- Atkeson, C.¹ Moore, A.² Stefan, S.³

8
- 0039816976
- Using local trajectory optimizers to speed up global optimization in dynamic programming
- Atkeson, C.G.: Using local trajectory optimizers to speed up global optimization in dynamic programming. In: Advances in Neural Information Processing Systems, NIPS (1994)
- (1994) Advances in Neural Information Processing Systems, NIPS
- Atkeson, C.G.¹

9
- 49049119585
- Nonparametric model-based reinforcement learning
- Atkeson, C.G.: Nonparametric model-based reinforcement learning. In: Advances in Neural Information Processing Systems, NIPS (1998)
- (1998) Advances in Neural Information Processing Systems, NIPS
- Atkeson, C.G.¹

10
- 0002130986
- Robot learning from demonstration
- Atkeson, C.G., Schaal, S.: Robot learning from demonstration. In: International Conference on Machine Learning, ICML (1997)
- (1997) International Conference on Machine Learning, ICML
- Atkeson, C.G.¹ Schaal, S.²

11
- 0034859944
- Autonomous helicopter control using reinforcement learning policy search methods
- Bagnell, J.A., Schneider, J.C.: Autonomous helicopter control using reinforcement learning policy search methods. In: IEEE International Conference on Robotics and Automation, ICRA (2001)
- (2001) IEEE International Conference on Robotics and Automation, ICRA
- Bagnell, J.A.¹ Schneider, J.C.²

12
- 0346149797
- A robot that reinforcement-learns to identify and memorize important previous observations
- Bakker, B., Zhumatiy, V., Gruener, G., Schmidhuber, J.: A robot that reinforcement-learns to identify and memorize important previous observations. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS (2003)
- (2003) IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS
- Bakker, B.¹ Zhumatiy, V.² Gruener, G.³ Schmidhuber, J.⁴

13
- 33845607326
- Quasi-online reinforcement learning for robots
- Bakker, B., Zhumatiy, V., Gruener, G., Schmidhuber, J.: Quasi-online reinforcement learning for robots. In: IEEE International Conference on Robotics and Automation, ICRA (2006)
- (2006) IEEE International Conference on Robotics and Automation, ICRA
- Bakker, B.¹ Zhumatiy, V.² Gruener, G.³ Schmidhuber, J.⁴

14
- 0141988716
- Recent advances in hierarchical reinforcement learning
- Barto, A.G., Mahadevan, S.: Recent advances in hierarchical reinforcement learning. Discrete Event Dynamic Systems 13(4), 341–379 (2003)
- (2003) Discrete Event Dynamic Systems , vol.13 , Issue.4 , pp. 341-379
- Barto, A.G.¹ Mahadevan, S.²

15
- 0003787146
- Princeton University Press, Princeton
- Bellman, R.E.: Dynamic Programming. Princeton University Press, Princeton (1957)
- (1957) Dynamic Programming
- Bellman, R.E.¹

16
- 0004101565
- Academic Press, New York
- Bellman, R.E.: Introduction to the Mathematical Theory of Control Processes, vol. 40-I. Academic Press, New York (1967)
- (1967) Introduction to the Mathematical Theory of Control Processes , vol.40
- Bellman, R.E.¹

17
- 0004101565
- Academic Press, New York
- Bellman, R.E.: Introduction to the Mathematical Theory of Control Processes, vol. 40-II. Academic Press, New York (1971)
- (1971) Introduction to the Mathematical Theory of Control Processes , vol.40
- Bellman, R.E.¹

18
- 0031343491
- Biped dynamic walking using reinforcement learning
- Benbrahim, H., Franklin, J.A.: Biped dynamic walking using reinforcement learning. Robotics and Autonomous Systems 22(3-4), 283–302 (1997)
- (1997) Robotics and Autonomous Systems , vol.22 , Issue.3-4 , pp. 283-302
- Benbrahim, H.¹ Franklin, J.A.²

19
- 85132036412
- Real-time learning: A ball on a beam
- Benbrahim, H., Doleac, J., Franklin, J., Selfridge, O.: Real-time learning: a ball on a beam. In: International Joint Conference on Neural Networks, IJCNN (1992)
- (1992) International Joint Conference on Neural Networks, IJCNN
- Benbrahim, H.¹ Doleac, J.² Franklin, J.³ Selfridge, O.⁴

20
- 33845608830
- PhD thesis, Georgia Institute of Technology
- Bentivegna, D.C.: Learning from observation using primitives. PhD thesis, Georgia Institute of Technology (2004)
- (2004) Learning from Observation Using Primitives
- Bentivegna, D.C.¹

21
- 1542307046
- Practical methods for optimal control using nonlinear programming
- Society for Industrial and Applied Mathematics (SIAM), Philadelphia
- Betts, J.T.: Practical methods for optimal control using nonlinear programming. In: Advances in Design and Control, vol. 3. Society for Industrial and Applied Mathematics (SIAM), Philadelphia (2001)
- (2001) Advances in Design and Control , vol.3
- Betts, J.T.¹

22
- 84884228435
- Tech. rep., University of Tennesse, Knoxville, advised by Dr. Itamar Elhanany
- Birdwell, N., Livingston, S.: Reinforcement learning in sensor-guided aibo robots. Tech. rep., University of Tennesse, Knoxville, advised by Dr. Itamar Elhanany (2007)
- (2007) Reinforcement Learning in Sensor-Guided Aibo Robots
- Birdwell, N.¹ Livingston, S.²

23
- 78651478352
- Using dimensionality reduction to exploit constraints in reinforcement learning
- Bitzer, S., Howard, M., Vijayakumar, S.: Using dimensionality reduction to exploit constraints in reinforcement learning. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS (2010)
- (2010) IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS
- Bitzer, S.¹ Howard, M.² Vijayakumar, S.³

24
- 80052231991
- Learning variable impedance control
- Buchli, J., Stulp, F., Theodorou, E., Schaal, S.: Learning variable impedance control. International Journal of Robotics Research Online First (2011)
- (2011) International Journal of Robotics Research Online First
- Buchli, J.¹ Stulp, F.² Theodorou, E.³ Schaal, S.⁴

25
- 85046476577
- CRC Press, Boca Raton
- Buşoniu, L., Babuška, R., De Schutter, B., Ernst, D.: Reinforcement Learning and Dynamic Programming Using Function Approximators. CRC Press, Boca Raton (2010)
- (2010) Reinforcement Learning and Dynamic Programming Using Function Approximators
- Buşoniu, L.¹ Babuška, R.² De Schutter, B.³ Ernst, D.⁴

26
- 67650065351
- Apprenticeship learning for helicopter control
- Coates, A., Abbeel, P., Ng, A.Y.: Apprenticeship learning for helicopter control. Commun. ACM 52(7), 97–105 (2009)
- (2009) Commun. ACM , vol.52 , Issue.7 , pp. 97-105
- Coates, A.¹ Abbeel, P.² Ng, A.Y.³

27
- 34250688661
- Learning relational navigation policies
- Cocora, A., Kersting, K., Plagemann, C., Burgard, W., Raedt, L.D.: Learning relational navigation policies. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS (2006)
- (2006) IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS
- Cocora, A.¹ Kersting, K.² Plagemann, C.³ Burgard, W.⁴ Raedt, L.D.⁵

28
- 34948845102
- Reinforcement learning with a supervisor for a mobile robot in a real-world environment
- Conn, K., Peters II, R.A.: Reinforcement learning with a supervisor for a mobile robot in a real-world environment. In: IEEE International Symposium on Computational Intelligence in Robotics and Automation, CIRA (2007)
- (2007) IEEE International Symposium on Computational Intelligence in Robotics and Automation, CIRA
- Conn, K.¹ Peters, I.I.R.A.²

29
- 0346982426
- Using expectation-maximization for reinforcement learning
- Dayan, P., Hinton, G.E.: Using expectation-maximization for reinforcement learning. Neural Computation 9(2), 271–278 (1997)
- (1997) Neural Computation , vol.9 , Issue.2 , pp. 271-278
- Dayan, P.¹ Hinton, G.E.²

30
- 85042921795
- Tech. Rep. UW-CSE-10-06-01, Department of Computer Science & Engineering, University of Washington, USA
- Deisenroth, M.P., Rasmussen, C.E.: A practical and conceptual framework for learning in control. Tech. Rep. UW-CSE-10-06-01, Department of Computer Science & Engineering, University of Washington, USA (2010)
- (2010) A Practical and Conceptual Framework for Learning in Control
- Deisenroth, M.P.¹ Rasmussen, C.E.²

31
- 0030168651
- Learning reactive and planning rules in a motivationally autonomous animat
- Donnart, J.Y., Meyer, J.A.: Learning reactive and planning rules in a motivationally autonomous animat. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 26(3), 381–395 (1996)
- (1996) IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics , vol.26 , Issue.3 , pp. 381-395
- Donnart, J.Y.¹ Meyer, J.A.²

32
- 24844475430
- Tech. rep., International Computer Science Institute, Berkeley, CA
- Dorigo, M., Colombetti, M.: Robot shaping: Developing situated agents through learning. Tech. rep., International Computer Science Institute, Berkeley, CA (1993)
- (1993) Robot Shaping: Developing Situated Agents through Learning
- Dorigo, M.¹ Colombetti, M.²

33
- 34548663531
- Application of reinforcement learning in robot soccer
- Duan, Y., Liu, Q., Xu, X.: Application of reinforcement learning in robot soccer. Engineering Applications of Artificial Intelligence 20(7), 936–950 (2007)
- (2007) Engineering Applications of Artificial Intelligence , vol.20 , Issue.7 , pp. 936-950
- Duan, Y.¹ Liu, Q.² Xu, X.³

34
- 59149102884
- Robot Navigation Based on Fuzzy RL Algorithm
- Sun, F., Zhang, J., Tan, Y., Cao, J., Yu, W. (eds.), Springer, Heidelberg
- Duan, Y., Cui, B., Yang, H.: Robot Navigation Based on Fuzzy RL Algorithm. In: Sun, F., Zhang, J., Tan, Y., Cao, J., Yu, W. (eds.) ISNN 2008, Part I. LNCS, vol. 5263, pp. 391–399. Springer, Heidelberg (2008)
- (2008) ISNN 2008, Part I. LNCS , vol.5263 , pp. 391-399
- Duan, Y.¹ Cui, B.² Yang, H.³

35
- 38649142135
- Learning CPG-based biped locomotion with a policy gradient method: Application to a humanoid robot
- Endo, G., Morimoto, J., Matsubara, T., Nakanishi, J., Cheng, G.: Learning CPG-based biped locomotion with a policy gradient method: Application to a humanoid robot. I. J. Robotic Res. 27(2), 213–228 (2008)
- (2008) I. J. Robotic Res. , vol.27 , Issue.2 , pp. 213-228
- Endo, G.¹ Morimoto, J.² Matsubara, T.³ Nakanishi, J.⁴ Cheng, G.⁵

36
- 39449120595
- Free gait generation with reinforcement learning for a six-legged robot
- Erden, M.S., Leblebicioaglu, K.: Free gait generation with reinforcement learning for a six-legged robot. Robot. Auton. Syst. 56(3), 199–212 (2008)
- (2008) Robot. Auton. Syst. , vol.56 , Issue.3 , pp. 199-212
- Erden, M.S.¹ Leblebicioaglu, K.²

37
- 84884277423
- Rapid reinforcement learning for reactive control policy design for autonomous robots
- Fagg, A.H., Lotspeich, D.L., Hoff, J., Bekey, G.A.: Rapid reinforcement learning for reactive control policy design for autonomous robots. In: Artificial Life in Robotics (1998)
- (1998) Artificial Life in Robotics
- Fagg, A.H.¹ Lotspeich, D.L.² Hoff, J.³ Bekey, G.A.⁴

38
- 0034446356
- Reinforcement learning for a vision based mobile robot
- Gaskett, C., Fletcher, L., Zelinsky, A.: Reinforcement learning for a vision based mobile robot. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS (2000)
- (2000) IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS
- Gaskett, C.¹ Fletcher, L.² Zelinsky, A.³

39
- 84884278950
- Fast biped walking with a reflexive controller and real-time policy searching
- Geng, T., Porr, B., Wörgötter, F.: Fast biped walking with a reflexive controller and real-time policy searching. In: Advances in Neural Information Processing Systems, NIPS (2006)
- (2006) Advances in Neural Information Processing Systems, NIPS
- Geng, T.¹ Porr, B.² Wörgötter, F.³

40
- 0023543886
- Likelihood ratio gradient estimation: An overview
- Glynn, P.: Likelihood ratio gradient estimation: an overview. In: Winter Simulation Conference, WSC (1987)
- (1987) Winter Simulation Conference, WSC
- Glynn, P.¹

41
- 3142584551
- Genetic algorithms
- Goldberg, D.E.: Genetic algorithms. Addision Wesley (1989)
- (1989) Addision Wesley
- Goldberg, D.E.¹

42
- 84881409264
- Learning motion skills from expert demonstrations and own experience using gaussian process regression
- Gräve, K., Stückler, J., Behnke, S.: Learning motion skills from expert demonstrations and own experience using gaussian process regression. In: Joint International Symposium on Robotics (ISR) and German Conference on Robotics, ROBOTIK (2010)
- (2010) Joint International Symposium on Robotics (ISR) and German Conference on Robotics, ROBOTIK
- Gräve, K.¹ Stückler, J.² Behnke, S.³

43
- 34948857495
- Reinforcement learning for imitating constrained reaching movements
- Guenter, F., Hersch, M., Calinon, S., Billard, A.: Reinforcement learning for imitating constrained reaching movements. Advanced Robotics 21(13), 1521–1544 (2007)
- (2007) Advanced Robotics , vol.21 , Issue.13 , pp. 1521-1544
- Guenter, F.¹ Hersch, M.² Calinon, S.³ Billard, A.⁴

44
- 0028381374
- Acquiring robot skills via reinforcement learning
- Gullapalli, V., Franklin, J., Benbrahim, H.: Acquiring robot skills via reinforcement learning. IEEE on Control Systems Magazine 14(1), 13–24 (1994)
- (1994) IEEE on Control Systems Magazine , vol.14 , Issue.1 , pp. 13-24
- Gullapalli, V.¹ Franklin, J.² Benbrahim, H.³

45
- 0346780165
- Reinforcement learning on a omnidirectional mobile robot
- Hafner, R., Riedmiller, M.: Reinforcement learning on a omnidirectional mobile robot. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS (2003)
- (2003) IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS
- Hafner, R.¹ Riedmiller, M.²

46
- 36348930983
- Neural reinforcement learning controllers for a real robot application
- Hafner, R., Riedmiller, M.: Neural reinforcement learning controllers for a real robot application. In: IEEE International Conference on Robotics and Automation, ICRA (2007)
- (2007) IEEE International Conference on Robotics and Automation, ICRA
- Hafner, R.¹ Riedmiller, M.²

47
- 0032315210
- Integrating symbolic knowledge in reinforcement learning
- Hailu, G., Sommer, G.: Integrating symbolic knowledge in reinforcement learning. In: IEEE International Conference on Systems, Man and Cybernetics (SMC) (1998)
- (1998) IEEE International Conference on Systems, Man and Cybernetics (SMC)
- Hailu, G.¹ Sommer, G.²

48
- 77955817264
- Generalized model learning for reinforcement learning on a humanoid robot
- Hester, T., Quinlan, M., Stone, P.: Generalized model learning for reinforcement learning on a humanoid robot. In: IEEE International Conference on Robotics and Automation, ICRA (2010)
- (2010) IEEE International Conference on Robotics and Automation, ICRA
- Hester, T.¹ Quinlan, M.² Stone, P.³

49
- 23144448134
- Novelty and reinforcement learning in the value system of developmental robots
- Huang, X., Weng, J.: Novelty and reinforcement learning in the value system of developmental robots. In: Lund University Cognitive Studies (2002)
- (2002) Lund University Cognitive Studies
- Huang, X.¹ Weng, J.²

50
- 84899019754
- Learning attractor landscapes for learning motor primitives
- Ijspeert, A.J., Nakanishi, J., Schaal, S.: Learning attractor landscapes for learning motor primitives. in: Advances in Neural Information Processing Systems, NIPS (2003)
- (2003) Advances in Neural Information Processing Systems, NIPS
- Ijspeert, A.J.¹ Nakanishi, J.² Schaal, S.³

51
- 0032682144
- Adaptive periodic movement control for the four legged walking machine BISAM
- Ilg, W., Albiez, J., Jedele, H., Berns, K., Dillmann, R.: Adaptive periodic movement control for the four legged walking machine BISAM. In: IEEE International Conference on Robotics and Automation, ICRA (1999)
- (1999) IEEE International Conference on Robotics and Automation, ICRA
- Ilg, W.¹ Albiez, J.² Jedele, H.³ Berns, K.⁴ Dillmann, R.⁵

52
- 0029679044
- Reinforcement learning: A survey
- Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4, 237–285 (1996)
- (1996) Journal of Artificial Intelligence Research , vol.4 , pp. 237-285
- Kaelbling, L.P.¹ Littman, M.L.² Moore, A.W.³

53
- 84878320217
- Modular Reinforcement Learning: An Application to a Real Robot Task
- Birk, A., Demiris, J. (eds.), Springer, Heidelberg
- Kalmár, Z., Szepesvári, C., Lörincz, A.: Modular Reinforcement Learning: An Application to a Real Robot Task. In: Birk, A., Demiris, J. (eds.) EWLR 1997. LNCS (LNAI), vol. 1545, pp. 29–45. Springer, Heidelberg (1998)
- (1998) EWLR 1997. LNCS (LNAI) , vol.1545 , pp. 29-45
- Kalmár, Z.¹ Szepesvári, C.² Lörincz, A.³

54
- 29044440299
- Path integrals and symmetry breaking for optimal control theory
- Kappen, H.: Path integrals and symmetry breaking for optimal control theory. Journal of Statistical Mechanics: Theory and Experiment 11 (2005)
- (2005) Journal of Statistical Mechanics: Theory and Experiment , Issue.11
- Kappen, H.¹

55
- 77950552568
- Learning to manipulate articulated objects in unstructured environments using a grounded relational representation
- Katz, D., Pyuro, Y., Brock, O.: Learning to manipulate articulated objects in unstructured environments using a grounded relational representation. In: Robotics: Science and Systems, R:SS (2008)
- (2008) Robotics: Science and Systems, R:SS
- Katz, D.¹ Pyuro, Y.² Brock, O.³

56
- 0035709047
- Reinforcement learning of walking behavior for a four-legged robot
- Kimura, H., Yamashita, T., Kobayashi, S.: Reinforcement learning of walking behavior for a four-legged robot. In: IEEE Conference on Decision and Control (CDC) (2001)
- (2001) IEEE Conference on Decision and Control (CDC)
- Kimura, H.¹ Yamashita, T.² Kobayashi, S.³

57
- 0031361330
- Q-learning of complex behaviours on a six-legged walking machine
- Kirchner, F.: Q-learning of complex behaviours on a six-legged walking machine. In: EU-ROMICRO Workshop on Advanced Mobile Robots (1997)
- (1997) EU-ROMICRO Workshop on Advanced Mobile Robots
- Kirchner, F.¹

58
- 0003448648
- Prentice-Hall, Englewood Cliffs
- Kirk, D.E.: Optimal control theory. Prentice-Hall, Englewood Cliffs (1970)
- (1970) Optimal Control Theory
- Kirk, D.E.¹

59
- 36348997154
- Gaussian processes and reinforcement learning for identification and control of an autonomous blimp
- Ko, J., Klein, D.J., Fox, D., Hähnel, D.: Gaussian processes and reinforcement learning for identification and control of an autonomous blimp. In: IEEE International Conference on Robotics and Automation (ICRA) (2007)
- (2007) IEEE International Conference on Robotics and Automation (ICRA)
- Ko, J.¹ Klein, D.J.² Fox, D.³ Hähnel, D.⁴

60
- 84858754385
- Policy search for motor primitives in robotics
- Kober, J., Peters, J.: Policy search for motor primitives in robotics. In: Advances in Neural Information Processing Systems, NIPS (2009)
- (2009) Advances in Neural Information Processing Systems, NIPS
- Kober, J.¹ Peters, J.²

61
- 80054983011
- Policy search for motor primitives in robotics
- Kober, J., Peters, J.: Policy search for motor primitives in robotics. Machine Learning Online First (2010)
- (2010) Machine Learning Online First
- Kober, J.¹ Peters, J.²

62
- 67650835709
- Learning perceptual coupling for motor primitives
- Kober, J., Mohler, B., Peters, J.: Learning perceptual coupling for motor primitives. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS (2008)
- (2008) IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS
- Kober, J.¹ Mohler, B.² Peters, J.³

63
- 78651495944
- Reinforcement learning to adjust robot movements to new situations
- Kober, J., Oztop, E., Peters, J.: Reinforcement learning to adjust robot movements to new situations. In: Robotics: Science and Systems Conference (R:SS) (2010)
- (2010) Robotics: Science and Systems Conference (R:SS)
- Kober, J.¹ Oztop, E.² Peters, J.³

64
- 3042534761
- Policy gradient reinforcement learning for fast quadrupedal locomotion
- Kohl, N., Stone, P.: Policy gradient reinforcement learning for fast quadrupedal locomotion. In: IEEE International Conference on Robotics and Automation (ICRA) (2004)
- (2004) IEEE International Conference on Robotics and Automation (ICRA)
- Kohl, N.¹ Stone, P.²

65
- 77955793428
- Policy search via the signed derivative
- Kolter, J.Z., Ng, A.Y.: Policy search via the signed derivative. In: Robotics: Science and Systems (R:SS) (2009)
- (2009) Robotics: Science and Systems (R:SS)
- Kolter, J.Z.¹ Ng, A.Y.²

66
- 85042697846
- Hierarchical apprenticeship learning with application to quadruped locomotion
- Kolter, J.Z., Abbeel, P., Ng, A.Y.: Hierarchical apprenticeship learning with application to quadruped locomotion. In: Advances in Neural Information Processing Systems (NIPS) (2007)
- (2007) Advances in Neural Information Processing Systems (NIPS)
- Kolter, J.Z.¹ Abbeel, P.² Ng, A.Y.³

67
- 56449101181
- Space-indexed dynamic programming: Learning to follow trajectories
- Kolter, J.Z., Coates, A., Ng, A.Y., Gu, Y., DuHadway, C.: Space-indexed dynamic programming: learning to follow trajectories. In: International Conference on Machine Learning (ICML) (2008)
- (2008) International Conference on Machine Learning (ICML)
- Kolter, J.Z.¹ Coates, A.² Ng, A.Y.³ Gu, Y.⁴ Duhadway, C.⁵

68
- 77955819329
- A probabilistic approach to mixed open-loop and closed-loop control, with application to extreme autonomous driving
- Kolter, J.Z., Plagemann, C., Jackson, D.T., Ng, A.Y., Thrun, S.: A probabilistic approach to mixed open-loop and closed-loop control, with application to extreme autonomous driving. In: IEEE International Conference on Robotics and Automation (ICRA) (2010)
- (2010) IEEE International Conference on Robotics and Automation (ICRA)
- Kolter, J.Z.¹ Plagemann, C.² Jackson, D.T.³ Ng, A.Y.⁴ Thrun, S.⁵

69
- 76249123093
- Active learning using mean shift optimization for robot grasping
- Kroemer, O., Detry, R., Piater, J., Peters, J.: Active learning using mean shift optimization for robot grasping. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2009)
- (2009) IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- Kroemer, O.¹ Detry, R.² Piater, J.³ Peters, J.⁴

70
- 77955426970
- Combining active learning and reactive control for robot grasping
- Kroemer, O., Detry, R., Piater, J., Peters, J.: Combining active learning and reactive control for robot grasping. Robotics and Autonomous Systems 58(9), 1105–1116 (2010)
- (2010) Robotics and Autonomous Systems , vol.58 , Issue.9 , pp. 1105-1116
- Kroemer, O.¹ Detry, R.² Piater, J.³ Peters, J.⁴

71
- 0003028425
- Nonlinear programming
- Kuhn, H.W., Tucker, A.W.: Nonlinear programming. In: Berkeley Symposium on Mathematical Statistics and Probability (1950)
- (1950) Berkeley Symposium on Mathematical Statistics and Probability
- Kuhn, H.W.¹ Tucker, A.W.²

72
- 38149139273
- Imitative Reinforcement Learning for Soccer Playing Robots
- Lakemeyer, G., Sklar, E., Sorrenti, D.G., Takahashi, T. (eds.), Springer, Heidelberg
- Latzke, T., Behnke, S., Bennewitz, M.: Imitative Reinforcement Learning for Soccer Playing Robots. In: Lakemeyer, G., Sklar, E., Sorrenti, D.G., Takahashi, T. (eds.) RoboCup 2006. LNCS (LNAI), vol. 4434, pp. 47–58. Springer, Heidelberg (2007)
- (2007) Robocup 2006. LNCS (LNAI) , vol.4434 , pp. 47-58
- Latzke, T.¹ Behnke, S.² Bennewitz, M.³

73
- 84880890296
- Automatic gait optimization with gaussian process regression
- Lizotte, D., Wang, T., Bowling, M., Schuurmans, D.: Automatic gait optimization with gaussian process regression. In: International Joint Conference on Artifical Intelligence (IJ-CAI) (2007)
- (2007) International Joint Conference on Artifical Intelligence (IJ-CAI)
- Lizotte, D.¹ Wang, T.² Bowling, M.³ Schuurmans, D.⁴

74
- 0026880130
- Automatic programming of behavior-based robots using reinforcement learning
- Mahadevan, S., Connell, J.: Automatic programming of behavior-based robots using reinforcement learning. Artificial Intelligence 55(2-3), 311–365 (1992)
- (1992) Artificial Intelligence , vol.55 , Issue.2-3 , pp. 311-365
- Mahadevan, S.¹ Connell, J.²

75
- 33846128114
- Fast reinforcement learning for vision-guided mobile robots
- Martínez-Marín, T., Duckett, T.: Fast reinforcement learning for vision-guided mobile robots. In: IEEE International Conference on Robotics and Automation (ICRA) (2005)
- (2005) IEEE International Conference on Robotics and Automation (ICRA)
- Martínez-Marín, T.¹ Duckett, T.²

76
- 84957895797
- Reward functions for accelerated learning
- Mataric, M.J.: Reward functions for accelerated learning. In: International Conference on Machine Learning (ICML) (1994)
- (1994) International Conference on Machine Learning (ICML)
- Mataric, M.J.¹

77
- 0030647149
- Reinforcement learning in the multi-robot domain
- Mataric, M.J.: Reinforcement learning in the multi-robot domain. Autonomous Robots 4, 73–83 (1997)
- (1997) Autonomous Robots , vol.4 , pp. 73-83
- Mataric, M.J.¹

78
- 31844450882
- High speed obstacle avoidance using monocular vision and reinforcement learning
- Michels, J., Saxena, A., Ng, A.Y.: High speed obstacle avoidance using monocular vision and reinforcement learning. In: International Conference on Machine Learning (ICML) (2005)
- (2005) International Conference on Machine Learning (ICML)
- Michels, J.¹ Saxena, A.² Ng, A.Y.³

79
- 34250186688
- Robot behavior adaptation for human-robot interaction based on policy gradient reinforcement learning
- Mitsunaga, N., Smith, C., Kanda, T., Ishiguro, H., Hagita, N.: Robot behavior adaptation for human-robot interaction based on policy gradient reinforcement learning. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2005)
- (2005) IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- Mitsunaga, N.¹ Smith, C.² Kanda, T.³ Ishiguro, H.⁴ Hagita, N.⁵

80
- 0030297195
- A kendama learning robot based on bi-directional theory
- Miyamoto, H., Schaal, S., Gandolfo, F., Gomi, H., Koike, Y., Osu, R., Nakano, E., Wada, Y., Kawato, M.: A kendama learning robot based on bi-directional theory. Neural Networks 9(8), 1281–1302 (1996)
- (1996) Neural Networks , vol.9 , Issue.8 , pp. 1281-1302
- Miyamoto, H.¹ Schaal, S.² Gandolfo, F.³ Gomi, H.⁴ Koike, Y.⁵ Osu, R.⁶ Nakano, E.⁷ Wada, Y.⁸ Kawato, M.⁹

81
- 0035979437
- Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning
- Morimoto, J., Doya, K.: Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning. Robotics and Autonomous Systems 36(1), 37–51 (2001)
- (2001) Robotics and Autonomous Systems , vol.36 , Issue.1 , pp. 37-51
- Morimoto, J.¹ Doya, K.²

82
- 44649168729
- Operational space control: A theoretical and emprical comparison
- Nakanishi, J., Cory, R., Mistry, M., Peters, J., Schaal, S.: Operational space control: a theoretical and emprical comparison. International Journal of Robotics Research 27, 737–757 (2008)
- (2008) International Journal of Robotics Research , vol.27 , pp. 737-757
- Nakanishi, J.¹ Cory, R.² Mistry, M.³ Peters, J.⁴ Schaal, S.⁵

83
- 77950583695
- Task adaptation through exploration and action sequencing
- Nemec, B., Tamošiūnaitė, M., Wörgötter, F., Ude, A.: Task adaptation through exploration and action sequencing. In: IEEE-RAS International Conference on Humanoid Robots, Humanoids (2009)
- (2009) IEEE-RAS International Conference on Humanoid Robots, Humanoids
- Nemec, B.¹ Tamošiūnaitė, M.² Wörgötter, F.³ Ude, A.⁴

84
- 77956035400
- Learning of a ball-in-a-cup playing robot
- Nemec, B., Zorko, M., Zlajpah, L.: Learning of a ball-in-a-cup playing robot. In: International Workshop on Robotics in Alpe-Adria-Danube Region (RAAD) (2010)
- (2010) International Workshop on Robotics in Alpe-Adria-Danube Region (RAAD)
- Nemec, B.¹ Zorko, M.² Zlajpah, L.³

85
- 31844443291
- Autonomous inverted helicopter flight via reinforcement learning
- Ng, A.Y., Coates, A., Diel, M., Ganapathi, V., Schulte, J., Tse, B., Berger, E., Liang, E.: Autonomous inverted helicopter flight via reinforcement learning. In: International Symposium on Experimental Robotics (ISER) (2004a)
- (2004) International Symposium on Experimental Robotics (ISER)
- Ng, A.Y.¹ Coates, A.² Diel, M.³ Ganapathi, V.⁴ Schulte, J.⁵ Tse, B.⁶ Berger, E.⁷ Liang, E.⁸

86
- 84898980684
- Autonomous helicopter flight via reinforcement learning
- Ng, A.Y., Kim, H.J., Jordan, M.I., Sastry, S.: Autonomous helicopter flight via reinforcement learning. In: Advances in Neural Information Processing Systems (NIPS) (2004b)
- (2004) Advances in Neural Information Processing Systems (NIPS)
- Ng, A.Y.¹ Kim, H.J.² Jordan, M.I.³ Sastry, S.⁴

87
- 77955806943
- Learning reliable and efficient navigation with a humanoid
- Oßwald, S., Hornung, A., Bennewitz, M.: Learning reliable and efficient navigation with a humanoid. In: IEEE International Conference on Robotics and Automation (ICRA) (2010)
- (2010) IEEE International Conference on Robotics and Automation (ICRA)
- Oßwald, S.¹ Hornung, A.² Bennewitz, M.³

88
- 38149039530
- Perception and Developmental Learning of Affordances in Autonomous Robots
- Hertzberg, J., Beetz, M., Englert, R. (eds.), Springer, Heidelberg
- Paletta, L., Fritz, G., Kintzler, F., Irran, J., Dorffner, G.: Perception and Developmental Learning of Affordances in Autonomous Robots. In: Hertzberg, J., Beetz, M., Englert, R. (eds.) KI 2007. LNCS (LNAI), vol. 4667, pp. 235–250. Springer, Heidelberg (2007)
- (2007) KI 2007. LNCS (LNAI) , vol.4667 , pp. 235-250
- Paletta, L.¹ Fritz, G.² Kintzler, F.³ Irran, J.⁴ Dorffner, G.⁵

89
- 84868008342
- Skill learning and task outcome prediction for manipulation
- Pastor, P., Kalakrishnan, M., Chitta, S., Theodorou, E., Schaal, S.: Skill learning and task outcome prediction for manipulation. In: IEEE International Conference on Robotics and Automation (ICRA) (2011)
- (2011) IEEE International Conference on Robotics and Automation (ICRA)
- Pastor, P.¹ Kalakrishnan, M.² Chitta, S.³ Theodorou, E.⁴ Schaal, S.⁵

90
- 84999067567
- Reinforcement learning in situated agents: Some theoretical problems and practical solutions
- Pendrith, M.: Reinforcement learning in situated agents: Some theoretical problems and practical solutions. In: European Workshop on Learning Robots (EWRL) (1999)
- (1999) European Workshop on Learning Robots (EWRL)
- Pendrith, M.¹

91
- 38649095925
- Learning to control in operational space
- Peters, J., Schaal, S.: Learning to control in operational space. International Journal of Robotics Research 27(2), 197–212 (2008a)
- (2008) International Journal of Robotics Research , vol.27 , Issue.2 , pp. 197-212
- Peters, J.¹ Schaal, S.²

92
- 40649106649
- Natural actor-critic
- Peters, J., Schaal, S.: Natural actor-critic. Neurocomputing 71(7-9), 1180–1190 (2008b)
- (2008) Neurocomputing , vol.71 , Issue.7-9 , pp. 1180-1190
- Peters, J.¹ Schaal, S.²

93
- 44949241322
- Reinforcement learning of motor skills with policy gradients
- Peters, J., Schaal, S.: Reinforcement learning of motor skills with policy gradients. Neural Networks 21(4), 682–697 (2008c)
- (2008) Neural Networks , vol.21 , Issue.4 , pp. 682-697
- Peters, J.¹ Schaal, S.²

94
- 84884263644
- Tech. rep., University of Southern California
- Peters, J., Vijayakumar, S., Schaal, S.: Linear quadratic regulation as benchmark for policy gradient methods. Tech. rep., University of Southern California (2004)
- (2004) Linear Quadratic Regulation as Benchmark for Policy Gradient Methods
- Peters, J.¹ Vijayakumar, S.² Schaal, S.³

95
- 77958569725
- Relative entropy policy search
- Peters, J., Mülling, K., Altun, Y.: Relative entropy policy search. In: National Conference on Artificial Intelligence (AAAI) (2010a)
- (2010) National Conference on Artificial Intelligence (AAAI)
- Peters, J.¹ Mülling, K.² Altun, Y.³

96
- 84871688905
- Towards motor skill learning for robotics
- Peters, J., Mülling, K., Kober, J., Nguyen-Tuong, D., Kroemer, O.: Towards motor skill learning for robotics. In: International Symposium on Robotics Research, ISRR (2010b)
- (2010) International Symposium on Robotics Research, ISRR
- Peters, J.¹ Mülling, K.² Kober, J.³ Nguyen-Tuong, D.⁴ Kroemer, O.⁵

97
- 84855987763
- Learning visual representations for perception-action systems
- Piater, J., Jodogne, S., Detry, R., Kraft, D., Krüger, N., Kroemer, O., Peters, J.: Learning visual representations for perception-action systems. International Journal of Robotics Research Online First (2010)
- (2010) International Journal of Robotics Research Online First
- Piater, J.¹ Jodogne, S.² Detry, R.³ Kraft, D.⁴ Krüger, N.⁵ Kroemer, O.⁶ Peters, J.⁷

98
- 59149084970
- Improving grasp skills using schema structured learning
- Platt, R., Grupen, R.A., Fagg, A.H.: Improving grasp skills using schema structured learning. In: International Conference on Development and Learning (2006)
- (2006) International Conference on Development and Learning
- Platt, R.¹ Grupen, R.A.² Fagg, A.H.³

99
- 0004255876
- Addison-Wesley, Reading
- Åström, K.J., Wittenmark, B.: Adaptive control. Addison-Wesley, Reading (1989)
- (1989) Adaptive Control
- Åström, K.J.¹ Wittenmark, B.²

100
- 67650996818
- Reinforcement learning for robot soccer
- Riedmiller, M., Gabel, T., Hafner, R., Lange, S.: Reinforcement learning for robot soccer. Autonomous Robots 27(1), 55–73 (2009)
- (2009) Autonomous Robots , vol.27 , Issue.1 , pp. 55-73
- Riedmiller, M.¹ Gabel, T.² Hafner, R.³ Lange, S.⁴

101
- 51349089046
- Autonomous blimp control using model-free reinforcement learning in a continuous state and action space
- Rottmann, A., Plagemann, C., Hilgers, P., Burgard, W.: Autonomous blimp control using model-free reinforcement learning in a continuous state and action space. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS (2007)
- (2007) IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS
- Rottmann, A.¹ Plagemann, C.² Hilgers, P.³ Burgard, W.⁴

102
- 56049089041
- State-Dependent Exploration for Policy Gradient Methods
- Daelemans, W., Goethals, B., Morik, K. (eds.), Springer, Heidelberg
- Rückstieß, T., Felder, M., Schmidhuber, J.: State-Dependent Exploration for Policy Gradient Methods. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI), vol. 5212, pp. 234–249. Springer, Heidelberg (2008)
- (2008) ECML PKDD 2008, Part II. LNCS (LNAI) , vol.5212 , pp. 234-249
- Rückstieß, T.¹ Felder, M.² Schmidhuber, J.³

103
- 84902174443
- Reinforcement Learning for Biped Locomotion
- Dorronsoro, J.R. (ed.), Springer, Heidelberg
- Sato, M.-A., Nakamura, Y., Ishii, S.: Reinforcement Learning for Biped Locomotion. In: Dorronsoro, J.R. (ed.) ICANN 2002. LNCS, vol. 2415, pp. 777–782. Springer, Heidelberg (2002)
- (2002) ICANN 2002. LNCS , vol.2415 , pp. 777-782
- Sato, M.-A.¹ Nakamura, Y.² Ishii, S.³

104
- 84898995067
- Learning from demonstration
- Schaal, S.: Learning from demonstration. In: Advances in Neural Information Processing Systems, NIPS (1997)
- (1997) Advances in Neural Information Processing Systems, NIPS
- Schaal, S.¹

105
- 0028374275
- Robot juggling: An implementation of memory-based learning
- Schaal, S., Atkeson, C.G.: Robot juggling: An implementation of memory-based learning. Control Systems Magazine 14(1), 57–71 (1994)
- (1994) Control Systems Magazine , vol.14 , Issue.1 , pp. 57-71
- Schaal, S.¹ Atkeson, C.G.²

106
- 0036639869
- Scalable techniques from nonparameteric statistics for real-time robot learning
- Schaal, S., Atkeson, C.G., Vijayakumar, S.: Scalable techniques from nonparameteric statistics for real-time robot learning. Applied Intelligence 17(1), 49–60 (2002)
- (2002) Applied Intelligence , vol.17 , Issue.1 , pp. 49-60
- Schaal, S.¹ Atkeson, C.G.² Vijayakumar, S.³

107
- 34848832311
- Dynamics systems vs. Optimal control - A unifying view
- Schaal, S., Mohajerian, P., Ijspeert, A.J.: Dynamics systems vs. optimal control - a unifying view. Progress in Brain Research 165(1), 425–445 (2007)
- (2007) Progress in Brain Research , vol.165 , Issue.1 , pp. 425-445
- Schaal, S.¹ Mohajerian, P.² Ijspeert, A.J.³

108
- 0031636098
- A framework for reinforcement learning on real robots
- Smart, W.D., Kaelbling, L.P.: A framework for reinforcement learning on real robots. In: National Conference on Artificial Intelligence/Innovative Applications of Artificial Intelligence, AAAI/IAAI (1998)
- (1998) National Conference on Artificial Intelligence/Innovative Applications of Artificial Intelligence, AAAI/IAAI
- Smart, W.D.¹ Kaelbling, L.P.²

109
- 0036058423
- Effective reinforcement learning for mobile robots
- Smart, W.D., Kaelbling, L.P.: Effective reinforcement learning for mobile robots. In: IEEE International Conference on Robotics and Automation (ICRA) (2002)
- (2002) IEEE International Conference on Robotics and Automation (ICRA)
- Smart, W.D.¹ Kaelbling, L.P.²

110
- 80055036368
- Reinforcement learning of hierarchical skills on the sony aibo robot
- Soni, V., Singh, S.: Reinforcement learning of hierarchical skills on the sony aibo robot. In: International Conference on Development and Learning (ICDL) (2006)
- (2006) In: International Conference on Development and Learning (ICDL)
- Soni, V.¹ Singh, S.²

111
- 0141484963
- Direct policy search using paired statistical tests
- Strens, M., Moore, A.: Direct policy search using paired statistical tests. In: International Conference on Machine Learning (ICML) (2001)
- (2001) In: International Conference on Machine Learning (ICML)
- Strens, M.¹ Moore, A.²

112
- 0004102479
- MIT Press, Boston
- Sutton, R., Barto, A.: Reinforcement Learning. MIT Press, Boston (1998)
- (1998) Reinforcement Learning
- Sutton, R.¹ Barto, A.²

113
- 0002995053
- Integrated architectures for learning, planning, and reacting based on approximating dynamic programming
- Sutton, R.S.: Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In: International Machine Learning Conference (1990)
- (1990) International Machine Learning Conference
- Sutton, R.S.¹

114
- 84898939480
- Policy gradient methods for reinforcement learning with function approximation
- Sutton, R.S., McAllester, D., Singh, S., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: Advances in Neural Information Processing Systems (NIPS) (2000)
- (2000) Advances in Neural Information Processing Systems (NIPS)
- Sutton, R.S.¹ McAllester, D.² Singh, S.³ Mansour, Y.⁴

115
- 34547991608
- On the role of tracking in stationary environments
- Sutton, R.S., Koop, A., Silver, D.: On the role of tracking in stationary environments. In: International Conference on Machine Learning (ICML) (2007)
- (2007) International Conference on Machine Learning (ICML)
- Sutton, R.S.¹ Koop, A.² Silver, D.³

116
- 0035481996
- Emergent synthesis of motion patterns for locomotion robots
- Svinin, M.M., Yamada, K., Ueda, K.: Emergent synthesis of motion patterns for locomotion robots. Artificial Intelligence in Engineering 15(4), 353–363 (2001)
- (2001) Artificial Intelligence in Engineering , vol.15 , Issue.4 , pp. 353-363
- Svinin, M.M.¹ Yamada, K.² Ueda, K.³

117
- 70349102909
- Policy Gradient Learning of Cooperative Interaction with a Robot Using User’s Biological Signals
- Köppen, M., Kasabov, N., Coghill, G. (eds.), Springer, Heidelberg
- Tamei, T., Shibata, T.: Policy Gradient Learning of Cooperative Interaction with a Robot Using User’s Biological Signals. In: Köppen, M., Kasabov, N., Coghill, G. (eds.) ICONIP 2008. LNCS, vol. 5507, pp. 1029–1037. Springer, Heidelberg (2009)
- (2009) ICONIP 2008. LNCS , vol.5507 , pp. 1029-1037
- Tamei, T.¹ Shibata, T.²

118
- 14044262287
- Stochastic policy gradient reinforcement learning on a simple 3d biped
- Tedrake, R.: Stochastic policy gradient reinforcement learning on a simple 3d biped. In: International Conference on Intelligent Robots and Systems (IROS) (2004)
- (2004) International Conference on Intelligent Robots and Systems (IROS)
- Tedrake, R.¹

119
- 34250679869
- Learning to walk in 20 minutes
- Tedrake, R., Zhang, T.W., Seung, H.S.: Learning to walk in 20 minutes. In: Yale Workshop on Adaptive and Learning Systems (2005)
- (2005) Yale Workshop on Adaptive and Learning Systems
- Tedrake, R.¹ Zhang, T.W.² Seung, H.S.³

120
- 77954052363
- LQR-trees: Feedback motion planning via sums of squares verification
- Tedrake, R., Manchester, I.R., Tobenkin, M.M., Roberts, J.W.: LQR-trees: Feedback motion planning via sums of squares verification. International Journal of Robotics Research 29, 1038–1052 (2010)
- (2010) International Journal of Robotics Research , vol.29 , pp. 1038-1052
- Tedrake, R.¹ Manchester, I.R.² Tobenkin, M.M.³ Roberts, J.W.⁴

121
- 77955836276
- Reinforcement learning of motor skills in high dimensions: A path integral approach
- Theodorou, E.A., Buchli, J., Schaal, S.: Reinforcement learning of motor skills in high dimensions: A path integral approach. In: IEEE International Conference on Robotics and Automation (ICRA) (2010)
- (2010) IEEE International Conference on Robotics and Automation (ICRA)
- Theodorou, E.A.¹ Buchli, J.² Schaal, S.³

122
- 0029386385
- An approach to learning mobile robot navigation
- Thrun, S.: An approach to learning mobile robot navigation. Robotics and Autonomous Systems 15, 301–319 (1995)
- (1995) Robotics and Autonomous Systems , vol.15 , pp. 301-319
- Thrun, S.¹

123
- 70350458680
- The crawler, a class room demonstrator for reinforcement learning
- Tokic, M., Ertel, W., Fessler, J.: The crawler, a class room demonstrator for reinforcement learning. In: International Florida Artificial Intelligence Research Society Conference (FLAIRS) (2009)
- (2009) In: International Florida Artificial Intelligence Research Society Conference (FLAIRS)
- Tokic, M.¹ Ertel, W.² Fessler, J.³

124
- 78651507715
- Expectation-Maximization methods for solving (PO)MDPs and optimal control problems
- Cambridge University Press
- Toussaint, M., Storkey, A., Harmeling, S.: Expectation-Maximization methods for solving (PO)MDPs and optimal control problems. In: Inference and Learning in Dynamic Models. Cambridge University Press (2010)
- (2010) Inference and Learning in Dynamic Models
- Toussaint, M.¹ Storkey, A.² Harmeling, S.³

125
- 0031341345
- Neural reinforcement learning for behaviour synthesis
- Touzet, C.: Neural reinforcement learning for behaviour synthesis. Robotics and Autonomous Systems, Special Issue on Learning Robot: the New Wave 22(3-4), 251–281 (1997)
- (1997) Robotics and Autonomous Systems, Special Issue on Learning Robot: The New Wave , vol.22 , Issue.3-4 , pp. 251-281
- Touzet, C.¹

126
- 0031629214
- Cooperative behavior acquisition in multi mobile robots environment by reinforcement learning based on state vector estimation
- Uchibe, E., Asada, M., Hosoda, K.: Cooperative behavior acquisition in multi mobile robots environment by reinforcement learning based on state vector estimation. In: IEEE International Conference on Robotics and Automation (ICRA) (1998)
- (1998) IEEE International Conference on Robotics and Automation (ICRA)
- Uchibe, E.¹ Asada, M.² Hosoda, K.³

127
- 70349327392
- Learning model-free robot control by a Monte Carlo EM algorithm
- Vlassis, N., Toussaint, M., Kontes, G., Piperidis, S.: Learning model-free robot control by a Monte Carlo EM algorithm. Autonomous Robots 27(2), 123–130 (2009)
- (2009) Autonomous Robots , vol.27 , Issue.2 , pp. 123-130
- Vlassis, N.¹ Toussaint, M.² Kontes, G.³ Piperidis, S.⁴

128
- 34547266691
- A heuristic reinforcement learning for robot approaching objects
- Wang, B., Li, J., Liu, H.: A heuristic reinforcement learning for robot approaching objects. In: IEEE Conference on Robotics, Automation and Mechatronics (2006)
- (2006) IEEE Conference on Robotics, Automation and Mechatronics
- Wang, B.¹ Li, J.² Liu, H.³

129
- 84957710334
- Reinforcement learning of behaviors in mobile robots using noisy infrared sensing
- Willgoss, R.A., Iqbal, J.: Reinforcement learning of behaviors in mobile robots using noisy infrared sensing. In: Australian Conference on Robotics and Automation (1999)
- (1999) Australian Conference on Robotics and Automation
- Willgoss, R.A.¹ Iqbal, J.²

130
- 0000337576
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning 8, 229–256 (1992)
- (1992) Machine Learning , vol.8 , pp. 229-256
- Williams, R.J.¹

131
- 54249138633
- A Reinforcement Learning Technique with an Adaptive Action Generator for a Multi-Robot System
- Asada, M., Hallam, J.C.T., Meyer, J.-A., Tani, J. (eds.), Springer, Heidelberg
- Yasuda, T., Ohkura, K.: A Reinforcement Learning Technique with an Adaptive Action Generator for a Multi-Robot System. In: Asada, M., Hallam, J.C.T., Meyer, J.-A., Tani, J. (eds.) SAB 2008. LNCS (LNAI), vol. 5040, pp. 250–259. Springer, Heidelberg (2008)
- (2008) SAB 2008. LNCS (LNAI) , vol.5040 , pp. 250-259
- Yasuda, T.¹ Ohkura, K.²

132
- 85042941950
- Neuro-based learning of mobile robots with evolutionary path planning
- Youssef, S.M.: Neuro-based learning of mobile robots with evolutionary path planning. In: ICGST International Conference on Automation, Robotics and Autonomous Systems (ARAS) (2005)
- (2005) ICGST International Conference on Automation, Robotics and Autonomous Systems (ARAS)
- Youssef, S.M.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.