-
2
-
-
38649142135
-
Learning CPG-based biped locomotion with a policy gradient method: Application to a humanoid robot
-
G. Endo, J. Morimoto, T. Matsubara, J. Nakanishi, and G. Cheng, "Learning CPG-based biped locomotion with a policy gradient method: Application to a humanoid robot", International Journal of Robotic Research, vol. 27, no. 2, pp. 213-228, 2008.
-
(2008)
International Journal of Robotic Research
, vol.27
, Issue.2
, pp. 213-228
-
-
Endo, G.1
Morimoto, J.2
Matsubara, T.3
Nakanishi, J.4
Cheng, G.5
-
3
-
-
44949241322
-
Reinforcement learning of motor skills with policy gradients
-
J. Peters and S. Schaal, "Reinforcement learning of motor skills with policy gradients", Neural Networks, vol. 21, no. 4, pp. 682-697, 2008.
-
(2008)
Neural Networks
, vol.21
, Issue.4
, pp. 682-697
-
-
Peters, J.1
Schaal, S.2
-
4
-
-
85027996010
-
Reinforcement learning of motor skills in high dimensions
-
E. Theodorou, J. Buchli, and S. Schaal, "Reinforcement learning of motor skills in high dimensions", in ICRA, 2010.
-
(2010)
ICRA
-
-
Theodorou, E.1
Buchli, J.2
Schaal, S.3
-
6
-
-
84455188451
-
Learning force control policies for compliant manipulation
-
M. Kalakrishnan, L. Righetti, P. Pastor, and S. Schaal, "Learning force control policies for compliant manipulation", in International Conference on Intelligent Robots and Systems (IROS), 2011.
-
(2011)
International Conference on Intelligent Robots and Systems (IROS)
-
-
Kalakrishnan, M.1
Righetti, L.2
Pastor, P.3
Schaal, S.4
-
7
-
-
85158005713
-
An application of reinforcement learning to aerobatic helicopter flight
-
P. Abbeel, A. Coates, M. Quigley, and A. Ng, "An application of reinforcement learning to aerobatic helicopter flight", in Advances in Neural Information Processing Systems (NIPS), 2006.
-
(2006)
Advances in Neural Information Processing Systems (NIPS)
-
-
Abbeel, P.1
Coates, A.2
Quigley, M.3
Ng, A.4
-
8
-
-
84884276459
-
Reinforcement learning in robotics: A survey
-
J. Kober, J. A. Bagnell, and J. Peters, "Reinforcement learning in robotics: A survey", International Journal of Robotic Research, vol. 32, no. 11, pp. 1238-1274, 2013.
-
(2013)
International Journal of Robotic Research
, vol.32
, Issue.11
, pp. 1238-1274
-
-
Kober, J.1
Bagnell, J.A.2
Peters, J.3
-
9
-
-
85105191314
-
Learning and generalization of motor skills by learning from demonstration
-
P. Pastor, H. Hoffmann, T. Asfour, and S. Schaal, "Learning and generalization of motor skills by learning from demonstration", in ICRA, 2009.
-
(2009)
ICRA
-
-
Pastor, P.1
Hoffmann, H.2
Asfour, T.3
Schaal, S.4
-
10
-
-
85083953657
-
Continuous control with deep reinforcement learning
-
T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, "Continuous control with deep reinforcement learning", ICLR, 2016.
-
(2016)
ICLR
-
-
Lillicrap, T.P.1
Hunt, J.J.2
Pritzel, A.3
Heess, N.4
Erez, T.5
Tassa, Y.6
Silver, D.7
Wierstra, D.8
-
11
-
-
84998579328
-
Continuous deep q-learning with model-based acceleration
-
S. Gu, T. Lillicrap, I. Sutskever, and S. Levine, "Continuous deep q-learning with model-based acceleration", in ICML, 2016.
-
(2016)
ICML
-
-
Gu, S.1
Lillicrap, T.2
Sutskever, I.3
Levine, S.4
-
12
-
-
84903590417
-
A survey on policy search for robotics
-
M. Deisenroth, G. Neumann, and J. Peters, "A survey on policy search for robotics", Foundations and Trends in Robotics, vol. 2, no. 1-2, pp. 1-142, 2013.
-
(2013)
Foundations and Trends in Robotics
, vol.2
, Issue.1-2
, pp. 1-142
-
-
Deisenroth, M.1
Neumann, G.2
Peters, J.3
-
13
-
-
0026954775
-
Neural networks for control systems: A survey
-
Nov.
-
K. J. Hunt, D. Sbarbaro, R. Żbikowski, and P. J. Gawthrop, "Neural networks for control systems: A survey", Automatica, vol. 28, no. 6, pp. 1083-1112, Nov. 1992.
-
(1992)
Automatica
, vol.28
, Issue.6
, pp. 1083-1112
-
-
Hunt, K.J.1
Sbarbaro, D.2
Zbikowski, R.3
Gawthrop, P.J.4
-
14
-
-
33646398129
-
Neural fitted q iteration-first experiences with a data efficient neural reinforcement learning method
-
Springer
-
M. Riedmiller, "Neural fitted q iteration-first experiences with a data efficient neural reinforcement learning method", in European Conference on Machine Learning. Springer, 2005, pp. 317-328.
-
(2005)
European Conference on Machine Learning
, pp. 317-328
-
-
Riedmiller, M.1
-
15
-
-
36348930983
-
Neural reinforcement learning controllers for a real robot application
-
R. Hafner and M. Riedmiller, "Neural reinforcement learning controllers for a real robot application", in ICRA, 2007.
-
(2007)
ICRA
-
-
Hafner, R.1
Riedmiller, M.2
-
17
-
-
84883060087
-
Evolving largescale neural networks for vision-based reinforcement learning
-
J. Koutník, G. Cuccu, J. Schmidhuber, and F. Gomez, "Evolving largescale neural networks for vision-based reinforcement learning", in Conference on Genetic and Evolutionary Computation, ser. GECCO'13, 2013.
-
(2013)
Conference on Genetic and Evolutionary Computation, Ser. GECCO'13
-
-
Koutník, J.1
Cuccu, G.2
Schmidhuber, J.3
Gomez, F.4
-
18
-
-
84969963490
-
Trust region policy optimization
-
J. Schulman, S. Levine, P. Moritz, M. Jordan, and P. Abbeel, "Trust region policy optimization", in ICML, 2015.
-
(2015)
ICML
-
-
Schulman, J.1
Levine, S.2
Moritz, P.3
Jordan, M.4
Abbeel, P.5
-
19
-
-
84979924150
-
End-to-end training of deep visuomotor policies
-
S. Levine, C. Finn, T. Darrell, and P. Abbeel, "End-to-end training of deep visuomotor policies", Journal of Machine Learning Research (JMLR), vol. 17, 2016.
-
(2016)
Journal of Machine Learning Research (JMLR)
, vol.17
-
-
Levine, S.1
Finn, C.2
Darrell, T.3
Abbeel, P.4
-
20
-
-
80053441894
-
PILCO: A model-based and dataefficient approach to policy search
-
M. Deisenroth and C. Rasmussen, "PILCO: a model-based and dataefficient approach to policy search", in ICML, 2011.
-
(2011)
ICML
-
-
Deisenroth, M.1
Rasmussen, C.2
-
21
-
-
84938265627
-
Optimism-driven exploration for nonlinear systems
-
T. Moldovan, S. Levine, M. Jordan, and S. Abbeel, "Optimism-driven exploration for nonlinear systems", in ICRA, 2015.
-
(2015)
ICRA
-
-
Moldovan, T.1
Levine, S.2
Jordan, M.3
Abbeel, S.4
-
22
-
-
84908057666
-
Samplebased information-theoretic stochastic optimal control
-
R. Lioutikov, A. Paraschos, G. Neumann, and J. Peters, "Samplebased information-theoretic stochastic optimal control", in International Conference on Robotics and Automation, 2014.
-
(2014)
International Conference on Robotics and Automation
-
-
Lioutikov, R.1
Paraschos, A.2
Neumann, G.3
Peters, J.4
-
23
-
-
84903590417
-
A survey on policy search for robotics
-
M. P. Deisenroth, G. Neumann, J. Peters et al., "A survey on policy search for robotics." Foundations and Trends in Robotics, vol. 2, no. 1-2, pp. 1-142, 2013.
-
(2013)
Foundations and Trends in Robotics
, vol.2
, Issue.1-2
, pp. 1-142
-
-
Deisenroth, M.P.1
Neumann, G.2
Peters, J.3
-
24
-
-
0000337576
-
Simple statistical gradient-following algorithms for connectionist reinforcement learning
-
May
-
R. Williams, "Simple statistical gradient-following algorithms for connectionist reinforcement learning", Machine Learning, vol. 8, no. 3-4, pp. 229-256, May 1992.
-
(1992)
Machine Learning
, vol.8
, Issue.3-4
, pp. 229-256
-
-
Williams, R.1
-
25
-
-
34249833101
-
Q-learning
-
C. J. Watkins and P. Dayan, "Q-learning", Machine learning, vol. 8, no. 3-4, pp. 279-292, 1992.
-
(1992)
Machine Learning
, vol.8
, Issue.3-4
, pp. 279-292
-
-
Watkins, C.J.1
Dayan, P.2
-
26
-
-
33750244274
-
Policy gradient methods for reinforcement learning with function approximation
-
R. Sutton, D. McAllester, S. Singh, and Y. Mansour, "Policy gradient methods for reinforcement learning with function approximation", in Advances in Neural Information Processing Systems (NIPS), 1999.
-
(1999)
Advances in Neural Information Processing Systems (NIPS)
-
-
Sutton, R.1
McAllester, D.2
Singh, S.3
Mansour, Y.4
-
27
-
-
84883060087
-
Evolving largescale neural networks for vision-based reinforcement learning
-
ACM
-
J. Koutník, G. Cuccu, J. Schmidhuber, and F. Gomez, "Evolving largescale neural networks for vision-based reinforcement learning", in Proceedings of the 15th annual conference on Genetic and evolutionary computation. ACM, 2013, pp. 1061-1068.
-
(2013)
Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation
, pp. 1061-1068
-
-
Koutník, J.1
Cuccu, G.2
Schmidhuber, J.3
Gomez, F.4
-
28
-
-
84924051598
-
Human-level control through deep reinforcement learning
-
V. Mnih et al., "Human-level control through deep reinforcement learning", Nature, vol. 518, no. 7540, pp. 529-533, 2015.
-
(2015)
Nature
, vol.518
, Issue.7540
, pp. 529-533
-
-
Mnih, V.1
-
29
-
-
84999036937
-
Asynchronous methods for deep reinforcement learning
-
V. Mnih, A. P. Badia, M. Mirza, A. Graves, T. Lillicrap, T. Harley, D. Silver, and K. Kavukcuoglu, "Asynchronous methods for deep reinforcement learning", in ICML, 2016, pp. 1928-1937.
-
(2016)
ICML
, pp. 1928-1937
-
-
Mnih, V.1
Badia, A.P.2
Mirza, M.3
Graves, A.4
Lillicrap, T.5
Harley, T.6
Silver, D.7
Kavukcuoglu, K.8
-
30
-
-
79958779459
-
Reinforcement learning in feedback control
-
R. Hafner and M. Riedmiller, "Reinforcement learning in feedback control", Machine learning, vol. 84, no. 1-2, pp. 137-169, 2011.
-
(2011)
Machine Learning
, vol.84
, Issue.1-2
, pp. 137-169
-
-
Hafner, R.1
Riedmiller, M.2
-
31
-
-
0034292720
-
A platform for robotics research based on the remote-brained robot approach
-
M. Inaba, S. Kagami, F. Kanehiro, and Y. Hoshino, "A platform for robotics research based on the remote-brained robot approach", International Journal of Robotics Research, vol. 19, no. 10, 2000.
-
(2000)
International Journal of Robotics Research
, vol.19
, Issue.10
-
-
Inaba, M.1
Kagami, S.2
Kanehiro, F.3
Hoshino, Y.4
-
33
-
-
84887309933
-
Cloud-based robot grasping with the google object recognition engine
-
B. Kehoe, A. Matsukawa, S. Candido, J. Kuffner, and K. Goldberg, "Cloud-based robot grasping with the google object recognition engine", in ICRA, 2013.
-
(2013)
ICRA
-
-
Kehoe, B.1
Matsukawa, A.2
Candido, S.3
Kuffner, J.4
Goldberg, K.5
-
34
-
-
84924680020
-
A survey of research on cloud robotics and automation
-
April
-
B. Kehoe, S. Patil, P. Abbeel, and K. Goldberg, "A survey of research on cloud robotics and automation", IEEE Transactions on Automation Science and Engineering, vol. 12, no. 2, April 2015.
-
(2015)
IEEE Transactions on Automation Science and Engineering
, vol.12
, Issue.2
-
-
Kehoe, B.1
Patil, S.2
Abbeel, P.3
Goldberg, K.4
-
35
-
-
85028021982
-
-
arXiv preprint arXiv:1610.00673
-
A. Yahya, A. Li, M. Kalakrishnan, Y. Chebotar, and S. Levine, "Collective robot reinforcement learning with distributed asynchronous guided policy search", arXiv preprint arXiv:1610.00673, 2016.
-
(2016)
Collective Robot Reinforcement Learning with Distributed Asynchronous Guided Policy Search
-
-
Yahya, A.1
Li, A.2
Kalakrishnan, M.3
Chebotar, Y.4
Levine, S.5
-
36
-
-
15744398067
-
Cahra: Collision avoidance system for humanoid robot arms with potential field
-
A. Sahara, M. Imai, and Y. Anzai, "Cahra: Collision avoidance system for humanoid robot arms with potential field", in IEEE lntemational Conference on Systems, Man and Cybemetics, 2004.
-
(2004)
IEEE Lntemational Conference on Systems, Man and Cybemetics
-
-
Sahara, A.1
Imai, M.2
Anzai, Y.3
-
37
-
-
84872292044
-
Mujoco: A physics engine for model-based control
-
IEEE
-
E. Todorov, T. Erez, and Y. Tassa, "Mujoco: A physics engine for model-based control", in 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 2012, pp. 5026-5033.
-
(2012)
2012 IEEE/RSJ International Conference on Intelligent Robots and Systems
, pp. 5026-5033
-
-
Todorov, E.1
Erez, T.2
Tassa, Y.3
-
38
-
-
85083951076
-
Adam: A method for stochastic optimization
-
D. Kingma and J. Ba, "Adam: A method for stochastic optimization", ICLR, 2015.
-
(2015)
ICLR
-
-
Kingma, D.1
Ba, J.2
-
39
-
-
85060321083
-
Learning motor primitives for robotics
-
J. Kober and J. Peters, "Learning motor primitives for robotics", in ICRA, 2009.
-
(2009)
ICRA
-
-
Kober, J.1
Peters, J.2
-
41
-
-
84969584486
-
Batch normalization: Accelerating deep network training by reducing internal covariate shift
-
S. Ioffe and C. Szegedy, "Batch normalization: Accelerating deep network training by reducing internal covariate shift", ICML, 2015.
-
(2015)
ICML
-
-
Ioffe, S.1
Szegedy, C.2
-
42
-
-
85083952240
-
Policy distillation
-
A. Rusu, S. Colmenarejo, C. Gulcehre, G. Desjardins, J. Kirkpatrick, R. Pascanu, V. Mnih, K. Kavukcuoglu, and R. Hadsell, "Policy distillation", in ICLR, 2016.
-
(2016)
ICLR
-
-
Rusu, A.1
Colmenarejo, S.2
Gulcehre, C.3
Desjardins, G.4
Kirkpatrick, J.5
Pascanu, R.6
Mnih, V.7
Kavukcuoglu, K.8
Hadsell, R.9
|