SCOPUS 정보 검색 플랫폼

Proceedings - IEEE International Conference on Robotics and Automation

Volumn , Issue , 2017, Pages 3389-3396

Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates

(4) Gu, Shixiang a,b,c Holly, Ethan a Lillicrap, Timothy d Levine, Sergey a,e

a GOOGLE INC (United States)

b UNIVERSITY OF CAMBRIDGE (United Kingdom)

c Autonomous Vision Group (United States)

d DEEPMIND (United Kingdom)

e UNIVERSITY OF CALIFORNIA (United States)

Author keywords

[No Author keywords available]

Indexed keywords

COMPLEX NETWORKS; DEEP LEARNING; DEEP NEURAL NETWORKS; PERSONNEL TRAINING; REINFORCEMENT LEARNING; ROBOTICS; ROBOTS;

3D MANIPULATION TASKS; EXPERIMENTAL EVALUATION; HUMAN INTERVENTION; LEARNING PROCESS; REAL PHYSICAL SYSTEMS; ROBOTIC APPLICATIONS; ROBOTIC MANIPULATION; SAMPLE COMPLEXITY;

LEARNING ALGORITHMS;

EID: 85027967014 PISSN: 10504729 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICRA.2017.7989385 Document Type: Conference Paper

Times cited : (1658)

References (42)

1
- 3042534761
- Policy gradient reinforcement learning for fast quadrupedal locomotion
- N. Kohl and P. Stone, "Policy gradient reinforcement learning for fast quadrupedal locomotion", in International Conference on Robotics and Automation (IROS), 2004.
- (2004) International Conference on Robotics and Automation (IROS)
- Kohl, N.¹ Stone, P.²

2
- 38649142135
- Learning CPG-based biped locomotion with a policy gradient method: Application to a humanoid robot
- G. Endo, J. Morimoto, T. Matsubara, J. Nakanishi, and G. Cheng, "Learning CPG-based biped locomotion with a policy gradient method: Application to a humanoid robot", International Journal of Robotic Research, vol. 27, no. 2, pp. 213-228, 2008.
- (2008) International Journal of Robotic Research , vol.27 , Issue.2 , pp. 213-228
- Endo, G.¹ Morimoto, J.² Matsubara, T.³ Nakanishi, J.⁴ Cheng, G.⁵

3
- 44949241322
- Reinforcement learning of motor skills with policy gradients
- J. Peters and S. Schaal, "Reinforcement learning of motor skills with policy gradients", Neural Networks, vol. 21, no. 4, pp. 682-697, 2008.
- (2008) Neural Networks , vol.21 , Issue.4 , pp. 682-697
- Peters, J.¹ Schaal, S.²

4
- 85027996010
- Reinforcement learning of motor skills in high dimensions
- E. Theodorou, J. Buchli, and S. Schaal, "Reinforcement learning of motor skills in high dimensions", in ICRA, 2010.
- (2010) ICRA
- Theodorou, E.¹ Buchli, J.² Schaal, S.³

5
- 77958569725
- Relative entropy policy search
- J. Peters, K. Mülling, and Y. Altün, "Relative entropy policy search", in AAAI Conference on Artificial Intelligence, 2010.
- (2010) AAAI Conference on Artificial Intelligence
- Peters, J.¹ Mülling, K.² Altün, Y.³

6
- 84455188451
- Learning force control policies for compliant manipulation
- M. Kalakrishnan, L. Righetti, P. Pastor, and S. Schaal, "Learning force control policies for compliant manipulation", in International Conference on Intelligent Robots and Systems (IROS), 2011.
- (2011) International Conference on Intelligent Robots and Systems (IROS)
- Kalakrishnan, M.¹ Righetti, L.² Pastor, P.³ Schaal, S.⁴

7
- 85158005713
- An application of reinforcement learning to aerobatic helicopter flight
- P. Abbeel, A. Coates, M. Quigley, and A. Ng, "An application of reinforcement learning to aerobatic helicopter flight", in Advances in Neural Information Processing Systems (NIPS), 2006.
- (2006) Advances in Neural Information Processing Systems (NIPS)
- Abbeel, P.¹ Coates, A.² Quigley, M.³ Ng, A.⁴

8
- 84884276459
- Reinforcement learning in robotics: A survey
- J. Kober, J. A. Bagnell, and J. Peters, "Reinforcement learning in robotics: A survey", International Journal of Robotic Research, vol. 32, no. 11, pp. 1238-1274, 2013.
- (2013) International Journal of Robotic Research , vol.32 , Issue.11 , pp. 1238-1274
- Kober, J.¹ Bagnell, J.A.² Peters, J.³

9
- 85105191314
- Learning and generalization of motor skills by learning from demonstration
- P. Pastor, H. Hoffmann, T. Asfour, and S. Schaal, "Learning and generalization of motor skills by learning from demonstration", in ICRA, 2009.
- (2009) ICRA
- Pastor, P.¹ Hoffmann, H.² Asfour, T.³ Schaal, S.⁴

10
- 85083953657
- Continuous control with deep reinforcement learning
- T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, "Continuous control with deep reinforcement learning", ICLR, 2016.
- (2016) ICLR
- Lillicrap, T.P.¹ Hunt, J.J.² Pritzel, A.³ Heess, N.⁴ Erez, T.⁵ Tassa, Y.⁶ Silver, D.⁷ Wierstra, D.⁸

11
- 84998579328
- Continuous deep q-learning with model-based acceleration
- S. Gu, T. Lillicrap, I. Sutskever, and S. Levine, "Continuous deep q-learning with model-based acceleration", in ICML, 2016.
- (2016) ICML
- Gu, S.¹ Lillicrap, T.² Sutskever, I.³ Levine, S.⁴

12
- 84903590417
- A survey on policy search for robotics
- M. Deisenroth, G. Neumann, and J. Peters, "A survey on policy search for robotics", Foundations and Trends in Robotics, vol. 2, no. 1-2, pp. 1-142, 2013.
- (2013) Foundations and Trends in Robotics , vol.2 , Issue.1-2 , pp. 1-142
- Deisenroth, M.¹ Neumann, G.² Peters, J.³

13
- 0026954775
- Neural networks for control systems: A survey
- Nov.
- K. J. Hunt, D. Sbarbaro, R. Żbikowski, and P. J. Gawthrop, "Neural networks for control systems: A survey", Automatica, vol. 28, no. 6, pp. 1083-1112, Nov. 1992.
- (1992) Automatica , vol.28 , Issue.6 , pp. 1083-1112
- Hunt, K.J.¹ Sbarbaro, D.² Zbikowski, R.³ Gawthrop, P.J.⁴

14
- 33646398129
- Neural fitted q iteration-first experiences with a data efficient neural reinforcement learning method
- Springer
- M. Riedmiller, "Neural fitted q iteration-first experiences with a data efficient neural reinforcement learning method", in European Conference on Machine Learning. Springer, 2005, pp. 317-328.
- (2005) European Conference on Machine Learning , pp. 317-328
- Riedmiller, M.¹

15
- 36348930983
- Neural reinforcement learning controllers for a real robot application
- R. Hafner and M. Riedmiller, "Neural reinforcement learning controllers for a real robot application", in ICRA, 2007.
- (2007) ICRA
- Hafner, R.¹ Riedmiller, M.²

16
- 84865083902
- Autonomous reinforcement learning on raw visual input data in a real world application
- M. Riedmiller, S. Lange, and A. Voigtlaender, "Autonomous reinforcement learning on raw visual input data in a real world application", in International Joint Conference on Neural Networks, 2012.
- (2012) International Joint Conference on Neural Networks
- Riedmiller, M.¹ Lange, S.² Voigtlaender, A.³

17
- 84883060087
- Evolving largescale neural networks for vision-based reinforcement learning
- J. Koutník, G. Cuccu, J. Schmidhuber, and F. Gomez, "Evolving largescale neural networks for vision-based reinforcement learning", in Conference on Genetic and Evolutionary Computation, ser. GECCO'13, 2013.
- (2013) Conference on Genetic and Evolutionary Computation, Ser. GECCO'13
- Koutník, J.¹ Cuccu, G.² Schmidhuber, J.³ Gomez, F.⁴

18
- 84969963490
- Trust region policy optimization
- J. Schulman, S. Levine, P. Moritz, M. Jordan, and P. Abbeel, "Trust region policy optimization", in ICML, 2015.
- (2015) ICML
- Schulman, J.¹ Levine, S.² Moritz, P.³ Jordan, M.⁴ Abbeel, P.⁵

19
- 84979924150
- End-to-end training of deep visuomotor policies
- S. Levine, C. Finn, T. Darrell, and P. Abbeel, "End-to-end training of deep visuomotor policies", Journal of Machine Learning Research (JMLR), vol. 17, 2016.
- (2016) Journal of Machine Learning Research (JMLR) , vol.17
- Levine, S.¹ Finn, C.² Darrell, T.³ Abbeel, P.⁴

20
- 80053441894
- PILCO: A model-based and dataefficient approach to policy search
- M. Deisenroth and C. Rasmussen, "PILCO: a model-based and dataefficient approach to policy search", in ICML, 2011.
- (2011) ICML
- Deisenroth, M.¹ Rasmussen, C.²

21
- 84938265627
- Optimism-driven exploration for nonlinear systems
- T. Moldovan, S. Levine, M. Jordan, and S. Abbeel, "Optimism-driven exploration for nonlinear systems", in ICRA, 2015.
- (2015) ICRA
- Moldovan, T.¹ Levine, S.² Jordan, M.³ Abbeel, S.⁴

22
- 84908057666
- Samplebased information-theoretic stochastic optimal control
- R. Lioutikov, A. Paraschos, G. Neumann, and J. Peters, "Samplebased information-theoretic stochastic optimal control", in International Conference on Robotics and Automation, 2014.
- (2014) International Conference on Robotics and Automation
- Lioutikov, R.¹ Paraschos, A.² Neumann, G.³ Peters, J.⁴

23
- 84903590417
- A survey on policy search for robotics
- M. P. Deisenroth, G. Neumann, J. Peters et al., "A survey on policy search for robotics." Foundations and Trends in Robotics, vol. 2, no. 1-2, pp. 1-142, 2013.
- (2013) Foundations and Trends in Robotics , vol.2 , Issue.1-2 , pp. 1-142
- Deisenroth, M.P.¹ Neumann, G.² Peters, J.³

24
- 0000337576
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- May
- R. Williams, "Simple statistical gradient-following algorithms for connectionist reinforcement learning", Machine Learning, vol. 8, no. 3-4, pp. 229-256, May 1992.
- (1992) Machine Learning , vol.8 , Issue.3-4 , pp. 229-256
- Williams, R.¹

25
- 34249833101
- Q-learning
- C. J. Watkins and P. Dayan, "Q-learning", Machine learning, vol. 8, no. 3-4, pp. 279-292, 1992.
- (1992) Machine Learning , vol.8 , Issue.3-4 , pp. 279-292
- Watkins, C.J.¹ Dayan, P.²

26
- 33750244274
- Policy gradient methods for reinforcement learning with function approximation
- R. Sutton, D. McAllester, S. Singh, and Y. Mansour, "Policy gradient methods for reinforcement learning with function approximation", in Advances in Neural Information Processing Systems (NIPS), 1999.
- (1999) Advances in Neural Information Processing Systems (NIPS)
- Sutton, R.¹ McAllester, D.² Singh, S.³ Mansour, Y.⁴

27
- 84883060087
- Evolving largescale neural networks for vision-based reinforcement learning
- ACM
- J. Koutník, G. Cuccu, J. Schmidhuber, and F. Gomez, "Evolving largescale neural networks for vision-based reinforcement learning", in Proceedings of the 15th annual conference on Genetic and evolutionary computation. ACM, 2013, pp. 1061-1068.
- (2013) Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation , pp. 1061-1068
- Koutník, J.¹ Cuccu, G.² Schmidhuber, J.³ Gomez, F.⁴

28
- 84924051598
- Human-level control through deep reinforcement learning
- V. Mnih et al., "Human-level control through deep reinforcement learning", Nature, vol. 518, no. 7540, pp. 529-533, 2015.
- (2015) Nature , vol.518 , Issue.7540 , pp. 529-533
- Mnih, V.¹

29
- 84999036937
- Asynchronous methods for deep reinforcement learning
- V. Mnih, A. P. Badia, M. Mirza, A. Graves, T. Lillicrap, T. Harley, D. Silver, and K. Kavukcuoglu, "Asynchronous methods for deep reinforcement learning", in ICML, 2016, pp. 1928-1937.
- (2016) ICML , pp. 1928-1937
- Mnih, V.¹ Badia, A.P.² Mirza, M.³ Graves, A.⁴ Lillicrap, T.⁵ Harley, T.⁶ Silver, D.⁷ Kavukcuoglu, K.⁸

30
- 79958779459
- Reinforcement learning in feedback control
- R. Hafner and M. Riedmiller, "Reinforcement learning in feedback control", Machine learning, vol. 84, no. 1-2, pp. 137-169, 2011.
- (2011) Machine Learning , vol.84 , Issue.1-2 , pp. 137-169
- Hafner, R.¹ Riedmiller, M.²

31
- 0034292720
- A platform for robotics research based on the remote-brained robot approach
- M. Inaba, S. Kagami, F. Kanehiro, and Y. Hoshino, "A platform for robotics research based on the remote-brained robot approach", International Journal of Robotics Research, vol. 19, no. 10, 2000.
- (2000) International Journal of Robotics Research , vol.19 , Issue.10
- Inaba, M.¹ Kagami, S.² Kanehiro, F.³ Hoshino, Y.⁴

32
- 84872554347
- Cloud-enabled humanoid robots
- J. Kuffner, "Cloud-enabled humanoid robots", in IEEE-RAS International Conference on Humanoid Robotics, 2010.
- (2010) IEEE-RAS International Conference on Humanoid Robotics
- Kuffner, J.¹

33
- 84887309933
- Cloud-based robot grasping with the google object recognition engine
- B. Kehoe, A. Matsukawa, S. Candido, J. Kuffner, and K. Goldberg, "Cloud-based robot grasping with the google object recognition engine", in ICRA, 2013.
- (2013) ICRA
- Kehoe, B.¹ Matsukawa, A.² Candido, S.³ Kuffner, J.⁴ Goldberg, K.⁵

34
- 84924680020
- A survey of research on cloud robotics and automation
- April
- B. Kehoe, S. Patil, P. Abbeel, and K. Goldberg, "A survey of research on cloud robotics and automation", IEEE Transactions on Automation Science and Engineering, vol. 12, no. 2, April 2015.
- (2015) IEEE Transactions on Automation Science and Engineering , vol.12 , Issue.2
- Kehoe, B.¹ Patil, S.² Abbeel, P.³ Goldberg, K.⁴

35
- 85028021982
- arXiv preprint arXiv:1610.00673
- A. Yahya, A. Li, M. Kalakrishnan, Y. Chebotar, and S. Levine, "Collective robot reinforcement learning with distributed asynchronous guided policy search", arXiv preprint arXiv:1610.00673, 2016.
- (2016) Collective Robot Reinforcement Learning with Distributed Asynchronous Guided Policy Search
- Yahya, A.¹ Li, A.² Kalakrishnan, M.³ Chebotar, Y.⁴ Levine, S.⁵

36
- 15744398067
- Cahra: Collision avoidance system for humanoid robot arms with potential field
- A. Sahara, M. Imai, and Y. Anzai, "Cahra: Collision avoidance system for humanoid robot arms with potential field", in IEEE lntemational Conference on Systems, Man and Cybemetics, 2004.
- (2004) IEEE Lntemational Conference on Systems, Man and Cybemetics
- Sahara, A.¹ Imai, M.² Anzai, Y.³

37
- 84872292044
- Mujoco: A physics engine for model-based control
- IEEE
- E. Todorov, T. Erez, and Y. Tassa, "Mujoco: A physics engine for model-based control", in 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 2012, pp. 5026-5033.
- (2012) 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems , pp. 5026-5033
- Todorov, E.¹ Erez, T.² Tassa, Y.³

38
- 85083951076
- Adam: A method for stochastic optimization
- D. Kingma and J. Ba, "Adam: A method for stochastic optimization", ICLR, 2015.
- (2015) ICLR
- Kingma, D.¹ Ba, J.²

39
- 85060321083
- Learning motor primitives for robotics
- J. Kober and J. Peters, "Learning motor primitives for robotics", in ICRA, 2009.
- (2009) ICRA
- Kober, J.¹ Peters, J.²

40
- 85027992370
- R. Tedrake, T. W. Zhang, and H. S. Seung, "Learning to walk in 20 minutes."
- Learning to Walk in 20 Minutes
- Tedrake, R.¹ Zhang, T.W.² Seung, H.S.³

41
- 84969584486
- Batch normalization: Accelerating deep network training by reducing internal covariate shift
- S. Ioffe and C. Szegedy, "Batch normalization: Accelerating deep network training by reducing internal covariate shift", ICML, 2015.
- (2015) ICML
- Ioffe, S.¹ Szegedy, C.²

42
- 85083952240
- Policy distillation
- A. Rusu, S. Colmenarejo, C. Gulcehre, G. Desjardins, J. Kirkpatrick, R. Pascanu, V. Mnih, K. Kavukcuoglu, and R. Hadsell, "Policy distillation", in ICLR, 2016.
- (2016) ICLR
- Rusu, A.¹ Colmenarejo, S.² Gulcehre, C.³ Desjardins, G.⁴ Kirkpatrick, J.⁵ Pascanu, R.⁶ Mnih, V.⁷ Kavukcuoglu, K.⁸ Hadsell, R.⁹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.