메뉴 건너뛰기




Volumn , Issue , 2017, Pages 3389-3396

Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates

Author keywords

[No Author keywords available]

Indexed keywords

COMPLEX NETWORKS; DEEP LEARNING; DEEP NEURAL NETWORKS; PERSONNEL TRAINING; REINFORCEMENT LEARNING; ROBOTICS; ROBOTS;

EID: 85027967014     PISSN: 10504729     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICRA.2017.7989385     Document Type: Conference Paper
Times cited : (1658)

References (42)
  • 2
    • 38649142135 scopus 로고    scopus 로고
    • Learning CPG-based biped locomotion with a policy gradient method: Application to a humanoid robot
    • G. Endo, J. Morimoto, T. Matsubara, J. Nakanishi, and G. Cheng, "Learning CPG-based biped locomotion with a policy gradient method: Application to a humanoid robot", International Journal of Robotic Research, vol. 27, no. 2, pp. 213-228, 2008.
    • (2008) International Journal of Robotic Research , vol.27 , Issue.2 , pp. 213-228
    • Endo, G.1    Morimoto, J.2    Matsubara, T.3    Nakanishi, J.4    Cheng, G.5
  • 3
    • 44949241322 scopus 로고    scopus 로고
    • Reinforcement learning of motor skills with policy gradients
    • J. Peters and S. Schaal, "Reinforcement learning of motor skills with policy gradients", Neural Networks, vol. 21, no. 4, pp. 682-697, 2008.
    • (2008) Neural Networks , vol.21 , Issue.4 , pp. 682-697
    • Peters, J.1    Schaal, S.2
  • 4
    • 85027996010 scopus 로고    scopus 로고
    • Reinforcement learning of motor skills in high dimensions
    • E. Theodorou, J. Buchli, and S. Schaal, "Reinforcement learning of motor skills in high dimensions", in ICRA, 2010.
    • (2010) ICRA
    • Theodorou, E.1    Buchli, J.2    Schaal, S.3
  • 9
    • 85105191314 scopus 로고    scopus 로고
    • Learning and generalization of motor skills by learning from demonstration
    • P. Pastor, H. Hoffmann, T. Asfour, and S. Schaal, "Learning and generalization of motor skills by learning from demonstration", in ICRA, 2009.
    • (2009) ICRA
    • Pastor, P.1    Hoffmann, H.2    Asfour, T.3    Schaal, S.4
  • 11
    • 84998579328 scopus 로고    scopus 로고
    • Continuous deep q-learning with model-based acceleration
    • S. Gu, T. Lillicrap, I. Sutskever, and S. Levine, "Continuous deep q-learning with model-based acceleration", in ICML, 2016.
    • (2016) ICML
    • Gu, S.1    Lillicrap, T.2    Sutskever, I.3    Levine, S.4
  • 13
    • 0026954775 scopus 로고
    • Neural networks for control systems: A survey
    • Nov.
    • K. J. Hunt, D. Sbarbaro, R. Żbikowski, and P. J. Gawthrop, "Neural networks for control systems: A survey", Automatica, vol. 28, no. 6, pp. 1083-1112, Nov. 1992.
    • (1992) Automatica , vol.28 , Issue.6 , pp. 1083-1112
    • Hunt, K.J.1    Sbarbaro, D.2    Zbikowski, R.3    Gawthrop, P.J.4
  • 14
    • 33646398129 scopus 로고    scopus 로고
    • Neural fitted q iteration-first experiences with a data efficient neural reinforcement learning method
    • Springer
    • M. Riedmiller, "Neural fitted q iteration-first experiences with a data efficient neural reinforcement learning method", in European Conference on Machine Learning. Springer, 2005, pp. 317-328.
    • (2005) European Conference on Machine Learning , pp. 317-328
    • Riedmiller, M.1
  • 15
    • 36348930983 scopus 로고    scopus 로고
    • Neural reinforcement learning controllers for a real robot application
    • R. Hafner and M. Riedmiller, "Neural reinforcement learning controllers for a real robot application", in ICRA, 2007.
    • (2007) ICRA
    • Hafner, R.1    Riedmiller, M.2
  • 20
    • 80053441894 scopus 로고    scopus 로고
    • PILCO: A model-based and dataefficient approach to policy search
    • M. Deisenroth and C. Rasmussen, "PILCO: a model-based and dataefficient approach to policy search", in ICML, 2011.
    • (2011) ICML
    • Deisenroth, M.1    Rasmussen, C.2
  • 21
    • 84938265627 scopus 로고    scopus 로고
    • Optimism-driven exploration for nonlinear systems
    • T. Moldovan, S. Levine, M. Jordan, and S. Abbeel, "Optimism-driven exploration for nonlinear systems", in ICRA, 2015.
    • (2015) ICRA
    • Moldovan, T.1    Levine, S.2    Jordan, M.3    Abbeel, S.4
  • 24
    • 0000337576 scopus 로고
    • Simple statistical gradient-following algorithms for connectionist reinforcement learning
    • May
    • R. Williams, "Simple statistical gradient-following algorithms for connectionist reinforcement learning", Machine Learning, vol. 8, no. 3-4, pp. 229-256, May 1992.
    • (1992) Machine Learning , vol.8 , Issue.3-4 , pp. 229-256
    • Williams, R.1
  • 25
    • 34249833101 scopus 로고
    • Q-learning
    • C. J. Watkins and P. Dayan, "Q-learning", Machine learning, vol. 8, no. 3-4, pp. 279-292, 1992.
    • (1992) Machine Learning , vol.8 , Issue.3-4 , pp. 279-292
    • Watkins, C.J.1    Dayan, P.2
  • 28
    • 84924051598 scopus 로고    scopus 로고
    • Human-level control through deep reinforcement learning
    • V. Mnih et al., "Human-level control through deep reinforcement learning", Nature, vol. 518, no. 7540, pp. 529-533, 2015.
    • (2015) Nature , vol.518 , Issue.7540 , pp. 529-533
    • Mnih, V.1
  • 30
    • 79958779459 scopus 로고    scopus 로고
    • Reinforcement learning in feedback control
    • R. Hafner and M. Riedmiller, "Reinforcement learning in feedback control", Machine learning, vol. 84, no. 1-2, pp. 137-169, 2011.
    • (2011) Machine Learning , vol.84 , Issue.1-2 , pp. 137-169
    • Hafner, R.1    Riedmiller, M.2
  • 33
    • 84887309933 scopus 로고    scopus 로고
    • Cloud-based robot grasping with the google object recognition engine
    • B. Kehoe, A. Matsukawa, S. Candido, J. Kuffner, and K. Goldberg, "Cloud-based robot grasping with the google object recognition engine", in ICRA, 2013.
    • (2013) ICRA
    • Kehoe, B.1    Matsukawa, A.2    Candido, S.3    Kuffner, J.4    Goldberg, K.5
  • 38
    • 85083951076 scopus 로고    scopus 로고
    • Adam: A method for stochastic optimization
    • D. Kingma and J. Ba, "Adam: A method for stochastic optimization", ICLR, 2015.
    • (2015) ICLR
    • Kingma, D.1    Ba, J.2
  • 39
    • 85060321083 scopus 로고    scopus 로고
    • Learning motor primitives for robotics
    • J. Kober and J. Peters, "Learning motor primitives for robotics", in ICRA, 2009.
    • (2009) ICRA
    • Kober, J.1    Peters, J.2
  • 41
    • 84969584486 scopus 로고    scopus 로고
    • Batch normalization: Accelerating deep network training by reducing internal covariate shift
    • S. Ioffe and C. Szegedy, "Batch normalization: Accelerating deep network training by reducing internal covariate shift", ICML, 2015.
    • (2015) ICML
    • Ioffe, S.1    Szegedy, C.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.