메뉴 건너뛰기




Volumn 2005, Issue , 2005, Pages 4164-4169

Learning sensory feedback to CPG with policy gradient for biped locomotion

Author keywords

Biped locomotion; Central pattern generator; Policy gradient; Reinforcement learning

Indexed keywords

BIPED LOCOMOTION; COMPUTER SIMULATION; FEEDBACK CONTROL; GRADIENT METHODS; LEARNING ALGORITHMS; MOTION PLANNING; SENSORY FEEDBACK;

EID: 33846174631     PISSN: 10504729     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ROBOT.2005.1570759     Document Type: Conference Paper
Times cited : (21)

References (18)
  • 1
    • 0022390346 scopus 로고
    • Sustained oscillatons generated by mutually inhibiting neurons with adaptation
    • K. Matsuoka, "Sustained oscillatons generated by mutually inhibiting neurons with adaptation," Biologial Cybernetics, vol. 52, pp. 367-376, 1985.
    • (1985) Biologial Cybernetics , vol.52 , pp. 367-376
    • Matsuoka, K.1
  • 2
    • 0026045478 scopus 로고
    • Self-organized control of bipedal locomotion by neural oscillators in unpredictable environment
    • G. Taga, Y. Yamaguchi, and H. Shimizu, "Self-organized control of bipedal locomotion by neural oscillators in unpredictable environment," Biological Cybernetics, vol. 65, pp. 147-159, 1991.
    • (1991) Biological Cybernetics , vol.65 , pp. 147-159
    • Taga, G.1    Yamaguchi, Y.2    Shimizu, H.3
  • 4
    • 0037645833 scopus 로고    scopus 로고
    • Adaptive dynamic walking of a quadruped robot on irregular terrain based on biological concepts
    • Y. Fukuoka, H. Kimura, and A. Cohen, "Adaptive dynamic walking of a quadruped robot on irregular terrain based on biological concepts," The International Journal of Robotics Reserch, vol. 22, no. 3-4, pp. 187-202, 2003.
    • (2003) The International Journal of Robotics Reserch , vol.22 , Issue.3-4 , pp. 187-202
    • Fukuoka, Y.1    Kimura, H.2    Cohen, A.3
  • 5
    • 0032251175 scopus 로고    scopus 로고
    • Computer simulation of the ontogeny of biped walking
    • K. Hase and N. Yamazaki, "Computer simulation of the ontogeny of biped walking," Anthropological Science, vol. 106(4), pp. 327-347, 1998.
    • (1998) Anthropological Science , vol.106 , Issue.4 , pp. 327-347
    • Hase, K.1    Yamazaki, N.2
  • 7
    • 0008336447 scopus 로고    scopus 로고
    • An analysis of actor/critic algorithms using eligibility traces: Reinforcement learning with imperfect value function
    • H. Kimura and S. Kobayashi, "An analysis of actor/critic algorithms using eligibility traces: Reinforcement learning with imperfect value function," Internal Conferrence on Machine Learning, pp. 278-286, 1998.
    • (1998) Internal Conferrence on Machine Learning , pp. 278-286
    • Kimura, H.1    Kobayashi, S.2
  • 10
    • 33846140666 scopus 로고    scopus 로고
    • T. Matsubara, J. Morimoto, J. Nakanishi, M. Sato, and K. Doya, Learning a dynamic policy by using policy gradient: Application to biped walking, in The Institute of Electronics, Information and Communication Engineers, Technical Report of IEICE, no. 2003-128, 2004, pp. 53-58, in Japanese.
    • T. Matsubara, J. Morimoto, J. Nakanishi, M. Sato, and K. Doya, "Learning a dynamic policy by using policy gradient: Application to biped walking," in The Institute of Electronics, Information and Communication Engineers, Technical Report of IEICE, no. 2003-128, 2004, pp. 53-58, in Japanese.
  • 12
    • 0033629916 scopus 로고    scopus 로고
    • Reinforcement learning in continuous time and space
    • K. Doya, "Reinforcement learning in continuous time and space," Neural Computation, vol. 12, pp. 219-245, 2000.
    • (2000) Neural Computation , vol.12 , pp. 219-245
    • Doya, K.1
  • 14
    • 0012003778 scopus 로고    scopus 로고
    • Autobalancer: An online dynamic balance compensation scheme for humanoid robots
    • B. R. Donald, K. Lynch, and D. Rus, Eds. A K Peters, Ltd
    • S. Kagami, F. Kanehiro, Y. Tamiya, M. Inaba, and H. Inoue, "Autobalancer: An online dynamic balance compensation scheme for humanoid robots," in Algorithmic and Computational Robotics: New Directions, B. R. Donald, K. Lynch, and D. Rus, Eds. A K Peters, Ltd., 2001, pp. 329-340.
    • (2001) Algorithmic and Computational Robotics: New Directions , pp. 329-340
    • Kagami, S.1    Kanehiro, F.2    Tamiya, Y.3    Inaba, M.4    Inoue, H.5
  • 15
    • 0036168467 scopus 로고    scopus 로고
    • S. Kagami, T. Kitagawa, K. Nishiwaki, and T. sugiharaand M. Inaba, A fast dynamically equilibrated walking trajectory generation method of humanoid robot, Autonomouns Robots, 12, pp. 71-82, 2002.
    • S. Kagami, T. Kitagawa, K. Nishiwaki, and T. sugiharaand M. Inaba, "A fast dynamically equilibrated walking trajectory generation method of humanoid robot," Autonomouns Robots, vol. 12, pp. 71-82, 2002.
  • 17
    • 33846156072 scopus 로고    scopus 로고
    • T. Mori, Y. Nakamura, and S. Ishii, Reinforcement learning based on a policy gradient method for biped locomotion, in The Institute of Electronics, Information and Communication Engineers, Technical Report of IEICE, no. 2003-206, 2004, pp. 73-78, in Japanese.
    • T. Mori, Y. Nakamura, and S. Ishii, "Reinforcement learning based on a policy gradient method for biped locomotion," in The Institute of Electronics, Information and Communication Engineers, Technical Report of IEICE, no. 2003-206, 2004, pp. 73-78, in Japanese.


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.