SCOPUS 정보 검색 플랫폼

Advanced Robotics

Volumn 21, Issue 13, 2007, Pages 1521-1544

Reinforcement learning for imitating constrained reaching movements

(4) Guenter, Florent a Hersch, Micha a Calinon, Sylvain a Billard, Aude a

a EPFL (Switzerland)

Author keywords

DYNAMICAL SYSTEMS; GAUSSIAN MIXTURE MODEL; PROGRAMMING BY DEMONSTRATION; REINFORCEMENT LEARNING

Indexed keywords

ALGORITHMS; COMPUTER PROGRAMMING; DYNAMICAL SYSTEMS; ROBOTS;

CONSTRAINED REACHING MOVEMENTS; GAUSSIAN MIXTURE MODELS; PROGRAMMING BY DEMONSTRATION;

REINFORCEMENT LEARNING;

EID: 34948857495 PISSN: 01691864 EISSN: 15685535 Source Type: Journal
DOI: 10.1163/156855307782148550 Document Type: Article

Times cited : (129)

References (30)

1
- 0004320928
- Cambridge, MA: MIT Press
- Dautenhahn, K., and Nehaniv, C. L., 2001. Imitation in Animals and Artifacts, Cambridge, MA:MIT Press.
- (2001) Imitation in Animals and Artifacts
- Dautenhahn, K.¹ Nehaniv, C.L.²

2
- 34948817292
- Amsterdam: Elsevier
- Billard, A., and Siegwart, R., 2004. Robotics and Autonomous Systems, Special Issue:Robot Learning From Demonstration, Amsterdam:Elsevier.
- (2004) Robotics and Autonomous Systems, Special Issue: Robot Learning From Demonstration
- Billard, A.¹ Siegwart, R.²

3
- 0029539342
- Dynamics of behaviour: Theory and application for autonomous robot architecture
- Schoner, G., Dose, M., and Engels, C., 1995. Dynamics of behaviour:theory and application for autonomous robot architecture. Robotics Autonomous Syst., 16:213–245.
- (1995) Robotics Autonomous Syst. , vol.16 , pp. 213-245
- Schoner, G.¹ Dose, M.² Engels, C.³

4
- 33845565702
- A biologically-inspired model of reaching movements
- Pisa
- Hersch, M., and Billard, A., 2006. “ A biologically-inspired model of reaching movements ”. In Proc. IEEE/RAS—EMBS Int. Conf. on Biomedical Robotics and Biomechatronics 1067–1071. Pisa
- (2006) Proc. IEEE/RAS—EMBS Int. Conf. on Biomedical Robotics and Biomechatronics , pp. 1067-1071
- Hersch, M.¹ Billard, A.²

5
- 3042617895
- Autonomous reaching and obstacle avoidance with anthropomorphic arm of a robotics assistant using the attractor dynamics approach
- New Orleans, LA
- Iossifidis, I., and Schoner, G., 2004. “ Autonomous reaching and obstacle avoidance with anthropomorphic arm of a robotics assistant using the attractor dynamics approach ”. In Proc. IEEE Int. Conf. on Robotics and Automation 4295–4300. New Orleans, LA
- (2004) Proc. IEEE Int. Conf. on Robotics and Automation , pp. 4295-4300
- Iossifidis, I.¹ Schoner, G.²

6
- 33845640093
- Programmable central pattern generators: A application to biped locomotion control
- Orlando, FL
- Righetti, L., and Ijspeert, A., 2006. “ Programmable central pattern generators:a application to biped locomotion control ”. In Proc. IEEE Int. Conf. on Robotics and Automation 1585–1590. Orlando, FL
- (2006) Proc. IEEE Int. Conf. on Robotics and Automation , pp. 1585-1590
- Righetti, L.¹ Ijspeert, A.²

7
- 0030652809
- Learning tasks from a single demonstration
- Albuquerque, NM
- Atkeson, C. G., and Schaal, S., 1997. “ Learning tasks from a single demonstration ”. In Proc. IEEE Int. Conf. on Robotics and Automation 1706–1712. Albuquerque, NM
- (1997) Proc. IEEE Int. Conf. on Robotics and Automation , pp. 1706-1712
- Atkeson, C.G.¹ Schaal, S.²

8
- 34250620650
- Learning movement primitives
- Siena
- Schaal, S., Peters, J., Nakanishi, J., and Ijspeert, A., 2003. “ Learning movement primitives ”. In Proc. Int. Symp. on Robotics Research 1805–1815. Siena
- (2003) Proc. Int. Symp. on Robotics Research , pp. 1805-1815
- Schaal, S.¹ Peters, J.² Nakanishi, J.³ Ijspeert, A.⁴

9
- 0035979437
- Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning
- Morimoto, J., and Doya, K., 2001. Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning. Robotics Autonomous Syst., 36:37–51.
- (2001) Robotics Autonomous Syst. , vol.36 , pp. 37-51
- Morimoto, J.¹ Doya, K.²

10
- 14044272927
- Reinforcement learning for motion control of humanoid robots
- Sendai
- Iida, S., Kanoh, M., Kato, S., and Itoh, H., 2004. “ Reinforcement learning for motion control of humanoid robots ”. In Proc. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems 353–358. Sendai
- (2004) Proc. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems , pp. 353-358
- Iida, S.¹ Kanoh, M.² Kato, S.³ Itoh, H.⁴

11
- 85150714688
- Reinforcement learning methods for continuous-time Markov decision problems
- Denver
- Bratke, S. J., and Duff, M. O., 1994. “ Reinforcement learning methods for continuous-time Markov decision problems ”. In Proc. Neural Information Processing Systems Conf 393–400. Denver
- (1994) Proc. Neural Information Processing Systems Conf , pp. 393-400
- Bratke, S.J.¹ Duff, M.O.²

12
- 0033629916
- Reinforcement learning in continuous time and space
- Doya, K., 2000. Reinforcement learning in continuous time and space. Neural Comput., 12:219–245.
- (2000) Neural Comput. , vol.12 , pp. 219-245
- Doya, K.¹

13
- 0004102479
- Cambridge, MA: MIT Press
- Sutton, R. S., and Barto, A. G., 1998. Reinforcement Learning:An Introduction, Cambridge, MA:MIT Press.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

14
- 84898938510
- Actor—critic algorithms
- Konda, V. R., and Tsitsiklis, J. N., 2000. Actor—critic algorithms. Adv. Neural Information Process. Syst., 12:1008–1015.
- (2000) Adv. Neural Information Process. Syst. , vol.12 , pp. 1008-1015
- Konda, V.R.¹ Tsitsiklis, J.N.²

15
- 34447553096
- Reinforcement learning for humanoid robotics
- München
- Peters, J., Vijayakumar, S., and Schaal, S., 2003. “ Reinforcement learning for humanoid robotics ”. In Proc. IEEE-RAS Int. Conf. on Humanoid Robots (Humanoids2003) 225–230. München
- (2003) Proc. IEEE-RAS Int. Conf. on Humanoid Robots (Humanoids2003) , pp. 225-230
- Peters, J.¹ Vijayakumar, S.² Schaal, S.³

16
- 33646413135
- Natural actor—critic
- Porto
- Peters, J., Vijayakumar, S., and Schaal, S., 2005. “ Natural actor—critic ”. In Proc. 16th Eur. Conf. on Machine Learning 280–291. Porto
- (2005) Proc. 16th Eur. Conf. on Machine Learning , pp. 280-291
- Peters, J.¹ Vijayakumar, S.² Schaal, S.³

17
- 84873015924
- Least-squares policy evaluation algorithms with linear function approximation
- Cambridge
- Nedic, A., and Bertsekas, D., 2001. “ Least-squares policy evaluation algorithms with linear function approximation ”. In LIDS Report LIDS-P-2537, Dec. 2001 Cambridge
- (2001) LIDS Report LIDS-P-2537, Dec. 2001
- Nedic, A.¹ Bertsekas, D.²

18
- 0000337576
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- Williams, R., 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learn., 8:229–256.
- (1992) Machine Learn. , vol.8 , pp. 229-256
- Williams, R.¹

19
- 84898939480
- Policy gradient methods for reinforcement learning with function approximation
- Sutton, R. S., McAllester, D., Singh, S., and Mansour, Y., 2000. Policy gradient methods for reinforcement learning with function approximation. Adv. Neural Information Process. Syst., 12:1057–1064.
- (2000) Adv. Neural Information Process. Syst. , vol.12 , pp. 1057-1064
- Sutton, R.S.¹ McAllester, D.² Singh, S.³ Mansour, Y.⁴

20
- 0141988716
- Recent advances in hierarchical reinforcement learning
- Barto, A. G., and Mahadevan, S., 2003. Recent advances in hierarchical reinforcement learning. Discrete Event Dyn. Syst.:Theory Applic., 13:343–379.
- (2003) Discrete Event Dyn. Syst.: Theory Applic. , vol.13 , pp. 343-379
- Barto, A.G.¹ Mahadevan, S.²

21
- 0000396062
- Natural gradient works efficiently in learning
- Amari, S., 2000. Natural gradient works efficiently in learning. Neural Comput., 10:251–276.
- (2000) Neural Comput. , vol.10 , pp. 251-276
- Amari, S.¹

22
- 33749243349
- Autonomous shaping: Knowledge transfer in reinforcement learning
- Pittsburgh, PA
- Konidaris, G., and Barto, A., 2006. “ Autonomous shaping:knowledge transfer in reinforcement learning ”. In Proc. Int. Conf. on Machine Learning 497–504. Pittsburgh, PA
- (2006) Proc. Int. Conf. on Machine Learning , pp. 497-504
- Konidaris, G.¹ Barto, A.²

23
- 33749242451
- Using inaccurate models in reinforcement learning
- Pittsburgh, PA
- Abbeel, P., and Quigley, M., 2006. “ Using inaccurate models in reinforcement learning ”. In Proc. Int. Conf. on Machine Learning 9–16. Pittsburgh, PA
- (2006) Proc. Int. Conf. on Machine Learning , pp. 9-16
- Abbeel, P.¹ Quigley, M.²

24
- 33749261645
- An intrinsic reward mechanism for efficient exploration
- Pittsburgh, PA
- Simsek, O., and Barto, A., 2006. “ An intrinsic reward mechanism for efficient exploration ”. In Proc. Int. Conf. on Machine Learning 841–848. Pittsburgh, PA
- (2006) Proc. Int. Conf. on Machine Learning , pp. 841-848
- Simsek, O.¹ Barto, A.²

25
- 34047104556
- Learning dynamical system modulation for constraint reaching tasks
- Genova
- Hersch, M., Guenter, F., Calinon, S., and Billard, A., 2006. “ Learning dynamical system modulation for constraint reaching tasks ”. In Proc. IEEE-RAS International Conference on Humanoid Robots 444–449. Genova
- (2006) Proc. IEEE-RAS International Conference on Humanoid Robots , pp. 444-449
- Hersch, M.¹ Guenter, F.² Calinon, S.³ Billard, A.⁴

26
- 34047173490
- On learning, representing and generalizing a task in a humanoid robot
- Calinon S., Guenter F., and Billard A., On learning, representing and generalizing a task in a humanoid robot, IEEE Trans. Syst. Man Cybernet. B (Special Issue on Robot Learning by Observation, Demonstration and Imitation) 37, in press (2007).
- (2007) IEEE Trans. Syst. Man Cybernet. B (Special Issue on Robot Learning by Observation, Demonstration and Imitation , vol.37
- Calinon, S.¹ Guenter, F.² Billard, A.³

27
- 0001551844
- Supervised learning from incomplete data via an EM approach
- Ghahramani, Z., and Jordan, M. I., 1994. Supervised learning from incomplete data via an EM approach. Adv. Neural Information Process. Syst., 6:120–128.
- (1994) Adv. Neural Information Process. Syst. , vol.6 , pp. 120-128
- Ghahramani, Z.¹ Jordan, M.I.²

28
- 33646162402
- Discriminative and adaptative imitation in uni-manual and bi-manual tasks
- Billard, A., Calinon, S., and Guenter, F., 2006. Discriminative and adaptative imitation in uni-manual and bi-manual tasks. Robotics and Autonomous Syst., 54:370–384.
- (2006) Robotics and Autonomous Syst. , vol.54 , pp. 370-384
- Billard, A.¹ Calinon, S.² Guenter, F.³

29
- 0036832950
- Technical update: Least-squares temporal difference learning
- Boyan, J. A., 2002. Technical update:least-squares temporal difference learning. Machine Learn., 49:233–246.
- (2002) Machine Learn. , vol.49 , pp. 233-246
- Boyan, J.A.¹

30
- 0003487482
- Belmont, MA: Athena Scientific
- Bertsekas, D., and Tsitsiklis, J., 1996. Neuro-dynamic Programming, Belmont, MA:Athena Scientific.
- (1996) Neuro-dynamic Programming
- Bertsekas, D.¹ Tsitsiklis, J.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.