SCOPUS 정보 검색 플랫폼

IEEE Robotics and Automation Magazine

Volumn 17, Issue 2, 2010, Pages 20-29

Learning control in robotics

(2) Schaal, Stefan a Atkeson, Christopher G b

a UNIVERSITY OF SOUTHERN CALIFORNIA (United States)

b CARNEGIE MELLON UNIVERSITY (United States)

Author keywords

Learning control; Optimal control; Reinforcement learning; Robot learning

Indexed keywords

BAYESIAN; CONTROL POLICY; CONTROLLED SYSTEM; CONTROLLER PARAMETER; INTERNAL MODELS; INVERSE MODELS; LEARNING CONTROL; LEARNING CONTROL TECHNIQUES; LOCALLY WEIGHTED REGRESSION; OPTIMAL CONTROL; OPTIMAL CONTROLS; ROBOT ARMS; ROBUST CONTROLLERS; STATE VECTOR; TRAJECTORY-BASED;

CONTROL; INVERSE KINEMATICS; LEARNING ALGORITHMS; MATHEMATICAL MODELS; OPTIMIZATION; REINFORCEMENT; REINFORCEMENT LEARNING; ROBOT LEARNING; ROBOTICS; ROBOTS;

CONTROLLERS;

EID: 77953330028 PISSN: 10709932 EISSN: None Source Type: Journal
DOI: 10.1109/MRA.2010.936957 Document Type: Article

Times cited : (159)

References (83)

1
- 73549098121
- The new robotics-Towards human-centeredmachines
- S. Schaal, "The new robotics-Towards human-centeredmachines," HFSP J. Frontiers Interdisciplinary Res. Life Sci., vol.1, no.2, pp. 115-126, 2007.
- (2007) HFSP J. Frontiers Interdisciplinary Res. Life Sci. , vol.1 , Issue.2 , pp. 115-126
- Schaal, S.¹

2
- 0004255876
- Reading MA Addison- Wesley
- K. J. Åstrom and B. Wittenmark, Adaptive Control. Reading, MA: Addison- Wesley, 1989.
- (1989) Adaptive Control
- Åstrom, K.J.¹ Wittenmark, B.²

3
- 27744518715
- Cambridge MA MIT Press
- S. Thrun, W. Burgard, and D. Fox, Probabilistic Robotics. Cambridge, MA: MIT Press, 2005.
- (2005) Probabilistic Robotics
- Thrun, S.¹ Burgard, W.² Fox, D.³

4
- 77953334417
- 1st ed. New York: Springer-Verlag
- M. Buehler, The DARPA Urban Challenge: Autonomous Vehicles in City Traffic, 1st ed. New York: Springer-Verlag, 2009.
- (2009) The DARPA Urban Challenge: Autonomous Vehicles in City Traffic
- Buehler, M.¹

5
- 65949096112
- New York: Springer-Verlag
- M. Buehler, K. Iagnemma, and S. Singh, The 2005 DARPA Grand Challenge: The Great Robot Race. New York: Springer-Verlag, 2007.
- (2007) The 2005 DARPA Grand Challenge: The Great Robot Race
- Buehler, M.¹ Iagnemma, K.² Singh, S.³

6
- 27344443125
- Finding approximate POMDP solutions through belief compression
- M. Roy, G. Gordon, and S. Thrun, "Finding approximate POMDP solutions through belief compression," J. Artif. Intell. Res., vol.23, pp. 1-40, 2005.
- (2005) J. Artif. Intell. Res. , vol.23 , pp. 1-40
- Roy, M.¹ Gordon, G.² Thrun, S.³

7
- 0004102479
- Cambridge MA: MIT Press
- R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

8
- 84921399937
- Hoboken NJ: IEEE Press/Wiley-Interscience
- J. Si, Handbook of Learning and Approximate Dynamic Programming. Hoboken, NJ: IEEE Press/Wiley-Interscience, 2004.
- (2004) Handbook of Learning and Approximate Dynamic Programming
- Si, J.¹

9
- 85012688561
- Princeton NJ Princeton Univ. Press
- R. Bellman, Dynamic Programming. Princeton, NJ: Princeton Univ. Press, 1957.
- (1957) Dynamic Programming
- Bellman, R.¹

10
- 0004276055
- New York: Academic
- P. Dyer and S. R. McReynolds, The Computation and Theory of Optimal Control. New York: Academic, 1970.
- (1970) The Computation and Theory of Optimal Control
- Dyer, P.¹ McReynolds, S.R.²

11
- 0004039554
- New York: Springer-Verlag
- L. Sciavicco and B. Siciliano, Modelling and Control of Robot Manipulators. New York: Springer-Verlag, 2000.
- (2000) Modelling and Control of Robot Manipulators
- Sciavicco, L.¹ Siciliano, B.²

12
- 44049116478
- Supervised learning with a distal teacher
- I. M. Jordan, D. E. Rumelhart, "Supervised learning with a distal teacher," Cogn. Sci., vol.16, pp. 307-354, 1992.
- (1992) Cogn. Sci. , vol.16 , pp. 307-354
- Jordan, I.M.¹ Rumelhart, D.E.²

13
- 0035559687
- Learning inverse kinematics
- Maui, HI, Oct. 29-Nov.
- A. D'Souza, S. Vijayakumar, and S. Schaal, "Learning inverse kinematics," in Proc. IEEE Int. Conf. Intelligent Robots and Systems (IROS 2001), Maui, HI, Oct. 29-Nov. 3, 2001, pp. 298-301.
- (2001) Proc. IEEE Int. Conf. Intelligent Robots and Systems (IROS 2001) , vol.3 , pp. 298-301
- D'Souza, A.¹ Vijayakumar, S.² Schaal, S.³

14
- 0027382368
- A self-organizing neural model of motor equivalent reaching and tool use by a multijoint arm
- D. Bullock, S. Grossberg, and F. H. Guenther, "A self-organizing neural model of motor equivalent reaching and tool use by a multijoint arm," J. Cogn. Neurosci., vol.5, no.4, pp. 408-435, 1993.
- (1993) J. Cogn. Neurosci. , vol.5 , Issue.4 , pp. 408-435
- Bullock, D.¹ Grossberg, S.² Guenther, F.H.³

15
- 38649095925
- Learning to control in operational space
- J. Peters and S. Schaal, "Learning to control in operational space," Int. J. Robot. Res., vol.27, pp. 197-212, 2008.
- (2008) Int. J. Robot. Res. , vol.27 , pp. 197-212
- Peters, J.¹ Schaal, S.²

16
- 0001551844
- Supervised learning from incomplete data via an em approach
- J. D. Cowan, G. Tesauro, and J. Alspector, Eds. San Mateo, CA: Morgan Kaufmann
- Z. Ghahramani and M. I. Jordan, "Supervised learning from incomplete data via an EM approach," in Advances in Neural Information Processing Systems 6, J. D. Cowan, G. Tesauro, and J. Alspector, Eds. San Mateo, CA: Morgan Kaufmann, 1994, pp. 120-127.
- (1994) Advances in Neural Information Processing Systems , vol.6 , pp. 120-127
- Ghahramani, Z.¹ Jordan, M.I.²

17
- 0001108227
- Constructive incremental learning from only local information
- S. Schaal and C. G. Atkeson, "Constructive incremental learning from only local information," Neural Comput., vol.10, no.8, pp. 2047-2084, 1998.
- (1998) Neural Comput. , vol.10 , Issue.8 , pp. 2047-2084
- Schaal, S.¹ Atkeson, C.G.²

18
- 84936916896
- Robust locally weighted regression and smoothing scatterplots
- W. S. Cleveland, "Robust locally weighted regression and smoothing scatterplots," J. Amer. Statist. Assoc., vol.74, pp. 829-836, 1979.
- (1979) J. Amer. Statist. Assoc. , vol.74 , pp. 829-836
- Cleveland, W.S.¹

19
- 2342560362
- Using local models to control movement
- D. Touretzky, Ed. San Mateo, CA: Morgan Kaufmann
- C. G. Atkeson, "Using local models to control movement," in Advances in Neural Information Processing Systems 1, D. Touretzky, Ed. San Mateo, CA: Morgan Kaufmann, 1989, pp. 157-183.
- (1989) Advances in Neural Information Processing Systems , vol.1 , pp. 157-183
- Atkeson, C.G.¹

20
- 0031074521
- Locally weighted learning
- C. G. Atkeson, A. W. Moore, and S. Schaal, "Locally weighted learning," Artif. Intell. Rev., vol.11, no.1-5, pp. 11-73, 1997.
- (1997) Artif. Intell. Rev. , vol.11 , Issue.1-5 , pp. 11-73
- Atkeson, C.G.¹ Moore, A.W.² Schaal, S.³

21
- 0031073475
- Locally weighted learning for control
- C. G. Atkeson, A. W. Moore, and S. Schaal, "Locally weighted learning for control," Artif. Intell. Rev., vol.11, no.1-5, pp. 75-113, 1997.
- (1997) Artif. Intell. Rev. , vol.11 , Issue.1-5 , pp. 75-113
- Atkeson, C.G.¹ Moore, A.W.² Schaal, S.³

22
- 27144556425
- Incremental online learning in high dimensions
- S. Vijayakumar, A. D'Souza, and S. Schaal, "Incremental online learning in high dimensions," Neural Comput., vol.17, no.12, pp. 2602-2634, 2005.
- (2005) Neural Comput. , vol.17 , Issue.12 , pp. 2602-2634
- Vijayakumar, S.¹ D'Souza, A.² Schaal, S.³

23
- 51649101130
- A Bayesian approach to empirical local linearizations for robotics
- Pasadena, CA, May 19-23
- J.-A. Ting, A. D'Souza, S. Vijayakumar, and S. Schaal, "A Bayesian approach to empirical local linearizations for robotics," in Proc. Int. Conf. Robotics and Automation (ICRA2008), Pasadena, CA, May 19-23, 2008, pp. 2860-2865.
- (2008) Proc. Int. Conf. Robotics and Automation (ICRA2008) , pp. 2860-2865
- Ting, J.-A.¹ D'Souza, A.² Vijayakumar, S.³ Schaal, S.⁴

24
- 25444448065
- Cambridge MA: MIT Press
- C. E. Rasmussen and C. K. I. Williams, Gaussian Processes for Machine Learning. Cambridge, MA: MIT Press, 2006.
- (2006) Gaussian Processes for Machine Learning
- Rasmussen, C.E.¹ Williams, C.K.I.²

25
- 77953336006
- Local gaussian process regression for real time online model learning and control
- D. Schuurmans, J. Benigio, and D. Koller, Eds. Vancouver, BC, Dec. 8-11
- D. Nguyen-Tuong, M. Seeger, and J. Peters, "Local gaussian process regression for real time online model learning and control," in Proc. Advances in Neural Information Processing Systems 21 (NIPS 2008), D. Schuurmans, J. Benigio, and D. Koller, Eds. Vancouver, BC, Dec. 8-11, 2009, pp. 1193-1200.
- (2009) Proc. Advances in Neural Information Processing Systems 21 (NIPS 2008) , pp. 1193-1200
- Nguyen-Tuong, D.¹ Seeger, M.² Peters, J.³

26
- 61849173491
- Gaussian process dynamic programming
- M. P. Deisenroth, C. E. Rasmussen, and J. Peters, "Gaussian process dynamic programming," Neurocomputing, vol.72, no.7-9, pp. 1508- 1524, 2009.
- (2009) Neurocomputing , vol.72 , Issue.7-9 , pp. 1508-1524
- Deisenroth, M.P.¹ Rasmussen, C.E.² Peters, J.³

27
- 84898947911
- Sparse representation for gaussian process models
- Denver, CO
- L. Csat'o and M. Opper, "Sparse representation for gaussian process models," in Proc. Advances in Neural Information Processing Systems 13 (NIPS 2000), Denver, CO, 2001, pp. 444-450.
- (2001) Proc. Advances in Neural Information Processing Systems 13 (NIPS 2000) , pp. 444-450
- Csat'O, L.¹ Opper, M.²

28
- 0032192424
- Multiple paired forward and inverse models for motor control
- D. M. Wolpert and M. Kawato, "Multiple paired forward and inverse models for motor control," Neural Netw., vol.11, no.7-8, pp. 1317- 1329, 1998.
- (1998) Neural Netw. , vol.11 , Issue.7-8 , pp. 1317-1329
- Wolpert, D.M.¹ Kawato, M.²

29
- 0004291983
- New York American Elsevier
- D. H. Jacobson and D. Q. Mayne, Differential Dynamic Programming. New York: American Elsevier, 1970.
- (1970) Differential Dynamic Programming
- Jacobson, D.H.¹ Mayne, D.Q.²

30
- 0033629916
- Reinforcement learning in continuous time and space
- Jan.
- K. Doya, "Reinforcement learning in continuous time and space," Neural Comput., vol.12, no.1, pp. 219-245, Jan. 2000.
- (2000) Neural Comput. , vol.12 , Issue.1 , pp. 219-245
- Doya, K.¹

31
- 77953332345
- submitted for publication
- E. Theodorou, J. Buchli, and S. Schaal, "Reinforcement learning in high dimensional state spaces: A path integral approach," submitted for publication.
- Reinforcement Learning in High Dimensional State Spaces: A Path Integral Approach
- Theodorou, E.¹ Buchli, J.² Schaal, S.³

32
- 0036832953
- Variable resolution discretization in optimal control
- R. Munos and A. Moore, "Variable resolution discretization in optimal control," Mach. Learn., vol. 49, no. 2/3, p. 33, 2002.
- (2002) Mach. Learn. , vol.49 , Issue.2-3 , pp. 33
- Munos, R.¹ Moore, A.²

33
- 49049094416
- Random sampling of states in dynamic programming
- C. G. Atkeson and B. J. Stephens, "Random sampling of states in dynamic programming," IEEE Trans. Syst., Man, Cybern. B, vol.38, no.4, pp. 924-929, 2008.
- (2008) IEEE Trans. Syst., Man, Cybern. B , vol.38 , Issue.4 , pp. 924-929
- Atkeson, C.G.¹ Stephens, B.J.²

34
- 34548784023
- Randomly sampling actions in dynamic programming
- ADPRL'07
- C. G. Atkeson, "Randomly sampling actions in dynamic programming," in Proc. IEEE Int. Symp. Approximate Dynamic Programming and Reinforcement Learning, 2007, ADPRL'07, pp. 185-192.
- (2007) Proc. IEEE Int. Symp. Approximate Dynamic Programming and Reinforcement Learning , pp. 185-192
- Atkeson, C.G.¹

35
- 77950579842
- Control of a walking biped using a combination of simple policies
- Paris, France, Dec. 7-10
- E. Whitman and C. G. Atkeson, "Control of a walking biped using a combination of simple policies," in Proc. IEEE/RAS Int. Conf. Humanoid Robotics, Paris, France, Dec. 7-10, 2009, pp. 520-527.
- (2009) Proc. IEEE/RAS Int. Conf. Humanoid Robotics , pp. 520-527
- Whitman, E.¹ Atkeson, C.G.²

36
- 64849106540
- Tomlab Optimization Inc. [Online]. Available
- Tomlab Optimization Inc. (2010). PROPT-Matlab optimal control software [Online]. Available: http://tomdyn.com/
- (2010) PROPT-Matlab Optimal Control Software

37
- 77953346346
- Technische Universitat Darmstadt. [Online]. Available
- Technische Universitat Darmstadt. (2010). DIRCOL: A direct collocation method for the numerical solution of optimal control problems [Online]. Available: http://www.sim.informatik.tu-darmstadt.de/sw/dircol
- (2010) DIRCOL: A Direct Collocation Method for the Numerical Solution of Optimal Control Problems

38
- 77953331240
- [Online]. Available
- Stanford Business Software Corporation. (2010). SNOPT; Software for large-scale nonlinear programming [Online]. Available: http://www.sbsisol- optimize.com/asp/sol-product-snopt.htm
- (2010) SNOPT; Software for Large-scale Nonlinear Programming

39
- 12844272111
- Synthesizing physically realistic human motion in low-dimensional, behavior-specific spaces
- A. Safonova, J. K. Hodgins, and N. S. Pollard, "Synthesizing physically realistic human motion in low-dimensional, behavior-specific spaces," ACM Trans. Graph. J.(SIGGRAPH 2004 Proc.), vol.23, no.3, pp. 514- 521, 2004.
- (2004) ACM Trans. Graph. J.(SIGGRAPH 2004 Proc.) , vol.23 , Issue.3 , pp. 514-521
- Safonova, A.¹ Hodgins, J.K.² Pollard, N.S.³

40
- 76249100249
- Standing balance control using a trajectory library
- presented at the
- L. Chenggang and C. G. Atkeson, "Standing balance control using a trajectory library," presented at the IEEE/RSJ Int. Conf. Intelligent Robots and Systems (IROS 2009), 2009.
- (2009) IEEE/RSJ Int. Conf. Intelligent Robots and Systems (IROS 2009)
- Chenggang, L.¹ Atkeson, C.G.²

41
- 0141819580
- Pegasus: A policy search method for large MDPs and POMDPs
- presented at the
- A. Ng, "Pegasus: A policy search method for large MDPs and POMDPs," presented at the Uncertainty in Artificial Intelligence (UAI), 2000.
- (2000) Uncertainty in Artificial Intelligence (UAI)
- Ng, A.¹

42
- 0015615562
- Wide-sense adaptive dual control for nonlinear stochastic systems
- E. Tse, Y. Bar-Shalom, and L. Meier, III, "Wide-sense adaptive dual control for nonlinear stochastic systems," IEEE Trans. Automat. Contr., vol.18, no.2, pp. 98-108, 1973.
- (1973) IEEE Trans. Automat. Contr. , vol.18 , Issue.2 , pp. 98-108
- Tse, E.¹ Bar-Shalom, Y.² Meier III, L.³

43
- 0041443966
- Caution, probing and the value of information in the control of uncertain systems
- Y. Bar-Shalom and E. Tse, "Caution, probing and the value of information in the control of uncertain systems," Ann. Econ. Social Meas., vol.4, no.3, pp. 323-338, 1976.
- (1976) Ann. Econ. Social Meas. , vol.4 , Issue.3 , pp. 323-338
- Bar-Shalom, Y.¹ Tse, E.²

44
- 0002130986
- Robot learning from demonstration
- D. H. Fisher, Jr., Ed. Nashville, TN, July 8-12
- C. G. Atkeson and S. Schaal, "Robot learning from demonstration," in Proc. 14th Int. Conf. Machine Learning (ICML'97), D. H. Fisher, Jr., Ed. Nashville, TN, July 8-12, 1997, pp. 12-20.
- (1997) Proc. 14th Int. Conf. Machine Learning (ICML'97) , pp. 12-20
- Atkeson, C.G.¹ Schaal, S.²

45
- 33847202724
- Learning to predict by the methods of temporal differences
- R. S. Sutton, "Learning to predict by the methods of temporal differences," Mach. Learn., vol.3, no.1, pp. 9-44, 1988.
- (1988) Mach. Learn. , vol.3 , Issue.1 , pp. 9-44
- Sutton, R.S.¹

46
- 0004049895
- Ph.D. thesis, Cambridge Univ., U.K.
- C. J. C. H. Watkins, "Learning with delayed rewards," Ph.D. thesis, Cambridge Univ., U.K., 1989.
- (1989) Learning with Delayed Rewards
- Watkins, C.J.C.H.¹

47
- 0035979437
- Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning
- J. Morimoto and K. Doya, "Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning," Robot. Auton. Syst., vol.36, no.1, pp. 37-51, 2001.
- (2001) Robot. Auton. Syst. , vol.36 , Issue.1 , pp. 37-51
- Morimoto, J.¹ Doya, K.²

48
- 0033151712
- Is imitation learning the route to humanoid robots?
- S. Schaal, "Is imitation learning the route to humanoid robots?" Trends Cogn. Sci., vol.3, no.6, pp. 233-242, 1999.
- (1999) Trends Cogn. Sci. , vol.3 , Issue.6 , pp. 233-242
- Schaal, S.¹

49
- 0037471828
- Computational approaches to motor learning by imitation
- S. Schaal, A. Ijspeert, and A. Billard, "Computational approaches to motor learning by imitation," Philos. Trans. R. Soc. London B, Biol. Sci., vol.358, no.1431, pp. 537-547, 2003.
- (2003) Philos. Trans. R. Soc. London B, Biol. Sci. , vol.358 , Issue.1431 , pp. 537-547
- Schaal, S.¹ Ijspeert, A.² Billard, A.³

50
- 0030652809
- Learning tasks from a single demonstration
- Albuquerque, NM, Apr. 20-25
- C. G. Atkeson and S. Schaal, "Learning tasks from a single demonstration," in Proc. IEEE Int. Conf. Robotics and Automation (ICRA'97), Albuquerque, NM, Apr. 20-25, 1997, pp. 1706-1712.
- (1997) Proc. IEEE Int. Conf. Robotics and Automation (ICRA'97) , pp. 1706-1712
- Atkeson, C.G.¹ Schaal, S.²

51
- 84898995067
- Learning from demonstration
- M. C. Mozer, M. Jordan, and T. Petsche, Eds. Cambridge, MA
- S. Schaal, "Learning from demonstration," in Proc. Advances in Neural Information Processing Systems 9, M. C. Mozer, M. Jordan, and T. Petsche, Eds. Cambridge, MA, 1997, pp. 1040-1046.
- (1997) Proc. Advances in Neural Information Processing Systems , vol.9 , pp. 1040-1046
- Schaal, S.¹

52
- 21844465127
- Tree-based batch mode reinforcement learning
- D. Ernst, P. Geurts, and L. Wehenkel, "Tree-based batch mode reinforcement learning," J. Mach. Learn. Res., vol.6, pp. 503-556, 2005.
- (2005) J. Mach. Learn. Res. , vol.6 , pp. 503-556
- Ernst, D.¹ Geurts, P.² Wehenkel, L.³

53
- 70049104729
- Fitted Q-iteration by advantage weighted regression
- D. Schuurmans, J. Benigio, and D. Koller, Eds. Vancouver, BC, Dec. 8-1
- G. Neumann and J. Peters, "Fitted Q-iteration by advantage weighted regression," in Proc. Advances in Neural Information Processing Systems 21 (NIPS 2008), D. Schuurmans, J. Benigio, and D. Koller, Eds. Vancouver, BC, Dec. 8-11, 2009, pp. 1177-1184.
- (2009) Proc. Advances in Neural Information Processing Systems 21 (NIPS 2008) , pp. 1177-1184
- Neumann, G.¹ Peters, J.²

54
- 84898939480
- Policy gradient methods for reinforcement learning with function approximation
- S. A. Solla, T. K. Leen, and K.-R.Muller, Eds. Denver, CO
- R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour, "Policy gradient methods for reinforcement learning with function approximation," in Proc. Advances in Neural Processing Systems 12, S. A. Solla, T. K. Leen, and K.-R.Muller, Eds. Denver, CO, 2000.
- (2000) Proc. Advances in Neural Processing Systems , vol.12
- Sutton, R.S.¹ McAllester, D.² Singh, S.³ Mansour, Y.⁴

55
- 44949241322
- Reinforcement learning of motor skills with policy gradients
- May
- J. Peters and S. Schaal, "Reinforcement learning of motor skills with policy gradients," Neural Netw., vol.21, no.4, pp. 682-697, May 2008.
- (2008) Neural Netw. , vol.21 , Issue.4 , pp. 682-697
- Peters, J.¹ Schaal, S.²

56
- 0030703463
- Optimal random perturbations for stochastic approximation using a simultaneous perturbation gradient approximation
- presented at the
- P. Sadegh and J. Spall, "Optimal random perturbations for stochastic approximation using a simultaneous perturbation gradient approximation," presented at the Proc. American Control Conf., 1997.
- (1997) Proc. American Control Conf.
- Sadegh, P.¹ Spall, J.²

57
- 0000337576
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- R. J. Williams, "Simple statistical gradient-following algorithms for connectionist reinforcement learning," Mach. Learn., vol.8, no.3-4, pp. 229- 256, 1992.
- (1992) Mach. Learn. , vol.8 , Issue.3-4 , pp. 229-256
- Williams, R.J.¹

58
- 0025600638
- A stochastic reinforcement learning algorithm for learning real-valued functions
- V. Gullapalli, "A stochastic reinforcement learning algorithm for learning real-valued functions," Neural Netw., vol.3, no.6, pp. 671-692, 1990.
- (1990) Neural Netw. , vol.3 , Issue.6 , pp. 671-692
- Gullapalli, V.¹

59
- 1942514241
- Scaling internal-state policy-gradient methods for POMDPs
- Sydney, Australia
- D. Aberdeen and J. Baxter, "Scaling internal-state policy-gradient methods for POMDPs," in Proc. 19th Int. Conf. Machine Learning (ICML-2002), Sydney, Australia, 2002, pp. 3-10.
- (2002) Proc. 19th Int. Conf. Machine Learning (ICML-2002) , pp. 3-10
- Aberdeen, D.¹ Baxter, J.²

60
- 40649106649
- Natural actor critic
- J. Peters and S. Schaal, "Natural actor critic," Neurocomputing, vol.71, no.7-9, pp. 1180-1190, 2008.
- (2008) Neurocomputing , vol.71 , Issue.7-9 , pp. 1180-1190
- Peters, J.¹ Schaal, S.²

61
- 0033570817
- Natural gradient learning for over- and under-complete bases in ICA
- Nov.
- S. Amari, "Natural gradient learning for over- and under-complete bases In ICA," Neural Comput., vol.11, no.8, pp. 1875-1883, Nov. 1999.
- (1999) Neural Comput. , vol.11 , Issue.8 , pp. 1875-1883
- Amari, S.¹

62
- 84898930479
- Natural policy gradient
- presented at the Vancouver, CA
- S. Kakade, "Natural policy gradient," presented at the Advances in Neural Information Processing Systems, Vancouver, CA, 2002.
- (2002) Advances in Neural Information Processing Systems
- Kakade, S.¹

63
- 56049089041
- State-dependent exploration for policy gradient methods
- presented at the Part II, LNAI
- T. Ruckstieß, M. Felder, and J. Schmidhuber, "State-dependent exploration for policy gradient methods," presented at the European Conf. Machine Learning and Principles and Practice of Knowledge Discovery in Databases 2008, Part II, LNAI 5212, 2008.
- (2008) European Conf. Machine Learning and Principles and Practice of Knowledge Discovery in Databases 2008 , vol.5212
- Ruckstieß, T.¹ Felder, M.² Schmidhuber, J.³

64
- 38649142135
- Learning CPG-based biped locomotion with a policy gradient method: Application to a humanoid robot
- G. Endo, J. Morimoto, T. Matsubara, J. Nakanish, and G. Cheng, "Learning CPG-based biped locomotion with a policy gradient method: Application to a humanoid robot," Int. J. Robot. Res., vol.27, no.2, pp. 213-228, 2008.
- (2008) Int. J. Robot. Res. , vol.27 , Issue.2 , pp. 213-228
- Endo, G.¹ Morimoto, J.² Matsubara, T.³ Nakanish, J.⁴ Cheng, G.⁵

65
- 14044262287
- Stochastic policy gradient reinforcement learning on a simple 3D biped
- Sendai, Japan, Oct.
- R. Tedrake, T.W. Zhang, and S. Seung, "Stochastic policy gradient reinforcement learning on a simple 3D biped," in Proc. Int. Conf. Intelligent Robots and Systems (IROS 2004), Sendai, Japan, Oct. 2004, pp. 2849-2854.
- (2004) Proc. Int. Conf. Intelligent Robots and Systems (IROS 2004) , pp. 2849-2854
- Tedrake, R.¹ Zhang, T.W.² Seung, S.³

66
- 34250635407
- Policy gradient methods for robotics
- Beijing, Oct. 9- 15
- J. Peters and S. Schaal, "Policy gradient methods for robotics," in Proc. IEEE Int. Conf. Intelligent Robotics Systems (IROS 2006), Beijing, Oct. 9- 15, 2006, pp. 2219-2225.
- (2006) Proc. IEEE Int. Conf. Intelligent Robotics Systems (IROS 2006) , pp. 2219-2225
- Peters, J.¹ Schaal, S.²

67
- 0346982426
- Using em for reinforcement learning
- P. Dayan and G. Hinton, "Using EM for reinforcement learning," Neural Comput., vol.9, no.2, pp. 271-278, 1997.
- (1997) Neural Comput. , vol.9 , Issue.2 , pp. 271-278
- Dayan, P.¹ Hinton, G.²

68
- 0002629270
- Maximum likelihood from incomplete data via the em algorithm
- A. P. Dempster, N. M. Laird, and D. B. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," J. R. Statist. Soc. B, vol.39, no.1, pp. 1-38, 1977.
- (1977) J. R. Statist. Soc. B , vol.39 , Issue.1 , pp. 1-38
- Dempster, A.P.¹ Laird, N.M.² Rubin, D.B.³

69
- 78049446733
- Learning motor primitives in robotics
- D. Schuurmans, J. Benigio, and D. Koller, Eds. Vancouver, BC, Dec. 8-11
- J. Kober and J. Peters, "Learning motor primitives in robotics," in Proc. Advances in Neural Information Processing Systems 21 (NIPS 2008), D. Schuurmans, J. Benigio, and D. Koller, Eds. Vancouver, BC, Dec. 8-11, 2009, pp. 297-304.
- (2009) Proc. Advances in Neural Information Processing Systems 21 (NIPS 2008) , pp. 297-304
- Kober, J.¹ Peters, J.²

70
- 34250728061
- Probabilistic inference for solving discrete and continuous state Markov decision processes
- presented at the
- M. Toussaint and A. Storkey, "Probabilistic inference for solving discrete and continuous state Markov decision processes," presented at the 23nd Int. Conf. Machine Learning (ICML 2006), 2006.
- (2006) 23nd Int. Conf. Machine Learning (ICML 2006)
- Toussaint, M.¹ Storkey, A.²

71
- 70349327392
- Learning modelfree control by a Monte-Carlo em algorithm
- N. Vlassis, M. Toussaint, G. Kontes, and S. Piperidis, "Learning modelfree control by a Monte-Carlo EM algorithm," Auton. Robots, vol.27, no.2, pp. 123-130, 2009.
- (2009) Auton. Robots , vol.27 , Issue.2 , pp. 123-130
- Vlassis, N.¹ Toussaint, M.² Kontes, G.³ Piperidis, S.⁴

72
- 28844435646
- Linear theory for control of nonlinear stochastic systems
- Nov.
- H. J. Kappen, "Linear theory for control of nonlinear stochastic systems," Phys. Rev. Lett., vol.95, no.20, pp. 200201-200204, Nov. 2005.
- (2005) Phys. Rev. Lett. , vol.95 , Issue.20 , pp. 200201-200204
- Kappen, H.J.¹

73
- 33947410345
- An introduction to stochastic control theory, path integrals and reinforcement learning
- J. Marro, P. L. Garrido, and J. J. Torres, Eds
- H. J. Kappen, "An introduction to stochastic control theory, path integrals and reinforcement learning," in Cooperative Behavior in Neural Systems, vol.887, J. Marro, P. L. Garrido, and J. J. Torres, Eds. 2007, pp. 149-181.
- (2007) Cooperative Behavior in Neural Systems , vol.887 , pp. 149-181
- Kappen, H.J.¹

74
- 67650458713
- Path integral stochastic optimal control for rigid body dynamics
- presented at the Nashville, TN, Mar. 30-Apr.
- E. Theodorou, J. Buchli, and S. Schaal, "Path integral stochastic optimal control for rigid body dynamics," presented at the IEEE Int. Symp. Approximate Dynamic Programming and Reinforcement Learning (ADPRL2009), Nashville, TN, Mar. 30-Apr. 2, 2009.
- (2009) IEEE Int. Symp. Approximate Dynamic Programming and Reinforcement Learning (ADPRL2009) , vol.2
- Theodorou, E.¹ Buchli, J.² Schaal, S.³

75
- 67650915125
- Efficient computation of optimal actions
- July
- E. Todorov, "Efficient computation of optimal actions," Proc. Nat. Acad. Sci. USA, vol.106, no.28, pp. 11478-11483, July 2009.
- (2009) Proc. Nat. Acad. Sci. USA , vol.106 , Issue.28 , pp. 11478-11483
- Todorov, E.¹

76
- 84899019754
- Learning attractor landscapes for learning motor primitives
- S. Becker, S. Thrun, and K.Obermayer, Eds.
- A. Ijspeert, J. Nakanishi, and S. Schaal, "Learning attractor landscapes for learning motor primitives," in Advances in Neural Information Processing Systems 15, S. Becker, S. Thrun, and K.Obermayer, Eds. 2003, pp. 1547-1554.
- (2003) Advances in Neural Information Processing Systems , vol.15 , pp. 1547-1554
- Ijspeert, A.¹ Nakanishi, J.² Schaal, S.³

77
- 33846516584
- New York Springer-Verlag
- C. M. Bishop, Pattern Recognition and Machine Learning. New York: Springer-Verlag, 2006.
- (2006) Pattern Recognition and Machine Learning
- Bishop, C.M.¹

78
- 58249141653
- Robot programming by demonstration
- B. Siciliano andO.Khatib, Eds. Cambridge,MA:MIT Press ch. 59
- A. Billard, S. Calinon, R. Dillmann, and S. Schaal, "Robot programming by demonstration," in Handbook of Robotics, vol.1, B. Siciliano andO.Khatib, Eds. Cambridge,MA:MIT Press, 2008, ch. 59.
- (2008) Handbook of Robotics , vol.1
- Billard, A.¹ Calinon, S.² Dillmann, R.³ Schaal, S.⁴

79
- 0027832075
- Trajectory formation of arm movement by a neural network with forward and inverse dynamics models
- Y. Wada and M. Kawato, "Trajectory formation of arm movement by a neural network with forward and inverse dynamics models," Syst. Comput. Jpn., vol.24, pp. 37-50, 1994.
- (1994) Syst. Comput. Jpn. , vol.24 , pp. 37-50
- Wada, Y.¹ Kawato, M.²

80
- 2442636320
- Embodied symbol emergence based on mimesis theory
- Apr.-May
- T. Inamura, I. Toshima, H. Tanie, and Y. Nakamura, "Embodied symbol emergence based on mimesis theory," Int. J. Robot. Res., vol.23, no.4-5, p. 363, Apr.-May 2004.
- (2004) Int. J. Robot. Res. , vol.23 , Issue.4-5 , pp. 363
- Inamura, T.¹ Toshima, I.² Tanie, H.³ Nakamura, Y.⁴

81
- 0042547347
- Algorithms for inverse reinforcement learning
- Stanford, CA
- A. Y. Ng and S. Russell, "Algorithms for inverse reinforcement learning," in Proc. 17th Int. Conf. Machine Learning (ICML 2000), Stanford, CA, 2000, pp. 663-670.
- (2000) Proc. 17th Int. Conf. Machine Learning (ICML 2000) , pp. 663-670
- Ng, A.Y.¹ Russell, S.²

82
- 14344251217
- Apprenticeship learning via inverse reinforcement learning
- P. Abbeel and A. Ng, "Apprenticeship learning via inverse reinforcement learning," in Proc. 21st Int. Conf. Machine Learning, 2004.
- (2004) Proc. 21st Int. Conf. Machine Learning
- Abbeel, P.¹ Ng, A.²

83
- 67650957592
- Learning to search: Functional gradient techniques for imitation learning
- N. Ratliff, D. Silver, and J. A. Bagnell, "Learning to search: Functional gradient techniques for imitation learning," Auton. Robots, vol.27, no.1, pp. 25-53, 2009.
- (2009) Auton. Robots , vol.27 , Issue.1 , pp. 25-53
- Ratliff, N.¹ Silver, D.² Bagnell, J.A.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.