메뉴 건너뛰기




Volumn 29, Issue 2, 2010, Pages 169-200

Finding and transferring policies using stored behaviors

Author keywords

Behaviour based systems; Learning and adaptive systems; Learning from demonstration; Legged robots; Planning

Indexed keywords

BEHAVIOUR-BASED SYSTEMS; CONTROL APPROACH; CONTROL LAWS; FAST CONTROL; GLOBAL POLICIES; LEARNING AND ADAPTIVE SYSTEM; LEARNING FROM DEMONSTRATION; LEGGED ROBOTS; LOCAL FEATURE; PLANNING ALGORITHMS; POLICY ITERATION; QUADRUPED ROBOTS; ROUGH TERRAINS; SPEED-UPS; VALUE FUNCTIONS;

EID: 77955774980     PISSN: 09295593     EISSN: None     Source Type: Journal    
DOI: 10.1007/s10514-010-9191-2     Document Type: Article
Times cited : (16)

References (58)
  • 2
    • 0039816976 scopus 로고
    • Using local trajectory optimizers to speed up global optimization in dynamic programming
    • J. D. Cowan, G. Tesauro, & J. Alspector (Eds.) San Mateo: Morgan Kaufmann
    • Atkeson, C. G. (1994 Using local trajectory optimizers to speed up global optimization in dynamic programming. In J. D. Cowan, G. Tesauro, & J. Alspector (Eds.), Advances in Neural Information Processing Systems (Vol. 6, pp. 663-670 San Mateo: Morgan Kaufmann. URL ftp://ftp.cc.gatech.edu/pub/people/cga/ local.html.
    • (1994) Advances in Neural Information Processing Systems , vol.6 , pp. 663-670
    • Atkeson, C.G.1
  • 3
    • 84898983672 scopus 로고    scopus 로고
    • Nonparametric representation of policies and value functions: A trajectory-based approach. In S. Becker, S. Thrun, & K. Obermayer (Eds.)
    • Cambridge: MIT Press
    • Atkeson, C. G., & Morimoto, J. (2003 Nonparametric representation of policies and value functions: a trajectory-based approach. In S. Becker, S. Thrun, & K. Obermayer (Eds.), Advances in Neural Information Processing Systems (Vol. 15, pp. 1611-1618 Cambridge: MIT Press. URL http://www-2.cs. cmu.edu/~cga/publications.html.
    • (2003) Advances in Neural Information Processing Systems , vol.15 , pp. 1611-1618
    • Atkeson, C.G.1    Morimoto, J.2
  • 5
    • 33845608830 scopus 로고    scopus 로고
    • PhD thesis, Georgia Institute of Technology
    • Bentivegna, D. C. (2004 Learning from observation using primitives. PhD thesis, Georgia Institute of Technology. URL http://etd. gatech.edu/theses/ available/etd-06202004-213721/.
    • (2004) Learning from observation using primitives
    • Bentivegna, D.C.1
  • 7
    • 0028447220 scopus 로고
    • Deliberation scheduling for problem solving in time-constrained environments
    • doi:10.1016/0004-3702(94)90054-X
    • Boddy, M. S., & Dean, T. (1994 Deliberation scheduling for problem solving in time-constrained environments. Artificial Intelligence, 67(2), 245-285. doi:10.1016/0004-3702(94)90054-X.
    • (1994) Artificial Intelligence , vol.67 , Issue.2 , pp. 245-285
    • Boddy, M.S.1    Dean, T.2
  • 14
    • 0027684906 scopus 로고
    • The application of harmonic potential functions to robotics
    • Connolly, C., & Grupen, R. (1993 The application of harmonic potential functions to robotics. Journal of Robotic Systems, 10(7), 931-946.
    • (1993) Journal of Robotic Systems , vol.10 , Issue.7 , pp. 931-946
    • Connolly, C.1    Grupen, R.2
  • 15
    • 0345062531 scopus 로고    scopus 로고
    • Multidimensional interpolation and triangulation for reinforcement learning
    • San Mateo: Morgan Kaufmann
    • Davies, S. (1997 Multidimensional interpolation and triangulation for reinforcement learning. In Advances in neural information processing systems (Vol. 9 San Mateo: Morgan Kaufmann. URL http://www.autonlab.org/autonweb/ showPaper.jsp?ID=daviesmultidimensional.
    • (1997) Advances in Neural Information Processing Systems , vol.9
    • Davies, S.1
  • 17
    • 33744466799 scopus 로고    scopus 로고
    • Approximate policy iteration with a policy language bias: Solving relational Markov decision processes
    • Fern, A., Yoon, S., & Givan, R. (2006 Approximate policy iteration with a policy language bias: solving relational Markov decision processes. Journal of Artificial Intelligence Research, 25, 85-118.
    • (2006) Journal of Artificial Intelligence Research , vol.25 , pp. 85-118
    • Fern, A.1    Yoon, S.2    Givan, R.3
  • 19
    • 0003973891 scopus 로고    scopus 로고
    • Robust hybrid control for autonomous vehicle motion planning
    • Massachusetts Institute of Technology, Cambridge, MA
    • Frazzoli, E. (2001 Robust hybrid control for autonomous vehicle motion planning. Department of aeronautics and astronautics, Massachusetts Institute of Technology, Cambridge, MA. URL http://rigoletto.seas.ucla.edu/papers/Year/2001. html.
    • (2001) Department of Aeronautics and Astronautics
    • Frazzoli, E.1
  • 20
    • 84945709355 scopus 로고
    • An algorithm for finding best matches in logarithmic expected time
    • doi 10.1145/355744.355745
    • Friedman, J. H., Bentley, J. L., & Finkel, R. A. (1977 An algorithm for finding best matches in logarithmic expected time. ACM Transactions on Mathematical Software, 3(3), 209-226. doi:10.1145/355744.355745, URL http://portal.acm.org/citation. cfm?doid=355744.355745.
    • (1977) ACM Transactions on Mathematical Software , vol.3 , Issue.3 , pp. 209-226
    • Friedman, J.H.1    Bentley, J.L.2    Finkel, R.A.3
  • 24
    • 33846547167 scopus 로고    scopus 로고
    • Optimal rough terrain trajectory generation for wheeled mobile robots
    • Howard, T., & Kelly, A. (2007 Optimal rough terrain trajectory generation for wheeled mobile robots. International Journal of Robotics Research, 26(2), 141-166. URL http://www.ri.cmu.edu/ pubs/pub-5739.html. Iba, G. A. (1989 A heuristic approach to the discovery of macrooperators. Machine Learning, 3, 285-317.
    • (2007) International Journal of Robotics Research , vol.26 , Issue.2 , pp. 141-166
    • Howard, T.1    Kelly, A.2
  • 25
    • 0000148778 scopus 로고
    • A heuristic approach to the discovery of macrooperators
    • Iba, G. A. (1989). A heuristic approach to the discovery of macrooperators. Machine Learning, 3, 285-317.
    • (1989) Machine Learning , vol.3 , pp. 285-317
    • Iba, G.A.1
  • 27
    • 0030212126 scopus 로고    scopus 로고
    • Probabilistic roadmaps for path planning in high-dimensional configuration spaces
    • doi 10.1109/70.508439, URL http://ai.stanford. edu/~latombe/pub.htm
    • Kavraki, L., Svestka, P., Latombe, J., & Overmars, M. (1996 Probabilistic roadmaps for path planning in high-dimensional configuration spaces. IEEE Transactions on Robotics and Automation, 12(4), 566-580. doi:10.1109/70.508439, URL http://ai.stanford. edu/~latombe/pub.htm.
    • (1996) IEEE Transactions on Robotics and Automation , vol.12 , Issue.4 , pp. 566-580
    • Kavraki, L.1    Svestka, P.2    Latombe, J.3    Overmars, M.4
  • 29
    • 85162069513 scopus 로고    scopus 로고
    • Hierarchical apprenticeship learning with application to quadruped locomotion
    • Kolter, J. Z., Abbeel, P., & Ng, A. Y. (2008 Hierarchical apprenticeship learning with application to quadruped locomotion. In Neural information processing systems 20.
    • (2008) Neural information processing systems , vol.20
    • Kolter, J.Z.1    Abbeel, P.2    Ng, A.Y.3
  • 30
    • 0002982589 scopus 로고
    • Chunking in Soar: The anatomy of a general learning mechanism
    • Laird, J., Rosenbloom, P., & Newell, A. (1986 Chunking in Soar: The anatomy of a general learning mechanism. Machine Learning, 1, 11-46.
    • (1986) Machine Learning , vol.1 , pp. 11-46
    • Laird, J.1    Rosenbloom, P.2    Newell, A.3
  • 32
    • 77952010176 scopus 로고    scopus 로고
    • Cambridge: Cambridge University Press, to appear
    • LaValle, S. M. (2006 Planning algorithms. Cambridge: Cambridge University Press. URL http://msl.cs.uiuc.edu/planning/, to appear.
    • (2006) Planning algorithms
    • LaValle, S.M.1
  • 33
    • 0035327156 scopus 로고    scopus 로고
    • Randomized kinodynamic planning
    • DOI 10.1177/02783640122067453
    • LaValle, S. M., & Kuffner, J. J. Jr. (2001 Randomized kinodynamic planning. The International Journal of Robotics Research, 20(5), 378-400. doi:10.1177/02783640122067453, URL http://ijr.sagepub.com/cgi/content/abstract/ 20/5/378. (Pubitemid 32813681)
    • (2001) International Journal of Robotics Research , vol.20 , Issue.5 , pp. 378-400
    • LaValle, S.M.1    Kuffner Jr., J.J.2
  • 35
    • 84898982129 scopus 로고    scopus 로고
    • Predictive representations of state
    • San Mateo: Morgan Kaufmann
    • Littman, M. L., Sutton, R. S., & Singh, S. (2002 Predictive representations of state. In Advances in neural information processing systems (Vol. 14, pp. 1555-1561 San Mateo: Morgan Kaufmann. URL http://www.eecs.umich. edu/~baveja/PSRmainpage.html.
    • (2002) Advances in Neural Information Processing Systems , vol.14 , pp. 1555-1561
    • Littman, M.L.1    Sutton, R.S.2    Singh, S.3
  • 36
    • 0348132949 scopus 로고
    • Enhancing transfer in reinforcement learning by building stochastic models of robot actions
    • Mahadevan, S. (1992 Enhancing transfer in reinforcement learning by building stochastic models of robot actions. In Proceedings of the ninth international conference on machine learning (pp. 290-299 URL http://www.cs.umass. edu/~mahadeva/organized-pubs-by-year.html.
    • (1992) Proceedings of the Ninth International Conference on Machine Learning , pp. 290-299
    • Mahadevan, S.1
  • 37
    • 14344250635 scopus 로고    scopus 로고
    • Dynamic abstraction in reinforcement learning via clustering. In Proceedings of the twenty-first international conference on machine learning. McGovern, A. (2002
    • PhD thesis, University of Massachusetts Amherst
    • Mannor, S., Menache, I., Hoze, A., & Klein, U. (2004 Dynamic abstraction in reinforcement learning via clustering. In Proceedings of the twenty-first international conference on machine learning. McGovern, A. (2002 Autonomous discovery of temporal abstractions from interaction with an environment. PhD thesis, University of Massachusetts Amherst. URL http://www. cs.ou.edu/~amy/pubs.html.
    • (2004) Autonomous Discovery of Temporal Abstractions from Interaction with an Environment
    • Mannor, S.1    Menache, I.2    Hoze, A.3    Klein, U.4
  • 38
    • 0036832953 scopus 로고    scopus 로고
    • Variable resolution discretization in optimal control
    • Munos, R., & Moore, A. (2002 Variable resolution discretization in optimal control. Machine Learning, 49(2/3), 291-323. URL http://www.autonlab. org/autonweb/showPaper.jsp?ID=munosvariable.
    • (2002) Machine Learning , vol.49 , Issue.2-3 , pp. 291-323
    • Munos, R.1    Moore, A.2
  • 43
    • 34548803402 scopus 로고    scopus 로고
    • Evolutionary gait-optimization using a fitness function based on proprioception
    • Röfer, T. (2005 Evolutionary gait-optimization using a fitness function based on proprioception. In Eighth international workshop on robocup 2004. URL http://www.informatik. uni-bremen.de/~roefer/public-e.htm.
    • (2005) Eighth International Workshop on Robocup , vol.2004
    • Röfer, T.1
  • 46
    • 77955772383 scopus 로고    scopus 로고
    • Stolle, M. (2007 Images of mazes used. URL http://www.cs.cmu. edu/~mstoll/files/adprl2007-mazes.tar.gz, http://www.cs.cmu. edu/~mstoll/files/adprl2007-mazes.tar.gz.
    • (2007) Images of Mazes Used
    • Stolle, M.1
  • 47
    • 77955769182 scopus 로고    scopus 로고
    • PhD thesis, Carnegie Mellon University, 5000 Forbes Ave Pittsburgh, PA 15213
    • Stolle, M. (2008 Finding and transferring policies using stored behaviors. PhD thesis, Carnegie Mellon University, 5000 Forbes Ave Pittsburgh, PA 15213. URL http://www.cs.cmu. edu/~mstoll/publications.shtml.
    • (2008) Finding and Transferring Policies Using Stored Behaviors
    • Stolle, M.1
  • 51
    • 84912073624 scopus 로고    scopus 로고
    • Learning options in reinforcement learning
    • Berlin: Springer
    • Stolle, M., & Precup, D. (2002 Learning options in reinforcement learning. In Lecture notes in computer science (Vol. 2371, pp. 212-223 Berlin: Springer. URL http://www.cs.cmu.edu/~ mstoll/publications.shtml. von Stryk, O.
    • (2002) Lecture Notes in Computer Science , vol.2371 , pp. 212-223
    • Stolle, M.1    Precup, D.2
  • 52
    • 77955767888 scopus 로고    scopus 로고
    • 2001 DIRCOL. http://www.sim.informatik.tudarmstadt. de/sw/dircol.html.en, URL http://www.sim.informatik. tu-darmstadt.de/sw/dircol.html.en.
    • DIRCOL , Issue.2001
    • Von Stryk, O.1
  • 57
    • 2942729966 scopus 로고    scopus 로고
    • The sampling-based neighborhood graph: A framework for planning and executing feedback motion strategies
    • Yang, L., & LaValle, S. M. (2004 The sampling-based neighborhood graph: a framework for planning and executing feedback motion strategies. IEEE Transactions on Robotics and Automation, 20(3), 419-432.
    • (2004) IEEE Transactions on Robotics and Automation , vol.20 , Issue.3 , pp. 419-432
    • Yang, L.1    LaValle, S.M.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.