메뉴 건너뛰기




Volumn 4, Issue 1, 2012, Pages 70-86

Autonomous learning of high-level states and actions in continuous environments

Author keywords

Active learning; intrinsic motivation; qualitative reasoning; reinforcement learning; unsupervised learning

Indexed keywords

ACTIVE LEARNING; AUTONOMOUS LEARNING; CONTINUOUS VARIABLES; DISCRETE VARIABLE REPRESENTATION; DISCRETIZATIONS; INTRINSIC MOTIVATION; LEARNING AGENTS; LEARNING MODELS; PREDICTIVE MODELS; QUALITATIVE REASONING; QUALITATIVE REPRESENTATION; REALISTIC PHYSICS; SIMULATED ROBOT;

EID: 84858634841     PISSN: 19430604     EISSN: None     Source Type: Journal    
DOI: 10.1109/TAMD.2011.2160943     Document Type: Article
Times cited : (89)

References (56)
  • 3
    • 27144556425 scopus 로고    scopus 로고
    • Incremental online learning in high dimensions
    • S. Vijayakumar, A. D'souza, and S. Schaal, "Incremental online learning in high dimensions," Neural Comput., vol. 17, no. 12, pp. 2602-2634, 2005.
    • (2005) Neural Comput. , vol.17 , Issue.12 , pp. 2602-2634
    • Vijayakumar, S.1    D'souza, A.2    Schaal, S.3
  • 4
    • 80054969173 scopus 로고    scopus 로고
    • Intrinsically motivated hierarchical skill learning in structured environments
    • Jun
    • C. M. Vigorito and A. G. Barto, "Intrinsically motivated hierarchical skill learning in structured environments," IEEE Trans. Autonom. Mental Develop., vol. 2, no. 2, Jun. 2010.
    • (2010) IEEE Trans. Autonom. Mental Develop , vol.2 , Issue.2
    • Vigorito, C.M.1    Barto, A.G.2
  • 6
    • 84867104859 scopus 로고    scopus 로고
    • Neo: Learning conceptual knowledge by sensorimotor interaction with an environment
    • Marina del Rey, CA ACM
    • P. R. Cohen, M. S. Atkin, T. Oates, and C. R. Beal, "Neo: Learning conceptual knowledge by sensorimotor interaction with an environment," in Proc. Agents '97, Marina del Rey, CA, 1997, ACM.
    • (1997) Proc. Agents , vol.97
    • Cohen, P.R.1    Atkin, M.S.2    Oates, T.3    Beal, C.R.4
  • 10
    • 50849114173 scopus 로고    scopus 로고
    • Learning to predict the effects of actions: Synergy between rules and landmarks
    • London, U.K
    • J. Mugan and B. Kuipers, "Learning to predict the effects of actions: Synergy between rules and landmarks," in Proc. Int. Conf. Develop. Learn., London, U.K., 2007.
    • (2007) Proc. Int. Conf. Develop. Learn.
    • Mugan, J.1    Kuipers, B.2
  • 11
    • 80055063319 scopus 로고    scopus 로고
    • Learning distinctions and rules in a continuous world through active exploration
    • Piscataway, NJ
    • J. Mugan and B. Kuipers, "Learning distinctions and rules in a continuous world through active exploration," in Proc. Int. Conf. Epigenet. Robot., Piscataway, NJ, 2007.
    • (2007) Proc. Int. Conf. Epigenet. Robot.
    • Mugan, J.1    Kuipers, B.2
  • 12
    • 77955689438 scopus 로고    scopus 로고
    • Towards the application of reinforcement learning to undirected developmental learning
    • Brighton, U.K
    • J. Mugan and B. Kuipers, "Towards the application of reinforcement learning to undirected developmental learning," in Proc. Int. Conf. Epigenet. Robot., Brighton, U.K., 2008.
    • (2008) Proc. Int. Conf. Epigenet. Robot.
    • Mugan, J.1    Kuipers, B.2
  • 13
    • 84858637257 scopus 로고    scopus 로고
    • A comparison of strategies for developmental action acquisition in qlap
    • Venice, Italy
    • J. Mugan and B. Kuipers, "A comparison of strategies for developmental action acquisition in QLAP," in Proc. Int. Conf. Epigenet. Robot., Venice, Italy, 2009.
    • (2009) Proc. Int. Conf. Epigenet. Robot.
    • Mugan, J.1    Kuipers, B.2
  • 14
    • 78751697580 scopus 로고    scopus 로고
    • Autonomously learning an action hierarchy using a learned qualitative state representation
    • Pasadena, CA
    • J. Mugan and B. Kuipers, "Autonomously learning an action hierarchy using a learned qualitative state representation," in Proc. Int. Joint Conf. Artif. Intell., Pasadena, CA, 2009.
    • (2009) Proc. Int. Joint Conf. Artif. Intell.
    • Mugan, J.1    Kuipers, B.2
  • 16
    • 0003024008 scopus 로고
    • On the handling of continuous-valued attributes in decision tree generation
    • U. Fayyad and K. Irani, "On the handling of continuous-valued attributes in decision tree generation," Mach. Learn., vol. 8, no. 1, pp. 87-102, 1992.
    • (1992) Mach. Learn. , vol.8 , Issue.1 , pp. 87-102
    • Fayyad, U.1    Irani, K.2
  • 17
    • 84858684975 scopus 로고    scopus 로고
    • The qualitative learner of action and perception, qlap
    • Atlanta, GA [Online]. Available:
    • J. Mugan and B. Kuipers, "The qualitative learner of action and perception, QLAP," in Proc. AAAI Video Competition (AIVC 2010), Atlanta, GA, 2010 [Online]. Available: Http://videolectures. net/aaai2010-mugan-qlap
    • (2010) Proc. AAAI Video Competition (AIVC 2010)
    • Mugan, J.1    Kuipers, B.2
  • 20
    • 0031073475 scopus 로고    scopus 로고
    • Locally weighted learning for control
    • C. G. Atkeson, A. W. Moore, and S. Schaal, "Locally weighted learning for control," Artif. Intell. Rev., vol. 11, no. 1/5, pp. 75-113, 1997.
    • (1997) Artif. Intell. Rev. , vol.11 , Issue.1-5 , pp. 75-113
    • Atkeson, C.G.1    Moore, A.W.2    Schaal, S.3
  • 21
    • 0033691378 scopus 로고    scopus 로고
    • Locally weighted projection regression: An o(n) algorithm for incremental real time learning in high dimensional space
    • Palo Alto, CA
    • S. Vijayakumar and S. Schaal, "Locally weighted projection regression: An O(n) algorithm for incremental real time learning in high dimensional space," in Proc. 17th Int. Conf. Mach. Learn. (ICML 2000), Palo Alto, CA, 2000, vol. 1, pp. 288-293.
    • (2000) Proc. 17th Int. Conf. Mach. Learn. (ICML 2000) , vol.1 , pp. 288-293
    • Vijayakumar, S.1    Schaal, S.2
  • 22
    • 44049116478 scopus 로고
    • Forward models: Supervised learning with a distal teacher
    • M. Jordan and D. Rumelhart, "Forward models: Supervised learning with a distal teacher," Cogn. Sci., vol. 16, pp. 307-354, 1992.
    • (1992) Cogn. Sci. , vol.16 , pp. 307-354
    • Jordan, M.1    Rumelhart, D.2
  • 23
    • 35048879177 scopus 로고    scopus 로고
    • Gaussian processes in machine learning
    • C. Rasmussen, "Gaussian processes in machine learning," Adv. Lectures Mach. Learn., pp. 63-71, 2006.
    • (2006) Adv. Lectures Mach. Learn. , pp. 63-71
    • Rasmussen, C.1
  • 25
    • 33749242809 scopus 로고    scopus 로고
    • Learning the structure of factored markov decision processes in reinforcement learning problems
    • Pittsburgh, PA
    • T. Degris, O. Sigaud, and P.Wuillemin, "Learning the structure of factored Markov decision processes in reinforcement learning problems," in Proc. Int. Conf. Mach. Learn. (ICML), Pittsburgh, PA, 2006, pp. 257-264.
    • (2006) Proc. Int. Conf. Mach. Learn. (ICML) , pp. 257-264
    • Degris, T.1    Sigaud, O.2    Wuillemin, P.3
  • 27
    • 36348930987 scopus 로고    scopus 로고
    • Efficient structure learning in factored-state mdps
    • Vancouver, BC, Canada
    • A. Strehl, C. Diuk, and M. Littman, "Efficient structure learning in factored-state MDPs," in Proc. AAAI, Vancouver, BC, Canada, 2007, vol. 22, no. 1, p. 645.
    • (2007) Proc. AAAI , vol.22 , Issue.1 , pp. 645
    • Strehl, A.1    Diuk, C.2    Littman, M.3
  • 31
    • 0346942368 scopus 로고    scopus 로고
    • Decision theoretic planning: Structural assumptions and computational leverage
    • C. Boutilier, T. Dean, and S. Hanks, "Decision theoretic planning: Structural assumptions and computational leverage," J. Artif. Intell. Res., vol. 11, no. 1, p. 94, 1999.
    • (1999) J. Artif. Intell. Res. , vol.11 , Issue.1 , pp. 94
    • Boutilier, C.1    Dean, T.2    Hanks, S.3
  • 32
    • 0033170372 scopus 로고    scopus 로고
    • Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning
    • R. S. Sutton, D. Precup, and S. Singh, "Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning," Artif. Intell., vol. 112, no. 1-2, pp. 181-211, 1999.
    • (1999) Artif. Intell. , vol.112 , Issue.1-2 , pp. 181-211
    • Sutton, R.S.1    Precup, D.2    Singh, S.3
  • 34
    • 23144448134 scopus 로고    scopus 로고
    • Novelty and reinforcement learning in the value system of developmental robots
    • Edinburgh, Scotland
    • X. Huang and J. Weng, "Novelty and reinforcement learning in the value system of developmental robots," in Proc. 2nd Inter. Workshop Epigenet. Robot., Edinburgh, Scotland, 2002.
    • (2002) Proc. 2nd Inter. Workshop Epigenet. Robot.
    • Huang, X.1    Weng, J.2
  • 36
    • 0041965975 scopus 로고    scopus 로고
    • R-max-A general polynomial time algorithm for near-optimal reinforcement learning
    • R. Brafman and M. Tennenholtz, "R-max-a general polynomial time algorithm for near-optimal reinforcement learning," J. Mach. Learn. Res., vol. 3, pp. 213-231, 2003.
    • (2003) J. Mach. Learn. Res. , vol.3 , pp. 213-231
    • Brafman, R.1    Tennenholtz, M.2
  • 37
    • 0026306990 scopus 로고
    • Curious model-building control systems
    • Seattle,WA
    • J. Schmidhuber, "Curious model-building control systems," in Proc. Int. Joint Conf. Neural Netw., Seattle,WA, 1991, vol. 2, pp. 1458-1463.
    • (1991) Proc. Int. Joint Conf. Neural Netw. , vol.2 , pp. 1458-1463
    • Schmidhuber, J.1
  • 38
    • 34047267520 scopus 로고    scopus 로고
    • Intrinsic motivation systems for autonomous mental development
    • P. Oudeyer, F. Kaplan, and V. Hafner, "Intrinsic motivation systems for autonomous mental development," IEEE Trans. Evol. Comput., vol. 11, no. 2, pp. 265-286, 2007.
    • (2007) IEEE Trans. Evol. Comput. , vol.11 , Issue.2 , pp. 265-286
    • Oudeyer, P.1    Kaplan, F.2    Hafner, V.3
  • 41
    • 0141613153 scopus 로고    scopus 로고
    • Early integration of vision and manipulation
    • G. Metta and P. Fitzpatrick, "Early integration of vision and manipulation," Adapt. Behav., vol. 11, no. 2, pp. 109-128, 2003.
    • (2003) Adapt. Behav. , vol.11 , Issue.2 , pp. 109-128
    • Metta, G.1    Fitzpatrick, P.2
  • 42
    • 33745616789 scopus 로고    scopus 로고
    • Learning acceptable windows of contingency
    • K. Gold and B. Scassellati, "Learning acceptable windows of contingency," Connect. Sci., vol. 18, no. 2, pp. 217-228, 2006.
    • (2006) Connect. Sci. , vol.18 , Issue.2 , pp. 217-228
    • Gold, K.1    Scassellati, B.2
  • 43
    • 14844364287 scopus 로고    scopus 로고
    • Breve: A 3d environment for the simulation of decentralized systems and artificial life
    • Sydney, Australia
    • J. Klein, "Breve: A 3d environment for the simulation of decentralized systems and artificial life," in Proc. Int. Conf. Artif. Life, Sydney, Australia, 2003.
    • (2003) Proc. Int. Conf. Artif. Life
    • Klein, J.1
  • 44
    • 84858633893 scopus 로고    scopus 로고
    • Open Dynamics Engine v 0.5 User Guide [Online]. Available:
    • R. Smith, Open Dynamics Engine v 0.5 User Guide [Online]. Available: Http://ode.org/ode-latest-userguide.pdf 2004
    • Smith, R.1
  • 46
    • 0036428888 scopus 로고    scopus 로고
    • A pick-me-up for infants' exploratory skills: Early simulated experiences reaching for objects using 'sticky mittens' enhances young infants' object exploration skills
    • A. Needham, T. Barrett, and K. Peterman, "A pick-me-up for infants' exploratory skills: Early simulated experiences reaching for objects using 'sticky mittens' enhances young infants' object exploration skills," Inf. Behav. Develop., vol. 25, no. 3, pp. 279-295, 2002.
    • (2002) Inf. Behav. Develop. , vol.25 , Issue.3 , pp. 279-295
    • Needham, A.1    Barrett, T.2    Peterman, K.3
  • 47
    • 84858684979 scopus 로고    scopus 로고
    • [Online]. Available: Sourceforge.net
    • J. Provost [Online]. Available: Sourceforge.net, 2008
    • (2008)
    • Provost, J.1
  • 48
    • 0031147214 scopus 로고    scopus 로고
    • Map learning with uninterpreted sensors and effectors
    • D. M. Pierce and B. J. Kuipers, "Map learning with uninterpreted sensors and effectors," Artif. Intell., vol. 92, pp. 169-227, 1997.
    • (1997) Artif. Intell. , vol.92 , pp. 169-227
    • Pierce, D.M.1    Kuipers, B.J.2
  • 50
    • 33750705246 scopus 로고    scopus 로고
    • Causal graph based decomposition of factored mdps
    • A. Jonsson and A. Barto, "Causal graph based decomposition of factored MDPs," The J. Mach. Learn. Res., vol. 7, pp. 2259-2301, 2006.
    • (2006) The J. Mach. Learn. Res. , vol.7 , pp. 2259-2301
    • Jonsson, A.1    Barto, A.2
  • 53
    • 0001806701 scopus 로고    scopus 로고
    • The maxq method for hierarchical reinforcement learning
    • Madison, WI
    • T. Dietterich, "The MAXQ method for hierarchical reinforcement learning," in Proc. Int. Conf. Mach. Learn. (ICML), Madison, WI, 1998.
    • (1998) Proc. Int. Conf. Mach. Learn. (ICML)
    • Dietterich, T.1
  • 54
    • 0007907759 scopus 로고    scopus 로고
    • Emergent hierarchical control structures: Learning reactive/ hierarchical relationships in reinforcement environments
    • Cape Cod, MA MIT Press
    • B. Digney, "Emergent hierarchical control structures: Learning reactive/ hierarchical relationships in reinforcement environments," in Proc. 4th Int. Conf. Simul. Adapt. Behav., Cape Cod, MA, 1996, p. 363, MIT Press.
    • (1996) Proc. 4th Int. Conf. Simul. Adapt. Behav. , pp. 363
    • Digney, B.1
  • 55
    • 0013465036 scopus 로고    scopus 로고
    • Discovering hierarchy in reinforcement learning with hexq
    • Sydney, Australia
    • B. Hengst, "Discovering hierarchy in reinforcement learning with HEXQ," in Proc. 19th Int. Conf. Mach. Learn., Sydney, Australia, 2002, pp. 243-250.
    • (2002) Proc. 19th Int. Conf. Mach. Learn. , pp. 243-250
    • Hengst, B.1
  • 56
    • 84899784589 scopus 로고    scopus 로고
    • Generalized model learning for reinforcement learning in factored domains
    • International Foundation for Autonomous Agents and Multiagent Systems Budapest, Hungary
    • T. Hester and P. Stone, International Foundation for Autonomous Agents and Multiagent Systems, "Generalized model learning for reinforcement learning in factored domains," in Proc. 8th Int. Conf. Autonom. Agents Multiagent Syst.-Volume 2, Budapest, Hungary, 2009, pp. 717-724.
    • (2009) Proc. 8th Int. Conf. Autonom. Agents Multiagent Syst.-Volume 2 , pp. 717-724
    • Hester, T.1    Stone, P.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.