메뉴 건너뛰기




Volumn 90, Issue 3, 2013, Pages 385-429

TEXPLORE: Real-time sample-efficient reinforcement learning for robots

Author keywords

MDP; Real time; Reinforcement learning; Robotics

Indexed keywords

ALGORITHM LEARNING; AUTONOMOUS VEHICLES; CONTINUOUS STATE; MARKOV DECISION PROCESSES; MDP; RANDOM FORESTS; REAL-TIME; ROBOTIC CONTROLS; SENSOR/ACTUATOR; SEQUENTIAL DECISION MAKING;

EID: 84874698101     PISSN: 08856125     EISSN: 15730565     Source Type: Journal    
DOI: 10.1007/s10994-012-5322-7     Document Type: Article
Times cited : (120)

References (64)
  • 1
    • 0016556021 scopus 로고
    • A new approach to manipulator control: The cerebellar model articulation controller
    • 0314.92007 10.1115/1.3426922
    • Albus, J. S. (1975). A new approach to manipulator control: the cerebellar model articulation controller. Journal of Dynamic Systems, Measurement, and Control, 97(3), 220-227.
    • (1975) Journal of Dynamic Systems, Measurement, and Control , vol.97 , Issue.3 , pp. 220-227
    • Albus, J.S.1
  • 3
    • 0036568025 scopus 로고    scopus 로고
    • Finite-time analysis of the multiarmed bandit problem
    • 1012.68093 10.1023/A:1013689704352
    • Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2), 235-256.
    • (2002) Machine Learning , vol.47 , Issue.2 , pp. 235-256
    • Auer, P.1    Cesa-Bianchi, N.2    Fischer, P.3
  • 4
    • 0029210635 scopus 로고
    • Learning to act using real-time dynamic programming
    • 10.1016/0004-3702(94)00011-O
    • Barto, A. G., Bradtke, S. J., & Singh, S. P. (1995). Learning to act using real-time dynamic programming. Artificial Intelligence, 72(1-2), 81-138.
    • (1995) Artificial Intelligence , vol.72 , Issue.1-2 , pp. 81-138
    • Barto, A.G.1    Bradtke, S.J.2    Singh, S.P.3
  • 7
    • 0035478854 scopus 로고    scopus 로고
    • Random forests
    • 1007.68152 10.1023/A:1010933404324
    • Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5-32.
    • (2001) Machine Learning , vol.45 , Issue.1 , pp. 5-32
    • Breiman, L.1
  • 26
    • 0037399236 scopus 로고    scopus 로고
    • Markov decision processes with delays and asynchronous cost collection
    • 1968039 10.1109/TAC.2003.809799
    • Katsikopoulos, K., & Engelbrecht, S. (2003). Markov decision processes with delays and asynchronous cost collection. IEEE Transactions on Automatic Control, 48(4), 568-574.
    • (2003) IEEE Transactions on Automatic Control , vol.48 , Issue.4 , pp. 568-574
    • Katsikopoulos, K.1    Engelbrecht, S.2
  • 28
    • 78049390740 scopus 로고    scopus 로고
    • Policy search for motor primitives in robotics
    • 1237.68229 10.1007/s10994-010-5223-6
    • Kober, J., & Peters, J. (2011). Policy search for motor primitives in robotics. Machine Learning, 84(1-2), 171-203.
    • (2011) Machine Learning , vol.84 , Issue.1-2 , pp. 171-203
    • Kober, J.1    Peters, J.2
  • 31
    • 84868298260 scopus 로고    scopus 로고
    • LRTDP versus UCT for online probabilistic planning
    • Kolobov, A., Mausam, & Weld, D. (2012). LRTDP versus UCT for online probabilistic planning. In AAAI conference on artificial intelligence. https://www.aaai.org/ocs/index.php/AAAI/AAAI12/paper/view/4961/5334.
    • (2012) AAAI Conference on Artificial Intelligence
    • Kolobov, A.1    Mausam2    Weld, D.3
  • 37
    • 84855817203 scopus 로고    scopus 로고
    • A parallel general game player
    • 10.1007/s13218-010-0083-6
    • Méhat, J., & Cazenave, T. (2011). A parallel general game player. KI. Künstliche Intelligenz, 25(1), 43-47.
    • (2011) KI. Künstliche Intelligenz , vol.25 , Issue.1 , pp. 43-47
    • Méhat, J.1    Cazenave, T.2
  • 38
    • 0036832953 scopus 로고    scopus 로고
    • Variable resolution discretization in optimal control
    • 1005.68086 10.1023/A:1017992615625
    • Munos, R., & Moore, A. (2002). Variable resolution discretization in optimal control. Machine Learning, 49, 291-323.
    • (2002) Machine Learning , vol.49 , pp. 291-323
    • Munos, R.1    Moore, A.2
  • 41
  • 44
    • 33744584654 scopus 로고
    • Induction of decision trees
    • Quinlan, R. (1986). Induction of decision trees. Machine Learning, 1, 81-106.
    • (1986) Machine Learning , vol.1 , pp. 81-106
    • Quinlan, R.1
  • 53
    • 85132026293 scopus 로고
    • Integrated architectures for learning, planning, and reacting based on approximating dynamic programming
    • Sutton, R. (1990). Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In Proceedings of the seventh international conference on machine learning (ICML) (pp. 216-224).
    • (1990) Proceedings of the Seventh International Conference on Machine Learning (ICML) , pp. 216-224
    • Sutton, R.1
  • 56
    • 70449370276 scopus 로고    scopus 로고
    • RL-Glue: Language-independent software for reinforcement-learning experiments
    • Tanner, B., & White, A. (2009). RL-Glue: language-independent software for reinforcement-learning experiments. Journal of Machine Learning Research, 10, 2133-2136.
    • (2009) Journal of Machine Learning Research , vol.10 , pp. 2133-2136
    • Tanner, B.1    White, A.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.