메뉴 건너뛰기




Volumn , Issue , 2008, Pages 323-334

Adaptive aggregation for reinforcement learning with efficient exploration: Deterministic domains

Author keywords

[No Author keywords available]

Indexed keywords

ADAPTIVE AGGREGATION; COARSER RESOLUTION; CONTINUOUS STATE SPACE; DETERMINISTIC DOMAINS; EXPLORATION TECHNIQUES; ONLINE LEARNING; STATE AGGREGATION; UNCERTAINTY INTERVALS;

EID: 84898060153     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (4)

References (21)
  • 3
    • 78649714480 scopus 로고    scopus 로고
    • Master's thesis, Technion - Israel Institute of Technology
    • A. Bernstein. Adaptive state aggregation for reinforcement learning. Master's thesis, Technion - Israel Institute of Technology, 2007. URL: http://tx.technion.ac.il/~andreyb/MSc-Thesis-final.pdf.
    • (2007) Adaptive State Aggregation for Reinforcement Learning
    • Bernstein, A.1
  • 6
    • 0346942368 scopus 로고    scopus 로고
    • Decision-theoretic planning: Structural assumptions and computational leverage
    • C. Boutilier, T. Dean, and S. Hanks. Decision-theoretic planning: Structural assumptions and computational leverage. Journal of Artificial Intelligence Research, 11: 1-94, 1999.
    • (1999) Journal of Artificial Intelligence Research , vol.11 , pp. 1-94
    • Boutilier, C.1    Dean, T.2    Hanks, S.3
  • 7
    • 0041965975 scopus 로고    scopus 로고
    • R-MAX - A general polynomial time algorithm for near-optimal reinforcement learning
    • R. I. Brafman and M. Tennenholtz. R-MAX - a general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research, 3: 213-231, 2002.
    • (2002) Journal of Machine Learning Research , vol.3 , pp. 213-231
    • Brafman, R.I.1    Tennenholtz, M.2
  • 9
    • 0026206780 scopus 로고
    • An optimal oneway multigrid algorithm for discrete-time stochastic control
    • C.-S Chow and J.N. Tsitsiklis. An optimal oneway multigrid algorithm for discrete-time stochastic control. IEEE Transactions on Automatic Control, 36(8): 898-914, 1991.
    • (1991) IEEE Transactions on Automatic Control , vol.36 , Issue.8 , pp. 898-914
    • Chow, C.-S.1    Tsitsiklis, J.N.2
  • 11
    • 0742284358 scopus 로고    scopus 로고
    • Reinforcement learning with function approximation converges to a region
    • G. J. Gordon. Reinforcement learning with function approximation converges to a region. In Advances in Neural Information Processing Systems (NIPS) 12, pages 1040-1046, 2000.
    • (2000) Advances in Neural Information Processing Systems (NIPS) , vol.12 , pp. 1040-1046
    • Gordon, G.J.1
  • 12
    • 23244466805 scopus 로고    scopus 로고
    • PhD thesis, Gatsby Computational Neuroscience Unit, University College London, UK
    • S. M. Kakade. On the Sample Complexity of Reinforcement Learning. PhD thesis, Gatsby Computational Neuroscience Unit, University College London, UK, 2003.
    • (2003) On the Sample Complexity of Reinforcement Learning
    • Kakade, S.M.1
  • 13
    • 0036832954 scopus 로고    scopus 로고
    • Near-optimal reinforcement learning in polynomial time
    • M. Kearns and S. P. Singh. Near-optimal reinforcement learning in polynomial time. Machine Learning, 49: 209-232, 2002.
    • (2002) Machine Learning , vol.49 , pp. 209-232
    • Kearns, M.1    Singh, S.P.2
  • 15
    • 0029514510 scopus 로고
    • The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces
    • A. W. Moore and C. G. Atkeson. The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces. Machine Learning, 21: 199-233, 1995.
    • (1995) Machine Learning , vol.21 , pp. 199-233
    • Moore, A.W.1    Atkeson, C.G.2
  • 16
    • 0036832953 scopus 로고    scopus 로고
    • Variable resolution discretization in optimal control
    • R. Munos and A. W. Moore. Variable resolution discretization in optimal control. Machine Learning, 49: 291-323, 2002.
    • (2002) Machine Learning , vol.49 , pp. 291-323
    • Munos, R.1    Moore, A.W.2
  • 21
    • 0017997986 scopus 로고
    • Approximations of dynamic programs, I
    • W. Whitt. Approximations of dynamic programs, I. Mathematics of Operations Research, 3(3): 231-243, 1978.
    • (1978) Mathematics of Operations Research , vol.3 , Issue.3 , pp. 231-243
    • Whitt, W.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.