메뉴 건너뛰기




Volumn , Issue , 2009, Pages 513-520

Near-Bayesian exploration in polynomial time

Author keywords

[No Author keywords available]

Indexed keywords

BAYESIAN; BAYESIAN APPROACHES; BAYESIAN SOLUTION; EXPLORATION STRATEGIES; HIGH PROBABILITY; MODEL-BASED; POLYNOMIAL-TIME; SAMPLE COMPLEXITY BOUNDS; SIMPLE ALGORITHM; TIME STEP;

EID: 71149109483     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (215)

References (24)
  • 1
    • 70049118714 scopus 로고    scopus 로고
    • A Bayesian sampling approach to exploration in reinforcement learning
    • Preprint
    • Asmuth, J., Li, L., Littman, M. L., Nouri, A., & Wingate, D. (2009). A Bayesian sampling approach to exploration in reinforcement learning. (Preprint).
    • (2009)
    • Asmuth, J.1    Li, L.2    Littman, M.L.3    Nouri, A.4    Wingate, D.5
  • 2
    • 56449090814 scopus 로고    scopus 로고
    • Logarithmic online regret bounds for undiscounted reinforcement learning
    • Auer, P., & Ortner, R. (2007). Logarithmic online regret bounds for undiscounted reinforcement learning. Neural Information Processing Systems (pp. 49-56).
    • (2007) Neural Information Processing Systems , pp. 49-56
    • Auer, P.1    Ortner, R.2
  • 3
    • 0041965975 scopus 로고    scopus 로고
    • R-MAX - a general polynomial time algorithm for near-optimal reinforcement learning
    • Brafman, R. I., & Tennenholtz, M. (2002). R-MAX - a general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research, 3, 213-231.
    • (2002) Journal of Machine Learning Research , vol.3 , pp. 213-231
    • Brafman, R.I.1    Tennenholtz, M.2
  • 7
    • 71149084050 scopus 로고    scopus 로고
    • 21 1033-1039
    • 21 1033-1039,
  • 8
    • 71149097616 scopus 로고    scopus 로고
    • 22 1-12
    • 22 1-12,
  • 9
    • 71149104625 scopus 로고    scopus 로고
    • 22 109-121
    • 22 109-121.
  • 13
    • 23244466805 scopus 로고    scopus 로고
    • Doctoral dissertation, Gatsby Computational Neuroscience Unit, University College, London
    • Kakade, S. M. (2003). On the sample complexity of reinforcement learning. Doctoral dissertation, Gatsby Computational Neuroscience Unit, University College, London.
    • (2003) On the sample complexity of reinforcement learning
    • Kakade, S.M.1
  • 15
    • 0036832954 scopus 로고    scopus 로고
    • Near-optimal reinforcement learning in polynomial time
    • Kearns, M., & Singh, S. (2002). Near-optimal reinforcement learning in polynomial time. Machine Learning, 49, 209-232.
    • (2002) Machine Learning , vol.49 , pp. 209-232
    • Kearns, M.1    Singh, S.2
  • 19
    • 0010956944 scopus 로고
    • Distribution inequalities for the binomial law
    • Slud, E. V. (1977). Distribution inequalities for the binomial law. The Annals of Probability, 5, 404-412.
    • (1977) The Annals of Probability , vol.5 , pp. 404-412
    • Slud, E.V.1
  • 21
    • 55549110436 scopus 로고    scopus 로고
    • An analysis of model-based interval estimation for markov decision processes
    • Strehl, A. L., & Littman, M. L. (2008a). An analysis of model-based interval estimation for markov decision processes. Journal of Computer and System Sciences, 74, 1209-1331.
    • (2008) Journal of Computer and System Sciences , vol.74 , pp. 1209-1331
    • Strehl, A.L.1    Littman, M.L.2
  • 22
    • 85162058047 scopus 로고    scopus 로고
    • Online linear regression and its application to model-based reinforcement learning
    • Strehl, A. L., & Littman, M. L. (2008b). Online linear regression and its application to model-based reinforcement learning. Neural Information Processing Systems (pp. 1417-1424).
    • (2008) Neural Information Processing Systems , pp. 1417-1424
    • Strehl, A.L.1    Littman, M.L.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.