메뉴 건너뛰기




Volumn , Issue , 2013, Pages 774-781

PAC optimal exploration in continuous space Markov decision processes

Author keywords

[No Author keywords available]

Indexed keywords

CONTINUOUS SPACES; EXPLORATION ALGORITHMS; FINITE SAMPLES; GREEDY EXPLORATION; HEURISTIC APPROACH; MARKOV DECISION PROCESSES; THEORETICAL GUARANTEES; THEORY AND PRACTICE;

EID: 84893414333     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (80)

References (17)
  • 6
    • 1942452450 scopus 로고    scopus 로고
    • Exploration in metric state spaces
    • Kakade, S.; Kearns, M. J.; and Langford, J. 2003. Exploration in metric state spaces. In ICML, 306-312.
    • (2003) ICML , pp. 306-312
    • Kakade, S.1    Kearns, M.J.2    Langford, J.3
  • 7
    • 23244466805 scopus 로고    scopus 로고
    • Ph.D. Dissertation, Gatsby Computational Neuroscience Unit, University College London
    • Kakade, S. M. 2003. On the sample complexity of reinforcement learning. Ph.D. Dissertation, Gatsby Computational Neuroscience Unit, University College London.
    • (2003) On the Sample Complexity of Reinforcement Learning
    • Kakade, S.M.1
  • 8
    • 0036832954 scopus 로고    scopus 로고
    • Near-optimal reinforcement learning in polynomial time
    • Kearns, M. J., and Singh, S. P. 2002. Near-optimal reinforcement learning in polynomial time. Machine Learning 49(2-3):209-232.
    • (2002) Machine Learning , vol.49 , Issue.2-3 , pp. 209-232
    • Kearns, M.J.1    Singh, S.P.2
  • 9
    • 71149109483 scopus 로고    scopus 로고
    • Near-Bayesian exploration in polynomial time
    • Kolter, J. Z., and Ng, A. Y. 2009. Near-bayesian exploration in polynomial time. In ICML '09, 513-520.
    • (2009) ICML '09 , pp. 513-520
    • Kolter, J.Z.1    Ng, A.Y.2
  • 11
    • 84893393934 scopus 로고    scopus 로고
    • Safe exploration in Markov decision processes
    • abs/1205.4810
    • Moldovan, T. M., and Abbeel, P. 2012. Safe exploration in Markov decision processes. CoRR abs/1205.4810.
    • (2012) CoRR
    • Moldovan, T.M.1    Abbeel, P.2
  • 13
    • 31844432138 scopus 로고    scopus 로고
    • A theoretical analysis of model-based interval estimation
    • New York, NY, USA: ACM
    • Strehl, A. L., and Littman, M. L. 2005. A theoretical analysis of model-based interval estimation. In ICML '05, 856-863. New York, NY, USA: ACM.
    • (2005) ICML '05 , pp. 856-863
    • Strehl, A.L.1    Littman, M.L.2
  • 14
    • 85162058047 scopus 로고    scopus 로고
    • Online linear regression and its application to model-based reinforcement learning
    • Strehl, A., and Littman, M. 2008. Online linear regression and its application to model-based reinforcement learning. Advances in Neural Information Processing Systems 20:1417-1424.
    • (2008) Advances in Neural Information Processing Systems , vol.20 , pp. 1417-1424
    • Strehl, A.1    Littman, M.2
  • 16
    • 77956520676 scopus 로고    scopus 로고
    • Model-based reinforcement learning with nearly tight exploration complexity bounds
    • Szita, I., and Szepesvári, C. 2010. Model-based reinforcement learning with nearly tight exploration complexity bounds. In ICML, 1031-1038.
    • (2010) ICML , pp. 1031-1038
    • Szita, I.1    Szepesvári, C.2
  • 17
    • 0012252296 scopus 로고
    • Tight performance bounds on greedy policies based on imperfect value functions
    • College of Computer Science
    • Williams, R., and Baird, L. 1993. Tight performance bounds on greedy policies based on imperfect value functions. Technical report, Northeastern University, College of Computer Science.
    • (1993) Technical Report, Northeastern University
    • Williams, R.1    Baird, L.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.