메뉴 건너뛰기




Volumn WS-06-11, Issue , 2006, Pages 50-56

PAC reinforcement learning bounds for RTDP and Rand-RTDP

Author keywords

[No Author keywords available]

Indexed keywords

CONVERGENCE OF NUMERICAL METHODS; DECISION MAKING; LEARNING ALGORITHMS; MARKOV PROCESSES; POLYNOMIAL APPROXIMATION; PROBABILITY DISTRIBUTIONS; REAL TIME SYSTEMS;

EID: 33845972675     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (7)

References (14)
  • 1
    • 0029210635 scopus 로고
    • Learning to act using real-time dynamic programming
    • Barto, A. G., Bradtke, S. J., & Singh, S. P. (1995). Learning to act using real-time dynamic programming. Artificial Intelligence, 72, 81-138.
    • (1995) Artificial Intelligence , vol.72 , pp. 81-138
    • Barto, A.G.1    Bradtke, S.J.2    Singh, S.P.3
  • 4
    • 0041965975 scopus 로고    scopus 로고
    • R-MAX - a general polynomial time algorithm for near-optimal reinforcement learning
    • Brafman, R. I., & Tennenholtz, M (2002). R-MAX - a general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research, 3, 213-231.
    • (2002) Journal of Machine Learning Research , vol.3 , pp. 213-231
    • Brafman, R.I.1    Tennenholtz, M.2
  • 5
    • 23244466805 scopus 로고    scopus 로고
    • Doctoral dissertation, Gatsby Computational Neuro-science Unit, University College London
    • Kakade, S. M. (2003). On the sample complexity of reinforcement learning. Doctoral dissertation, Gatsby Computational Neuro-science Unit, University College London.
    • (2003) On the sample complexity of reinforcement learning
    • Kakade, S.M.1
  • 7
    • 0036832954 scopus 로고    scopus 로고
    • Near-optimal reinforcement learning in polynomial time
    • Kearns, M. J., & Singh, S. P. (2002). Near-optimal reinforcement learning in polynomial time. Machine Learning, 49, 209-232.
    • (2002) Machine Learning , vol.49 , pp. 209-232
    • Kearns, M.J.1    Singh, S.P.2
  • 9
    • 0025400088 scopus 로고
    • Real-time heuristic search
    • Korf, R. E. (1990). Real-time heuristic search. Artificial Intelligence, 42, 189-211.
    • (1990) Artificial Intelligence , vol.42 , pp. 189-211
    • Korf, R.E.1
  • 13
    • 0021518106 scopus 로고
    • A theory of the learnable
    • Valiant, L. G. (1984). A theory of the learnable. Communications of the ACM, 27, 1134-1142.
    • (1984) Communications of the ACM , vol.27 , pp. 1134-1142
    • Valiant, L.G.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.