메뉴 건너뛰기




Volumn , Issue , 2008, Pages

Fitted Q-iteration in continuous action-space MDPs

Author keywords

[No Author keywords available]

Indexed keywords

REINFORCEMENT LEARNING;

EID: 85161978146     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (144)

References (16)
  • 2
    • 85162071116 scopus 로고    scopus 로고
    • Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
    • accepted
    • A. Antos, Cs. Szepesvári, and R. Munos. Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. Machine Learning, 2007. (accepted).
    • (2007) Machine Learning
    • Antos, A.1    Szepesvári, Cs.2    Munos, R.3
  • 8
    • 85153940465 scopus 로고
    • Generalization in reinforcement learning: Safely approximating the value function
    • J.A. Boyan and A.W. Moore. Generalization in reinforcement learning: Safely approximating the value function. In NIPS-7, pages 369-376, 1995.
    • (1995) NIPS-7 , pp. 369-376
    • Boyan, J.A.1    Moore, A.W.2
  • 9
    • 0030165580 scopus 로고    scopus 로고
    • Fat-shattering and the learnability of real-valued functions
    • DOI 10.1006/jcss.1996.0033
    • P.L. Bartlett, P.M. Long, and R.C.Williamson. Fat-shattering and the learnability of real-valued functions. Journal of Computer and System Sciences, 52:434-452, 1996. (Pubitemid 126359770)
    • (1996) Journal of Computer and System Sciences , vol.52 , Issue.3 , pp. 434-452
    • Bartlett, P.L.1    Long, P.M.2    Williamson, R.C.3
  • 11
    • 40849114100 scopus 로고    scopus 로고
    • Finite time bounds for sampling based fitted value iteration
    • Research Institute of the Hungarian Academy of Sciences, Kende u, Budapest 1111, Hungary
    • R. Munos and Cs. Szepesvári. Finite time bounds for sampling based fitted value iteration. Technical report, Computer and Automation Research Institute of the Hungarian Academy of Sciences, Kende u. 13-17, Budapest 1111, Hungary, 2006.
    • (2006) Technical Report, Computer and Automation , pp. 13-17
    • Munos, R.1    Szepesvári, Cs.2
  • 13
    • 77955430645 scopus 로고    scopus 로고
    • Sample complexity of policy search with known dynamics
    • MIT Press
    • P.L. Bartlett and A. Tewari. Sample complexity of policy search with known dynamics. In NIPS-19. MIT Press, 2007.
    • (2007) NIPS-19
    • Bartlett, P.L.1    Tewari, A.2
  • 15
    • 33646398129 scopus 로고    scopus 로고
    • Neural fitted Q iteration - First experiences with a data efficient neural reinforcement learning method
    • M. Riedmiller. Neural fitted Q iteration - first experiences with a data efficient neural reinforcement learning method. In 16th European Conference on Machine Learning, pages 317-328, 2005.
    • (2005) 16th European Conference on Machine Learning , pp. 317-328
    • Riedmiller, M.1
  • 16
    • 60349130974 scopus 로고    scopus 로고
    • Batch reinforcement learning in a complex domain
    • S. Kalyanakrishnan and P. Stone. Batch reinforcement learning in a complex domain. In AAMAS-07, 2007.
    • (2007) AAMAS-07
    • Kalyanakrishnan, S.1    Stone, P.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.