메뉴 건너뛰기




Volumn , Issue , 2010, Pages

LSTD with random projections

Author keywords

[No Author keywords available]

Indexed keywords

ITERATIVE METHODS; REINFORCEMENT LEARNING;

EID: 85162046948     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (47)

References (21)
  • 1
    • 40849145988 scopus 로고    scopus 로고
    • Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
    • A. Antos, Cs. Szepesvari, and R. Munos. Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. Machine Learning Journal, 71:89-129, 2008.
    • (2008) Machine Learning Journal , vol.71 , pp. 89-129
    • Antos, A.1    Szepesvari, Cs.2    Munos, R.3
  • 3
    • 0001771345 scopus 로고    scopus 로고
    • Linear least-squares algorithms for temporal difference learning
    • S. Bradtke and A. Barto. Linear least-squares algorithms for temporal difference learning. Machine Learning, 22:33-57, 1996.
    • (1996) Machine Learning , vol.22 , pp. 33-57
    • Bradtke, S.1    Barto, A.2
  • 16
    • 17444414191 scopus 로고    scopus 로고
    • Basis function adaptation in temporal difference reinforcement learning
    • I. Menache, S. Mannor, and N. Shimkin. Basis function adaptation in temporal difference reinforcement learning. Annals of Operations Research, 134:215-238, 2005.
    • (2005) Annals of Operations Research , vol.134 , pp. 215-238
    • Menache, I.1    Mannor, S.2    Shimkin, N.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.