메뉴 건너뛰기




Volumn , Issue , 2002, Pages

Model-free least squares policy iteration

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; ITERATIVE METHODS; REINFORCEMENT LEARNING;

EID: 84898963274     PISSN: 10495258     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (28)

References (13)
  • 1
    • 0003272616 scopus 로고    scopus 로고
    • Reinforcement learning in POMDP'S via direct gradient ascent
    • Morgan Kaufmann, San Francisco, CA
    • J. Baxter and P.Bartlett. Reinforcement learning in POMDP's via direct gradient ascent. In Proc. 17th International Conf. on Machine Learning, pages 41-48. Morgan Kaufmann, San Francisco, CA, 2000.
    • (2000) Proc. 17th International Conf. on Machine Learning , pp. 41-48
    • Baxter, J.1    Bartlett, P.2
  • 3
    • 0038595396 scopus 로고    scopus 로고
    • Least-squares temporal difference learning
    • I. Bratko and S. Dzeroski, editors, Morgan Kaufmann, San Francisco, CA
    • Justin A. Boyan. Least-squares temporal difference learning. In I. Bratko and S. Dzeroski, editors, Machine Learning: Proceedings of the Sixteenth International Conference, pages 49- 56. Morgan Kaufmann, San Francisco, CA, 1999.
    • (1999) Machine Learning: Proceedings of the Sixteenth International Conference , pp. 49-56
    • Boyan, J.A.1
  • 4
    • 0001771345 scopus 로고    scopus 로고
    • Linear least-squares algorithms for temporal difference learning
    • S. Bradtke and A. Barto. Linear least-squares algorithms for temporal difference learning. Machine Learning, 22(1/2/3):33-57, 1996.
    • (1996) Machine Learning , vol.22 , Issue.1-3 , pp. 33-57
    • Bradtke, S.1    Barto, A.2
  • 10
    • 0141596576 scopus 로고    scopus 로고
    • Policy invariance under reward transformations: Theory and application to reward shaping
    • Morgan Kaufmann, San Francisco, CA
    • Andrew Y. Ng, Daishi Harada, and Stuart Russell. Policy invariance under reward transformations: Theory and application to reward shaping. In Proc. 16th International Conf. on Machine Learning, pages 278-287. Morgan Kaufmann, San Francisco, CA, 1999.
    • (1999) Proc. 16th International Conf. on Machine Learning , pp. 278-287
    • Ng, A.Y.1    Harada, D.2    Russell, S.3
  • 11
    • 84898988649 scopus 로고    scopus 로고
    • Kernel-based reinforcement learning
    • To appear
    • D. Ormoneit and S. Sen. Kernel-based reinforcement learning. To appear, Machine Learning, 2001.
    • (2001) Machine Learning
    • Ormoneit, D.1    Sen, S.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.