메뉴 건너뛰기




Volumn 1, Issue , 2010, Pages 44-49

Using prior knowledge to accelerate online least-squares policy iteration

Author keywords

[No Author keywords available]

Indexed keywords

CONTROL POLICY; EMPIRICAL EVALUATIONS; IN-CONTROL; LEAST SQUARE; MONOTONICITY; ONLINE LEARNING; OPTIMAL CONTROLS; POLICY ITERATION; PRIOR KNOWLEDGE; SYSTEM STATE;

EID: 77958522395     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/AQTR.2010.5520917     Document Type: Conference Paper
Times cited : (8)

References (12)
  • 4
    • 0036832950 scopus 로고    scopus 로고
    • Technical update: Least-squares temporal difference learning
    • J. Boyan, "Technical update: Least-squares temporal difference learning, " Machine Learning, vol. 49, pp. 233-246, 2002.
    • (2002) Machine Learning , vol.49 , pp. 233-246
    • Boyan, J.1
  • 5
    • 0037288398 scopus 로고    scopus 로고
    • Least-squares policy evaluation algorithms with linear function approximation
    • A. Nedić and D. P. Bertsekas, "Least-squares policy evaluation algorithms with linear function approximation, " Discrete Event Dynamic Systems: Theory and Applications, vol. 13, no. 1-2, pp. 79-110, 2003.
    • (2003) Discrete Event Dynamic Systems: Theory and Applications , vol.13 , Issue.1-2 , pp. 79-110
    • Nedić, A.1    Bertsekas, D.P.2
  • 8
    • 67949109470 scopus 로고    scopus 로고
    • Convergence results for some temporal difference methods based on least squares
    • H. Yu and D. P. Bertsekas, "Convergence results for some temporal difference methods based on least squares, " IEEE Transactions on Automatic Control, vol. 54, no. 7, pp. 1515-1531, 2009.
    • (2009) IEEE Transactions on Automatic Control , vol.54 , Issue.7 , pp. 1515-1531
    • Yu, H.1    Bertsekas, D.P.2
  • 9
    • 77957782880 scopus 로고    scopus 로고
    • Online least-squares policy iteration for reinforcement learning control
    • Baltimore, US, 30 June - 2 July, accepted for publication
    • L. Buşoniu, D. Ernst, B. De Schutter, and R. Babǔska, "Online least-squares policy iteration for reinforcement learning control, " in Proceedings 2010 American Control Conference (ACC-10), Baltimore, US, 30 June - 2 July 2010, accepted for publication.
    • (2010) Proceedings 2010 American Control Conference (ACC-10)
    • Buşoniu, L.1    Ernst, D.2    Schutter, B.D.3    Babǔska, R.4
  • 10
    • 33847202724 scopus 로고
    • Learning to predict by the method of temporal differences
    • R. S. Sutton, "Learning to predict by the method of temporal differences, " Machine Learning, vol. 3, pp. 9-44, 1988.
    • (1988) Machine Learning , vol.3 , pp. 9-44
    • Sutton, R.S.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.