메뉴 건너뛰기




Volumn 3607 LNAI, Issue , 2005, Pages 321-331

Feature-discovering approximate value iteration methods

Author keywords

[No Author keywords available]

Indexed keywords

APPROXIMATION THEORY; CLASSIFICATION (OF INFORMATION); MARKOV PROCESSES; PROBLEM SOLVING; STATE SPACE METHODS;

EID: 26944495251     PISSN: 03029743     EISSN: 16113349     Source Type: Book Series    
DOI: 10.1007/11527862_25     Document Type: Conference Paper
Times cited : (5)

References (11)
  • 1
    • 84968468700 scopus 로고
    • Polynomial approximation - A new computational technique in dynamic programming
    • R. Bellman, R. Kalaba, and B. Kotkin. Polynomial approximation - a new computational technique in dynamic programming. Math. Comp., 17(8):155-161, 1963.
    • (1963) Math. Comp. , vol.17 , Issue.8 , pp. 155-161
    • Bellman, R.1    Kalaba, R.2    Kotkin, B.3
  • 6
    • 33847202724 scopus 로고
    • Learning to predict by the methods of temporal differences
    • R. S. Sutton. Learning to predict by the methods of temporal differences. MLJ, 3:9-44, 1988.
    • (1988) MLJ , vol.3 , pp. 9-44
    • Sutton, R.S.1
  • 8
    • 0029276036 scopus 로고
    • Temporal difference learning and td-gammon
    • G. Tesauro. Temporal difference learning and td-gammon. Comm. ACM, 38(3):58-68, 1995.
    • (1995) Comm. ACM , vol.38 , Issue.3 , pp. 58-68
    • Tesauro, G.1
  • 11
    • 0012252296 scopus 로고
    • Tight performance bounds on greedy policies based on imperfect value functions
    • Northeastern University
    • R. J. Williams and L. C. Baird. Tight performance bounds on greedy policies based on imperfect value functions. Technical report, Northeastern University, 1993.
    • (1993) Technical Report
    • Williams, R.J.1    Baird, L.C.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.