메뉴 건너뛰기




Volumn , Issue , 2009, Pages 725-730

Regularized fitted q-iteration for planning in continuous-space markovian decision problems

Author keywords

[No Author keywords available]

Indexed keywords

FINITE SAMPLES; GENERALIZATION BOUND; ITERATION ALGORITHMS; MACHINE-LEARNING; MARKOVIAN DECISION PROBLEMS; NONLINEAR FUNCTIONS; PLANNING PROBLEM; REGULARIZATION PROCEDURE; SMALL SAMPLE SIZE; VALUE FUNCTIONS;

EID: 70449644892     PISSN: 07431619     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ACC.2009.5160611     Document Type: Conference Paper
Times cited : (71)

References (20)
  • 1
    • 70449695865 scopus 로고    scopus 로고
    • A. Antos, R. Munos, and Cs. Szepesvári. Fitted q-iteration in continuous action-space mdps. In Advances in Neural Information Processing Systems, 2007. (accepted).
    • A. Antos, R. Munos, and Cs. Szepesvári. Fitted q-iteration in continuous action-space mdps. In Advances in Neural Information Processing Systems, 2007. (accepted).
  • 2
    • 40849145988 scopus 로고    scopus 로고
    • Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
    • April, Published Online First: 14 Nov, DOI: 10.1007/s10994-007-5038-2
    • A. Antos, Cs. Szepesvári, and R. Munos. Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. Machine Learning, 71(1):89-129, April 2008. Published Online First: 14 Nov, 2007, DOI: 10.1007/s10994-007-5038-2.
    • (2007) Machine Learning , vol.71 , Issue.1 , pp. 89-129
    • Antos, A.1    Szepesvári, C.2    Munos, R.3
  • 11
    • 84885993384 scopus 로고    scopus 로고
    • Least squares SVM for least squares TD learning
    • T. Jung and D. Polani. Least squares SVM for least squares TD learning. In ECAI, pages 499-503, 2006.
    • (2006) ECAI , pp. 499-503
    • Jung, T.1    Polani, D.2
  • 12
    • 1942420814 scopus 로고    scopus 로고
    • Reinforcement learning as classification: Leveraging modern classifiers
    • M.G. Lagoudakis and R. Parr. Reinforcement learning as classification: Leveraging modern classifiers. In ICML-03, pages 424-431, 2003.
    • (2003) ICML-03 , pp. 424-431
    • Lagoudakis, M.G.1    Parr, R.2
  • 14
    • 17444414191 scopus 로고    scopus 로고
    • Basis function adaptation in temporal difference reinforcement learning
    • S. Mannor, I. Menache, and N. Shimkin. Basis function adaptation in temporal difference reinforcement learning. Annals of Operations Research, 134:215-238, 2005.
    • (2005) Annals of Operations Research , vol.134 , pp. 215-238
    • Mannor, S.1    Menache, I.2    Shimkin, N.3
  • 18
    • 34547982545 scopus 로고    scopus 로고
    • Analyzing feature generation for value-function approximation
    • R. Parr, C. Painter-Wakefield, L. Li, and M.L. Littman. Analyzing feature generation for value-function approximation. In ICML, pages 737-744, 2007.
    • (2007) ICML , pp. 737-744
    • Parr, R.1    Painter-Wakefield, C.2    Li, L.3    Littman, M.L.4
  • 20
    • 0038105204 scopus 로고    scopus 로고
    • Capacity of reproducing kernel spaces in learning theory
    • D-X. Zhou. Capacity of reproducing kernel spaces in learning theory. IEEE Transactions on Information Theory, 49:1743-1752, 2003.
    • (2003) IEEE Transactions on Information Theory , vol.49 , pp. 1743-1752
    • Zhou, D.-X.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.