메뉴 건너뛰기




Volumn , Issue , 2011, Pages

A non-parametric approach to Dynamic Programming

Author keywords

[No Author keywords available]

Indexed keywords

DYNAMIC PROGRAMMING; LEAST SQUARES APPROXIMATIONS; NUMERICAL METHODS;

EID: 85162488584     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (17)

References (28)
  • 3
    • 79951481923 scopus 로고    scopus 로고
    • Convergent temporal-difference learning with arbitrary smooth function approximation
    • H. Maei, C. Szepesvari, S. Bhatnagar, D. Precup, D. Silver, and R. Sutton. Convergent temporal-difference learning with arbitrary smooth function approximation. In NIPS, pages 1204-1212, 2009.
    • (2009) NIPS , pp. 1204-1212
    • Maei, H.1    Szepesvari, C.2    Bhatnagar, S.3    Precup, D.4    Silver, D.5    Sutton, R.6
  • 7
    • 33646384929 scopus 로고    scopus 로고
    • Geometric variance reduction in Markov chains: Application to value function and gradient estimation
    • Rémi Munos. Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation. Journal of Machine Learning Research, 7:413-427, 2006.
    • (2006) Journal of Machine Learning Research , vol.7 , pp. 413-427
    • Munos, R.1
  • 8
    • 1942482175 scopus 로고    scopus 로고
    • Optimality of reinforcement learning algorithms with linear function approximation
    • Ralf Schoknecht. Optimality of reinforcement learning algorithms with linear function approximation. In NIPS, pages 1555-1562, 2002.
    • (2002) NIPS , pp. 1555-1562
    • Schoknecht, R.1
  • 9
    • 85151728371 scopus 로고
    • Residual algorithms: Reinforcement learning with function approximation
    • Leemon Baird. Residual algorithms: Reinforcement learning with function approximation. In ICML, 1995.
    • (1995) ICML
    • Baird, L.1
  • 10
    • 0030691430 scopus 로고    scopus 로고
    • A comparison of direct and model-based reinforcement learning
    • Christopher G. Atkeson and Juan C. Santamaria. A Comparison of Direct and Model-Based Reinforcement Learning. In ICRA, pages 3557-3564, 1997.
    • (1997) ICRA , pp. 3557-3564
    • Atkeson, C.G.1    Santamaria, J.C.2
  • 11
    • 0029746072 scopus 로고    scopus 로고
    • Three connectionist implementations of dynamic programming for optimal control: A preliminary comparative analysis
    • H. Bersini and V. Gorrini. Three connectionist implementations of dynamic programming for optimal control: A preliminary comparative analysis. In Nicrosp, 1996.
    • (1996) Nicrosp
    • Bersini, H.1    Gorrini, V.2
  • 13
    • 0001762424 scopus 로고
    • Smooth regression analysis
    • G. Watson. Smooth regression analysis. Sankhya, Series, A(26):359-372, 1964.
    • (1964) Sankhya Series A , Issue.26 , pp. 359-372
    • Watson, G.1
  • 14
    • 0038595396 scopus 로고    scopus 로고
    • Least-squares temporal difference learning
    • San Francisco, CA, USA, Morgan Kaufmann Publishers Inc
    • Justin A. Boyan. Least-squares temporal difference learning. In ICML, pages 49-56, San Francisco, CA, USA, 1999. Morgan Kaufmann Publishers Inc.
    • (1999) ICML , pp. 49-56
    • Boyan, J.A.1
  • 15
    • 71149100225 scopus 로고    scopus 로고
    • Kernelized value function approximation for reinforcement learning
    • New York, NY, USA, ACM
    • Taylor, Gavin and Parr, Ronald. Kernelized value function approximation for reinforcement learning. In ICML, pages 1017-1024, New York, NY, USA, 2009. ACM.
    • (2009) ICML , pp. 1017-1024
    • Gavin, T.1    Ronald, P.2
  • 17
    • 0001529784 scopus 로고
    • Remarks on some nonparametric estimates of a density function
    • September
    • Murray Rosenblatt. Remarks on Some Nonparametric Estimates of a Density Function. The Annals of Mathematical Statistics, 27(3):832-837, September 1956.
    • (1956) The Annals of Mathematical Statistics , vol.27 , Issue.3 , pp. 832-837
    • Rosenblatt, M.1
  • 18
    • 0001473437 scopus 로고
    • On estimation of a probability density function and mode
    • Emanuel Parzen. On Estimation of a Probability Density Function and Mode. The Annals of Mathematical Statistics, 33(3):1065-1076, 1962.
    • (1962) The Annals of Mathematical Statistics , vol.33 , Issue.3 , pp. 1065-1076
    • Parzen, E.1
  • 20
    • 1942516880 scopus 로고    scopus 로고
    • Error bounds for approximate policy iteration
    • Rémi Munos. Error bounds for approximate policy iteration. In ICML, pages 560-567, 2003.
    • (2003) ICML , pp. 560-567
    • Munos, R.1
  • 22
    • 85051703735 scopus 로고    scopus 로고
    • Consistency of the kernel density estimator: A survey
    • Dominik Wied and Rafael Weissbach. Consistency of the kernel density estimator: a survey. Statistical Papers, pages 1-21, 2010.
    • (2010) Statistical Papers , pp. 1-21
    • Wied, D.1    Weissbach, R.2
  • 23
    • 31844451013 scopus 로고    scopus 로고
    • Reinforcement learning with Gaussian processes
    • New York, NY, USA. ACM
    • Yaakov Engel, Shie Mannor, and Ron Meir. Reinforcement learning with Gaussian processes. In ICML, pages 201-208, New York, NY, USA, 2005. ACM.
    • (2005) ICML , pp. 201-208
    • Engel, Y.1    Mannor, S.2    Meir, R.3
  • 25
    • 71149121683 scopus 로고    scopus 로고
    • Regularization and feature selection in least-squares temporal difference learning
    • ACM
    • J. Zico Kolter and Andrew Y. Ng. Regularization and feature selection in least-squares temporal difference learning. In ICML, pages 521-528. ACM, 2009.
    • (2009) ICML , pp. 521-528
    • Kolter, J.Z.1    Ng, A.Y.2
  • 26
    • 84874668709 scopus 로고    scopus 로고
    • Model-based function approximation for reinforcement learning
    • May
    • Nicholas K. Jong and Peter Stone. Model-based function approximation for reinforcement learning. In AAMAS, May 2007.
    • (2007) AAMAS
    • Jong, N.K.1    Stone, P.2
  • 27
    • 0036832956 scopus 로고    scopus 로고
    • Kernel-Based reinforcement learning
    • November
    • Dirk Ormoneit and Saunak Sen. Kernel-Based reinforcement learning. Machine Learning, 49(2):161-178, November 2002.
    • (2002) Machine Learning , vol.49 , Issue.2 , pp. 161-178
    • Ormoneit, D.1    Sen, S.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.