메뉴 건너뛰기




Volumn , Issue , 2011, Pages

Action-gap phenomenon in reinforcement learning

Author keywords

[No Author keywords available]

Indexed keywords

ITERATIVE METHODS;

EID: 85162479771     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (57)

References (25)
  • 3
    • 0033234630 scopus 로고    scopus 로고
    • Smooth discrimination analysis
    • Enno Mammen and Alexander B. Tsybakov. Smooth discrimination analysis. The Annals of Statistics, 27(6):1808-1829, 1999.
    • (1999) The Annals of Statistics , vol.27 , Issue.6 , pp. 1808-1829
    • Mammen, E.1    Tsybakov, A.B.2
  • 4
    • 3142725508 scopus 로고    scopus 로고
    • Optimal aggregation of classifiers in statistical learning
    • Alexander B. Tsybakov. Optimal aggregation of classifiers in statistical learning. The Annals of Statistics, 32 (1):135-166, 2004.
    • (2004) The Annals of Statistics , vol.32 , Issue.1 , pp. 135-166
    • Tsybakov, A.B.1
  • 5
    • 34547706430 scopus 로고    scopus 로고
    • Fast learning rates for plug-in classifiers
    • Jean-Yves Audibert and Alexander B. Tsybakov. Fast learning rates for plug-in classifiers. The Annals of Statistics, 35(2):608-633, 2007.
    • (2007) The Annals of Statistics , vol.35 , Issue.2 , pp. 608-633
    • Audibert, J.-Y.1    Tsybakov, A.B.2
  • 6
    • 77957604813 scopus 로고    scopus 로고
    • Generalized density clustering
    • Alessandro Rinaldo and Larry Wasserman. Generalized density clustering. The Annals of Statistics, 38(5):2678-2722, 2010.
    • (2010) The Annals of Statistics , vol.38 , Issue.5 , pp. 2678-2722
    • Rinaldo, A.1    Wasserman, L.2
  • 9
    • 85162059109 scopus 로고    scopus 로고
    • A reduction from apprenticeship learning to classification
    • J. Lafferty, C. K. I.Williams, J. Shawe-Taylor, R.S. Zemel, and A. Culotta, editors
    • Omar Syed and Robert E. Schapire. A reduction from apprenticeship learning to classification. In J. Lafferty, C. K. I.Williams, J. Shawe-Taylor, R.S. Zemel, and A. Culotta, editors, Advances in Neural Information Processing Systems (NIPS - 23), pages 2253-2261, 2010.
    • (2010) Advances in Neural Information Processing Systems (NIPS - 23) , pp. 2253-2261
    • Syed, O.1    Schapire, R.E.2
  • 10
    • 40849145988 scopus 로고    scopus 로고
    • Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
    • András Antos, Csaba Szepesvári, and Rémi Munos. Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. Machine Learning, 71:89-129, 2008.
    • (2008) Machine Learning , vol.71 , pp. 89-129
    • Antos, A.1    Szepesvári, C.2    Munos, R.3
  • 17
  • 19
    • 33646398129 scopus 로고    scopus 로고
    • Neural fitted Q iteration - First experiences with a data efficient neural reinforcement learning method
    • Martin Riedmiller. Neural fitted Q iteration - first experiences with a data efficient neural reinforcement learning method. In 16th European Conference on Machine Learning, pages 317-328, 2005.
    • (2005) 16th European Conference on Machine Learning , pp. 317-328
    • Riedmiller, M.1
  • 21
    • 0348090400 scopus 로고    scopus 로고
    • The linear programming approach to approximate dynamic programming
    • Daniela Pucci de Farias and Benjamin Van Roy. The linear programming approach to approximate dynamic programming. Operations Research, 51(6):850-865, 2003.
    • (2003) Operations Research , vol.51 , Issue.6 , pp. 850-865
    • De Farias, D.P.1    Van Roy, B.2
  • 25
    • 85162063395 scopus 로고    scopus 로고
    • Error propagation for approximate policy and value iteration
    • J. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R.S. Zemel, and A. Culotta, editors
    • Amir-massoud Farahmand, Rémi Munos, and Csaba Szepesvári. Error propagation for approximate policy and value iteration. In J. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R.S. Zemel, and A. Culotta, editors, Advances in Neural Information Processing Systems (NIPS - 23), pages 568-576. 2010.
    • (2010) Advances in Neural Information Processing Systems (NIPS - 23) , pp. 568-576
    • Farahmand, A.-M.1    Munos, R.2    Szepesvári, C.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.