메뉴 건너뛰기




Volumn 5323 LNAI, Issue , 2008, Pages 55-68

Regularized fitted Q-Iteration: Application to planning

Author keywords

[No Author keywords available]

Indexed keywords

PROBABILITY DENSITY FUNCTION; REINFORCEMENT LEARNING;

EID: 58449110583     PISSN: 03029743     EISSN: 16113349     Source Type: Book Series    
DOI: 10.1007/978-3-540-89722-4_5     Document Type: Conference Paper
Times cited : (20)

References (23)
  • 1
    • 34548752490 scopus 로고    scopus 로고
    • Antes, A., Szepesvári, C., Munos, R.: Value-iteration based fitted policy iteration: learning with a single trajectory. In: IEEE ADPRL, pp. 330-337 (2007)
    • Antes, A., Szepesvári, C., Munos, R.: Value-iteration based fitted policy iteration: learning with a single trajectory. In: IEEE ADPRL, pp. 330-337 (2007)
  • 2
    • 58449095085 scopus 로고    scopus 로고
    • Antos, A., Munos, R., Szepesvári, C.: Fitted Q-iteration in continuous action-space MDPs. In: Advances in Neural Information Processing Systems 20, NIPS 2007 (in print, 2008)
    • Antos, A., Munos, R., Szepesvári, C.: Fitted Q-iteration in continuous action-space MDPs. In: Advances in Neural Information Processing Systems 20, NIPS 2007 (in print, 2008)
  • 3
    • 40849145988 scopus 로고    scopus 로고
    • Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
    • Antos, A., Szepesvári, C., Munos, R.: Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. Machine Learning 71, 89-129 (2008)
    • (2008) Machine Learning , vol.71 , pp. 89-129
    • Antos, A.1    Szepesvári, C.2    Munos, R.3
  • 11
    • 84885993384 scopus 로고    scopus 로고
    • Jung, T., Polani, D.: Least squares SVM for least squares TD learning. In: ECAI, pp. 499-503 (2006)
    • Jung, T., Polani, D.: Least squares SVM for least squares TD learning. In: ECAI, pp. 499-503 (2006)
  • 12
    • 84880649215 scopus 로고    scopus 로고
    • A sparse sampling algorithm for near-optimal planning in large Markovian decision processes
    • Kearns, M., Mansour, Y., Ng, A.Y.: A sparse sampling algorithm for near-optimal planning in large Markovian decision processes. In: Proceedings of IJCAI 1999, pp. 1324-1331 (1999)
    • (1999) Proceedings of IJCAI , pp. 1324-1331
    • Kearns, M.1    Mansour, Y.2    Ng, A.Y.3
  • 13
    • 1942420814 scopus 로고    scopus 로고
    • Reinforcement learning as classification: Leveraging modern classifiers
    • Lagoudakis, M.G., Parr, R.: Reinforcement learning as classification: Leveraging modern classifiers. In: ICML 2003, pp. 424-431 (2003)
    • (2003) ICML 2003 , pp. 424-431
    • Lagoudakis, M.G.1    Parr, R.2
  • 15
    • 17444414191 scopus 로고    scopus 로고
    • Basis function adaptation in temporal difference reinforcement learning
    • Mannor, S., Menache, I., Shimkin, N.: Basis function adaptation in temporal difference reinforcement learning. Annals of Operations Research 134, 215-238 (2005)
    • (2005) Annals of Operations Research , vol.134 , pp. 215-238
    • Mannor, S.1    Menache, I.2    Shimkin, N.3
  • 18
    • 0036832956 scopus 로고    scopus 로고
    • Kernel-based reinforcement learning
    • Ormoneit, D., Sen, S.: Kernel-based reinforcement learning. Machine Learning 49, 161-178 (2002)
    • (2002) Machine Learning , vol.49 , pp. 161-178
    • Ormoneit, D.1    Sen, S.2
  • 19
    • 34547982545 scopus 로고    scopus 로고
    • Analyzing feature generation for value-function approximation
    • Parr, R., Painter-Wakefield, C., Li, L., Littman, M.L.: Analyzing feature generation for value-function approximation. In: ICML, pp. 737-744 (2007)
    • (2007) ICML , pp. 737-744
    • Parr, R.1    Painter-Wakefield, C.2    Li, L.3    Littman, M.L.4
  • 21
    • 33746031418 scopus 로고    scopus 로고
    • Srebro, N., Ben-David, S.: Learning bounds for support vector machines with learned kernels. In: Lugosi, G., Simon, H.U. (eds.) COLT 2006. LNCS, 4005, pp. 169-183. Springer, Heidelberg (2006)
    • Srebro, N., Ben-David, S.: Learning bounds for support vector machines with learned kernels. In: Lugosi, G., Simon, H.U. (eds.) COLT 2006. LNCS, vol. 4005, pp. 169-183. Springer, Heidelberg (2006)
  • 22
    • 34547098844 scopus 로고    scopus 로고
    • Kernel-based least squares policy iteration for reinforcement learning
    • Xu, X., Hu, D., Lu, X.: Kernel-based least squares policy iteration for reinforcement learning. IEEE Trans, on Neural Networks 18, 973-992 (2007)
    • (2007) IEEE Trans, on Neural Networks , vol.18 , pp. 973-992
    • Xu, X.1    Hu, D.2    Lu, X.3
  • 23
    • 0038105204 scopus 로고    scopus 로고
    • Capacity of reproducing kernel spaces in learning theory
    • Zhou, D.-X.: Capacity of reproducing kernel spaces in learning theory. IEEE Transactions on Information Theory 49, 1743-1752 (2003)
    • (2003) IEEE Transactions on Information Theory , vol.49 , pp. 1743-1752
    • Zhou, D.-X.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.