메뉴 건너뛰기




Volumn 5863 LNCS, Issue PART 1, 2009, Pages 502-511

Tracking in reinforcement learning

Author keywords

Kalman filtering; Reinforcement learning; Tracking; Value function approximation

Indexed keywords

ITERATIVE METHODS; SURFACE DISCHARGES;

EID: 76649127744     PISSN: 03029743     EISSN: 16113349     Source Type: Book Series    
DOI: 10.1007/978-3-642-10677-4_57     Document Type: Conference Paper
Times cited : (16)

References (18)
  • 2
    • 34547974097 scopus 로고    scopus 로고
    • Tracking Value Function Dynamics to Improve Reinforcement Learning with Piecewise Linear Function Approximation
    • Phua, C.W., Fitch, R.: Tracking Value Function Dynamics to Improve Reinforcement Learning with Piecewise Linear Function Approximation. In: International Conference on Machine Learning, ICML 2007 (2007)
    • (2007) International Conference on Machine Learning, ICML
    • Phua, C.W.1    Fitch, R.2
  • 5
    • 85024429815 scopus 로고    scopus 로고
    • Kalman, R.E.: A New Approach to Linear Filtering and Prediction Problems. Transactions of the ASME-Journal of Basic Engineering 82(Series D), 35-45 (1960)
    • Kalman, R.E.: A New Approach to Linear Filtering and Prediction Problems. Transactions of the ASME-Journal of Basic Engineering 82(Series D), 35-45 (1960)
  • 6
    • 21244437999 scopus 로고    scopus 로고
    • Unscented filtering and nonlinear estimation
    • Julier, S.J., Uhlmann, J.K.: Unscented filtering and nonlinear estimation. Proceedings of the IEEE 92(3), 401-422 (2004)
    • (2004) Proceedings of the IEEE , vol.92 , Issue.3 , pp. 401-422
    • Julier, S.J.1    Uhlmann, J.K.2
  • 8
    • 0001771345 scopus 로고    scopus 로고
    • Linear Least-Squares Algorithms for Temporal Difference Learning
    • Bradtke, S.J., Barto, A.G.: Linear Least-Squares Algorithms for Temporal Difference Learning. Machine Learning 22(1-3), 33-57 (1996)
    • (1996) Machine Learning , vol.22 , Issue.1-3 , pp. 33-57
    • Bradtke, S.J.1    Barto, A.G.2
  • 10
    • 40849145988 scopus 로고    scopus 로고
    • Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
    • Antos, A., Szepesvári, C., Munos, R.: Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. Machine Learning 71(1), 89-129 (2008)
    • (2008) Machine Learning , vol.71 , Issue.1 , pp. 89-129
    • Antos, A.1    Szepesvári, C.2    Munos, R.3
  • 11
    • 76649113839 scopus 로고    scopus 로고
    • Kakade, S.: A natural policy gradient. In: Advances in Neural Information Processing Systems 14 (NIPS 2001), Vancouver, British Columbia, Canada, pp. 1531-1538 (2001)
    • Kakade, S.: A natural policy gradient. In: Advances in Neural Information Processing Systems 14 (NIPS 2001), Vancouver, British Columbia, Canada, pp. 1531-1538 (2001)
  • 12
    • 33646413135 scopus 로고    scopus 로고
    • Peters, J., Vijayakumar, S., Schaal, S.: Natural actor-critic. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), 3720, pp. 280-291. Springer, Heidelberg (2005)
    • Peters, J., Vijayakumar, S., Schaal, S.: Natural actor-critic. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 280-291. Springer, Heidelberg (2005)
  • 13
    • 0036832950 scopus 로고    scopus 로고
    • Technical Update: Least-Squares Temporal Difference Learning
    • Boyan, J.A.: Technical Update: Least-Squares Temporal Difference Learning. Machine Learning 49(2-3), 233-246 (1999)
    • (1999) Machine Learning , vol.49 , Issue.2-3 , pp. 233-246
    • Boyan, J.A.1
  • 15
    • 20544433674 scopus 로고    scopus 로고
    • Consistent Normalized Least Mean Square Filtering with Noisy Data Matrix
    • Jo, S., Kim, S.W.: Consistent Normalized Least Mean Square Filtering with Noisy Data Matrix. IEEE Transactions on Signal Processing 53(6), 2112-2123 (2005)
    • (2005) IEEE Transactions on Signal Processing , vol.53 , Issue.6 , pp. 2112-2123
    • Jo, S.1    Kim, S.W.2
  • 18
    • 58449117448 scopus 로고    scopus 로고
    • Geist, M., Pietquin, O., Fricout, G.: Bayesian Reward Filtering. In: Girgin, S., Loth, M., Munos, R., Preux, P., Ryabko, D. (eds.) EWRL 2008. LNCS (LNAI), 5323, pp. 96-109. Springer, Heidelberg (2008)
    • Geist, M., Pietquin, O., Fricout, G.: Bayesian Reward Filtering. In: Girgin, S., Loth, M., Munos, R., Preux, P., Ryabko, D. (eds.) EWRL 2008. LNCS (LNAI), vol. 5323, pp. 96-109. Springer, Heidelberg (2008)


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.