메뉴 건너뛰기




Volumn , Issue , 2010, Pages 450-457

Statistically linearized least-squares temporal differences

Author keywords

Neural networks; Reinforcement learning; Statistical linearization; Value function approximation

Indexed keywords

ITERATIVE METHODS; LINEARIZATION; NEURAL NETWORKS; REINFORCEMENT LEARNING;

EID: 79951499926     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICUMT.2010.5676598     Document Type: Conference Paper
Times cited : (9)

References (20)
  • 2
    • 0031143730 scopus 로고    scopus 로고
    • An analysis of temporal-difference learning with function approximation
    • J. N. Tsitsiklisc and B. Van Roy, "An analysis of temporal-difference learning with function approximation," IEEE Transactions on Automatic Control, vol. 42, pp. 674-690, 1997.
    • (1997) IEEE Transactions on Automatic Control , vol.42 , pp. 674-690
    • Tsitsiklisc, J.N.1    Van Roy, B.2
  • 3
    • 0001771345 scopus 로고    scopus 로고
    • Linear Least-Squares algorithms for temporal difference learning
    • S. J. Bradtke and A. G. Barto, "Linear Least-Squares algorithms for temporal difference learning," Machine Learning, vol. 22, no. 1-3, pp. 33-57, 1996.
    • (1996) Machine Learning , vol.22 , Issue.1-3 , pp. 33-57
    • Bradtke, S.J.1    Barto, A.G.2
  • 5
    • 33646435300 scopus 로고    scopus 로고
    • A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference Learning
    • D. Choi and B. Van Roy, "A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference Learning," Discrete Event Dynamic Systems, vol. 16, pp. 207-239, 2006.
    • (2006) Discrete Event Dynamic Systems , vol.16 , pp. 207-239
    • Choi, D.1    Van Roy, B.2
  • 15
    • 21244437999 scopus 로고    scopus 로고
    • Unscented filtering and nonlinear estimation
    • S. J. Julier and J. K. Uhlmann, "Unscented filtering and nonlinear estimation," Proceedings of the IEEE, vol. 92, no. 3, pp. 401-422, 2004.
    • (2004) Proceedings of the IEEE , vol.92 , Issue.3 , pp. 401-422
    • Julier, S.J.1    Uhlmann, J.K.2
  • 16
    • 0034326226 scopus 로고    scopus 로고
    • New developments in state estimation for nonlinear systems
    • P. Nørg̊ard, N. Poulsen, and O. Ravn, "New developments in state estimation for nonlinear systems," Automatica, vol. 36, no. 11, pp. 1627-1638, 2000.
    • (2000) Automatica , vol.36 , Issue.11 , pp. 1627-1638
    • Nørg̊ard, P.1    Poulsen, N.2    Ravn, O.3
  • 19
    • 40849145988 scopus 로고    scopus 로고
    • Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
    • A. Antos, C. Szepesvári, and R. Munos, "Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path," Machine Learning, vol. 71, no. 1, pp. 89-129, 2008.
    • (2008) Machine Learning , vol.71 , Issue.1 , pp. 89-129
    • Antos, A.1    Szepesvári, C.2    Munos, R.3
  • 20
    • 33646398129 scopus 로고    scopus 로고
    • Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method
    • M. Riedmiller, "Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method," in European Conference on Machine Learning, 2005, pp. 317-328.
    • European Conference on Machine Learning, 2005 , pp. 317-328
    • Riedmiller, M.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.