메뉴 건너뛰기




Volumn , Issue , 2009, Pages 74-81

Basis Function Adaptation Methods for Cost Approximation in MDP

Author keywords

[No Author keywords available]

Indexed keywords

ADAPTATION FRAMEWORK; ADAPTATION METHODS; ADAPTATION SCHEME; BASIS FUNCTIONS; FUNCTION APPROXIMATION; LOW ORDER; MARKOV DECISION PROCESS; NONLINEAR OPTIMAL; OBJECTIVE FUNCTIONS; POLICY GRADIENT METHODS; TD METHOD; TEMPORAL DIFFERENCES;

EID: 67650458822     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ADPRL.2009.4927528     Document Type: Conference Paper
Times cited : (39)

References (17)
  • 1
    • 17444414191 scopus 로고    scopus 로고
    • Basis function adaptation in temporal difference reinforcement learning
    • DOI 10.1007/s10479-005-5732-z
    • I. Menache, S. Mannor, and N. Shimkin, "Basis function adaptation in temporal difference reinforcement learning," Ann. Oper. Res., Vol. 134, no. 1, pp. 215-238, 2005. (Pubitemid 40550047)
    • (2005) Annals of Operations Research , vol.134 , Issue.1 , pp. 215-238
    • Menache, I.1    Mannor, S.2    Shimkin, N.3
  • 2
    • 33847202724 scopus 로고
    • Learning to predict by the methods of temporal differences
    • R. S. Sutton, "Learning to predict by the methods of temporal differences," Machine Learning, Vol. 3, pp. 9-44, 1988.
    • (1988) Machine Learning , vol.3 , pp. 9-44
    • Sutton, R.S.1
  • 6
    • 4043069840 scopus 로고    scopus 로고
    • Actor-critic algorithms
    • V. R. Konda and J. N. Tsitsiklis, "Actor-critic algorithms," SIAM J. Control Optim., Vol. 42, no. 4, pp. 1143-1166, 2003.
    • (2003) SIAM J. Control Optim. , vol.42 , Issue.4 , pp. 1143-1166
    • Konda, V.R.1    Tsitsiklis, J.N.2
  • 9
    • 28544451799 scopus 로고    scopus 로고
    • Stochastic approximation with 'controlled Markov' noise
    • V. S. Borkar, "Stochastic approximation with 'controlled Markov' noise," Systems Control Lett., Vol. 55, pp. 139-145, 2006.
    • (2006) Systems Control Lett. , vol.55 , pp. 139-145
    • Borkar, V.S.1
  • 11
    • 67650362344 scopus 로고    scopus 로고
    • Projected equation methods for approximate solution of large linear systems
    • to appear
    • D. P. Bertsekas and H. Yu, "Projected equation methods for approximate solution of large linear systems," J. Comput. Sci. Appl. Math., 2008, to appear.
    • (2008) J. Comput. Sci. Appl. Math.
    • Bertsekas, D.P.1    Yu, H.2
  • 12
    • 0033351917 scopus 로고    scopus 로고
    • Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing high-dimensional financial derivatives
    • DOI 10.1109/9.793723
    • J. N. Tsitsiklis and B. Van Roy, "Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing financial derivatives," IEEE Trans. Automat. Contr., Vol. 44, pp. 1840-1851, 1999. (Pubitemid 30546876)
    • (1999) IEEE Transactions on Automatic Control , vol.44 , Issue.10 , pp. 1840-1851
    • Tsitsiklis, J.N.1    Van Roy, B.2
  • 13
    • 33646435300 scopus 로고    scopus 로고
    • A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning
    • D. S. Choi and B. Van Roy, "A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning," Discrete Event Dyn. Syst., Vol. 16, no. 2, pp. 207-239, 2006.
    • (2006) Discrete Event Dyn. Syst. , vol.16 , Issue.2 , pp. 207-239
    • Choi, D.S.1    Van Roy, B.2
  • 14
    • 58849124361 scopus 로고    scopus 로고
    • A least squares Q-learning algorithm for optimal stopping problems
    • H. Yu and D. P. Bertsekas, "A least squares Q-learning algorithm for optimal stopping problems," MIT, LIDS Tech. Report 2731, 2006.
    • (2006) MIT, LIDS Tech. Report , vol.2731
    • Yu, H.1    Bertsekas, D.P.2
  • 16
    • 0000516813 scopus 로고
    • An implicit-function theorem for a class of nonsmooth functions
    • S. M. Robinson, "An implicit-function theorem for a class of nonsmooth functions," Math. Oper. Res., Vol. 16, no. 2, pp. 292-309, 1991.
    • (1991) Math. Oper. Res. , vol.16 , Issue.2 , pp. 292-309
    • Robinson, S.M.1
  • 17
    • 46749106339 scopus 로고    scopus 로고
    • Robinson's implicit function theorem and its extensions
    • A. L. Dontchev and R. T. Rockafellar, "Robinson's implicit function theorem and its extensions," Math. Program. Ser. B, Vol. 117, no. 1, pp. 129-147, 2008.
    • (2008) Math. Program. Ser. B , vol.117 , Issue.1 , pp. 129-147
    • Dontchev, A.L.1    Rockafellar, R.T.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.