메뉴 건너뛰기




Volumn 22, Issue 10, 2009, Pages 1399-1410

Adaptive importance sampling for value function approximation in off-policy reinforcement learning

Author keywords

Adaptive importance sampling; Efficient sample reuse; Importance weighted cross validation; Off policy reinforcement learning; Policy iteration; Value function approximation

Indexed keywords

ADAPTIVE IMPORTANCE SAMPLING; BIAS AND VARIANCE; CROSS VALIDATION; DATA SAMPLE; EFFICIENT SAMPLE REUSE; IMPORTANCE SAMPLING; VALUE FUNCTION APPROXIMATION; VALUE FUNCTIONS;

EID: 70549113878     PISSN: 08936080     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.neunet.2009.01.002     Document Type: Article
Times cited : (45)

References (19)
  • 2
    • 84945307039 scopus 로고    scopus 로고
    • Non-linear swing-up and stabilizing control of an inverted pendulum system
    • Bugeja, M. (2003). Non-linear swing-up and stabilizing control of an inverted pendulum system. In Proceedings of IEEE Region 8 EUROCON (pp. 437-441)
    • (2003) Proceedings of IEEE Region 8 EUROCON , pp. 437-441
    • Bugeja, M.1
  • 9
    • 44949241322 scopus 로고    scopus 로고
    • Reinforcement learning of motor skills with policy gradients
    • Peters J., and Schaal S. Reinforcement learning of motor skills with policy gradients. Neural Networks 21 (2008) 682-697
    • (2008) Neural Networks , vol.21 , pp. 682-697
    • Peters, J.1    Schaal, S.2
  • 13
    • 84899025152 scopus 로고    scopus 로고
    • Optimality of reinforcement learning algorithms with linear function approximation
    • Schoknecht R. Optimality of reinforcement learning algorithms with linear function approximation. Neural Information Processing Systems 15 (2003) 1555-1562
    • (2003) Neural Information Processing Systems , vol.15 , pp. 1555-1562
    • Schoknecht, R.1
  • 15
    • 0037527188 scopus 로고    scopus 로고
    • Improving predictive inference under covariate shift by weighting the log-likelihood function
    • Shimodaira H. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference 90 (2000) 227-244
    • (2000) Journal of Statistical Planning and Inference , vol.90 , pp. 227-244
    • Shimodaira, H.1
  • 16
    • 1842733198 scopus 로고    scopus 로고
    • Trading variance reduction with unbiasedness: The regularized subspace information criterion for robust model selection in kernel regression
    • Sugiyama M., Kawanabe M., and Müller K.-R. Trading variance reduction with unbiasedness: The regularized subspace information criterion for robust model selection in kernel regression. Neural Computation 16 (2004) 1077-1104
    • (2004) Neural Computation , vol.16 , pp. 1077-1104
    • Sugiyama, M.1    Kawanabe, M.2    Müller, K.-R.3
  • 19
    • 0030082891 scopus 로고    scopus 로고
    • An approach to fuzzy control of nonlinear systems: Stability and design issues
    • Wang H.O., Tanaka K., and Griffin M.F. An approach to fuzzy control of nonlinear systems: Stability and design issues. IEEE Transactions on Fuzzy Systems (1996) 14-23
    • (1996) IEEE Transactions on Fuzzy Systems , pp. 14-23
    • Wang, H.O.1    Tanaka, K.2    Griffin, M.F.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.