메뉴 건너뛰기




Volumn 3, Issue , 2008, Pages 1351-1356

Adaptive importance sampling with automatic model selection in value function approximation

Author keywords

[No Author keywords available]

Indexed keywords

BIONICS; COMMERCE; REINFORCEMENT;

EID: 57749096203     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (16)

References (13)
  • 3
    • 84898930479 scopus 로고    scopus 로고
    • A natural policy gradient
    • Kakade, S. 2002. A natural policy gradient. In NIPS 14.
    • (2002) NIPS 14
    • Kakade, S.1
  • 4
    • 4644323293 scopus 로고    scopus 로고
    • Least-squares policy iteration
    • Lagoudakis, M. G., and Parr, R. 2003. Least-squares policy iteration. JMLR 4:1107-1149.
    • (2003) JMLR , vol.4 , pp. 1107-1149
    • Lagoudakis, M.G.1    Parr, R.2
  • 6
    • 4644328593 scopus 로고    scopus 로고
    • Off-policy temporal-difference learning with function approximation
    • Precup, D.; Sutton, R. S.; and Dasgupta, S. 2001. Off-policy temporal-difference learning with function approximation. In Proc. of ICML.
    • (2001) Proc. of ICML
    • Precup, D.1    Sutton, R.S.2    Dasgupta, S.3
  • 7
    • 0242393653 scopus 로고    scopus 로고
    • Eligibility traces for off-policy policy evaluation
    • Precup, D.; Sutton, R. S.; and Singh, S. 2000. Eligibility traces for off-policy policy evaluation. In Proc. of ICML.
    • (2000) Proc. of ICML
    • Precup, D.1    Sutton, R.S.2    Singh, S.3
  • 8
    • 84899025152 scopus 로고    scopus 로고
    • Optimality of reinforcement learning algorithms with linear function approximation
    • Schoknecht, R. 2003. Optimality of reinforcement learning algorithms with linear function approximation. In NIPS 15.
    • (2003) NIPS 15
    • Schoknecht, R.1
  • 9
    • 18544374225 scopus 로고    scopus 로고
    • Policy improvement for pomdps using normalized importance sampling
    • Shelton, C. R. 2001. Policy improvement for pomdps using normalized importance sampling. In Proc. of UAI.
    • (2001) Proc. of UAI
    • Shelton, C.R.1
  • 10
    • 0037527188 scopus 로고    scopus 로고
    • Improving predictive inference under covariate shift by weighting the log-likelihood function
    • Shimodaira, H. 2000. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference 90(2):227-244.
    • (2000) Journal of Statistical Planning and Inference , vol.90 , Issue.2 , pp. 227-244
    • Shimodaira, H.1
  • 11
    • 34249047899 scopus 로고    scopus 로고
    • Covariate shift adaptation by importance weighted cross validation
    • Sugiyama, M.; Krauledat, M.; and Müller, K.-R. 2007. Covariate shift adaptation by importance weighted cross validation. JMLR 8:985-1005.
    • (2007) JMLR , vol.8 , pp. 985-1005
    • Sugiyama, M.1    Krauledat, M.2    Müller, K.-R.3
  • 13
    • 0004049893 scopus 로고
    • Ph.D. Dissertation, King's College, University of Oxford
    • Watkins, C. 1989. Learning from Delayed Rewards. Ph.D. Dissertation, King's College, University of Oxford.
    • (1989) Learning from Delayed Rewards
    • Watkins, C.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.