메뉴 건너뛰기




Volumn 148, Issue , 2006, Pages 697-704

An analytic solution to discrete bayesian reinforcement learning

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; COMPUTER SIMULATION; MARKOV PROCESSES; ONLINE SYSTEMS; PARAMETERIZATION; PROBLEM SOLVING;

EID: 34250730267     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1143844.1143932     Document Type: Conference Paper
Times cited : (113)

References (17)
  • 1
    • 84880705310 scopus 로고    scopus 로고
    • A decision-theoretic approach to task assistance for persons with dementia
    • Boger, J., Poupart, P., Hoey, J., Boutilier, C., Fernie, G., & Mihailidis, A. (2005). A decision-theoretic approach to task assistance for persons with dementia. IJCAI (pp. 1293-1299).
    • (2005) IJCAI , pp. 1293-1299
    • Boger, J.1    Poupart, P.2    Hoey, J.3    Boutilier, C.4    Fernie, G.5    Mihailidis, A.6
  • 2
    • 85156187730 scopus 로고    scopus 로고
    • Improving elevator performance using reinforcement learning
    • Crites, R. H., & Barto, A. G. (1996). Improving elevator performance using reinforcement learning. NIPS (pp. 1017-1023).
    • (1996) NIPS , pp. 1017-1023
    • Crites, R.H.1    Barto, A.G.2
  • 3
    • 22944449970 scopus 로고    scopus 로고
    • Model based Bayesian exploration
    • Dearden, R., Friedman, N., & Andre, D. (1999). Model based Bayesian exploration. UAI (pp. 150-159).
    • (1999) UAI , pp. 150-159
    • Dearden, R.1    Friedman, N.2    Andre, D.3
  • 7
    • 1942421168 scopus 로고    scopus 로고
    • Design for an optimal probe
    • Duff, M. (2003). Design for an optimal probe. ICML (pp. 131-138).
    • (2003) ICML , pp. 131-138
    • Duff, M.1
  • 9
    • 0032679082 scopus 로고    scopus 로고
    • Exploration of multi-state environments: Local measures and back-propagation of uncertainty
    • Meuleau, N., & Bourgine, P. (1999). Exploration of multi-state environments: local measures and back-propagation of uncertainty. Machine Learning, 35, 117-154.
    • (1999) Machine Learning , vol.35 , pp. 117-154
    • Meuleau, N.1    Bourgine, P.2
  • 10
    • 3042583887 scopus 로고    scopus 로고
    • Autonomous helicopter flight via reinforcement learning
    • Ng, A., Kim, H. J., Jordan, M., & Sastiy, S. (2003). Autonomous helicopter flight via reinforcement learning. NIPS.
    • (2003) NIPS
    • Ng, A.1    Kim, H.J.2    Jordan, M.3    Sastiy, S.4
  • 12
    • 0015658957 scopus 로고
    • The optimal control of partially observable Markov processes over a finite horizon
    • Smallwood, R. D., & Sondik, E. J. (1973). The optimal control of partially observable Markov processes over a finite horizon. Operations Research, 21, 1071-1088.
    • (1973) Operations Research , vol.21 , pp. 1071-1088
    • Smallwood, R.D.1    Sondik, E.J.2
  • 14
    • 14344258433 scopus 로고    scopus 로고
    • A Bayesian framework for reinforcement learning
    • Strens, M. (2000). A Bayesian framework for reinforcement learning. ICML.
    • (2000) ICML
    • Strens, M.1
  • 16
    • 0029276036 scopus 로고
    • Temporal difference learning and TD-Gammon
    • Tesauro, G. J. (1995). Temporal difference learning and TD-Gammon. Communications of the ACM, 38, 58-68.
    • (1995) Communications of the ACM , vol.38 , pp. 58-68
    • Tesauro, G.J.1
  • 17
    • 31844436266 scopus 로고    scopus 로고
    • Bayesian sparse sampling for on-line reward optimization
    • Wang, T., Lizotte, D., Bowling, M., & Schuurmans, D. (2005). Bayesian sparse sampling for on-line reward optimization. ICML.
    • (2005) ICML
    • Wang, T.1    Lizotte, D.2    Bowling, M.3    Schuurmans, D.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.