메뉴 건너뛰기




Volumn , Issue , 2008, Pages 256-263

Reinforcement learning with limited reinforcement: Using bayes risk for active learning in POMDPs

Author keywords

[No Author keywords available]

Indexed keywords

EDUCATION; LEARNING SYSTEMS; PLANNING; REINFORCEMENT LEARNING; ROBOT LEARNING;

EID: 56449086386     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (49)

References (18)
  • 2
    • 56449088986 scopus 로고    scopus 로고
    • Efficient model learning for dialog management
    • Technical Report SS-07-07. AAA1 Press
    • Doshi, F., & Roy, N. (2007). Efficient model learning for dialog management. Technical Report SS-07-07. AAA1 Press.
    • (2007)
    • Doshi, F.1    Roy, N.2
  • 3
    • 84880715629 scopus 로고    scopus 로고
    • Reinforcement learning in POMDPs without resets
    • Even-Dar, E., Kakade, S. M., & Mansour, Y. (2005). Reinforcement learning in POMDPs without resets. IJCAI.
    • (2005) IJCAI
    • Even-Dar, E.1    Kakade, S.M.2    Mansour, Y.3
  • 5
    • 85138579181 scopus 로고
    • Learning policies for partially observable environments: Scaling up
    • Littman, M. L., Cassandra, A. R., & Kaelbling, L. P. (1995). Learning policies for partially observable environments: scaling up. ICML.
    • (1995) ICML
    • Littman, M.L.1    Cassandra, A.R.2    Kaelbling, L.P.3
  • 6
    • 56449107989 scopus 로고    scopus 로고
    • The variational Bayesian EM algorithm for incomplete data: With application to scoring graphical model structures
    • Millet, I. (1998). The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures. Journal of Multi-Criteria Decision Analysis, 6.
    • (1998) Journal of Multi-Criteria Decision Analysis , vol.6
    • Millet, I.1
  • 8
    • 0042547347 scopus 로고    scopus 로고
    • Algorithms for inverse reinforcement learning
    • Ng, A., & Russell, S. (2000). Algorithms for inverse reinforcement learning. ICML.
    • (2000) ICML
    • Ng, A.1    Russell, S.2
  • 9
    • 84880772945 scopus 로고    scopus 로고
    • Point-based value iteration: An anytime algorithm for POMDPs
    • Pineau, J., Gordon, G., & Thrun, S. (2003). Point-based value iteration: an anytime algorithm for POMDPs. IJCAI.
    • (2003) IJCAI
    • Pineau, J.1    Gordon, G.2    Thrun, S.3
  • 10
    • 77950356463 scopus 로고    scopus 로고
    • Model-based Bayesian reinforcement learning in partially observable domains
    • Poupart, P., & Vlassis, N. (2008). Model-based Bayesian reinforcement learning in partially observable domains. ISAIM.
    • (2008) ISAIM
    • Poupart, P.1    Vlassis, N.2
  • 11
    • 33749251297 scopus 로고    scopus 로고
    • An analytic solution to discrete Bayesian reinforcement learning
    • Poupart, P., Vlassis, N., Hoey, J., & Regan, K. (2006). An analytic solution to discrete Bayesian reinforcement learning. ICML.
    • (2006) ICML
    • Poupart, P.1    Vlassis, N.2    Hoey, J.3    Regan, K.4
  • 12
    • 84880707672 scopus 로고    scopus 로고
    • Spoken dialogue management using probabilistic reasoning
    • Hong Kong
    • Roy, N., Pineau, J., & Thrun, S. (2000). Spoken dialogue management using probabilistic reasoning. ACL. Hong Kong.
    • (2000) ACL
    • Roy, N.1    Pineau, J.2    Thrun, S.3
  • 14
    • 84880906197 scopus 로고    scopus 로고
    • Forward search value iteration for POMDPs
    • Shani, G., Brafman, R., & Shimony, S. (2007). Forward search value iteration for POMDPs. IJCAI.
    • (2007) IJCAI
    • Shani, G.1    Brafman, R.2    Shimony, S.3
  • 15
    • 34548745051 scopus 로고    scopus 로고
    • Incremental model-based learners with formal learning-time guarantees
    • Srrehl, A. L., Li, L., & Littman, M. L. (2006). Incremental model-based learners with formal learning-time guarantees. UAI.
    • (2006) UAI
    • Srrehl, A.L.1    Li, L.2    Littman, M.L.3
  • 16
    • 14344258433 scopus 로고    scopus 로고
    • A Bayesian framework for reinforcement learning
    • Strens, M. (2000). A Bayesian framework for reinforcement learning. ICML.
    • (2000) ICML
    • Strens, M.1
  • 18


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.