메뉴 건너뛰기




Volumn , Issue , 2008, Pages

Reinforcement learning with limited reinforcement: Using Bayes risk for active learning in POMDPs

Author keywords

[No Author keywords available]

Indexed keywords

ACTIVE LEARNING; APPROXIMATE ALGORITHMS; BAYES RISK; DOMAIN KNOWLEDGE; HIDDEN STATE; MODEL PARAMETERS; PARTIALLY OBSERVABLE MARKOV DECISION PROCESS; PLANNING DOMAINS;

EID: 84864606634     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (2)

References (15)
  • 3
    • 84880715629 scopus 로고    scopus 로고
    • Reinforcement learning in pomdps without resets
    • Even-Dar, E.; Kakade, S. M.; and Mansour, Y. 2005. Reinforcement learning in pomdps without resets. In IJCAI, 690-695.
    • (2005) IJCAI , pp. 690-695
    • Even-Dar, E.1    Kakade, S.M.2    Mansour, Y.3
  • 4
    • 39649090194 scopus 로고    scopus 로고
    • Learning in non-stationary partially observable markov decision processes
    • Jaulmes, R.; Pineau, J.; and Precup, D. 2005. Learning in non-stationary partially observable markov decision processes. In ECML Workshop.
    • (2005) ECML Workshop
    • Jaulmes, R.1    Pineau, J.2    Precup, D.3
  • 5
    • 85138579181 scopus 로고
    • Learning policies for partially observable environments: Scaling up
    • Littman, M. L.; Cassandra, A. R.; and Kaelbling, L. P. 1995. Learning policies for partially observable environments: scaling up. ICML.
    • (1995) ICML
    • Littman, M.L.1    Cassandra, A.R.2    Kaelbling, L.P.3
  • 6
    • 0042547347 scopus 로고    scopus 로고
    • Algorithms for inverse reinforcement learning
    • Ng, A., and Russell, S. 2000. Algorithms for inverse reinforcement learning. In Proceedings of ICML.
    • (2000) Proceedings of ICML
    • Ng, A.1    Russell, S.2
  • 7
    • 84880772945 scopus 로고    scopus 로고
    • Point-based value iteration: An anytime algorithm for pomdps
    • Pineau, J.; Gordon, G.; and Thrun, S. 2003. Point-based value iteration: An anytime algorithm for pomdps. IJCAI.
    • (2003) IJCAI
    • Pineau, J.1    Gordon, G.2    Thrun, S.3
  • 8
    • 34250730267 scopus 로고    scopus 로고
    • An analytic solution to discrete bayesian reinforcement learning
    • New York, NY, USA: ACMPress
    • Poupart, P.; Vlassis, N.; Hoey, J.; and Regan, K. 2006. An analytic solution to discrete bayesian reinforcement learning. In ICML, 697-704. New York, NY, USA: ACMPress.
    • (2006) ICML , pp. 697-704
    • Poupart, P.1    Vlassis, N.2    Hoey, J.3    Regan, K.4
  • 9
    • 0024610919 scopus 로고
    • A tutorial on hidden markov models and selected applications in speech recognition
    • Rabiner, L. R. 1989. A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2):257-286.
    • (1989) Proceedings of the IEEE , vol.77 , Issue.2 , pp. 257-286
    • Rabiner, L.R.1
  • 13
    • 33847202724 scopus 로고
    • Learning to predict by the methods of temporal differences
    • Sutton, R. 1988. Learning to predict by the methods of temporal differences. Machine Learning 3.
    • (1988) Machine Learning , vol.3
    • Sutton, R.1
  • 15
    • 33846220727 scopus 로고    scopus 로고
    • Scaling up pomdps for dialogue management: The "summary pomdp" method
    • Williams, J., and Young, S. 2005. Scaling up pomdps for dialogue management: The "summary pomdp" method. In Proceedings of the IEEE ASRU Workshop.
    • (2005) Proceedings of the IEEE ASRU Workshop
    • Williams, J.1    Young, S.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.