메뉴 건너뛰기




Volumn , Issue , 2010, Pages

Nonparametric Bayesian policy priors for reinforcement learning

Author keywords

[No Author keywords available]

Indexed keywords

BAYESIAN NETWORKS;

EID: 85162013390     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (26)

References (22)
  • 1
    • 39649090194 scopus 로고    scopus 로고
    • Learning in non-stationary partially observable Markov decision processes
    • R. Jaulmes, J. Pineau, and D. Precup. Learning in non-stationary partially observable Markov decision processes. ECML Workshop, 2005.
    • (2005) ECML Workshop
    • Jaulmes, R.1    Pineau, J.2    Precup, D.3
  • 3
    • 51649091499 scopus 로고    scopus 로고
    • Bayesian reinforcement learning in continuous POMDPs with application to robot navigation
    • Stephane Ross, Brahim Chaib-draa, and Joelle Pineau. Bayesian reinforcement learning in continuous POMDPs with application to robot navigation. In ICRA, 2008.
    • (2008) ICRA
    • Ross, S.1    Chaib-Draa, B.2    Pineau, J.3
  • 4
    • 56449086386 scopus 로고    scopus 로고
    • Reinforcement learning with limited reinforcement: Using Bayes risk for active learning in POMDPs
    • Finale Doshi, Joelle Pineau, and Nicholas Roy. Reinforcement learning with limited reinforcement: Using Bayes risk for active learning in POMDPs. In International Conference on Machine Learning, volume 25, 2008.
    • (2008) International Conference on Machine Learning , vol.25
    • Doshi, F.1    Pineau, J.2    Roy, N.3
  • 7
    • 77950356463 scopus 로고    scopus 로고
    • Model-based Bayesian reinforcement learning in partially observable domains
    • P. Poupart and N. Vlassis. Model-based Bayesian reinforcement learning in partially observable domains. In ISAIM, 2008.
    • (2008) ISAIM
    • Poupart, P.1    Vlassis, N.2
  • 8
    • 14344258433 scopus 로고    scopus 로고
    • A Bayesian framework for reinforcement learning
    • M. Strens. A Bayesian framework for reinforcement learning. In ICML, 2000.
    • (2000) ICML
    • Strens, M.1
  • 12
    • 77958539351 scopus 로고    scopus 로고
    • The infinite partially observable Markov decision process
    • Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, editors
    • Finale Doshi-Velez. The infinite partially observable Markov decision process. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, editors, Advances in Neural Information Processing Systems 22, pages 477-485. 2009.
    • (2009) Advances in Neural Information Processing Systems , vol.22 , pp. 477-485
    • Doshi-Velez, F.1
  • 13
  • 17
    • 56449130659 scopus 로고    scopus 로고
    • Beam sampling for the infinite hidden Markov model
    • J. van Gael, Y. Saatci, Y. W. Teh, and Z. Ghahramani. Beam sampling for the infinite hidden Markov model. In ICML, volume 25, 2008.
    • (2008) ICML , vol.25
    • Van Gael, J.1    Saatci, Y.2    Teh, Y.W.3    Ghahramani, Z.4
  • 18
    • 84880772945 scopus 로고    scopus 로고
    • Point-based value iteration: An anytime algorithm for POMDPs
    • J. Pineau, G. Gordon, and S. Thrun. Point-based value iteration: An anytime algorithm for POMDPs. IJCAI, 2003.
    • (2003) IJCAI
    • Pineau, J.1    Gordon, G.2    Thrun, S.3
  • 20
    • 85138579181 scopus 로고
    • Learning policies for partially observable environments: Scaling up
    • M. L. Littman, A. R. Cassandra, and L. P. Kaelbling. Learning policies for partially observable environments: scaling up. ICML, 1995.
    • (1995) ICML
    • Littman, M.L.1    Cassandra, A.R.2    Kaelbling, L.P.3
  • 21
    • 0026998041 scopus 로고
    • Reinforcement learning with perceptual aliasing: The perceptual distinctions approach
    • AAAI Press
    • Lonnie Chrisman. Reinforcement learning with perceptual aliasing: The perceptual distinctions approach. In In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 183-188. AAAI Press, 1992.
    • (1992) Proceedings of the Tenth National Conference on Artificial Intelligence , pp. 183-188
    • Chrisman, L.1
  • 22
    • 31144465830 scopus 로고    scopus 로고
    • Heuristic search value iteration for POMDPs
    • Banff, Alberta
    • T. Smith and R. Simmons. Heuristic search value iteration for POMDPs. In Proc. of UAI 2004, Banff, Alberta, 2004.
    • (2004) Proc. of UAI 2004
    • Smith, T.1    Simmons, R.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.