SCOPUS 정보 검색 플랫폼

Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010

Volumn , Issue , 2010, Pages

Nonparametric Bayesian policy priors for reinforcement learning

(4) Doshi Velez, Finale a Wingate, David a Roy, Nicholas a Tenenbaum, Joshua a

a MASSACHUSETTS INSTITUTE OF TECHNOLOGY (United States)

Author keywords

[No Author keywords available]

Indexed keywords

BAYESIAN NETWORKS;

BAYESIAN APPROACHES; EXPERT INFORMATIONS; MODEL KNOWLEDGE; MODEL LEARNING; NON-PARAMETRIC BAYESIAN; POLICY LEARNING; REINFORCEMENT LEARNINGS; SIMPLE++;

REINFORCEMENT LEARNING;

EID: 85162013390 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (26)

References (22)

1
- 39649090194
- Learning in non-stationary partially observable Markov decision processes
- R. Jaulmes, J. Pineau, and D. Precup. Learning in non-stationary partially observable Markov decision processes. ECML Workshop, 2005.
- (2005) ECML Workshop
- Jaulmes, R.¹ Pineau, J.² Precup, D.³

2
- 85162018872
- Bayes-adaptive POMDPs
- Stephane Ross, Brahim Chaib-draa, and Joelle Pineau. Bayes-adaptive POMDPs. In Neural Information Processing Systems (NIPS), 2008.
- (2008) Neural Information Processing Systems (NIPS)
- Ross, S.¹ Chaib-Draa, B.² Pineau, J.³

3
- 51649091499
- Bayesian reinforcement learning in continuous POMDPs with application to robot navigation
- Stephane Ross, Brahim Chaib-draa, and Joelle Pineau. Bayesian reinforcement learning in continuous POMDPs with application to robot navigation. In ICRA, 2008.
- (2008) ICRA
- Ross, S.¹ Chaib-Draa, B.² Pineau, J.³

4
- 56449086386
- Reinforcement learning with limited reinforcement: Using Bayes risk for active learning in POMDPs
- Finale Doshi, Joelle Pineau, and Nicholas Roy. Reinforcement learning with limited reinforcement: Using Bayes risk for active learning in POMDPs. In International Conference on Machine Learning, volume 25, 2008.
- (2008) International Conference on Machine Learning , vol.25
- Doshi, F.¹ Pineau, J.² Roy, N.³

5
- 33749242451
- Using inaccurate models in reinforcement learning
- ACM Press
- Pieter Abbeel, Morgan Quigley, and Andrew Y. Ng. Using inaccurate models in reinforcement learning. In In International Conference on Machine Learning (ICML) Pittsburgh, pages 1-8. ACM Press, 2006.
- (2006) International Conference on Machine Learning (ICML) Pittsburgh , pp. 1-8
- Abbeel, P.¹ Quigley, M.² Ng, A.Y.³

6
- 84862279024
- Inverse optimal heuristic control for imitation learning
- Nathan Ratliff, Brian Ziebart, Kevin Peterson, J. Andrew Bagnell, Martial Hebert, Anind K. Dey, and Siddhartha Srinivasa. Inverse optimal heuristic control for imitation learning. In Proc. AISTATS, pages 424-431, 2009.
- (2009) Proc. AISTATS , pp. 424-431
- Ratliff, N.¹ Ziebart, B.² Peterson, K.³ Bagnell, J.A.⁴ Hebert, M.⁵ Dey, A.K.⁶ Srinivasa, S.⁷

7
- 77950356463
- Model-based Bayesian reinforcement learning in partially observable domains
- P. Poupart and N. Vlassis. Model-based Bayesian reinforcement learning in partially observable domains. In ISAIM, 2008.
- (2008) ISAIM
- Poupart, P.¹ Vlassis, N.²

8
- 14344258433
- A Bayesian framework for reinforcement learning
- M. Strens. A Bayesian framework for reinforcement learning. In ICML, 2000.
- (2000) ICML
- Strens, M.¹

9
- 78649507911
- A Bayesian sampling approach to exploration in reinforcement learning
- John Asmuth, Lihong Li, Michael Littman, Ali Nouri, and David Wingate. A Bayesian sampling approach to exploration in reinforcement learning. In Uncertainty in Artificial Intelligence (UAI), 2009.
- (2009) Uncertainty in Artificial Intelligence (UAI)
- Asmuth, J.¹ Li, L.² Littman, M.³ Nouri, A.⁴ Wingate, D.⁵

10
- 1142281527
- R. Dearden, N. Friedman, and D. Andre. Model based Bayesian exploration. pages 150-159, 1999.
- (1999) Model Based Bayesian Exploration , pp. 150-159
- Dearden, R.¹ Friedman, N.² Andre, D.³

11
- 0003871607
- PhD thesis, Stanford University
- E. J. Sondik. The Optimial Control of Partially Observable Markov Processes. PhD thesis, Stanford University, 1971.
- (1971) The Optimial Control of Partially Observable Markov Processes
- Sondik, E.J.¹

12
- 77958539351
- The infinite partially observable Markov decision process
- Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, editors
- Finale Doshi-Velez. The infinite partially observable Markov decision process. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, editors, Advances in Neural Information Processing Systems 22, pages 477-485. 2009.
- (2009) Advances in Neural Information Processing Systems , vol.22 , pp. 477-485
- Doshi-Velez, F.¹

13
- 77956163527
- The infinite hidden Markov model
- MIT Press
- Matthew J. Beal, Zoubin Ghahramani, and Carl E. Rasmussen. The infinite hidden Markov model. In Machine Learning, pages 29-245. MIT Press, 2002.
- (2002) Machine Learning , pp. 29-245
- Beal, M.J.¹ Ghahramani, Z.² Rasmussen, C.E.³

14
- 33749249312
- Hierarchical Dirichlet processes
- Yee Whye Teh, Michael I. Jordan, Matthew J. Beal, and David M. Blei. Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101:1566-1581, 2006.
- (2006) Journal of the American Statistical Association , vol.101 , pp. 1566-1581
- Teh, Y.W.¹ Jordan, M.I.² Beal, M.J.³ Blei, D.M.⁴

15
- 31844436266
- Bayesian sparse sampling for on-line reward optimization
- TaoWang, Daniel Lizotte, Michael Bowling, and Dale Schuurmans. Bayesian sparse sampling for on-line reward optimization. In International Conference on Machine Learning (ICML), 2005.
- (2005) International Conference on Machine Learning (ICML)
- Wang, T.¹ Lizotte, D.² Bowling, M.³ Schuurmans, D.⁴

16
- 71149109483
- Near-Bayesian exploration in polynomial time
- J. Zico Kolter and Andrew Ng. Near-Bayesian exploration in polynomial time. In International Conference on Machine Learning (ICML), 2009.
- (2009) International Conference on Machine Learning (ICML)
- Kolter, J.Z.¹ Ng, A.²

17
- 56449130659
- Beam sampling for the infinite hidden Markov model
- J. van Gael, Y. Saatci, Y. W. Teh, and Z. Ghahramani. Beam sampling for the infinite hidden Markov model. In ICML, volume 25, 2008.
- (2008) ICML , vol.25
- Van Gael, J.¹ Saatci, Y.² Teh, Y.W.³ Ghahramani, Z.⁴

18
- 84880772945
- Point-based value iteration: An anytime algorithm for POMDPs
- J. Pineau, G. Gordon, and S. Thrun. Point-based value iteration: An anytime algorithm for POMDPs. IJCAI, 2003.
- (2003) IJCAI
- Pineau, J.¹ Gordon, G.² Thrun, S.³

19
- 34247196722
- Bounded finite state controllers
- Pascal Poupart and Craig Boutilier. Bounded finite state controllers. In Neural Information Processing Systems, 2003.
- (2003) Neural Information Processing Systems
- Poupart, P.¹ Boutilier, C.²

20
- 85138579181
- Learning policies for partially observable environments: Scaling up
- M. L. Littman, A. R. Cassandra, and L. P. Kaelbling. Learning policies for partially observable environments: scaling up. ICML, 1995.
- (1995) ICML
- Littman, M.L.¹ Cassandra, A.R.² Kaelbling, L.P.³

21
- 0026998041
- Reinforcement learning with perceptual aliasing: The perceptual distinctions approach
- AAAI Press
- Lonnie Chrisman. Reinforcement learning with perceptual aliasing: The perceptual distinctions approach. In In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 183-188. AAAI Press, 1992.
- (1992) Proceedings of the Tenth National Conference on Artificial Intelligence , pp. 183-188
- Chrisman, L.¹

22
- 31144465830
- Heuristic search value iteration for POMDPs
- Banff, Alberta
- T. Smith and R. Simmons. Heuristic search value iteration for POMDPs. In Proc. of UAI 2004, Banff, Alberta, 2004.
- (2004) Proc. of UAI 2004
- Smith, T.¹ Simmons, R.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.