SCOPUS 정보 검색 플랫폼

Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence, UAI 2011

Volumn , Issue , 2011, Pages 19-26

Learning is planning: Near Bayes-optimal reinforcement learning via Monte-Carlo tree search

(2) Asmuth, John a Littman, Michael a

a RUTGERS UNIVERSITY (United States)

Author keywords

[No Author keywords available]

Indexed keywords

MARKOV PROCESSES; MONTE CARLO METHODS;

BAYES-OPTIMAL; BELIEF SPACE; INFINITE STATE SPACE; MARKOV DECISION PROCESSES; MONTE CARLO TREE SEARCH (MCTS); MONTE-CARLO TREE SEARCHES; POLYNOMIAL NUMBER; SPARSE SAMPLING;

REINFORCEMENT LEARNING;

EID: 80053158617 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (33)

References (27)

1
- 78649507911
- A Bayesian sampling approach to exploration in reinforcement learning
- Asmuth, J., Li, L., Littman, M., Nouri, A., & Wingate, D. (2009). A Bayesian sampling approach to exploration in reinforcement learning. Proceedings of the 25th Conference on Uncertainty in Artifical Intelligence (UAI-09).
- (2009) Proceedings of the 25th Conference on Uncertainty in Artifical Intelligence (UAI-09)
- Asmuth, J.¹ Li, L.² Littman, M.³ Nouri, A.⁴ Wingate, D.⁵

2
- 80053137163
- Rutgers University department of Computer Science
- Asmuth, J., & Littman, M. (2011). Appendix (Technical Report DCS-tr-687). Rutgers University department of Computer Science.
- (2011) Appendix (Technical Report DCS-tr-687)
- Asmuth, J.¹ Littman, M.²

3
- 0036568025
- Finite-time analysis of the multiarmed bandit problem
- DOI 10.1023/A:1013689704352, Computational Learning Theory
- Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47, 235-256. (Pubitemid 34126111)
- (2002) Machine Learning , vol.47 , Issue.2-3 , pp. 235-256
- Auer, P.¹ Cesa-Bianchi, N.² Fischer, P.³

4
- 0041965975
- R-MAX-A general polynomial time algorithm for near-optimal reinforcement learning
- Brafman, R. I., & Tennenholtz, M. (2002). R-MAX-A general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research, 3, 213-231.
- (2002) Journal of Machine Learning Research , vol.3 , pp. 213-231
- Brafman, R.I.¹ Tennenholtz, M.²

5
- 1942421168
- Design for an optimal probe
- Duff, M. (2003). Design for an optimal probe. Proceedings of the 20th International Conference on Machine Learning.
- (2003) Proceedings of the 20th International Conference on Machine Learning
- Duff, M.¹

6
- 16244388049
- Local bandit approximation for optimal learning problems
- The MIT Press
- Duff, M. O., & Barto, A. G. (1997). Local bandit approximation for optimal learning problems. Advances in Neural Information Processing Systems (pp. 1019-1025). The MIT Press.
- (1997) Advances in Neural Information Processing Systems , pp. 1019-1025
- Duff, M.O.¹ Barto, A.G.²

7
- 57749091602
- Achieving master level play in 9×9 computer go
- Chicago, Illinois: AAAI Press
- Gelly, S., & Silver, D. (2008). Achieving master level play in 9×9 computer go. Proceedings of the 23rd national conference on Artificial intelligence - Volume 3 (pp. 1537-1540). Chicago, Illinois: AAAI Press.
- (2008) Proceedings of the 23rd National Conference on Artificial Intelligence , vol.3 , pp. 1537-1540
- Gelly, S.¹ Silver, D.²

8
- 23244466805
- Doctoral dissertation, Gatsby Computational Neuroscience Unit, University College London
- Kakade, S. M. (2003). On the sample complexity of reinforcement learning. Doctoral dissertation, Gatsby Computational Neuroscience Unit, University College London.
- (2003) On the Sample Complexity of Reinforcement Learning
- Kakade, S.M.¹

9
- 84880649215
- A sparse sampling algorithm for near-optimal planning in large Markov decision processes
- Kearns, M., Mansour, Y., & Ng, A. Y. (1999). A sparse sampling algorithm for near-optimal planning in large Markov decision processes. Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence (IJCAI-99) (pp. 1324-1331).
- (1999) Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence (IJCAI-99) , pp. 1324-1331
- Kearns, M.¹ Mansour, Y.² Ng, A.Y.³

10
- 33750293964
- Bandit based Monte-Carlo planning
- Machine Learning: ECML 2006 - 17th European Conference on Machine Learning, Proceedings
- Kocsis, L., & Szepesvari, C. (2006). Bandit based Monte-Carlo planning. Proceedings of the 17th European Conference on Machine Learning (ECML- 06) (pp. 282-293). Springer Berlin / Heidelberg. (Pubitemid 44618839)
- (2006) Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) , vol.LNAI4212 , pp. 282-293
- Kocsis, L.¹ Szepesvari, C.²

11
- 71149109483
- Near-Bayesian exploration in polynomial time
- New York, NY, USA: ACM
- Kolter, J. Z., & Ng, A. Y. (2009). Near-Bayesian exploration in polynomial time. Proceedings of the 26th Annual International Conference on Machine Learning (pp. 513-520). New York, NY, USA: ACM.
- (2009) Proceedings of the 26th Annual International Conference on Machine Learning , pp. 513-520
- Kolter, J.Z.¹ Ng, A.Y.²

12
- 36349026477
- Efficient reinforcement learning with relocatable action models
- Leffler, B. R., Littman, M. L., & Edmunds, T. (2007). Efficient reinforcement learning with relocatable action models. Proceedings of the Twenty-Second Conference on Artificial Intelligence (AAAI-07).
- (2007) Proceedings of the Twenty-Second Conference on Artificial Intelligence (AAAI-07)
- Leffler, B.R.¹ Littman, M.L.² Edmunds, T.³

13
- 70349428076
- Doctoral dissertation Rutgers University
- Li, L. (2009). A unifying framework for computational reinforcement learning theory (pp 78-79). Doctoral dissertation, Rutgers University.
- (2009) A Unifying Framework for Computational Reinforcement Learning Theory , pp. 78-79
- Li, L.¹

14
- 0032221058
- Estimating mixture of Dirichlet process models
- MacEachern, S. N., & Muller, P. (1998). Estimating mixture of Dirichlet process models. Journal of Computational and Graphical Statistics, 7, 223-238.
- (1998) Journal of Computational and Graphical Statistics , vol.7 , pp. 223-238
- MacEachern, S.N.¹ Muller, P.²

15
- 77950032550
- Markov chain sampling methods for dirichlet process mixture models
- Neal, R. M. (2000). Markov chain sampling methods for dirichlet process mixture models. Journal of Computational and Graphical Statistics, Vol. 9, pp. 249-265.
- (2000) Journal of Computational and Graphical Statistics , vol.9 , pp. 249-265
- Neal, R.M.¹

16
- 33749251297
- An analytic solution to discrete Bayesian reinforcement learning
- Poupart, P., Vlassis, N., Hoey, J., & Regan, K. (2006). An analytic solution to discrete Bayesian reinforcement learning. Proceedings of the 23rd International Conference on Machine Learning (pp. 697-704).
- (2006) Proceedings of the 23rd International Conference on Machine Learning , pp. 697-704
- Poupart, P.¹ Vlassis, N.² Hoey, J.³ Regan, K.⁴

17
- 85102627959
- New York, NY: John Wiley & Sons, Inc
- Puterman, M. L. (1994). Markov decision processes- discrete stochastic dynamic programming. New York, NY: John Wiley & Sons, Inc.
- (1994) Markov Decision Processes- Discrete Stochastic Dynamic Programming
- Puterman, M.L.¹

18
- 0003584577
- Englewood Cliffs, NJ: Prentice-Hall
- Russell, S. J., & Norvig, P. (1994). Artificial intelligence: A modern approach. Englewood Cliffs, NJ: Prentice-Hall.
- (1994) Artificial Intelligence: A Modern Approach
- Russell, S.J.¹ Norvig, P.²

19
- 80053165997
- Variance- based rewards for approximate Bayesian reinforcement learning
- Sorg, J., Singh, S., & Lewis, R. L. (2010). Variance- based rewards for approximate Bayesian reinforcement learning. Proceedings of the 26th Conference on Uncertainty in Artifical Intelligence (UAI-10).
- (2010) Proceedings of the 26th Conference on Uncertainty in Artifical Intelligence (UAI-10)
- Sorg, J.¹ Singh, S.² Lewis, R.L.³

20
- 36348930987
- Efficient structure learning in factored-state MDPs
- Strehl, A. L., Diuk, C., & Littman, M. L. (2007). Efficient structure learning in factored-state MDPs. Proceedings of the Twenty-Second National Conference on Artificial Intelligence (AAAI-07).
- (2007) Proceedings of the Twenty-Second National Conference on Artificial Intelligence (AAAI-07)
- Strehl, A.L.¹ Diuk, C.² Littman, M.L.³

21
- 55549110436
- An analysis of modelbased interval estimation for Markov decision processes
- Special Issue on Learning Theory
- Strehl, A. L., & Littman, M. L. (2008). An analysis of modelbased interval estimation for Markov decision processes. Journal of Computer and System Sciences, 74, 1309-1331. Special Issue on Learning Theory.
- (2008) Journal of Computer and System Sciences , vol.74 , pp. 1309-1331
- Strehl, A.L.¹ Littman, M.L.²

22
- 14344258433
- A Bayesian framework for reinforcement learning
- Strens, M. J. A. (2000). A Bayesian framework for reinforcement learning. Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000) (pp. 943-950).
- (2000) Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000) , pp. 943-950
- Strens, M.J.A.¹

23
- 0004102479
- The MIT Press
- Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. The MIT Press.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

24
- 77958578580
- Integrating samplebased planning and model-based reinforcement learning
- Walsh, T., Goschin, S., & Littman, M. (2010). Integrating samplebased planning and model-based reinforcement learning. Proceedings of the Association for the Advancement of Artificial Intelligence.
- (2010) Proceedings of the Association for the Advancement of Artificial Intelligence
- Walsh, T.¹ Goschin, S.² Littman, M.³

25
- 79958846996
- Exploring compact reinforcement-learning representations with linear regression
- Arlington, Virginia, United States: AUAI Press
- Walsh, T. J., Szita, I., Diuk, C., & Littman, M. L. (2009). Exploring compact reinforcement-learning representations with linear regression. Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (pp. 591-598). Arlington, Virginia, United States: AUAI Press.
- (2009) Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence , pp. 591-598
- Walsh, T.J.¹ Szita, I.² Diuk, C.³ Littman, M.L.⁴

26
- 31844436266
- Bayesian sparse sampling for on-line reward optimization
- New York, NY, USA: ACM
- Wang, T., Lizotte, D., Bowling, M., & Schuurmans, D. (2005). Bayesian sparse sampling for on-line reward optimization. ICML '05: Proceedings of the 22nd International Conference on Machine Learning (pp. 956-963). New York, NY, USA: ACM.
- (2005) ICML '05: Proceedings of the 22nd International Conference on Machine Learning , pp. 956-963
- Wang, T.¹ Lizotte, D.² Bowling, M.³ Schuurmans, D.⁴

27
- 34547994508
- Multi-task reinforcement learning: A hierarchical Bayesian approach. Machine Learning
- Wilson, A., Fern, A., Ray, S., & Tadepalli, P. (2007). Multi-task reinforcement learning: A hierarchical Bayesian approach. Machine Learning, Proceedings of the Twenty-Fourth International Conference (ICML 2007) (pp. 1015-1022).
- (2007) Proceedings of the Twenty-Fourth International Conference (ICML 2007) , pp. 1015-1022
- Wilson, A.¹ Fern, A.² Ray, S.³ Tadepalli, P.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.