-
1
-
-
70049118714
-
A Bayesian sampling approach to exploration in reinforcement learning
-
Preprint
-
Asmuth, J., Li, L., Littman, M. L., Nouri, A., & Wingate, D. (2009). A Bayesian sampling approach to exploration in reinforcement learning. (Preprint).
-
(2009)
-
-
Asmuth, J.1
Li, L.2
Littman, M.L.3
Nouri, A.4
Wingate, D.5
-
2
-
-
56449090814
-
Logarithmic online regret bounds for undiscounted reinforcement learning
-
Auer, P., & Ortner, R. (2007). Logarithmic online regret bounds for undiscounted reinforcement learning. Neural Information Processing Systems (pp. 49-56).
-
(2007)
Neural Information Processing Systems
, pp. 49-56
-
-
Auer, P.1
Ortner, R.2
-
3
-
-
0041965975
-
R-MAX - a general polynomial time algorithm for near-optimal reinforcement learning
-
Brafman, R. I., & Tennenholtz, M. (2002). R-MAX - a general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research, 3, 213-231.
-
(2002)
Journal of Machine Learning Research
, vol.3
, pp. 213-231
-
-
Brafman, R.I.1
Tennenholtz, M.2
-
4
-
-
70049084399
-
CORL: A continuous-state offset-dynamics reinforcement learner
-
Brunskill, E., Leffler, B. R., Li, L., Littman, M. L., & Roy, N. (2008). CORL: A continuous-state offset-dynamics reinforcement learner. Proceedings of the International Conference on Uncertainty in Artificial Intelligence (pp. 53-61).
-
(2008)
Proceedings of the International Conference on Uncertainty in Artificial Intelligence
, pp. 53-61
-
-
Brunskill, E.1
Leffler, B.R.2
Li, L.3
Littman, M.L.4
Roy, N.5
-
7
-
-
71149084050
-
-
21 1033-1039
-
21 1033-1039,
-
-
-
-
8
-
-
71149097616
-
-
22 1-12
-
22 1-12,
-
-
-
-
9
-
-
71149104625
-
-
22 109-121
-
22 109-121.
-
-
-
-
13
-
-
23244466805
-
-
Doctoral dissertation, Gatsby Computational Neuroscience Unit, University College, London
-
Kakade, S. M. (2003). On the sample complexity of reinforcement learning. Doctoral dissertation, Gatsby Computational Neuroscience Unit, University College, London.
-
(2003)
On the sample complexity of reinforcement learning
-
-
Kakade, S.M.1
-
15
-
-
0036832954
-
Near-optimal reinforcement learning in polynomial time
-
Kearns, M., & Singh, S. (2002). Near-optimal reinforcement learning in polynomial time. Machine Learning, 49, 209-232.
-
(2002)
Machine Learning
, vol.49
, pp. 209-232
-
-
Kearns, M.1
Singh, S.2
-
17
-
-
33749251297
-
An analytic solution to discrete Bayesian reinforcement learning
-
Poupart, P., Vlassis, N., Hoey, J., & Regan, K. (2006). An analytic solution to discrete Bayesian reinforcement learning. Proceedings of the International Conference on Machine Learning (pp. 697-704).
-
(2006)
Proceedings of the International Conference on Machine Learning
, pp. 697-704
-
-
Poupart, P.1
Vlassis, N.2
Hoey, J.3
Regan, K.4
-
19
-
-
0010956944
-
Distribution inequalities for the binomial law
-
Slud, E. V. (1977). Distribution inequalities for the binomial law. The Annals of Probability, 5, 404-412.
-
(1977)
The Annals of Probability
, vol.5
, pp. 404-412
-
-
Slud, E.V.1
-
20
-
-
33749255382
-
Pac model-free reinforcement learning
-
Strehl, A. L., Li, L., Wiewiora, E., Langford, J., & Littman, M. L. (2006). Pac model-free reinforcement learning. Proceedings of the International Conference on Machine Learning (pp. 881-888).
-
(2006)
Proceedings of the International Conference on Machine Learning
, pp. 881-888
-
-
Strehl, A.L.1
Li, L.2
Wiewiora, E.3
Langford, J.4
Littman, M.L.5
-
21
-
-
55549110436
-
An analysis of model-based interval estimation for markov decision processes
-
Strehl, A. L., & Littman, M. L. (2008a). An analysis of model-based interval estimation for markov decision processes. Journal of Computer and System Sciences, 74, 1209-1331.
-
(2008)
Journal of Computer and System Sciences
, vol.74
, pp. 1209-1331
-
-
Strehl, A.L.1
Littman, M.L.2
-
22
-
-
85162058047
-
Online linear regression and its application to model-based reinforcement learning
-
Strehl, A. L., & Littman, M. L. (2008b). Online linear regression and its application to model-based reinforcement learning. Neural Information Processing Systems (pp. 1417-1424).
-
(2008)
Neural Information Processing Systems
, pp. 1417-1424
-
-
Strehl, A.L.1
Littman, M.L.2
-
24
-
-
31844436266
-
Bayesian sparse sampling for on-line reward optimization
-
Wang, T., Lizotte, D., Bowling, M., & Schuurmans, D. (2005). Bayesian sparse sampling for on-line reward optimization. Proceedings of the International Conference on Machine Learning (pp. 956-963).
-
(2005)
Proceedings of the International Conference on Machine Learning
, pp. 956-963
-
-
Wang, T.1
Lizotte, D.2
Bowling, M.3
Schuurmans, D.4
|