-
2
-
-
78649420293
-
Regret bounds and minimax policies under partial monitoring
-
Audibert, J.-Y., Bubeck, S.: Regret bounds and minimax policies under partial monitoring. Journal of Machine Learning Research 11, 2785-2836 (2010)
-
(2010)
Journal of Machine Learning Research
, vol.11
, pp. 2785-2836
-
-
Audibert, J.-Y.1
Bubeck, S.2
-
3
-
-
62949181077
-
Exploration-exploitation trade-off using variance estimates in multi-armed bandits
-
Audibert, J.-Y., Munos, R., Szepesvaŕi, C.: Exploration- exploitation trade-off using variance estimates in multi-armed bandits. Theoretical Computer Science 410(19), 1876-1902 (2009)
-
(2009)
Theoretical Computer Science
, vol.410
, Issue.19
, pp. 1876-1902
-
-
Audibert, J.-Y.1
Munos, R.2
Szepesvaŕi, C.3
-
4
-
-
0036568025
-
Finite-time analysis of the multiarmed bandit problem
-
Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine Learning 47(2), 235-256 (2002)
-
(2002)
Machine Learning
, vol.47
, Issue.2
, pp. 235-256
-
-
Auer, P.1
Cesa-Bianchi, N.2
Fischer, P.3
-
5
-
-
85162416700
-
An empirical evaluation of thompson sampling
-
Chapelle, O., Li, L.: An empirical evaluation of thompson sampling. In: NIPS (2011)
-
(2011)
NIPS
-
-
Chapelle, O.1
Li, L.2
-
7
-
-
78549244167
-
Solving two-armed bernoulli bandit problems using a bayesian learning automaton
-
Granmo, O.C.: Solving two-armed bernoulli bandit problems using a bayesian learning automaton. International Journal of Intelligent Computing and Cybernetics 3(2), 207-234 (2010)
-
(2010)
International Journal of Intelligent Computing and Cybernetics
, vol.3
, Issue.2
, pp. 207-234
-
-
Granmo, O.C.1
-
9
-
-
84867888879
-
On bayesian upper-confidence bounds for bandit problems
-
Kaufmann, E., Garivier, A., Cappé, O.: On bayesian upper-confidence bounds for bandit problems. In: AISTATS (2012)
-
(2012)
AISTATS
-
-
Kaufmann, E.1
Garivier, A.2
Cappé, O.3
-
10
-
-
0002899547
-
Asymptotically efficient adaptive allocation rules
-
Lai, T.L., Robbins, H.: Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics 6(1), 4-22 (1985)
-
(1985)
Advances in Applied Mathematics
, vol.6
, Issue.1
, pp. 4-22
-
-
Lai, T.L.1
Robbins, H.2
-
11
-
-
84874038864
-
A finite-time analysis of multi-armed bandits problems with kullback-leibler divergences
-
Maillard, O.-A., Munos, R., Stoltz, G.: A finite-time analysis of multi-armed bandits problems with kullback-leibler divergences. In: Conference on Learning Theory, COLT (2011)
-
Conference on Learning Theory, COLT (2011)
-
-
Maillard, O.-A.1
Munos, R.2
Stoltz, G.3
-
12
-
-
84864939787
-
Optimistic bayesian sampling in contextual bandit problems
-
May, B.C., Korda, N., Lee, A., Leslie, D.: Optimistic bayesian sampling in contextual bandit problems. Journal of Machine Learning Research 13, 2069-2106 (2012)
-
(2012)
Journal of Machine Learning Research
, vol.13
, pp. 2069-2106
-
-
May, B.C.1
Korda, N.2
Lee, A.3
Leslie, D.4
-
13
-
-
80054114465
-
Deviations of Stochastic Bandit Regret
-
Kivinen, J., Szepesvári, C., Ukkonen, E., Zeugmann, T. (eds.) ALT 2011. Springer, Heidelberg
-
Salomon, A., Audibert, J.-Y.: Deviations of Stochastic Bandit Regret. In: Kivinen, J., Szepesvári, C., Ukkonen, E., Zeugmann, T. (eds.) ALT 2011. LNCS, vol. 6925, pp. 159-173. Springer, Heidelberg (2011)
-
(2011)
LNCS
, vol.6925
, pp. 159-173
-
-
Salomon, A.1
Audibert, J.-Y.2
-
14
-
-
0001395850
-
On the likelihood that one unknown probability exceeds another in view of the evidence of two samples
-
Thompson,W.R.: On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25, 285-294 (1933)
-
(1933)
Biometrika
, vol.25
, pp. 285-294
-
-
Thompson, W.R.1
|