-
1
-
-
84886540275
-
Analysis of thompson sampling for the multi-armed bandit problem
-
Agrawal, S., Goyal, N.: Analysis of thompson sampling for the multi-armed bandit problem. In: COLT (2012)
-
(2012)
COLT
-
-
Agrawal, S.1
Goyal, N.2
-
2
-
-
84864970677
-
Best arm identification in multi-armed bandits
-
Audibert, J.-Y., Bubeck, S., Munos, R.: Best arm identification in multi-armed bandits. In: COLT (2010)
-
(2010)
COLT
-
-
Audibert, J.-Y.1
Bubeck, S.2
Munos, R.3
-
4
-
-
77952070805
-
Pure exploration in multi-armed bandits problems
-
Gavaldà, R., Lugosi, G., Zeugmann, T., Zilles, S. (eds.) ALT 2009. Springer, Heidelberg
-
Bubeck, S., Munos, R., Stoltz, G.: Pure exploration in multi-armed bandits problems. In: Gavaldà, R., Lugosi, G., Zeugmann, T., Zilles, S. (eds.) ALT 2009. LNCS, vol. 5809, pp. 23-37. Springer, Heidelberg (2009)
-
(2009)
LNCS
, vol.5809
, pp. 23-37
-
-
Bubeck, S.1
Munos, R.2
Stoltz, G.3
-
7
-
-
33745295134
-
Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems
-
Even-Dar, E., Mannor, S., Mansour, Y.: Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. Journal of Machine Learning Research 7, 1079-1105 (2006)
-
(2006)
Journal of Machine Learning Research
, vol.7
, pp. 1079-1105
-
-
Even-Dar, E.1
Mannor, S.2
Mansour, Y.3
-
8
-
-
85162482585
-
Multi-bandit best arm identification
-
Gabillon, V., Ghavamzadeh, M., Lazaric, A., Bubeck, S.: Multi-bandit best arm identification. In: Advances in Neural Information Processing Systems (2011)
-
(2011)
Advances in Neural Information Processing Systems
-
-
Gabillon, V.1
Ghavamzadeh, M.2
Lazaric, A.3
Bubeck, S.4
-
10
-
-
84867131498
-
Pac subset selection in stochastic multi-armed bandits
-
Kalyanakrishnan, S., Tewari, A., Auer, P., Stone, P.: Pac subset selection in stochastic multi-armed bandits. In: International Conference on Machine Learning (2012)
-
International Conference on Machine Learning (2012)
-
-
Kalyanakrishnan, S.1
Tewari, A.2
Auer, P.3
Stone, P.4
-
11
-
-
85029696856
-
Open problem: Regret bounds for thompson sampling
-
Li, L., Chappelle, O.: Open problem: Regret bounds for thompson sampling. In: COLT (2012)
-
(2012)
COLT
-
-
Li, L.1
Chappelle, O.2
-
13
-
-
30044441333
-
The sample complexity of exploration in the multiarmed bandit problem
-
Mannor, S., Tsitsiklis, J.N.: The sample complexity of exploration in the multiarmed bandit problem. Journal of Machine Learning Research 5, 623-648 (2004)
-
(2004)
Journal of Machine Learning Research
, vol.5
, pp. 623-648
-
-
Mannor, S.1
Tsitsiklis, J.N.2
-
16
-
-
0001395850
-
On the likelihood that one unknown probability exceeds another in view of the evidence of two samples
-
Thompson,W.R.: On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25(3-4), 285-294 (1933)
-
(1933)
Biometrika
, vol.25
, Issue.3-4
, pp. 285-294
-
-
Thompson, W.R.1
|