-
1
-
-
0001395850
-
On the likelihood that one unknown probability exceeds another in view of the evidence of two samples
-
W. R. Thompson. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25:285-294, 1933.
-
(1933)
Biometrika
, vol.25
, pp. 285-294
-
-
Thompson, W.R.1
-
3
-
-
34247981226
-
Play the winner rule and the controlled clinical trials
-
M. Zelen. Play the winner rule and the controlled clinical trials. Journal of the American Statistical Association, 64:131-146, 1969.
-
(1969)
Journal of the American Statistical Association
, vol.64
, pp. 131-146
-
-
Zelen, M.1
-
4
-
-
0030352286
-
Learning and strategic pricing
-
D. Bergemann and J. Valimaki. Learning and strategic pricing. Econometrica, 64:1125-1149, 1996.
-
(1996)
Econometrica
, vol.64
, pp. 1125-1149
-
-
Bergemann, D.1
Valimaki, J.2
-
5
-
-
33744719690
-
The financing of innovation: Learning and stopping
-
D. Bergemann and U. Hege. The financing of innovation: Learning and stopping. RAND Journal of Economics, 36(4):719-752, 2005.
-
(2005)
RAND Journal of Economics
, vol.36
, Issue.4
, pp. 719-752
-
-
Bergemann, D.1
Hege, U.2
-
8
-
-
33847255926
-
Dynamic assortment with demand learning for seasonal consumer goods
-
F. Caro and G. Gallien. Dynamic assortment with demand learning for seasonal consumer goods. Management Science, 53:276-292, 2007.
-
(2007)
Management Science
, vol.53
, pp. 276-292
-
-
Caro, F.1
Gallien, G.2
-
10
-
-
85050365667
-
Bandit problems: Sequential allocation of experiments
-
D. A. Berry and B. Fristedt. Bandit problems: sequential allocation of experiments. Chapman and Hall, 1985.
-
(1985)
Chapman and Hall
-
-
Berry, D.A.1
Fristedt, B.2
-
12
-
-
84926078662
-
-
Cambridge University Press, Cambridge, UK
-
N. Cesa-Bianchi and G. Lugosi. Prediction, Learning, and Games. Cambridge University Press, Cambridge, UK, 2006.
-
(2006)
Prediction, Learning, and Games
-
-
Cesa-Bianchi, N.1
Lugosi, G.2
-
13
-
-
0002899547
-
Asymptotically efficient adaptive allocation rules
-
T. L. Lai and H. Robbins. Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6:4-22, 1985.
-
(1985)
Advances in Applied Mathematics
, vol.6
, pp. 4-22
-
-
Lai, T.L.1
Robbins, H.2
-
15
-
-
0000169010
-
Bandit processes and dynamic allocation indices (with discussion)
-
Series B
-
J. C. Gittins. Bandit processes and dynamic allocation indices (with discussion). Journal of the Royal Statistical Society, Series B, 41:148-177, 1979.
-
(1979)
Journal of the Royal Statistical Society
, vol.41
, pp. 148-177
-
-
Gittins, J.C.1
-
17
-
-
0001043843
-
Restless bandits: Activity allocation in a changing world
-
P. Whittle. Restless bandits: Activity allocation in a changing world. Journal of Applied Probability, 25A:287-298, 1988.
-
(1988)
Journal of Applied Probability
, vol.25 A
, pp. 287-298
-
-
Whittle, P.1
-
19
-
-
0343441515
-
Restless bandits, linear programming relaxations, and primal dual index heuristic
-
D. Bertsimas and J. Nino-Mora. Restless bandits, linear programming relaxations, and primal dual index heuristic. Operations Research, 48(1):80-90, 2000.
-
(2000)
Operations Research
, vol.48
, Issue.1
, pp. 80-90
-
-
Bertsimas, D.1
Nino-Mora, J.2
-
21
-
-
84867856114
-
Regret bounds for restless Markov bandits
-
Springer Berlin Heidelberg
-
R. Ortner, D. Ryabko, P. Auer, and R. Munos. Regret bounds for restless markov bandits. In Algorithmic Learning Theory, pages 214-228. Springer Berlin Heidelberg, 2012.
-
(2012)
Algorithmic Learning Theory
, pp. 214-228
-
-
Ortner, R.1
Ryabko, D.2
Auer, P.3
Munos, R.4
-
23
-
-
84972545864
-
An analog of the minimax theorem for vector payoffs
-
D. Blackwell. An analog of the minimax theorem for vector payoffs. Pacific Journal of Mathematics, 6:1-8, 1956.
-
(1956)
Pacific Journal of Mathematics
, vol.6
, pp. 1-8
-
-
Blackwell, D.1
-
24
-
-
0001976283
-
-
Princeton University Press, Cambridge, UK
-
J. Hannan. Approximation to bayes risk in repeated plays, Contributions to the Theory of Games, Volume 3. Princeton University Press, Cambridge, UK, 1957.
-
(1957)
Approximation to Bayes Risk in Repeated Plays, Contributions to the Theory of Games
, vol.3
-
-
Hannan, J.1
-
27
-
-
80054097465
-
On upper-confidence bound policies for switching bandit problems
-
Springer Berlin Heidelberg
-
A. Garivier and E. Moulines. On upper-confidence bound policies for switching bandit problems. In Algorithmic Learning Theory, pages 174-188. Springer Berlin Heidelberg, 2011.
-
(2011)
Algorithmic Learning Theory
, pp. 174-188
-
-
Garivier, A.1
Moulines, E.2
-
28
-
-
0037709910
-
The non-stochastic multi-armed bandit problem
-
P. Auer, N. Cesa-Bianchi, Y. Freund, and R. E. Schapire. The non-stochastic multi-armed bandit problem. SIAM journal of computing, 32:48-77, 2002.
-
(2002)
SIAM Journal of Computing
, vol.32
, pp. 48-77
-
-
Auer, P.1
Cesa-Bianchi, N.2
Freund, Y.3
Schapire, R.E.4
-
29
-
-
0036568025
-
Finite-time analysis of the multiarmed bandit problem
-
P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47:235-246, 2002.
-
(2002)
Machine Learning
, vol.47
, pp. 235-246
-
-
Auer, P.1
Cesa-Bianchi, N.2
Fischer, P.3
-
30
-
-
0031211090
-
A decision-theoretic generalization of on-line learning and an application to boosting
-
Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. System Sci., 55:119-139, 1997.
-
(1997)
J. Comput. System Sci.
, vol.55
, pp. 119-139
-
-
Freund, Y.1
Schapire, R.E.2
-
31
-
-
70449882757
-
Multi-armed bandit, dynamic environments and meta-bandits
-
Whistler, Canada
-
C. Hartland, S. Gelly, N. Baskiotis, O. Teytaud, and M. Sebag. Multi-armed bandit, dynamic environments and meta-bandits. NIPS-2006 workshop, Online trading between exploration and exploitation, Whistler, Canada, 2006.
-
(2006)
NIPS-2006 Workshop, Online Trading Between Exploration and Exploitation
-
-
Hartland, C.1
Gelly, S.2
Baskiotis, N.3
Teytaud, O.4
Sebag, M.5
|