-
1
-
-
0345224411
-
The continuum-armed bandit problem
-
Rajeev Agrawal. The continuum-armed bandit problem. SIAM J. Control and Optimization, 33(6):1926-1951, 1995.
-
(1995)
SIAM J. Control and Optimization
, vol.33
, Issue.6
, pp. 1926-1951
-
-
Agrawal, R.1
-
2
-
-
0041966002
-
Using confidence bounds for exploitation-exploration trade-offs
-
Preliminary version in 41st IEEE FOCS, 2000
-
Peter Auer. Using confidence bounds for exploitation-exploration trade-offs. J. Machine Learning Research, 3:397-422, 2002. Preliminary version in 41st IEEE FOCS, 2000.
-
(2002)
J. Machine Learning Research
, vol.3
, pp. 397-422
-
-
Auer, P.1
-
3
-
-
0036568025
-
Finite-time analysis of the multiarmed bandit problem
-
Preliminary version in 15th ICML, 1998
-
Peter Auer, Nicolò Cesa-Bianchi, and Paul Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2-3):235-256, 2002. Preliminary version in 15th ICML, 1998.
-
(2002)
Machine Learning
, vol.47
, Issue.2-3
, pp. 235-256
-
-
Auer, P.1
Cesa-Bianchi, N.2
Fischer, P.3
-
4
-
-
0037709910
-
The nonstochastic multiarmed bandit problem
-
Preliminary version in 36th IEEE FOCS, 1995
-
Peter Auer, Nicolò Cesa-Bianchi, Yoav Freund, and Robert E. Schapire. The nonstochastic multiarmed bandit problem. SIAM J. Comput., 32(1):48-77, 2002. Preliminary version in 36th IEEE FOCS, 1995.
-
(2002)
SIAM J. Comput.
, vol.32
, Issue.1
, pp. 48-77
-
-
Auer, P.1
Cesa-Bianchi, N.2
Freund, Y.3
Schapire, R.E.4
-
6
-
-
35448960376
-
Online linear optimization and adaptive routing
-
February. Preliminary version appeared in 36th ACM STOC, 2004
-
Baruch Awerbuch and Robert Kleinberg. Online linear optimization and adaptive routing. Journal of Computer and System Sciences, 74(1):97-114, February 2008. Preliminary version appeared in 36th ACM STOC, 2004.
-
(2008)
Journal of Computer and System Sciences
, vol.74
, Issue.1
, pp. 97-114
-
-
Awerbuch, B.1
Kleinberg, R.2
-
7
-
-
0000768035
-
Denumerable-armed bandits
-
Jeffrey Banks and Rangarajan Sundaram. Denumerable-armed bandits. Econometrica, 60(5):1071-1096, 1992.
-
(1992)
Econometrica
, vol.60
, Issue.5
, pp. 1071-1096
-
-
Banks, J.1
Sundaram, R.2
-
8
-
-
1242275243
-
Über unendliche, lineare Punktmannichfaltigkeiten, 4
-
G. Cantor. Über unendliche, lineare Punktmannichfaltigkeiten, 4. Mathematische Annalen, 21:51-58, 1883.
-
(1883)
Mathematische Annalen
, vol.21
, pp. 51-58
-
-
Cantor, G.1
-
10
-
-
0031140246
-
How to use expert advice
-
Nicolò Cesa-Bianchi, Yoav Freund, David Haussler, David P. Helmbold, Robert E. Schapire, and Manfred K. Warmuth. How to use expert advice. J. ACM, 44(3):427-485, 1997.
-
(1997)
J. ACM
, vol.44
, Issue.3
, pp. 427-485
-
-
Cesa-Bianchi, N.1
Freund, Y.2
Haussler, D.3
Helmbold, D.P.4
Schapire, R.E.5
Warmuth, M.K.6
-
14
-
-
33244456637
-
Robbing the bandit: Less regret in online geometric optimization against an adaptive adversary
-
Varsha Dani and Thomas P. Hayes. Robbing the bandit: less regret in online geometric optimization against an adaptive adversary. In 17th ACM-SIAM Symp. on Discrete Algorithms (SODA), pages 937-943, 2006.
-
(2006)
17th ACM-SIAM Symp. on Discrete Algorithms (SODA)
, pp. 937-943
-
-
Dani, V.1
Hayes, T.P.2
-
16
-
-
20744454447
-
Online Convex Optimization in the Bandit Setting: Gradient Descent, without a Gradient
-
Abraham Flaxman, Adam Kalai, and H. Brendan McMahan. Online Convex Optimization in the Bandit Setting: Gradient Descent, without a Gradient. In 16th ACM-SIAM Symp. on Discrete Algorithms (SODA), pages 385-394, 2005.
-
(2005)
16th ACM-SIAM Symp. on Discrete Algorithms (SODA)
, pp. 385-394
-
-
Flaxman, A.1
Kalai, A.2
Brendan McMahan, H.3
-
18
-
-
0002955623
-
A dynamic allocation index for the sequential design of experiments
-
J. Gani et al., editor, North-Holland
-
J. C. Gittins and D. M. Jones. A dynamic allocation index for the sequential design of experiments. In J. Gani et al., editor, Progress in Statistics, pages 241-266. North-Holland, 1974.
-
(1974)
Progress in Statistics
, pp. 241-266
-
-
Gittins, J.C.1
Jones, D.M.2
-
19
-
-
46749146164
-
Approximation algorithms for partial-information based stochastic control with Markovian rewards
-
Sudipta Guha and Kamesh Munagala. Approximation algorithms for partial-information based stochastic control with Markovian rewards. In 48th Symp. on Foundations of Computer Science (FOCS), pages 483-493, 2007.
-
(2007)
48th Symp. on Foundations of Computer Science (FOCS)
, pp. 483-493
-
-
Guha, S.1
Munagala, K.2
-
25
-
-
84898981061
-
Nearly tight bounds for the continuum-armed bandit problem
-
Full version appeared in the author's thesis (MIT, 1995)
-
Robert Kleinberg. Nearly tight bounds for the continuum-armed bandit problem. In 18th Advances in Neural Information Processing Systems (NIPS), 2004. Full version appeared in the author's thesis (MIT, 1995).
-
(2004)
18th Advances in Neural Information Processing Systems (NIPS)
-
-
Kleinberg, R.1
-
29
-
-
0002899547
-
Asymptotically efficient Adaptive Allocation Rules
-
T.L. Lai and Herbert Robbins. Asymptotically efficient Adaptive Allocation Rules. Advances in Applied Mathematics, 6:4-22, 1985.
-
(1985)
Advances in Applied Mathematics
, vol.6
, pp. 4-22
-
-
Lai, T.L.1
Robbins, H.2
-
30
-
-
0002365425
-
Contribution à la topologie des ensembles dénombrables
-
S. Mazurkiewicz and W. Sierpinski. Contribution à la topologie des ensembles dénombrables. Fund. Math., 1:17-27, 1920.
-
(1920)
Fund. Math.
, vol.1
, pp. 17-27
-
-
Mazurkiewicz, S.1
Sierpinski, W.2
-
31
-
-
9444257628
-
Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary
-
H. Brendan McMahan and Avrim Blum. Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary. In 17th Conference on Learning Theory (COLT), pages 109-123, 2004.
-
(2004)
17th Conference on Learning Theory (COLT)
, pp. 109-123
-
-
Brendan McMahan, H.1
Blum, A.2
-
33
-
-
0032047115
-
A game of prediction with expert advice
-
V. Vovk. A game of prediction with expert advice. J. Computer and System Sciences, 56(2):153-173, 1998.
-
(1998)
J. Computer and System Sciences
, vol.56
, Issue.2
, pp. 153-173
-
-
Vovk, V.1
-
34
-
-
0001043843
-
Restless bandits: Activity allocation in a changing world
-
P. Whittle. Restless bandits: Activity allocation in a changing world. J. of Appl. Prob., 25A:287-298, 1988.
-
(1988)
J. of Appl. Prob.
, vol.25 A
, pp. 287-298
-
-
Whittle, P.1
|