-
1
-
-
0041966002
-
Using confidence bounds for exploitation-exploration trade-offs
-
Preliminary version in 41st IEEE FOCS, 2000
-
P. Auer. Using confidence bounds for exploitation-exploration trade-offs. J. Machine Learning Research, 3: 397-422, 2002. Preliminary version in 41st IEEE FOCS, 2000.
-
(2002)
J. Machine Learning Research
, vol.3
, pp. 397-422
-
-
Auer, P.1
-
2
-
-
0036568025
-
Finite-time analysis of the multiarmed bandit problem
-
Preliminary version in 15th ICML, 1998
-
P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2-3): 235-256, 2002. Preliminary version in 15th ICML, 1998.
-
(2002)
Machine Learning
, vol.47
, Issue.2-3
, pp. 235-256
-
-
Auer, P.1
Cesa-Bianchi, N.2
Fischer, P.3
-
3
-
-
0037709910
-
The nonstochastic multiarmed bandit problem
-
Preliminary version in 36th IEEE FOCS, 1995
-
P. Auer, N. Cesa-Bianchi, Y. Freund, and R. E. Schapire. The nonstochastic multiarmed bandit problem. SIAM J. Comput., 32(1): 48-77, 2002. Preliminary version in 36th IEEE FOCS, 1995.
-
(2002)
SIAM J. Comput.
, vol.32
, Issue.1
, pp. 48-77
-
-
Auer, P.1
Cesa-Bianchi, N.2
Freund, Y.3
Schapire, R.E.4
-
4
-
-
4544345025
-
Adaptive routing with end-to-end feedback: Distributed learning and geometric approaches
-
B. Awerbuch and R. D. Kleinberg. Adaptive routing with end-to-end feedback: distributed learning and geometric approaches. In 36th ACM Symp. on Theory of Computing (STOC), pages 45-53, 2004.
-
(2004)
36th ACM Symp. on Theory of Computing (STOC)
, pp. 45-53
-
-
Awerbuch, B.1
Kleinberg, R.D.2
-
6
-
-
0030134077
-
Conservation laws, extended polymatroids and multi-armed bandit problems: A unified polyhedral approach
-
D. Bertsimas and J. Nino-Mora. Conservation laws, extended polymatroids and multi-armed bandit problems: A unified polyhedral approach. Math. of Oper. Res, 21(2): 257-306, 1996.
-
(1996)
Math. of Oper. Res
, vol.21
, Issue.2
, pp. 257-306
-
-
Bertsimas, D.1
Nino-Mora, J.2
-
7
-
-
0343441515
-
Restless bandits, linear programming relaxations, and a primal-dual index heuristic
-
D. Bertsimas and J. Nino-Mora. Restless bandits, linear programming relaxations, and a primal-dual index heuristic. Operations Research, 48(1): 80-90, 2000.
-
(2000)
Operations Research
, vol.48
, Issue.1
, pp. 80-90
-
-
Bertsimas, D.1
Nino-Mora, J.2
-
9
-
-
33244456637
-
Robbing the bandit: Less regret in online geometric optimization against an adaptive adversary
-
V. Dani and T. P. Hayes. Robbing the bandit: less regret in online geometric optimization against an adaptive adversary. In 17th ACM-SIAM Symp. on Discrete Algorithms (SODA), pages 937-943, 2006.
-
(2006)
17th ACM-SIAM Symp. on Discrete Algorithms (SODA)
, pp. 937-943
-
-
Dani, V.1
Hayes, T.P.2
-
11
-
-
0000169010
-
Bandit processes and dynamic allocation indices (with discussion)
-
J. C. Gittins. Bandit processes and dynamic allocation indices (with discussion). J. Roy. Statist. Soc. Ser. B, 41: 148-177, 1979.
-
(1979)
J. Roy. Statist. Soc. Ser. B
, vol.41
, pp. 148-177
-
-
Gittins, J.C.1
-
13
-
-
0002955623
-
A dynamic allocation index for the sequential design of experiments
-
J. G. et al., editor North-Holland
-
J. C. Gittins and D. M. Jones. A dynamic allocation index for the sequential design of experiments. In J. G. et al., editor, Progress in Statistics, pages 241-266. North-Holland, 1974.
-
(1974)
Progress in Statistics
, pp. 241-266
-
-
Gittins, J.C.1
Jones, D.M.2
-
14
-
-
46749146164
-
Approximation algorithms for partial-information based stochastic control with Markovian rewards
-
S. Guha and K. Munagala. Approximation algorithms for partial-information based stochastic control with Markovian rewards. In 48th Symp. on Foundations of Computer Science (FOCS), 2007.
-
(2007)
48th Symp. on Foundations of Computer Science (FOCS)
-
-
Guha, S.1
Munagala, K.2
-
16
-
-
49949119498
-
Reinforcement learning-based load shared sequential routing
-
F. Heidari, S. Mannor, and L. Mason. Reinforcement learning-based load shared sequential routing. In IFIP Networking, 2007.
-
(2007)
IFIP Networking
-
-
Heidari, F.1
Mannor, S.2
Mason, L.3
-
17
-
-
84898981061
-
Nearly tight bounds for the continuum-armed bandit problem
-
Full version appeared as Chapters 4-5 in [18]
-
R. D. Kleinberg. Nearly tight bounds for the continuum-armed bandit problem. In 18th Advances in Neural Information Processing Systems (NIPS), 2004. Full version appeared as Chapters 4-5 in [18].
-
(2004)
18th Advances in Neural Information Processing Systems (NIPS)
-
-
Kleinberg, R.D.1
-
19
-
-
33244473533
-
Anytime algorithms for multi-armed bandit problems
-
Full version appeared as Chapter 6 in [18]
-
R. D. Kleinberg. Anytime algorithms for multi-armed bandit problems. In 17th ACM-SIAM Symp. on Discrete Algorithms (SODA), pages 928-936, 2006. Full version appeared as Chapter 6 in [18].
-
(2006)
17th ACM-SIAM Symp. on Discrete Algorithms (SODA)
, pp. 928-936
-
-
Kleinberg, R.D.1
-
21
-
-
0002899547
-
Asymptotically efficient adaptive allocation rules
-
T. Lai and H. Robbins. Asymptotically efficient Adaptive Allocation Rules. Advances in Applied Mathematics, 6: 4-22, 1985.
-
(1985)
Advances in Applied Mathematics
, vol.6
, pp. 4-22
-
-
Lai, T.1
Robbins, H.2
-
22
-
-
9444257628
-
Online geometric optimization in the bandit setting against an adaptive adversary
-
H. B. McMahan and A. Blum. Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary. In 17th Conference on Learning Theory (COLT), pages 109-123, 2004.
-
(2004)
17th Conference on Learning Theory (COLT)
, pp. 109-123
-
-
McMahan, H.B.1
Blum, A.2
-
23
-
-
27944479719
-
On the constant in the nonuniform version of the berry-esseen theorem
-
2005
-
K. Neammanee. On the constant in the nonuniform version of the Berry-Esseen theorem. Intl. J. of Mathematics and Mathematical Sciences, 2005: 12: 1951-1967, 2005.
-
(2005)
Intl. J. of Mathematics and Mathematical Sciences
, vol.12
, pp. 1951-1967
-
-
Neammanee, K.1
-
24
-
-
17744388964
-
Restless bandits, partial conservation laws and indexability
-
J. Nino-Mora. Restless bandits, partial conservation laws and indexability. Advances in Applied Probability, 33: 76-98, 2001.
-
(2001)
Advances in Applied Probability
, vol.33
, pp. 76-98
-
-
Nino-Mora, J.1
-
27
-
-
84966203785
-
Some aspects of the sequential design of experiments
-
H. Robbins. Some Aspects of the Sequential Design of Experiments. Bull. Amer. Math. Soc., 58: 527-535, 1952.
-
(1952)
Bull. Amer. Math. Soc.
, vol.58
, pp. 527-535
-
-
Robbins, H.1
-
28
-
-
0030095077
-
Markov chain convergence: From finite to infinite
-
J. S. Rosenthal. Markov chain convergence: From finite to infinite. Stochastic Processes Appl., 62(1): 55-72, 1996.
-
(1996)
Stochastic Processes Appl.
, vol.62
, Issue.1
, pp. 55-72
-
-
Rosenthal, J.S.1
-
29
-
-
34548750873
-
Generalized bandit problems
-
D. Austen-Smith and J. Duggan, editors Springer, 2005. First appeared as Working Paper, Stern School of Business
-
R. K. Sundaram. Generalized Bandit Problems. In D. Austen-Smith and J. Duggan, editors, Social Choice and Strategic Decisions: Essays in Honor of Jeffrey S. Banks (Studies in Choice and Welfare), pages 131-162. Springer, 2005. First appeared as Working Paper, Stern School of Business, 2003.
-
(2003)
Social Choice and Strategic Decisions: Essays in Honor of Jeffrey S. Banks (Studies in Choice and Welfare)
, pp. 131-162
-
-
Sundaram, R.K.1
-
30
-
-
0242590668
-
A short proof of the gittins index theorem
-
J. N. Tsitsiklis. A short proof of the Gittins index theorem. Annals of Applied Probability, 4(1): 194-199, 1994.
-
(1994)
Annals of Applied Probability
, vol.4
, Issue.1
, pp. 194-199
-
-
Tsitsiklis, J.N.1
-
31
-
-
84975987963
-
Branching bandit processes
-
G. Weiss. Branching bandit processes. Probab. Engng. Inform. Sci., 2: 269-278, 1988.
-
(1988)
Probab. Engng. Inform. Sci.
, vol.2
, pp. 269-278
-
-
Weiss, G.1
-
32
-
-
0000595228
-
Arm acquiring bandits
-
P. Whittle. Arm acquiring bandits. Ann. Probab., 9: 284-292, 1981.
-
(1981)
Ann. Probab.
, vol.9
, pp. 284-292
-
-
Whittle, P.1
-
33
-
-
0001043843
-
Restless bandits: Activity allocation in a changing world
-
P. Whittle. Restless bandits: Activity allocation in a changing world. J. of Appl. Prob., 25A: 287-298, 1988.
-
(1988)
J. of Appl. Prob.
, vol.25 A
, pp. 287-298
-
-
Whittle, P.1
|