-
1
-
-
0345224411
-
The continuum-armed bandit problem
-
R. Agrawal. The continuum-armed bandit problem. SIAM J. Control and Optimization, 33(6): 1926-1951, 1995.
-
(1995)
SIAM J. Control and Optimization
, vol.33
, Issue.6
, pp. 1926-1951
-
-
Agrawal, R.1
-
2
-
-
0036568025
-
Finite-time analysis of the multiarmed bandit problem
-
P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2-3):235-256, 2002.
-
(2002)
Machine Learning
, vol.47
, Issue.2-3
, pp. 235-256
-
-
Auer, P.1
Cesa-Bianchi, N.2
Fischer, P.3
-
3
-
-
0037709910
-
The nonstochastic multiarmed bandit problem
-
P. Auer, N. Cesa-Bianchi, Y. Freund, and R. E. Sehapire. The nonstochastic multiarmed bandit problem. SIAM J. Comput., 32(l):48-77, 2002.
-
(2002)
SIAM J. Comput
, vol.32
, Issue.L
, pp. 48-77
-
-
Auer, P.1
Cesa-Bianchi, N.2
Freund, Y.3
Sehapire, R.E.4
-
6
-
-
0000768035
-
Denumerable-armed bandits
-
J. S. Banks and R. K. Sundaram. Denumerable-armed bandits. Econometrica, 60(5):1071-1096, 1992.
-
(1992)
Econometrica
, vol.60
, Issue.5
, pp. 1071-1096
-
-
Banks, J.S.1
Sundaram, R.K.2
-
9
-
-
57049145053
-
-
E. Cope. Regret and convergence bounds for immediate-reward reinforcement learning with continuous action spaces, 2004. Unpublished manuscript.
-
E. Cope. Regret and convergence bounds for immediate-reward reinforcement learning with continuous action spaces, 2004. Unpublished manuscript.
-
-
-
-
10
-
-
57049131250
-
The Price of Bandit Information for Online Optimization
-
Preprint
-
V. Dani, T. Hayes, and S. M. Kakade. The Price of Bandit Information for Online Optimization. Preprint, 2007.
-
(2007)
-
-
Dani, V.1
Hayes, T.2
Kakade, S.M.3
-
11
-
-
33244456637
-
Robbing the bandit: Less regret in online geometric optimization against an adaptive adversary
-
V. Dani and T. P. Hayes. Robbing the bandit: Less regret in online geometric optimization against an adaptive adversary. In 16th ACM-SIAM Symp. on Discrete Algorithms (SODA), pages 937-943, 2006.
-
(2006)
16th ACM-SIAM Symp. on Discrete Algorithms (SODA)
, pp. 937-943
-
-
Dani, V.1
Hayes, T.P.2
-
13
-
-
57049154285
-
-
J. C. Gittins and D. M. Jones. A dynamic allocation index for the sequential design of experiments. In J. G. et al., editor, Progress in Statistics, pages 241-266. North-Holland, 1974.
-
J. C. Gittins and D. M. Jones. A dynamic allocation index for the sequential design of experiments. In J. G. et al., editor, Progress in Statistics, pages 241-266. North-Holland, 1974.
-
-
-
-
15
-
-
84898981061
-
Nearly tight bounds for the continuum-armed bandit problem
-
Full version appeared as Chapters 4-5, 16
-
R. Kleinberg. Nearly tight bounds for the continuum-armed bandit problem. In 18th Advances in Neural Information Processing Systems (NIPS), 2004. Full version appeared as Chapters 4-5 in [16].
-
(2004)
18th Advances in Neural Information Processing Systems (NIPS)
-
-
Kleinberg, R.1
-
17
-
-
9444257628
-
Online geometric optimization in the bandit setting against an adaptive adversary
-
17th Annual Conference on Learning Theory COLT, of, Springer Verlag
-
H. B. McMahan and A. Blum. Online geometric optimization in the bandit setting against an adaptive adversary. In 17th Annual Conference on Learning Theory (COLT), volume 3120 of LNCS, pages 109-123. Springer Verlag, 2004.
-
(2004)
LNCS
, vol.3120
, pp. 109-123
-
-
McMahan, H.B.1
Blum, A.2
-
18
-
-
27944479719
-
-
K. Neammanee. On the constant in the nonuniform version of the Berry-Esseen theorem. Intl. J. of Mathematics and Mathematical Sciences, 2005:12:1951-1967, 2005.
-
K. Neammanee. On the constant in the nonuniform version of the Berry-Esseen theorem. Intl. J. of Mathematics and Mathematical Sciences, 2005:12:1951-1967, 2005.
-
-
-
|