-
1
-
-
0345224411
-
The continuum-armed bandit problem
-
Rajeev Agrawal. The continuum-armed bandit problem. SIAM J. Control and Optimization, 33(6):1926-1951, 1995.
-
(1995)
SIAM J. Control and Optimization
, vol.33
, Issue.6
, pp. 1926-1951
-
-
Agrawal, R.1
-
2
-
-
0036568025
-
Finite-time analysis of the multiarmed bandit problem
-
Preliminary version in 15th ICML, 1998
-
Peter Auer, Nicolò Cesa-Bianchi, and Paul Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2-3):235-256, 2002. Preliminary version in 15th ICML, 1998.
-
(2002)
Machine Learning
, vol.47
, Issue.2-3
, pp. 235-256
-
-
Auer, P.1
Cesa-Bianchi, N.2
Fischer, P.3
-
3
-
-
0037709910
-
The nonstochastic multiarmed bandit problem
-
Preliminary version in 36th IEEE FOCS, 1995
-
Peter Auer, Nicolò Cesa-Bianchi, Yoav Freund, and Robert E. Schapire. The nonstochastic multiarmed bandit problem. SIAM J. Comput., 32(1):48-77, 2002. Preliminary version in 36th IEEE FOCS, 1995.
-
(2002)
SIAM J. Comput.
, vol.32
, Issue.1
, pp. 48-77
-
-
Auer, P.1
Cesa-Bianchi, N.2
Freund, Y.3
Schapire, R.E.4
-
4
-
-
38049040954
-
Improved rates for the stochastic continuum-armed bandit problem
-
Peter Auer, Ronald Ortner, and Csaba Szepesvári. Improved Rates for the Stochastic Continuum-Armed Bandit Problem. In 20th COLT, pages 454-468, 2007.
-
(2007)
20th COLT
, pp. 454-468
-
-
Auer, P.1
Ortner, R.2
Szepesvári, C.3
-
5
-
-
35448960376
-
Online linear optimization and adaptive routing
-
February. Preliminary version in 36th ACM STOC, 2004
-
Baruch Awerbuch and Robert Kleinberg. Online linear optimization and adaptive routing. J. of Computer and System Sciences, 74(1):97-114, February 2008. Preliminary version in 36th ACM STOC, 2004.
-
(2008)
J. of Computer and System Sciences
, vol.74
, Issue.1
, pp. 97-114
-
-
Awerbuch, B.1
Kleinberg, R.2
-
6
-
-
36448945038
-
A semantic approach to contextual advertising
-
Andrei Broder, Marcus Fontoura, Vanja Josifovski, and Lance Riedel. A semantic approach to contextual advertising. In 30th SIGIR, pages 559-566, 2007.
-
(2007)
30th SIGIR
, pp. 559-566
-
-
Broder, A.1
Fontoura, M.2
Josifovski, V.3
Riedel, L.4
-
7
-
-
84860634388
-
Online optimization in X-armed bandits
-
Preliminary version in NIPS 2008
-
Sébastien Bubeck, Rémi Munos, Gilles Stoltz, and Csaba Szepesvari. Online Optimization in X-Armed Bandits. J. of Machine Learning Research (JMLR), 12:1587-1627, 2011. Preliminary version in NIPS 2008.
-
(2011)
J. of Machine Learning Research (JMLR)
, vol.12
, pp. 1587-1627
-
-
Bubeck, S.1
Munos, R.2
Stoltz, G.3
Szepesvari, C.4
-
9
-
-
67649577204
-
Regret and convergence bounds for immediate-reward reinforcement learning with continuous action spaces
-
A manuscript from 2004
-
Eric Cope. Regret and convergence bounds for immediate-reward reinforcement learning with continuous action spaces. IEEE Trans. on Automatic Control, 54(6):1243-1253, 2009. A manuscript from 2004.
-
(2009)
IEEE Trans. on Automatic Control
, vol.54
, Issue.6
, pp. 1243-1253
-
-
Cope, E.1
-
10
-
-
33244456637
-
Robbing the bandit: Less regret in online geometric optimization against an adaptive adversary
-
Varsha Dani and Thomas P. Hayes. Robbing the bandit: less regret in online geometric optimization against an adaptive adversary. In 17th ACM-SIAM SODA, pages 937-943, 2006.
-
(2006)
17th ACM-SIAM SODA
, pp. 937-943
-
-
Dani, V.1
Hayes, T.P.2
-
11
-
-
70349295143
-
The price of bandit information for online optimization
-
Varsha Dani, Thomas P. Hayes, and Sham Kakade. The Price of Bandit Information for Online Optimization. In 20th NIPS, 2007.
-
(2007)
20th NIPS
-
-
Dani, V.1
Hayes, T.P.2
Kakade, S.3
-
12
-
-
20744454447
-
Online convex optimization in the bandit setting: Gradient descent without a gradient
-
Abraham Flaxman, Adam Kalai, and H. Brendan McMahan. Online Convex Optimization in the Bandit Setting: Gradient Descent without a Gradient. In 16th ACM-SIAM SODA, pages 385-394, 2005.
-
(2005)
16th ACM-SIAM SODA
, pp. 385-394
-
-
Flaxman, A.1
Kalai, A.2
McMahan, H.B.3
-
13
-
-
77958578450
-
Combining online and offline knowledge in UCT
-
Sylvain Gelly and David Silver. Combining online and offline knowledge in UCT. In 24th ICML, 2007.
-
(2007)
24th ICML
-
-
Gelly, S.1
Silver, D.2
-
14
-
-
70349295261
-
Achieving master level play in 9x9 computer go
-
Sylvain Gelly and David Silver. Achieving master level play in 9x9 computer go. In 23rd AAAI, 2008.
-
(2008)
23rd AAAI
-
-
Gelly, S.1
Silver, D.2
-
15
-
-
0344550482
-
Bounded geometries, fractals, and low- distortion embeddings
-
Anupam Gupta, Robert Krauthgamer, and James R. Lee. Bounded geometries, fractals, and low- distortion embeddings. In 44th IEEE FOCS, pages 534-543, 2003.
-
(2003)
44th IEEE FOCS
, pp. 534-543
-
-
Gupta, A.1
Krauthgamer, R.2
Lee, J.R.3
-
17
-
-
38049011420
-
Nearly tight bounds for the continuum-armed bandit problem
-
Robert Kleinberg. Nearly tight bounds for the continuum-armed bandit problem. In 18th NIPS, 2004.
-
(2004)
18th NIPS
-
-
Kleinberg, R.1
-
19
-
-
77951694424
-
Sharp dichotomies for regret minimization in metric spaces
-
Robert Kleinberg and Aleksandrs Slivkins. Sharp Dichotomies for Regret Minimization in Metric Spaces. In 21st ACM-SIAM SODA, 2010.
-
(2010)
21st ACM-SIAM SODA
-
-
Kleinberg, R.1
Slivkins, A.2
-
20
-
-
57049185311
-
Multi-armed bandits in metric spaces
-
Robert Kleinberg, Aleksandrs Slivkins, and Eli Upfal. Multi-Armed Bandits in Metric Spaces. In 40th ACM STOC, pages 681-690, 2008.
-
(2008)
40th ACM STOC
, pp. 681-690
-
-
Kleinberg, R.1
Slivkins, A.2
Upfal, E.3
-
21
-
-
33750293964
-
Bandit based monte-carlo planning
-
Levente Kocsis and Csaba Szepesvari. Bandit Based Monte-Carlo Planning. In 17th ECML, pages 282-293, 2006.
-
(2006)
17th ECML
, pp. 282-293
-
-
Kocsis, L.1
Szepesvari, C.2
-
22
-
-
0002899547
-
Asymptotically efficient adaptive allocation rules
-
T.L. Lai and Herbert Robbins. Asymptotically efficient Adaptive Allocation Rules. Advances in Applied Mathematics, 6:4-22, 1985.
-
(1985)
Advances in Applied Mathematics
, vol.6
, pp. 4-22
-
-
Lai, T.L.1
Robbins, H.2
-
23
-
-
9444257628
-
Online geometric optimization in the bandit setting against an adaptive adversary
-
H. Brendan McMahan and Avrim Blum. Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary. In 17th COLT, pages 109-123, 2004.
-
(2004)
17th COLT
, pp. 109-123
-
-
McMahan, H.B.1
Blum, A.2
-
24
-
-
84860618045
-
Bandit algorithms for tree search
-
Rémi Munos and Pierre-Arnaud Coquelin. Bandit algorithms for tree search. In 23rd UAI, 2007.
-
(2007)
23rd UAI
-
-
Munos, R.1
Coquelin, P.-A.2
-
26
-
-
70350700875
-
Multi-armed bandit problems with dependent arms
-
Sandeep Pandey, Deepayan Chakrabarti, and Deepak Agarwal. Multi-armed Bandit Problems with Dependent Arms. In 24th ICML, 2007.
-
(2007)
24th ICML
-
-
Pandey, S.1
Chakrabarti, D.2
Agarwal, D.3
-
27
-
-
77954582369
-
Classification-enhanced ranking
-
Susan T. Dumais Paul N. Bennett, Krysta Marie Svore. Classification- enhanced ranking. In 19th WWW,pages 111-120, 2010.
-
(2010)
19th WWW
, pp. 111-120
-
-
Dumais, S.T.1
Bennett, P.N.2
Svore, K.M.3
-
28
-
-
56449088596
-
Learning diverse rankings with multi-armed bandits
-
Filip Radlinski, Robert Kleinberg, and Thorsten Joachims. Learning diverse rankings with multi-armed bandits. In 25th ICML, pages 784-791, 2008.
-
(2008)
25th ICML
, pp. 784-791
-
-
Radlinski, F.1
Kleinberg, R.2
Joachims, T.3
-
29
-
-
77956542736
-
Learning optimally diverse rankings over large document collections
-
Aleksandrs Slivkins, Filip Radlinski, and Sreenivas Gollapudi. Learning optimally diverse rankings over large document collections. In 27th ICML, pages 983-990, 2010.
-
(2010)
27th ICML
, pp. 983-990
-
-
Slivkins, A.1
Radlinski, F.2
Gollapudi, S.3
|