-
1
-
-
0037709910
-
The nonstochastic multiarmed bandit problem
-
January
-
Peter Auer, Nicolò Cesa-Bianchi, Yoav Freund, and Robert E. Schapire. The nonstochastic multiarmed bandit problem. SIAM J. Comput., 32:48-77, January 2003.
-
(2003)
SIAM J. Comput.
, vol.32
, pp. 48-77
-
-
Auer, P.1
Cesa-Bianchi, N.2
Freund, Y.3
Schapire, R.E.4
-
2
-
-
84898063697
-
Competing in the dark: An efficient algorithm for bandit linear optimization
-
Jacob Abernethy, Elad Hazan, and Alexander Rakhlin. Competing in the dark: An efficient algorithm for bandit linear optimization. In COLT, pages 263-274, 2008.
-
(2008)
COLT
, pp. 263-274
-
-
Abernethy, J.1
Hazan, E.2
Rakhlin, A.3
-
3
-
-
35448960376
-
Online linear optimization and adaptive routing
-
Baruch Awerbuch and Robert Kleinberg. Online linear optimization and adaptive routing. J. Comput. Syst. Sci., 74(1):97-114, 2008.
-
(2008)
J. Comput. Syst. Sci.
, vol.74
, Issue.1
, pp. 97-114
-
-
Awerbuch, B.1
Kleinberg, R.2
-
4
-
-
84898768231
-
An efficient bandit algorithm for p T-regret in online multiclass prediction?
-
Jacob Abernethy and Alexander Rakhlin. An efficient bandit algorithm for p T-regret in online multiclass prediction? In COLT, 2009.
-
(2009)
COLT
-
-
Abernethy, J.1
Rakhlin, A.2
-
5
-
-
80053461043
-
Multiclass classification with bandit feedback using adaptive regularization
-
Koby Crammer and Claudio Gentile. Multiclass classification with bandit feedback using adaptive regularization. In ICML, 2011.
-
(2011)
ICML
-
-
Crammer, K.1
Gentile, C.2
-
6
-
-
33244456637
-
Robbing the bandit: Less regret in online geometric optimization against an adaptive adversary
-
Varsha Dani and Thomas P. Hayes. Robbing the bandit: less regret in online geometric optimization against an adaptive adversary. In SODA, pages 937-943, 2006.
-
(2006)
SODA
, pp. 937-943
-
-
Dani, V.1
Hayes, T.P.2
-
7
-
-
70349295143
-
The price of bandit information for online optimization
-
Varsha Dani, Thomas Hayes, and Sham Kakade. The price of bandit information for online optimization. In NIPS. 2007.
-
(2007)
NIPS
-
-
Dani, V.1
Hayes, T.2
Kakade, S.3
-
8
-
-
20744454447
-
Online convex optimization in the bandit setting: Gradient descent without a gradient
-
Abraham D. Flaxman, Adam Tauman Kalai, and H. Brendan McMahan. Online convex optimization in the bandit setting: gradient descent without a gradient. In SODA, pages 385-394, 2005.
-
(2005)
SODA
, pp. 385-394
-
-
Flaxman, A.D.1
Kalai, A.T.2
McMahan, H.B.3
-
9
-
-
35348918820
-
Logarithmic regret algorithms for online convex optimization
-
Elad Hazan, Amit Agarwal, and Satyen Kale. Logarithmic regret algorithms for online convex optimization. Machine Learning, 69(2-3):169-192, 2007.
-
(2007)
Machine Learning
, vol.69
, Issue.2-3
, pp. 169-192
-
-
Hazan, E.1
Agarwal, A.2
Kale, S.3
-
11
-
-
56449104477
-
Efficient bandit algorithms for online multiclass prediction
-
Sham M. Kakade, Shai Shalev-Shwartz, and Ambuj Tewari. Efficient bandit algorithms for online multiclass prediction. In ICML'08, pages 440-447, 2008.
-
(2008)
ICML'08
, pp. 440-447
-
-
Kakade, S.M.1
Shalev-Shwartz, S.2
Tewari, A.3
-
12
-
-
77956144722
-
The epoch-greedy algorithm for multi-armed bandits with side information
-
John Langford and Tong Zhang. The epoch-greedy algorithm for multi-armed bandits with side information. In NIPS, 2007.
-
(2007)
NIPS
-
-
Langford, J.1
Zhang, T.2
-
13
-
-
9444257628
-
Online geometric optimization in the bandit setting against an adaptive adversary
-
H. Brendan McMahan and Avrim Blum. Online geometric optimization in the bandit setting against an adaptive adversary. In COLT, pages 109-123, 2004.
-
(2004)
COLT
, pp. 109-123
-
-
McMahan, H.B.1
Blum, A.2
-
14
-
-
84862517941
-
Closing the gap between bandit and full-information online optimization: High-probability regret bound
-
EECS Department, University of California, Berkeley, Aug
-
Alexander Rakhlin, Ambuj Tewari, and Peter Bartlett. Closing the gap between bandit and full-information online optimization: High-probability regret bound. Technical Report UCB/EECS-2007-109, EECS Department, University of California, Berkeley, Aug 2007.
-
(2007)
Technical Report UCB/EECS-2007-109
-
-
Rakhlin, A.1
Tewari, A.2
Bartlett, P.3
|