-
1
-
-
0037709910
-
The nonstochastic multiarmed bandit problem
-
Auer, P., Cesa-Bianchi, N., Freund, Y., & Schapire, R. E. (2003). The nonstochastic multiarmed bandit problem. SIAM J. Comput., 32(1), 48-77.
-
(2003)
SIAM J. Comput.
, vol.32
, Issue.1
, pp. 48-77
-
-
Auer, P.1
Cesa-Bianchi, N.2
Freund, Y.3
Schapire, R.E.4
-
3
-
-
0141496132
-
Ultraconservative online algorithms for multiclass problems
-
Crammer, K., Singer, Y., & Warmuth, K. (2003). Ultraconservative online algorithms for multiclass problems. JMLR, 3, 2003.
-
(2003)
JMLR
, vol.3
, pp. 2003
-
-
Crammer, K.1
Singer, Y.2
Warmuth, K.3
-
4
-
-
84937398609
-
Pac bounds for multi-armed bandit and markov decision processes
-
London, UK. Springer-Verlag
-
Even-Dar, E., Mannor, S., & Mansour, Y. (2002). Pac bounds for multi-armed bandit and markov decision processes. In COLT '02, pages 255-270, London, UK. Springer-Verlag.
-
(2002)
COLT '02
, pp. 255-270
-
-
Even-Dar, E.1
Mannor, S.2
Mansour, Y.3
-
5
-
-
33745295134
-
Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems
-
Even-Dar, E., Mannor, S., & Mansour, Y. (2006). Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. JMLR, 7, 1079-1105.
-
(2006)
JMLR
, vol.7
, pp. 1079-1105
-
-
Even-Dar, E.1
Mannor, S.2
Mansour, Y.3
-
6
-
-
0035370643
-
General convergence results for linear discriminant updates
-
Grove, A. J., Littlestone, N., & Schuurmans, D. (2001). General convergence results for linear discriminant updates. Mach. Learn., 43(3), 173-210.
-
(2001)
Mach. Learn.
, vol.43
, Issue.3
, pp. 173-210
-
-
Grove, A.J.1
Littlestone, N.2
Schuurmans, D.3
-
7
-
-
56449104477
-
Efficient bandit algorithms for online multi-class prediction
-
New York, NY, USA. ACM
-
Kakade, S. M., Shalev-Shwartz, S., & Tewari, A. (2008). Efficient bandit algorithms for online multi-class prediction. In ICML '08, pages 440-447, New York, NY, USA. ACM.
-
(2008)
ICML '08
, pp. 440-447
-
-
Kakade, S.M.1
Shalev-Shwartz, S.2
Tewari, A.3
-
8
-
-
0346494464
-
Additive versus exponentiated gradient updates for linear prediction
-
New York, NY, USA. ACM
-
Kivinen, J. & Warmuth, M. K. (1995). Additive versus exponentiated gradient updates for linear prediction. In STOC '95, pages 209-218, New York, NY, USA. ACM.
-
(1995)
STOC '95
, pp. 209-218
-
-
Kivinen, J.1
Warmuth, M.K.2
-
9
-
-
83055177001
-
The epoch-greedy algorithm for contextual multi-armed bandits
-
Langford, J. & Tong, Z. (2007). The epoch-greedy algorithm for contextual multi-armed bandits. In NIPS '07.
-
(2007)
NIPS '07
-
-
Langford, J.1
Tong, Z.2
-
10
-
-
84876811202
-
Rcv1: A new benchmark collection for text categorization research
-
Lewis, D. D., Yang, Y., Rose, T., & Li, F. (2004). Rcv1: A new benchmark collection for text categorization research. JMLR, 5, 361-397.
-
(2004)
JMLR
, vol.5
, pp. 361-397
-
-
Lewis, D.D.1
Yang, Y.2
Rose, T.3
Li, F.4
-
11
-
-
34250091945
-
Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm
-
Littlestone, N. (1988). Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Mach. Learn., 2(4), 285-318.
-
(1988)
Mach. Learn.
, vol.2
, Issue.4
, pp. 285-318
-
-
Littlestone, N.1
-
12
-
-
30044441333
-
The sample complexity of exploration in multi-armed bandit problem
-
Mannor, S. & Tsitsiklis, J. N. (2004). The sample complexity of exploration in multi-armed bandit problem. JMLR, 5, 623-648.
-
(2004)
JMLR
, vol.5
, pp. 623-648
-
-
Mannor, S.1
Tsitsiklis, J.N.2
-
13
-
-
11144273669
-
The perceptron: A probabilistic model for information storage and organization in the brain
-
Rosenblatt, F. (1958). The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review, 65, 386-408.
-
(1958)
Psychological Review
, vol.65
, pp. 386-408
-
-
Rosenblatt, F.1
|