-
1
-
-
84919804450
-
Taming the monster: A fast and simple algorithm for contextual bandits
-
abs/1402.0555
-
Agarwal, Alekh, Hsu, Daniel, Kale, Satyen, Langford, John, Li, Lihong, and Schapire, Robert E. Taming the monster: A fast and simple algorithm for contextual bandits. CoRR, abs/1402.0555, 2014.
-
(2014)
CoRR
-
-
Alekh, A.1
Daniel, H.2
Satyen, K.3
John, L.4
Lihong, L.5
Schapire Robert, E.6
-
2
-
-
0041966002
-
Using confidence bounds for exploitation-exploration trade-offs
-
Auer, Peter. Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research, 3:397-422, 2002.
-
(2002)
Journal of Machine Learning Research
, vol.3
, pp. 397-422
-
-
Peter, A.1
-
3
-
-
0037709910
-
The nonstochastic multiarmed bandit problem
-
Auer, Peter, Cesa-Bianchi, Nicolo, Freund, Yoav, and Schapire, Robert E. The nonstochastic multiarmed bandit problem. SIAM Journal of Computing, 32(l):48-77, 2002.
-
(2002)
SIAM Journal of Computing
, vol.32
, Issue.1
, pp. 48-77
-
-
Auer, P.1
Cesa-Bianchi, N.2
Freund, Y.3
Schapire, R.E.4
-
4
-
-
70350664424
-
The offset tree for learning with partial labels
-
Beygelzimer, Alina and Langford, John. The offset tree for learning with partial labels. In KDD, 2009.
-
(2009)
KDD
-
-
Beygelzimer, A.1
Langford, J.2
-
5
-
-
80053144086
-
Contextual bandit algorithms with supervised learning guarantees
-
Beygelzimer, AUna, Langford, John, Li, Lihong, Reyzin, Lev, and Schapire, Robert E. Contextual bandit algorithms with supervised learning guarantees. In AISTATS, 2011.
-
(2011)
AISTATS
-
-
Beygelzimer, A.1
Langford, J.2
Li, L.3
Reyzin, L.4
Schapire, R.E.5
-
6
-
-
0033280893
-
Beating the holdout: Bounds for k-fold and progressive cross-validation
-
Blum, Avrim, Kalai, Adam, and Langford, John. Beating the holdout: Bounds for k-fold and progressive cross-validation. In COLT, 1999.
-
(1999)
COLT
-
-
Blum, A.1
Kalai, A.2
Langford, J.3
-
7
-
-
85162416700
-
An empirical evaluation of Thompson sampling
-
Chapelle, Olivier and Li, Lihong. An empirical evaluation of Thompson sampling. In NIPS, 2011.
-
(2011)
NIPS
-
-
Chapelle, O.1
Li, L.2
-
8
-
-
84860620518
-
Contextual bandits with linear payoff functions
-
Chu, Wei, Li, Lihong, Reyzin, Lev, and Schapire, Robert E. Contextual bandits with linear payoff functions. In AISTATS, 2011.
-
(2011)
AISTATS
-
-
Chu, W.1
Li, L.2
Reyzin, L.3
Schapire, R.E.4
-
9
-
-
80053154335
-
Efficient optimal learning for contextual bandits
-
Dudik, Miroslav, Hsu, Daniel, Kale, Satyen, Karampatzi-akis, Nikos, Langford, John, Reyzin, Lev, and Zhang, Tong. Efficient optimal learning for contextual bandits. In UAI, 2011a.
-
(2011)
UAI
-
-
Dudik, M.1
Hsu, D.2
Kale, S.3
Karampatzi-Akis, N.4
Langford, J.5
Reyzin, L.6
Zhang, T.7
-
10
-
-
80053456223
-
Doubly robust policy evaluation and learning
-
Dudik, Miroslav, Langford, John, and Li, Lihong. Doubly robust policy evaluation and learning. In ICML, 2011b.
-
(2011)
ICML
-
-
Dudik, M.1
Langford, J.2
Li, L.3
-
11
-
-
0031122905
-
Predicting neariy as well as the best pruning of a decision tree
-
Helmbold, David P. and Schapire, Robert E. Predicting neariy as well as the best pruning of a decision tree. Machine Learning, 27(l):51-68, 1997.
-
(1997)
Machine Learning
, vol.27
, Issue.1
, pp. 51-68
-
-
Helmbold, D.P.1
Schapire, R.E.2
-
13
-
-
77956144722
-
The epoch-greedy algorithm for contextual multi-armed bandits
-
Langford, John and Zhang, Tong. The epoch-greedy algorithm for contextual multi-armed bandits. In NIPS, 2007.
-
(2007)
NIPS
-
-
Langford, J.1
Zhang, T.2
-
14
-
-
84876811202
-
Rev I: A new benchmark collection for text categorization research
-
Lewis, David D, Yang, Yiming, Rose, Tony G, and Li, Fan. Rev I: A new benchmark collection for text categorization research. The Journal of Machine Learning Research, 5:361-397, 2004.
-
(2004)
The Journal of Machine Learning Research
, vol.5
, pp. 361-397
-
-
Lewis, D.D.1
Yang, Y.2
Rose, T.G.3
Li, F.4
-
15
-
-
84919804446
-
Generalized Thompson sampling for contextual bandits
-
abs/1310.7163
-
Li, Lihong. Generalized Thompson sampling for contextual bandits. CoRR, abs/1310.7163, 2013.
-
(2013)
CoRR
-
-
Li, L.1
-
16
-
-
77954641643
-
A contextual-bandit approach to personalized news article recommendation
-
Li, Lihong, Chu, Wei, Langford, John, and Schapire, Robert E. A contextual-bandit approach to personalized news article recommendation. In WWW, 2010.
-
(2010)
WWW
-
-
Li, L.1
Chu, W.2
Langford, J.3
Schapire, R.E.4
-
17
-
-
84898068653
-
Tighter bounds for multi-armed bandits with expert advice
-
McMahan, H. Brendan and Streeter, Matthew. Tighter bounds for multi-armed bandits with expert advice. In COLT, 2009.
-
(2009)
COLT
-
-
McMahan, H.B.1
Streeter, M.2
-
19
-
-
0001395850
-
On the likelihood that one unknown probability exceeds another in view of the evidence of two samples
-
Thompson, William R. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25(3-4):285-294, 1933.
-
(1933)
Biometrika
, vol.25
, Issue.3-4
, pp. 285-294
-
-
Thompson, W.R.1
|