-
1
-
-
0036568025
-
Finite-time analysis of the multiarmed bandit problem
-
DOI 10.1023/A:1013689704352, Computational Learning Theory
-
Peter Auer, Nicolò Cesa-Bianchi, and Paul Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2-3):235-256, 2002. (Pubitemid 34126111)
-
(2002)
Machine Learning
, vol.47
, Issue.2-3
, pp. 235-256
-
-
Auer, P.1
Cesa-Bianchi, N.2
Fischer, P.3
-
2
-
-
80052659095
-
An optimal high probability algorithm for the contextual bandit problem
-
abs/1002.4058
-
Alina Beygelzimer, John Langford, Lihong Li, Lev Reyzin, and Robert E. Schapire. An optimal high probability algorithm for the contextual bandit problem. Computational Research Repository, abs/1002.4058, 2010.
-
(2010)
Computational Research Repository
-
-
Beygelzimer, A.1
Langford, J.2
Li, L.3
Reyzin, L.4
Schapire, R.E.5
-
6
-
-
33745295134
-
Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems
-
Eyal Even-Dar, Shie Mannor, and Yishay Mansour. Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. Journal of Machine Learning Research, 7:1079-1105, 2006. (Pubitemid 43938989)
-
(2006)
Journal of Machine Learning Research
, vol.7
, pp. 1079-1105
-
-
Even-Bar, E.1
Mannor, S.2
Mansour, Y.3
-
10
-
-
84876811202
-
Rcv1: A new benchmark collection for text categorization research
-
D. D. Lewis, Y. Yang, T. Rose, and F. Li. Rcv1: A new benchmark collection for text categorization research. Journal of Machine Learning Research, 5:361-397, 2004.
-
(2004)
Journal of Machine Learning Research
, vol.5
, pp. 361-397
-
-
Lewis, D.D.1
Yang, Y.2
Rose, T.3
Li, F.4
-
11
-
-
77954641643
-
A contextual-bandit approach to personalized news article recommendation
-
New York, NY, USA, ACM
-
Lihong Li, Wei Chu, John Langford, and Robert E. Schapire. A contextual-bandit approach to personalized news article recommendation. In WWW'10: Proceedings of the 19th international conference on World wide web, pages 661-670, New York, NY, USA, 2010. ACM.
-
(2010)
WWW'10: Proceedings of the 19th International Conference on World Wide Web
, pp. 661-670
-
-
Li, L.1
Chu, W.2
Langford, J.3
Schapire, R.E.4
-
12
-
-
77956210502
-
Exploitation and exploration in a performance based contextual advertising system
-
Wei Li, Xuerui Wang, Ruofei Zhang, Ying Cui, Jianchang Mao, and Rong Jin. Exploitation and exploration in a performance based contextual advertising system. In KDD 2010: Knoledge Discovery and Data Mining, pages 27-36, 2010.
-
(2010)
KDD 2010: Knoledge Discovery and Data Mining
, pp. 27-36
-
-
Li, W.1
Wang, X.2
Zhang, R.3
Cui, Y.4
Mao, J.5
Jin, R.6
-
13
-
-
30044441333
-
The sample complexity of exploration in the multi-armed bandit problem
-
Shie Mannor and John N. Tsitsiklis. The sample complexity of exploration in the multi-armed bandit problem. Journal of Machine Learning Research, 5:623-648, 2004.
-
(2004)
Journal of Machine Learning Research
, vol.5
, pp. 623-648
-
-
Mannor, S.1
Tsitsiklis, J.N.2
-
14
-
-
84966203785
-
Some aspects of the sequential design of experiments
-
Herbert Robbins. some aspects of the sequential design of experiments. Bulletin of the American Mathematical Society, 58:527-535, 1952.
-
(1952)
Bulletin of the American Mathematical Society
, vol.58
, pp. 527-535
-
-
Robbins, H.1
-
15
-
-
84966203785
-
Some aspects of the sequential design of experiments
-
Herbert Robins. Some aspects of the sequential design of experiments. Bull. Amer. Math. Soc., 58(5):527-535, 2010.
-
(2010)
Bull. Amer. Math. Soc.
, vol.58
, Issue.5
, pp. 527-535
-
-
Robins, H.1
-
16
-
-
11144273669
-
The perceptron: A probabilistic model for information storage and organization in the brain
-
F. Rosenblatt. The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review, 65:386-408, 1958.
-
(1958)
Psychological Review
, vol.65
, pp. 386-408
-
-
Rosenblatt, F.1
-
17
-
-
33646406807
-
Multi-armed bandit algorithms and empirical evaluation
-
Springer
-
Joannès Vermorel and Mehryar Mohri. Multi-armed bandit algorithms and empirical evaluation. In In European Conference on Machine Learning, pages 437-448. Springer, 2005.
-
(2005)
European Conference on Machine Learning
, pp. 437-448
-
-
Vermorel, J.1
Mohri, M.2
|