-
1
-
-
0037709910
-
The nonstochastic multiarmed bandit problem
-
AUER, P., CESA-BIANCHI, N., FREUND, Y., AND SCHAPIRE, R. E. 2002. The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32, 1.
-
(2002)
SIAM J. Comput.
, vol.32
, pp. 1
-
-
Auer, P.1
Cesa-Bianchi, N.2
Freund, Y.3
Schapire, R.E.4
-
2
-
-
0031140246
-
How to use expert advice
-
CESA-BIANCHI, N., FREUND, Y., HAUSSLER, D., HELMBOLD, D. P., SCHAPIRE, R. E., AND WARMUTH, M. K. 1997. How to use expert advice. J. ACM 44, 427-485.
-
(1997)
J. ACM
, vol.44
, pp. 427-485
-
-
Cesa-Bianchi, N.1
Freund, Y.2
Haussler, D.3
Helmbold, D.P.4
Schapire, R.E.5
Warmuth, M.K.6
-
3
-
-
0000182415
-
A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations
-
CHERNOFF, H. 1952. A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Ann. Math. Stat. 23, 493-507.
-
(1952)
Ann. Math. Stat.
, vol.23
, pp. 493-507
-
-
Chernoff, H.1
-
6
-
-
0002095886
-
A randomization rule for selecting forecasts
-
FOSTER, D. P. AND VOHRA, R. V. 1993. A randomization rule for selecting forecasts. Oper. Res. 41, 704-709.
-
(1993)
Oper. Res.
, vol.41
, pp. 704-709
-
-
Foster, D.P.1
Vohra, R.V.2
-
7
-
-
0002476325
-
Regret and the on-line decision problem
-
FOSTER, D. AND VOHRA, R. 1999. Regret and the on-line decision problem. Games Econ. Behav. 29, 7-35.
-
(1999)
Games Econ. Behav.
, vol.29
, pp. 7-35
-
-
Foster, D.1
Vohra, R.2
-
8
-
-
84983110889
-
A decision-theoretic generalization of on-line learning and an application to boosting
-
(P. Vitányi, Ed.), Lecture Notes in Computer Science. Springer-Verlag, New York
-
FREUND, Y., AND SCHAPIRE, R. E. 1995. A decision-theoretic generalization of on-line learning and an application to boosting. In Computational Learning Theory, (P. Vitányi, Ed.), Lecture Notes in Computer Science, vol. 904. Springer-Verlag, New York, 23-37.
-
(1995)
Computational Learning Theory
, vol.904
, pp. 23-37
-
-
Freund, Y.1
Schapire, R.E.2
-
9
-
-
0002267135
-
Adaptive game playing using multiplicative weights
-
FREUND, Y., AND SCHAPIRE, R. E. 1999. Adaptive game playing using multiplicative weights. Games Econ. Behav. 29, 79-103.
-
(1999)
Games Econ. Behav.
, vol.29
, pp. 79-103
-
-
Freund, Y.1
Schapire, R.E.2
-
11
-
-
84947403595
-
Probability inequalities for sums of bounded random variables
-
HOEFFDING, W. 1963. Probability inequalities for sums of bounded random variables. J. ASA 58, 13-30.
-
(1963)
J. ASA
, vol.58
, pp. 13-30
-
-
Hoeffding, W.1
-
12
-
-
23244466805
-
-
Ph.D. dissertation, Gatsby Computational Neuroscience Unit, University College, London, England
-
KAKADE, S. 2003. On the sample complexity of reinforcement learning. Ph.D. dissertation, Gatsby Computational Neuroscience Unit, University College, London, England.
-
(2003)
On the Sample Complexity of Reinforcement Learning
-
-
Kakade, S.1
-
14
-
-
0036832954
-
Near-optimal reinforcement learning in polynomial time
-
KEARNS, M., AND SINGH, S. 2002. Near-optimal reinforcement learning in polynomial time. Mach. Learn. 49, 2, 209-232.
-
(2002)
Mach. Learn.
, vol.49
, Issue.2
, pp. 209-232
-
-
Kearns, M.1
Singh, S.2
-
15
-
-
0029344133
-
Machine learning and nonparametric bandit theory
-
LAI, T.-L., AND YAKOWITZ, S. 1995. Machine learning and nonparametric bandit theory. IEEE Trans. Automat. Cont. 40, 7, 1199-1209.
-
(1995)
IEEE Trans. Automat. Cont.
, vol.40
, Issue.7
, pp. 1199-1209
-
-
Lai, T.-L.1
Yakowitz, S.2
-
16
-
-
35148838877
-
The weighted majority algorithm
-
LITTLESTONE, N., AND WARMUTH, M. 1994. The weighted majority algorithm. Inf. Comput. 108, 2, 212-261.
-
(1994)
Inf. Comput.
, vol.108
, Issue.2
, pp. 212-261
-
-
Littlestone, N.1
Warmuth, M.2
-
17
-
-
0032047115
-
A game of prediction with expert advice
-
VOVK, V. 1998. A game of prediction with expert advice. J. Compu. Syst. Sci. 56, 153-173.
-
(1998)
J. Compu. Syst. Sci.
, vol.56
, pp. 153-173
-
-
Vovk, V.1
|