-
1
-
-
0000616723
-
Sample mean based index policies with O(log n) regret for the multiarmed bandit problem
-
Agrawal, R.: Sample mean based index policies with O(log n) regret for the multiarmed bandit problem. Adv. in Appl. Probab. 27(4), 1054-1078 (1995)
-
(1995)
Adv. in Appl. Probab.
, vol.27
, Issue.4
, pp. 1054-1078
-
-
Agrawal, R.1
-
2
-
-
38149013086
-
Tuning bandit algorithms in stochastic environments
-
Hutter, M., Servedio, R.A., Takimoto, E. (eds.) ALT 2007. Springer, Heidelberg
-
Audibert, J.Y., Munos, R., Szepesvari, A.: Tuning bandit algorithms in stochastic environments. In: Hutter, M., Servedio, R.A., Takimoto, E. (eds.) ALT 2007. NCS (LNAI), vol. 4754, pp. 150-165. Springer, Heidelberg (2007)
-
(2007)
LNCS (LNAI)
, vol.4754
, pp. 150-165
-
-
Audibert, J.Y.1
Munos, R.2
Szepesvari, A.3
-
3
-
-
0037709910
-
The nonstochastic multiarmed bandit problem
-
Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32(1), 48-77 (2002)
-
(2002)
SIAM J. Comput.
, vol.32
, Issue.1
, pp. 48-77
-
-
Auer, P.1
Cesa-Bianchi, N.2
Freund, Y.3
Schapire, R.E.4
-
4
-
-
0041966002
-
Using confidence bounds for exploitation-exploration trade-offs
-
Spec. Issue Comput. Learn. Theory
-
Auer, P.: Using confidence bounds for exploitation-exploration trade-offs. J. Mach. Learn. Res. 3(Spec. Issue Comput. Learn. Theory), 397-422 (2002)
-
(2002)
J. Mach. Learn. Res.
, vol.3
, pp. 397-422
-
-
Auer, P.1
-
5
-
-
0036568025
-
Finite-time analysis of the multiarmed bandit problem
-
Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine Learning 47(2/3), 235-256 (2002)
-
(2002)
Machine Learning
, vol.47
, Issue.2-3
, pp. 235-256
-
-
Auer, P.1
Cesa-Bianchi, N.2
Fischer, P.3
-
6
-
-
84926078662
-
-
Cambridge University Press, New York
-
Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, New York (2006)
-
(2006)
Prediction, Learning, and Games
-
-
Cesa-Bianchi, N.1
Lugosi, G.2
-
7
-
-
0033234631
-
On prediction of individual sequences
-
Cesa-Bianchi, N., Lugosi, G.: On prediction of individual sequences. Ann. Statist. 27(6), 1865-1895 (1999)
-
(1999)
Ann. Statist.
, vol.27
, Issue.6
, pp. 1865-1895
-
-
Cesa-Bianchi, N.1
Lugosi, G.2
-
8
-
-
33748442333
-
Regret minimization under partial monitoring
-
Cesa-Bianchi, N., Lugosi, G., Stoltz, G.: Regret minimization under partial monitoring. Math. Oper. Res. 31(3), 562-580 (2006)
-
(2006)
Math. Oper. Res.
, vol.31
, Issue.3
, pp. 562-580
-
-
Cesa-Bianchi, N.1
Lugosi, G.2
Stoltz, G.3
-
10
-
-
0003336572
-
A probabilistic theory of pattern recognition
-
Springer, New York
-
Devroye, L., Györfi, L., Lugosi, G.: A probabilistic theory of pattern recognition. Applications of Mathematics, vol. 31. Springer, New York (1996)
-
(1996)
Applications of Mathematics
, vol.31
-
-
Devroye, L.1
Györfi, L.2
Lugosi, G.3
-
11
-
-
0031211090
-
A decision-theoretic generalization of on-line learning and an application to boosting
-
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. System Sci. 55(1, part 2), 119-139 (1997);
-
(1997)
J. Comput. System Sci.
, vol.55
, Issue.1 PART 2
, pp. 119-139
-
-
Freund, Y.1
Schapire, R.E.2
-
12
-
-
80054108357
-
EuroCOLT 1995
-
Springer, Heidelberg
-
In: Vitányi, P.M.B. (ed.) EuroCOLT 1995. LNCS, vol. 904. Springer, Heidelberg (1995)
-
(1995)
LNCS
, vol.904
-
-
Vitányi, P.M.B.1
-
13
-
-
24344490792
-
Asymptotic operating characteristics of an optimal change point detection in hidden Markov models
-
Fuh, C.D.: Asymptotic operating characteristics of an optimal change point detection in hidden Markov models. Ann. Statist. 32(5), 2305-2339 (2004)
-
(2004)
Ann. Statist.
, vol.32
, Issue.5
, pp. 2305-2339
-
-
Fuh, C.D.1
-
15
-
-
70449882757
-
Multi-armed bandit, dynamic environments and meta-bandits
-
Hartland, C., Gelly, S., Baskiotis, N., Teytaud, O., Sebag, M.: Multi-armed bandit, dynamic environments and meta-bandits. In: nIPS-2006 Workshop, Online Trading Between Exploration and Exploitation, Whistler, Canada (2006)
-
nIPS-2006 Workshop, Online Trading between Exploration and Exploitation, Whistler, Canada (2006)
-
-
Hartland, C.1
Gelly, S.2
Baskiotis, N.3
Teytaud, O.4
Sebag, M.5
-
16
-
-
0032137328
-
Tracking the best expert
-
Herbster, M., Warmuth, M.: Tracking the best expert. Machine Learning 32(2), 151-178 (1998)
-
(1998)
Machine Learning
, vol.32
, Issue.2
, pp. 151-178
-
-
Herbster, M.1
Warmuth, M.2
-
19
-
-
38649118249
-
Reinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problems
-
Koulouriotis, D.E., Xanthopoulos, A.: Reinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problems. Applied Mathematics and Computation 196(2), 913-922 (2008)
-
(2008)
Applied Mathematics and Computation
, vol.196
, Issue.2
, pp. 913-922
-
-
Koulouriotis, D.E.1
Xanthopoulos, A.2
-
20
-
-
57849152837
-
-
Lai, L., El Gamal, H., Jiang, H., Poor, H.V.: Cognitive medium access: Exploration, exploitation and competition (2007)
-
(2007)
Cognitive Medium Access: Exploration, Exploitation and Competition
-
-
Lai, L.1
El Gamal, H.2
Jiang, H.3
Poor, H.V.4
-
21
-
-
0002899547
-
Asymptotically efficient adaptive allocation rules
-
Lai, T.L., Robbins, H.: Asymptotically efficient adaptive allocation rules. Adv. in Appl. Math. 6(1), 4-22 (1985)
-
(1985)
Adv. in Appl. Math.
, vol.6
, Issue.1
, pp. 4-22
-
-
Lai, T.L.1
Robbins, H.2
-
22
-
-
33744827418
-
Sequential change-point detection when unknown parameters are present in the pre-change distribution
-
Mei, Y.: Sequential change-point detection when unknown parameters are present in the pre-change distribution. Ann. Statist. 34(1), 92-122 (2006)
-
(2006)
Ann. Statist.
, vol.34
, Issue.1
, pp. 92-122
-
-
Mei, Y.1
-
24
-
-
0001043843
-
Restless bandits: Activity allocation in a changing world
-
a celebration of applied probability
-
Whittle, P.: Restless bandits: activity allocation in a changing world. J. Appl. Probab. Special 25A, 287-298 (1988) a celebration of applied probability
-
(1988)
J. Appl. Probab. Special
, vol.25 A
, pp. 287-298
-
-
Whittle, P.1
|