-
1
-
-
0037709910
-
The nonstochastic multiarmed bandit problem
-
Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: The nonstochastic multiarmed bandit problem. SIAM J. Computing 32(1), 48-77 (2002)
-
(2002)
SIAM J. Computing
, vol.32
, Issue.1
, pp. 48-77
-
-
Auer, P.1
Cesa-Bianchi, N.2
Freund, Y.3
Schapire, R.E.4
-
2
-
-
0002056057
-
Markets with a continuum of traders
-
Aumann, R.J.: Markets with a continuum of traders. Econometrica 32, 39-50 (1964)
-
(1964)
Econometrica
, vol.32
, pp. 39-50
-
-
Aumann, R.J.1
-
5
-
-
33750501028
-
Modified logarithmic Sobolev inequalities in discrete settings
-
Bobkov, S.G., Tetali, P.: Modified logarithmic Sobolev inequalities in discrete settings. Journal of Theoretical Probability 19(2), 289-336 (2006)
-
(2006)
Journal of Theoretical Probability
, vol.19
, Issue.2
, pp. 289-336
-
-
Bobkov, S.G.1
Tetali, P.2
-
6
-
-
0033876515
-
-
Borkar, V.S., Meyn, S.P.: The O.D.E. method for convergence of stochastic approximation and reinforcement learning. SIAM J. Control and Optimization 38(2), 447-469 (2000)
-
Borkar, V.S., Meyn, S.P.: The O.D.E. method for convergence of stochastic approximation and reinforcement learning. SIAM J. Control and Optimization 38(2), 447-469 (2000)
-
-
-
-
7
-
-
0041965975
-
R-max-a general polynomial time algorithm for near-optimal reinforcement learning
-
Brafman, R.I., Tennenholtz, M.: R-max-a general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research 3, 213-231 (2003)
-
(2003)
Journal of Machine Learning Research
, vol.3
, pp. 213-231
-
-
Brafman, R.I.1
Tennenholtz, M.2
-
8
-
-
84926078662
-
-
Cambridge University Press, Cambridge
-
Cesa-Bianchi, N., Lugosi, G.: Prediction, learning, and games. Cambridge University Press, Cambridge (2006)
-
(2006)
Prediction, learning, and games
-
-
Cesa-Bianchi, N.1
Lugosi, G.2
-
10
-
-
23044525872
-
A nonstationary offered-load model for packet networks
-
Duffield, N.G., Massey, W.A., Whitt, W.: A nonstationary offered-load model for packet networks. Telecommunication Systems 16(3-4), 271-296 (2001)
-
(2001)
Telecommunication Systems
, vol.16
, Issue.3-4
, pp. 271-296
-
-
Duffield, N.G.1
Massey, W.A.2
Whitt, W.3
-
11
-
-
41649111187
-
Experts in a Markov decision process
-
Even-Dar, E., Kakade, S., Mansour, Y.: Experts in a Markov decision process. In: NIPS, pp. 401-408 (2004)
-
(2004)
NIPS
, pp. 401-408
-
-
Even-Dar, E.1
Kakade, S.2
Mansour, Y.3
-
13
-
-
0001976283
-
Approximation to Bayes risk in repeated play
-
Princeton University Press, Princeton
-
Hannan, J.: Approximation to Bayes risk in repeated play. In: Contributions to the Theory of Games, vol. 3, pp. 97-139. Princeton University Press, Princeton (1957)
-
(1957)
Contributions to the Theory of Games
, vol.3
, pp. 97-139
-
-
Hannan, J.1
-
14
-
-
0032137328
-
Tracking the best expert
-
Herbster, M., Warmuth, M.K.: Tracking the best expert. Machine Learning 32(2), 151-178 (1998)
-
(1998)
Machine Learning
, vol.32
, Issue.2
, pp. 151-178
-
-
Herbster, M.1
Warmuth, M.K.2
-
15
-
-
24644463787
-
Efficient algorithms for online decision problems
-
15
-
15.Kalai, A., Vempala, S.: Efficient algorithms for online decision problems. Journal of Computer and System Sciences 71(3), 291-307 (2005)
-
(2005)
Journal of Computer and System Sciences
, vol.71
, Issue.3
, pp. 291-307
-
-
Kalai, A.1
Vempala, S.2
-
16
-
-
0038386340
-
The empirical Bayes envelope and regret minimization in competitive Markov decision processes
-
Mannor, S., Shimkin, N.: The empirical Bayes envelope and regret minimization in competitive Markov decision processes. Mathematics of Operations Research 28(2), 327-345 (2003)
-
(2003)
Mathematics of Operations Research
, vol.28
, Issue.2
, pp. 327-345
-
-
Mannor, S.1
Shimkin, N.2
-
17
-
-
0036649565
-
On sequential strategies for loss functions with memory
-
Merhav, N., Ordentlich, E., Seroussi, G., Weinberger, M.J.: On sequential strategies for loss functions with memory. IEEE Trans. Inf. Theory 48(7), 1947-1958 (2002)
-
(2002)
IEEE Trans. Inf. Theory
, vol.48
, Issue.7
, pp. 1947-1958
-
-
Merhav, N.1
Ordentlich, E.2
Seroussi, G.3
Weinberger, M.J.4
-
18
-
-
0000392613
-
Stochastic games
-
Shapley, L.: Stochastic games. PNAS 39(10), 1095-1100 (1953)
-
(1953)
PNAS
, vol.39
, Issue.10
, pp. 1095-1100
-
-
Shapley, L.1
-
19
-
-
84868886330
-
-
Preprint, 2008
-
Yu, J.Y., Mannor, S., Shimkin, N.: Markov decision processes with arbitrarily varying rewards (Preprint, 2008), http://www.cim.mcgill.ca/~jiayuan/ mdp.pdf:
-
Markov decision processes with arbitrarily varying rewards
-
-
Yu, J.Y.1
Mannor, S.2
Shimkin, N.3
|