-
1
-
-
0000392613
-
Stochastic games
-
L. Shapley, "Stochastic games," PNAS, vol.39, no.10, pp. 1095-1100, 1953.
-
(1953)
PNAS
, vol.39
, Issue.10
, pp. 1095-1100
-
-
Shapley, L.1
-
2
-
-
41649111187
-
Experts in a Markov decision process
-
E. Even-Dar, S. Kakade, and Y. Mansour, "Experts in a Markov decision process," in NIPS, 2004, pp. 401-408.
-
(2004)
NIPS
, pp. 401-408
-
-
Even-Dar, E.1
Kakade, S.2
Mansour, Y.3
-
3
-
-
0038386340
-
The empirical Bayes envelope and regret minimization in competitive Markov decision processes
-
S. Mannor and N. Shimkin, "The empirical Bayes envelope and regret minimization in competitive Markov decision processes," Mathematics of Operations Research, vol.28, no.2, pp. 327-345, 2003.
-
(2003)
Mathematics of Operations Research
, vol.28
, Issue.2
, pp. 327-345
-
-
Mannor, S.1
Shimkin, N.2
-
5
-
-
0030685459
-
Markov modulated Bernoulli process
-
S. Özekici, "Markov modulated Bernoulli process," Mathematical Methods of Operations Research, vol.45, no.3, pp. 311-324, 1997.
-
(1997)
Mathematical Methods of Operations Research
, vol.45
, Issue.3
, pp. 311-324
-
-
Özekici, S.1
-
6
-
-
14344250395
-
Robust control of Markov decision processes with uncertain transition matrices
-
A. Nilim and L. E. Ghaoui, "Robust control of Markov decision processes with uncertain transition matrices," Operations Research, vol.53, no.5, pp. 780-798, 2005.
-
(2005)
Operations Research
, vol.53
, Issue.5
, pp. 780-798
-
-
Nilim, A.1
Ghaoui, L.E.2
-
7
-
-
0037709910
-
The nonstochastic multiarmed bandit problem
-
P. Auer, N. Cesa-Bianchi, Y. Freund, and R. E. Schapire, "The nonstochastic multiarmed bandit problem," SIAM J. Comput., vol.32, no.1, pp. 48-77, 2002.
-
(2002)
SIAM J. Comput.
, vol.32
, Issue.1
, pp. 48-77
-
-
Auer, P.1
Cesa-Bianchi, N.2
Freund, Y.3
Schapire, R.E.4
-
8
-
-
0041965975
-
R-max - A general polynomial time algorithm for near-optimal reinforcement learning
-
R. I. Brafman and M. Tennenholtz, "R-max - a general polynomial time algorithm for near-optimal reinforcement learning," Journal of Machine Learning Research, vol.3, pp. 213-231, 2003.
-
(2003)
Journal of Machine Learning Research
, vol.3
, pp. 213-231
-
-
Brafman, R.I.1
Tennenholtz, M.2
-
9
-
-
70349280578
-
Markov decision processes with arbitrary rewards
-
to appear
-
J. Y. Yu, S. Mannor, and N. Shimkin, "Markov decision processes with arbitrary rewards," Math. Oper. Res., 2009, to appear.
-
(2009)
Math. Oper. Res.
-
-
Yu, J.Y.1
Mannor, S.2
Shimkin, N.3
-
10
-
-
70349986740
-
Online learning in Markov decision processes with arbitrarily changing rewards and transitions
-
J. Y. Yu and S. Mannor, "Online learning in Markov decision processes with arbitrarily changing rewards and transitions," in GameNets, 2009.
-
(2009)
GameNets
-
-
Yu, J.Y.1
Mannor, S.2
-
13
-
-
0001616908
-
Uncertainty principles and signal recovery
-
D. L. Donoho and P. B. Stark, "Uncertainty principles and signal recovery," SIAM J. Appl. Math., vol.49, no.3, pp. 906-931, 1989.
-
(1989)
SIAM J. Appl. Math.
, vol.49
, Issue.3
, pp. 906-931
-
-
Donoho, D.L.1
Stark, P.B.2
-
14
-
-
24344490792
-
Asymptotic operating characteristics of an optimal change point detection in hidden Markov models
-
C. D. Fuh, "Asymptotic operating characteristics of an optimal change point detection in hidden Markov models," Ann. Statist., pp. 2305-2339, 2004.
-
(2004)
Ann. Statist.
, pp. 2305-2339
-
-
Fuh, C.D.1
-
16
-
-
37349042879
-
The robustness-performance tradeoff in Markov decision processes
-
H. Xu and S. Mannor, "The robustness-performance tradeoff in Markov decision processes," in NIPS, 2006, pp. 1537-1544.
-
(2006)
NIPS
, pp. 1537-1544
-
-
Xu, H.1
Mannor, S.2
-
17
-
-
0001976283
-
Approximation to Bayes risk in repeated play
-
Princeton University Press
-
J. Hannan, "Approximation to Bayes risk in repeated play," in Contributions to the Theory of Games. Princeton University Press, 1957, vol.3, pp. 97-139.
-
(1957)
Contributions to the Theory of Games
, vol.3
, pp. 97-139
-
-
Hannan, J.1
-
18
-
-
35148838877
-
The weighted majority algorithm
-
N. Littlestone and M. Warmuth, "The weighted majority algorithm," Information and Computation, vol.108, no.2, pp. 212-261, 1994.
-
(1994)
Information and Computation
, vol.108
, Issue.2
, pp. 212-261
-
-
Littlestone, N.1
Warmuth, M.2
-
21
-
-
24644463787
-
Efficient algorithms for online decision problems
-
A. Kalai and S. Vempala, "Efficient algorithms for online decision problems," Journal of Computer and System Sciences, vol.71, no.3, pp. 291-307, 2005.
-
(2005)
Journal of Computer and System Sciences
, vol.71
, Issue.3
, pp. 291-307
-
-
Kalai, A.1
Vempala, S.2
-
23
-
-
0033876515
-
The O.D.E. method for convergence of stochastic approximation and reinforcement learning
-
V. S. Borkar and S. P. Meyn, "The O.D.E. method for convergence of stochastic approximation and reinforcement learning," SIAM J. Control Optim., vol.38, no.2, pp. 447-469, 2000.
-
(2000)
SIAM J. Control Optim.
, vol.38
, Issue.2
, pp. 447-469
-
-
Borkar, V.S.1
Meyn, S.P.2
-
24
-
-
0001296683
-
Perturbation theory and finite Markov chains
-
P. J. Schweitzer, "Perturbation theory and finite Markov chains," Journal of Applied Probability, vol.5, pp. 401-413, 1968.
-
(1968)
Journal of Applied Probability
, vol.5
, pp. 401-413
-
-
Schweitzer, P.J.1
|