-
1
-
-
0019037868
-
Optimal infinite-horizon undiscounted control of finite probabilistic systems
-
L. K. Platzman, "Optimal infinite-horizon undiscounted control of finite probabilistic systems," SIAM J. Control Optim., vol. 18, pp. 362-380, 1980.
-
(1980)
SIAM J. Control Optim.
, vol.18
, pp. 362-380
-
-
Platzman, L.K.1
-
2
-
-
28544443262
-
On the existence of stationary optimal policies for partially observed MDPs under the long-run average cost criterion
-
DOI 10.1016/j.sysconle.2005.06.009, PII S016769110500109X
-
S. P. Hsu, D. M. Chuang, and A. Arapostathis, "On the existence of stationary optimal policies for partially observed mdps under the long-run average cost criterion," Systems and Control Letters, vol. 55, pp. 165-173, 2006. (Pubitemid 41745435)
-
(2006)
Systems and Control Letters
, vol.55
, Issue.2
, pp. 165-173
-
-
Hsu, S.-P.1
Chuang, D.-M.2
Arapostathis, A.3
-
3
-
-
0002899547
-
Asymptotically efficient adaptive allocation rules
-
T. Lai and H. Robbins, "Asymptotically efficient adaptive allocation rules," Advances in Applied Mathematics, vol. 6, pp. 4-22, 1985.
-
(1985)
Advances in Applied Mathematics
, vol.6
, pp. 4-22
-
-
Lai, T.1
Robbins, H.2
-
4
-
-
0000616723
-
Sample mean based index policies with o(log n) regret for the multi-armed bandit problem
-
December
-
R. Agrawal, "Sample mean based index policies with o(log n) regret for the multi-armed bandit problem," Advances in Applied Probability, vol. 27, no. 4, pp. 1054-1078, December 1995.
-
(1995)
Advances in Applied Probability
, vol.27
, Issue.4
, pp. 1054-1078
-
-
Agrawal, R.1
-
5
-
-
0023453059
-
Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-part i: Iid rewards
-
November
-
V. Anantharam, P. Varaiya, and J. . Walrand, "Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-part i: Iid rewards," IEEE Trans. Automat. Contr., pp. 968-975, November 1987.
-
(1987)
IEEE Trans. Automat. Contr.
, pp. 968-975
-
-
Anantharam, V.1
Varaiya, P.2
Walrand, J.3
-
6
-
-
0023450663
-
Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-part ii: Markovian rewards
-
-, "Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-part ii: Markovian rewards," IEEE Trans. Automat. Contr., pp. 977-982, November 1987.
-
(1987)
IEEE Trans. Automat. Contr.
, Issue.NOVEMBER
, pp. 977-982
-
-
Anantharam, V.1
Varaiya, P.2
Walrand, J.3
-
7
-
-
0036568025
-
Finite-time analysis of the multiarmed bandit problem
-
P. Auer, N. Cesa-Bianchi, and P. Fischer, "Finite-time analysis of the multiarmed bandit problem," Machine Learning, vol. 47, p. 235256, 2002.
-
(2002)
Machine Learning
, vol.47
, pp. 235256
-
-
Auer, P.1
Cesa-Bianchi, N.2
Fischer, P.3
-
8
-
-
79952397795
-
Online algorithms for the multi-armed bandit problem with markovian rewards
-
C. Tekin and M. Liu, "Online algorithms for the multi-armed bandit problem with markovian rewards," in Proceedings of the 48th Annual Allerton Conference on Communication, Control, and Computation, September.
-
Proceedings of the 48th Annual Allerton Conference on Communication, Control, and Computation, September
-
-
Tekin, C.1
Liu, M.2
-
14
-
-
0031070051
-
Optimal adaptive policies for markov decision processes
-
A. N. Burnetas and M. N. Katehakis, "Optimal adaptive policies for markov decision processes," Mathematics of Operations Research, vol. 22, no. 1, pp. 222-255, 1997.
-
(1997)
Mathematics of Operations Research
, vol.22
, Issue.1
, pp. 222-255
-
-
Burnetas, A.N.1
Katehakis, M.N.2
-
15
-
-
85162041468
-
Optimistic linear programming gives logarithmic regret for irreducible mdps
-
A. Tewari and P. Bartlett, "Optimistic linear programming gives logarithmic regret for irreducible mdps," Advances in Neural Information Processing Systems, vol. 20, pp. 1505-1512, 2008.
-
(2008)
Advances in Neural Information Processing Systems
, vol.20
, pp. 1505-1512
-
-
Tewari, A.1
Bartlett, P.2
-
17
-
-
27944497396
-
Senstivity and convergence of uniformly ergodic markov chains
-
A. Y. Mitrophanov, "Senstivity and convergence of uniformly ergodic markov chains," J. Appl. Prob., vol. 42, pp. 1003-1014, 2005.
-
(2005)
J. Appl. Prob.
, vol.42
, pp. 1003-1014
-
-
Mitrophanov, A.Y.1
-
18
-
-
0032628612
-
The complexity of optimal queuing network control
-
C. H. Papadimitriou and J. N. Tsitsiklis, "The complexity of optimal queuing network control," Math. Oper. Res., vol. 24, no. 2, pp. 293-305, 1999.
-
(1999)
Math. Oper. Res.
, vol.24
, Issue.2
, pp. 293-305
-
-
Papadimitriou, C.H.1
Tsitsiklis, J.N.2
-
19
-
-
0001043843
-
Restless bandits
-
P. Whitlle, "Restless bandits," J. Appl. Prob., pp. 301-313, 1988.
-
(1988)
J. Appl. Prob.
, pp. 301-313
-
-
Whitlle, P.1
-
20
-
-
69449097218
-
Approximation algorithms for restless bandit problems
-
S. Guha, K. Mungala, and P. Shi, "Approximation algorithms for restless bandit problems," 20th ACM-SIAM Symp. on Discrete Algorithms (SODA), pp. 28-37, 2009.
-
(2009)
20th ACM-SIAM Symp. on Discrete Algorithms (SODA)
, pp. 28-37
-
-
Guha, S.1
Mungala, K.2
Shi, P.3
-
21
-
-
69449100462
-
Optimality of myopic sensing in multi-channel opportunistic access
-
September
-
S. H. A. Ahmad, M. Liu, T. Javidi, Q. Zhao, and B. Krishnamachari, "Optimality of myopic sensing in multi-channel opportunistic access,"IEEE Transactions on Information Theory, vol. 55, no. 9, pp. 4040-4050, September 2009.
-
(2009)
IEEE Transactions on Information Theory
, vol.55
, Issue.9
, pp. 4040-4050
-
-
Ahmad, S.H.A.1
Liu, M.2
Javidi, T.3
Zhao, Q.4
Krishnamachari, B.5
|