-
1
-
-
0032628612
-
The complexity of optimal queuing network control
-
C. H. Papadimitriou and J. N. Tsitsiklis, "The complexity of optimal queuing network control," Math. Oper. Res., vol. 24, no. 2, pp. 293-305, 1999.
-
(1999)
Math. Oper. Res.
, vol.24
, Issue.2
, pp. 293-305
-
-
Papadimitriou, C.H.1
Tsitsiklis, J.N.2
-
2
-
-
0001043843
-
Restless bandits
-
P. Whitlle, "Restless bandits," J. Appl. Prob., pp. 301-313, 1988.
-
(1988)
J. Appl. Prob.
, pp. 301-313
-
-
Whitlle, P.1
-
3
-
-
78650720102
-
Approximation algorithms for restless bandit problems
-
December
-
S. Guha, K. Mungala, and P. Shi, "Approximation algorithms for restless bandit problems," Journal of the ACM, vol. 58, December 2010.
-
(2010)
Journal of the ACM
, vol.58
-
-
Guha, S.1
Mungala, K.2
Shi, P.3
-
4
-
-
69449100462
-
Optimality of myopic sensing in multi-channel opportunistic access
-
September
-
S. H. A. Ahmad, M. Liu, T. Javidi, Q. Zhao, and B. Krishnamachari, "Optimality of myopic sensing in multi-channel opportunistic access,"IEEE Transactions on Information Theory, vol. 55, no. 9, pp. 4040-4050, September 2009.
-
(2009)
IEEE Transactions on Information Theory
, vol.55
, Issue.9
, pp. 4040-4050
-
-
Ahmad, S.H.A.1
Liu, M.2
Javidi, T.3
Zhao, Q.4
Krishnamachari, B.5
-
5
-
-
84966203785
-
Some aspects of the sequential design of experiments
-
H. Robbins, "Some aspects of the sequential design of experiments,"Bull. Amer. Math. Soc., vol. 55, pp. 527-535, 1952.
-
(1952)
Bull. Amer. Math. Soc.
, vol.55
, pp. 527-535
-
-
Robbins, H.1
-
6
-
-
0002899547
-
Asymptotically efficient adaptive allocation rules
-
T. Lai and H. Robbins, "Asymptotically efficient adaptive allocation rules," Advances in Applied Mathematics, vol. 6, pp. 4-22, 1985.
-
(1985)
Advances in Applied Mathematics
, vol.6
, pp. 4-22
-
-
Lai, T.1
Robbins, H.2
-
7
-
-
0000616723
-
Sample mean based index policies with o(log n) regret for the multi-armed bandit problem
-
December
-
R. Agrawal, "Sample mean based index policies with o(log n) regret for the multi-armed bandit problem," Advances in Applied Probability, vol. 27, no. 4, pp. 1054-1078, December 1995.
-
(1995)
Advances in Applied Probability
, vol.27
, Issue.4
, pp. 1054-1078
-
-
Agrawal, R.1
-
8
-
-
0023453059
-
Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-part i: Iid rewards
-
November
-
V. Anantharam, P. Varaiya, and J. . Walrand, "Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-part i: Iid rewards," IEEE Trans. Automat. Contr., pp. 968-975, November 1987.
-
(1987)
IEEE Trans. Automat. Contr.
, pp. 968-975
-
-
Anantharam, V.1
Varaiya, P.2
Walrand, J.3
-
9
-
-
0023450663
-
Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-part ii: Markovian rewards
-
November
-
-, "Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-part ii: Markovian rewards," IEEE Trans. Automat. Contr., pp. 977-982, November 1987.
-
(1987)
IEEE Trans. Automat. Contr.
, pp. 977-982
-
-
Anantharam, V.1
Varaiya, P.2
Walrand, J.3
-
10
-
-
0036568025
-
Finite-time analysis of the multiarmed bandit problem
-
P. Auer, N. Cesa-Bianchi, and P. Fischer, "Finite-time analysis of the multiarmed bandit problem," Machine Learning, vol. 47, p. 235256, 2002.
-
(2002)
Machine Learning
, vol.47
, pp. 235256
-
-
Auer, P.1
Cesa-Bianchi, N.2
Fischer, P.3
-
11
-
-
79952397795
-
Online algorithms for the multi-armed bandit problem with markovian rewards
-
C. Tekin and M. Liu, "Online algorithms for the multi-armed bandit problem with markovian rewards," in Proceedings of the 48th Annual Allerton Conference on Communication, Control, and Computation, September.
-
Proceedings of the 48th Annual Allerton Conference on Communication, Control, and Computation, September
-
-
Tekin, C.1
Liu, M.2
-
15
-
-
79953827701
-
Distributed learning in multi-armed bandit with multiple players
-
November
-
K. Liu and Q. Zhao, "Distributed learning in multi-armed bandit with multiple players," IEEE Transactions on Signal Processing, vol. 58, pp. 5667 - 5681, November 2010.
-
(2010)
IEEE Transactions on Signal Processing
, vol.58
, pp. 5667-5681
-
-
Liu, K.1
Zhao, Q.2
-
17
-
-
0031070051
-
Optimal adaptive policies for markov decision processes
-
A. N. Burnetas and M. N. Katehakis, "Optimal adaptive policies for markov decision processes," Mathematics of Operations Research, vol. 22, no. 1, pp. 222-255, 1997.
-
(1997)
Mathematics of Operations Research
, vol.22
, Issue.1
, pp. 222-255
-
-
Burnetas, A.N.1
Katehakis, M.N.2
-
18
-
-
85162041468
-
Optimistic linear programming gives logarithmic regret for irreducible mdps
-
A. Tewari and P. Bartlett, "Optimistic linear programming gives logarithmic regret for irreducible mdps," Advances in Neural Information Processing Systems, vol. 20, pp. 1505-1512, 2008.
-
(2008)
Advances in Neural Information Processing Systems
, vol.20
, pp. 1505-1512
-
-
Tewari, A.1
Bartlett, P.2
-
19
-
-
84856091352
-
Adaptive learning of uncontrolled restless bandits with logarithmic regret
-
C. Tekin and M. Liu, "Adaptive learning of uncontrolled restless bandits with logarithmic regret," in Proc. Forty-Ninth Annual Allerton Conference on Communication, Control, and Computing, September 2011.
-
Proc. Forty-Ninth Annual Allerton Conference on Communication, Control, and Computing, September 2011
-
-
Tekin, C.1
Liu, M.2
-
20
-
-
27944497396
-
Senstivity and convergence of uniformly ergodic markov chains
-
A. Y. Mitrophanov, "Senstivity and convergence of uniformly ergodic markov chains," J. Appl. Prob., vol. 42, pp. 1003-1014, 2005.
-
(2005)
J. Appl. Prob.
, vol.42
, pp. 1003-1014
-
-
Mitrophanov, A.Y.1
-
21
-
-
0037709910
-
The nonstochastic multiarmed bandit problem
-
P. Auer, N. Cesa-Bianchi, Y. Freund, and R. Schapire, "The nonstochastic multiarmed bandit problem," SIAM Journal on Computing, vol. 32, pp. 48-77, 2002.
-
(2002)
SIAM Journal on Computing
, vol.32
, pp. 48-77
-
-
Auer, P.1
Cesa-Bianchi, N.2
Freund, Y.3
Schapire, R.4
|