-
1
-
-
0002899547
-
Asymptotically efficient adaptive allocation rules
-
T. Lai and H. Robbins, "Asymptotically Efficient Adaptive Allocation Rules," Advances in Applied Mathematics, Vol. 6, No. 1, pp. 4C22, 1985.
-
(1985)
Advances in Applied Mathematics
, vol.6
, Issue.1
-
-
Lai, T.1
Robbins, H.2
-
2
-
-
0023453059
-
Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays - PART I: I. I. D. Rewards
-
V. Anantharam, P. Varaiya, J. Walrand, "Asymptotically Efficient Allocation Rules for the Multiarmed Bandit Problem with Multiple Plays-Part I: I.I.D. Rewards," IEEE Transaction on Automatic Control, Vol. AC-32 ,No.11 , pp. 968-976, Nov., 1987. (Pubitemid 18521625)
-
(1987)
IEEE Transactions on Automatic Control
, vol.AC-32
, Issue.11
, pp. 968-976
-
-
Venkatachalam, A.1
Pravin, V.2
Jean, W.3
-
3
-
-
0023450663
-
Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays - Part II: Markovian rewards
-
V. Anantharam, P. Varaiya, J. Walrand, "Asymptotically Efficient Allocation Rules for the Multiarmed Bandit Problem with Multiple Plays-Part II: Markovian Rewards," IEEE Transaction on Automatic Control, Vol. AC- 32 ,No.11 ,pp. 977-982, Nov., 1987. (Pubitemid 18521626)
-
(1987)
IEEE Transactions on Automatic Control
, vol.AC-32
, Issue.11
, pp. 977-982
-
-
Venkatachalam, A.1
Pravin, V.2
Jean, W.3
-
4
-
-
0000616723
-
Sample mean based index policies with o log n regret for the multi-armed bandit problem
-
R. Agrawal, "Sample Mean Based Index Policies With O(log n) Regret for the Multi-armed Bandit Problem," Advances in Applied Probability, Vol. 27, pp. 1054C1078, 1995.
-
(1995)
Advances in Applied Probability
, vol.27
-
-
Agrawal, R.1
-
5
-
-
0036568025
-
Finite-time analysis of the multiarmed bandit problem
-
DOI 10.1023/A:1013689704352, Computational Learning Theory
-
P. Auer, N. Cesa-Bianchi, P. Fischer, "Finite-time Analysis of the Multiarmed Bandit Problem," Machine Learning, 47, 235-256, 2002. (Pubitemid 34126111)
-
(2002)
Machine Learning
, vol.47
, Issue.2-3
, pp. 235-256
-
-
Auer, P.1
Cesa-Bianchi, N.2
Fischer, P.3
-
7
-
-
79953827701
-
Distributed learning in multi-armed bandit with multiple players
-
Nov.
-
K. Liu, Q. Zhao, "Distributed Learning in Multi-Armed Bandit with Multiple Players," IEEE Transations on Signal Processing, Vol. 58, No. 11, pp. 5667-5681, Nov. 2010.
-
(2010)
IEEE Transations on Signal Processing
, vol.58
, Issue.11
, pp. 5667-5681
-
-
Liu, K.1
Zhao, Q.2
-
9
-
-
80051636024
-
Logarithmic weak regret of non-bayesian restless multi-armed bandit
-
May
-
H. Liu, K. Liu, Q. Zhao, "Logarithmic Weak Regret of Non-Bayesian Restless Multi-Armed Bandit," Proc. of Internanional Conference on Acoustics, Speech and Signal Processing (ICASSP), May, 2011.
-
(2011)
Proc. of Internanional Conference on Acoustics Speech and Signal Processing ICASSP
-
-
Liu, H.1
Liu, K.2
Zhao, Q.3
-
10
-
-
0037709910
-
The nonstochastic multiarmed bandit problem
-
P. Auer, N. Cesa-Bianchi, Y. Freund, R.E. Schapire "The nonstochastic multiarmed bandit problem," SIAM Journal on Computing, Vol. 32, pp. 48C77, 2002.
-
(2002)
SIAM Journal on Computing
, vol.32
-
-
Auer, P.1
Cesa-Bianchi, N.2
Freund, Y.3
Schapire, R.E.4
-
11
-
-
0032628612
-
The complexity of optimal queuing network control
-
May
-
C. Papadimitriou, J. Tsitsiklis, "The Complexity of Optimal Queuing Network Control," Mathematics of Operations Research, Vol. 24, No. 2, pp. 293-305, May 1999.
-
(1999)
Mathematics of Operations Research
, vol.24
, Issue.2
, pp. 293-305
-
-
Papadimitriou, C.1
Tsitsiklis, J.2
-
13
-
-
80051623306
-
The non-bayesian restless multi-armed bandit: A case of near-logarithmic regret
-
May
-
W. Dai, Y. Gai, B. Krishnamachari, Q. Zhao "The Non-Bayesian Restless Multi-armed Bandit: A Case Of Near-Logarithmic Regret," Proc. of Internanional Conference on Acoustics, Speech and Signal Processing (ICASSP), May, 2011.
-
(2011)
Proc. of Internanional Conference on Acoustics Speech and Signal Processing ICASSP
-
-
Dai, W.1
Gai, Y.2
Krishnamachari, B.3
Zhao, Q.4
|