-
1
-
-
0036568025
-
Finite-time analysis of the multiarmed bandit problem
-
DOI 10.1023/A:1013689704352, Computational Learning Theory
-
P. Auer, N. Cesa-Bianchi, and P. Fischer, "Finite time analysis of the multiarmed bandit problem," Mach. Learn., vol. 47, pp. 235-256, 2002. (Pubitemid 34126111)
-
(2002)
Machine Learning
, vol.47
, Issue.2-3
, pp. 235-256
-
-
Auer, P.1
Cesa-Bianchi, N.2
Fischer, P.3
-
2
-
-
0023453059
-
Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-part I: I.i.d. rewards
-
V. Anantharam, P. Varaiya, and J. Warland, "Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-part I: I.i.d. rewards," IEEE Trans. Autom. Control, vol. 34, pp. 968-976, 1987a. (Pubitemid 18521625)
-
(1987)
IEEE Transactions on Automatic Control
, vol.AC-32
, Issue.11
, pp. 968-976
-
-
Anantharam, V.1
Varaiya, P.2
Walrand, J.3
-
3
-
-
0023450663
-
Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple playspart II: Markovian rewards
-
V. Anantharam, P. Varaiya, and J. Warland, "Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple playspart II: Markovian rewards," IEEE Trans. Autom. Control, vol. 34, pp. 977-982, 1987b. (Pubitemid 18521626)
-
(1987)
IEEE Transactions on Automatic Control
, vol.AC-32
, Issue.11
, pp. 977-982
-
-
Anantharam, V.1
Varaiya, P.2
Walrand, J.3
-
5
-
-
0013218879
-
Covariate models for bernoulli bandits
-
M. K. Clayton, "Covariate models for bernoulli bandits," Sequential Anal., vol. 8, pp. 405-426, 1989.
-
(1989)
Sequential Anal.
, vol.8
, pp. 405-426
-
-
Clayton, M.K.1
-
7
-
-
70049095891
-
Woodroofe's one armed bandit problem revisited
-
A. Goldenshluger and A. Zeevi, "Woodroofe's one armed bandit problem revisited," Ann. Appl. Probab., vol. 19, pp. 1603-1633, 2009.
-
(2009)
Ann. Appl. Probab.
, vol.19
, pp. 1603-1633
-
-
Goldenshluger, A.1
Zeevi, A.2
-
8
-
-
0034171759
-
Finite time lower bounds for the two-armed bandit problem
-
S. Kulkarni and G. Lugosi, "Finite time lower bounds for the two-armed bandit problem," IEEE Trans. Autom. Control, vol. 45, pp. 711-714, 2000.
-
(2000)
IEEE Trans. Autom. Control
, vol.45
, pp. 711-714
-
-
Kulkarni, S.1
Lugosi, G.2
-
9
-
-
0001732282
-
Asymptotically optimal allocation of treatments in sequential experiments
-
New York: Dekker
-
T. L. Lai and H. Robbins, "Asymptotically optimal allocation of treatments in sequential experiments," in Design of Experiments. New York: Dekker, 1984, pp. 127-142.
-
(1984)
Design of Experiments
, pp. 127-142
-
-
Lai, T.L.1
Robbins, H.2
-
10
-
-
0002899547
-
Asymptotically efficient allocation rules
-
T. L. Lai and H. Robbins, "Asymptotically efficient allocation rules," Adv. Appl. Math., vol. 6, pp. 4-22, 1985.
-
(1985)
Adv. Appl. Math.
, vol.6
, pp. 4-22
-
-
Lai, T.L.1
Robbins, H.2
-
11
-
-
0000854435
-
Adaptive treatment allocation and the multiarmed bandit problem
-
T. L. Lai, "Adaptive treatment allocation and the multiarmed bandit problem," Ann. Statist., vol. 15, pp. 1091-1114, 1987.
-
(1987)
Ann. Statist.
, vol.15
, pp. 1091-1114
-
-
Lai, T.L.1
-
12
-
-
0029344133
-
Machine learning and nonparametric bandit theory
-
Jul.
-
T. L. Lai and S.Yakowitz, "Machine learning and nonparametric bandit theory," IEEE Trans. Autom. Control, vol. 40, no. 7, pp. 1199-1209, Jul. 1995.
-
(1995)
IEEE Trans. Autom. Control
, vol.40
, Issue.7
, pp. 1199-1209
-
-
Lai, T.L.1
Yakowitz, S.2
-
13
-
-
77956144722
-
The epoch-greedy algorithm for multiarmed bandits with side information
-
Cambridge, MA: MIT Press
-
J. Langford and T. Zhang, , J. C. Platt, D. Koller, Y. Singer, and S. Roweis, Eds., "The epoch-greedy algorithm for multiarmed bandits with side information," in Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press, 2008, vol. 20, pp. 817-824.
-
(2008)
Advances in Neural Information Processing Systems
, vol.20
, pp. 817-824
-
-
Langford, J.1
Zhang, T.2
Platt, J.C.3
Koller, D.4
Singer, Y.5
Roweis, S.6
-
14
-
-
84966203785
-
Some aspects of the sequential design of experiments
-
H. Robbins, "Some aspects of the sequential design of experiments," Bull. Amer. Math. Soc., vol. 55, pp. 527-535, 1952.
-
(1952)
Bull. Amer. Math. Soc.
, vol.55
, pp. 527-535
-
-
Robbins, H.1
-
15
-
-
0000017483
-
One-armed bandit problems with covariates
-
J. Sarkar, "One-armed bandit problems with covariates," Ann. Statist., vol. 19, pp. 1978-2002, 1991.
-
(1991)
Ann. Statist.
, vol.19
, pp. 1978-2002
-
-
Sarkar, J.1
-
17
-
-
15844389867
-
Bandit problems with side observations
-
DOI 10.1109/TAC.2005.844079
-
C.-C. Wang, S. Kulkarni, and V. H. Poor, "Bandit problems with side observations," IEEE Trans. Autom. Control, vol. 50, no. 3, pp. 338-355, Mar. 2005. (Pubitemid 40448585)
-
(2005)
IEEE Transactions on Automatic Control
, vol.50
, Issue.3
, pp. 338-355
-
-
Wang, C.-C.1
Kulkarni, S.R.2
Poor, H.V.3
-
18
-
-
0001631327
-
A one-armed bandit problem with a concomitant variable
-
M.Woodroofe, "A one-armed bandit problem with a concomitant variable," J. Amer. Statist. Assoc., vol. 74, pp. 799-806, 1979.
-
(1979)
J. Amer. Statist. Assoc.
, vol.74
, pp. 799-806
-
-
Woodroofe, M.1
-
19
-
-
0006030678
-
Sequential allocation with covariates
-
M. Woodroofe, "Sequential allocation with covariates," Sankhya Ser., vol. 44, pp. 403-414, 1982.
-
(1982)
Sankhya Ser.
, vol.44
, pp. 403-414
-
-
Woodroofe, M.1
-
20
-
-
0036108219
-
Randomized allocation with nonparametric estimation for a multi-armed bandit problem with covariates
-
DOI 10.1214/aos/1015362186
-
Y. Yang and D. Zhu, "Randomized allocation with nonparametric estimation for a multiarmed bandit problem with covariates," Ann. Statis., vol. 30, pp. 100-121, 2002. (Pubitemid 37095370)
-
(2002)
Annals of Statistics
, vol.30
, Issue.1
, pp. 100-121
-
-
Yang, Y.1
Zhu, D.2
|