-
1
-
-
0037709910
-
The nonstochastic multiarmed bandit problem
-
Jan.
-
P. Auer, N. Cesa-Bianchi, Y. Freund, and R. Schapire, "The nonstochastic multiarmed bandit problem," SIAM J. Comput., vol. 32, pp. 48-77, Jan. 2002.
-
(2002)
SIAM J. Comput.
, vol.32
, pp. 48-77
-
-
Auer, P.1
Cesa-Bianchi, N.2
Freund, Y.3
Schapire, R.4
-
2
-
-
84966203785
-
Some aspects of the sequential design of experiments
-
H. Robbins, "Some aspects of the sequential design of experiments," Bull. Amer. Math. Soc., vol. 55, pp. 527-535, 1952.
-
(1952)
Bull. Amer. Math. Soc.
, vol.55
, pp. 527-535
-
-
Robbins, H.1
-
3
-
-
0036568025
-
Finite-time analysis of the multiarmed bandit problem
-
DOI 10.1023/A:1013689704352, Computational Learning Theory
-
P. Auer, N. Cesa-Bianchi, and P. Fischer, "Finite-time analysis of the multiarmed bandit problem," Mach. Learn., vol. 47, pp. 235-256, 2002. (Pubitemid 34126111)
-
(2002)
Machine Learning
, vol.47
, Issue.2-3
, pp. 235-256
-
-
Auer, P.1
Cesa-Bianchi, N.2
Fischer, P.3
-
4
-
-
0002899547
-
Asymptotically efficient adaptive allocation rules
-
T. Lai and H. Robbins, "Asymptotically efficient adaptive allocation rules," Adv. Appl. Math., vol. 6, pp. 4-22, 1985.
-
(1985)
Adv. Appl. Math.
, vol.6
, pp. 4-22
-
-
Lai, T.1
Robbins, H.2
-
5
-
-
0023453059
-
Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays - part I: I. I. D. RewardS
-
V. Anantharam, P. Varaiya, and J. Walrand, "Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part I: I.I.D. rewards," IEEE Trans. Autom. Control, vol. 32, no. 11, pp. 968-976, Nov. 1987. (Pubitemid 18521625)
-
(1987)
IEEE Transactions on Automatic Control
, vol.AC-32
, Issue.11
, pp. 968-976
-
-
Anantharam, V.1
Varaiya, P.2
Walrand, J.3
-
6
-
-
0023450663
-
Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays - part II: Markovian rewards
-
V. Anantharam, P. Varaiya, and J. Walrand, "Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part II: Markovian rewards," IEEE Trans. Autom. Control, vol. 32, no. 11, pp. 977-982, Nov. 1987. (Pubitemid 18521626)
-
(1987)
IEEE Transactions on Automatic Control
, vol.AC-32
, Issue.11
, pp. 977-982
-
-
Anantharam, V.1
Varaiya, P.2
Walrand, J.3
-
7
-
-
0000616723
-
Samplemean based index policieswith regret for the multi-armed bandit problem
-
Dec.
-
R.Agrawal, "Samplemean based index policieswith regret for the multi-armed bandit problem," Adv. Appl. Probabil., vol. 27, no. 4, pp. 1054-1078, Dec. 1995.
-
(1995)
Adv. Appl. Probabil.
, vol.27
, Issue.4
, pp. 1054-1078
-
-
Agrawal, R.1
-
8
-
-
84898437076
-
The KL-UCB algorithm for bounded stochastic bandits and beyond
-
A. Garivier and O. Cappe, "The KL-UCB algorithm for bounded stochastic bandits and beyond," in Proc. JMLR Workshop Conf., 2011, vol. 19, pp. 359-376.
-
(2011)
Proc. JMLR Workshop Conf.
, vol.19
, pp. 359-376
-
-
Garivier, A.1
Cappe, O.2
-
9
-
-
84898072179
-
Stochastic linear optimization under bandit feedback
-
Jul.
-
V. Dani,T.P.Hayes, andS.M.Kakade, "Stochastic linear optimization under bandit feedback," in Proc. 21st Annu. Conf. Learn. Theory, Jul. 2008, pp. 355-366.
-
(2008)
Proc. 21st Annu. Conf. Learn. Theory
, pp. 355-366
-
-
Dani, V.1
Hayes, T.P.2
Kakade, S.M.3
-
10
-
-
79952397795
-
Online algorithms for the multi-armed bandit problem with Markovian rewards
-
Control, Comput., Sep.
-
C. Tekin and M. Liu, "Online algorithms for the multi-armed bandit problem with Markovian rewards," in Proc. 48th Annu. Allerton Conf. Commun., Control, Comput., Sep. 2010, pp. 1675-1682.
-
(2010)
Proc. 48th Annu. Allerton Conf. Commun.
, pp. 1675-1682
-
-
Tekin, C.1
Liu, M.2
-
11
-
-
79960884459
-
Online learning in opportunistic spectrum access: A restless bandit approach
-
Apr.
-
C. Tekin and M. Liu, "Online learning in opportunistic spectrum access: A restless bandit approach," in Proc. 30th Annu. IEEE Int. Conf. Comput. Commun., Apr. 2011, pp. 2462-2470.
-
(2011)
Proc. 30th Annu. IEEE Int. Conf. Comput. Commun.
, pp. 2462-2470
-
-
Tekin, C.1
Liu, M.2
-
13
-
-
79953827701
-
Distributed learning in multi-armed bandit with multiple players
-
Nov.
-
K. Liu and Q. Zhao, "Distributed learning in multi-armed bandit with multiple players," IEEE Trans. Signal Process., vol. 58, no. 11, pp. 5667-5681, Nov. 2010.
-
(2010)
IEEE Trans. Signal Process.
, vol.58
, Issue.11
, pp. 5667-5681
-
-
Liu, K.1
Zhao, Q.2
-
14
-
-
79953194834
-
Distributed algorithms for learning and cognitive medium access with logarithmic regret
-
Apr.
-
A. Anandkumar, N. Michael, A. Tang, and A. Swami, "Distributed algorithms for learning and cognitive medium access with logarithmic regret," IEEE J. Sel. Areas Commun., vol. 29, no. 4, pp. 731-745, Apr. 2011.
-
(2011)
IEEE J. Sel. Areas Commun.
, vol.29
, Issue.4
, pp. 731-745
-
-
Anandkumar, A.1
Michael, N.2
Tang, A.3
Swami, A.4
-
15
-
-
77953180719
-
Learning multiuser channel allocations in cognitive radio networks: A combinatorial multi-armed bandit formulation
-
Apr.
-
Y. Gai, B. Krishnamachari, and R. Jain, "Learning multiuser channel allocations in cognitive radio networks: A combinatorial multi-armed bandit formulation," in IEEE Symp. Dyn. Spectrum Access Netw. (DySPAN), Apr. 2010, pp. 1-9.
-
(2010)
IEEE Symp. Dyn. Spectrum Access Netw. (DySPAN)
, pp. 1-9
-
-
Gai, Y.1
Krishnamachari, B.2
Jain, R.3
-
16
-
-
0000169010
-
Bandit processes and dynamic allocation indices
-
J. Gittins, "Bandit processes and dynamic allocation indices," J. Roy. Statist. Soc., vol. 41, no. 2, pp. 148-177, 1979.
-
(1979)
J. Roy. Statist. Soc.
, vol.41
, Issue.2
, pp. 148-177
-
-
Gittins, J.1
-
17
-
-
0001043843
-
Restless bandits: Activity allocation in a changing world
-
Sheffield, U.K.: Applied Probability Trust
-
P. Whittle, , J. Gani, Ed., "Restless bandits: Activity allocation in a changing world," in A Celebration of Applied Probability. Sheffield, U.K.: Applied Probability Trust, 1988, vol. 25A, pp. 287-298.
-
(1988)
A Celebration of Applied Probability
, vol.25 A
, pp. 287-298
-
-
Whittle, P.1
Gani, J.2
-
18
-
-
69449100462
-
Optimality of myopic sensing in multi-channel opportunistic access
-
Sep.
-
S. H. A. Ahmad, M. Liu, T. Javidi, Q. Zhao, and B. Krishnamachari, "Optimality of myopic sensing in multi-channel opportunistic access," IEEE Trans. Inf. Theory, vol. 55, no. 9, pp. 4040-4050, Sep. 2009.
-
(2009)
IEEE Trans. Inf. Theory
, vol.55
, Issue.9
, pp. 4040-4050
-
-
Ahmad, S.H.A.1
Liu, M.2
Javidi, T.3
Zhao, Q.4
Krishnamachari, B.5
-
19
-
-
0032222170
-
Chernoff-type bound for finite Markov chains
-
P. Lezaud, "Chernoff-type bound for finite Markov chains," Ann. Appl. Probab., vol. 8, pp. 849-867, 1998.
-
(1998)
Ann. Appl. Probab.
, vol.8
, pp. 849-867
-
-
Lezaud, P.1
-
20
-
-
62949181077
-
Exploration-exploitation tradeoff using variance estimates in multi-armed bandits
-
J. Y. Audibert and R. M. Szepesvári, "Exploration- exploitation tradeoff using variance estimates in multi-armed bandits," Theoretical Comput. Sci., vol. 410, no. 19, pp. 1876-1902, 2009.
-
(2009)
Theoretical Comput. Sci.
, vol.410
, Issue.19
, pp. 1876-1902
-
-
Audibert, J.Y.1
Szepesvári, R.M.2
-
21
-
-
84856091352
-
Adaptive learning of uncontrolled restless bandits with logarithmic regret
-
Control, Comput., Sep.
-
C. Tekin and M. Liu, "Adaptive learning of uncontrolled restless bandits with logarithmic regret," in Proc. 49th Annu. Allerton Conf. Commun., Control, Comput., Sep. 2011, pp. 983-990.
-
(2011)
Proc. 49th Annu. Allerton Conf. Commun.
, pp. 983-990
-
-
Tekin, C.1
Liu, M.2
-
22
-
-
0032628612
-
The complexity of optimal queuing network control
-
May
-
C. Papadimitriou and J. Tsitsiklis, "The complexity of optimal queuing network control," Math. Oper. Res., vol. 24, no. 2, pp. 293-305, May 1999.
-
(1999)
Math. Oper. Res.
, vol.24
, Issue.2
, pp. 293-305
-
-
Papadimitriou, C.1
Tsitsiklis, J.2
-
23
-
-
80051623306
-
The non-Bayesian restless multi-armed bandit: A case of near-logarithmic regret
-
May
-
W. Dai, Y. Gai, B. Krishnamachari, and Q. Zhao, "The non-Bayesian restless multi-armed bandit: A case of near-logarithmic regret," in Proc. Int. Conf. Acoust., Speech Signal Process., May 2011, pp. 2940-2943.
-
(2011)
Proc. Int. Conf. Acoust., Speech Signal Process.
, pp. 2940-2943
-
-
Dai, W.1
Gai, Y.2
Krishnamachari, B.3
Zhao, Q.4
-
24
-
-
69449097218
-
Approximation algorithms for restless bandit problems
-
S. Guha, K. Mungala, and P. Shi, "Approximation algorithms for restless bandit problems," in Proc. 20th ACM-SIAM Symp. Discr. Algorithms, 2009, pp. 28-37.
-
(2009)
Proc. 20th ACM-SIAM Symp. Discr. Algorithms
, pp. 28-37
-
-
Guha, S.1
Mungala, K.2
Shi, P.3
-
25
-
-
84861588214
-
Approximately optimal adaptive learning in opportunistic spectrum access
-
Orlando, FL, Mar.
-
C. Tekin and M. Liu, "Approximately optimal adaptive learning in opportunistic spectrum access," presented at the presented at the IEEE INFOCOM, Orlando, FL, Mar. 2012.
-
(2012)
IEEE INFOCOM
-
-
Tekin, C.1
Liu, M.2
-
26
-
-
37349120464
-
On the expectation of the maximum of i.i.d. geometric random variables
-
B. Eisenberg, "On the expectation of the maximum of i.i.d. geometric random variables," Statist. Probab. Lett., vol. 78, pp. 135-143, 2008.
-
(2008)
Statist. Probab. Lett.
, vol.78
, pp. 135-143
-
-
Eisenberg, B.1
|