SCOPUS 정보 검색 플랫폼

IEEE Journal on Selected Topics in Signal Processing

Volumn 7, Issue 5, 2013, Pages 759-767

Deterministic sequencing of exploration and exploitation for multi-armed bandit problems

(3) Vakili, Sattar a Liu, Keqin a Zhao, Qing a

Author keywords

combinatorial multi armed bandit; decentralized multi armed bandit; deterministic sequencing of exploration and exploitation; Multi armed bandit; regret; restless multi armed bandit

Indexed keywords

DOMINATING SET PROBLEMS; EXPLORATION AND EXPLOITATION; EXPLORATION SEQUENCES; MULTI ARMED BANDIT; MULTI-ARMED BANDIT PROBLEM; REGRET; RESTLESS MULTI-ARMED BANDIT; SELECTION POLICIES;

ELECTRICAL ENGINEERING; SIGNAL PROCESSING;

OPTIMIZATION;

EID: 84884549238 PISSN: 19324553 EISSN: None Source Type: Journal
DOI: 10.1109/JSTSP.2013.2263494 Document Type: Article

Times cited : (117)

References (28)

1
- 84966203785
- Some aspects of the sequential design of experiments
- H. Robbins, "Some aspects of the sequential design of experiments," Bull. Amer. Math. Soc., vol. 58, no. 5, pp. 527-535, 1952.
- (1952) Bull. Amer. Math. Soc , vol.58 , Issue.5 , pp. 527-535
- Robbins, H.¹

2
- 0039813594
- Boca Raton, FL, USA: CRC
- T. Santner and A. Tamhane, Design of Experiments: Ranking and Selection. Boca Raton, FL, USA: CRC, 1984.
- (1984) Design of Experiments: Ranking and Selection
- Santner, T.¹ Tamhane, A.²

3
- 61449109791
- Multi-armed bandit problems
- NewYork, NY, USA: Springer-Verlag
- A. Mahajan and D. Teneketzis, , A. O. Hero, III, D. A. Castanon, D. Cochran, and K. Kastella, Eds., "Multi-armed bandit problems," in Foundations and Applications of Sensor Management. NewYork, NY, USA: Springer-Verlag, 2007.
- (2007) Foundations and Applications of Sensor Management
- Mahajan, A.¹ Teneketzis, D.² Hero Iii., A.O.³ Castanon, D.A.⁴ Cochran, D.⁵ Kastella Eds, K.⁶

4
- 0002899547
- Asymptotically efficient adaptive allocation rules
- T. Lai and H. Robbins, "Asymptotically efficient adaptive allocation rules," Adv. Appl. Math., vol. 6, no. 1, pp. 4-22, 1985.
- (1985) Adv. Appl. Math , vol.6 , Issue.1 , pp. 4-22
- Lai, T.¹ Robbins, H.²

5
- 0000616723
- Sample mean based index policies with regret for the multi-armed bandit problem
- R. Agrawal, "Sample mean based index policies with regret for the multi-armed bandit problem," Adv. Appl. Probab., vol. 27, pp. 1054-1078, 1995.
- (1995) Adv. Appl. Probab , vol.27 , pp. 1054-1078
- Agrawal, R.¹

6
- 0036568025
- Finite-time analysis of the multiarmed bandit problem
- DOI 10.1023/A:1013689704352, Computational Learning Theory
- P. Auer, N. Cesa-Bianchi, and P. Fischer, "Finite-time analysis of the multiarmed bandit problem," Mach. Learn., vol. 47, pp. 235-256, 2002. (Pubitemid 34126111)
- (2002) Machine Learning , vol.47 , Issue.2-3 , pp. 235-256
- Auer, P.¹ Cesa-Bianchi, N.² Fischer, P.³

7
- 84873932839
- Learning in a changing world: Restless multi-armed bandit with unknown dynamics
- Mar.
- H. Liu, K. Liu, and Q. Zhao, "Learning in a changing world: Restless multi-armed bandit with unknown dynamics," IEEE Trans. Inf. Theory, vol. 59, no. 3, pp. 1902-1916, Mar. 2013.
- (2013) IEEE Trans. Inf. Theory , vol.59 , Issue.3 , pp. 1902-1916
- Liu, H.¹ Liu, K.² Zhao, Q.³

8
- 84866935969
- Adaptive shortest-path routing under unknown and stochastically varying link states
- May
- K. Liu and Q. Zhao, "Adaptive shortest-path routing under unknown and stochastically varying link states," in Proc. 10th Int. Symp. Modeling Optimiz. Mobile, Ad Hoc, and Wireless Netw. (WiOpt), May 2012.
- (2012) Proc. 10th Int. Symp. Modeling Optimiz. Mobile, Ad Hoc, and Wireless Netw. (WiOpt)
- Liu, K.¹ Zhao, Q.²

9
- 84874028541
- Sep [Online]. Available: arXiv: 1209.1727 [stat.ML]
- S. Bubeck, N. Cesa-Bianchi, and G. Lugosi, "Bandits With Heavy Tail," Sep. 2012 [Online]. Available: arXiv:1209.1727 [stat.ML]
- (2012) Bandits with Heavy Tail
- Bubeck, S.¹ Cesa-Bianchi, N.² Lugosi, G.³

10
- 84856102138
- Multi-armed bandit problems with heavy tail reward distributions
- Sep
- K. Liu and Q. Zhao, "Multi-armed bandit problems with heavy tail reward distributions," in Proc. Allerton Conf. Commun., Control, Comput., Sep. 2011.
- (2011) Proc. Allerton Conf. Commun., Control, Comput
- Liu, K.¹ Zhao, Q.²

11
- 79953827701
- Distributed learning in multi-armed bandit with multiple players
- Nov.
- K. Liu and Q. Zhao, "Distributed learning in multi-armed bandit with multiple players," IEEE Trans. Signal Process., vol. 58, no. 11, pp. 5667-5681, Nov. 2010.
- (2010) IEEE Trans. Signal Process , vol.58 , Issue.11 , pp. 5667-5681
- Liu, K.¹ Zhao, Q.²

12
- 79953194834
- Distributed algorithms for learning and cognitive medium access with logarithmic regret
- Mar.
- A. Anandkumar, N. Michael, A. K. Tang, and A. Swami, "Distributed algorithms for learning and cognitive medium access with logarithmic regret," IEEE J. Sel. Areas Commun., vol. 29, no. 4, pp. 731-745, Mar. 2011.
- (2011) IEEE J. Sel. Areas Commun , vol.29 , Issue.4 , pp. 731-745
- Anandkumar, A.¹ Michael, N.² Tang, A.K.³ Swami, A.⁴

13
- 84857218599
- Decentralized online learning algorithms for opportunistic spectrum access
- Houston, TX, USA, Dec
- Y. Gai and B. Krishnamachari, "Decentralized online learning algorithms for opportunistic spectrum access," in Proc. IEEE Global Commun. Conf. (GLOBECOM '11), Houston, TX, USA, Dec. 2011.
- (2011) Proc. IEEE Global Commun. Conf. (GLOBECOM '11)
- Gai, Y.¹ Krishnamachari, B.²

14
- 84874338536
- Performance and convergence of multiuser online learning
- Apr
- C. Tekin and M. Liu, "Performance and convergence of multiuser online learning," in Proc. Int. Conf. Game Theory Netw. (GAMNETS), Apr. 2011.
- (2011) Proc. Int. Conf. Game Theory Netw. (GAMNETS)
- Tekin, C.¹ Liu, M.²

15
- 84874251645
- Decentralized learning for multiplayer multi-armed bandits
- Dec
- D. Kalathil, N. Nayyar, and R. Jain, "Decentralized learning for multiplayer multi-armed bandits," in Proc. IEEE Conf. Decision Control (CDC), Dec. 2012, pp. 3960-3965.
- (2012) Proc. IEEE Conf. Decision Control (CDC) , pp. 3960-3965
- Kalathil, D.¹ Nayyar, N.² Jain, R.³

16
- 84874251645
- Decentralized learning for multiplayer multi-armed bandits
- [Online]. Available submitted for publication
- D. Kalathil, N. Nayyar, and R. Jain, "Decentralized learning for multiplayer multi-armed bandits," IEEE Trans. Inf. Theory Apr. 2012 [Online]. Available: http://arxiv.org/abs/1206.3582, submitted for publication
- (2012) IEEE Trans. Inf. Theory Apr
- Kalathil, D.¹ Nayyar, N.² Jain, R.³

17
- 84863956678
- Online learning of rested and restless bandits
- Aug.
- C. Tekin and M. Liu, "Online learning of rested and restless bandits," IEEE Trans. Inf. Theory, vol. 58, no. 8, pp. 5588-5611, Aug. 2012.
- (2012) IEEE Trans. Inf. Theory , vol.58 , Issue.8 , pp. 5588-5611
- Tekin, C.¹ Liu, M.²

18
- 80051623306
- The non-Bayesian restless multi-armed bandit: A case of near-logarithmic regret
- May
- W. Dai, Y. Gai, B. Krishnamachari, and Q. Zhao, "The non-Bayesian restless multi-armed bandit: A case of near-logarithmic regret," in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), May 2011, pp. 2940-2943.
- (2011) Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 2940-2943
- Dai, W.¹ Gai, Y.² Krishnamachari, B.³ Zhao, Q.⁴

19
- 84861588214
- Approximately optimal adaptive learning in opportunistic spectrum access
- Orlando, FL, USA, Mar
- C. Tekin and M. Liu, "Approximately optimal adaptive learning in opportunistic spectrum access," in Proc. Int. Conf. Comput. Commun. (INFOCOM), Orlando, FL, USA, Mar. 2012.
- (2012) Proc. Int. Conf. Comput. Commun. (INFOCOM)
- Tekin, C.¹ Liu, M.²

20
- 84856091352
- Adaptive learning of uncontrolled restless bandits with logarithmic regret
- Sep
- C. Tekin and M. Liu, "Adaptive learning of uncontrolled restless bandits with logarithmic regret," in Proc. Allerton Conf. Commun. , Control, Comput., Sep. 2011.
- (2011) Proc. Allerton Conf. Commun. , Control, Comput
- Tekin, C.¹ Liu, M.²

21
- 84867858040
- Combinatorial network optimization with unknown variables: Multi-armed bandits with linear rewards and individual observations
- Oct.
- Y. Gai, B. Krishnamachari, and R. Jain, "Combinatorial network optimization with unknown variables: Multi-armed bandits with linear rewards and individual observations," IEEE/ACM Trans. Netw., vol. 20, no. 5, pp. 1466-1478, Oct. 2012.
- (2012) IEEE/ACM Trans. Netw , vol.20 , Issue.5 , pp. 1466-1478
- Gai, Y.¹ Krishnamachari, B.² Jain, R.³

22
- 35448960376
- Online linear optimization and adaptive routing
- DOI 10.1016/j.jcss.2007.04.016, PII S0022000007000621, Learning Theory 2004
- B. Awerbuch and R. Kleinberg, "Online linear optimization and adaptive routing," J. Comput. Syst. Sci., pp. 97-114, 2008. (Pubitemid 47625408)
- (2008) Journal of Computer and System Sciences , vol.74 , Issue.1 , pp. 97-114
- Awerbuch, B.¹ Kleinberg, R.²

23
- 84867967913
- Online learning for combinatorial network optimization with restless Markovian rewards
- Y. Gai, B. Krishnamachari, and M. Liu, "Online learning for combinatorial network optimization with restless Markovian rewards," in Proc. 9th Annu. IEEE Conf. Sens., Mesh, Ad Hoc Commun. Networks (SECON), 2012.
- (2012) Proc. 9th Annu. IEEE Conf. Sens., Mesh, Ad Hoc Commun. Networks (SECON)
- Gai, Y.¹ Krishnamachari, B.² Liu, M.³

24
- 84860425852
- Locally sub-Gaussian random variable and the strong law of large numbers
- P. Chareka, O. Chareka, and S. Kennendy, "Locally sub-Gaussian random variable and the strong law of large numbers," Atlantic Electron. J. Math., vol. 1, no. 1, pp. 75-81, 2006.
- (2006) Atlantic Electron. J. Math , vol.1 , Issue.1 , pp. 75-81
- Chareka, P.¹ Chareka, O.² Kennendy, S.³

25
- 0345224411
- The continuum-armed bandit problem
- Nov
- R. Agrawal, "The continuum-armed bandit problem," SIAM J. Control Optimiz., vol. 33, no. 6, pp. 1926-1951, Nov. 1995.
- (1995) SIAM J. Control Optimiz , vol.33 , Issue.6 , pp. 1926-1951
- Agrawal, R.¹

26
- 84947403595
- Probability inequalities for sums of bounded random variables
- Mar
- W. Hoeffding, "Probability inequalities for sums of bounded random variables," J. Amer. Statist. Assoc., vol. 58, no. 301, pp. 13-30, Mar. 1963.
- (1963) J. Amer. Statist. Assoc , vol.58 , Issue.301 , pp. 13-30
- Hoeffding, W.¹

27
- 0012972085
- On the best constant in Marcinkiewicz-Zygmund inequality
- DOI 10.1016/S0167-7152(01)00015-3, PII S0167715201000153
- Y. Ren andH. Liang, "On the best constant in Marcinkiewicz-Zygmund inequality," Statist. Probab. Lett., vol. 53, pp. 227-233, Jun. 2001. (Pubitemid 33623382)
- (2001) Statistics and Probability Letters , vol.53 , Issue.3 , pp. 227-233
- Ren, Y.-F.¹ Liang, H.-Y.²

28
- 84856106270
- Tech. Rep., Mar [Online]. Available Available at
- Y. Gai and B. Krishnamachari, "Decentralized online learning algorithms for opportunistic spectrum access," Tech. Rep., Mar. 2011 [Online]. Available: http://anrg.usc.edu/www/publications/papers/DMAB2011.pdf, Available at
- (2011) Decentralized Online Learning Algorithms for Opportunistic Spectrum Access
- Gai, Y.¹ Krishnamachari, B.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.