메뉴 건너뛰기




Volumn , Issue , 2011, Pages 240-246

Learning and sharing in a changing world: Non-Bayesian restless bandit with multiple players

Author keywords

[No Author keywords available]

Indexed keywords

COMMUNICATION NETWORKS; DECENTRALIZED POLICIES; FINANCIAL INVESTMENTS; MARKOVIAN; RESTLESS BANDIT; RESTLESS MULTI-ARMED BANDIT;

EID: 79955764815     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ITA.2011.5743588     Document Type: Conference Paper
Times cited : (19)

References (14)
  • 1
    • 0002899547 scopus 로고
    • Asymptotically efficient adaptive allocation rules
    • T. Lai and H. Robbins, "Asymptotically Efficient Adaptive Allocation Rules," Advances in Applied Mathematics, Vol. 6, No. 1, pp. 4C22, 1985.
    • (1985) Advances in Applied Mathematics , vol.6 , Issue.1
    • Lai, T.1    Robbins, H.2
  • 2
    • 0023453059 scopus 로고
    • Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays - PART I: I. I. D. Rewards
    • V. Anantharam, P. Varaiya, J. Walrand, "Asymptotically Efficient Allocation Rules for the Multiarmed Bandit Problem with Multiple Plays-Part I: I.I.D. Rewards," IEEE Transaction on Automatic Control, Vol. AC-32 ,No.11 , pp. 968-976, Nov., 1987. (Pubitemid 18521625)
    • (1987) IEEE Transactions on Automatic Control , vol.AC-32 , Issue.11 , pp. 968-976
    • Venkatachalam, A.1    Pravin, V.2    Jean, W.3
  • 3
    • 0023450663 scopus 로고
    • Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays - Part II: Markovian rewards
    • V. Anantharam, P. Varaiya, J. Walrand, "Asymptotically Efficient Allocation Rules for the Multiarmed Bandit Problem with Multiple Plays-Part II: Markovian Rewards," IEEE Transaction on Automatic Control, Vol. AC- 32 ,No.11 ,pp. 977-982, Nov., 1987. (Pubitemid 18521626)
    • (1987) IEEE Transactions on Automatic Control , vol.AC-32 , Issue.11 , pp. 977-982
    • Venkatachalam, A.1    Pravin, V.2    Jean, W.3
  • 4
    • 0000616723 scopus 로고
    • Sample mean based index policies with o log n regret for the multi-armed bandit problem
    • R. Agrawal, "Sample Mean Based Index Policies With O(log n) Regret for the Multi-armed Bandit Problem," Advances in Applied Probability, Vol. 27, pp. 1054C1078, 1995.
    • (1995) Advances in Applied Probability , vol.27
    • Agrawal, R.1
  • 5
    • 0036568025 scopus 로고    scopus 로고
    • Finite-time analysis of the multiarmed bandit problem
    • DOI 10.1023/A:1013689704352, Computational Learning Theory
    • P. Auer, N. Cesa-Bianchi, P. Fischer, "Finite-time Analysis of the Multiarmed Bandit Problem," Machine Learning, 47, 235-256, 2002. (Pubitemid 34126111)
    • (2002) Machine Learning , vol.47 , Issue.2-3 , pp. 235-256
    • Auer, P.1    Cesa-Bianchi, N.2    Fischer, P.3
  • 7
    • 79953827701 scopus 로고    scopus 로고
    • Distributed learning in multi-armed bandit with multiple players
    • Nov.
    • K. Liu, Q. Zhao, "Distributed Learning in Multi-Armed Bandit with Multiple Players," IEEE Transations on Signal Processing, Vol. 58, No. 11, pp. 5667-5681, Nov. 2010.
    • (2010) IEEE Transations on Signal Processing , vol.58 , Issue.11 , pp. 5667-5681
    • Liu, K.1    Zhao, Q.2
  • 11
    • 0032628612 scopus 로고    scopus 로고
    • The complexity of optimal queuing network control
    • May
    • C. Papadimitriou, J. Tsitsiklis, "The Complexity of Optimal Queuing Network Control," Mathematics of Operations Research, Vol. 24, No. 2, pp. 293-305, May 1999.
    • (1999) Mathematics of Operations Research , vol.24 , Issue.2 , pp. 293-305
    • Papadimitriou, C.1    Tsitsiklis, J.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.