메뉴 건너뛰기




Volumn , Issue , 2011, Pages 1968-1971

Logarithmic weak regret of non-Bayesian restless multi-armed bandit

Author keywords

logarithmic order; non Bayesian formulation; regret; Restless multi armed bandit

Indexed keywords

AVERAGE REWARD; COGNITIVE RADIO NETWORK; FADING ENVIRONMENT; FINANCIAL INVESTMENTS; K-OUT-OF-N; LOGARITHMIC ORDER; MARKOVIAN; MARKOVIAN DYNAMICS; NON-BAYESIAN FORMULATION; OPPORTUNISTIC COMMUNICATIONS; POTENTIAL APPLICATIONS; REGRET; RESTLESS MULTI-ARMED BANDIT; UPPER CONFIDENCE BOUND;

EID: 80051636024     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2011.5946895     Document Type: Conference Paper
Times cited : (34)

References (17)
  • 1
    • 0002899547 scopus 로고
    • Asymptotically Efficient Adaptive Allocation Rules
    • T. Lai and H. Robbins, "Asymptotically Efficient Adaptive Allocation Rules," Advances in Applied Mathematics, Vol. 6, No. 1, pp. 4C22, 1985.
    • (1985) Advances in Applied Mathematics , vol.6 , Issue.1
    • Lai, T.1    Robbins, H.2
  • 2
    • 0023453059 scopus 로고
    • Asymptotically Efficient Allocation Rules for the Multiarmed Bandit Problem with Multiple Plays-Part I: I.I.D. Rewards
    • Nov.
    • V. Anantharam, P. Varaiya, J. Walrand, "Asymptotically Efficient Allocation Rules for the Multiarmed Bandit Problem with Multiple Plays-Part I: I.I.D. Rewards," IEEE Transaction on Automatic Control, Vol. AC-32 ,No.11 , pp. 968-976, Nov., 1987.
    • (1987) IEEE Transaction on Automatic Control , vol.AC-32 , Issue.11 , pp. 968-976
    • Anantharam, V.1    Varaiya, P.2    Walrand, J.3
  • 3
    • 0023450663 scopus 로고
    • Asymptotically Efficient Allocation Rules for the Multiarmed Bandit Problem with Multiple Plays- Part II: Markovian Rewards
    • Nov.
    • V. Anantharam, P. Varaiya, J. Walrand, "Asymptotically Efficient Allocation Rules for the Multiarmed Bandit Problem with Multiple Plays- Part II: Markovian Rewards," IEEE Transaction on Automatic Control, Vol. AC-32 ,No.11 ,pp. 977-982, Nov., 1987.
    • (1987) IEEE Transaction on Automatic Control , vol.AC-32 , Issue.11 , pp. 977-982
    • Anantharam, V.1    Varaiya, P.2    Walrand, J.3
  • 4
    • 0000616723 scopus 로고
    • Sample Mean Based Index Policies with O(log n) Regret for the Multi-armed Bandit Problem
    • R. Agrawal, "Sample Mean Based Index Policies With O(log n) Regret for the Multi-armed Bandit Problem," Advances in Applied Probability, Vol. 27, pp. 1054C1078, 1995.
    • (1995) Advances in Applied Probability , vol.27
    • Agrawal, R.1
  • 5
    • 0036568025 scopus 로고    scopus 로고
    • Finite-time analysis of the multiarmed bandit problem
    • DOI 10.1023/A:1013689704352, Computational Learning Theory
    • P. Auer, N. Cesa-Bianchi, P. Fischer, "Finite-time Analysis of the Multiarmed Bandit Problem," Machine Learning, 47, 235-256, 2002. (Pubitemid 34126111)
    • (2002) Machine Learning , vol.47 , Issue.2-3 , pp. 235-256
    • Auer, P.1    Cesa-Bianchi, N.2    Fischer, P.3
  • 7
    • 0032628612 scopus 로고    scopus 로고
    • The Complexity of Optimal Queuing Network Control
    • May
    • C. Papadimitriou, J. Tsitsiklis, "The Complexity of Optimal Queuing Network Control," Mathematics of Operations Research, Vol. 24, No. 2, pp. 293-305, May 1999.
    • (1999) Mathematics of Operations Research , vol.24 , Issue.2 , pp. 293-305
    • Papadimitriou, C.1    Tsitsiklis, J.2
  • 9
    • 79953827701 scopus 로고    scopus 로고
    • Distributed Learning in Multi-Armed Bandit with Multiple Players
    • Nov.
    • K. Liu, Q. Zhao, "Distributed Learning in Multi-Armed Bandit with Multiple Players," Transations on Signal Processing, Vol. 58, No. 11, pp. 5667-5681, Nov. 2010.
    • (2010) Transations on Signal Processing , vol.58 , Issue.11 , pp. 5667-5681
    • Liu, K.1    Zhao, Q.2
  • 12
    • 0001043843 scopus 로고
    • Restless Bandits: Activity Allocation in a Changing World
    • P. Whittle, "Restless Bandits: Activity Allocation in a Changing World," Journal of Applied Probability, Vol. 25, pp. 287-298, 1988.
    • (1988) Journal of Applied Probability , vol.25 , pp. 287-298
    • Whittle, P.1
  • 13
    • 0000169010 scopus 로고
    • Bandit processes and dynamic allocation indicies (with discussion)
    • J. Gittins, "Bandit processes and dynamic allocation indicies (with discussion)," J. R. Statist. Soc., B, 41, pp. 148-177, 1979
    • (1979) J. R. Statist. Soc., B , vol.41 , pp. 148-177
    • Gittins, J.1
  • 14
    • 0002327722 scopus 로고
    • On an Index Policy for Restless Bandits
    • Sep.
    • R. Weber and G. Weiss, "On an Index Policy for Restless Bandits," Journal of Applied Probability, Vol. 27, No. 3, pp. 637-648, Sep., 1990.
    • (1990) Journal of Applied Probability , vol.27 , Issue.3 , pp. 637-648
    • Weber, R.1    Weiss, G.2
  • 15
    • 77958597180 scopus 로고    scopus 로고
    • Indexability of Restless Bandit Problems and Optimality of Whittle Index for Dynamic Multichannel Access
    • Nov.
    • K. Liu and Q. Zhao "Indexability of Restless Bandit Problems and Optimality of Whittle Index for Dynamic Multichannel Access," IEEE Transactions on Information Theory, Vol. 55, No. 11, pp. 5547-5567, Nov. 2010.
    • (2010) IEEE Transactions on Information Theory , vol.55 , Issue.11 , pp. 5547-5567
    • Liu, K.1    Zhao, Q.2
  • 17
    • 0000646152 scopus 로고    scopus 로고
    • A Chernoff Bound for Random Walks on Expander Graphs
    • Proc. 34th IEEE Symp. on Foundatioins of Computer Science (FOCS93),vol.
    • D. Gillman, "A Chernoff Bound for Random Walks on Expander Graphs," Proc. 34th IEEE Symp. on Foundatioins of Computer Science (FOCS93),vol. SIAM J. Comp.,Vol. 27, No. 4, 1998.
    • (1998) SIAM J. Comp. , vol.27 , Issue.4
    • Gillman, D.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.