메뉴 건너뛰기




Volumn , Issue , 2012, Pages 3960-3965

Decentralized learning for multi-player multi-armed bandits

Author keywords

Distributed adaptive control; multi agent systems; multiarmed bandits; online learning

Indexed keywords

ADAPTIVE CONTROL SYSTEMS; COGNITIVE RADIO; E-LEARNING; LEARNING SYSTEMS; ONLINE SYSTEMS; RADIO SYSTEMS;

EID: 84874251645     PISSN: 07431546     EISSN: 25762370     Source Type: Conference Proceeding    
DOI: 10.1109/CDC.2012.6426587     Document Type: Conference Paper
Times cited : (20)

References (10)
  • 1
    • 0002899547 scopus 로고
    • Asymptotically efficient adaptive allocation rules
    • T. Lai and H. Robbins, "Asymptotically efficient adaptive allocation rules," Advances in Applied Mathematics, vol. 6, no. 1, pp. 4-22, 1985.
    • (1985) Advances in Applied Mathematics , vol.6 , Issue.1 , pp. 4-22
    • Lai, T.1    Robbins, H.2
  • 2
    • 0023450663 scopus 로고
    • Asymptotically efficient allocation rules for the multi-armed bandit problem with multiple plays - Part ii: Markovian rewards
    • V. Anantharam, P. Varaiya, and J. Walrand, "Asymptotically efficient allocation rules for the multi-armed bandit problem with multiple plays - part ii: Markovian rewards," IEEE Transactions on Automatic Control, vol. 32, no. 11, pp. 977-982, 1987.
    • (1987) IEEE Transactions on Automatic Control , vol.32 , Issue.11 , pp. 977-982
    • Anantharam, V.1    Varaiya, P.2    Walrand, J.3
  • 3
    • 0000616723 scopus 로고
    • Sample mean based index policies with (O(log n)) regret for the multi-armed bandit problem
    • R. Agrawal, "Sample mean based index policies with (O(log n)) regret for the multi-armed bandit problem," Advances in Applied Probability, Vol. 27, No. 4, pp. 1054-1078, 1995.
    • (1995) Advances in Applied Probability , vol.27 , Issue.4 , pp. 1054-1078
    • Agrawal, R.1
  • 4
    • 0036568025 scopus 로고    scopus 로고
    • Finite-time analysis of the multiarmed bandit problem
    • P. Auer, N. Cesa-Bianchi, and P. Fischer, "Finite-time analysis of the multiarmed bandit problem," Machine Learning, vol. 47, no. 2, pp. 235-256, 2002.
    • (2002) Machine Learning , vol.47 , Issue.2 , pp. 235-256
    • Auer, P.1    Cesa-Bianchi, N.2    Fischer, P.3
  • 5
    • 84867858040 scopus 로고    scopus 로고
    • Combinatorial network optimization with unknown variables: Multi-armed bandits with linear rewards and individual observations
    • to appear
    • Y. Gai, B. Krishnamachari, and R. Jain, "Combinatorial network optimization with unknown variables: Multi-armed bandits with linear rewards and individual observations," IEEE/ACM Trans. on Networking, to appear, 2012.
    • (2012) IEEE/ACM Trans. on Networking
    • Gai, Y.1    Krishnamachari, B.2    Jain, R.3
  • 6
    • 79953827701 scopus 로고    scopus 로고
    • Distributed learning in multi-armed bandit with multiple players
    • November
    • K. Liu and Q. Zhao, "Distributed learning in multi-armed bandit with multiple players," IEEE Transactions on Signal Processing, vol. 58, pp. 5667-5681, November, 2010.
    • (2010) IEEE Transactions on Signal Processing , vol.58 , pp. 5667-5681
    • Liu, K.1    Zhao, Q.2
  • 8
    • 34249831790 scopus 로고
    • Auction algorithms for network flow problems: A tutorial introduction
    • D. P. Bertsekas, "Auction algorithms for network flow problems: A tutorial introduction," Computational Optimization and Applications, vol. 1, pp. 7-66, 1992.
    • (1992) Computational Optimization and Applications , vol.1 , pp. 7-66
    • Bertsekas, D.P.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.