SCOPUS 정보 검색 플랫폼

Proceedings of the IEEE Conference on Decision and Control

Volumn , Issue , 2012, Pages 3960-3965

Decentralized learning for multi-player multi-armed bandits

(3) Kalathil, Dileep a Nayyar, Naumaan a Jain, Rahul a

a University of Southern California ^* (United States)

Author keywords

Distributed adaptive control; multi agent systems; multiarmed bandits; online learning

Indexed keywords

ADAPTIVE CONTROL SYSTEMS; COGNITIVE RADIO; E-LEARNING; LEARNING SYSTEMS; ONLINE SYSTEMS; RADIO SYSTEMS;

COGNITIVE RADIO NETWORK; CONTROL CHANNELS; DECENTRALIZED LEARNING; DISTRIBUTED ADAPTIVE CONTROL; MULTI ARMED BANDIT; ONLINE LEARNING; OPPORTUNISTIC SPECTRUM ACCESS; WIRELESS CHANNEL;

MULTI AGENT SYSTEMS;

EID: 84874251645 PISSN: 07431546 EISSN: 25762370 Source Type: Conference Proceeding
DOI: 10.1109/CDC.2012.6426587 Document Type: Conference Paper

Times cited : (20)

References (10)

1
- 0002899547
- Asymptotically efficient adaptive allocation rules
- T. Lai and H. Robbins, "Asymptotically efficient adaptive allocation rules," Advances in Applied Mathematics, vol. 6, no. 1, pp. 4-22, 1985.
- (1985) Advances in Applied Mathematics , vol.6 , Issue.1 , pp. 4-22
- Lai, T.¹ Robbins, H.²

2
- 0023450663
- Asymptotically efficient allocation rules for the multi-armed bandit problem with multiple plays - Part ii: Markovian rewards
- V. Anantharam, P. Varaiya, and J. Walrand, "Asymptotically efficient allocation rules for the multi-armed bandit problem with multiple plays - part ii: Markovian rewards," IEEE Transactions on Automatic Control, vol. 32, no. 11, pp. 977-982, 1987.
- (1987) IEEE Transactions on Automatic Control , vol.32 , Issue.11 , pp. 977-982
- Anantharam, V.¹ Varaiya, P.² Walrand, J.³

3
- 0000616723
- Sample mean based index policies with (O(log n)) regret for the multi-armed bandit problem
- R. Agrawal, "Sample mean based index policies with (O(log n)) regret for the multi-armed bandit problem," Advances in Applied Probability, Vol. 27, No. 4, pp. 1054-1078, 1995.
- (1995) Advances in Applied Probability , vol.27 , Issue.4 , pp. 1054-1078
- Agrawal, R.¹

4
- 0036568025
- Finite-time analysis of the multiarmed bandit problem
- P. Auer, N. Cesa-Bianchi, and P. Fischer, "Finite-time analysis of the multiarmed bandit problem," Machine Learning, vol. 47, no. 2, pp. 235-256, 2002.
- (2002) Machine Learning , vol.47 , Issue.2 , pp. 235-256
- Auer, P.¹ Cesa-Bianchi, N.² Fischer, P.³

5
- 84867858040
- Combinatorial network optimization with unknown variables: Multi-armed bandits with linear rewards and individual observations
- to appear
- Y. Gai, B. Krishnamachari, and R. Jain, "Combinatorial network optimization with unknown variables: Multi-armed bandits with linear rewards and individual observations," IEEE/ACM Trans. on Networking, to appear, 2012.
- (2012) IEEE/ACM Trans. on Networking
- Gai, Y.¹ Krishnamachari, B.² Jain, R.³

6
- 79953827701
- Distributed learning in multi-armed bandit with multiple players
- November
- K. Liu and Q. Zhao, "Distributed learning in multi-armed bandit with multiple players," IEEE Transactions on Signal Processing, vol. 58, pp. 5667-5681, November, 2010.
- (2010) IEEE Transactions on Signal Processing , vol.58 , pp. 5667-5681
- Liu, K.¹ Zhao, Q.²

7
- 79953194834
- Distributed algorithms for learning and cognitive medium access with logarithmic regret
- April
- A. Anandkumar, N. Michael, A. Tang, and A. Swami, "Distributed algorithms for learning and cognitive medium access with logarithmic regret," IEEE JSAC on Advances in Cognitive Radio Networking and Communications, April, 2011.
- (2011) IEEE JSAC on Advances in Cognitive Radio Networking and Communications
- Anandkumar, A.¹ Michael, N.² Tang, A.³ Swami, A.⁴

8
- 34249831790
- Auction algorithms for network flow problems: A tutorial introduction
- D. P. Bertsekas, "Auction algorithms for network flow problems: A tutorial introduction," Computational Optimization and Applications, vol. 1, pp. 7-66, 1992.
- (1992) Computational Optimization and Applications , vol.1 , pp. 7-66
- Bertsekas, D.P.¹

9
- 0003983125
- Springer
- D. Pollard, "Convergence of stochastic processes," Springer, 1984.
- (1984) Convergence of Stochastic Processes
- Pollard, D.¹

10
- 84874259376
- Submitted, June
- D. Kalathil, N. Nayyar, and R. Jain, "Decentralized learning for multi-player multi-armed bandits," Submitted, available at : http://arxiv.org/abs/1206.3582, June, 2012.
- (2012) Decentralized Learning for Multi-player Multi-armed Bandits
- Kalathil, D.¹ Nayyar, N.² Jain, R.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.