메뉴 건너뛰기




Volumn 30, Issue , 2013, Pages 228-251

Information complexity in bandit subset selection

Author keywords

KL divergence; Stochastic multi armed bandits; Subset selection

Indexed keywords

ARTIFICIAL INTELLIGENCE; SOFTWARE ENGINEERING;

EID: 84898028877     PISSN: 15324435     EISSN: 15337928     Source Type: Journal    
DOI: None     Document Type: Conference Paper
Times cited : (155)

References (20)
  • 2
    • 0036568025 scopus 로고    scopus 로고
    • Finite-time analysis of the multiarmed bandit problem
    • P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2): 235-256, 2002.
    • (2002) Machine Learning , vol.47 , Issue.2 , pp. 235-256
    • Auer, P.1    Cesa-Bianchi, N.2    Fischer, P.3
  • 7
    • 33745295134 scopus 로고    scopus 로고
    • Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems
    • E. Even-Dar, S. Mannor, and Y. Mansour. Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. Journal of Machine Learning Research, 7: 1079-1105, 2006.
    • (2006) Journal of Machine Learning Research , vol.7 , pp. 1079-1105
    • Even-Dar, E.1    Mannor, S.2    Mansour, Y.3
  • 15
    • 0002899547 scopus 로고
    • Asymptotically efficient adaptive allocation rules
    • T.L. Lai and H. Robbins. Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6(1): 4-22, 1985.
    • (1985) Advances in Applied Mathematics , vol.6 , Issue.1 , pp. 4-22
    • Lai, T.L.1    Robbins, H.2
  • 16
    • 84874038864 scopus 로고    scopus 로고
    • A finite-time analysis of multi-armed bandits problems with kullback-leibler divergences
    • O-A. Maillard, R. Munos, and G. Stoltz. A finite-time analysis of multi-armed bandits problems with Kullback-Leibler divergences. In Conference On Learning Theory (COLT), 2011.
    • (2011) Conference on Learning Theory (COLT)
    • Maillard, O.-A.1    Munos, R.2    Stoltz, G.3
  • 17
    • 30044441333 scopus 로고    scopus 로고
    • The sample complexity of exploration in the multi-armed bandit problem
    • S. Mannor and J. Tsitsiklis. The sample complexity of exploration in the multi-armed bandit problem. Journal of Machine Learning Research, pages 623-648, 2004.
    • (2004) Journal of Machine Learning Research , pp. 623-648
    • Mannor, S.1    Tsitsiklis, J.2
  • 18
    • 0031069121 scopus 로고    scopus 로고
    • The racing algorithm: Model selection for lazy learners
    • O. Maron and A. Moore. The racing algorithm: Model selection for lazy learners. Artificial Intelligence Review, 11(1-5): 113-131, 1997.
    • (1997) Artificial Intelligence Review , vol.11 , Issue.1-5 , pp. 113-131
    • Maron, O.1    Moore, A.2
  • 20
    • 0001395850 scopus 로고
    • On the likelihood that one unknown probability exceeds another in view of the evidence of two samples
    • W.R. Thompson. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25: 285-294, 1933.
    • (1933) Biometrika , vol.25 , pp. 285-294
    • Thompson, W.R.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.