메뉴 건너뛰기




Volumn 31, Issue , 2013, Pages 99-107

Further optimal regret bounds for thompson sampling

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; ARTIFICIAL INTELLIGENCE; PROBABILITY; STATISTICS;

EID: 84898938874     PISSN: 15324435     EISSN: 15337928     Source Type: Journal    
DOI: None     Document Type: Conference Paper
Times cited : (379)

References (24)
  • 2
    • 84886540275 scopus 로고    scopus 로고
    • Analysis of thompson sampling for the multi-armed bandit problem
    • S. Agrawal and N. Goyal. Analysis of Thompson Sampling for the Multi-armed Bandit Problem. In COLT, 2012a.
    • (2012) COLT
    • Agrawal, S.1    Goyal, N.2
  • 4
    • 84898079018 scopus 로고    scopus 로고
    • Minimax policies for adversarial and stochastic bandits
    • J.-Y. Audibert and S. Bubeck. Minimax Policies for Adversarial and Stochastic Bandits. In COLT, 2009.
    • (2009) COLT
    • Audibert, J.-Y.1    Bubeck, S.2
  • 5
    • 0036568025 scopus 로고    scopus 로고
    • Finite-time analysis of the multiarmed bandit problem
    • P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2-3): 235-256, 2002.
    • (2002) Machine Learning , vol.47 , Issue.2-3 , pp. 235-256
    • Auer, P.1    Cesa-Bianchi, N.2    Fischer, P.3
  • 6
    • 84943560912 scopus 로고    scopus 로고
    • Regret analysis of stochastic and nonstochastic multi-armed bandit problems
    • S. Bubeck and N. Cesa-Bianchi. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems. CoRR, 2012.
    • (2012) CoRR
    • Bubeck, S.1    Cesa-Bianchi, N.2
  • 7
    • 85162416700 scopus 로고    scopus 로고
    • An empirical evaluation of thompson sampling
    • O. Chapelle and L. Li. An Empirical Evaluation of Thompson Sampling. In NIPS, pages 2249-2257, 2011.
    • (2011) NIPS , pp. 2249-2257
    • Chapelle, O.1    Li, L.2
  • 8
    • 84897516898 scopus 로고    scopus 로고
    • Open problem: Regret bounds for thompson sampling
    • O. Chapelle and L. Li. Open Problem: Regret Bounds for Thompson Sampling. In COLT, 2012.
    • (2012) COLT
    • Chapelle, O.1    Li, L.2
  • 11
    • 77956543367 scopus 로고    scopus 로고
    • Web-scale Bayesian click-through rate prediction for sponsored search advertising in microsoft's bing search engine
    • T. Graepel, J. Q. Candela, T. Borchert, and R. Herbrich. Web-Scale Bayesian Click-Through rate Prediction for Sponsored Search Advertising in Microsoft's Bing Search Engine. In ICML, pages 13-20, 2010.
    • (2010) ICML , pp. 13-20
    • Graepel, T.1    Candela, J.Q.2    Borchert, T.3    Herbrich, R.4
  • 13
    • 3543140670 scopus 로고    scopus 로고
    • Dual weak pigeonhole principle Boolean complexity and derandomization
    • October
    • E. Jerábek. Dual weak pigeonhole principle, Boolean complexity, and derandomization. Annals of Pure and Applied Logic, 129(1-3): 1-37, October 2004.
    • (2004) Annals of Pure and Applied Logic , vol.129 , Issue.1-3 , pp. 1-37
    • Jerábek, E.1
  • 16
    • 0002899547 scopus 로고
    • Asymptotically efficient adaptive allocation rules
    • T. L. Lai and H. Robbins. Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6: 4-22, 1985.
    • (1985) Advances in Applied Mathematics , vol.6 , pp. 4-22
    • Lai, T.L.1    Robbins, H.2
  • 17
    • 84874038864 scopus 로고    scopus 로고
    • Finitetime analysis of multi-armed bandits problems with Kullback-Leibler divergences
    • O.-A. Maillard, R. Munos, and G. Stoltz. Finitetime analysis of multi-armed bandits problems with Kullback-Leibler divergences. In Conference on Learning Theory (COLT), 2011.
    • (2011) Conference on Learning Theory (COLT
    • Maillard, O.-A.1    Munos, R.2    Stoltz, G.3
  • 22
    • 14344258433 scopus 로고    scopus 로고
    • A Bayesian framework for reinforcement learning
    • M. J. A. Strens. A Bayesian Framework for Reinforcement Learning. In ICML, pages 943-950, 2000.
    • (2000) ICML , pp. 943-950
    • Strens, M.J.A.1
  • 23
    • 0001395850 scopus 로고
    • On the likelihood that one unknown probability exceeds another in view of the evidence of two samples
    • W. R. Thompson. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25(3-4): 285-294, 1933.
    • (1933) Biometrika , vol.25 , Issue.3-4 , pp. 285-294
    • Thompson, W.R.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.