메뉴 건너뛰기




Volumn 23, Issue , 2012, Pages

Analysis of thompson sampling for the multi-armed bandit problem

Author keywords

Bayesian algorithm; Multi armed bandit; Online learning; Thompson sampling

Indexed keywords

DECISION THEORY; LEARNING ALGORITHMS; OPTIMIZATION; PROBABILITY; STOCHASTIC SYSTEMS;

EID: 84874084136     PISSN: 15324435     EISSN: 15337928     Source Type: Journal    
DOI: None     Document Type: Conference Paper
Times cited : (425)

References (14)
  • 1
    • 0036568025 scopus 로고    scopus 로고
    • Finite-time analysis of the multiarmed bandit problem
    • P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2-3):235-256, 2002.
    • (2002) Machine Learning , vol.47 , Issue.2-3 , pp. 235-256
    • Auer, P.1    Cesa-Bianchi, N.2    Fischer, P.3
  • 2
    • 85162416700 scopus 로고    scopus 로고
    • An empirical evaluation of thompson sampling
    • O. Chapelle and L. Li. An empirical evaluation of thompson sampling. In NIPS, 2011.
    • (2011) NIPS
    • Chapelle, O.1    Li, L.2
  • 4
    • 84891584370 scopus 로고
    • Wiley Interscience Series in Systems and Optimization. John Wiley and Son
    • J. C. Gittins. Multi-armed Bandit Allocation Indices. Wiley Interscience Series in Systems and Optimization. John Wiley and Son, 1989.
    • (1989) Multi-armed Bandit Allocation Indices
    • Gittins, J.C.1
  • 5
    • 77956543367 scopus 로고    scopus 로고
    • Web-scale bayesian click-through rate prediction for sponsored search advertising in microsoft's bing search engine
    • T. Graepel, J. Q. Candela, T. Borchert, and R. Herbrich. Web-scale bayesian click-through rate prediction for sponsored search advertising in microsoft's bing search engine. In ICML, pages 13-20, 2010.
    • (2010) ICML , pp. 13-20
    • Graepel, T.1    Candela, J.Q.2    Borchert, T.3    Herbrich, R.4
  • 7
    • 0011027964 scopus 로고
    • Monotone convergence of binomial probabilities and a generalization of ramanujan's equation
    • K. Jogdeo and S. M. Samuels. Monotone Convergence of Binomial Probabilities and A Generalization of Ramanujan's equation. The Annals of Mathematical Statistics, (4):1191-1195, 1968.
    • (1968) The Annals of Mathematical Statistics , Issue.4 , pp. 1191-1195
    • Jogdeo, K.1    Samuels, S.M.2
  • 9
    • 0002899547 scopus 로고
    • Asymptotically efficient adaptive allocation rules
    • T. L. Lai and H. Robbins. Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6:4-22, 1985.
    • (1985) Advances in Applied Mathematics , vol.6 , pp. 4-22
    • Lai, T.L.1    Robbins, H.2
  • 10
    • 84874038864 scopus 로고    scopus 로고
    • Finite-time analysis of multi-armed bandits problems with kullback-leibler divergences
    • O.-A. Maillard, R.Munos, and G. Stoltz. Finite-time analysis of multi-armed bandits problems with kullback-leibler divergences. In Conference on Learning Theory (COLT), 2011.
    • (2011) Conference on Learning Theory (COLT)
    • Maillard, O.-A.1    Munos, R.2    Stoltz, G.3
  • 14
    • 0001395850 scopus 로고
    • On the likelihood that one unknown probability exceeds another in view of the evidence of two samples
    • W. R. Thompson. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25(3-4):285-294, 1933.
    • (1933) Biometrika , vol.25 , Issue.3-4 , pp. 285-294
    • Thompson, W.R.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.