메뉴 건너뛰기




Volumn 26, Issue 6, 2010, Pages 639-658

A modern Bayesian look at the multi-armed bandit

Author keywords

Bayesian adaptive design; exploration vs exploitation; probability matching; sequential design

Indexed keywords

ADAPTIVE DESIGNS; BAYESIAN; BAYESIAN COMPUTATION; BAYESIAN POSTERIOR PROBABILITIES; EXPERIMENTAL DESIGN; EXPLORATION VS EXPLOITATION; MULTI ARMED BANDIT; PROBABILITY MATCHING; SEQUENTIAL DESIGN; UNKNOWN PARAMETERS;

EID: 78650505735     PISSN: 15241904     EISSN: 15264025     Source Type: Journal    
DOI: 10.1002/asmb.874     Document Type: Article
Times cited : (370)

References (27)
  • 1
    • 78650472752 scopus 로고    scopus 로고
    • Available from
    • Google. Available from:, 2010.
    • (2010) Google
  • 4
    • 0004870746 scopus 로고
    • A problem in the sequential design of experiments
    • Bellman RE,. A problem in the sequential design of experiments. Sankhya Series A 1956; 30: 221-252.
    • (1956) Sankhya Series A , vol.30 , pp. 221-252
    • Bellman, R.E.1
  • 5
    • 0009943101 scopus 로고    scopus 로고
    • Incomplete leraning from endogenous data in dynamic allocation
    • Brezzi M, Lai TL,. Incomplete leraning from endogenous data in dynamic allocation. Econometrica 2000; 68 (6): 1511-1516.
    • (2000) Econometrica , vol.68 , Issue.6 , pp. 1511-1516
    • Brezzi, M.1    Lai, T.L.2
  • 6
    • 0001395850 scopus 로고
    • On the likelihood that one unknown probability exceeds another in view of the evidence of two samples
    • Thompson WR,. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 1933; 25: 285-294.
    • (1933) Biometrika , vol.25 , pp. 285-294
    • Thompson, W.R.1
  • 7
  • 9
    • 84972528615 scopus 로고
    • Bayesian experimental design: A review
    • Chaloner K, Verdinelli I,. Bayesian experimental design: a review. Statistical Science 1995; 10: 273-304.
    • (1995) Statistical Science , vol.10 , pp. 273-304
    • Chaloner, K.1    Verdinelli, I.2
  • 10
    • 0000576595 scopus 로고
    • Markov chains for exploring posterior distributions (disc: P1728-1762)
    • Tierney L,. Markov chains for exploring posterior distributions (disc: P1728-1762). The Annals of Statistics 1994; 22: 1701-1728.
    • (1994) The Annals of Statistics , vol.22 , pp. 1701-1728
    • Tierney, L.1
  • 15
    • 21144463800 scopus 로고
    • The learning component of dynamic allocation indices
    • Gittins J, Wang Y-G,. The learning component of dynamic allocation indices. The Annals of Statistics 1992; 20: 1625-1636.
    • (1992) The Annals of Statistics , vol.20 , pp. 1625-1636
    • Gittins, J.1    Wang, Y.-G.2
  • 18
    • 0000854435 scopus 로고
    • Adaptive treatment allocation and the multi-armed bandit problem
    • Lai T-L,. Adaptive treatment allocation and the multi-armed bandit problem. The Annals of Statistics 1987; 15 (3): 1091-1114.
    • (1987) The Annals of Statistics , vol.15 , Issue.3 , pp. 1091-1114
    • Lai, T.-L.1
  • 19
    • 0000616723 scopus 로고
    • Sample mean based index policies with o(logn) regret for the multi-armed bandit problem
    • Agrawal R,. Sample mean based index policies with o(logn) regret for the multi-armed bandit problem. Advances in Applied Probability 1995; 27: 1054-1078.
    • (1995) Advances in Applied Probability , vol.27 , pp. 1054-1078
    • Agrawal, R.1
  • 20
    • 0036568025 scopus 로고    scopus 로고
    • Finite-time analysis of the multiarmed bandit problem
    • Auer P, Cesa-Bianchi N, Fischer P,. Finite-time analysis of the multiarmed bandit problem. Machine Learning 2002; 47: 235-256.
    • (2002) Machine Learning , vol.47 , pp. 235-256
    • Auer, P.1    Cesa-Bianchi, N.2    Fischer, P.3
  • 23
    • 0036108219 scopus 로고    scopus 로고
    • Randomized allocation with nonparametric estimation for a multi-armed bandit problem with covariates
    • Yang Y, Zhu D,. Randomized allocation with nonparametric estimation for a multi-armed bandit problem with covariates. The Annals of Statistics 2002; 30: 100-121.
    • (2002) The Annals of Statistics , vol.30 , pp. 100-121
    • Yang, Y.1    Zhu, D.2
  • 25
    • 0001043843 scopus 로고
    • Restless bandits: Activity allocation in a changing world
    • Whittle P,. Restless bandits: activity allocation in a changing world. Journal of Applied Probability 1988; 25A: 287-298.
    • (1988) Journal of Applied Probability , vol.25 A , pp. 287-298
    • Whittle, P.1
  • 27
    • 0000595228 scopus 로고
    • Arm-acquiring bandits
    • Whittle P,. Arm-acquiring bandits. The Annals of Probability 1981; 9 (2): 284-292.
    • (1981) The Annals of Probability , vol.9 , Issue.2 , pp. 284-292
    • Whittle, P.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.