SCOPUS 정보 검색 플랫폼

Journal of Machine Learning Research

Volumn 22, Issue , 2012, Pages 592-600

On Bayesian upper confidence bounds for bandit problems

(3) Kaufmann, Emilie a Cappé, Olivier a Garivier, Aurélien a

a INSTITUT TELECOM (France)

Author keywords

[No Author keywords available]

Indexed keywords

ARTIFICIAL INTELLIGENCE; BAYESIAN NETWORKS; STOCHASTIC SYSTEMS;

ASYMPTOTIC OPTIMALITY; BAYESIAN APPROACHES; MEASURE OF PERFORMANCE; MULTI ARMED BANDIT; POSTERIOR DISTRIBUTIONS; PRIOR DISTRIBUTION; SPARSITY CONSTRAINTS; UPPER CONFIDENCE BOUND;

PROBABILITY;

EID: 84954519509 PISSN: 15324435 EISSN: 15337928 Source Type: Journal
DOI: None Document Type: Conference Paper

Times cited : (271)

References (18)

1
- 0036568025
- Finite-time analysis of the multiarmed bandit problem
- P. Auer, N. Cesa-Bianchi, P. Fischer. Finite-time analysis of the multiarmed bandit problem Machine Learning 47, 235-256, 2002.
- (2002) Machine Learning , vol.47 , pp. 235-256
- Auer, P.¹ Cesa-Bianchi, N.² Fischer, P.³

2
- 84954468120
- J-Y Audibert, S. Bubeck. Regret Bounds and Minimax Policies under Partial Monitoring Journal of Machine Learning Research, 2010.
- (2010) Regret Bounds and Minimax Policies under Partial Monitoring Journal of Machine Learning Research
- Audibert, J.-Y.¹ Bubeck, S.²

3
- 0030159874
- Optimal adaptive policies for sequential allocation problems
- A.N. Burnetas and M.N. Katehakis. Optimal adaptive policies for sequential allocation problems. in Advances in Applied Mathematics, 17 (2): 122-142, 1996.
- (1996) Advances in Applied Mathematics , vol.17 , Issue.2 , pp. 122-142
- Burnetas, A.N.¹ Katehakis, M.N.²

4
- 84898072179
- Stochastic linear optimization under bandit feedback
- V. Dani, T.P. Hayes, S.M. Kakade. Stochastic linear optimization under bandit feedback. In Conference On Learning Theory COLT 2008.
- (2008) Conference on Learning Theory COLT
- Dani, V.¹ Hayes, T.P.² Kakade, S.M.³

5
- 84863920694
- The KL-UCB algorithm for bounded stochastic bandits and beyond
- A. Garivier, O. Cappé. The KL-UCB algorithm for bounded stochastic bandits and beyond In Conference On Learning Theory COLT , 2011.
- (2011) Conference on Learning Theory COLT
- Garivier, A.¹ Cappé, O.²

6
- 22544444317
- E. Frostig, G. Weiss, Four proofs of Gittins' multiarmed bandit theorem Preprint, 1999.
- (1999) Four Proofs of Gittins' Multiarmed Bandit Theorem Preprint
- Frostig, E.¹ Weiss, G.²

7
- 0000169010
- Bandit processes and dynamic allocation indices
- J.C. Gittins. Bandit processes and dynamic allocation indices. In Journal of the Royal Statistical Society Series B, 41 (2): 148-177, 1979.
- (1979) Journal of the Royal Statistical Society Series B , vol.41 , Issue.2 , pp. 148-177
- Gittins, J.C.¹

8
- 84891584370
- (2nd Edition) Wiley
- J. Gittins, K. Glazebrook and R. Weber. Multi-armed bandit allocation indices (2nd Edition) Wiley, 2011.
- (2011) Multi-armed Bandit Allocation Indices
- Gittins, J.¹ Glazebrook, K.² Weber, R.³

9
- 78549244167
- O.C. Granmo. Solving Two-Armed Bernoulli Bandit Problems Using a Bayesian Learning Automaton in International Journal of Intelligent Computing and Cybernetics (IJICC) 3 (2): 207-234, 2010.
- (2010) Solving Two-Armed Bernoulli Bandit Problems Using A Bayesian Learning Automaton in International Journal of Intelligent Computing and Cybernetics (IJICC) , vol.3 , Issue.2 , pp. 207-234
- Granmo, O.C.¹

10
- 84898077171
- An asymptotically optimal bandit algorithm for bounded support models
- T. Kalai and M. Mohri, editors
- J. Honda and A. Takemura. An asymptotically optimal bandit algorithm for bounded support models. In T. Kalai and M. Mohri, editors, Conference On Learning Theory COLT, 2010.
- (2010) Conference on Learning Theory COLT
- Honda, J.¹ Takemura, A.²

11
- 0000854435
- Adaptive treatment allocation and the multi-armed bandit problem
- T.L. Lai. Adaptive treatment allocation and the multi-armed bandit problem. In Annals of Statistics 15 (3): 1091-1114, 1987.
- (1987) Annals of Statistics , vol.15 , Issue.3 , pp. 1091-1114
- Lai, T.L.¹

12
- 0002899547
- Asymptotically efficient adaptive allocation rules
- T.L. Lai, H. Robbins. Asymptotically efficient adaptive allocation rules. In Advances in Applied Mathematics 6 (1): 4-22, 1985.
- (1985) Advances in Applied Mathematics , vol.6 , Issue.1 , pp. 4-22
- Lai, T.L.¹ Robbins, H.²

13
- 79955755016
- J. Niño-Mora. Computing a classic index for Finite- Horizon bandits Journal on Computing 23 (2): 254-267, 2011
- (2011) Computing A Classic Index for Finite- Horizon Bandits Journal on Computing , vol.23 , Issue.2 , pp. 254-267
- Niño-Mora, J.¹

14
- 50149089442
- Simulation studies of multi-armed bandits with covariates
- Cambridge, UK
- N.G Pavlidis, D.K. Tasoulis and D.J. Hand. Simulation studies of multi-armed bandits with covariates In Proc. 10th International Conference on Computer Modelling, Cambridge, UK, 2008.
- (2008) Proc. 10th International Conference on Computer Modelling
- Pavlidis, N.G.¹ Tasoulis, D.K.² Hand, D.J.³

15
- 84874038864
- A finite-time analysis of Multi-armed bandits problems with Kullback-Leibler Divergence
- O. Maillard, R. Munos, G. Stoltz. A finite-time analysis of Multi-armed bandits problems with Kullback- Leibler Divergence In Conference On Learning Theory COLT , 2011.
- (2011) Conference on Learning Theory COLT
- Maillard, O.¹ Munos, R.² Stoltz, G.³

16
- 77953111834
- Linearly parameterized bandits
- P. Rusmevichientong, J.N. Tsitsiklis. linearly Parameterized Bandits In Mathematics of Operations Research 32 (2): 395-411, 2010.
- (2010) Mathematics of Operations Research , vol.32 , Issue.2 , pp. 395-411
- Rusmevichientong, P.¹ Tsitsiklis, J.N.²

17
- 77956501313
- Gaussian process optimization in the bandit setting: No regret and experimental design
- N. Srinivas, A. Krause, S. Kakade, and M. Seeger. Gaussian process optimization in the bandit setting: No regret and experimental design. In International Conference on Machine Learning ICML 10, 2010.
- (2010) International Conference on Machine Learning ICML , vol.10
- Srinivas, N.¹ Krause, A.² Kakade, S.³ Seeger, M.⁴

18
- 0001395850
- On the likelihood that one unknown probability exceeds another in view of the evidence of two samples
- W.R. Thompson. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. In Biometrika 25: 285-294, 1933.
- (1933) Biometrika , vol.25 , pp. 285-294
- Thompson, W.R.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.