SCOPUS 정보 검색 플랫폼

Volumn 19, Issue , 2011, Pages 497-514

A finite-time analysis of multi-armed bandits problems with Kullback-Leibler divergences

a INRIA (France)

Author keywords

Finite time analysis; Kullback Leibler divergence; Multi armed bandit problem; Sanov's lemma

Indexed keywords

STATISTICS;

FINITE SUPPORTS; FINITE-TIME ANALYSIS; KULLBACK LEIBLER DIVERGENCE; LOWER BOUNDS; MULTI ARMED BANDIT; MULTI-ARMED BANDIT PROBLEM; SANOV'S LEMMA;

ALGORITHMS;

EID: 84874038864 PISSN: 15324435 EISSN: 15337928 Source Type: Journal
DOI: None Document Type: Conference Paper

Times cited : (90)

References (15)

1
- 62949181077
- Exploration-exploitation trade-off using variance estimates in multi-armed bandits
- J-Y. Audibert, R. Munos, and C. Szepesvari. Exploration-exploitation trade-off using variance estimates in multi-armed bandits. Theoretical Computer Science, 410:1876-1902, 2009.
- (2009) Theoretical Computer Science , vol.410 , pp. 1876-1902
- Audibert, J.-Y.¹ Munos, R.² Szepesvari, C.³

2
- 78649420293
- Regret bounds and minimax policies under partial monitoring
- J.Y. Audibert and S. Bubeck. Regret bounds and minimax policies under partial monitoring. Journal of Machine Learning Research, 11:2635-2686, 2010.
- (2010) Journal of Machine Learning Research , vol.11 , pp. 2635-2686
- Audibert, J.Y.¹ Bubeck, S.²

3
- 77957337199
- UCB revisited: Improved regret bounds for the stochastic multiarmed bandit problem
- P. Auer and R. Ortner. UCB revisited: Improved regret bounds for the stochastic multiarmed bandit problem. Periodica Mathematica Hungarica, 61(1-2) :555, 2010.
- (2010) Periodica Mathematica Hungarica , vol.61 , Issue.1-2 , pp. 555
- Auer, P.¹ Ortner, R.²

6
- 0004269083
- Springer
- Y. Chow and H. Teicher. Probability Theory. Springer, 1988.
- (1988) Probability Theory
- Chow, Y.¹ Teicher, H.²

7
- 0040108303
- Mesures dominantes et théorème de Sanov
- I. H. Dinwoodie. Mesures dominantes et théorème de Sanov. Annales de l'Institut Henri Poincaré - Probabilités et Statistiques, 28(3):365-373, 1992.
- (1992) Annales de L'Institut Henri Poincaré - Probabilités et Statistiques , vol.28 , Issue.3 , pp. 365-373
- Dinwoodie, I.H.¹

8
- 84898437486
- PhD thesis, Télécom ParisTech
- S. Filippi. Stratégies optimistes en apprentissage par renforcement. PhD thesis, Télécom ParisTech, 2010.
- (2010) Stratégies Optimistes en Apprentissage Par Renforcement
- Filippi, S.¹

9
- 84863920694
- The KL-UCB algorithm for bounded stochastic bandits and beyond
- A. Garivier and O. Cappè. The KL-UCB algorithm for bounded stochastic bandits and beyond. In Proceedings of COLT, 2011.
- (2011) Proceedings of COLT
- Garivier, A.¹ Cappè, O.²

11
- 84898077171
- An asymptotically optimal bandit algorithm for bounded support models
- J. Honda and A. Takemura. An asymptotically optimal bandit algorithm for bounded support models. In Proceedings of COLT, pages 67-79, 2010a.
- (2010) Proceedings of COLT , pp. 67-79
- Honda, J.¹ Takemura, A.²

12
- 84898455666
- arXiv:0905.2776
- J. Honda and A. Takemura. An asymptotically optimal policy for finite support models in the multiarmed bandit problem. arXiv:0905.2776, 2010b.
- (2010) An Asymptotically Optimal Policy for Finite Support Models in the Multiarmed Bandit Problem
- Honda, J.¹ Takemura, A.²

13
- 0002899547
- Asymptotically efficient adaptive allocation rules
- T. L. Lai and H. Robbins. Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6:4-22, 1985.
- (1985) Advances in Applied Mathematics , vol.6 , pp. 4-22
- Lai, T.L.¹ Robbins, H.²

14
- 84892903032
- O.-A. Maillard, R. Munos, and G. Stoltz. A finite-time analysis of multi-armed bandits problems with Kullback-Leibler divergences. 2011. URL http://hai.archives-ouvertes. fr/inria-00574987/.
- (2011) A Finite-time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences
- Maillard, O.-A.¹ Munos, R.² Stoltz, G.³

15
- 84966203785
- Some aspects of the sequential design of experiments
- H. Robbins. Some aspects of the sequential design of experiments. Bulletin of the American Mathematics Society, 58:527-535, 1952.
- (1952) Bulletin of the American Mathematics Society , vol.58 , pp. 527-535
- Robbins, H.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.