SCOPUS 정보 검색 플랫폼

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Volumn 7568 LNAI, Issue , 2012, Pages 199-213

Thompson sampling: An asymptotically optimal finite-time analysis

(3) Kaufmann, Emilie a Korda, Nathaniel b Munos, Rémi b

a TELECOM PARISTECH (France)

b INRIA (France)

Author keywords

[No Author keywords available]

Indexed keywords

ASYMPTOTIC RATE; ASYMPTOTICALLY OPTIMAL; BERNOULLI; FINITE-TIME ANALYSIS; LOWER BOUNDS; MULTI-ARMED BANDIT PROBLEM; NUMERICAL COMPARISON; OPTIMAL POLICIES; OPTIMALITY; THOMPSON;

ARTIFICIAL INTELLIGENCE;

OPTIMIZATION;

EID: 84867888479 PISSN: 03029743 EISSN: 16113349 Source Type: Book Series
DOI: 10.1007/978-3-642-34106-9_18 Document Type: Conference Paper

Times cited : (486)

References (14)

1
- 84898052054
- Analysis of thompson sampling for the multi-armed bandit problem
- Agrawal, S., Goyal, N.: Analysis of thompson sampling for the multi-armed bandit problem. In: Conference on Learning Theory, COLT (2012)
- Conference on Learning Theory, COLT (2012)
- Agrawal, S.¹ Goyal, N.²

2
- 78649420293
- Regret bounds and minimax policies under partial monitoring
- Audibert, J.-Y., Bubeck, S.: Regret bounds and minimax policies under partial monitoring. Journal of Machine Learning Research 11, 2785-2836 (2010)
- (2010) Journal of Machine Learning Research , vol.11 , pp. 2785-2836
- Audibert, J.-Y.¹ Bubeck, S.²

3
- 62949181077
- Exploration-exploitation trade-off using variance estimates in multi-armed bandits
- Audibert, J.-Y., Munos, R., Szepesvaŕi, C.: Exploration- exploitation trade-off using variance estimates in multi-armed bandits. Theoretical Computer Science 410(19), 1876-1902 (2009)
- (2009) Theoretical Computer Science , vol.410 , Issue.19 , pp. 1876-1902
- Audibert, J.-Y.¹ Munos, R.² Szepesvaŕi, C.³

4
- 0036568025
- Finite-time analysis of the multiarmed bandit problem
- Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine Learning 47(2), 235-256 (2002)
- (2002) Machine Learning , vol.47 , Issue.2 , pp. 235-256
- Auer, P.¹ Cesa-Bianchi, N.² Fischer, P.³

5
- 85162416700
- An empirical evaluation of thompson sampling
- Chapelle, O., Li, L.: An empirical evaluation of thompson sampling. In: NIPS (2011)
- (2011) NIPS
- Chapelle, O.¹ Li, L.²

6
- 84863920694
- The kl-ucb algorithm for bounded stochastic bandits and beyond
- Garivier, A., Cappé, O.: The kl-ucb algorithm for bounded stochastic bandits and beyond. In: Conference on Learning Theory, COLT (2011)
- Conference on Learning Theory, COLT (2011)
- Garivier, A.¹ Cappé, O.²

7
- 78549244167
- Solving two-armed bernoulli bandit problems using a bayesian learning automaton
- Granmo, O.C.: Solving two-armed bernoulli bandit problems using a bayesian learning automaton. International Journal of Intelligent Computing and Cybernetics 3(2), 207-234 (2010)
- (2010) International Journal of Intelligent Computing and Cybernetics , vol.3 , Issue.2 , pp. 207-234
- Granmo, O.C.¹

8
- 84898077171
- An asymptotically optimal bandit algorithm for bounded support models
- Honda, J., Takemura, A.: An asymptotically optimal bandit algorithm for bounded support models. In: Conference on Learning Theory, COLT (2010)
- Conference on Learning Theory, COLT (2010)
- Honda, J.¹ Takemura, A.²

9
- 84867888879
- On bayesian upper-confidence bounds for bandit problems
- Kaufmann, E., Garivier, A., Cappé, O.: On bayesian upper-confidence bounds for bandit problems. In: AISTATS (2012)
- (2012) AISTATS
- Kaufmann, E.¹ Garivier, A.² Cappé, O.³

10
- 0002899547
- Asymptotically efficient adaptive allocation rules
- Lai, T.L., Robbins, H.: Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics 6(1), 4-22 (1985)
- (1985) Advances in Applied Mathematics , vol.6 , Issue.1 , pp. 4-22
- Lai, T.L.¹ Robbins, H.²

11
- 84874038864
- A finite-time analysis of multi-armed bandits problems with kullback-leibler divergences
- Maillard, O.-A., Munos, R., Stoltz, G.: A finite-time analysis of multi-armed bandits problems with kullback-leibler divergences. In: Conference on Learning Theory, COLT (2011)
- Conference on Learning Theory, COLT (2011)
- Maillard, O.-A.¹ Munos, R.² Stoltz, G.³

12
- 84864939787
- Optimistic bayesian sampling in contextual bandit problems
- May, B.C., Korda, N., Lee, A., Leslie, D.: Optimistic bayesian sampling in contextual bandit problems. Journal of Machine Learning Research 13, 2069-2106 (2012)
- (2012) Journal of Machine Learning Research , vol.13 , pp. 2069-2106
- May, B.C.¹ Korda, N.² Lee, A.³ Leslie, D.⁴

13
- 80054114465
- Deviations of Stochastic Bandit Regret
- Kivinen, J., Szepesvári, C., Ukkonen, E., Zeugmann, T. (eds.) ALT 2011. Springer, Heidelberg
- Salomon, A., Audibert, J.-Y.: Deviations of Stochastic Bandit Regret. In: Kivinen, J., Szepesvári, C., Ukkonen, E., Zeugmann, T. (eds.) ALT 2011. LNCS, vol. 6925, pp. 159-173. Springer, Heidelberg (2011)
- (2011) LNCS , vol.6925 , pp. 159-173
- Salomon, A.¹ Audibert, J.-Y.²

14
- 0001395850
- On the likelihood that one unknown probability exceeds another in view of the evidence of two samples
- Thompson,W.R.: On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25, 285-294 (1933)
- (1933) Biometrika , vol.25 , pp. 285-294
- Thompson, W.R.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.