SCOPUS 정보 검색 플랫폼

Journal of Machine Learning Research

Volumn 31, Issue , 2013, Pages 99-107

Further optimal regret bounds for thompson sampling

(2) Agrawal, Shipra a Goyal, Navin a

a MICROSOFT RESEARCH (United States)

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; ARTIFICIAL INTELLIGENCE; PROBABILITY; STATISTICS;

ANALYSIS TECHNIQUES; BETA DISTRIBUTIONS; EMPIRICAL PERFORMANCE; MULTI-ARMED BANDIT PROBLEM; OPTIMAL PROBLEMS; RANDOMIZED ALGORITHMS; STATE-OF-THE-ART METHODS; THOMPSON SAMPLINGS;

OPTIMIZATION;

EID: 84898938874 PISSN: 15324435 EISSN: 15337928 Source Type: Journal
DOI: None Document Type: Conference Paper

Times cited : (379)

References (24)

1
- 0003851729
- Dover, New York
- M. Abramowitz and I. A. Stegun. Handbook of Math- ematical Functions with Formulas, Graphs, and Mathematical Tables. Dover, New York, 1964.
- (1964) Handbook of Math- ematical Functions with Formulas, Graphs, and Mathematical Tables.
- Abramowitz, M.¹ Stegun, I.A.²

2
- 84886540275
- Analysis of thompson sampling for the multi-armed bandit problem
- S. Agrawal and N. Goyal. Analysis of Thompson Sampling for the Multi-armed Bandit Problem. In COLT, 2012a.
- (2012) COLT
- Agrawal, S.¹ Goyal, N.²

3
- 84979873896
- Manuscript
- S. Agrawal and N. Goyal. Thompson sampling for contextual bandits with linear payoffs. Manuscript, 2012b.
- (2012) Thompson Sampling for Contextual Bandits with Linear Payoffs.
- Agrawal, S.¹ Goyal, N.²

4
- 84898079018
- Minimax policies for adversarial and stochastic bandits
- J.-Y. Audibert and S. Bubeck. Minimax Policies for Adversarial and Stochastic Bandits. In COLT, 2009.
- (2009) COLT
- Audibert, J.-Y.¹ Bubeck, S.²

5
- 0036568025
- Finite-time analysis of the multiarmed bandit problem
- P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2-3): 235-256, 2002.
- (2002) Machine Learning , vol.47 , Issue.2-3 , pp. 235-256
- Auer, P.¹ Cesa-Bianchi, N.² Fischer, P.³

6
- 84943560912
- Regret analysis of stochastic and nonstochastic multi-armed bandit problems
- S. Bubeck and N. Cesa-Bianchi. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems. CoRR, 2012.
- (2012) CoRR
- Bubeck, S.¹ Cesa-Bianchi, N.²

7
- 85162416700
- An empirical evaluation of thompson sampling
- O. Chapelle and L. Li. An Empirical Evaluation of Thompson Sampling. In NIPS, pages 2249-2257, 2011.
- (2011) NIPS , pp. 2249-2257
- Chapelle, O.¹ Li, L.²

8
- 84897516898
- Open problem: Regret bounds for thompson sampling
- O. Chapelle and L. Li. Open Problem: Regret Bounds for Thompson Sampling. In COLT, 2012.
- (2012) COLT
- Chapelle, O.¹ Li, L.²

9
- 84898437076
- The KL-UCB algorithm for bounded stochastic bandits and beyond
- A. Garivier and O. Cappé. The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond. In Con- ference on Learning Theory (COLT), 2011.
- (2011) Con- ference on Learning Theory (COLT
- Garivier, A.¹ Cappé, O.²

10
- 0003848944
- Multi-armed bandit allocation indices
- John Wiley and Son
- J. C. Gittins. Multi-armed Bandit Allocation Indices. Wiley Interscience Series in Systems and Optimization. John Wiley and Son, 1989.
- (1989) Wiley Interscience Series in Systems and Optimization
- Gittins, J.C.¹

11
- 77956543367
- Web-scale Bayesian click-through rate prediction for sponsored search advertising in microsoft's bing search engine
- T. Graepel, J. Q. Candela, T. Borchert, and R. Herbrich. Web-Scale Bayesian Click-Through rate Prediction for Sponsored Search Advertising in Microsoft's Bing Search Engine. In ICML, pages 13-20, 2010.
- (2010) ICML , pp. 13-20
- Graepel, T.¹ Candela, J.Q.² Borchert, T.³ Herbrich, R.⁴

12
- 78549244167
- Solving two-armed Bernoulli bandit problems using a Bayesian learning automaton
- O.-C. Granmo. Solving Two-Armed Bernoulli Bandit Problems Using a Bayesian Learning Automaton. International Journal of Intelligent Computing and Cybernetics (IJICC), 3(2): 207-234, 2010.
- (2010) International Journal of Intelligent Computing and Cybernetics (IJICC) , vol.3 , Issue.2 , pp. 207-234
- Granmo, O.-C.¹

13
- 3543140670
- Dual weak pigeonhole principle Boolean complexity and derandomization
- October
- E. Jerábek. Dual weak pigeonhole principle, Boolean complexity, and derandomization. Annals of Pure and Applied Logic, 129(1-3): 1-37, October 2004.
- (2004) Annals of Pure and Applied Logic , vol.129 , Issue.1-3 , pp. 1-37
- Jerábek, E.¹

14
- 84867888879
- On Bayesian upper confidence bounds for bandit problems
- E. Kaufmann, O. Cappé, and A. Garivier. On Bayesian Upper Confidence Bounds for Bandit Problems. In Fifteenth International Conference on Artificial Intelligence and Statistics (AISTAT), 2012a.
- (2012) Fifteenth International Conference on Artificial Intelligence and Statistics (AISTAT
- Kaufmann, E.¹ Cappé, O.² Garivier, A.³

15
- 84887459202
- Thompson sampling: An optimal finite time analysis
- E. Kaufmann, N. Korda, and R. Munos. Thompson Sampling: An Optimal Finite Time Analysis. In International Conference on Algorithmic Learning Theory (ALT), 2012b.
- (2012) International Conference on Algorithmic Learning Theory (ALT
- Kaufmann, E.¹ Korda, N.² Munos, R.³

16
- 0002899547
- Asymptotically efficient adaptive allocation rules
- T. L. Lai and H. Robbins. Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6: 4-22, 1985.
- (1985) Advances in Applied Mathematics , vol.6 , pp. 4-22
- Lai, T.L.¹ Robbins, H.²

17
- 84874038864
- Finitetime analysis of multi-armed bandits problems with Kullback-Leibler divergences
- O.-A. Maillard, R. Munos, and G. Stoltz. Finitetime analysis of multi-armed bandits problems with Kullback-Leibler divergences. In Conference on Learning Theory (COLT), 2011.
- (2011) Conference on Learning Theory (COLT
- Maillard, O.-A.¹ Munos, R.² Stoltz, G.³

18
- 84860647553
- Technical Report 11:02, Statistics Group, Department of Mathematics, University of Bristol
- B. C. May and D. S. Leslie. Simulation studies in optimistic Bayesian sampling in contextual-bandit problems. Technical Report 11: 02, Statistics Group, Department of Mathematics, University of Bristol, 2011.
- (2011) Simulation Studies in Optimistic Bayesian Sampling in Contextual-Bandit Problems
- May, B.C.¹ Leslie, D.S.²

19
- 84860620509
- Technical Report 11:01, Statistics Group, Department of Mathematics, University of Bristol
- B. C. May, N. Korda, A. Lee, and D. S. Leslie. Optimistic Bayesian sampling in contextual-bandit problems. Technical Report 11: 01, Statistics Group, Department of Mathematics, University of Bristol, 2011.
- (2011) Optimistic Bayesian Sampling in Contextual-Bandit Problems
- May, B.C.¹ Korda, N.² Lee, A.³ Leslie, D.S.⁴

20
- 77957883511
- Linearly parametrized bandits
- P. A. Ortega and D. A. Braun. Linearly parametrized bandits. Journal of Artificial Intelligence Research, 38: 475-511, 2010.
- (2010) Journal of Artificial Intelligence Research , vol.38 , pp. 475-511
- Ortega, P.A.¹ Braun, D.A.²

21
- 78650505735
- A modern Bayesian look at the multi-armed bandit
- S. Scott. A modern Bayesian look at the multi-armed bandit. Applied Stochastic Models in Business and Industry, 26: 639-658, 2010.
- (2010) Applied Stochastic Models in Business and Industry , vol.26 , pp. 639-658
- Scott, S.¹

22
- 14344258433
- A Bayesian framework for reinforcement learning
- M. J. A. Strens. A Bayesian Framework for Reinforcement Learning. In ICML, pages 943-950, 2000.
- (2000) ICML , pp. 943-950
- Strens, M.J.A.¹

23
- 0001395850
- On the likelihood that one unknown probability exceeds another in view of the evidence of two samples
- W. R. Thompson. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25(3-4): 285-294, 1933.
- (1933) Biometrika , vol.25 , Issue.3-4 , pp. 285-294
- Thompson, W.R.¹

24
- 0008954974
- PhD thesis, Department of Artificial Intelligence, University of Edinburgh
- J. Wyatt. Exploration and Inference in Learning from Reinforcement. PhD thesis, Department of Artificial Intelligence, University of Edinburgh, 1997.
- (1997) Exploration and Inference in Learning from Reinforcement.
- Wyatt, J.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.