SCOPUS 정보 검색 플랫폼

Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011

Volumn , Issue , 2011, Pages

Improved algorithms for linear stochastic bandits

(3) Abbasi Yadkori, Yasin a Pál, Dávid a Szepesvári, Csaba a

a UNIVERSITY OF ALBERTA (Canada)

Author keywords

[No Author keywords available]

Indexed keywords

ARTIFICIAL INTELLIGENCE; PROBABILITY;

BANDIT PROBLEMS; CONFIDENCE SETS; EMPIRICAL PERFORMANCE; HIGH PROBABILITY; IMPROVED * ALGORITHM; MULTIARMED BANDIT PROBLEMS (MABP); PERFORMANCE OF ALGORITHM; REGRET BOUNDS; SIMPLE MODIFICATIONS; STOCHASTICS;

STOCHASTIC SYSTEMS;

EID: 85162561761 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (1966)

References (27)

1
- 84860615664
- Forced-exploration based algorithms for playing in stochastic linear bandits
- Y. Abbasi-Yadkori, A. Antos, and Cs. Szepesvári. Forced-exploration based algorithms for playing in stochastic linear bandits. In COLTWorkshop on On-line Learning with Limited Feedback, 2009.
- (2009) COLTWorkshop on On-line Learning with Limited Feedback
- Abbasi-Yadkori, Y.¹ Antos, A.² Szepesvári, Cs.³

2
- 0344118814
- Reinforcement learning with immediate rewards and linear hypotheses
- N. Abe, A. W. Biermann, and P. M. Long. Reinforcement learning with immediate rewards and linear hypotheses. Algorithmica, 37:263293, 2003.
- (2003) Algorithmica , vol.37 , pp. 263293
- Abe, N.¹ Biermann, A.W.² Long, P.M.³

3
- 77953287501
- Active learning in heteroscedastic noise
- A. Antos, V. Grover, and Cs. Szepesvári. Active learning in heteroscedastic noise. Theoretical Computer Science, 411(29-30):2712-2728, 2010.
- (2010) Theoretical Computer Science , vol.411 , Issue.29-30 , pp. 2712-2728
- Antos, A.¹ Grover, V.² Szepesvári, Cs.³

4
- 62949181077
- Exploration-exploitation tradeoff using variance estimates in multi-armed bandits
- J.-Y. Audibert, R. Munos, and Csaba Szepesvári. Exploration-exploitation tradeoff using variance estimates in multi-armed bandits. Theoretical Computer Science, 410(19):1876-1902, 2009.
- (2009) Theoretical Computer Science , vol.410 , Issue.19 , pp. 1876-1902
- Audibert, J.-Y.¹ Munos, R.² Szepesvári, C.³

5
- 0034497786
- Using upper confidence bounds for online learning
- P. Auer. Using upper confidence bounds for online learning. In FOCS, pages 270-279, 2000.
- (2000) FOCS , pp. 270-279
- Auer, P.¹

6
- 84860601617
- Using confidence bounds for exploitation-exploration trade-offs
- P. Auer. Using confidence bounds for exploitation-exploration trade-offs. JMLR, 2002.
- (2002) JMLR
- Auer, P.¹

7
- 0036568025
- Finite time analysis of the multiarmed bandit problem
- P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite time analysis of the multiarmed bandit problem. Machine Learning, 47(2-3):235-256, 2002.
- (2002) Machine Learning , vol.47 , Issue.2-3 , pp. 235-256
- Auer, P.¹ Cesa-Bianchi, N.² Fischer, P.³

8
- 77952027689
- Online optimization in X-armed bandits
- S. Bubeck, R. Munos, G. Stoltz, and Cs. Szepesvári. Online optimization in X-armed bandits. In NIPS, pages 201-208, 2008.
- (2008) NIPS , pp. 201-208
- Bubeck, S.¹ Munos, R.² Stoltz, G.³ Szepesvári, C.⁴

9
- 84926078662
- N. Cesa-Bianchi and G. Lugosi. Prediction, Learning, and Games. 2006.
- (2006) Prediction, Learning, and Games
- Cesa-Bianchi, N.¹ Lugosi, G.²

10
- 84888139304
- Contextual bandits with linear payoff functions
- W. Chu, L. Li, L. Reyzin, and R. E. Schapire. Contextual bandits with linear payoff functions. In AISTATS, 2011.
- (2011) AISTATS
- Chu, W.¹ Li, L.² Reyzin, L.³ Schapire, R.E.⁴

11
- 70349275222
- Bandit algorithms for tree search
- P.-A. Coquelin and R. Munos. Bandit algorithms for tree search. In UAI, 2007.
- (2007) UAI
- Coquelin, P.-A.¹ Munos, R.²

12
- 84898072179
- Stochastic linear optimization under bandit feedback
- Rocco Servedio and Tong Zhang, editors
- V. Dani, T. P. Hayes, and S. M. Kakade. Stochastic linear optimization under bandit feedback. In Rocco Servedio and Tong Zhang, editors, COLT, pages 355-366, 2008.
- (2008) COLT , pp. 355-366
- Dani, V.¹ Hayes, T.P.² Kakade, S.M.³

13
- 4544277579
- Self-normalized processes: Exponential inequalities, moment bounds and iterated logarithm laws
- V. H. de la Peña, M. J. Klass, and T. L. Lai. Self-normalized processes: exponential inequalities, moment bounds and iterated logarithm laws. Annals of Probability, 32(3):1902-1933, 2004.
- (2004) Annals of Probability , vol.32 , Issue.3 , pp. 1902-1933
- De La Peña, V.H.¹ Klass, M.J.² Lai, T.L.³

14
- 70349472792
- Springer
- V. H. de la Peña, T. L. Lai, and Q.-M. Shao. Self-normalized processes: Limit theory and Statistical Applications. Springer, 2009.
- (2009) Self-normalized Processes: Limit Theory and Statistical Applications
- De La Peña, V.H.¹ Lai, T.L.² Shao, Q.-M.³

15
- 84875634609
- Robust selective sampling from single and multiple teachers
- O. Dekel, C. Gentile, and K. Sridharan. Robust selective sampling from single and multiple teachers. In COLT, 2010.
- (2010) COLT
- Dekel, O.¹ Gentile, C.² Sridharan, K.³

16
- 0002384441
- On tail probabilities for martingales
- D. A. Freedman. On tail probabilities for martingales. The Annals of Probability, 3(1):100-118, 1975.
- (1975) The Annals of Probability , vol.3 , Issue.1 , pp. 100-118
- Freedman, D.A.¹

17
- 70049104217
- On upper-confidence bound policies for non-stationary bandit problems
- A. Garivier and E. Moulines. On upper-confidence bound policies for non-stationary bandit problems. Technical report, LTCI, 2008.
- (2008) Technical Report LTCI
- Garivier, A.¹ Moulines, E.²

18
- 84856454906
- Regret bounds for sleeping experts and bandits
- R. Kleinberg, A. Niculescu-Mizil, and Y. Sharma. Regret bounds for sleeping experts and bandits. Machine learning, pages 1-28, 2008.
- (2008) Machine Learning , pp. 1-28
- Kleinberg, R.¹ Niculescu-Mizil, A.² Sharma, Y.³

19
- 0002899547
- Asymptotically efficient adaptive allocation rules
- T. L. Lai and H. Robbins. Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6:4-22, 1985.
- (1985) Advances in Applied Mathematics , vol.6 , pp. 4-22
- Lai, T.L.¹ Robbins, H.²

20
- 0000258837
- Least squares estimates in stochastic regression models with applications to identification and control of dynamic systems
- T. L. Lai and C. Z. Wei. Least squares estimates in stochastic regression models with applications to identification and control of dynamic systems. The Annals of Statistics, 10(1):154-166, 1982.
- (1982) The Annals of Statistics , vol.10 , Issue.1 , pp. 154-166
- Lai, T.L.¹ Wei, C.Z.²

21
- 0018190841
- Strong consistency of least squares estimates in multiple regression
- T. L. Lai, H. Robbins, and C. Z. Wei. Strong consistency of least squares estimates in multiple regression. Proceedings of the National Academy of Sciences, 75(7):3034-3036, 1979.
- (1979) Proceedings of the National Academy of Sciences , vol.75 , Issue.7 , pp. 3034-3036
- Lai, T.L.¹ Robbins, H.² Wei, C.Z.³

22
- 77954641643
- A contextual-bandit approach to personalized news article recommendation
- ACM
- L. Li, W. Chu, J. Langford, and R. E. Schapire. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th International Conference on World Wide Web (WWW 2010), pages 661-670. ACM, 2010.
- (2010) Proceedings of the 19th International Conference on World Wide Web (WWW 2010) , pp. 661-670
- Li, L.¹ Chu, W.² Langford, J.³ Schapire, R.E.⁴

23
- 56449108844
- V. Mnih, Cs. Szepesvári, and J.-Y. Audibert. Empirical Bernstein stopping. pages 672-679, 2008.
- (2008) Empirical Bernstein Stopping , pp. 672-679
- Mnih, V.¹ Szepesvári, Cs.² Audibert, J.-Y.³

24
- 84966203785
- Some aspects of the sequential design of experiments
- H. Robbins. Some aspects of the sequential design of experiments. Bulletin of the American Mathematical Society, 58:527-535, 1952.
- (1952) Bulletin of the American Mathematical Society , vol.58 , pp. 527-535
- Robbins, H.¹

25
- 77953111834
- Linearly parameterized bandits
- P. Rusmevichientong and J. N. Tsitsiklis. Linearly parameterized bandits. Mathematics of Operations Research, 35(2):395-411, 2010.
- (2010) Mathematics of Operations Research , vol.35 , Issue.2 , pp. 395-411
- Rusmevichientong, P.¹ Tsitsiklis, J.N.²

26
- 0004245243
- Academic Press
- G. W. Stewart and J.-G. Sun. Matrix Perturbation Theory. Academic Press, 1990.
- (1990) Matrix Perturbation Theory
- Stewart, G.W.¹ Sun, J.-G.²

27
- 79958846996
- Exploring compact reinforcement-learning representations with linear regression
- AUAI Press
- T. J. Walsh, I. Szita, C. Diuk, and M. L. Littman. Exploring compact reinforcement-learning representations with linear regression. In UAI, pages 591-598. AUAI Press, 2009
- (2009) UAI , pp. 591-598
- Walsh, T.J.¹ Szita, I.² Diuk, C.³ Littman, M.L.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.