SCOPUS 정보 검색 플랫폼

Journal of Machine Learning Research

Volumn 22, Issue , 2012, Pages 1-9

Online-to-confidence-set conversions and application to sparse stochastic bandits

(3) Abbasi Yadkori, Yasin a Pál, Dávid b Szepesvári, Csaba a

a GOOGLE INC (United States)

b UNIVERSITY OF ALBERTA (Canada)

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; ARTIFICIAL INTELLIGENCE; LEARNING ALGORITHMS; STOCHASTIC SYSTEMS;

EXPONENTIATED GRADIENT ALGORITHMS; LINEAR COMBINATIONS; LINEAR PREDICTION; LINEAR PREDICTORS; NOVEL TECHNIQUES; ON-LINE ALGORITHMS; ONLINE LEARNING ALGORITHMS; ONLINE PREDICTION;

E-LEARNING;

EID: 84908661477 PISSN: 15324435 EISSN: 15337928 Source Type: Journal
DOI: None Document Type: Conference Paper

Times cited : (153)

References (36)

1
- 84860615664
- Forced-exploration based algorithms for playing in stochastic linear bandits
- Yasin Abbasi-Yadkori, Andras Antos, and Csaba Szepesvari. Forced-exploration based algorithms for playing in stochastic linear bandits. In COLT Workshop on On-line Learning with Limited Feedback, 2009.
- (2009) COLT Workshop on On-line Learning with Limited Feedback
- Abbasi-Yadkori, Y.¹ Antos, A.² Szepesvari, C.³

2
- 85162561761
- Improved algorithms for linear stochastic bandits
- Yasin Abbasi-Yadkori, David Pal, and Csaba Szepesvaari. Improved algorithms for linear stochastic bandits. In Advances in Neural Information Processing Systems, 2011a.
- (2011) Advances in Neural Information Processing Systems
- Abbasi-Yadkori, Y.¹ Pal, D.² Szepesvaari, C.³

3
- 84954256963
- Online least squares estimation with self-normalized processes: An application to bandit problems
- Yasin Abbasi-Yadkori, Daavid Paal, and Csaba Szepesvari. Online least squares estimation with self-normalized processes: An application to bandit problems. Arxiv preprint http://arxiv.org/abs/1102.2670, 2011b.
- (2011) Arxiv Preprint
- Abbasi-Yadkori, Y.¹ Paal, D.² Szepesvari, C.³

4
- 84947554090
- Andraas Antos and Csaba Szepesvaari. Personal Communication, 2009.
- (2009) Personal Communication
- Antos, A.¹ Szepesvaari, C.²

5
- 62949181077
- Exploration-exploitation trade-off using variance estimates in multi-armed bandits
- Jean Yves Audibert, Remi Munos, and Csaba Szepesvari. Exploration-exploitation trade-off using variance estimates in multi-armed bandits. In Theoretical Computer Science-2008, 2008.
- (2008) Theoretical Computer Science-2008
- Yves Audibert, J.¹ Munos, R.² Szepesvari, C.³

6
- 62949181077
- Exploration-exploitation tradeoff using variance estimates in multi-armed bandits
- Jean-Yves Audibert, Raemi Munos, and Csaba Szepesvari. Exploration-exploitation tradeoff using variance estimates in multi-armed bandits. Theoretical Computer Science, 410(19):1876-1902, 2009.
- (2009) Theoretical Computer Science , vol.410 , Issue.19 , pp. 1876-1902
- Audibert, J.¹ Munos, R.² Szepesvari, C.³

7
- 0041966002
- Using confidence bounds for exploitation-exploration trade-offs
- Peter Auer. Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research, 3:397-422, 2002.
- (2002) Journal of Machine Learning Research , vol.3 , pp. 397-422
- Auer, P.¹

8
- 0036568025
- Finite time analysis of the multiarmed bandit problem
- Peter Auer, Nicolo Cesa-Bianchi, and Paul Fischer. Finite time analysis of the multiarmed bandit problem. Machine Learning, 47(2-3):235-256, 2002a.
- (2002) Machine Learning , vol.47 , Issue.2-3 , pp. 235-256
- Auer, P.¹ Cesa-Bianchi, N.² Fischer, P.³

9
- 0036477185
- Adaptive and self-confident on-line learning algorithms
- Peter Auer, Nicolo Cesa-Bianchi, and Claudio Gentile. Adaptive and self-confident on-line learning algorithms. Journal of Computer and System Sciences, 64(1):48-75, 2002b.
- (2002) Journal of Computer and System Sciences , vol.64 , Issue.1 , pp. 48-75
- Auer, P.¹ Cesa-Bianchi, N.² Gentile, C.³

10
- 80053161827
- REGAL: A reg-ularization based algorithm for reinforcement learning in weakly communicating MDPs
- Peter L. Bartlett and Ambuj Tewari. REGAL: A reg-ularization based algorithm for reinforcement learning in weakly communicating MDPs. In Proceedings of the 25th Annual Conference on Uncertainty in Artificial Intelligence, 2009.
- (2009) Proceedings of the 25th Annual Conference on Uncertainty in Artificial Intelligence
- Bartlett, P.L.¹ Tewari, A.²

11
- 84878104490
- Compressive sampling
- Emmanuel J. Candoes. Compressive sampling. In Proceedings of the International Congress of Mathematicians, volume 3, pages 1433-1452, 2006.
- (2006) Proceedings of the International Congress of Mathematicians , vol.3 , pp. 1433-1452
- Candoes, E.J.¹

12
- 84954336966
- Bandit theory meets compressed sensing for high dimensional stochastic linear bandit
- Alexandra Carpentier and Remi Munos. Bandit theory meets compressed sensing for high dimensional stochastic linear bandit. In Proceedings of fifteenth international conference on Artificial Intelligence and Statistics, 2012.
- (2012) Proceedings of Fifteenth International Conference on Artificial Intelligence and Statistics
- Carpentier, A.¹ Munos, R.²

13
- 84926078662
- Cambridge University Press, New York, NY, USA
- Nicoloo Cesa-Bianchi and Gaabor Lugosi. Prediction, Learning, and Games. Cambridge University Press, New York, NY, USA, 2006.
- (2006) Prediction Learning and Games
- Cesa-Bianchi, N.¹ Lugosi, G.²

14
- 4544304381
- On the generalization ability of on-line learning algorithms
- Nicoloo Cesa-Bianchi, Alex Conconi, and Claudio Gentile. On the generalization ability of on-line learning algorithms. IEEE Transactions on Information Theory, 50(9):2050-2057, 2004.
- (2004) IEEE Transactions on Information Theory , vol.50 , Issue.9 , pp. 2050-2057
- Cesa-Bianchi, N.¹ Conconi, A.² Gentile, C.³

15
- 38049046503
- Aggregation by exponential weighting and sharp oracle inequalities
- Arnak S. Dalalyan and Alexandre B. Tsybakov. Aggregation by exponential weighting and sharp oracle inequalities. In Proceedings of the 20th Annual Conference on Learning Theory, pages 97-111, 2007.
- (2007) Proceedings of the 20th Annual Conference on Learning Theory , pp. 97-111
- Dalalyan, A.S.¹ Tsybakov, A.B.²

16
- 84898072179
- Stochastic linear optimization under bandit feed-back
- Rocco Servedio and Tong Zhang, editors
- Varsha Dani, Thomas P. Hayes, and Sham M. Kakade. Stochastic linear optimization under bandit feed-back. In Rocco Servedio and Tong Zhang, editors, Proceedings of the 21st Annual Conference on Learning Theory (COLT 2008), pages 355-366, 2008.
- (2008) Proceedings of the 21st Annual Conference on Learning Theory (COLT 2008 , pp. 355-366
- Dani, V.¹ Hayes, T.P.² Kakade, S.M.³

17
- 56449091064
- Data-driven online to batch conversions
- Ofer Dekel and Yoram Singer. Data-driven online to batch conversions. NIPS 2005, 18:267, 2006.
- (2005) NIPS , vol.18 , Issue.267 , pp. 2006
- Dekel, O.¹ Singer, Y.²

18
- 84937398609
- PAC bounds for multi-armed bandit and Markov decision processes
- Eyal Even-Dar, Shie Mannor, and Yishay Mansour. PAC bounds for multi-armed bandit and Markov decision processes. In Fifteenth Annual Conference on Computational Learning Theory (COLT), pages 255-270, 2002.
- (2002) Fifteenth Annual Conference on Computational Learning Theory (COLT , pp. 255-270
- Even-Dar, E.¹ Mannor, S.² Mansour, Y.³

19
- 0002384441
- On tail probabilities for martingales
- David A. Freedman. On tail probabilities for martingales. The Annals of Probability, 3(1):100-118, 1975.
- (1975) The Annals of Probability , vol.3 , Issue.1 , pp. 100-118
- Freedman, D.A.¹

20
- 0033280804
- The robustness of the p-norm algorithms
- New York, NY, USA, ACM
- Claudio Gentile and Nick Littlestone. The robustness of the p-norm algorithms. In Proceedings of the twelfth annual conference on Computational learning theory, pages 1-11, New York, NY, USA, 1999. ACM.
- (1999) Proceedings of the Twelfth Annual Conference on Computational Learning Theory , pp. 1-11
- Gentile, C.¹ Littlestone, N.²

21
- 84868131036
- Sparsity regret bounds for individual sequences in online linear regression
- Sebastien Gerchinovitz. Sparsity regret bounds for individual sequences in online linear regression. In Proceedings of the 24st Annual Conference on Learning Theory (COLT 2011), 2011.
- (2011) Proceedings of the 24st Annual Conference on Learning Theory (COLT 2011
- Gerchinovitz, S.¹

22
- 0030661191
- General convergence results for linear discriminant updates
- ACM Press
- Adam J. Grove, Nick Littlestone, and Dale Schuur-mans. General convergence results for linear discriminant updates. In Machine Learning, pages 171-183. ACM Press, 1997.
- (1997) Machine Learning , pp. 171-183
- Grove, A.J.¹ Littlestone, N.² Schuur-Mans, D.³

23
- 77951952841
- Near-optimal regret bounds for reinforcement learning
- Thomas Jaksch, Ronald Ortner, and Peter Auer. Near-optimal regret bounds for reinforcement learning. Journal of Machine Learning Research, 11:1563-1600, 2010.
- (2010) Journal of Machine Learning Research , vol.11 , pp. 1563-1600
- Jaksch, T.¹ Ortner, R.² Auer, P.³

24
- 0008815681
- Exponentiated gradient versus gradient descent for linear predictors
- January
- Jyrki Kivinen and Manfred K. Warmuth. Exponentiated gradient versus gradient descent for linear predictors. Information and Computation, 132(1):1-63, January 1997.
- (1997) Information and Computation , vol.132 , Issue.1 , pp. 1-63
- Kivinen, J.¹ Warmuth, M.K.²

25
- 0018190841
- Strong consistency of least squares estimates in multiple regression
- Tze Leung Lai, Herbert Robbins, and Ching Zong Wei. Strong consistency of least squares estimates in multiple regression. Proceedings of the National Academy of Sciences, 75(7):3034-3036, 1979.
- (1979) Proceedings of the National Academy of Sciences , vol.75 , Issue.7 , pp. 3034-3036
- Leung Lai, T.¹ Robbins, H.² Zong Wei, C.³

26
- 84954256968
- John Langford. Machine learning reductions, 2011. http://hunch.net/~jl/projects/reductions/reductions.html.
- (2011) Machine Learning Reductions
- Langford, J.¹

27
- 77954641643
- A contextual-bandit approach to personalized news article recommendation
- ACM
- Lihong Li, Wei Chu, John Langford, and Robert E. Schapire. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th International Conference on World Wide Web (WWW 2010), pages 661-670. ACM, 2010.
- (2010) Proceedings of the 19th International Conference on World Wide Web (WWW 2010 , pp. 661-670
- Li, L.¹ Chu, W.² Langford, J.³ Schapire, R.E.⁴

28
- 34250091945
- Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm
- Nick Littlestone. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning, 2(4):285-318, 1988.
- (1988) Machine Learning , vol.2 , Issue.4 , pp. 285-318
- Littlestone, N.¹

29
- 85011913774
- From on-line to batch learning
- Association for Computing Machinery, Inc, One Astor Plaza, 1515 Broadway, New York, NY, 10036-5701, USA
- Nicolas Littlestone. From on-line to batch learning. In Annual Workshop on Computational Learning Theory: Proceedings of the second annual workshop on Computational learning theory (COLT 1989). Association for Computing Machinery, Inc, One Astor Plaza, 1515 Broadway, New York, NY, 10036-5701, USA, 1989.
- (1989) Annual Workshop on Computational Learning Theory: Proceedings of the Second Annual Workshop on Computational Learning Theory (COLT 1989)
- Littlestone, N.¹

30
- 30044441333
- The sample complexity of exploration in the multi-armed bandit problem
- Shie Mannor and John N. Tsitsiklis. The sample complexity of exploration in the multi-armed bandit problem. Journal of Machine Learning Research, 5:623-648, 2004.
- (2004) Journal of Machine Learning Research , vol.5 , pp. 623-648
- Mannor, S.¹ Tsitsiklis, J.N.²

31
- 56449108844
- Empirical bernstein stopping
- Andrew McCallum and Sam Roweis, editors
- Volodymyr Mnih, Csaba Szepesvari, and Jean-Yves Audibert. Empirical bernstein stopping. In Andrew McCallum and Sam Roweis, editors, Proceedings of the 25th Annual International Conference on Machine Learning (ICML 2008), pages 672-679, 2008.
- (2008) Proceedings of the 25th Annual International Conference on Machine Learning (ICML 2008 , pp. 672-679
- Mnih, V.¹ Szepesvari, C.² Audibert, J.³

32
- 77953111834
- Linearly parameterized bandits
- Paat Rusmevichientong and John N. Tsitsiklis. Linearly parameterized bandits. Mathematics of Operations Research, 35(2):395-411, 2010.
- (2010) Mathematics of Operations Research , vol.35 , Issue.2 , pp. 395-411
- Rusmevichientong, P.¹ Tsitsiklis, J.N.²

33
- 85194972808
- Regression shrinkage and selection via the lasso
- Robert Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodolical), 58(1):267-288, 1996.
- (1996) Journal of the Royal Statistical Society. Series B (Methodolical) , vol.58 , Issue.1 , pp. 267-288
- Tibshirani, R.¹

34
- 0035413537
- Competitive on-line statistics
- Vladimir Vovk. Competitive on-line statistics. International Statistical Review, 69:213-248, 2001.
- (2001) International Statistical Review , vol.69 , pp. 213-248
- Vovk, V.¹

35
- 79958846996
- Exploring compact reinforcement-learning representations with linear regression
- AUAI Press
- Thomas J. Walsh, Istvaan Szita, Carlso Diuk, and Michael L. Littman. Exploring compact reinforcement-learning representations with linear regression. In Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI 2009), pages 591-598. AUAI Press, 2009.
- (2009) Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI 2009 , pp. 591-598
- Walsh, T.J.¹ Szita, I.² Diuk, C.³ Littman, M.L.⁴

36
- 33749242004
- Springer
- Larry Wasserman. All of Nonparametric Statistics. Springer, 1998.
- (1998) All of Nonparametric Statistics
- Wasserman, L.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.