SCOPUS 정보 검색 플랫폼

Volumn 11, Issue , 2010, Pages 2785-2863

Regret bounds and minimax policies under partial monitoring

(2) Audibert, Jean Yves a,c Bubeck, Sébastien b

Author keywords

Bandits (adversarial and stochastic); Label efficient; Minimax rate; Online learning; Prediction with limited feedback; Regret bound; Upper confidence bound (UCB) policy

Indexed keywords

BANDITS (ADVERSARIAL AND STOCHASTIC); LIMITED FEEDBACK; MINIMAX; ONLINE LEARNING; REGRET BOUND; UPPER CONFIDENCE BOUND;

E-LEARNING; OPTIMIZATION; STOCHASTIC SYSTEMS;

FORECASTING;

EID: 78649420293 PISSN: 15324435 EISSN: 15337928 Source Type: Journal
DOI: None Document Type: Article

Times cited : (198)

References (16)

1
- 33750733956
- Hannan consistency in on-line learning in case of unbounded losses under partial monitoring
- Springer
- C. Allenberg, P. Auer, L. Györfi, and G. Ottucsák. Hannan consistency in on-line learning in case of unbounded losses under partial monitoring. In ALT, volume 4264 of Lecture Notes in Computer Science, pages 229-243. Springer, 2006.
- (2006) ALT, Volume 4264 of Lecture Notes in Computer Science , pp. 229-243
- Allenberg, C.¹ Auer, P.² Györfi, L.³ Ottucsák, G.⁴

2
- 62949181077
- Exploration-exploitation trade-off using variance estimates in multi-armed bandits
- J.-Y. Audibert, R. Munos, and Cs. Szepesvári. Exploration- exploitation trade-off using variance estimates in multi-armed bandits. Theoretical Computer Science, 410:1876-1902, 2009.
- (2009) Theoretical Computer Science , vol.410 , pp. 1876-1902
- Audibert, J.-Y.¹ Munos, R.² Szepesvári, Cs.³

3
- 0041966002
- Using confidence bounds for exploitation-exploration trade-offs
- P. Auer. Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research, 3:397-422, 2002.
- (2002) Journal of Machine Learning Research , vol.3 , pp. 397-422
- Auer, P.¹

4
- 0029513526
- Gambling in a rigged casino: The adversarial multi-armed bandit problem
- IEEE Computer Society Press
- P. Auer, N. Cesa-Bianchi, Y. Freund, and R. Schapire. Gambling in a rigged casino: the adversarial multi-armed bandit problem. In Proceedings of the 36th Annual Symposium on Foundations of Computer Science, pages 322-331. IEEE Computer Society Press, 1995.
- (1995) Proceedings of the 36th Annual Symposium on Foundations of Computer Science , pp. 322-331
- Auer, P.¹ Cesa-Bianchi, N.² Freund, Y.³ Schapire, R.⁴

5
- 0036568025
- Finite-time analysis of the multiarmed bandit problem
- P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning Journal, 47(2-3):235-256, 2002a.
- (2002) Machine Learning Journal , vol.47 , Issue.2-3 , pp. 235-256
- Auer, P.¹ Cesa-Bianchi, N.² Fischer, P.³

6
- 0037709910
- The non-stochastic multi-armed bandit problem
- P. Auer, N. Cesa-Bianchi, Y. Freund, and R. Schapire. The non-stochastic multi-armed bandit problem. SIAM Journal on Computing, 32(1):48-77, 2002b.
- (2002) SIAM Journal on Computing , vol.32 , Issue.1 , pp. 48-77
- Auer, P.¹ Cesa-Bianchi, N.² Freund, Y.³ Schapire, R.⁴

7
- 84972574511
- Weighted sums of certain dependent random variables
- K. Azuma. Weighted sums of certain dependent random variables. Tohoku Mathematical Journal, 19:357-367, 1967.
- (1967) Tohoku Mathematical Journal , vol.19 , pp. 357-367
- Azuma, K.¹

8
- 0033285751
- Analysis of two gradient-based algorithms for on-line regression
- N. Cesa-Bianchi. Analysis of two gradient-based algorithms for on-line regression. Journal of Computer and System Sciences, 59(3):392-411, 1999.
- (1999) Journal of Computer and System Sciences , vol.59 , Issue.3 , pp. 392-411
- Cesa-Bianchi, N.¹

9
- 84926078662
- Cambridge University Press
- N. Cesa-Bianchi and G. Lugosi. Prediction, Learning, and Games. Cambridge University Press, 2006.
- (2006) Prediction, Learning, and Games
- Cesa-Bianchi, N.¹ Lugosi, G.²

10
- 0031140246
- How to use expert advice
- N. Cesa-Bianchi, Y. Freund, D. P. Helmbold, D. Haussler, R. E. Schapire, and M. K. Warmuth. How to use expert advice. Journal of the ACM, 44(3):427-485, 1997.
- (1997) Journal of the ACM , vol.44 , Issue.3 , pp. 427-485
- Cesa-Bianchi, N.¹ Freund, Y.² Helmbold, D.P.³ Haussler, D.⁴ Schapire, R.E.⁵ Warmuth, M.K.⁶

11
- 20544462399
- Minimizing regret with label efficient prediction
- N. Cesa-Bianchi, G. Lugosi, and G. Stoltz. Minimizing regret with label efficient prediction. IEEE: Transactions on Information Theory, 51:2152-2162, 2005.
- (2005) IEEE: Transactions on Information Theory , vol.51 , pp. 2152-2162
- Cesa-Bianchi, N.¹ Lugosi, G.² Stoltz, G.³

12
- 0002384441
- On tail probabilities for martingales
- D. A. Freedman. On tail probabilities for martingales. The Annals of Probability, 3:100-118, 1975.
- (1975) The Annals of Probability , vol.3 , pp. 100-118
- Freedman, D.A.¹

13
- 33644897321
- Adaptive routing using expert advice
- A. György and G. Ottucsák. Adaptive routing using expert advice. Computer Journal-Oxford, 49(2):180-189, 2006.
- (2006) Computer Journal-Oxford , vol.49 , Issue.2 , pp. 180-189
- György, A.¹ Ottucsák, G.²

14
- 0030707345
- Some label efficient learning results
- ACM New York, NY, USA
- D. Helmbold and S. Panizza. Some label efficient learning results. In Proceedings of the 10th annual conference on Computational learning theory, pages 218-230. ACM New York, NY, USA, 1997.
- (1997) Proceedings of the 10th Annual Conference on Computational Learning Theory , pp. 218-230
- Helmbold, D.¹ Panizza, S.²

15
- 84947403595
- Probability inequalities for sums of bounded random variables
- W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58:13-30, 1963.
- (1963) Journal of the American Statistical Association , vol.58 , pp. 13-30
- Hoeffding, W.¹

16
- 84966203785
- Some aspects of the sequential design of experiments
- H. Robbins. Some aspects of the sequential design of experiments. Bulletin of the American Mathematics Society, 58:527-535, 1952.
- (1952) Bulletin of the American Mathematics Society , vol.58 , pp. 527-535
- Robbins, H.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.