메뉴 건너뛰기




Volumn 11, Issue , 2010, Pages 2785-2863

Regret bounds and minimax policies under partial monitoring

Author keywords

Bandits (adversarial and stochastic); Label efficient; Minimax rate; Online learning; Prediction with limited feedback; Regret bound; Upper confidence bound (UCB) policy

Indexed keywords

BANDITS (ADVERSARIAL AND STOCHASTIC); LIMITED FEEDBACK; MINIMAX; ONLINE LEARNING; REGRET BOUND; UPPER CONFIDENCE BOUND;

EID: 78649420293     PISSN: 15324435     EISSN: 15337928     Source Type: Journal    
DOI: None     Document Type: Article
Times cited : (198)

References (16)
  • 2
    • 62949181077 scopus 로고    scopus 로고
    • Exploration-exploitation trade-off using variance estimates in multi-armed bandits
    • J.-Y. Audibert, R. Munos, and Cs. Szepesvári. Exploration- exploitation trade-off using variance estimates in multi-armed bandits. Theoretical Computer Science, 410:1876-1902, 2009.
    • (2009) Theoretical Computer Science , vol.410 , pp. 1876-1902
    • Audibert, J.-Y.1    Munos, R.2    Szepesvári, Cs.3
  • 3
    • 0041966002 scopus 로고    scopus 로고
    • Using confidence bounds for exploitation-exploration trade-offs
    • P. Auer. Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research, 3:397-422, 2002.
    • (2002) Journal of Machine Learning Research , vol.3 , pp. 397-422
    • Auer, P.1
  • 5
    • 0036568025 scopus 로고    scopus 로고
    • Finite-time analysis of the multiarmed bandit problem
    • P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning Journal, 47(2-3):235-256, 2002a.
    • (2002) Machine Learning Journal , vol.47 , Issue.2-3 , pp. 235-256
    • Auer, P.1    Cesa-Bianchi, N.2    Fischer, P.3
  • 7
    • 84972574511 scopus 로고
    • Weighted sums of certain dependent random variables
    • K. Azuma. Weighted sums of certain dependent random variables. Tohoku Mathematical Journal, 19:357-367, 1967.
    • (1967) Tohoku Mathematical Journal , vol.19 , pp. 357-367
    • Azuma, K.1
  • 8
    • 0033285751 scopus 로고    scopus 로고
    • Analysis of two gradient-based algorithms for on-line regression
    • N. Cesa-Bianchi. Analysis of two gradient-based algorithms for on-line regression. Journal of Computer and System Sciences, 59(3):392-411, 1999.
    • (1999) Journal of Computer and System Sciences , vol.59 , Issue.3 , pp. 392-411
    • Cesa-Bianchi, N.1
  • 12
    • 0002384441 scopus 로고
    • On tail probabilities for martingales
    • D. A. Freedman. On tail probabilities for martingales. The Annals of Probability, 3:100-118, 1975.
    • (1975) The Annals of Probability , vol.3 , pp. 100-118
    • Freedman, D.A.1
  • 13
    • 33644897321 scopus 로고    scopus 로고
    • Adaptive routing using expert advice
    • A. György and G. Ottucsák. Adaptive routing using expert advice. Computer Journal-Oxford, 49(2):180-189, 2006.
    • (2006) Computer Journal-Oxford , vol.49 , Issue.2 , pp. 180-189
    • György, A.1    Ottucsák, G.2
  • 15
  • 16


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.