메뉴 건너뛰기




Volumn 17, Issue 1 PART 1, 2008, Pages

Gap-free bounds for stochastic multi-armed bandit

Author keywords

Learning theory; Randomized methods; Stochastic control

Indexed keywords

GRADIENT TYPE; INSTANTANEOUS LOSS; LEARNING THEORY; MULTI ARMED BANDIT; MULTI-ARMED BANDIT PROBLEM; RANDOMIZED DECISIONS; RANDOMIZED METHODS; STOCHASTIC CONTROL;

EID: 79961019787     PISSN: 14746670     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.3182/20080706-5-KR-1001.2585     Document Type: Conference Paper
Times cited : (22)

References (15)
  • 2
    • 0036568025 scopus 로고    scopus 로고
    • Finite-time analysis of the multiarmed bandit problem
    • P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47:235-256, 2002a.
    • (2002) Machine Learning , vol.47 , pp. 235-256
    • Auer, P.1    Cesa-Bianchi, N.2    Fischer, P.3
  • 7
    • 0008815681 scopus 로고    scopus 로고
    • Exponentiated gradient versus gradient descent for linear predictors
    • J. Kivinen and M. Warmuth. Exponentiated gradient versus gradient descent for linear predictors. Information and Computation, 132(1):1-63, 1997.
    • (1997) Information and Computation , vol.132 , Issue.1 , pp. 1-63
    • Kivinen, J.1    Warmuth, M.2
  • 15
    • 0038982800 scopus 로고
    • An asymptotic minimax theorem for the two armed bandit problem
    • W. Vogel. An asymptotic minimax theorem for the two armed bandit problem. The Annals of Mathematical Statistics, 31:444-451, 1960.
    • (1960) The Annals of Mathematical Statistics , vol.31 , pp. 444-451
    • Vogel, W.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.