메뉴 건너뛰기




Volumn 22, Issue , 2012, Pages 1-9

Online-to-confidence-set conversions and application to sparse stochastic bandits

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; ARTIFICIAL INTELLIGENCE; LEARNING ALGORITHMS; STOCHASTIC SYSTEMS;

EID: 84908661477     PISSN: 15324435     EISSN: 15337928     Source Type: Journal    
DOI: None     Document Type: Conference Paper
Times cited : (153)

References (36)
  • 3
    • 84954256963 scopus 로고    scopus 로고
    • Online least squares estimation with self-normalized processes: An application to bandit problems
    • Yasin Abbasi-Yadkori, Daavid Paal, and Csaba Szepesvari. Online least squares estimation with self-normalized processes: An application to bandit problems. Arxiv preprint http://arxiv.org/abs/1102.2670, 2011b.
    • (2011) Arxiv Preprint
    • Abbasi-Yadkori, Y.1    Paal, D.2    Szepesvari, C.3
  • 6
    • 62949181077 scopus 로고    scopus 로고
    • Exploration-exploitation tradeoff using variance estimates in multi-armed bandits
    • Jean-Yves Audibert, Raemi Munos, and Csaba Szepesvari. Exploration-exploitation tradeoff using variance estimates in multi-armed bandits. Theoretical Computer Science, 410(19):1876-1902, 2009.
    • (2009) Theoretical Computer Science , vol.410 , Issue.19 , pp. 1876-1902
    • Audibert, J.1    Munos, R.2    Szepesvari, C.3
  • 7
    • 0041966002 scopus 로고    scopus 로고
    • Using confidence bounds for exploitation-exploration trade-offs
    • Peter Auer. Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research, 3:397-422, 2002.
    • (2002) Journal of Machine Learning Research , vol.3 , pp. 397-422
    • Auer, P.1
  • 8
    • 0036568025 scopus 로고    scopus 로고
    • Finite time analysis of the multiarmed bandit problem
    • Peter Auer, Nicolo Cesa-Bianchi, and Paul Fischer. Finite time analysis of the multiarmed bandit problem. Machine Learning, 47(2-3):235-256, 2002a.
    • (2002) Machine Learning , vol.47 , Issue.2-3 , pp. 235-256
    • Auer, P.1    Cesa-Bianchi, N.2    Fischer, P.3
  • 17
    • 56449091064 scopus 로고    scopus 로고
    • Data-driven online to batch conversions
    • Ofer Dekel and Yoram Singer. Data-driven online to batch conversions. NIPS 2005, 18:267, 2006.
    • (2005) NIPS , vol.18 , Issue.267 , pp. 2006
    • Dekel, O.1    Singer, Y.2
  • 19
    • 0002384441 scopus 로고
    • On tail probabilities for martingales
    • David A. Freedman. On tail probabilities for martingales. The Annals of Probability, 3(1):100-118, 1975.
    • (1975) The Annals of Probability , vol.3 , Issue.1 , pp. 100-118
    • Freedman, D.A.1
  • 22
    • 0030661191 scopus 로고    scopus 로고
    • General convergence results for linear discriminant updates
    • ACM Press
    • Adam J. Grove, Nick Littlestone, and Dale Schuur-mans. General convergence results for linear discriminant updates. In Machine Learning, pages 171-183. ACM Press, 1997.
    • (1997) Machine Learning , pp. 171-183
    • Grove, A.J.1    Littlestone, N.2    Schuur-Mans, D.3
  • 23
    • 77951952841 scopus 로고    scopus 로고
    • Near-optimal regret bounds for reinforcement learning
    • Thomas Jaksch, Ronald Ortner, and Peter Auer. Near-optimal regret bounds for reinforcement learning. Journal of Machine Learning Research, 11:1563-1600, 2010.
    • (2010) Journal of Machine Learning Research , vol.11 , pp. 1563-1600
    • Jaksch, T.1    Ortner, R.2    Auer, P.3
  • 24
    • 0008815681 scopus 로고    scopus 로고
    • Exponentiated gradient versus gradient descent for linear predictors
    • January
    • Jyrki Kivinen and Manfred K. Warmuth. Exponentiated gradient versus gradient descent for linear predictors. Information and Computation, 132(1):1-63, January 1997.
    • (1997) Information and Computation , vol.132 , Issue.1 , pp. 1-63
    • Kivinen, J.1    Warmuth, M.K.2
  • 28
    • 34250091945 scopus 로고
    • Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm
    • Nick Littlestone. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning, 2(4):285-318, 1988.
    • (1988) Machine Learning , vol.2 , Issue.4 , pp. 285-318
    • Littlestone, N.1
  • 30
    • 30044441333 scopus 로고    scopus 로고
    • The sample complexity of exploration in the multi-armed bandit problem
    • Shie Mannor and John N. Tsitsiklis. The sample complexity of exploration in the multi-armed bandit problem. Journal of Machine Learning Research, 5:623-648, 2004.
    • (2004) Journal of Machine Learning Research , vol.5 , pp. 623-648
    • Mannor, S.1    Tsitsiklis, J.N.2
  • 34
    • 0035413537 scopus 로고    scopus 로고
    • Competitive on-line statistics
    • Vladimir Vovk. Competitive on-line statistics. International Statistical Review, 69:213-248, 2001.
    • (2001) International Statistical Review , vol.69 , pp. 213-248
    • Vovk, V.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.