메뉴 건너뛰기




Volumn 78, Issue 5, 2012, Pages 1538-1556

The K-armed dueling bandits problem

Author keywords

Multi armed bandits; Online learning; Preference elicitation

Indexed keywords

INFORMATION THEORY;

EID: 84861586270     PISSN: 00220000     EISSN: 10902724     Source Type: Journal    
DOI: 10.1016/j.jcss.2011.12.028     Document Type: Conference Paper
Times cited : (327)

References (29)
  • 2
    • 0036568025 scopus 로고    scopus 로고
    • Finite-time analysis of the multiarmed bandit problem
    • Peter Auer, Nicolò Cesa-Bianchi, and Paul Fischer Finite-time analysis of the multiarmed bandit problem Mach. Learn. 47 2 2002 235 256
    • (2002) Mach. Learn. , vol.47 , Issue.2 , pp. 235-256
    • Auer, P.1    Cesa-Bianchi, N.2    Fischer, P.3
  • 6
    • 0041966002 scopus 로고    scopus 로고
    • Using confidence bounds for exploitation-exploration trade
    • Peter Auer Using confidence bounds for exploitation-exploration trade J. Mach. Learn. Res. 3 2003 397 422
    • (2003) J. Mach. Learn. Res. , vol.3 , pp. 397-422
    • Auer, P.1
  • 9
    • 33748442333 scopus 로고    scopus 로고
    • Regret minimization under partial monitoring
    • Nicolò Cesa-Bianchi, Gábor Lugosi, and Gilles Stoltz Regret minimization under partial monitoring Math. Oper. Res. 31 3 2006 562 580
    • (2006) Math. Oper. Res. , vol.31 , Issue.3 , pp. 562-580
    • Cesa-Bianchi, N.1    Lugosi, G.2    Stoltz, G.3
  • 12
    • 33745295134 scopus 로고    scopus 로고
    • Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems
    • Eyal Even-Dar, Shie Mannor, and Yishay Mansour Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems J. Mach. Learn. Res. 7 2006 1079 1105
    • (2006) J. Mach. Learn. Res. , vol.7 , pp. 1079-1105
    • Even-Dar, E.1    Mannor, S.2    Mansour, Y.3
  • 13
    • 4644367942 scopus 로고    scopus 로고
    • An efficient boosting algorithm for combining preferences
    • Yoav Freund, Raj Iyer, Robert Schapire, and Yoram Singer An efficient boosting algorithm for combining preferences J. Mach. Learn. Res. 4 2003 933 969
    • (2003) J. Mach. Learn. Res. , vol.4 , pp. 933-969
    • Freund, Y.1    Iyer, R.2    Schapire, R.3    Singer, Y.4
  • 16
    • 84947403595 scopus 로고
    • Probability inequalities for sums of bounded random variables
    • Wassily Hoeffding Probability inequalities for sums of bounded random variables J. Amer. Statist. Assoc. 58 1963 13 30
    • (1963) J. Amer. Statist. Assoc. , vol.58 , pp. 13-30
    • Hoeffding, W.1
  • 20
    • 0002899547 scopus 로고
    • Asymptotically efficient adaptive allocation rules
    • T.L. Lai, and Herbert Robbins Asymptotically efficient adaptive allocation rules Adv. in Appl. Math. 6 1985 4 22
    • (1985) Adv. in Appl. Math. , vol.6 , pp. 4-22
    • Lai, T.L.1    Robbins, H.2
  • 24
    • 30044441333 scopus 로고    scopus 로고
    • The sample complexity of exploration in the multi-armed bandit problem
    • Shie Mannor, and John N. Tsitsiklis The sample complexity of exploration in the multi-armed bandit problem J. Mach. Learn. Res. 5 2004 623 648
    • (2004) J. Mach. Learn. Res. , vol.5 , pp. 623-648
    • Mannor, S.1    Tsitsiklis, J.N.2
  • 27
    • 84966203785 scopus 로고
    • Some aspects of the sequential design of experiments
    • Herbert Robbins Some aspects of the sequential design of experiments Bull. Amer. Math. Soc. 58 1952 527 535
    • (1952) Bull. Amer. Math. Soc. , vol.58 , pp. 527-535
    • Robbins, H.1
  • 29


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.