메뉴 건너뛰기




Volumn , Issue , 2009, Pages

The K-armed dueling bandits problem

Author keywords

[No Author keywords available]

Indexed keywords

BINARY FEEDBACK; CONSTANT FACTORS; CONVENTIONAL APPROACH; OPTIMAL REGRET; PAIR-WISE COMPARISON; PRODUCT ATTRACTIVENESS;

EID: 84898077397     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (28)

References (28)
  • 1
    • 0036568025 scopus 로고    scopus 로고
    • Finite-time analysis of the multiarmed bandit problem
    • Peter Auer, Nicolò Cesa-Bianchi, and Paul Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2): 235-256, 2002.
    • (2002) Machine Learning , vol.47 , Issue.2 , pp. 235-256
    • Auer, P.1    Cesa-Bianchi, N.2    Fischer, P.3
  • 5
    • 0041966002 scopus 로고    scopus 로고
    • Using confidence bounds for exploitation-exploration trade
    • Peter Auer. Using confidence bounds for exploitation-exploration trade. Journal of Machine Learning Research (JMLR), 3: 397-422, 2003.
    • (2003) Journal of Machine Learning Research (JMLR) , vol.3 , pp. 397-422
    • Auer, P.1
  • 7
  • 11
    • 33745295134 scopus 로고    scopus 로고
    • Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems
    • Eyal Even-Dar, Shie Mannor, and Yishay Mansour. Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. Journal of Machine Learning Research (JMLR), 7: 1079-1105, 2006.
    • (2006) Journal of Machine Learning Research (JMLR) , vol.7 , pp. 1079-1105
    • Even-Dar, E.1    Mannor, S.2    Mansour, Y.3
  • 15
    • 84947403595 scopus 로고
    • Probability inequalities for sums of bounded random variables
    • Wassily Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58: 13-30, 1963.
    • (1963) Journal of the American Statistical Association , vol.58 , pp. 13-30
    • Hoeffding, W.1
  • 20
    • 0002899547 scopus 로고
    • Asymptotically efficient adaptive allocation rules
    • T. L. Lai and Herbert Robbins. Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6: 4-22, 1985.
    • (1985) Advances in Applied Mathematics , vol.6 , pp. 4-22
    • Lai, T.L.1    Robbins, H.2
  • 24
    • 30044441333 scopus 로고    scopus 로고
    • The sample complexity of exploration in the multi-armed bandit problem
    • Shie Mannor and John N. Tsitsiklis. The sample complexity of exploration in the multi-armed bandit problem. Journal of Machine Learning Research (JMLR), 5: 623-648, 2004.
    • (2004) Journal of Machine Learning Research (JMLR) , vol.5 , pp. 623-648
    • Mannor, S.1    Tsitsiklis, J.N.2
  • 27
    • 84966203785 scopus 로고
    • Some aspects of the sequential design of experiments
    • Herbert Robbins. Some Aspects of the Sequential Design of Experiments. Bull. Amer. Math. Soc., 58: 527-535, 1952.
    • (1952) Bull. Amer. Math. Soc. , vol.58 , pp. 527-535
    • Robbins, H.1
  • 28
    • 71149114227 scopus 로고    scopus 로고
    • Interactively optimizing information retrieval systems as a dueling bandits problem
    • Yisong Yue and Thorsten Joachims. Interactively optimizing information retrieval systems as a dueling bandits problem. In International Conference on Machine Learning (ICML), 2009.
    • (2009) International Conference on Machine Learning (ICML)
    • Yue, Y.1    Joachims, T.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.