메뉴 건너뛰기




Volumn , Issue PART 3, 2013, Pages 2275-2283

Almost optimal exploration in multi-armed bandits

Author keywords

[No Author keywords available]

Indexed keywords

LEARNING SYSTEMS;

EID: 84897478950     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (244)

References (17)
  • 2
    • 84864970677 scopus 로고    scopus 로고
    • Best arm identification in multi-armed bandits
    • Audibert, J.Y., Bubeck, S., and Munos, R. Best arm identification in multi-armed bandits. In COLT, pp. 41-53, 2010.
    • (2010) COLT , pp. 41-53
    • Audibert, J.Y.1    Bubeck, S.2    Munos, R.3
  • 3
    • 77957337199 scopus 로고    scopus 로고
    • UCB revisited: Improved regret bounds for the stochastic multi-armed bandit problem
    • Auer, P. and Ortner, R. UCB revisited: Improved regret bounds for the stochastic multi-armed bandit problem. Periodica Mathematica Hungarica, 61(1-2):55-65, 2010.
    • (2010) Periodica Mathematica Hungarica , vol.61 , Issue.1-2 , pp. 55-65
    • Auer, P.1    Ortner, R.2
  • 4
    • 0036568025 scopus 로고    scopus 로고
    • Finite-time analysis of the multiarmed bandit problem
    • DOI 10.1023/A:1013689704352, Computational Learning Theory
    • Auer, P., Cesa-Bianchi, N., and Fischer, P. Finite-time analysis of the multiarmed bandit problem. Machine learning, 47(2):235-256, 2002. (Pubitemid 34126111)
    • (2002) Machine Learning , vol.47 , Issue.2-3 , pp. 235-256
    • Auer, P.1    Cesa-Bianchi, N.2    Fischer, P.3
  • 5
    • 77952070805 scopus 로고    scopus 로고
    • Pure exploration in multi-armed bandits problems
    • Springer
    • Bubeck, S., Munos, R., and Stoltz, G. Pure exploration in multi-armed bandits problems. In Algorithmic Learning Theory, pp. 23-37. Springer, 2009.
    • (2009) Algorithmic Learning Theory , pp. 23-37
    • Bubeck, S.1    Munos, R.2    Stoltz, G.3
  • 8
    • 9444277556 scopus 로고    scopus 로고
    • PAC bounds for multi-armed bandit and markov decision processes
    • Springer
    • Even-Dar, E., Mannor, S., and Mansour, Y. PAC bounds for multi-armed bandit and markov decision processes. In COLT, pp. 193-209. Springer, 2002.
    • (2002) COLT , pp. 193-209
    • Even-Dar, E.1    Mannor, S.2    Mansour, Y.3
  • 9
    • 33745295134 scopus 로고    scopus 로고
    • Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems
    • Even-Dar, E., Mannor, S., and Mansour, Y. Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. The Journal of Machine Learning Research, 7:1079-1105, 2006. (Pubitemid 43938989)
    • (2006) Journal of Machine Learning Research , vol.7 , pp. 1079-1105
    • Even-Bar, E.1    Mannor, S.2    Mansour, Y.3
  • 14
    • 0002899547 scopus 로고
    • Asymptotically efficient adaptive allocation rules
    • Lai, Tze Leung and Robbins, Herbert. Asymptotically efficient adaptive allocation rules. Advances in applied mathematics, 6(1):4-22, 1985.
    • (1985) Advances in Applied Mathematics , vol.6 , Issue.1 , pp. 4-22
    • Lai, T.L.1    Robbins, H.2
  • 15
    • 30044441333 scopus 로고    scopus 로고
    • The sample complexity of exploration in the multi-armed bandit problem
    • Mannor, S. and Tsitsiklis, J.N. The sample complexity of exploration in the multi-armed bandit problem. The Journal of Machine Learning Research, 5:623-648, 2004.
    • (2004) The Journal of Machine Learning Research , vol.5 , pp. 623-648
    • Mannor, S.1    Tsitsiklis, J.N.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.