메뉴 건너뛰기




Volumn , Issue , 2011, Pages 169-178

Efficient optimal learning for contextual bandits

Author keywords

[No Author keywords available]

Indexed keywords

CLASSIFICATION RULES; CONTEXTUAL BANDITS; COST SENSITIVE CLASSIFICATIONS; FEEDBACK DELAY; ON-LINE SETTING; OPTIMAL REGRET; RUNNING TIME;

EID: 80053154335     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (204)

References (16)
  • 1
    • 0041966002 scopus 로고    scopus 로고
    • Using confidence bounds for exploitationexploration trade-offs
    • Peter Auer. Using confidence bounds for exploitationexploration trade-offs. Journal of Machine Learning Research, 3:397-422, 2002.
    • (2002) Journal of Machine Learning Research , vol.3 , pp. 397-422
    • Auer, P.1
  • 2
    • 0036568025 scopus 로고    scopus 로고
    • Finite-time analysis of the multiarmed bandit problem
    • DOI 10.1023/A:1013689704352, Computational Learning Theory
    • Peter Auer, Nicolò Cesa-Bianchi, and Paul Fischer. Finitetime analysis of the multiarmed bandit problem. Machine Learning, 47(2-3):235-256, 2002a. (Pubitemid 34126111)
    • (2002) Machine Learning , vol.47 , Issue.2-3 , pp. 235-256
    • Auer, P.1    Cesa-Bianchi, N.2    Fischer, P.3
  • 7
    • 33745295134 scopus 로고    scopus 로고
    • Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems
    • Eyal Even-Dar, Shie Mannor, and Yishay Mansour. Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. Journal of Machine Learning Research, 7:1079-1105, 2006. (Pubitemid 43938989)
    • (2006) Journal of Machine Learning Research , vol.7 , pp. 1079-1105
    • Even-Bar, E.1    Mannor, S.2    Mansour, Y.3
  • 8
    • 0002384441 scopus 로고
    • On tail probabilities for martingales
    • David A. Freedman. On tail probabilities for martingales. Annals of Probability, 3(1):100-118, 1975.
    • (1975) Annals of Probability , vol.3 , Issue.1 , pp. 100-118
    • Freedman, D.A.1
  • 9
    • 0031211090 scopus 로고    scopus 로고
    • A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting
    • Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1): 119-139, 1997. (Pubitemid 127433398)
    • (1997) Journal of Computer and System Sciences , vol.55 , Issue.1 , pp. 119-139
    • Freund, Y.1    Schapire, R.E.2
  • 10
    • 84864059297 scopus 로고    scopus 로고
    • From batch to transductive online learning
    • Sham M. Kakade and Adam Kalai. From batch to transductive online learning. In NIPS, 2005.
    • (2005) NIPS
    • Kakade, S.M.1    Kalai, A.2
  • 11
    • 24644463787 scopus 로고    scopus 로고
    • Efficient algorithms for online decision problems
    • DOI 10.1016/j.jcss.2004.10.016, PII S0022000004001394
    • Adam Tauman Kalai and Santosh Vempala. Efficient algorithms for online decision problems. J. Comput. Syst. Sci., 71(3):291-307, 2005. (Pubitemid 41278182)
    • (2005) Journal of Computer and System Sciences , vol.71 , Issue.3 , pp. 291-307
    • Kalai, A.1    Vempala, S.2
  • 12
    • 0002899547 scopus 로고
    • Asymptotically efficient adaptive allocation rules
    • Tze Leung Lai and Herbert Robbins. Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6:4-22, 1985.
    • (1985) Advances in Applied Mathematics , vol.6 , pp. 4-22
    • Lai, T.L.1    Robbins, H.2
  • 14
    • 77956144722 scopus 로고    scopus 로고
    • The epoch-greedy algorithm for contextual multi-armed bandits
    • John Langford and Tong Zhang. The epoch-greedy algorithm for contextual multi-armed bandits. In NIPS, 2007.
    • (2007) NIPS
    • Langford, J.1    Zhang, T.2
  • 15
    • 84972513554 scopus 로고
    • On general minimax theorems
    • Maurice Sion. On general minimax theorems. Pacific J. Math., 8(1):171-176, 1958.
    • (1958) Pacific J. Math. , vol.8 , Issue.1 , pp. 171-176
    • Sion, M.1
  • 16
    • 77956501313 scopus 로고    scopus 로고
    • Gaussian process optimization in the bandit setting: No regret and experimental design
    • Niranjan Srinivas, Andreas Krause, Sham Kakade, and Matthias Seeger. Gaussian process optimization in the bandit setting: No regret and experimental design. In ICML, 2010
    • (2010) ICML
    • Srinivas, N.1    Krause, A.2    Kakade, S.3    Seeger, M.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.