메뉴 건너뛰기




Volumn 5, Issue , 2014, Pages 3611-3619

Taming the monster: A fast and simple algorithm for contextual bandits

Author keywords

[No Author keywords available]

Indexed keywords

ARTIFICIAL INTELLIGENCE; LEARNING SYSTEMS;

EID: 84919787147     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (213)

References (19)
  • 1
    • 84919804450 scopus 로고    scopus 로고
    • Taming the monster: A fast and simple algorithm for contextual bandits
    • abs/1402.0555
    • Agarwal, Alekh, Hsu, Daniel, Kale, Satyen, Langford, John, Li, Lihong, and Schapire, Robert E. Taming the monster: A fast and simple algorithm for contextual bandits. CoRR, abs/1402.0555, 2014.
    • (2014) CoRR
    • Alekh, A.1    Daniel, H.2    Satyen, K.3    John, L.4    Lihong, L.5    Schapire Robert, E.6
  • 2
    • 0041966002 scopus 로고    scopus 로고
    • Using confidence bounds for exploitation-exploration trade-offs
    • Auer, Peter. Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research, 3:397-422, 2002.
    • (2002) Journal of Machine Learning Research , vol.3 , pp. 397-422
    • Peter, A.1
  • 4
    • 70350664424 scopus 로고    scopus 로고
    • The offset tree for learning with partial labels
    • Beygelzimer, Alina and Langford, John. The offset tree for learning with partial labels. In KDD, 2009.
    • (2009) KDD
    • Beygelzimer, A.1    Langford, J.2
  • 5
  • 6
    • 0033280893 scopus 로고    scopus 로고
    • Beating the holdout: Bounds for k-fold and progressive cross-validation
    • Blum, Avrim, Kalai, Adam, and Langford, John. Beating the holdout: Bounds for k-fold and progressive cross-validation. In COLT, 1999.
    • (1999) COLT
    • Blum, A.1    Kalai, A.2    Langford, J.3
  • 7
    • 85162416700 scopus 로고    scopus 로고
    • An empirical evaluation of Thompson sampling
    • Chapelle, Olivier and Li, Lihong. An empirical evaluation of Thompson sampling. In NIPS, 2011.
    • (2011) NIPS
    • Chapelle, O.1    Li, L.2
  • 8
    • 84860620518 scopus 로고    scopus 로고
    • Contextual bandits with linear payoff functions
    • Chu, Wei, Li, Lihong, Reyzin, Lev, and Schapire, Robert E. Contextual bandits with linear payoff functions. In AISTATS, 2011.
    • (2011) AISTATS
    • Chu, W.1    Li, L.2    Reyzin, L.3    Schapire, R.E.4
  • 10
    • 80053456223 scopus 로고    scopus 로고
    • Doubly robust policy evaluation and learning
    • Dudik, Miroslav, Langford, John, and Li, Lihong. Doubly robust policy evaluation and learning. In ICML, 2011b.
    • (2011) ICML
    • Dudik, M.1    Langford, J.2    Li, L.3
  • 11
    • 0031122905 scopus 로고    scopus 로고
    • Predicting neariy as well as the best pruning of a decision tree
    • Helmbold, David P. and Schapire, Robert E. Predicting neariy as well as the best pruning of a decision tree. Machine Learning, 27(l):51-68, 1997.
    • (1997) Machine Learning , vol.27 , Issue.1 , pp. 51-68
    • Helmbold, D.P.1    Schapire, R.E.2
  • 13
    • 77956144722 scopus 로고    scopus 로고
    • The epoch-greedy algorithm for contextual multi-armed bandits
    • Langford, John and Zhang, Tong. The epoch-greedy algorithm for contextual multi-armed bandits. In NIPS, 2007.
    • (2007) NIPS
    • Langford, J.1    Zhang, T.2
  • 15
    • 84919804446 scopus 로고    scopus 로고
    • Generalized Thompson sampling for contextual bandits
    • abs/1310.7163
    • Li, Lihong. Generalized Thompson sampling for contextual bandits. CoRR, abs/1310.7163, 2013.
    • (2013) CoRR
    • Li, L.1
  • 16
    • 77954641643 scopus 로고    scopus 로고
    • A contextual-bandit approach to personalized news article recommendation
    • Li, Lihong, Chu, Wei, Langford, John, and Schapire, Robert E. A contextual-bandit approach to personalized news article recommendation. In WWW, 2010.
    • (2010) WWW
    • Li, L.1    Chu, W.2    Langford, J.3    Schapire, R.E.4
  • 17
    • 84898068653 scopus 로고    scopus 로고
    • Tighter bounds for multi-armed bandits with expert advice
    • McMahan, H. Brendan and Streeter, Matthew. Tighter bounds for multi-armed bandits with expert advice. In COLT, 2009.
    • (2009) COLT
    • McMahan, H.B.1    Streeter, M.2
  • 19
    • 0001395850 scopus 로고
    • On the likelihood that one unknown probability exceeds another in view of the evidence of two samples
    • Thompson, William R. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25(3-4):285-294, 1933.
    • (1933) Biometrika , vol.25 , Issue.3-4 , pp. 285-294
    • Thompson, W.R.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.