메뉴 건너뛰기




Volumn , Issue , 2010, Pages

Learning from logged implicit exploration data

Author keywords

[No Author keywords available]

Indexed keywords

CONTEXTUAL BANDITTI; EXPLORATION DATA; EXPLORATION POLICY; GIVEN FEATURES; HISTORICAL DATA; LEARNING PROCESS; OFFLINE DATA; RANDOMISATION; REAL-WORLD;

EID: 85162031443     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (208)

References (11)
  • 3
    • 84898967749 scopus 로고    scopus 로고
    • Approximate planning in large pomdps via reusable trajectories
    • Michael Kearns, YishayMansour, and Andrew Y. Ng. Approximate planning in large pomdps via reusable trajectories. In NIPS, 2000.
    • (2000) NIPS
    • Kearns, M.1    Mansour, Y.2    Ng, A.Y.3
  • 4
    • 77953968105 scopus 로고    scopus 로고
    • More bang for their bucks: Assessing new features for online advertisers
    • Diane Lambert and Daryl Pregibon. More bang for their bucks: Assessing new features for online advertisers. In ADKDD 2007, 2007.
    • (2007) ADKDD 2007
    • Lambert, D.1    Pregibon, D.2
  • 6
    • 77956144722 scopus 로고    scopus 로고
    • The epoch-greedy algorithm for multi-armed bandits with side information
    • John Langford and Tong Zhang. The epoch-greedy algorithm for multi-armed bandits with side information. In Advances in Neural Information Processing Systems 20, pages 817-824, 2008.
    • (2008) Advances in Neural Information Processing Systems , vol.20 , pp. 817-824
    • Langford, J.1    Zhang, T.2
  • 11
    • 0242393653 scopus 로고    scopus 로고
    • Eligibility traces for off-policy policy evaluation
    • Doina Precup, Rich Sutton, and Satinder Singh. Eligibility traces for off-policy policy evaluation. In ICML, 2000.
    • (2000) ICML
    • Precup, D.1    Sutton, R.2    Singh, S.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.