메뉴 건너뛰기




Volumn , Issue , 2012, Pages 93-101

Deterministic MDPs with adversarial rewards and bandit feedback

Author keywords

[No Author keywords available]

Indexed keywords

BANDIT FEEDBACKS; DECISION MAKERS; MARKOV DECISION PROCESSES; ON-LINE DECISION MAKINGS; STATE TRANSITION DYNAMICS; TRANSITION DYNAMICS;

EID: 84886067084     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (28)

References (21)
  • 7
    • 65749318481 scopus 로고
    • Uber den variabilitatsbereich der fourierschen konstanten von positiven harmonischen funktionen
    • C. Caratheodory. Uber den variabilitatsbereich der fourierschen konstanten von positiven harmonischen funktionen. Rendiconti del Circolo Matematico di Palermo, 32:193-217, 1911.
    • (1911) Rendiconti Del Circolo Matematico di Palermo , vol.32 , pp. 193-217
    • Caratheodory, C.1
  • 8
    • 33845302015 scopus 로고    scopus 로고
    • Combining expert advice in reactive environments
    • D. P. de Farias and N. Megiddo. Combining expert advice in reactive environments. Journal of the ACM, 53(5):762-799, 2006.
    • (2006) Journal of the ACM , vol.53 , Issue.5 , pp. 762-799
    • De Farias, D.P.1    Megiddo, N.2
  • 11
    • 50249167647 scopus 로고    scopus 로고
    • On polynomial cases of the unichain classification problem for markov decision processes
    • E. A. Feinberg and F. Yang. On polynomial cases of the unichain classification problem for Markov decision processes. Operations Research Letters, 36(5): 527-530, 2008.
    • (2008) Operations Research Letters , vol.36 , Issue.5 , pp. 527-530
    • Feinberg, E.A.1    Yang, F.2
  • 17
    • 77953539718 scopus 로고    scopus 로고
    • Online regret bounds for Markov decision processes with deterministic transitions
    • R. Ortner. Online regret bounds for Markov decision processes with deterministic transitions. Theoretical Computer Science, 411(29-30):2684-2695, 2010.
    • (2010) Theoretical Computer Science , vol.411 , Issue.29-30 , pp. 2684-2695
    • Ortner, R.1
  • 18
    • 77949509398 scopus 로고    scopus 로고
    • On the possibility of learning in reactive environments with arbitrary dependence
    • D. Ryabko and M. Hutter. On the possibility of learning in reactive environments with arbitrary dependence. Theoretical Computer Science, 405(3):274-284, 2008.
    • (2008) Theoretical Computer Science , vol.405 , Issue.3 , pp. 274-284
    • Ryabko, D.1    Hutter, M.2
  • 21
    • 70349280578 scopus 로고    scopus 로고
    • Markov decision processes with arbitrary reward processes
    • J. Y. Yu, S. Mannor, and N. Shimkin. Markov decision processes with arbitrary reward processes. Mathematics of Operations Research, 34(3):737-757, 2009.
    • (2009) Mathematics of Operations Research , vol.34 , Issue.3 , pp. 737-757
    • Yu, J.Y.1    Mannor, S.2    Shimkin, N.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.