메뉴 건너뛰기




Volumn 5323 LNAI, Issue , 2008, Pages 268-281

Markov decision processes with arbitrary reward processes

Author keywords

[No Author keywords available]

Indexed keywords

CONTROL PROBLEMS; DECISION MAKER (DM); MARKOV DECISION PROCESS (MDP); MARKOV DECISION PROCESSES (MDPS); PERFORMANCE LOSSES; REWARD FUNCTIONS;

EID: 58449132310     PISSN: 03029743     EISSN: 16113349     Source Type: Book Series    
DOI: 10.1007/978-3-540-89722-4_21     Document Type: Conference Paper
Times cited : (3)

References (19)
  • 2
    • 0002056057 scopus 로고
    • Markets with a continuum of traders
    • Aumann, R.J.: Markets with a continuum of traders. Econometrica 32, 39-50 (1964)
    • (1964) Econometrica , vol.32 , pp. 39-50
    • Aumann, R.J.1
  • 5
    • 33750501028 scopus 로고    scopus 로고
    • Modified logarithmic Sobolev inequalities in discrete settings
    • Bobkov, S.G., Tetali, P.: Modified logarithmic Sobolev inequalities in discrete settings. Journal of Theoretical Probability 19(2), 289-336 (2006)
    • (2006) Journal of Theoretical Probability , vol.19 , Issue.2 , pp. 289-336
    • Bobkov, S.G.1    Tetali, P.2
  • 6
    • 0033876515 scopus 로고    scopus 로고
    • Borkar, V.S., Meyn, S.P.: The O.D.E. method for convergence of stochastic approximation and reinforcement learning. SIAM J. Control and Optimization 38(2), 447-469 (2000)
    • Borkar, V.S., Meyn, S.P.: The O.D.E. method for convergence of stochastic approximation and reinforcement learning. SIAM J. Control and Optimization 38(2), 447-469 (2000)
  • 7
    • 0041965975 scopus 로고    scopus 로고
    • R-max-a general polynomial time algorithm for near-optimal reinforcement learning
    • Brafman, R.I., Tennenholtz, M.: R-max-a general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research 3, 213-231 (2003)
    • (2003) Journal of Machine Learning Research , vol.3 , pp. 213-231
    • Brafman, R.I.1    Tennenholtz, M.2
  • 10
    • 23044525872 scopus 로고    scopus 로고
    • A nonstationary offered-load model for packet networks
    • Duffield, N.G., Massey, W.A., Whitt, W.: A nonstationary offered-load model for packet networks. Telecommunication Systems 16(3-4), 271-296 (2001)
    • (2001) Telecommunication Systems , vol.16 , Issue.3-4 , pp. 271-296
    • Duffield, N.G.1    Massey, W.A.2    Whitt, W.3
  • 11
    • 41649111187 scopus 로고    scopus 로고
    • Experts in a Markov decision process
    • Even-Dar, E., Kakade, S., Mansour, Y.: Experts in a Markov decision process. In: NIPS, pp. 401-408 (2004)
    • (2004) NIPS , pp. 401-408
    • Even-Dar, E.1    Kakade, S.2    Mansour, Y.3
  • 13
    • 0001976283 scopus 로고
    • Approximation to Bayes risk in repeated play
    • Princeton University Press, Princeton
    • Hannan, J.: Approximation to Bayes risk in repeated play. In: Contributions to the Theory of Games, vol. 3, pp. 97-139. Princeton University Press, Princeton (1957)
    • (1957) Contributions to the Theory of Games , vol.3 , pp. 97-139
    • Hannan, J.1
  • 14
    • 0032137328 scopus 로고    scopus 로고
    • Tracking the best expert
    • Herbster, M., Warmuth, M.K.: Tracking the best expert. Machine Learning 32(2), 151-178 (1998)
    • (1998) Machine Learning , vol.32 , Issue.2 , pp. 151-178
    • Herbster, M.1    Warmuth, M.K.2
  • 15
    • 24644463787 scopus 로고    scopus 로고
    • Efficient algorithms for online decision problems
    • 15
    • 15.Kalai, A., Vempala, S.: Efficient algorithms for online decision problems. Journal of Computer and System Sciences 71(3), 291-307 (2005)
    • (2005) Journal of Computer and System Sciences , vol.71 , Issue.3 , pp. 291-307
    • Kalai, A.1    Vempala, S.2
  • 16
    • 0038386340 scopus 로고    scopus 로고
    • The empirical Bayes envelope and regret minimization in competitive Markov decision processes
    • Mannor, S., Shimkin, N.: The empirical Bayes envelope and regret minimization in competitive Markov decision processes. Mathematics of Operations Research 28(2), 327-345 (2003)
    • (2003) Mathematics of Operations Research , vol.28 , Issue.2 , pp. 327-345
    • Mannor, S.1    Shimkin, N.2
  • 18
    • 0000392613 scopus 로고
    • Stochastic games
    • Shapley, L.: Stochastic games. PNAS 39(10), 1095-1100 (1953)
    • (1953) PNAS , vol.39 , Issue.10 , pp. 1095-1100
    • Shapley, L.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.