SCOPUS 정보 검색 플랫폼 - 논문 보기

메뉴 건너뛰기

Proceedings of the 2009 International Conference on Game Theory for Networks, GameNets '09

Volumn , Issue , 2009, Pages 314-322

Online learning in Markov decision processes with arbitrarily changing rewards and transitions

(2) Yu, Jia An a Mannor, Shie a,b

a MCGILL UNIVERSITY (Canada)

b TECHNION ISRAEL INSTITUTE OF TECHNOLOGY (Israel)

Author keywords

[No Author keywords available]

Indexed keywords

DECISION MAKERS; DECISION-MAKING PROBLEM; MARKOV DECISION PROCESSES; NONSTATIONARY; ONLINE LEARNING; TRANSITION PROBABILITIES;

COMPUTATIONAL COMPLEXITY; E-LEARNING; GAME THEORY; INTERNET; LEARNING ALGORITHMS; MARKOV PROCESSES; PROBABILITY; ROBUST CONTROL;

DECISION MAKING;

EID: 70349986740 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/GAMENETS.2009.5137416 Document Type: Conference Paper

Times cited : (50)

References (21)

1
- 0003998452
- Wiley
- M. L. Puterman, Markov Decision Processes. Wiley, 1994.
- (1994) Markov Decision Processes
- Puterman, M.L.¹

2
- 0000392613
- Stochastic games
- L. Shapley, "Stochastic games," PNAS, vol. 39, no. 10, pp. 1095-1100, 1953.
- (1953) PNAS , vol.39 , Issue.10 , pp. 1095-1100
- Shapley, L.¹

3
- 41649111187
- Experts in a Markov decision process
- E. Even-Dar, S. Kakade, and Y. Mansour, "Experts in a Markov decision process," in NIPS, 2004, pp. 401-408.
- (2004) NIPS , pp. 401-408
- Even-Dar, E.¹ Kakade, S.² Mansour, Y.³

4
- 0038386340
- The empirical Bayes envelope and regret minimization in competitive Markov decision processes
- S. Mannor and N. Shimkin, "The empirical Bayes envelope and regret minimization in competitive Markov decision processes," Mathematics of Operations Research, vol. 28, no. 2, pp. 327-345, 2003.
- (2003) Mathematics of Operations Research , vol.28 , Issue.2 , pp. 327-345
- Mannor, S.¹ Shimkin, N.²

5
- 84926078662
- Cambridge University Press
- N. Cesa-Bianchi and G. Lugosi, Prediction, learning, and games. Cambridge University Press, 2006.
- (2006) Prediction, learning, and games
- Cesa-Bianchi, N.¹ Lugosi, G.²

6
- 14344250395
- Robust control of Markov decision processes with uncertain transition matrices
- A. Nilim and L. E. Ghaoui, "Robust control of Markov decision processes with uncertain transition matrices," Operations Research, vol. 53, no. 5, pp. 780-798, 2005.
- (2005) Operations Research , vol.53 , Issue.5 , pp. 780-798
- Nilim, A.¹ Ghaoui, L.E.²

7
- 0037709910
- The non-stochastic multiarmed bandit problem
- P. Auer, N. Cesa-Bianchi, Y. Freund, and R. E. Schapire, "The non-stochastic multiarmed bandit problem," SIAM J. Computing, vol. 32, no. 1, pp. 48-77, 2002.
- (2002) SIAM J. Computing , vol.32 , Issue.1 , pp. 48-77
- Auer, P.¹ Cesa-Bianchi, N.² Freund, Y.³ Schapire, R.E.⁴

8
- 0041965975
- R-max - a general polynomial time algorithm for near-optimal reinforcement learning
- R. I. Brafman and M. Tennenholtz, "R-max - a general polynomial time algorithm for near-optimal reinforcement learning," Journal of Machine Learning Research, vol. 3, pp. 213-231, 2003.
- (2003) Journal of Machine Learning Research , vol.3 , pp. 213-231
- Brafman, R.I.¹ Tennenholtz, M.²

9
- 58449132310
- J. Y. Yu, S. Mannor, and N. Shimkin, Markov decision processes with arbitrary reward processes, in Lecture Notes in Computer Science, 5323, 2009, http://www.cim.mcgill.ca/~jiayuan/mdp.pdf.
- J. Y. Yu, S. Mannor, and N. Shimkin, "Markov decision processes with arbitrary reward processes," in Lecture Notes in Computer Science, vol. 5323, 2009, http://www.cim.mcgill.ca/~jiayuan/mdp.pdf.

10
- 0003487482
- Athena Scientific
- D. P. Bertsekas and J. N. Tsitsiklis, Neuro-Dynamic Programming. Athena Scientific, 1996.
- (1996) Neuro-Dynamic Programming
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

11
- 0001976283
- Approximation to Bayes risk in repeated play
- Princeton University Press
- J. Hannan, "Approximation to Bayes risk in repeated play," in Contributions to the Theory of Games. Princeton University Press, 1957, vol. 3, pp. 97-139.
- (1957) Contributions to the Theory of Games , vol.3 , pp. 97-139
- Hannan, J.¹

12
- 61449152333
- Applications of dynamic games in queues
- E. Altman, "Applications of dynamic games in queues," Advances in Dynamic Games, vol. 7, pp. 309-342, 2005.
- (2005) Advances in Dynamic Games , vol.7 , pp. 309-342
- Altman, E.¹

13
- 0032182921
- Reliable communication under channel uncertainty
- A. Lapidoth and P. Narayan, "Reliable communication under channel uncertainty," IEEE Trans. Inf. Theory, vol. 44, no. 6, pp. 2148-2177, 1998.
- (1998) IEEE Trans. Inf. Theory , vol.44 , Issue.6 , pp. 2148-2177
- Lapidoth, A.¹ Narayan, P.²

14
- 35148838877
- The weighted majority algorithm
- N. Littlestone and M. Warmuth, "The weighted majority algorithm," Information and Computation, vol. 108, no. 2, pp. 212-261, 1994.
- (1994) Information and Computation , vol.108 , Issue.2 , pp. 212-261
- Littlestone, N.¹ Warmuth, M.²

15
- 37349042879
- The robustness-performance tradeoff in Markov decision processes
- H. Xu and S. Mannor, "The robustness-performance tradeoff in Markov decision processes," in NIPS, 2006, pp. 1537-1544.
- (2006) NIPS , pp. 1537-1544
- Xu, H.¹ Mannor, S.²

16
- 0003989209
- Springer-Verlag
- J. Filar and K. Vrieze, Competitive Markov Decision Processes. Springer-Verlag, 1996.
- (1996) Competitive Markov Decision Processes
- Filar, J.¹ Vrieze, K.²

17
- 24644463787
- Efficient algorithms for online decision problems
- A. Kalai and S. Vempala, "Efficient algorithms for online decision problems," Journal of Computer and System Sciences, vol. 71, no. 3, pp. 291-307, 2005.
- (2005) Journal of Computer and System Sciences , vol.71 , Issue.3 , pp. 291-307
- Kalai, A.¹ Vempala, S.²

18
- 0003565783
- 2nd ed. Athena Scientific
- D. P. Bertsekas, Dynamic Programming and Optimal Control, 2nd ed. Athena Scientific, 2001, vol. 2.
- (2001) Dynamic Programming and Optimal Control , vol.2
- Bertsekas, D.P.¹

19
- 0001296683
- Perturbation theory and finite Markov chains
- P. J. Schweitzer, "Perturbation theory and finite Markov chains," Journal of Applied Probability, vol. 5, pp. 401-413, 1968.
- (1968) Journal of Applied Probability , vol.5 , pp. 401-413
- Schweitzer, P.J.¹

20
- 70349991097
- On-line Markov decision processes
- preprint
- E. Even-Dar, S. Kakade, and Y. Mansour., "On-line Markov decision processes," preprint.
- Even-Dar, E.¹ Kakade, S.² Mansour, Y.³

21
- 0031074521
- Locally weighted learning
- C. G. Atkeson, A. W. Moore, and S. Schaal, "Locally weighted learning," Artificial Intelligence Review, 1997.
- (1997) Artificial Intelligence Review
- Atkeson, C.G.¹ Moore, A.W.² Schaal, S.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.