SCOPUS 정보 검색 플랫폼 - 논문 보기

메뉴 건너뛰기

Proceedings of the IEEE Conference on Decision and Control

Volumn , Issue , 2009, Pages 2946-2953

Arbitrarily modulated Markov decision processes

(2) Yu, Jia Yuan a Mannor, Shie a,b

a MCGILL UNIVERSITY (Canada)

b TECHNION ISRAEL INSTITUTE OF TECHNOLOGY (Israel)

Author keywords

[No Author keywords available]

Indexed keywords

BEHAVIORAL RESEARCH; COMPUTATIONAL EFFICIENCY; DECISION MAKING; MARKOV PROCESSES; REINFORCEMENT LEARNING;

ARBITRARY CHANGE; DECISION-MAKING PROBLEM; MARKOV DECISION PROCESSES; NONSTATIONARY; PRIOR KNOWLEDGE; Q-LEARNING; TRANSITION PROBABILITIES;

LEARNING ALGORITHMS;

EID: 77950787050 PISSN: 07431546 EISSN: 25762370 Source Type: Conference Proceeding
DOI: 10.1109/CDC.2009.5400662 Document Type: Conference Paper

Times cited : (51)

References (24)

1
- 0000392613
- Stochastic games
- L. Shapley, "Stochastic games," PNAS, vol.39, no.10, pp. 1095-1100, 1953.
- (1953) PNAS , vol.39 , Issue.10 , pp. 1095-1100
- Shapley, L.¹

2
- 41649111187
- Experts in a Markov decision process
- E. Even-Dar, S. Kakade, and Y. Mansour, "Experts in a Markov decision process," in NIPS, 2004, pp. 401-408.
- (2004) NIPS , pp. 401-408
- Even-Dar, E.¹ Kakade, S.² Mansour, Y.³

3
- 0038386340
- The empirical Bayes envelope and regret minimization in competitive Markov decision processes
- S. Mannor and N. Shimkin, "The empirical Bayes envelope and regret minimization in competitive Markov decision processes," Mathematics of Operations Research, vol.28, no.2, pp. 327-345, 2003.
- (2003) Mathematics of Operations Research , vol.28 , Issue.2 , pp. 327-345
- Mannor, S.¹ Shimkin, N.²

4
- 84926078662
- Cambridge University Press
- N. Cesa-Bianchi and G. Lugosi, Prediction, Learning, and Games. Cambridge University Press, 2006.
- (2006) Prediction, Learning, and Games
- Cesa-Bianchi, N.¹ Lugosi, G.²

5
- 0030685459
- Markov modulated Bernoulli process
- S. Özekici, "Markov modulated Bernoulli process," Mathematical Methods of Operations Research, vol.45, no.3, pp. 311-324, 1997.
- (1997) Mathematical Methods of Operations Research , vol.45 , Issue.3 , pp. 311-324
- Özekici, S.¹

6
- 14344250395
- Robust control of Markov decision processes with uncertain transition matrices
- A. Nilim and L. E. Ghaoui, "Robust control of Markov decision processes with uncertain transition matrices," Operations Research, vol.53, no.5, pp. 780-798, 2005.
- (2005) Operations Research , vol.53 , Issue.5 , pp. 780-798
- Nilim, A.¹ Ghaoui, L.E.²

7
- 0037709910
- The nonstochastic multiarmed bandit problem
- P. Auer, N. Cesa-Bianchi, Y. Freund, and R. E. Schapire, "The nonstochastic multiarmed bandit problem," SIAM J. Comput., vol.32, no.1, pp. 48-77, 2002.
- (2002) SIAM J. Comput. , vol.32 , Issue.1 , pp. 48-77
- Auer, P.¹ Cesa-Bianchi, N.² Freund, Y.³ Schapire, R.E.⁴

8
- 0041965975
- R-max - A general polynomial time algorithm for near-optimal reinforcement learning
- R. I. Brafman and M. Tennenholtz, "R-max - a general polynomial time algorithm for near-optimal reinforcement learning," Journal of Machine Learning Research, vol.3, pp. 213-231, 2003.
- (2003) Journal of Machine Learning Research , vol.3 , pp. 213-231
- Brafman, R.I.¹ Tennenholtz, M.²

9
- 70349280578
- Markov decision processes with arbitrary rewards
- to appear
- J. Y. Yu, S. Mannor, and N. Shimkin, "Markov decision processes with arbitrary rewards," Math. Oper. Res., 2009, to appear.
- (2009) Math. Oper. Res.
- Yu, J.Y.¹ Mannor, S.² Shimkin, N.³

10
- 70349986740
- Online learning in Markov decision processes with arbitrarily changing rewards and transitions
- J. Y. Yu and S. Mannor, "Online learning in Markov decision processes with arbitrarily changing rewards and transitions," in GameNets, 2009.
- (2009) GameNets
- Yu, J.Y.¹ Mannor, S.²

11
- 0003998452
- Wiley
- M. L. Puterman, Markov Decision Processes. Wiley, 1994.
- (1994) Markov Decision Processes
- Puterman, M.L.¹

12
- 0003487482
- Athena Scientific
- D. P. Bertsekas and J. N. Tsitsiklis, Neuro-Dynamic Programming. Athena Scientific, 1996.
- (1996) Neuro-Dynamic Programming
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

13
- 0001616908
- Uncertainty principles and signal recovery
- D. L. Donoho and P. B. Stark, "Uncertainty principles and signal recovery," SIAM J. Appl. Math., vol.49, no.3, pp. 906-931, 1989.
- (1989) SIAM J. Appl. Math. , vol.49 , Issue.3 , pp. 906-931
- Donoho, D.L.¹ Stark, P.B.²

14
- 24344490792
- Asymptotic operating characteristics of an optimal change point detection in hidden Markov models
- C. D. Fuh, "Asymptotic operating characteristics of an optimal change point detection in hidden Markov models," Ann. Statist., pp. 2305-2339, 2004.
- (2004) Ann. Statist. , pp. 2305-2339
- Fuh, C.D.¹

15
- 0003989209
- Springer-Verlag
- J. Filar and K. Vrieze, Competitive Markov Decision Processes. Springer-Verlag, 1996.
- (1996) Competitive Markov Decision Processes
- Filar, J.¹ Vrieze, K.²

16
- 37349042879
- The robustness-performance tradeoff in Markov decision processes
- H. Xu and S. Mannor, "The robustness-performance tradeoff in Markov decision processes," in NIPS, 2006, pp. 1537-1544.
- (2006) NIPS , pp. 1537-1544
- Xu, H.¹ Mannor, S.²

17
- 0001976283
- Approximation to Bayes risk in repeated play
- Princeton University Press
- J. Hannan, "Approximation to Bayes risk in repeated play," in Contributions to the Theory of Games. Princeton University Press, 1957, vol.3, pp. 97-139.
- (1957) Contributions to the Theory of Games , vol.3 , pp. 97-139
- Hannan, J.¹

18
- 35148838877
- The weighted majority algorithm
- N. Littlestone and M. Warmuth, "The weighted majority algorithm," Information and Computation, vol.108, no.2, pp. 212-261, 1994.
- (1994) Information and Computation , vol.108 , Issue.2 , pp. 212-261
- Littlestone, N.¹ Warmuth, M.²

19
- 0003565783
- 2nd ed. Athena Scientific
- D. P. Bertsekas, Dynamic Programming and Optimal Control, 2nd ed. Athena Scientific, 2001, vol.2.
- (2001) Dynamic Programming and Optimal Control , vol.2
- Bertsekas, D.P.¹

20
- 34249833101
- Q-learning
- C. Watkins and P. Dayan, "Q-learning," Machine Learning, vol.8, pp. 279-292, 1992.
- (1992) Machine Learning , vol.8 , pp. 279-292
- Watkins, C.¹ Dayan, P.²

21
- 24644463787
- Efficient algorithms for online decision problems
- A. Kalai and S. Vempala, "Efficient algorithms for online decision problems," Journal of Computer and System Sciences, vol.71, no.3, pp. 291-307, 2005.
- (2005) Journal of Computer and System Sciences , vol.71 , Issue.3 , pp. 291-307
- Kalai, A.¹ Vempala, S.²

22
- 0004066022
- Springer-Verlag
- H. J. Kushner and G. G. Yin, Stochastic Approximation Algorithms and Applications. Springer-Verlag, 1997.
- (1997) Stochastic Approximation Algorithms and Applications
- Kushner, H.J.¹ Yin, G.G.²

23
- 0033876515
- The O.D.E. method for convergence of stochastic approximation and reinforcement learning
- V. S. Borkar and S. P. Meyn, "The O.D.E. method for convergence of stochastic approximation and reinforcement learning," SIAM J. Control Optim., vol.38, no.2, pp. 447-469, 2000.
- (2000) SIAM J. Control Optim. , vol.38 , Issue.2 , pp. 447-469
- Borkar, V.S.¹ Meyn, S.P.²

24
- 0001296683
- Perturbation theory and finite Markov chains
- P. J. Schweitzer, "Perturbation theory and finite Markov chains," Journal of Applied Probability, vol.5, pp. 401-413, 1968.
- (1968) Journal of Applied Probability , vol.5 , pp. 401-413
- Schweitzer, P.J.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.