SCOPUS 정보 검색 플랫폼

Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010

Volumn , Issue , 2010, Pages

Online Markov decision processes under bandit feedback

(4) Neu, Gergely a,b György, András b Szepesvári, Csaba c Antos, András b

a BUDAPEST UNIVERSITY OF TECHNOLOGY AND ECONOMICS (Hungary)

b Machine Learning Research Group (Hungary)

c UNIVERSITY OF ALBERTA (Canada)

Author keywords

[No Author keywords available]

Indexed keywords

E-LEARNING; MARKOV PROCESSES; STOCHASTIC SYSTEMS;

BANDIT FEEDBACKS; LEARNING AGENTS; MARKOV DECISION PROCESSES; MARKOVIAN ENVIRONMENT; OBLIVIOUS ADVERSARIES; ONLINE LEARNING; REWARD FUNCTION; STATIONARY POLICY; STOCHASTICS; TIME STEP;

LEARNING ALGORITHMS;

EID: 85162052729 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (154)

References (8)

1
- 0037709910
- The nonstochastic multiarmed bandit problem
- Auer, P., Cesa-Bianchi, N., Freund, Y., and Schapire, R. E. (2002). The nonstochastic multiarmed bandit problem. SIAM J. Comput., 32(1):48-77.
- (2002) SIAM J. Comput , vol.32 , Issue.1 , pp. 48-77
- Auer, P.¹ Cesa-Bianchi, N.² Freund, Y.³ Schapire, R.E.⁴

2
- 84899000904
- Experts in a Markov decision process
- Saul, L. K., Weiss, Y., and Bottou, L., editors
- Even-Dar, E., Kakade, S. M., and Mansour, Y. (2005). Experts in a Markov decision process. In Saul, L. K., Weiss, Y., and Bottou, L., editors, Advances in Neural Information Processing Systems 17, pages 401-408.
- (2005) Advances in Neural Information Processing Systems , vol.17 , pp. 401-408
- Even-Dar, E.¹ Kakade, S.M.² Mansour, Y.³

3
- 70349277420
- Online Markov decision processes
- Even-Dar, E., Kakade, S. M., and Mansour, Y. (2009). Online Markov decision processes. Mathematics of Operations Research, 34(3):726-736.
- (2009) Mathematics of Operations Research , vol.34 , Issue.3 , pp. 726-736
- Even-Dar, E.¹ Kakade, S.M.² Mansour, Y.³

4
- 84898073198
- The online loop-free stochastic shortest-path problem
- Neu, G., György, A., and Szepesvári, C. (2010). The online loop-free stochastic shortest-path problem. In COLT-10.
- (2010) COLT-10
- Neu, G.¹ György, A.² Szepesvári, C.³

5
- 85102627959
- Wiley-Interscience
- Puterman, M. L. (1994). Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley-Interscience.
- (1994) Markov Decision Processes: Discrete Stochastic Dynamic Programming
- Puterman, M.L.¹

6
- 77950787050
- Arbitrarily modulated Markov decision processes
- IEEE Press
- Yu, J. Y. and Mannor, S. (2009a). Arbitrarily modulated Markov decision processes. In Joint 48th IEEE Conference on Decision and Control and 28th Chinese Control Conference. IEEE Press.
- (2009) Joint 48th, IEEE Conference on Decision and Control and 28th Chinese Control Conference
- Yu, J.Y.¹ Mannor, S.²

7
- 70349986740
- Online learning in Markov decision processes with arbitrarily changing rewards and transitions
- Piscataway, NJ, USA, IEEE Press
- Yu, J. Y. and Mannor, S. (2009b). Online learning in Markov decision processes with arbitrarily changing rewards and transitions. In GameNets'09: Proceedings of the First ICST international conference on Game Theory for Networks, pages 314-322, Piscataway, NJ, USA. IEEE Press.
- (2009) GameNets'09: Proceedings of the First ICST International Conference on Game Theory for Networks , pp. 314-322
- Yu, J.Y.¹ Mannor, S.²

8
- 70349280578
- Markov decision processes with arbitrary reward processes
- Yu, J. Y., Mannor, S., and Shimkin, N. (2009). Markov decision processes with arbitrary reward processes. Mathematics of Operations Research, 34(3):737-757.
- (2009) Mathematics of Operations Research , vol.34 , Issue.3 , pp. 737-757
- Yu, J.Y.¹ Mannor, S.² Shimkin, N.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.