SCOPUS 정보 검색 플랫폼 - 논문 보기

메뉴 건너뛰기

30th International Conference on Machine Learning, ICML 2013

Volumn , Issue PART 3, 2013, Pages 1712-1720

Better rates for any adversarial deterministic MDP

(2) Dekel, Ofer a Hazan, Elad b

a MICROSOFT RESEARCH (United States)

b TECHNION ISRAEL INSTITUTE OF TECHNOLOGY (Israel)

Author keywords

[No Author keywords available]

Indexed keywords

LEARNING ALGORITHMS; LEARNING SYSTEMS; MARKOV PROCESSES;

BANDIT FEEDBACKS; GRAPH TOPOLOGY; MARKOV DECISION PROCESSES; REGRET BOUNDS; REGRET MINIMIZATION; TWO WAYS;

TOPOLOGY;

EID: 84897554269 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (17)

References (17)

1
- 84862535425
- Interior-point methods for full-information and bandit online learning
- Abernethy, J., Hazan, E., and Rakhlin, A. Interior-point methods for full-information and bandit online learning. IEEE Transactions on Information Theory, 58(7):4164-4175, 2012.
- (2012) IEEE Transactions on Information Theory , vol.58 , Issue.7 , pp. 4164-4175
- Abernethy, J.¹ Hazan, E.² Rakhlin, A.³

2
- 84920492213
- Oxford University Press
- Alfonsin, J. Ramirez. The Diophantine Frobenius problem. Oxford University Press, 2005.
- (2005) The Diophantine Frobenius Problem
- Ramirez, A.J.¹

3
- 84886067084
- Deterministic MDPs with adversarial rewards and bandit feedback
- Arora, R., Dekel, O., and Tewari, A. Deterministic MDPs with adversarial rewards and bandit feedback. In Proceedings of the 28th Conference on Uncertainty in Artificial Intelligence, pp. 93-101, 2012.
- (2012) Proceedings of the 28th Conference on Uncertainty in Artificial Intelligence , pp. 93-101
- Arora, R.¹ Dekel, O.² Tewari, A.³

4
- 84897474760
- Minimax policies for combinatorial prediction games
- Audibert, Jean-Yves, Bubeck, Sébastien, and Lugosi, Gábor. Minimax policies for combinatorial prediction games. Journal of Machine Learning Research - Proceedings Track, 19:107-132, 2011.
- (2011) Journal of Machine Learning Research - Proceedings Track , vol.19 , pp. 107-132
- Audibert, J.-Y.¹ Bubeck, S.² Lugosi, G.³

5
- 0003565783
- Athena Scientific, Third edition
- Bertsekas, D. P. Dynamic Programming and Optimal Control. Athena Scientific, Third edition, 2005.
- (2005) Dynamic Programming and Optimal Control
- Bertsekas, D.P.¹

6
- 0003618624
- Springer
- Bremaud, P. Markov chains : Gibbs fields, Monte Carlo simulation and queues. Springer, 1999.
- (1999) Markov Chains: Gibbs Fields, Monte Carlo Simulation and Queues
- Bremaud, P.¹

7
- 85162050055
- The price of bandit information for online optimization
- Dani, Varsha, Hayes, Thomas P., and Kakade, Sham. The price of bandit information for online optimization. In Advances in Neural Information Processing Systems 20, 2007.
- (2007) Advances in Neural Information Processing Systems , vol.20
- Dani, V.¹ Hayes, T.P.² Kakade, S.³

8
- 0344560051
- Periods of connected networks and powers of nonnegative matrices
- Denardo, E. V. Periods of connected networks and powers of nonnegative matrices. Mathematics of Operations Research, 2(1):20-24, 1977.
- (1977) Mathematics of Operations Research , vol.2 , Issue.1 , pp. 20-24
- Denardo, E.V.¹

9
- 70349277420
- Online markov decision processes
- Even-Dar, E., Kakade, S., and Mansour, Y. Online markov decision processes. Mathematics of Operations Research, 34(3):726-736, 2009.
- (2009) Mathematics of Operations Research , vol.34 , Issue.3 , pp. 726-736
- Even-Dar, E.¹ Kakade, S.² Mansour, Y.³

10
- 77951573287
- Universal reinforcement learning
- Farias, V. F., Moallemi, C. C., Roy, B. Van, and Weissman, T. Universal reinforcement learning. IEEE Transactions on Information Theory, 56(5):2441-2454, 2010.
- (2010) IEEE Transactions on Information Theory , vol.56 , Issue.5 , pp. 2441-2454
- Farias, V.F.¹ Moallemi, C.C.² Van Roy, B.³ Weissman, T.⁴

11
- 50249167647
- On polynomial cases of the unichain classification problem for Markov decision processes
- Feinberg, E. A. and Yang, F. On polynomial cases of the unichain classification problem for Markov decision processes. Operations Research Letters, 36(5): 527-530, 2008.
- (2008) Operations Research Letters , vol.36 , Issue.5 , pp. 527-530
- Feinberg, E.A.¹ Yang, F.²

12
- 85162052729
- Online Markov decision processes under bandit feedback
- Neu, G., György, A., Szepesvári, C., and Antos, A. Online Markov decision processes under bandit feedback. In Advances in Neural Information Processing Systems 23, pp. 1804-1812, 2010.
- (2010) Advances in Neural Information Processing Systems , vol.23 , pp. 1804-1812
- Neu, G.¹ György, A.² Szepesvári, C.³ Antos, A.⁴

13
- 77953539718
- Online regret bounds for Markov decision processes with deterministic transitions
- Ortner, R. Online regret bounds for Markov decision processes with deterministic transitions. Theoretical Computer Science, 411 (29-30):2684-2695, 2010.
- (2010) Theoretical Computer Science , vol.411 , Issue.29-30 , pp. 2684-2695
- Ortner, R.¹

14
- 77955790905
- Algorithms for reinforcement learning
- Szepesvari, C. Algorithms for reinforcement learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, 4(1), 2010.
- (2010) Synthesis Lectures on Artificial Intelligence and Machine Learning , vol.4 , Issue.1
- Szepesvari, C.¹

15
- 3142657664
- Path kernels and multiplicative updates
- Takimoto, Eiji and Warmuth, Manfred K. Path kernels and multiplicative updates. Journal of Machine Learning Research, 4:773-818, 2003.
- (2003) Journal of Machine Learning Research , vol.4 , pp. 773-818
- Takimoto, E.¹ Warmuth, M.K.²

16
- 70349280578
- Markov decision processes with arbitrary reward processes
- Yu, J. Y., Mannor, S., and Shimkin, N. Markov decision processes with arbitrary reward processes. Mathematics of Operations Research, 34(3):737-757, 2009.
- (2009) Mathematics of Operations Research , vol.34 , Issue.3 , pp. 737-757
- Yu, J.Y.¹ Mannor, S.² Shimkin, N.³

17
- 77950787050
- Arbitrarily modulated markov decision processes
- Yu, Jia Yuan and Mannor, Shie. Arbitrarily modulated markov decision processes. In Proceedings of the 48th IEEE Conference on Decision and Control, pp. 2946-2953, 2009.
- (2009) Proceedings of the 48th IEEE Conference on Decision and Control , pp. 2946-2953
- Yu, J.Y.¹ Mannor, S.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.