SCOPUS 정보 검색 플랫폼

2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 - Proceedings

Volumn , Issue , 2009, Pages 177-184

A theoretical and empirical analysis of expected sarsa

(4) Harm van, Seijen a Hado van, Hasselt b Whiteson, Shimon c Wiering, Marco d

a TNO (Netherlands)

b UTRECHT UNIVERSITY (Netherlands)

c UNIVERSITY OF AMSTERDAM (Netherlands)

d UNIVERSITY OF GRONINGEN (Netherlands)

Author keywords

[No Author keywords available]

Indexed keywords

BEHAVIOR POLICY; DIFFERENCE METHOD; EMPIRICAL ANALYSIS; HIGHER LEARNING; LEARNING RATES; MODEL FREE; MULTIPLE DOMAINS; Q-LEARNING; STOCHASTICITY; ZERO VARIANCE;

DYNAMIC PROGRAMMING; LEARNING ALGORITHMS; REINFORCEMENT; REINFORCEMENT LEARNING; SYSTEMS ENGINEERING;

EDUCATION;

EID: 67650505307 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ADPRL.2009.4927542 Document Type: Conference Paper

Times cited : (230)

References (12)

1
- 0029679044
- Reinforcement learning: A survey
- L. P. Kaelbling, M. L. Littman, and A. P. Moore, "Reinforcement learning: A survey," Journal of Artificial Intelligence Research, Vol. 4, pp. 237-285, 1996. (Pubitemid 126646155)
- (1996) Journal of Artificial Intelligence Research , vol.4 , pp. 237-285
- Kaelbling, L.P.¹ Littman, M.L.² Moore, A.W.³

2
- 0004102479
- Cambridge, Massachussets: MIT Press
- R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. Cambridge, Massachussets: MIT Press, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

3
- 0001700171
- A Markov decision process
- R. E. Bellman, "A Markov decision process," Journal of Methematical Mechanics, Vol. 6, pp. 679-684, 1957.
- (1957) Journal of Methematical Mechanics , vol.6 , pp. 679-684
- Bellman, R.E.¹

4
- 85012688561
- Princeton, NJ.: Princeton University Press
- R. E. Bellman, Dynamic Programming. Princeton, NJ.: Princeton University Press, 1957.
- (1957) Dynamic Programming
- Bellman, R.E.¹

5
- 33847202724
- Learning to predict by the methods of temporal differences
- R. S. Sutton, "Learning to predict by the methods of temporal differences," Machine Learning, Vol. 3, pp. 9-44, 1988.
- (1988) Machine Learning , vol.3 , pp. 9-44
- Sutton, R.S.¹

6
- 34249833101
- C. Watkins and P. Dayan, "Q-learning," Machine Learning, Vol. 8(3-4), pp. 9-44, 1992.
- (1992) Q-learning. Machine Learning , vol.8 , Issue.3-4 , pp. 9-44
- Watkins, C.¹ Dayan, P.²

7
- 85153940465
- Generalization in reinforcement learning: Safely approximating the value function
- G. Tesauro, D. S. Touretzky, and T. K. Leen, Eds. MIT Press Cambridge MA
- J. A. Boyan and A. W. Moore, "Generalization in reinforcement learning: Safely approximating the value function," in Advances in Neural Information Processing Systems 7, G. Tesauro, D. S. Touretzky, and T. K. Leen, Eds. MIT Press Cambridge MA, 1995, pp. 369-376.
- (1995) Advances in Neural Information Processing Systems , vol.7 , pp. 369-376
- Boyan, J.A.¹ Moore, A.W.²

8
- 84880694195
- Stable function approximation in dynamic programming
- A. Prieditis and S. Russell, Eds. Morgan Kaufmann Publishers, San Francisco, CA
- G. Gordon, "Stable function approximation in dynamic programming," in Machine Learning: Proceedings of the Twelfth International Conference, A. Prieditis and S. Russell, Eds. Morgan Kaufmann Publishers, San Francisco, CA, 1995, pp. 261-268.
- (1995) Machine Learning: Proceedings of the Twelfth International Conference , pp. 261-268
- Gordon, G.¹

9
- 85151728371
- Residual algorithms: Reinforcement learning with function approximation
- A. Prieditis and S. Russell, Eds. Morgan Kaufmann Publishers, San Francisco, CA
- L. Baird, "Residual algorithms: Reinforcement learning with function approximation," in Machine Learning: Proceedings of the Twelfth International Conference, A. Prieditis and S. Russell, Eds. Morgan Kaufmann Publishers, San Francisco, CA, 1995, pp. 30-37.
- (1995) Machine Learning: Proceedings of the Twelfth International Conference , pp. 30-37
- Baird, L.¹

10
- 0003636089
- Cambridge University, Tech. Rep. CUED/F-INFENG/TR
- G. Rummery and M. Niranjan, "On-line Q-learning using connectionist systems," Cambridge University, Tech. Rep. CUED/F-INFENG/TR 166, 1994.
- (1994) On-line Q-learning using connectionist systems , vol.166
- Rummery, G.¹ Niranjan, M.²

11
- 85156221438
- Generalization in reinforcement learning: Successful examples using sparse coarse coding
- R. Sutton, "Generalization in reinforcement learning: Successful examples using sparse coarse coding," in Advances in Neural Information Processing Systems 8, 1996, pp. 1038-1044.
- (1996) Advances in Neural Information Processing Systems , vol.8 , pp. 1038-1044
- Sutton, R.¹

12
- 0033901602
- Convergence results for single-step on-policy reinforcement-learning algorithms
- DOI 10.1023/A:1007678930559
- S. Singh, T. Jaakkola, M. L. Littman, and C. Szepesvári, "Convergence results for single-step on-policy reinforcement-learning algorithms," Machine Learning, Vol. 38, no. 3, pp. 287-308, 2000. (Pubitemid 30572449)
- (2000) Machine Learning , vol.38 , Issue.3 , pp. 287-308
- Singh, S.¹ Jaakkola, T.² Littman, M.L.³ Szepesvari, C.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.