SCOPUS 정보 검색 플랫폼

Volumn 1585, Issue , 1999, Pages 195-197

Reinforcement learning: Past, present and future?

Author keywords

[No Author keywords available]

Indexed keywords

EID: 72949089205 PISSN: 03029743 EISSN: 16113349 Source Type: Book Series
DOI: 10.1007/3-540-48873-1_26 Document Type: Conference Paper

Times cited : (26)

References (13)

1
- 0002882372
- KnightCap: A chess progream that learns by combining TD(λ) with game-tree search
- Baxter, J., Tridgell, A., Weaver, L. (1998). KnightCap: A chess progream that learns by combining TD(λ) with game-tree search. Proceedings of the Fifteenth International Conference on Machine Learning, pp. 28-36.
- (1998) Proceedings of the Fifteenth International Conference on Machine Learning , pp. 28-36
- Baxter, J.¹ Tridgell, A.² Weaver, L.³

2
- 0003487482
- Athena Scientific, Belmont, MA
- Bertsekas, D. P., and Tsitsiklis, J. N. (1996). Neuro-Dynamic Programming. Athena Scientific, Belmont, MA.
- (1996) Neuro-Dynamic Programming
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

4
- 0003932121
- University of Rochester PhD. thesis
- McCallum, A. K. (1995) Reinforcement Learning with Selective Perception and Hidden State. University of Rochester PhD. thesis.
- (1995) Reinforcement Learning with Selective Perception and Hidden State
- McCallum, A.K.¹

5
- 84956885505
- CRL Report 334. Communications Research Laboratory, Mc-Master University, Hamilton, Ontario
- Nie, J., and Haykin, S. (1996). A dynamic channel assignment policy through Q-learning. CRL Report 334. Communications Research Laboratory, Mc-Master University, Hamilton, Ontario.
- (1996) A Dynamic Channel Assignment Policy through Q-Learning
- Nie, J.¹ Haykin, S.²

8
- 33847202724
- Learning to predict by the methods of temporal differences
- Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3:9-44.
- (1988) Machine Learning , vol.3 , pp. 9-44
- Sutton, R.S.¹

9
- 0004102479
- MIT Press, Cambridge, MA
- Sutton, R. S., and Barto, A. G. (1998). Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

10
- 0003899594
- Technical Report 98-74, Department of Computer Science, University of Massachusetts
- Sutton, R. S., Precup, D., Singh, S. (1998). Between MDPs and semi-MDPs: Learning, planning, and representing knowledge at multiple temporal scales. Technical Report 98-74, Department of Computer Science, University of Massachusetts.
- (1998) Between Mdps and Semi-Mdps: Learning, Planning, and Representing Knowledge at Multiple Temporal Scales
- Sutton, R.S.¹ Precup, D.² Singh, S.³

11
- 0029276036
- Temporal difference learning and TD-Gammon
- Tesauro, G. J. (1995). Temporal difference learning and TD-Gammon. Communications of the ACM, 38:58-68.
- (1995) Communications of the ACM , vol.38 , pp. 58-68
- Tesauro, G.J.¹

12
- 0004049893
- Ph.D. thesis, Cambridge University
- Watkins, C. J. C. H. (1989). Learning from Delayed Rewards. Ph.D. thesis, Cambridge University.
- (1989) Learning from Delayed Rewards
- Watkins, C.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.