SCOPUS 정보 검색 플랫폼

Volumn 30, Issue 5, 1999, Pages 341-363

An analysis of experience replay in temporal difference learning

Author keywords

[No Author keywords available]

Indexed keywords

ADAPTIVE SYSTEMS; LEARNING ALGORITHMS; NUMERICAL METHODS; SET THEORY; THEOREM PROVING;

REINFORCEMENT LEARNING; TEMPORAL DIFFERENCE LEARNING;

LEARNING SYSTEMS;

EID: 0032649518 PISSN: 01969722 EISSN: 10876553 Source Type: Journal
DOI: 10.1080/019697299125127 Document Type: Article

Times cited : (21)

References (18)

1
- 0020970738
- Neuronlike adaptive elements that can solve difficult learning control problems
- Barto, A. G., R. S. Sutton, and C. W. Anderson. 1983. Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans. Systems, Man, and Cybernetics 13:835-846.
- (1983) IEEE Trans. Systems, Man, and Cybernetics , vol.13 , pp. 835-846
- Barto, A.G.¹ Sutton, R.S.² Anderson, C.W.³

2
- 0007512578
- Truncating temporal differences: On the efficient implementation of TD(A) for reinforcement learning
- Cichosz, P. 1995. Truncating temporal differences: On the efficient implementation of TD(A) for reinforcement learning. Journal of Artificial Intelligence Research 2:287-318.
- (1995) Journal of Artificial Intelligence Research , vol.2 , pp. 287-318
- Cichosz, P.¹

3
- 0008666497
- Ph.D. thesis, Warsaw University of Technology, Department of Electronics and Information Technology
- Cichosz, P. 1997. Reinforcement Learning by Truncating Temporal Differences. Ph.D. thesis, Warsaw University of Technology, Department of Electronics and Information Technology.
- (1997) Reinforcement Learning by Truncating Temporal Differences
- Cichosz, P.¹

5
- 0029679044
- Reinforcement learning: A survey
- Kaelbling, L. P., M. L. Littman, and A. W. Moore. 1996. Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4:237-285.
- (1996) Journal of Artificial Intelligence Research , vol.4 , pp. 237-285
- Kaelbling, L.P.¹ Littman, M.L.² Moore, A.W.³

6
- 0003607885
- Washington D.C.: Hemisphere
- Klopf, A. H. 1982. The Hedonistic Neuron: A Theory of Memory, Learning, and Intelligence. Washington D.C.: Hemisphere.
- (1982) The Hedonistic Neuron: A Theory of Memory, Learning, and Intelligence.
- Klopf, A.H.¹

7
- 0003673017
- Ph.D. thesis, Carnegie-Mellon University, School of Computer Science, Pittsburgh, PA
- Lin, Long-Ji. 1993. Reinforcement Learning for Robots Using Neural Networks. Ph.D. thesis, Carnegie-Mellon University, School of Computer Science, Pittsburgh, PA.
- (1993) Reinforcement Learning for Robots Using Neural Networks.
- Lin, L.-J.¹

8
- 0026880130
- Automatic programming of behavior-based robots using reinforcement learning
- Mahadevan, S. and J. Connell. 1992. Automatic programming of behavior-based robots using reinforcement learning. Artificial Intelligence 55:311-365.
- (1992) Artificial Intelligence , vol.55 , pp. 311-365
- Mahadevan, S.¹ Connell, J.²

10
- 0003636089
- Technical Report CUED /F-INFENG /TR 166, Cambridge University, Engineering Department
- Rummery, G. A. and M. Niranjan. 1994. On-Line Q-Learning Using Connectionist Systems. Technical Report CUED /F-INFENG /TR 166, Cambridge University, Engineering Department.
- (1994) On-Line Q-Learning Using Connectionist Systems.
- Rummery, G.A.¹ Niranjan, M.²

11
- 0029753630
- Reinforcement learning with replacing eligibility traces
- Singh, S. P. and R. S. Sutton. 1996. Reinforcement learning with replacing eligibility traces. Machine Learning 22:123-158.
- (1996) Machine Learning , vol.22 , pp. 123-158
- Singh, S.P.¹ Sutton, R.S.²

12
- 0003617454
- Ph.D. thesis, University of Massachusetts, Department of Computer and Information Science, Boston, MA
- Sutton, R. S. 1984. Temporal Credit Assignment in Reinforcement Learning. Ph.D. thesis, University of Massachusetts, Department of Computer and Information Science, Boston, MA.
- (1984) Temporal Credit Assignment in Reinforcement Learning
- Sutton, R.S.¹

13
- 33847202724
- Learning to predict by the methods of temporal differences
- Sutton, R. S. 1988. Learning to predict by the methods of temporal differences. Machine Learning 3:9-44.
- (1988) Machine Learning , vol.3 , pp. 9-44
- Sutton, R.S.¹

16
- 0001046225
- Practical issues in temporal difference learning
- Tesauro, G. 1992. Practical issues in temporal difference learning. Machine Learning 8:257-277.
- (1992) Machine Learning , vol.8 , pp. 257-277
- Tesauro, G.¹

17
- 0004049893
- Ph.D. thesis, King’s College, Cambridge
- Watkins, C. J. C. H. 1989. Learning from Delayed Rewards. Ph.D. thesis, King’s College, Cambridge.
- (1989) Learning from Delayed Rewards.
- Watkins, C.J.C.H.¹

18
- 34249833101
- Technical note: Q-learning
- Watkins, C. J. C. H. and P. Dayan. 1992. Technical note: Q-learning. Machine Learning 8:279-292.
- (1992) Machine Learning , vol.8 , pp. 279-292
- Watkins, C.J.C.H.¹ Dayan, P.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.