SCOPUS 정보 검색 플랫폼 - 논문 보기

메뉴 건너뛰기

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Volumn 3720 LNAI, Issue , 2005, Pages 317-328

Neural fitted Q iteration - First experiences with a data efficient neural Reinforcement Learning method

(1) Riedmiller, Martin a

a UNIVERSITY OF OSNABRÜCK (Germany)

Author keywords

[No Author keywords available]

Indexed keywords

CONTROL POLICIES; MULTI-LAYER PERCEPTRON; Q ITERATION; REINFORCEMENT LEARNING METHOD;

ALGORITHMS; FUNCTIONS; ITERATIVE METHODS; LEARNING SYSTEMS; MATHEMATICAL MODELS;

Q FACTOR MEASUREMENT;

EID: 33646398129 PISSN: 03029743 EISSN: 16113349 Source Type: Book Series
DOI: 10.1007/11564096_32 Document Type: Conference Paper

Times cited : (842)

References (9)

1
- 0001133021
- Generalization in reinforcement learning: Safely approximating the value function
- [BM95]. Morgan Kaufmann
- [BM95] Boyan and Moore. Generalization in reinforcement learning: Safely approximating the value function. In Advances in Neural Information Processing Systems 7. Morgan Kaufmann, 1995.
- (1995) Advances in Neural Information Processing Systems , vol.7
- Boyan¹ Moore²

2
- 21844465127
- Tree-based batch mode reinforcement learning
- [EPG05]
- [EPG05] D. Ernst and and L. Wehenkel P. Geurts. Tree-based batch mode reinforcement learning. Journal of Machine Learning Research, 6:503-556, 2005.
- (2005) Journal of Machine Learning Research , vol.6 , pp. 503-556
- Ernst, D.¹ Wehenkel, L.² Geurts, P.³

3
- 84880694195
- Stable function approximation in dynamic programming
- [Gor95]. A. Prieditis and S. Russell, editors, San Francisco, CA
- [Gor95] G. J. Gordon. Stable function approximation in dynamic programming. In A. Prieditis and S. Russell, editors, Proceedings of the ICML, San Francisco, CA, 1995.
- (1995) Proceedings of the ICML
- Gordon, G.J.¹

4
- 0000123778
- Self-improving reactive agents based on reinforcement learning, planning and teaching
- [Lin92]
- [Lin92] L.-J. Lin. Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning, 8:293-321, 1992.
- (1992) Machine Learning , vol.8 , pp. 293-321
- Lin, L.-J.¹

5
- 4644323293
- Least-squares policy iteration
- [LP03]
- [LP03] M. Lagoudakis and R. Parr. Least-squares policy iteration. Journal of Machine Learning Research, 4:1107-1149, 2003.
- (2003) Journal of Machine Learning Research , vol.4 , pp. 1107-1149
- Lagoudakis, M.¹ Parr, R.²

6
- 84943274699
- A direct adaptive method for faster backpropagation learning: The RPROP algorithm
- [RB93]. H. Ruspini, editor, San Francisco
- [RB93] M. Riedmiller and H. Braun. A direct adaptive method for faster backpropagation learning: The RPROP algorithm. In H. Ruspini, editor, Proceedings of the IEEE International Conference on Neural Networks (ICNN), pages 586-591, San Francisco, 1993.
- (1993) Proceedings of the IEEE International Conference on Neural Networks (ICNN) , pp. 586-591
- Riedmiller, M.¹ Braun, H.²

7
- 0033233953
- Concepts and facilities of a neural reinforcement learning control architecture for technical process control
- [Rie00]
- [Rie00] M. Riedmiller. Concepts and facilities of a neural reinforcement learning control architecture for technical process control. Journal of Neural Computing and Application, 8:323-338, 2000.
- (2000) Journal of Neural Computing and Application , vol.8 , pp. 323-338
- Riedmiller, M.¹

8
- 0004007508
- [SB98]. MIT Press, Cambridge, MA
- [SB98] R. S. Sutton and A. G. Barto. Reinforcement Learning. MIT Press, Cambridge, MA, 1998.
- (1998) Reinforcement Learning
- Sutton, R.S.¹ Barto, A.G.²

9
- 0001046225
- Practical issues in temporal difference learning
- [Tes92]
- [Tes92] G. Tesauro. Practical issues in temporal difference learning. Machine Learning, (8):257-277, 1992.
- (1992) Machine Learning , Issue.8 , pp. 257-277
- Tesauro, G.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.