SCOPUS 정보 검색 플랫폼

Volumn 72, Issue 3, 2008, Pages 157-171

Rollout sampling approximate policy iteration

Author keywords

Approximate policy iteration; Bandit problems; Classification; Reinforcement learning; Rollouts; Sample complexity

Indexed keywords

BOOLEAN FUNCTIONS; CLASSIFICATION (OF INFORMATION); EDUCATION; REINFORCEMENT; STANDARDS;

COMPUTATIONAL EFFORTS; INVERTED PENDULUMS; LEARNING PROBLEMS; ORDER-OF MAGNITUDES; POLICY ITERATION; VALUE FUNCTIONS;

REINFORCEMENT LEARNING;

EID: 48349140736 PISSN: 08856125 EISSN: 15730565 Source Type: Journal
DOI: 10.1007/s10994-008-5069-3 Document Type: Conference Paper

Times cited : (51)

References (16)

5
- 22944468731
- Approximate policy iteration with a policy language bias
- Fern, A., Yoon, S., & Givan, R. (2004). Approximate policy iteration with a policy language bias. Advances in Neural Information Processing Systems, 16(3).
- (2004) Advances in Neural Information Processing Systems , vol.16 , Issue.3
- Fern, A.¹ Yoon, S.² Givan, R.³

6
- 33744466799
- Approximate policy iteration with a policy language bias: Solving relational Markov decision processes
- Fern, A., Yoon, S., & Givan, R. (2006). Approximate policy iteration with a policy language bias: Solving relational Markov decision processes. Journal of Artificial Intelligence Research, 25, 75-118.
- (2006) Journal of Artificial Intelligence Research , vol.25 , pp. 75-118
- Fern, A.¹ Yoon, S.² Givan, R.³

7
- 0003644124
- MIT Press Cambridge
- Howard, R. A. (1960). Dynamic programming and Markov processes. Cambridge: MIT Press.
- (1960) Dynamic Programming and Markov Processes
- Howard, R.A.¹

8
- 34547975806
- Bandit based Monte-Carlo planning
- Kocsis, L., & Szepesvári, C. (2006). Bandit based Monte-Carlo planning. In Proceedings of the European conference on machine learning.
- (2006) Proceedings of the European Conference on Machine Learning
- Kocsis, L.¹ Szepesvári, C.²

9
- 48349105325
- PhD thesis, Department of Computer Science, Duke University
- Lagoudakis, M. G. (2003). Efficient approximate policy iteration methods for sequential decision making in reinforcement learning. PhD thesis, Department of Computer Science, Duke University.
- (2003) Efficient Approximate Policy Iteration Methods for Sequential Decision Making in Reinforcement Learning
- Lagoudakis, M.G.¹

14
- 33646398129
- Neural fitted Q iteration-first experiences with a data efficient neural reinforcement learning method
- Riedmiller, M. (2005). Neural fitted Q iteration-first experiences with a data efficient neural reinforcement learning method. In 16th European conference on machine learning (pp. 317-328).
- (2005) 16th European Conference on Machine Learning , pp. 317-328
- Riedmiller, M.¹

15
- 0004102479
- MIT Press Cambridge
- Sutton, R., & Barto, A. (1998). Reinforcement learning: an introduction. Cambridge: MIT Press.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.¹ Barto, A.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.