SCOPUS 정보 검색 플랫폼

Advances in Neural Information Processing Systems

Volumn , Issue , 2002, Pages

Model-free least squares policy iteration

(2) Lagoudakis, Michail G a Parr, Ronald a

a Duke University (United States)

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; ITERATIVE METHODS; REINFORCEMENT LEARNING;

CONTROL PROBLEMS; FUNCTION APPROXIMATION; LEAST SQUARES POLICY ITERATIONS; LEAST-SQUARES TEMPORAL DIFFERENCES; NEW APPROACHES; POLICY ITERATION; PREDICTION PROBLEM; TEMPORAL-DIFFERENCE ALGORITHM;

LEAST SQUARES APPROXIMATIONS;

EID: 84898963274 PISSN: 10495258 EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (28)

References (13)

1
- 0003272616
- Reinforcement learning in POMDP'S via direct gradient ascent
- Morgan Kaufmann, San Francisco, CA
- J. Baxter and P.Bartlett. Reinforcement learning in POMDP's via direct gradient ascent. In Proc. 17th International Conf. on Machine Learning, pages 41-48. Morgan Kaufmann, San Francisco, CA, 2000.
- (2000) Proc. 17th International Conf. on Machine Learning , pp. 41-48
- Baxter, J.¹ Bartlett, P.²

2
- 0003487482
- Athena Scientific, Belmont, Massachusetts
- D. Bertsekas and J. Tsitsiklis. Neuro-Dynamic Programming. Athena Scientific, Belmont, Massachusetts, 1996.
- (1996) Neuro-Dynamic Programming
- Bertsekas, D.¹ Tsitsiklis, J.²

3
- 0038595396
- Least-squares temporal difference learning
- I. Bratko and S. Dzeroski, editors, Morgan Kaufmann, San Francisco, CA
- Justin A. Boyan. Least-squares temporal difference learning. In I. Bratko and S. Dzeroski, editors, Machine Learning: Proceedings of the Sixteenth International Conference, pages 49- 56. Morgan Kaufmann, San Francisco, CA, 1999.
- (1999) Machine Learning: Proceedings of the Sixteenth International Conference , pp. 49-56
- Boyan, J.A.¹

4
- 0001771345
- Linear least-squares algorithms for temporal difference learning
- S. Bradtke and A. Barto. Linear least-squares algorithms for temporal difference learning. Machine Learning, 22(1/2/3):33-57, 1996.
- (1996) Machine Learning , vol.22 , Issue.1-3 , pp. 33-57
- Bradtke, S.¹ Barto, A.²

5
- 0010359703
- Policy iteration for factored mdps
- Morgan Kaufmann
- D. Koller and R. Parr. Policy iteration for factored mdps. In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI-00). Morgan Kaufmann, 2000.
- (2000) Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI-00)
- Koller, D.¹ Parr, R.²

6
- 84898938510
- Actor-critic algorithms
- NIPS 2000 editors, editor, MIT Press
- V. Konda and J. Tsitsiklis. Actor-critic algorithms. In NIPS 2000 editors, editor, Advances in Neural Information Processing Systems 12: Proceedings of the 1999 Conference. MIT Press, 2000.
- (2000) Advances in Neural Information Processing Systems 12: Proceedings of the 1999 Conference
- Konda, V.¹ Tsitsiklis, J.²

7
- 9444289818
- Model-free least-squares policy iteration
- Duke University, December
- M. G. Lagoudakis and R. Parr. Model-Free Least-Squares policy iteration. Technical Report CS-2001-05, Department of Computer Science, Duke University, December 2001.
- (2001) Technical Report CS-2001-05, Department of Computer Science
- Lagoudakis, M.G.¹ Parr, R.²

8
- 0141819580
- Pegasus: A policy search method for large MDPS and POMDPS
- Morgan Kaufmann
- A. Ng and M. Jordan. PEGASUS: A policy search method for large MDPs and POMDPs. In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI-00). Morgan Kaufmann, 2000.
- (2000) Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI-00)
- Ng, A.¹ Jordan, M.²

9
- 84898967780
- Policy search via density estimation
- MIT Press
- A. Ng, R. Parr, and D. Koller. Policy search via density estimation. In Advances in Neural Information Processing Systems 12: Proceedings of the 1999 Conference. MIT Press, 2000.
- (2000) Advances in Neural Information Processing Systems 12: Proceedings of the 1999 Conference
- Ng, A.¹ Parr, R.² Koller, D.³

10
- 0141596576
- Policy invariance under reward transformations: Theory and application to reward shaping
- Morgan Kaufmann, San Francisco, CA
- Andrew Y. Ng, Daishi Harada, and Stuart Russell. Policy invariance under reward transformations: Theory and application to reward shaping. In Proc. 16th International Conf. on Machine Learning, pages 278-287. Morgan Kaufmann, San Francisco, CA, 1999.
- (1999) Proc. 16th International Conf. on Machine Learning , pp. 278-287
- Ng, A.Y.¹ Harada, D.² Russell, S.³

11
- 84898988649
- Kernel-based reinforcement learning
- To appear
- D. Ormoneit and S. Sen. Kernel-based reinforcement learning. To appear, Machine Learning, 2001.
- (2001) Machine Learning
- Ormoneit, D.¹ Sen, S.²

12
- 1642401055
- Learning to drive a bicycle using reinforcement learning and shaping
- Morgan Kaufmann
- J. Randløv and P. Alstrøm. Learning to drive a bicycle using reinforcement learning and shaping. In The Fifteenth International Conference on Machine Learning, 1998. Morgan Kaufmann.
- (1998) The Fifteenth International Conference on Machine Learning
- Randløv, J.¹ Alstrøm, P.²

13
- 84898939480
- Policy gradient methods for reinforcement learning with function approximation
- MIT Press
- R. Sutton, D. McAllester, S. Singh, and Y. Mansour. Policy gradient methods for reinforcement learning with function approximation. In Advances in Neural Information Processing Systems 12: Proceedings of the 1999 Conference, 2000. MIT Press.
- (2000) Advances in Neural Information Processing Systems 12: Proceedings of the 1999 Conference
- Sutton, R.¹ McAllester, D.² Singh, S.³ Mansour, Y.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.