SCOPUS 정보 검색 플랫폼

Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010

Volumn , Issue , 2010, Pages

LSTD with random projections

(4) Ghavamzadeh, Mohammad a Lazaric, Alessandro a Maillard, Odalric Ambrym a Munos, Rémi a

a INRIA (France)

Author keywords

[No Author keywords available]

Indexed keywords

ITERATIVE METHODS; REINFORCEMENT LEARNING;

HIGH DIMENSIONAL SPACES; LEAST SQUARES POLICY ITERATIONS; LEAST-SQUARES TEMPORAL DIFFERENCES; NUMBER OF SAMPLES; PERFORMANCE BOUNDS; POLICY ITERATION ALGORITHMS; RANDOM PROJECTIONS; REINFORCEMENT LEARNINGS; TEMPORAL DIFFERENCE LEARNING;

LEARNING ALGORITHMS;

EID: 85162046948 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (47)

References (21)

1
- 40849145988
- Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
- A. Antos, Cs. Szepesvari, and R. Munos. Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. Machine Learning Journal, 71:89-129, 2008.
- (2008) Machine Learning Journal , vol.71 , pp. 89-129
- Antos, A.¹ Szepesvari, Cs.² Munos, R.³

2
- 0038595396
- Least-squares temporal difference learning
- J. Boyan. Least-squares temporal difference learning. Proceedings of the 16th International Conference on Machine Learning, pages 49-56, 1999.
- (1999) Proceedings of the 16th International Conference on Machine Learning , pp. 49-56
- Boyan, J.¹

3
- 0001771345
- Linear least-squares algorithms for temporal difference learning
- S. Bradtke and A. Barto. Linear least-squares algorithms for temporal difference learning. Machine Learning, 22:33-57, 1996.
- (1996) Machine Learning , vol.22 , pp. 33-57
- Bradtke, S.¹ Barto, A.²

4
- 70049096468
- Regularized policy iteration
- MIT Press
- A. M. Farahmand, M. Ghavamzadeh, Cs. Szepesvári, and S. Mannor. Regularized policy iteration. In Proceedings of Advances in Neural Information Processing Systems 21, pages 441-448. MIT Press, 2008.
- (2008) Proceedings of Advances in Neural Information Processing Systems , vol.21 , pp. 441-448
- Farahmand, A.M.¹ Ghavamzadeh, M.² Szepesvári, Cs.³ Mannor, S.⁴

5
- 70449644892
- Regularized fitted Qiteration for planning in continuous-space Markovian decision problems
- A. M. Farahmand, M. Ghavamzadeh, Cs. Szepesvári, and S. Mannor. Regularized fitted Qiteration for planning in continuous-space Markovian decision problems. In Proceedings of the American Control Conference, pages 725-730, 2009.
- (2009) Proceedings of the American Control Conference , pp. 725-730
- Farahmand, A.M.¹ Ghavamzadeh, M.² Szepesvári, Cs.³ Mannor, S.⁴

6
- 85161982279
- Technical Report inria-00530762, INRIA
- M. Ghavamzadeh, A. Lazaric, O. Maillard, and R. Munos. LSPI with random projections. Technical Report inria-00530762, INRIA, 2010.
- (2010) LSPI with Random Projections
- Ghavamzadeh, M.¹ Lazaric, A.² Maillard, O.³ Munos, R.⁴

7
- 34250706852
- Automatic basis function construction for approximate dynamic programming and reinforcement learning
- P. Keller, S. Mannor, and D. Precup. Automatic basis function construction for approximate dynamic programming and reinforcement learning. In Proceedings of the Twenty-Third International Conference on Machine Learning, pages 449-456, 2006.
- (2006) Proceedings of the Twenty-Third International Conference on Machine Learning , pp. 449-456
- Keller, P.¹ Mannor, S.² Precup, D.³

8
- 71149121683
- Regularization and feature selection in least-squares temporal difference learning
- Z. Kolter and A. Ng. Regularization and feature selection in least-squares temporal difference learning. In Proceedings of the Twenty-Sixth International Conference on Machine Learning, pages 521-528, 2009.
- (2009) Proceedings of the Twenty-Sixth International Conference on Machine Learning , pp. 521-528
- Kolter, Z.¹ Ng, A.²

9
- 4644323293
- Least-squares policy iteration
- M. Lagoudakis and R. Parr. Least-squares policy iteration. Journal of Machine Learning Research, 4:1107-1149, 2003.
- (2003) Journal of Machine Learning Research , vol.4 , pp. 1107-1149
- Lagoudakis, M.¹ Parr, R.²

10
- 80053439401
- Technical Report inria-00528596 INRIA
- A. Lazaric, M. Ghavamzadeh, and R. Munos. Finite-sample analysis of least-squares policy iteration. Technical Report inria-00528596, INRIA, 2010.
- (2010) Finite-sample Analysis of Least-squares Policy Iteration
- Lazaric, A.¹ Ghavamzadeh, M.² Munos, R.³

11
- 77956549349
- Finite-sample analysis of LSTD
- A. Lazaric, M. Ghavamzadeh, and R. Munos. Finite-sample analysis of LSTD. In Proceedings of the Twenty-Seventh International Conference on Machine Learning, pages 615-622, 2010.
- (2010) Proceedings of the Twenty-Seventh International Conference on Machine Learning , pp. 615-622
- Lazaric, A.¹ Ghavamzadeh, M.² Munos, R.³

12
- 34548803187
- Sparse temporal difference learning using lasso
- M. Loth, M. Davy, and P. Preux. Sparse temporal difference learning using lasso. In IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning, pages 352-359, 2007.
- (2007) IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning , pp. 352-359
- Loth, M.¹ Davy, M.² Preux, P.³

13
- 34547966269
- Representation policy iteration
- S. Mahadevan. Representation policy iteration. In Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence, pages 372-379, 2005.
- (2005) Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence , pp. 372-379
- Mahadevan, S.¹

14
- 78249289201
- Compressed least-squares regression
- O. Maillard and R. Munos. Compressed least-squares regression. In Proceedings of Advances in Neural Information Processing Systems 22, pages 1213-1221, 2009.
- (2009) Proceedings of Advances in Neural Information Processing Systems , vol.22 , pp. 1213-1221
- Maillard, O.¹ Munos, R.²

15
- 85162011142
- Technical Report inria-00483014 INRIA
- O. Maillard and R. Munos. Brownian motions and scrambled wavelets for least-squares regression. Technical Report inria-00483014, INRIA, 2010.
- (2010) Brownian Motions and Scrambled Wavelets for Least-squares Regression
- Maillard, O.¹ Munos, R.²

16
- 17444414191
- Basis function adaptation in temporal difference reinforcement learning
- I. Menache, S. Mannor, and N. Shimkin. Basis function adaptation in temporal difference reinforcement learning. Annals of Operations Research, 134:215-238, 2005.
- (2005) Annals of Operations Research , vol.134 , pp. 215-238
- Menache, I.¹ Mannor, S.² Shimkin, N.³

17
- 34547982545
- Analyzing feature generation for valuefunction approximation
- R. Parr, C. Painter-Wakefield, L. Li, and M. Littman. Analyzing feature generation for valuefunction approximation. In Proceedings of the Twenty-Fourth International Conference on Machine Learning, pages 737-744, 2007.
- (2007) Proceedings of the Twenty-Fourth International Conference on Machine Learning , pp. 737-744
- Parr, R.¹ Painter-Wakefield, C.² Li, L.³ Littman, M.⁴

18
- 77956538796
- Feature selection using regularization in approximate linear programs for Markov decision processes
- M. Petrik, G. Taylor, R. Parr, and S. Zilberstein. Feature selection using regularization in approximate linear programs for Markov decision processes. In Proceedings of the Twenty- Seventh International Conference on Machine Learning, pages 871-878, 2010.
- (2010) Proceedings of the Twenty- Seventh International Conference on Machine Learning , pp. 871-878
- Petrik, M.¹ Taylor, G.² Parr, R.³ Zilberstein, S.⁴

19
- 84877883600
- Non-asymptotic theory of random matrices: Extreme singular values
- M. Rudelson and R. Vershynin. Non-asymptotic theory of random matrices: extreme singular values. In Proceedings of the International Congress of Mathematicians, 2010.
- (2010) Proceedings of the International Congress of Mathematicians
- Rudelson, M.¹ Vershynin, R.²

20
- 0004102479
- MIP Press
- R. Sutton and A. Barto. Reinforcement Learning: An Introduction. MIP Press, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.¹ Barto, A.²

21
- 14844315829
- American Mathematical Society
- S. Vempala. The Random Projection Method. American Mathematical Society, 2004.
- (2004) The Random Projection Method
- Vempala, S.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.