SCOPUS 정보 검색 플랫폼

Volumn , Issue , 2011, Pages 481-488

Incremental basis construction from temporal difference error

Author keywords

[No Author keywords available]

Indexed keywords

BASIS FUNCTIONS; BELLMAN ERROR; DISCOUNT FACTORS; LINEAR COMBINATIONS; ONLINE VERSIONS; REWARD FUNCTION; TEMPORAL DIFFERENCE ERRORS; VALUE FUNCTIONS;

REINFORCEMENT; REINFORCEMENT LEARNING;

ESTIMATION;

EID: 80053457849 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (11)

References (18)

1
- 85151728371
- Residual algorithms: Reinforcement learning with function approximation
- L. Baird. Residual algorithms: Reinforcement learning with function approximation. In ICML'95, 1995.
- (1995) ICML'95
- Baird, L.¹

2
- 85162041278
- Predictive state temporal difference learning
- B. Boots and G. J. Gordon. Predictive state temporal difference learning. In NIPS'10, 2010.
- (2010) NIPS'10
- Boots, B.¹ Gordon, G.J.²

4
- 0001771345
- Linear least-squares algorithms for temporal difference learning
- S. J. Bradtke, A. G. Barto, and L. P. Kaelbling. Linear least-squares algorithms for temporal difference learning. Machine Learning, 22:33-57, 1996.
- (1996) Machine Learning , vol.22 , pp. 33-57
- Bradtke, S.J.¹ Barto, A.G.² Kaelbling, L.P.³

5
- 80053451079
- Incremental least-squares temporal difference learning
- A. Geramifard, M. Bowling, and R. S. Sutton. Incremental least-squares temporal difference learning. In AAAI'06, 2006.
- (2006) AAAI'06
- Geramifard, A.¹ Bowling, M.² Sutton, R.S.³

6
- 85162046948
- Lstd with random projections
- M. Ghavamzadeh, A. Lazaric, O. Maillard, and R. Munos. Lstd with random projections. In NIPS'10, 2010.
- (2010) NIPS'10
- Ghavamzadeh, M.¹ Lazaric, A.² Maillard, O.³ Munos, R.⁴

7
- 33749263205
- Automatic basis function construction for approximate dynamic programming and reinforcement learning
- P. W. Keller, S. Mannor, and D. Precup. Automatic basis function construction for approximate dynamic programming and reinforcement learning. In ICML'06, 2006.
- (2006) ICML'06
- Keller, P.W.¹ Mannor, S.² Precup, D.³

8
- 71149121683
- Regularization and feature selection in least-squares temporal difference learning
- Z. J. Kolter and A. Y. Ng. Regularization and feature selection in least-squares temporal difference learning. In ICML'09, 2009.
- (2009) ICML'09
- Kolter, Z.J.¹ Ng, A.Y.²

10
- 77954101982
- Gq(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces
- H. R. Maei and R. S. Sutton. Gq(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces. In AGI'10, 2010.
- (2010) AGI'10
- Maei, H.R.¹ Sutton, R.S.²

11
- 85161990353
- Basis construction from power series expansions of value functions
- S. Mahadevan and B. Liu. Basis construction from power series expansions of value functions. In NIPS'10, 2010.
- (2010) NIPS'10
- Mahadevan, S.¹ Liu, B.²

12
- 35748957806
- Proto-value functions: A laplacian framework for learning representation and control in markov decision processes
- S. Mahadevan, M. Maggioni, and C. Guestrin. Proto-value functions: A laplacian framework for learning representation and control in markov decision processes. Journal of Machine Learning Research, 8: 2007, 2006.
- (2006) Journal of Machine Learning Research , vol.8 , pp. 2007
- Mahadevan, S.¹ Maggioni, M.² Guestrin, C.³

13
- 51849168812
- Analyzing feature generation for value-function approximation
- R. Parr, C. Painter-Wakefield, L.-H. Li, and M. Littman. Analyzing feature generation for value-function approximation. In ICML'07, 2007.
- (2007) ICML'07
- Parr, R.¹ Painter-Wakefield, C.² Li, L.-H.³ Littman, M.⁴

14
- 33847202724
- Learning to predict by the methods of temporal differences
- R. S. Sutton. Learning to predict by the methods of temporal differences. Machine Learning, 3:9-44, 1988.
- (1988) Machine Learning , vol.3 , pp. 9-44
- Sutton, R.S.¹

15
- 0004102479
- MIT Press
- R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

16
- 77956513316
- A convergent o(n) algorithm for off-policy temporal-difference learning with linear function approximation
- R. S. Sutton, C. Szepesvári, and H. R. Maei. A convergent o(n) algorithm for off-policy temporal-difference learning with linear function approximation. In NIPS'08, 2008.
- (2008) NIPS'08
- Sutton, R.S.¹ Szepesvári, C.² Maei, H.R.³

17
- 71149099079
- Fast gradient-descent methods for temporal-difference learning with linear function approximation
- R. S. Sutton, H. R. Maei, D. Precup, S. Bhatnagar, D. Silver, C. Szepesvári, and E. Wiewiora. Fast gradient-descent methods for temporal-difference learning with linear function approximation. In ICML'09, 2009.
- (2009) ICML'09
- Sutton, R.S.¹ Maei, H.R.² Precup, D.³ Bhatnagar, S.⁴ Silver, D.⁵ Szepesvári, C.⁶ Wiewiora, E.⁷

18
- 26944495251
- Feature-discovering approximate value iteration methods
- J.-H. Wu and R. Givan. Feature-discovering approximate value iteration methods. In SARA'05, pages 321-331, 2005.
- (2005) SARA'05 , pp. 321-331
- Wu, J.-H.¹ Givan, R.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.