SCOPUS 정보 검색 플랫폼

Volumn , Issue , 2010, Pages 615-622

Finite-sample analysis of LSTD

Author keywords

[No Author keywords available]

Indexed keywords

GENERALIZATION BOUND; LEAST SQUARE; MARKOV CHAIN; POLICY EVALUATION; SAMPLE ANALYSIS; STATIONARY DISTRIBUTION; VALUE FUNCTIONS;

LEARNING SYSTEMS; MARKOV PROCESSES;

LEARNING ALGORITHMS;

EID: 77956549349 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (63)

References (8)

1
- 40849145988
- Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
- Antos, A., Szepesvári, Cs., and Munos, R. Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. Machine Learning Journal, 71:89-129, 2008.
- (2008) Machine Learning Journal , vol.71 , pp. 89-129
- Antos, A.¹ Szepesvári, Cs.² Munos, R.³

2
- 33645411999
- Dynamic programming and optimal control
- Bertsekas, D. Dynamic Programming and Optimal Control. Athena Scientific, 2001.
- (2001) Athena Scientific
- Bertsekas, D.¹

3
- 0038595396
- Least-squares temporal difference learning
- Boyan, J. Least-squares temporal difference learning. Proceedings of the 16th International Conference on Machine Learning, pp. 49-56, 1999.
- (1999) Proceedings of the 16th International Conference on Machine Learning , pp. 49-56
- Boyan, J.¹

4
- 0001771345
- Linear least-squares algorithms for temporal difference learning
- Bradtke, S. and Barto, A. Linear least-squares algorithms for temporal difference learning. Machine Learning, 22:33-57, 1996.
- (1996) Machine Learning , vol.22 , pp. 33-57
- Bradtke, S.¹ Barto, A.²

5
- 4644323293
- Least-squares policy iteration
- Lagoudakis, M. and Parr, R. Least-squares policy iteration. Journal of Machine Learning Research, 4: 1107-1149, 2003.
- (2003) Journal of Machine Learning Research , vol.4 , pp. 1107-1149
- Lagoudakis, M.¹ Parr, R.²

6
- 77956509560
- Finite- sample analysis of LSTD
- Lazaric, A., Ghavamzadeh, M., and Munos, R. Finite- sample analysis of LSTD. Technical Report inria- 00482189, INRIA, 2010.
- (2010) Technical Report Inria- 00482189, INRIA
- Lazaric, A.¹ Ghavamzadeh, M.² Munos, R.³

7
- 0004102479
- MIP Press
- Sutton, R. and Barto, A. Reinforcement Learning: An Introduction. MIP Press, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.¹ Barto, A.²

8
- 0031143730
- An analysis of temporal difference learning with function approximation
- Tsitsiklis, J. and Van Roy, B. An analysis of temporal difference learning with function approximation. IEEE Transactions on Automatic Control, 42:674- 690, 1997.
- (1997) IEEE Transactions on Automatic Control , vol.42 , pp. 674-690
- Tsitsiklis, J.¹ Van Roy, B.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.