SCOPUS 정보 검색 플랫폼

Volumn , Issue , 2007, Pages 330-337

Value-iteration based fitted policy iteration: Learning with a single trajectory

Author keywords

[No Author keywords available]

Indexed keywords

DECISION THEORY; ITERATIVE METHODS; MARKOV PROCESSES; POLYNOMIALS; PROBLEM SOLVING; PUBLIC POLICY;

APPROXIMATE VALUE ITERATION; INTERMEDIATE POLICIES; POLICY ITERATION;

REINFORCEMENT LEARNING;

EID: 34548752490 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ADPRL.2007.368207 Document Type: Conference Paper

Times cited : (34)

References (17)

1
- 4644323293
- Least-squares policy iteration
- M. Lagoudakis and R. Parr. Least-squares policy iteration. Journal of Machine Learning Research, 4:1107-1149, 2003.
- (2003) Journal of Machine Learning Research , vol.4 , pp. 1107-1149
- Lagoudakis, M.¹ Parr, R.²

2
- 21844465127
- Tree-based batch mode reinforcement learning
- D. Ernst, P. Geurts, and L. Wehenkel. Tree-based batch mode reinforcement learning. Journal of Machine Learning Research, 6:503-556, 2005.
- (2005) Journal of Machine Learning Research , vol.6 , pp. 503-556
- Ernst, D.¹ Geurts, P.² Wehenkel, L.³

3
- 31844456754
- Finite time bounds for sampling based fitted value iteration
- Cs. Szepesvári and R. Munos. Finite time bounds for sampling based fitted value iteration. In ICML'2005, pages 881-886, 2005.
- (2005) ICML'2005 , pp. 881-886
- Szepesvári, C.¹ Munos, R.²

4
- 0003923091
- Academic Press, New York
- D. P. Bertsekas and S.E. Shreve. Stochastic Optimal Control (The Discrete Time Case). Academic Press, New York, 1978.
- (1978) Stochastic Optimal Control (The Discrete Time Case)
- Bertsekas, D.P.¹ Shreve, S.E.²

7
- 33746032553
- Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
- A. Antos, Cs. Szepesvári, and R. Munos. Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. In COLT-19, pages 574-588, 2006.
- (2006) COLT-19 , pp. 574-588
- Antos, A.¹ Szepesvári, C.² Munos, R.³

8
- 0003924391
- Cambridge University Press
- M. Anthony and P. L. Bartlett. Neural Network Learning: Theoretical Foundations. Cambridge University Press, 1999.
- (1999) Neural Network Learning: Theoretical Foundations
- Anthony, M.¹ Bartlett, P.L.²

10
- 0004234484
- McGraw-Hill, London, New York
- E.W. Cheney. Introduction to approximation theory. McGraw-Hill, London, New York, 1966.
- (1966) Introduction to approximation theory
- Cheney, E.W.¹

11
- 0003624357
- Springer-Verlag, New York
- L. Györfi, M. Kohler, A. Krzyżak, and H. Walk. A distribution-free theory of nonparametric regression. Springer-Verlag, New York, 2002.
- (2002) A distribution-free theory of nonparametric regression
- Györfi, L.¹ Kohler, M.² Krzyżak, A.³ Walk, H.⁴

12
- 0030489341
- Histogram regression estimation using data-dependent partitions
- A. Nobel. Histogram regression estimation using data-dependent partitions. Annals of Statistics, 24(3): 1084-1105, 1996.
- (1996) Annals of Statistics , vol.24 , Issue.3 , pp. 1084-1105
- Nobel, A.¹

13
- 0000996139
- Sphere packing numbers for subsets of the boolean n-cube with bounded Vapnik-Chervonenkis dimension
- D. Haussler. Sphere packing numbers for subsets of the boolean n-cube with bounded Vapnik-Chervonenkis dimension. Journal of Combinatorial Theory Series A, 69:217-232, 1995.
- (1995) Journal of Combinatorial Theory Series A , vol.69 , pp. 217-232
- Haussler, D.¹

14
- 1942516880
- Error bounds for approximate policy iteration
- R. Munos. Error bounds for approximate policy iteration. In ICML'2003, pages 560-567, 2003.
- (2003) ICML'2003 , pp. 560-567
- Munos, R.¹

16
- 0026206780
- An optimal multlgrid algorithm for continuous state discrete time stochastic control
- C.S. Chow and J.N. Tsitsiklis. An optimal multlgrid algorithm for continuous state discrete time stochastic control. IEEE Transactions on Automatic Control, 36(8):898-914, 1991.
- (1991) IEEE Transactions on Automatic Control , vol.36 , Issue.8 , pp. 898-914
- Chow, C.S.¹ Tsitsiklis, J.N.²

17
- 0001523794
- Strict stationarity of generalized autoregressive processes
- P. Bougerol and N. Picard. Strict stationarity of generalized autoregressive processes. Annals of Probability, 20:1714-1730, 1992.
- (1992) Annals of Probability , vol.20 , pp. 1714-1730
- Bougerol, P.¹ Picard, N.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.