SCOPUS 정보 검색 플랫폼

Proceedings of the American Control Conference

Volumn , Issue , 2009, Pages 725-730

Regularized fitted q-iteration for planning in continuous-space markovian decision problems

(4) Farahmand, Amir Massoud a Ghavamzadeh, Mohammad a Szepesvári, Csaba a Mannor, Shie b

a UNIVERSITY OF ALBERTA (Canada)

b MCGILL UNIVERSITY (Canada)

Author keywords

[No Author keywords available]

Indexed keywords

FINITE SAMPLES; GENERALIZATION BOUND; ITERATION ALGORITHMS; MACHINE-LEARNING; MARKOVIAN DECISION PROBLEMS; NONLINEAR FUNCTIONS; PLANNING PROBLEM; REGULARIZATION PROCEDURE; SMALL SAMPLE SIZE; VALUE FUNCTIONS;

ALGORITHMS; EDUCATION; FUNCTIONS; REINFORCEMENT LEARNING; SAMPLING;

REINFORCEMENT;

EID: 70449644892 PISSN: 07431619 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ACC.2009.5160611 Document Type: Conference Paper

Times cited : (71)

References (20)

1
- 70449695865
- A. Antos, R. Munos, and Cs. Szepesvári. Fitted q-iteration in continuous action-space mdps. In Advances in Neural Information Processing Systems, 2007. (accepted).
- A. Antos, R. Munos, and Cs. Szepesvári. Fitted q-iteration in continuous action-space mdps. In Advances in Neural Information Processing Systems, 2007. (accepted).

2
- 40849145988
- Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
- April, Published Online First: 14 Nov, DOI: 10.1007/s10994-007-5038-2
- A. Antos, Cs. Szepesvári, and R. Munos. Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. Machine Learning, 71(1):89-129, April 2008. Published Online First: 14 Nov, 2007, DOI: 10.1007/s10994-007-5038-2.
- (2007) Machine Learning , vol.71 , Issue.1 , pp. 89-129
- Antos, A.¹ Szepesvári, C.² Munos, R.³

3
- 0003923091
- Academic Press, New York
- D. P. Bertsekas and S.E. Shreve. Stochastic Optimal Control (The Discrete Time Case). Academic Press, New York, 1978.
- (1978) Stochastic Optimal Control (The Discrete Time Case)
- Bertsekas, D.P.¹ Shreve, S.E.²

4
- 50849114939
- Sparsity oracle inequalities for the lasso
- F. Bunea, A. Tsybakov, and M. Wegkamp. Sparsity oracle inequalities for the lasso. Electronic Journal of Statistics, 1:169-194, 2007.
- (2007) Electronic Journal of Statistics , vol.1 , pp. 169-194
- Bunea, F.¹ Tsybakov, A.² Wegkamp, M.³

5
- 33947374691
- Visual servo control, part I: Basic approaches
- December
- F. Chaumette and S. Hutchinson. Visual servo control, part I: Basic approaches. IEEE Robotics and Automation Magazine, 13(4):82-90, December 2006.
- (2006) IEEE Robotics and Automation Magazine , vol.13 , Issue.4 , pp. 82-90
- Chaumette, F.¹ Hutchinson, S.²

6
- 0030106027
- A robotics toolbox for MATLAB
- March
- P.I. Corke. A robotics toolbox for MATLAB. IEEE Robotics and Automation Magazine, 3(1):24-32, March 1996.
- (1996) IEEE Robotics and Automation Magazine , vol.3 , Issue.1 , pp. 24-32
- Corke, P.I.¹

7
- 31844451013
- Reinforcement learning with Gaussian processes
- New York, NY, USA, ACM
- Y. Engel, S. Mannor, and R. Meir. Reinforcement learning with Gaussian processes. In ICML '05: Proceedings of the 22nd inter- national conference on Machine learning, pages 201-208, New York, NY, USA, 2005. ACM.
- (2005) ICML '05: Proceedings of the 22nd inter- national conference on Machine learning , pp. 201-208
- Engel, Y.¹ Mannor, S.² Meir, R.³

8
- 21844465127
- Tree-based batch mode reinforcement learning
- D. Ernst, P. Geurts, and L. Wehenkel. Tree-based batch mode reinforcement learning. Journal of Machine Learning Research, 6:503-556, 2005.
- (2005) Journal of Machine Learning Research , vol.6 , pp. 503-556
- Ernst, D.¹ Geurts, P.² Wehenkel, L.³

9
- 70049096468
- Regularized policy iteration
- D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, editors
- A. M. Farahmand, M. Ghavamzadeh, Cs. Szepesvári, and Sh. Mannor. Regularized policy iteration. In D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, editors, Advances in Neural Information Processing Systems 21, pages 441-448. 2009.
- (2009) Advances in Neural Information Processing Systems 21 , pp. 441-448
- Farahmand, A.M.¹ Ghavamzadeh, M.² Szepesvári, C.³ Mannor, S.⁴

10
- 0003624357
- Springer-Verlag, New York
- L. Györfi, M. Kohler, A. Krzyżak, and H. Walk. A distribution-free theory of nonparametric regression. Springer-Verlag, New York, 2002.
- (2002) A distribution-free theory of nonparametric regression
- Györfi, L.¹ Kohler, M.² Krzyżak, A.³ Walk, H.⁴

11
- 84885993384
- Least squares SVM for least squares TD learning
- T. Jung and D. Polani. Least squares SVM for least squares TD learning. In ECAI, pages 499-503, 2006.
- (2006) ECAI , pp. 499-503
- Jung, T.¹ Polani, D.²

12
- 1942420814
- Reinforcement learning as classification: Leveraging modern classifiers
- M.G. Lagoudakis and R. Parr. Reinforcement learning as classification: Leveraging modern classifiers. In ICML-03, pages 424-431, 2003.
- (2003) ICML-03 , pp. 424-431
- Lagoudakis, M.G.¹ Parr, R.²

13
- 34548803187
- Sparse temporal difference learning using LASSO
- M. Loth, M. Davy, and P. Preux. Sparse temporal difference learning using LASSO. In IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, 2007.
- (2007) IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning
- Loth, M.¹ Davy, M.² Preux, P.³

14
- 17444414191
- Basis function adaptation in temporal difference reinforcement learning
- S. Mannor, I. Menache, and N. Shimkin. Basis function adaptation in temporal difference reinforcement learning. Annals of Operations Research, 134:215-238, 2005.
- (2005) Annals of Operations Research , vol.134 , pp. 215-238
- Mannor, S.¹ Menache, I.² Shimkin, N.³

15
- 31344457788
- Egt: A toolbox for multiple view geometry and visual servoing
- December
- G.L. Mariottini and D. Prattichizzo. Egt: a toolbox for multiple view geometry and visual servoing. IEEE Robotics and Automation Magazine, 3(12), December 2005.
- (2005) IEEE Robotics and Automation Magazine , vol.3 , Issue.12
- Mariottini, G.L.¹ Prattichizzo, D.²

16
- 84925067999
- Cambridge
- S. P. Meyn. Control Techniques for Complex Networks. Cambridge, 2008.
- (2008) Control Techniques for Complex Networks
- Meyn, S.P.¹

17
- 56449108844
- Empirical Bernstein stopping
- V. Mnih, Cs. Szepesvári, and J.-Y. Audibert. Empirical Bernstein stopping. In Proceedings of the 25th Annual International Conference on Machine Learning (ICML 2008), pages 672-679, 2008.
- (2008) Proceedings of the 25th Annual International Conference on Machine Learning (ICML 2008) , pp. 672-679
- Mnih, V.¹ Szepesvári, C.² Audibert, J.-Y.³

18
- 34547982545
- Analyzing feature generation for value-function approximation
- R. Parr, C. Painter-Wakefield, L. Li, and M.L. Littman. Analyzing feature generation for value-function approximation. In ICML, pages 737-744, 2007.
- (2007) ICML , pp. 737-744
- Parr, R.¹ Painter-Wakefield, C.² Li, L.³ Littman, M.L.⁴

19
- 0004094721
- MIT Press, Cambridge, MA
- B. Schölkopf and A.J. Smola. Learning with Kernels. MIT Press, Cambridge, MA, 2002.
- (2002) Learning with Kernels
- Schölkopf, B.¹ Smola, A.J.²

20
- 0038105204
- Capacity of reproducing kernel spaces in learning theory
- D-X. Zhou. Capacity of reproducing kernel spaces in learning theory. IEEE Transactions on Information Theory, 49:1743-1752, 2003.
- (2003) IEEE Transactions on Information Theory , vol.49 , pp. 1743-1752
- Zhou, D.-X.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.