-
1
-
-
70449695865
-
-
A. Antos, R. Munos, and Cs. Szepesvári. Fitted q-iteration in continuous action-space mdps. In Advances in Neural Information Processing Systems, 2007. (accepted).
-
A. Antos, R. Munos, and Cs. Szepesvári. Fitted q-iteration in continuous action-space mdps. In Advances in Neural Information Processing Systems, 2007. (accepted).
-
-
-
-
2
-
-
40849145988
-
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
-
April, Published Online First: 14 Nov, DOI: 10.1007/s10994-007-5038-2
-
A. Antos, Cs. Szepesvári, and R. Munos. Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. Machine Learning, 71(1):89-129, April 2008. Published Online First: 14 Nov, 2007, DOI: 10.1007/s10994-007-5038-2.
-
(2007)
Machine Learning
, vol.71
, Issue.1
, pp. 89-129
-
-
Antos, A.1
Szepesvári, C.2
Munos, R.3
-
7
-
-
31844451013
-
Reinforcement learning with Gaussian processes
-
New York, NY, USA, ACM
-
Y. Engel, S. Mannor, and R. Meir. Reinforcement learning with Gaussian processes. In ICML '05: Proceedings of the 22nd inter- national conference on Machine learning, pages 201-208, New York, NY, USA, 2005. ACM.
-
(2005)
ICML '05: Proceedings of the 22nd inter- national conference on Machine learning
, pp. 201-208
-
-
Engel, Y.1
Mannor, S.2
Meir, R.3
-
9
-
-
70049096468
-
Regularized policy iteration
-
D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, editors
-
A. M. Farahmand, M. Ghavamzadeh, Cs. Szepesvári, and Sh. Mannor. Regularized policy iteration. In D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, editors, Advances in Neural Information Processing Systems 21, pages 441-448. 2009.
-
(2009)
Advances in Neural Information Processing Systems 21
, pp. 441-448
-
-
Farahmand, A.M.1
Ghavamzadeh, M.2
Szepesvári, C.3
Mannor, S.4
-
10
-
-
0003624357
-
-
Springer-Verlag, New York
-
L. Györfi, M. Kohler, A. Krzyżak, and H. Walk. A distribution-free theory of nonparametric regression. Springer-Verlag, New York, 2002.
-
(2002)
A distribution-free theory of nonparametric regression
-
-
Györfi, L.1
Kohler, M.2
Krzyżak, A.3
Walk, H.4
-
11
-
-
84885993384
-
Least squares SVM for least squares TD learning
-
T. Jung and D. Polani. Least squares SVM for least squares TD learning. In ECAI, pages 499-503, 2006.
-
(2006)
ECAI
, pp. 499-503
-
-
Jung, T.1
Polani, D.2
-
12
-
-
1942420814
-
Reinforcement learning as classification: Leveraging modern classifiers
-
M.G. Lagoudakis and R. Parr. Reinforcement learning as classification: Leveraging modern classifiers. In ICML-03, pages 424-431, 2003.
-
(2003)
ICML-03
, pp. 424-431
-
-
Lagoudakis, M.G.1
Parr, R.2
-
14
-
-
17444414191
-
Basis function adaptation in temporal difference reinforcement learning
-
S. Mannor, I. Menache, and N. Shimkin. Basis function adaptation in temporal difference reinforcement learning. Annals of Operations Research, 134:215-238, 2005.
-
(2005)
Annals of Operations Research
, vol.134
, pp. 215-238
-
-
Mannor, S.1
Menache, I.2
Shimkin, N.3
-
18
-
-
34547982545
-
Analyzing feature generation for value-function approximation
-
R. Parr, C. Painter-Wakefield, L. Li, and M.L. Littman. Analyzing feature generation for value-function approximation. In ICML, pages 737-744, 2007.
-
(2007)
ICML
, pp. 737-744
-
-
Parr, R.1
Painter-Wakefield, C.2
Li, L.3
Littman, M.L.4
-
20
-
-
0038105204
-
Capacity of reproducing kernel spaces in learning theory
-
D-X. Zhou. Capacity of reproducing kernel spaces in learning theory. IEEE Transactions on Information Theory, 49:1743-1752, 2003.
-
(2003)
IEEE Transactions on Information Theory
, vol.49
, pp. 1743-1752
-
-
Zhou, D.-X.1
|