-
3
-
-
0000430514
-
The convergence of TD(λ) for general λ
-
P. Dayan. The convergence of TD(λ) for general λ. Machine Learning, 8(3-4), 1992.
-
(1992)
Machine Learning
, vol.8
, Issue.3-4
-
-
Dayan, P.1
-
4
-
-
0000439891
-
On the convergence of stochastic iterative dynamic programming algorithms
-
T. Jaakkola, M. I. Jordan, and S. P. Singh. On the convergence of stochastic iterative dynamic programming algorithms. Neural Computation, 6(6), 1994.
-
(1994)
Neural Computation
, vol.6
, pp. 6
-
-
Jaakkola, T.1
Jordan, M.I.2
Singh, S.P.3
-
5
-
-
77956086700
-
Low-rank optimization on the cone of positive semidefinite matrices
-
M. Journee, F. Bach, P.A. Absil, and R. Sepulchre. Low-rank optimization on the cone of positive semidefinite matrices. SIAM Journal on Optimization, 20(5):2327-2351, 2010.
-
(2010)
SIAM Journal on Optimization
, vol.20
, Issue.5
, pp. 2327-2351
-
-
Journee, M.1
Bach, F.2
Absil, P.A.3
Sepulchre, R.4
-
13
-
-
85162310185
-
-
Personal communication
-
B. Scherrer. Personal communication, 2011.
-
(2011)
-
-
Scherrer, B.1
-
14
-
-
84860607818
-
-
minfunc
-
M. Schmidt. minfunc, 2005. Available at http://www.cs.ubc.ca/~schmidtm/ Software/minFunc.html.
-
(2005)
M. Schmidt
-
-
-
15
-
-
71149099079
-
Fast gradient-descent methods for temporal-difference learning with linear function approximation
-
R.S. Sutton, H.R. Maei, D. Precup, S. Bhatnagar, D. Silver, Cs. Szepesvari, and E. Wiewiora. Fast gradient-descent methods for temporal-difference learning with linear function approximation. In Proceedings of the International Conference on Machine Learning, 2009.
-
(2009)
Proceedings of the International Conference on Machine Learning
-
-
Sutton, R.S.1
Maei, H.R.2
Precup, D.3
Bhatnagar, S.4
Silver, D.5
Szepesvari, Cs.6
Wiewiora, E.7
-
16
-
-
77956513316
-
A convergent O(n) algorithm for off-policy temporal-different learning with linear function approximation
-
R.S. Sutton, Cs. Szepesvari, and H.R. Maei. A convergent O(n) algorithm for off-policy temporal-different learning with linear function approximation. In Advances in Neural Information Processing, 2008.
-
(2008)
Advances in Neural Information Processing
-
-
Sutton, R.S.1
Szepesvari, Cs.2
Maei, H.R.3
-
17
-
-
0031143730
-
An analysis of temporal-difference learning with function approximation
-
J.N. Tsitsiklis and B. Van Roy. An analysis of temporal-difference learning with function approximation. IEEE Transactions and Auotomatic Control, 42:674-690, 1997.
-
(1997)
IEEE Transactions and Auotomatic Control
, vol.42
, pp. 674-690
-
-
Tsitsiklis, J.N.1
Van Roy, B.2
-
18
-
-
0033221519
-
Average cost temporal difference learning
-
J.N. Tsitsiklis and B. Van Roy. Average cost temporal difference learning. Automatica, 35(11):1799-1808, 1999.
-
(1999)
Automatica
, vol.35
, Issue.11
, pp. 1799-1808
-
-
Tsitsiklis, J.N.1
Van Roy, B.2
-
19
-
-
77953119098
-
Error bounds for approximations from projected linear equations
-
H. Yu and D. P. Bertsekas. Error bounds for approximations from projected linear equations. Mathematics of Operations Research, 35:306-329, 2010.
-
(2010)
Mathematics of Operations Research
, vol.35
, pp. 306-329
-
-
Yu, H.1
Bertsekas, D.P.2
|