-
3
-
-
49949144765
-
The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming
-
L. Bregman. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Computational Mathematics and Mathematical Physics, 7(3):200-217, 1967.
-
(1967)
USSR Computational Mathematics and Mathematical Physics
, vol.7
, Issue.3
, pp. 200-217
-
-
Bregman, L.1
-
5
-
-
0037403111
-
Mirror descent and nonlinear projected subgradient methods for convex optimization
-
A. Beck and M. Teboulle. Mirror descent and nonlinear projected subgradient methods for convex optimization. Operations Research Letters, 2003.
-
(2003)
Operations Research Letters
-
-
Beck, A.1
Teboulle, M.2
-
8
-
-
0344875562
-
The robustness of the p-norm algorithms
-
Decembe
-
Claudio Gentile. The robustness of the p-norm algorithms. Mach. Learn., 53:265-299, December 2003.
-
(2003)
Mach. Learn.
, vol.53
, pp. 265-299
-
-
Gentile, C.1
-
9
-
-
80053440025
-
Finite-sample analysis of lasso-td
-
Mohammad Ghavamzadeh, Alessandro Lazaric, Remi Munos, and Matthew Hoffman. Finite-Sample Analysis of Lasso-TD In Proceedings of the 28th International Conference on Machine Learning (ICML-11), ICML '11, 2011.
-
(2011)
Proceedings of the 28th International Conference on Machine Learning (ICML-11), ICML '11
-
-
Ghavamzadeh, M.1
Lazaric, A.2
Munos, R.3
Hoffman, M.4
-
12
-
-
71149121683
-
Regularization and feature selection in least-squares temporal difference learning
-
New York, NY, USA
-
J. Zico Kolter and Andrew Y. Ng. Regularization and feature selection in least-squares temporal difference learning. In Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, pages 521-528, New York, NY, USA, 2009. ACM.
-
(2009)
Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09
, pp. 521-528
-
-
Zico Kolter, J.1
Ng, A.Y.2
-
14
-
-
0008815681
-
Exponentiated gradient versus gradient descent for linear predictors
-
Jyrki Kivinen and Manfred K. Warmuth. Exponentiated gradient versus gradient descent for linear predictors. Information and Computation, 132, 1995.
-
(1995)
Information and Computation
, vol.132
-
-
Kivinen, J.1
Warmuth, M.K.2
-
15
-
-
34250091945
-
Learning quickly when irrelevant attributes abound: A new linearthreshold algorithm
-
Nick Littlestone. Learning quickly when irrelevant attributes abound: A new linearthreshold algorithm. In Machine Learning, pages 285-318, 1988.
-
(1988)
Machine Learning
, pp. 285-318
-
-
Littlestone, N.1
-
17
-
-
70349322784
-
Learning representation and control in markov decision processes: New frontiers
-
S. Mahadevan. Learning Representation and Control in Markov Decision Processes: New Frontiers. Foundations and Trends in Machine Learning, 1(4):403-565, 2009.
-
(2009)
Foundations and Trends in Machine Learning
, vol.1
, Issue.4
, pp. 403-565
-
-
Mahadevan, S.1
-
18
-
-
35748957806
-
Proto-value functions: A laplacian framework for learning representation and control in markov decision processes
-
S. Mahadevan and M. Maggioni. Proto-Value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes. Journal of Machine Learning Research, 8:2169-2231, 2007.
-
(2007)
Journal of Machine Learning Research
, vol.8
, pp. 2169-2231
-
-
Mahadevan, S.1
Maggioni, M.2
-
19
-
-
65249121279
-
Primal-dual subgradient methods for convex problems
-
Jan
-
Y Nesterov. Primal-dual subgradient methods for convex problems. Mathematical Programming, Jan 2009.
-
(2009)
Mathematical Programming
-
-
Nesterov, Y.1
-
20
-
-
70450197241
-
Robust stochastic approximation approach to stochastic programming
-
A Nemirovski, A Juditsky, G Lan, and A. Shapiro. Robust stochastic approximation approach to stochastic programming. SIAM Journal on Optimization, 14(4):1574-1609, 2009.
-
(2009)
SIAM Journal on Optimization
, vol.14
, Issue.4
, pp. 1574-1609
-
-
Nemirovski, A.1
Juditsky, A.2
Lan, G.3
Shapiro, A.4
-
23
-
-
77956538796
-
Feature selection using regularization in approximate linear programs for markov decision processes
-
M. Petrik, G. Taylor, R. Parr, and S. Zilberstein. Feature selection using regularization in approximate linear programs for markov decision processes. In ICML, pages 871-878, 2010.
-
(2010)
ICML
, pp. 871-878
-
-
Petrik, M.1
Taylor, G.2
Parr, R.3
Zilberstein, S.4
-
24
-
-
71149099079
-
Fast gradient-descent methods for temporaldifference learning with linear function approximation
-
Richard S. Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvri, and Eric Wiewiora. Fast gradient-descent methods for temporaldifference learning with linear function approximation. In In Proceedings of the 26th International Conference on Machine Learning, 2009.
-
(2009)
Proceedings of the 26th International Conference on Machine Learning
-
-
Sutton, R.S.1
Reza Maei, H.2
Precup, D.3
Bhatnagar, S.4
Silver, D.5
Szepesvri, C.6
Wiewiora, E.7
-
28
-
-
33847202724
-
Learning to predict by the methods of temporal differences
-
R. S. Sutton. Learning to predict by the methods of temporal differences. Machine Learning, 3:9-44, 1988.
-
(1988)
Machine Learning
, vol.3
, pp. 9-44
-
-
Sutton, R.S.1
-
29
-
-
0008815095
-
On the worst-case analysis of temporal-difference learning algorithms
-
Morgan Kaufmann
-
Robert Schapire Schapire and Manfred K. Warmuth. On the worst-case analysis of temporal-difference learning algorithms. In Machine Learning, pages 266-274. Morgan Kaufmann, 1994.
-
(1994)
Machine Learning
, pp. 266-274
-
-
Schapire, R.S.1
Warmuth, M.K.2
-
30
-
-
0035273403
-
Online learning control by association and reinforcement
-
J. Si and Y.T. Wang. Online learning control by association and reinforcement. Neural Networks, IEEE Transactions on, 12(2):264-276, 2001.
-
(2001)
Neural Networks, IEEE Transactions on
, vol.12
, Issue.2
, pp. 264-276
-
-
Si, J.1
Wang, Y.T.2
|