-
1
-
-
84860244324
-
Information-theoretic lower bounds on the oracle complexity of stochastic convex optimization
-
Agarwal, A., Bartlett, P.L., Ravikumar, P., Wainwright, M.J.: Information-theoretic lower bounds on the oracle complexity of stochastic convex optimization. IEEE Trans. Inf. Theory 58(5), 3235–3249 (2012)
-
(2012)
IEEE Trans. Inf. Theory
, vol.58
, Issue.5
, pp. 3235-3249
-
-
Agarwal, A.1
Bartlett, P.L.2
Ravikumar, P.3
Wainwright, M.J.4
-
2
-
-
84906533838
-
Non-asymptotic analysis of stochastic approximation algorithms for machine learning
-
Bach, F., Moulines, E.: Non-asymptotic analysis of stochastic approximation algorithms for machine learning. Adv. Neural Inf. Process. Syst. 773–781 (2013)
-
(2013)
Adv. Neural Inf. Process. Syst
, pp. 773-781
-
-
Bach, F.1
Moulines, E.2
-
3
-
-
84899001337
-
-
Bach, F., Moulines, E.: Non-strongly-convex smooth stochastic approximation with convergence rate o(1 / n). arXiv preprint (2013)
-
Bach, F., Moulines, E.: Non-strongly-convex smooth stochastic approximation with convergence rate o(1 / n). arXiv preprint (2013)
-
-
-
-
4
-
-
0031285678
-
A new class of incremental gradient methods for least squares problems
-
Bertsekas, D.P.: A new class of incremental gradient methods for least squares problems. SIAM J. Optim. 7(4), 913–926 (1997)
-
(1997)
SIAM J. Optim.
, vol.7
, Issue.4
, pp. 913-926
-
-
Bertsekas, D.P.1
-
5
-
-
39449100600
-
A convergent incremental gradient method with a constant step size
-
Blatt, D., Hero, A.O., Gauchman, H.: A convergent incremental gradient method with a constant step size. SIAM J. Optim. 18(1), 29–51 (2007)
-
(2007)
SIAM J. Optim.
, vol.18
, Issue.1
, pp. 29-51
-
-
Blatt, D.1
Hero, A.O.2
Gauchman, H.3
-
6
-
-
68949096711
-
Sgd-qn: careful quasi-newton stochastic gradient descent
-
Bordes, A., Bottou, L., Gallinari, P.: Sgd-qn: careful quasi-newton stochastic gradient descent. J. Mach. Learn. Res. 10, 1737–1754 (2009)
-
(2009)
J. Mach. Learn. Res.
, vol.10
, pp. 1737-1754
-
-
Bordes, A.1
Bottou, L.2
Gallinari, P.3
-
8
-
-
85013579182
-
New probabilistic inference algorithms that harness the strengths of variational and Monte Carlo methods. Ph.D. thesis
-
Carbonetto, P.: New probabilistic inference algorithms that harness the strengths of variational and Monte Carlo methods. Ph.D. thesis, University of British Columbia (2009)
-
(2009)
University of British Columbia
-
-
Carbonetto, P.1
-
9
-
-
33746225022
-
KDD-cup 2004: results and analysis
-
Caruana, R., Joachims, T., Backstrom, L.: KDD-cup 2004: results and analysis. ACM SIGKDD Newsl. 6(2), 95–108 (2004)
-
(2004)
ACM SIGKDD Newsl.
, vol.6
, Issue.2
, pp. 95-108
-
-
Caruana, R.1
Joachims, T.2
Backstrom, L.3
-
11
-
-
50949133940
-
Exponentiated gradient algorithms for conditional random fields and max-margin markov networks
-
Collins, M., Globerson, A., Koo, T., Carreras, X., Bartlett, P.: Exponentiated gradient algorithms for conditional random fields and max-margin markov networks. J. Mach. Learn. Res. 9, 1775–1822 (2008)
-
(2008)
J. Mach. Learn. Res.
, vol.9
, pp. 1775-1822
-
-
Collins, M.1
Globerson, A.2
Koo, T.3
Carreras, X.4
Bartlett, P.5
-
12
-
-
72449195626
-
-
Cormack, G.V., Lynam, T.R.: Spam corpus creation for TREC. In: Proceedings of 2nd Conference on Email and Anti-Spam (2005)
-
Cormack, G.V., Lynam, T.R.: Spam corpus creation for TREC. In: Proceedings of 2nd Conference on Email and Anti-Spam (2005). http://plg.uwaterloo.ca/~gvcormac/treccorpus/
-
-
-
-
13
-
-
84937908747
-
Saga: A fast incremental gradient method with support for non-strongly convex composite objectives
-
Defazio, A., Bach, F., Lacoste-Julien, S.: Saga: A fast incremental gradient method with support for non-strongly convex composite objectives. Adv. Neural Inf. Process. Syst. 1646–1654 (2014)
-
(2014)
Adv. Neural Inf. Process. Syst
, pp. 1646-1654
-
-
Defazio, A.1
Bach, F.2
Lacoste-Julien, S.3
-
14
-
-
0343149555
-
Accelerated stochastic approximation
-
Delyon, B., Juditsky, A.: Accelerated stochastic approximation. SIAM J. Optim. 3(4), 868–881 (1993)
-
(1993)
SIAM J. Optim.
, vol.3
, Issue.4
, pp. 868-881
-
-
Delyon, B.1
Juditsky, A.2
-
15
-
-
80052250414
-
Adaptive subgradient methods for online learning and stochastic optimization
-
Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)
-
(2011)
J. Mach. Learn. Res.
, vol.12
, pp. 2121-2159
-
-
Duchi, J.1
Hazan, E.2
Singer, Y.3
-
17
-
-
84877784292
-
Hybrid deterministic-stochastic methods for data fitting
-
Friedlander, M.P., Schmidt, M.: Hybrid deterministic-stochastic methods for data fitting. SIAM J. Sci. Comput. 34(3), A1351–A1379 (2012)
-
(2012)
SIAM J. Sci. Comput.
, vol.34
, Issue.3
, pp. A1351-A1379
-
-
Friedlander, M.P.1
Schmidt, M.2
-
20
-
-
85013572916
-
-
Sido, A phamacology dataset
-
Guyon, I.: Sido: A phamacology dataset (2008). http://www.causality.inf.ethz.ch/data/SIDO.html
-
(2008)
I.
-
-
-
21
-
-
84907359690
-
Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization
-
Hazan, E., Kale, S.: Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization. J. Mach. Learn. Res. Workshop Conf. Proc. 15, 2489–2512 (2011)
-
(2011)
J. Mach. Learn. Res. Workshop Conf. Proc
, vol.15
, pp. 2489-2512
-
-
Hazan, E.1
Kale, S.2
-
22
-
-
77956508892
-
Accelerated gradient methods for stochastic optimization and online learning
-
Hu, C., Kwok, J., Pan, W.: Accelerated gradient methods for stochastic optimization and online learning. Adv. Neural Inf. Process. Syst. 781–789 (2009)
-
(2009)
Adv. Neural Inf. Process. Syst
, pp. 781-789
-
-
Hu, C.1
Kwok, J.2
Pan, W.3
-
23
-
-
84898963415
-
Accelerating stochastic gradient descent using predictive variance reduction
-
Johnson, R., Zhang, T.: Accelerating stochastic gradient descent using predictive variance reduction. Adv. Neural Inf. Process. Syst. 315–323 (2013)
-
(2013)
Adv. Neural Inf. Process. Syst
, pp. 315-323
-
-
Johnson, R.1
Zhang, T.2
-
24
-
-
21844461582
-
A modified finite Newton method for fast solution of large scale linear SVMs
-
Keerthi, S., DeCoste, D.: A modified finite Newton method for fast solution of large scale linear SVMs. J. Mach. Learn. Res. 6, 341–361 (2005)
-
(2005)
J. Mach. Learn. Res.
, vol.6
, pp. 341-361
-
-
Keerthi, S.1
DeCoste, D.2
-
25
-
-
0002803115
-
Accelerated stochastic approximation
-
Kesten, H.: Accelerated stochastic approximation. Ann. Math. Stat. 29(1), 41–59 (1958)
-
(1958)
Ann. Math. Stat.
, vol.29
, Issue.1
, pp. 41-59
-
-
Kesten, H.1
-
29
-
-
84880570692
-
Block-coordinate frank-wolfe optimization for structural SVMs
-
Lacoste-Julien, S., Jaggi, M., Schmidt, M., Pletscher, P.: Block-coordinate frank-wolfe optimization for structural SVMs. Int. Conf. Mach. Learn. J. Mach. Learn. Res. Workshop Conf. Proc. 28, 53–61 (2013)
-
(2013)
Int. Conf. Mach. Learn. J. Mach. Learn. Res. Workshop Conf. Proc
, vol.28
, pp. 53-61
-
-
Lacoste-Julien, S.1
Jaggi, M.2
Schmidt, M.3
Pletscher, P.4
-
30
-
-
84877725219
-
A stochastic gradient method with an exponential convergence rate for strongly-convex optimization with finite training sets
-
Le Roux, N., Schmidt, M., Bach, F.: A stochastic gradient method with an exponential convergence rate for strongly-convex optimization with finite training sets. Adv. Neural Inf. Process. Syst. 2663–2671 (2012)
-
(2012)
Adv. Neural Inf. Process. Syst
, pp. 2663-2671
-
-
Le Roux, N.1
Schmidt, M.2
Bach, F.3
-
31
-
-
84876811202
-
RCV1: a new benchmark collection for text categorization research
-
Lewis, D., Yang, Y., Rose, T., Li, F.: RCV1: a new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004)
-
(2004)
J. Mach. Learn. Res.
, vol.5
, pp. 361-397
-
-
Lewis, D.1
Yang, Y.2
Rose, T.3
Li, F.4
-
35
-
-
84912526440
-
Optimization with first-order surrogate functions
-
Mairal, J.: Optimization with first-order surrogate functions. J. Mach. Learn. Res. Workshop Conf. Proc. 28, 783–791 (2013)
-
(2013)
J. Mach. Learn. Res. Workshop Conf. Proc
, vol.28
, pp. 783-791
-
-
Mairal, J.1
-
36
-
-
77956541496
-
Deep learning via Hessian-free optimization
-
Martens, J.: Deep learning via Hessian-free optimization. Int. Conf. Mach. Learn. 735–742 (2010)
-
(2010)
Int. Conf. Mach. Learn
, pp. 735-742
-
-
Martens, J.1
-
37
-
-
0005422061
-
Convergence rate of incremental subgradient algorithms
-
Kluwer Academic Publishers, Dordrecht
-
Nedic, A., Bertsekas, D.: Convergence rate of incremental subgradient algorithms. In: Uryasev, S., Pardalos, P.M. (eds.) Stochastic Optimization: Algorithms and Applications, pp. 263–304. Kluwer Academic Publishers, Dordrecht (2000)
-
(2000)
Stochastic Optimization: Algorithms and Applications
, pp. 263-304
-
-
Nedic, A.1
Bertsekas, D.2
Uryasev, S.3
Pardalos, P.M.4
-
38
-
-
84937892482
-
Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm
-
Needell, D., Srebro, N., Ward, R.: Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm. Adv. Neural Inf. Process. Syst. 1017–1025 (2014)
-
(2014)
Adv. Neural Inf. Process. Syst
, pp. 1017-1025
-
-
Needell, D.1
Srebro, N.2
Ward, R.3
-
40
-
-
70450197241
-
Robust stochastic approximation approach to stochastic programming
-
Nemirovski, A., Juditsky, A., Lan, G., Shapiro, A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19(4), 1574–1609 (2009)
-
(2009)
SIAM J. Optim.
, vol.19
, Issue.4
, pp. 1574-1609
-
-
Nemirovski, A.1
Juditsky, A.2
Lan, G.3
Shapiro, A.4
-
43
-
-
17444406259
-
Smooth minimization of non-smooth functions
-
Nesterov, Y.: Smooth minimization of non-smooth functions. Math. Program. 103(1), 127–152 (2005)
-
(2005)
Math. Program.
, vol.103
, Issue.1
, pp. 127-152
-
-
Nesterov, Y.1
-
44
-
-
65249121279
-
Primal-dual subgradient methods for convex problems
-
Nesterov, Y.: Primal-dual subgradient methods for convex problems. Math. Program. 120(1), 221–259 (2009)
-
(2009)
Math. Program.
, vol.120
, Issue.1
, pp. 221-259
-
-
Nesterov, Y.1
-
45
-
-
84860819390
-
Efficiency of coordinate descent methods on huge-scale optimization problems
-
Nesterov, Y.: Efficiency of coordinate descent methods on huge-scale optimization problems. CORE Discussion Paper (2010)
-
(2010)
CORE Discussion Paper
-
-
Nesterov, Y.1
-
47
-
-
0026899240
-
Acceleration of stochastic approximation by averaging
-
Polyak, B.T., Juditsky, A.B.: Acceleration of stochastic approximation by averaging. SIAM J. Control Optim. 30(4), 838–855 (1992)
-
(1992)
SIAM J. Control Optim.
, vol.30
, Issue.4
, pp. 838-855
-
-
Polyak, B.T.1
Juditsky, A.B.2
-
48
-
-
84867120686
-
Making gradient descent optimal for strongly convex stochastic optimization
-
Rakhlin, A., Shamir, O., Sridharan, K.: Making gradient descent optimal for strongly convex stochastic optimization. Int. Conf. Mach. Learn. 449–456 (2012)
-
(2012)
Int. Conf. Mach. Learn
, pp. 449-456
-
-
Rakhlin, A.1
Shamir, O.2
Sridharan, K.3
-
49
-
-
0000016172
-
A stochastic approximation method
-
Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22(3), 400–407 (1951)
-
(1951)
Ann. Math. Stat.
, vol.22
, Issue.3
, pp. 400-407
-
-
Robbins, H.1
Monro, S.2
-
51
-
-
85013624384
-
-
Schmidt, M.: minfunc: unconstrained differentiable multivariate optimization in matlab. (2005)
-
Schmidt, M.: minfunc: unconstrained differentiable multivariate optimization in matlab. https://www.cs.ubc.ca/~schmidtm/Software/minFunc.html (2005)
-
-
-
-
53
-
-
84954318065
-
Non-uniform stochastic average gradient method for training conditional random fields
-
Schmidt, M., Babanezhad, R., Ahemd, M., Clifton, A., Sarkar, A.: Non-uniform stochastic average gradient method for training conditional random fields. Int. Conf. Artif. Intell. Stat. J. Mach. Learn. Res. Workshop Conf. Proc. 38, 819–828 (2015)
-
(2015)
Int. Conf. Artif. Intell. Stat. J. Mach. Learn. Res. Workshop Conf. Proc
, vol.38
, pp. 819-828
-
-
Schmidt, M.1
Babanezhad, R.2
Ahemd, M.3
Clifton, A.4
Sarkar, A.5
-
54
-
-
84875134236
-
Stochastic dual coordinate ascent methods for regularized loss minimization
-
Shalev-Schwartz, S., Zhang, T.: Stochastic dual coordinate ascent methods for regularized loss minimization. J. Mach. Learn. Res. 14, 567–599 (2013)
-
(2013)
J. Mach. Learn. Res.
, vol.14
, pp. 567-599
-
-
Shalev-Schwartz, S.1
Zhang, T.2
-
55
-
-
79952748054
-
Pegasos: primal estimated sub-gradient solver for SVM
-
Shalev-Shwartz, S., Singer, Y., Srebro, N., Cotter, A.: Pegasos: primal estimated sub-gradient solver for SVM. Math. Program. 127(1), 3–30 (2011)
-
(2011)
Math. Program.
, vol.127
, Issue.1
, pp. 3-30
-
-
Shalev-Shwartz, S.1
Singer, Y.2
Srebro, N.3
Cotter, A.4
-
57
-
-
0032186984
-
Incremental gradient algorithms with stepsizes bounded away from zero
-
Solodov, M.: Incremental gradient algorithms with stepsizes bounded away from zero. Comput. Optim. Appl. 11(1), 23–35 (1998)
-
(1998)
Comput. Optim. Appl.
, vol.11
, Issue.1
, pp. 23-35
-
-
Solodov, M.1
-
58
-
-
85013579145
-
-
K.: Theoretical basis for “more data less work”? NIPS Workshop on Computataional Trade-offs in Statistical Learning
-
Srebro, N., Sridharan, K.: Theoretical basis for “more data less work”? NIPS Workshop on Computataional Trade-offs in Statistical Learning (2011)
-
(2011)
Sridharan
-
-
Srebro, N.1
-
59
-
-
67349206945
-
A randomized Kaczmarz algorithm with exponential convergence
-
Strohmer, T., Vershynin, R.: A randomized Kaczmarz algorithm with exponential convergence. J. Fourier Anal. Appl. 15(2), 262–278 (2009)
-
(2009)
J. Fourier Anal. Appl.
, vol.15
, Issue.2
, pp. 262-278
-
-
Strohmer, T.1
Vershynin, R.2
-
60
-
-
84862282880
-
Variable metric stochastic approximation theory
-
Sunehag, P., Trumpf, J., Vishwanathan, S., Schraudolph, N.: Variable metric stochastic approximation theory. J. Mach. Learn. Res. Workshop Conf. Proc. 5, 560–566 (2009)
-
(2009)
J. Mach. Learn. Res. Workshop Conf. Proc
, vol.5
, pp. 560-566
-
-
Sunehag, P.1
Trumpf, J.2
Vishwanathan, S.3
Schraudolph, N.4
-
61
-
-
36849059715
-
A scalable modular convex solver for regularized risk minimization
-
Teo, C.H., Le, Q., Smola, A.J., Vishwanathan, S.V.N.: A scalable modular convex solver for regularized risk minimization. In: ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2007)
-
(2007)
ACM SIGKDD Conference on Knowledge Discovery and Data Mining
-
-
Teo, C.H.1
Le, Q.2
Smola, A.J.3
Vishwanathan, S.V.N.4
-
62
-
-
0032222083
-
An incremental gradient(-projection) method with momentum term and adaptive stepsize rule
-
Tseng, P.: An incremental gradient(-projection) method with momentum term and adaptive stepsize rule. SIAM J. Optim. 8(2), 506–531 (1998)
-
(1998)
SIAM J. Optim.
, vol.8
, Issue.2
, pp. 506-531
-
-
Tseng, P.1
-
63
-
-
78649396336
-
Dual averaging methods for regularized stochastic learning and online optimization
-
Xiao, L.: Dual averaging methods for regularized stochastic learning and online optimization. J. Mach. Learn. Res. 11, 2543–2596 (2010)
-
(2010)
J. Mach. Learn. Res.
, vol.11
, pp. 2543-2596
-
-
Xiao, L.1
-
64
-
-
84919793228
-
A proximal stochastic gradient method with progressive variance reduction
-
Xiao, L., Zhang, T.: A proximal stochastic gradient method with progressive variance reduction. SIAM J. Optim. 24(2), 2057–2075 (2014)
-
(2014)
SIAM J. Optim.
, vol.24
, Issue.2
, pp. 2057-2075
-
-
Xiao, L.1
Zhang, T.2
|