SCOPUS 정보 검색 플랫폼

Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, AISTATS 2016

Volumn , Issue , 2016, Pages 249-258

A linearly-convergent stochastic L-BFGS algorithm

(3) Moritz, Philipp a Nishihara, Robert a Jordan, Michael I a

a UNIVERSITY OF CALIFORNIA (United States)

Author keywords

[No Author keywords available]

Indexed keywords

ARTIFICIAL INTELLIGENCE; CONVEX OPTIMIZATION; GRADIENT METHODS;

LINEAR CONVERGENCE; LINEAR CONVERGENCE RATE; NONCONVEX OPTIMIZATION; OPTIMIZATION PROBLEMS; ORDERS OF MAGNITUDE; SMOOTH FUNCTIONS; STOCHASTIC GRADIENT DESCENT; VARIANCE REDUCTIONS;

STOCHASTIC SYSTEMS;

EID: 85067064967 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (229)

References (31)

1
- 84899845537
- A reliable effective terascale linear learning system
- A. Agarwal, O. Chapelle, M. Dudík, and J. Langford. A reliable effective terascale linear learning system. The Journal of Machine Learning Research, 15(1): 1111–1133, 2014.
- (2014) The Journal of Machine Learning Research , vol.15 , Issue.1 , pp. 1111-1133
- Agarwal, A.¹ Chapelle, O.² Dudík, M.³ Langford, J.⁴

2
- 84873597375
- The million song dataset
- T. Bertin-Mahieux, D. P. Ellis, B. Whitman, and P. Lamere. The million song dataset. In International Conference on Music Information Retrieval, 2011.
- (2011) International Conference on Music Information Retrieval
- Bertin-Mahieux, T.¹ Ellis, D.P.² Whitman, B.³ Lamere, P.⁴

3
- 68949096711
- SGD-QN: Careful quasi-newton stochastic gradient descent
- A. Bordes, L. Bottou, and P. Gallinari. SGD-QN: Careful quasi-Newton stochastic gradient descent. The Journal of Machine Learning Research, 10: 1737–1754, 2009.
- (2009) The Journal of Machine Learning Research , vol.10 , pp. 1737-1754
- Bordes, A.¹ Bottou, L.² Gallinari, P.³

4
- 84904136037
- Large-scale machine learning with stochastic gradient descent
- L. Bottou. Large-scale machine learning with stochastic gradient descent. In International Conference on Computational Statistics, pages 177–186, 2010.
- (2010) International Conference on Computational Statistics , pp. 177-186
- Bottou, L.¹

5
- 84899022736
- Large scale online learning
- L. Bottou and Y. LeCun. Large scale online learning. Advances in neural information processing systems, 16:217, 2004.
- (2004) Advances in Neural Information Processing Systems , vol.16 , pp. 217
- Bottou, L.¹ LeCun, Y.²

6
- 84907027033
- arXiv preprint
- R. H. Byrd, S. L. Hansen, J. Nocedal, and Y. Singer. A stochastic quasi-Newton method for large-scale optimization. arXiv preprint arXiv:1401.7020, 2014.
- (2014) A Stochastic Quasi-Newton Method for Large-Scale Optimization
- Byrd, R.H.¹ Hansen, S.L.² Nocedal, J.³ Singer, Y.⁴

7
- 84937908747
- Saga: A fast incremental gradient method with support for non-strongly convex composite objectives
- A. Defazio, F. Bach, and S. Lacoste-Julien. SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives. In Advances in Neural Information Processing Systems 27, pages 1646–1654, 2014.
- (2014) Advances in Neural Information Processing Systems , vol.27 , pp. 1646-1654
- Defazio, A.¹ Bach, F.² Lacoste-Julien, S.³

8
- 0020763823
- Truncated-Newton algorithms for large-scale unconstrained optimization
- R. S. Dembo and T. Steihaug. Truncated-Newton algorithms for large-scale unconstrained optimization. Mathematical Programming, 26(2):190–212, 1983.
- (1983) Mathematical Programming , vol.26 , Issue.2 , pp. 190-212
- Dembo, R.S.¹ Steihaug, T.²

9
- 0000746005
- Inexact newton methods
- R. S. Dembo, S. C. Eisenstat, and T. Steihaug. Inexact Newton methods. SIAM Journal on Numerical analysis, 19(2):400–408, 1982.
- (1982) SIAM Journal on Numerical Analysis , vol.19 , Issue.2 , pp. 400-408
- Dembo, R.S.¹ Eisenstat, S.C.² Steihaug, T.³

10
- 0002663672
- Quasi-Newton methods, motivation and theory
- J. E. Dennis, Jr and J. J. Moré. Quasi-Newton methods, motivation and theory. SIAM review, 19(1): 46–89, 1977.
- (1977) SIAM Review , vol.19 , Issue.1 , pp. 46-89
- Dennis, J.E.¹ Moré, J.J.²

11
- 80052250414
- Adaptive subgradient methods for online learning and stochastic optimization
- J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. The Journal of Machine Learning Research, 12:2121–2159, 2011.
- (2011) The Journal of Machine Learning Research , vol.12 , pp. 2121-2159
- Duchi, J.¹ Hazan, E.² Singer, Y.³

12
- 84984710701
- Competing with the empirical risk minimizer in a single pass
- R. Frostig, R. Ge, S. M. Kakade, and A. Sidford. Competing with the empirical risk minimizer in a single pass. In Conference on Learning Theory, 2015.
- (2015) Conference on Learning Theory
- Frostig, R.¹ Ge, R.² Kakade, S.M.³ Sidford, A.⁴

13
- 84898963415
- Accelerating stochastic gradient descent using predictive variance reduction
- R. Johnson and T. Zhang. Accelerating stochastic gradient descent using predictive variance reduction. In Advances in Neural Information Processing Systems, pages 315–323, 2013.
- (2013) Advances in Neural Information Processing Systems , pp. 315-323
- Johnson, R.¹ Zhang, T.²

14
- 85083951076
- A method for stochastic optimization
- D. Kingma and J. Ba. Adam: A method for stochastic optimization. In International Conference on Learning Representations, 2015.
- (2015) International Conference on Learning Representations
- Kingma, D.¹ Adam, J.Ba.²

15
- 84876811202
- RCV1: A new benchmark collection for text categorization research
- D. D. Lewis, Y. Yang, T. G. Rose, and F. Li. RCV1: A new benchmark collection for text categorization research. The Journal of Machine Learning Research, 5:361–397, 2004.
- (2004) The Journal of Machine Learning Research , vol.5 , pp. 361-397
- Lewis, D.D.¹ Yang, Y.² Rose, T.G.³ Li, F.⁴

16
- 33646887390
- On the limited memory BFGS method for large scale optimization
- D. C. Liu and J. Nocedal. On the limited memory BFGS method for large scale optimization. Mathematical Programming, 45(1-3):503–528, 1989.
- (1989) Mathematical Programming , vol.45 , Issue.1-3 , pp. 503-528
- Liu, D.C.¹ Nocedal, J.²

17
- 84998606913
- arXiv preprint
- A. Lucchi, B. McWilliams, and T. Hofmann. A variance reduced stochastic Newton method. arXiv preprint arXiv:1503.08316, 2015.
- (2015) A Variance Reduced Stochastic Newton Method
- Lucchi, A.¹ McWilliams, B.² Hofmann, T.³

18
- 84976905372
- arXiv preprint
- A. Mokhtari and A. Ribeiro. Global convergence of online limited memory BFGS. arXiv preprint arXiv:1409.2045, 2014a.
- (2014) Global Convergence of Online Limited Memory BFGS
- Mokhtari, A.¹ Ribeiro, A.²

19
- 84910029642
- RES: Regularized stochastic BFGS algorithm
- A. Mokhtari and A. Ribeiro. RES: Regularized stochastic BFGS algorithm. IEEE Transactions on Signal Processing, 62(23):6089–6104, 2014b.
- (2014) IEEE Transactions on Signal Processing , vol.62 , Issue.23 , pp. 6089-6104
- Mokhtari, A.¹ Ribeiro, A.²

20
- 65249121279
- Primal-dual subgradient methods for convex problems
- Y. Nesterov. Primal-dual subgradient methods for convex problems. Mathematical Programming, 120 (1):221–259, 2009.
- (2009) Mathematical Programming , vol.120 , Issue.1 , pp. 221-259
- Nesterov, Y.¹

21
- 0003982971
- Springer
- J. Nocedal and S. J. Wright. Numerical Optimization. Springer, 2006.
- (2006) Numerical Optimization
- Nocedal, J.¹ Wright, S.J.²

22
- 0000255539
- Fast exact multiplication by the Hessian
- B. A. Pearlmutter. Fast exact multiplication by the Hessian. Neural Computation, 6(1):147–160, 1994.
- (1994) Neural Computation , vol.6 , Issue.1 , pp. 147-160
- Pearlmutter, B.A.¹

23
- 84880089568
- Parallel stochastic gradient algorithms for large-scale matrix completion
- B. Recht and C. Ré. Parallel stochastic gradient algorithms for large-scale matrix completion. Mathematical Programming Computation, 5(2):201–226, 2013.
- (2013) Mathematical Programming Computation , vol.5 , Issue.2 , pp. 201-226
- Recht, B.¹ Ré, C.²

24
- 0000016172
- A stochastic approximation method
- H. Robbins and S. Monro. A stochastic approximation method. The Annals of Mathematical Statistics, pages 400–407, 1951.
- (1951) The Annals of Mathematical Statistics , pp. 400-407
- Robbins, H.¹ Monro, S.²

25
- 84877725219
- A stochastic gradient method with an exponential convergence rate for finite training sets
- N. L. Roux, M. Schmidt, and F. R. Bach. A stochastic gradient method with an exponential convergence rate for finite training sets. In Advances in Neural Information Processing Systems, pages 2663–2671, 2012.
- (2012) Advances in Neural Information Processing Systems , pp. 2663-2671
- Roux, N.L.¹ Schmidt, M.² Bach, F.R.³

26
- 84862300219
- A stochastic quasi-newton method for online convex optimization
- N. N. Schraudolph, J. Yu, and S. Günter. A stochastic quasi-Newton method for online convex optimization. In International Conference on Artificial Intelligence and Statistics, pages 436–443, 2007.
- (2007) International Conference on Artificial Intelligence and Statistics , pp. 436-443
- Schraudolph, N.N.¹ Yu, J.² Günter, S.³

27
- 84875134236
- Stochastic dual coordinate ascent methods for regularized loss
- S. Shalev-Shwartz and T. Zhang. Stochastic dual coordinate ascent methods for regularized loss. The Journal of Machine Learning Research, 14(1):567–599, 2013.
- (2013) The Journal of Machine Learning Research , vol.14 , Issue.1 , pp. 567-599
- Shalev-Shwartz, S.¹ Zhang, T.²

28
- 84919904380
- Fast large-scale optimization by unifying stochastic gradient and quasi-newton methods
- J. Sohl-Dickstein, B. Poole, and S. Ganguli. Fast large-scale optimization by unifying stochastic gradient and quasi-Newton methods. In International Conference on Machine Learning, 2014.
- (2014) International Conference on Machine Learning
- Sohl-Dickstein, J.¹ Poole, B.² Ganguli, S.³

29
- 84892623436
- On the importance of initialization and momentum in deep learning
- I. Sutskever, J. Martens, G. Dahl, and G. Hinton. On the importance of initialization and momentum in deep learning. In International Conference on Machine Learning, pages 1139–1147, 2013.
- (2013) International Conference on Machine Learning , pp. 1139-1147
- Sutskever, I.¹ Martens, J.² Dahl, G.³ Hinton, G.⁴

30
- 84899020608
- Variance reduction for stochastic gradient optimization
- C. Wang, X. Chen, A. J. Smola, and E. P. Xing. Variance reduction for stochastic gradient optimization. In Advances in Neural Information Processing Systems, pages 181–189, 2013.
- (2013) Advances in Neural Information Processing Systems , pp. 181-189
- Wang, C.¹ Chen, X.² Smola, A.J.³ Xing, E.P.⁴

31
- 84946019022
- arXiv preprint
- X. Wang, S. Ma, and W. Liu. Stochastic quasi-Newton methods for nonconvex stochastic optimization. arXiv preprint arXiv:1412.1196, 2014.
- (2014) Stochastic Quasi-Newton Methods for Nonconvex Stochastic Optimization
- Wang, X.¹ Ma, S.² Liu, W.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.