SCOPUS 정보 검색 플랫폼

Proceedings of the Annual ACM Symposium on Theory of Computing

Volumn Part F128415, Issue , 2017, Pages 1195-1199

Finding approximate local minima faster than gradient descent

(5) Agarwal, Naman a Allen Zhu, Zeyuan b Bullins, Brian a Hazan, Elad a Ma, Tengyu a

a PRINCETON UNIVERSITY (United States)

b INSTITUTE FOR ADVANCED STUDY (United States)

Author keywords

Cubic regularization; Deep learning; Non convex optimization; Second order optimization

Indexed keywords

COMPLEX NETWORKS; COMPUTATIONAL COMPLEXITY; CONVEX OPTIMIZATION; DEEP LEARNING; EDUCATION;

CONVEX OBJECTIVES; CUBIC REGULARIZATION; GRADIENT DESCENT; NONCONVEX OPTIMIZATION; OPTIMIZATION PROBLEMS; SECOND ORDER OPTIMIZATION; TIME COMPLEXITY; TRAINING EXAMPLE;

OPTIMIZATION;

EID: 85024401503 PISSN: 07378017 EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/3055399.3055464 Document Type: Conference Paper

Times cited : (268)

References (31)

1
- 85018906286
- arXiv preprint arXiv: 1602.03943
- Naman Agarwal, Brian Bullins, and Elad Hazan. Second order stochastic optimization for machine learning in linear time. arXiv preprint arXiv: 1602.03943, 2016.
- (2016) Second Order Stochastic Optimization for Machine Learning in Linear Time
- Agarwal, N.¹ Bullins, B.² Hazan, E.³

2
- 85049912950
- ArXiv e-prints, abs/1702.00763, February
- Zeyuan Allen-Zhu. Natasha: Faster Stochastic Non-Convex Optimization via Strongly Non-Convex Parameter. ArXiv e-prints, abs/1702.00763, February 2017.
- (2017) Natasha: Faster Stochastic Non-Convex Optimization Via Strongly Non-Convex Parameter
- Allen-Zhu, Z.¹

3
- 84999029527
- Variance reduction for faster non-convex optimization
- Zeyuan Allen-Zhu and Elad Hazan. Variance Reduction for Faster Non-Convex Optimization. In ICML, 2016.
- (2016) ICML
- Allen-Zhu, Z.¹ Hazan, E.²

4
- 84964393578
- arXiv preprint arXiv: 1602.04426
- Afonso S Bandeira, Nicolas Boumal, and Vladislav Voroninski. On the low-rank approach for semidefinite programs arising in synchronization and community detection. arXiv preprint arXiv: 1602.04426, 2016.
- (2016) On the Low-rank Approach for Semidefinite Programs Arising in Synchronization and Community Detection
- Bandeira, A.S.¹ Boumal, N.² Voroninski, V.³

5
- 85010290626
- ArXiv e-prints, May
- S. Bhojanapalli, B. Neyshabur, and N. Srebro. Global Optimality of Local Search for Low Rank Matrix Recovery. ArXiv e-prints, May 2016.
- (2016) Global Optimality of Local Search for Low Rank Matrix Recovery
- Bhojanapalli, S.¹ Neyshabur, B.² Srebro, N.³

6
- 85024396637
- arXiv preprint 1611.00756
- Yair Carmon, John C. Duchi, Oliver Hinder, and Aaron Sidford. Accelerated methods for non-convex optimization. arXiv preprint 1611.00756, 2016.
- (2016) Accelerated Methods for Non-convex Optimization
- Carmon, Y.¹ Duchi, J.C.² Hinder, O.³ Sidford, A.⁴

7
- 79952763936
- Adaptive cubic regularisation methods for unconstrained optimization. Part I: Motivation, convergence and numerical results
- Coralia Cartis, Nicholas IM Gould, and Philippe L Toint. Adaptive cubic regularisation methods for unconstrained optimization. part i: motivation, convergence and numerical results. Mathematical Programming, 127(2): 245-295, 2011.
- (2011) Mathematical Programming , vol.127 , Issue.2 , pp. 245-295
- Cartis, C.¹ Gould, N.I.M.² Toint, P.L.³

8
- 81255179401
- Adaptive cubic regularisation methods for unconstrained optimization. Part II: Worst-case function-and derivative-evaluation complexity
- Coralia Cartis, Nicholas IM Gould, and Philippe L Toint. Adaptive cubic regularisation methods for unconstrained optimization. part ii: worst-case function-and derivative-evaluation complexity. Mathematical Programming, 130(2): 295-319, 2011.
- (2011) Mathematical Programming , vol.130 , Issue.2 , pp. 295-319
- Cartis, C.¹ Gould, N.I.M.² Toint, P.L.³

9
- 84965107578
- The loss surfaces of multilayer networks
- Anna Choromanska, Mikael Henaff, Michael Mathieu, Gérard Ben Arous, and Yann LeCun. The loss surfaces of multilayer networks. In AISTATS, 2015.
- (2015) AISTATS
- Choromanska, A.¹ Henaff, M.² Mathieu, M.³ Arous, G.B.⁴ LeCun, Y.⁵

10
- 84928534967
- Identifying and attacking the saddle point problem in high-dimensional non-convex optimization
- Yann N Dauphin, Razvan Pascanu, Caglar Gulcehre, Kyunghyun Cho, Surya Ganguli, and Yoshua Bengio. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization. In Advances in neural information processing systems, pages 2933-2941, 2014.
- (2014) Advances in Neural Information Processing Systems , pp. 2933-2941
- Dauphin, Y.N.¹ Pascanu, R.² Gulcehre, C.³ Cho, K.⁴ Ganguli, S.⁵ Bengio, Y.⁶

11
- 80052250414
- Adaptive subgradient methods for online learning and stochastic optimization
- John Duchi, Elad Hazan, and Yoram Singer. Adaptive subgradient methods for online learning and stochastic optimization. The Journal of Machine Learning Research, 12: 2121-2159, 2011.
- (2011) The Journal of Machine Learning Research , vol.12 , pp. 2121-2159
- Duchi, J.¹ Hazan, E.² Singer, Y.³

12
- 84986594668
- ArXiv e-prints, September
- Dan Garber and Elad Hazan. Fast and simple PCA via convex optimization. ArXiv e-prints, September 2015.
- (2015) Fast and Simple PCA Via Convex Optimization
- Garber, D.¹ Hazan, E.²

13
- 84998770000
- Robust shift-and-invert preconditioning: Faster and more sample efficient algorithms for eigenvector computation
- Dan Garber, Elad Hazan, Chi Jin, Sham M. Kakade, Cameron Musco, Praneeth Netrapalli, and Aaron Sidford. Robust shift-and-invert preconditioning: Faster and more sample efficient algorithms for eigenvector computation. In ICML, 2016.
- (2016) ICML
- Garber, D.¹ Hazan, E.² Jin, C.³ Kakade, S.M.⁴ Musco, C.⁵ Netrapalli, P.⁶ Sidford, A.⁷

14
- 85007253891
- arXiv:1503.02101
- Rong Ge, Furong Huang, Chi Jin, and Yang Yuan. Escaping from saddle points-online stochastic gradient for tensor decomposition. arXiv:1503.02101, 2015.
- (2015) Escaping from Saddle Points-online Stochastic Gradient for Tensor Decomposition
- Ge, R.¹ Huang, F.² Jin, C.³ Yuan, Y.⁴

15
- 85007253891
- Escaping from saddle points-online stochastic gradient for tensor decomposition
- Rong Ge, Furong Huang, Chi Jin, and Yang Yuan. Escaping from saddle points-online stochastic gradient for tensor decomposition. In Proceedings of the 28th Annual Conference on Learning Theory, COLT 2015, 2015.
- (2015) Proceedings of the 28th Annual Conference on Learning Theory, COLT 2015
- Ge, R.¹ Huang, F.² Jin, C.³ Yuan, Y.⁴

16
- 84984704687
- Escaping from saddle points-online stochastic gradient for tensor decomposition
- Paris, France, July 3-6, 2015
- Rong Ge, Furong Huang, Chi Jin, and Yang Yuan. Escaping from saddle points-online stochastic gradient for tensor decomposition. In Proceedings of The 28th Conference on Learning Theory, COLT 2015, Paris, France, July 3-6, 2015, pages 797-842, 2015.
- (2015) Proceedings of the 28th Conference on Learning Theory, COLT 2015 , pp. 797-842
- Ge, R.¹ Huang, F.² Jin, C.³ Yuan, Y.⁴

17
- 85010398346
- ArXiv e-prints, May
- Rong Ge, Jason Lee, and Tengyu Ma. Matrix Completion has No Spurious Local Minimum. ArXiv e-prints, May 2016.
- (2016) Matrix Completion has no Spurious Local Minimum
- Ge, R.¹ Lee, J.² Ma, T.³

18
- 85024404638
- Rong Ge and Tengyu Ma. On the optimization landscape of tensor decompositions, 2016.
- (2016) On the Optimization Landscape of Tensor Decompositions
- Ge, R.¹ Ma, T.²

19
- 84962468318
- Accelerated gradient methods for nonconvex nonlinear and stochastic programming
- feb
- Saeed Ghadimi and Guanghui Lan. Accelerated gradient methods for nonconvex nonlinear and stochastic programming. Mathematical Programming, pages 1-26, feb 2015.
- (2015) Mathematical Programming , pp. 1-26
- Ghadimi, S.¹ Lan, G.²

20
- 84973372740
- ArXiv e-prints, December
- I.J. Goodfellow, O. Vinyals, and A. M. Saxe. Qualitatively characterizing neural network optimization problems. ArXiv e-prints, December 2014.
- (2014) Qualitatively Characterizing Neural Network Optimization Problems
- Goodfellow, I.J.¹ Vinyals, O.² Saxe, A.M.³

21
- 85024380018
- A linear-time algorithm for trust region problems
- Elad Hazan and Tomer Koren. A linear-time algorithm for trust region problems. Mathematical Programming, pages 1-19, 2015.
- (2015) Mathematical Programming , pp. 1-19
- Hazan, E.¹ Koren, T.²

22
- 84891614753
- Most tensor problems are np-hard
- Christopher J. Hillar and Lek-Heng Lim. Most tensor problems are np-hard. J. ACM, 60(6): 45, 2013.
- (2013) J. ACM , vol.60 , Issue.6 , pp. 45
- Hillar, C.J.¹ Lim, L.-H.²

23
- 85072246985
- Gradient descent only converges to minimizers
- New York, USA, June 23-26, 2016
- Jason D. Lee, Max Simchowitz, Michael I. Jordan, and Benjamin Recht. Gradient descent only converges to minimizers. In Proceedings of the 29th Conference on Learning Theory, COLT2016, New York, USA, June 23-26, 2016, pages 1246-1257, 2016.
- (2016) Proceedings of the 29th Conference on Learning Theory, COLT2016 , pp. 1246-1257
- Lee, J.D.¹ Simchowitz, M.² Jordan, M.I.³ Recht, B.⁴

24
- 0023452095
- Some np-complete problems in quadratic and nonlinear programming
- Katta G Murty and Santosh N Kabadi. Some np-complete problems in quadratic and nonlinear programming. Mathematical programming, 39(2): 117-129, 1987.
- (1987) Mathematical Programming , vol.39 , Issue.2 , pp. 117-129
- Murty, K.G.¹ Kabadi, S.N.²

25
- 34548480020
- 2)
- 2). In Doklady AN SSSR (translated as Soviet Mathematics Doklady), Volume 269, pages 543-547, 1983.
- (1983) Doklady an SSSR (translated as Soviet Mathematics Doklady) , vol.269 , pp. 543-547
- Nesterov, Y.¹

26
- 0003696537
- Kluwer Academic Publishers
- Yurii Nesterov. Introductory Lectures on Convex Programming Volume: A Basic course, Volume I. Kluwer Academic Publishers, 2004.
- (2004) Introductory Lectures on Convex Programming Volume: A Basic Course , vol.1
- Nesterov, Y.¹

27
- 33646730150
- Cubic regularization of Newton method and its global performance
- Yurii Nesterov and Boris T Polyak. Cubic regularization of newton method and its global performance. Mathematical Programming, 108(1): 177-205, 2006.
- (2006) Mathematical Programming , vol.108 , Issue.1 , pp. 177-205
- Nesterov, Y.¹ Polyak, B.T.²

28
- 0000255539
- Fast exact multiplication by the hessian
- Barak A Pearlmutter. Fast exact multiplication by the hessian. Neural computation, 6(1): 147-160, 1994.
- (1994) Neural Computation , vol.6 , Issue.1 , pp. 147-160
- Pearlmutter, B.A.¹

29
- 0000016172
- A stochastic approximation method
- Herbert Robbins and Sutton Monro. A stochastic approximation method. The annals of mathematical statistics, pages 400-407, 1951.
- (1951) The Annals of Mathematical Statistics , pp. 400-407
- Robbins, H.¹ Monro, S.²

30
- 84899025130
- arXiv preprint arXiv: 1309.2388 Preliminary version appeared in NIPS 2012
- Mark Schmidt, Nicolas Le Roux, and Francis Bach. Minimizing finite sums with the stochastic average gradient. arXiv preprint arXiv: 1309.2388, pages 1-45, 2013. Preliminary version appeared in NIPS 2012.
- (2013) Minimizing Finite Sums with the Stochastic Average Gradient , pp. 1-45
- Schmidt, M.¹ Le Roux, N.² Bach, F.³

31
- 0003621102
- Jonathan Richard Shewchuk. An introduction to the conjugate gradient method without the agonizing pain, 1994.
- (1994) An Introduction to the Conjugate Gradient Method Without the Agonizing Pain
- Shewchuk, J.R.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.