SCOPUS 정보 검색 플랫폼

Advances in Neural Information Processing Systems

Volumn 2015-January, Issue , 2015, Pages 2737-2745

Asynchronous parallel stochastic gradient for nonconvex optimization

(4) Lian, Xiangru a Huang, Yijun a Li, Yuncheng a Liu, Ji a

a University of Rochester (United States)

Author keywords

[No Author keywords available]

Indexed keywords

INFORMATION SCIENCE; MECHANISMS;

ASYNCHRONOUS PARALLEL; CONVEX MINIMIZATION; DEEP NEURAL NETWORKS; ERGODIC CONVERGENCE; NONCONVEX OPTIMIZATION; NUMBER OF ITERATIONS; SHARED MEMORY SYSTEM; STOCHASTIC GRADIENT;

STOCHASTIC SYSTEMS;

EID: 84965099508 PISSN: 10495258 EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (518)

References (31)

1
- 85162387277
- Distributed delayed stochastic optimization
- A. Agarwal and J. C. Duchi. Distributed delayed stochastic optimization. NIPS, 2011.
- (2011) NIPS
- Agarwal, A.¹ Duchi, J.C.²

2
- 84906673146
- Revisiting asynchronous linear solvers: Provable convergence rate through randomization
- H. Avron, A. Druinsky, and A. Gupta. Revisiting asynchronous linear solvers: Provable convergence rate through randomization. IPDPS, 2014.
- (2014) IPDPS
- Avron, H.¹ Druinsky, A.² Gupta, A.³

3
- 0142166851
- A neural probabilistic language model
- Y. Bengio, R. Ducharme, P. Vincent, and C. Janvin. A neural probabilistic language model. The Journal of Machine Learning Research, 3:1137-1155, 2003.
- (2003) The Journal of Machine Learning Research , vol.3 , pp. 1137-1155
- Bengio, Y.¹ Ducharme, R.² Vincent, P.³ Janvin, C.⁴

4
- 0003636164
- Prentice hall Englewood Cliffs, NJ
- D. P. Bertsekas and J. N. Tsitsiklis. Parallel and distributed computation: numerical methods, volume 23. Prentice hall Englewood Cliffs, NJ, 1989.
- (1989) Parallel and Distributed Computation: Numerical Methods , vol.23
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

5
- 84877760312
- Large scale distributed deep networks
- J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, M. Mao, A. Senior, P. Tucker, K. Yang, Q. V. Le, et al. Large scale distributed deep networks. NIPS, 2012.
- (2012) NIPS
- Dean, J.¹ Corrado, G.² Monga, R.³ Chen, K.⁴ Devin, M.⁵ Mao, M.⁶ Senior, A.⁷ Tucker, P.⁸ Yang, K.⁹ Le, Q.V.¹⁰

6
- 84857527621
- Optimal distributed online prediction using mini-batches
- O. Dekel, R. Gilad-Bachrach, O. Shamir, and L. Xiao. Optimal distributed online prediction using mini-batches. Journal of Machine Learning Research, 13(1):165-202, 2012.
- (2012) Journal of Machine Learning Research , vol.13 , Issue.1 , pp. 165-202
- Dekel, O.¹ Gilad-Bachrach, R.² Shamir, O.³ Xiao, L.⁴

7
- 84912542181
- arXiv preprint arXiv:1312.5799
- O. Fercoq and P. Richtárik. Accelerated, parallel and proximal coordinate descent. arXiv preprint arXiv:1312.5799, 2013.
- (2013) Accelerated, Parallel and Proximal Coordinate Descent
- Fercoq, O.¹ Richtárik, P.²

8
- 84962029205
- ArXiv e-prints, May 18
- H. R. Feyzmahdavian, A. Aytekin, and M. Johansson. An asynchronous mini-batch algorithm for regularized stochastic optimization. ArXiv e-prints, May 18 2015.
- (2015) An Asynchronous Mini-batch Algorithm for Regularized Stochastic Optimization
- Feyzmahdavian, H.R.¹ Aytekin, A.² Johansson, M.³

9
- 84892854517
- Stochastic first- and zeroth-order methods for nonconvex stochastic programming
- S. Ghadimi and G. Lan. Stochastic first- and zeroth-order methods for nonconvex stochastic programming. SIAM Journal on Optimization, 23(4):2341-2368, 2013.
- (2013) SIAM Journal on Optimization , vol.23 , Issue.4 , pp. 2341-2368
- Ghadimi, S.¹ Lan, G.²

10
- 84964618398
- arXiv preprint arXiv:1412.6058
- M. Hong. A distributed, asynchronous and incremental algorithm for nonconvex optimization: An ADMM based approach. arXiv preprint arXiv:1412.6058, 2014.
- (2014) A Distributed, Asynchronous and Incremental Algorithm for Nonconvex Optimization: An ADMM Based Approach
- Hong, M.¹

11
- 84913555165
- arXiv preprint arXiv:1408.5093
- Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093, 2014.
- (2014) Caffe: Convolutional Architecture for Fast Feature Embedding
- Jia, Y.¹ Shelhamer, E.² Donahue, J.³ Karayev, S.⁴ Long, J.⁵ Girshick, R.⁶ Guadarrama, S.⁷ Darrell, T.⁸

12
- 77956002520
- Learning multiple layers of features from tiny images
- Tech. Rep
- A. Krizhevsky and G. Hinton. Learning multiple layers of features from tiny images. Computer Science Department, University of Toronto, Tech. Rep, 1(4):7, 2009.
- (2009) Computer Science Department, University of Toronto , vol.1 , Issue.4 , pp. 7
- Krizhevsky, A.¹ Hinton, G.²

13
- 84876231242
- Imagenet classification with deep convolutional neural networks
- A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. NIPS, pages 1097-1105, 2012.
- (2012) NIPS , pp. 1097-1105
- Krizhevsky, A.¹ Sutskever, I.² Hinton, G.E.³

14
- 84937960701
- Parameter server for distributed machine learning
- M. Li, L. Zhou, Z. Yang, A. Li, F. Xia, D. G. Andersen, and A. Smola. Parameter server for distributed machine learning. Big Learning NIPS Workshop, 2013.
- (2013) Big Learning NIPS Workshop
- Li, M.¹ Zhou, L.² Yang, Z.³ Li, A.⁴ Xia, F.⁵ Andersen, D.G.⁶ Smola, A.⁷

15
- 84937912100
- Scaling distributed machine learning with the parameter server
- M. Li, D. G. Andersen, J. W. Park, A. J. Smola, A. Ahmed, V. Josifovski, J. Long, E. J. Shekita, and B.-Y. Su. Scaling distributed machine learning with the parameter server. OSDI, 2014a.
- (2014) OSDI
- Li, M.¹ Andersen, D.G.² Park, J.W.³ Smola, A.J.⁴ Ahmed, A.⁵ Josifovski, V.⁶ Long, J.⁷ Shekita, E.J.⁸ Su, B.-Y.⁹

16
- 84937889303
- Communication efficient distributed machine learning with the parameter server
- M. Li, D. G. Andersen, A. J. Smola, and K. Yu. Communication efficient distributed machine learning with the parameter server. NIPS, 2014b.
- (2014) NIPS
- Li, M.¹ Andersen, D.G.² Smola, A.J.³ Yu, K.⁴

17
- 84912542180
- arXiv preprint arXiv:1403.3862
- J. Liu and S. J. Wright. Asynchronous stochastic coordinate descent: Parallelism and convergence properties. arXiv preprint arXiv:1403.3862, 2014.
- (2014) Asynchronous Stochastic Coordinate Descent: Parallelism and Convergence Properties
- Liu, J.¹ Wright, S.J.²

18
- 84919932688
- An asynchronous parallel stochastic coordinate descent algorithm
- J. Liu, S. J. Wright, C. Ré, V. Bittorf, and S. Sridhar. An asynchronous parallel stochastic coordinate descent algorithm. ICML, 2014a.
- (2014) ICML
- Liu, J.¹ Wright, S.J.² Ré, C.³ Bittorf, V.⁴ Sridhar, S.⁵

19
- 84925401191
- arXiv preprint arXiv:1401.4780
- J. Liu, S. J. Wright, and S. Sridhar. An asynchronous parallel randomized kaczmarz algorithm. arXiv preprint arXiv:1401.4780, 2014b.
- (2014) An Asynchronous Parallel Randomized Kaczmarz Algorithm
- Liu, J.¹ Wright, S.J.² Sridhar, S.³

20
- 84965135004
- arXiv preprint arXiv:1507.06970
- H. Mania, X. Pan, D. Papailiopoulos, B. Recht, K. Ramchandran, and M. I. Jordan. Perturbed iterate analysis for asynchronous stochastic optimization. arXiv preprint arXiv:1507.06970, 2015.
- (2015) Perturbed Iterate Analysis for Asynchronous Stochastic Optimization
- Mania, H.¹ Pan, X.² Papailiopoulos, D.³ Recht, B.⁴ Ramchandran, K.⁵ Jordan, M.I.⁶

21
- 84937848805
- arXiv preprint arXiv:1406.0238
- J. Marecek, P. Richtárik, and M. Takác. Distributed block coordinate descent for minimizing partially separable functions. arXiv preprint arXiv:1406.0238, 2014.
- (2014) Distributed Block Coordinate Descent for Minimizing Partially Separable Functions
- Marecek, J.¹ Richtárik, P.² Takác, M.³

22
- 70450197241
- Robust stochastic approximation approach to stochastic programming
- A. Nemirovski, A. Juditsky, G. Lan, and A. Shapiro. Robust stochastic approximation approach to stochastic programming. SIAM Journal on Optimization, 19(4):1574-1609, 2009.
- (2009) SIAM Journal on Optimization , vol.19 , Issue.4 , pp. 1574-1609
- Nemirovski, A.¹ Juditsky, A.² Lan, G.³ Shapiro, A.⁴

23
- 85162467517
- Hogwild: A lock-free approach to parallelizing stochastic gradient descent
- F. Niu, B. Recht, C. Re, and S. Wright. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. NIPS, 2011.
- (2011) NIPS
- Niu, F.¹ Recht, B.² Re, C.³ Wright, S.⁴

24
- 84965122596
- GPU asynchronous stochastic gradient descent to speed up neural network training
- T. Paine, H. Jin, J. Yang, Z. Lin, and T. Huang. Gpu asynchronous stochastic gradient descent to speed up neural network training. NIPS, 2013.
- (2013) NIPS
- Paine, T.¹ Jin, H.² Yang, J.³ Lin, Z.⁴ Huang, T.⁵

25
- 84908893558
- Gasgd: Stochastic gradient descent for distributed asynchronous matrix completion via graph partitioning
- F. Petroni and L. Querzoni. Gasgd: stochastic gradient descent for distributed asynchronous matrix completion via graph partitioning. ACM Conference on Recommender systems, 2014.
- (2014) ACM Conference on Recommender Systems
- Petroni, F.¹ Querzoni, L.²

26
- 84905092486
- An approximate, efficient LP solver for lp rounding
- S. Sridhar, S. Wright, C. Re, J. Liu, V. Bittorf, and C. Zhang. An approximate, efficient LP solver for lp rounding. NIPS, 2013.
- (2013) NIPS
- Sridhar, S.¹ Wright, S.² Re, C.³ Liu, J.⁴ Bittorf, V.⁵ Zhang, C.⁶

27
- 84947110026
- arXiv preprint arXiv:1503.03033
- R. Tappenden, M. Takáč, and P. Richtárik. On the complexity of parallel coordinate descent. arXiv preprint arXiv:1503.03033, 2015.
- (2015) On the Complexity of Parallel Coordinate Descent
- Tappenden, R.¹ Takáč, M.² Richtárik, P.³

28
- 84965182483
- Scaling up stochastic dual coordinate ascent
- K. Tran, S. Hosseini, L. Xiao, T. Finley, and M. Bilenko. Scaling up stochastic dual coordinate ascent. ICML, 2015.
- (2015) ICML
- Tran, K.¹ Hosseini, S.² Xiao, L.³ Finley, T.⁴ Bilenko, M.⁵

29
- 84965151095
- arXiv preprint arXiv:1312.0193
- H. Yun, H.-F. Yu, C.-J. Hsieh, S. Vishwanathan, and I. Dhillon. Nomad: Non-locking, stochastic multi-machine algorithm for asynchronous and decentralized matrix completion. arXiv preprint arXiv:1312.0193, 2013.
- (2013) Nomad: Non-locking, Stochastic Multi-machine Algorithm for Asynchronous and Decentralized Matrix Completion
- Yun, H.¹ Yu, H.-F.² Hsieh, C.-J.³ Vishwanathan, S.⁴ Dhillon, I.⁵

30
- 84919796967
- Asynchronous distributed ADMM for consensus optimization
- R. Zhang and J. Kwok. Asynchronous distributed ADMM for consensus optimization. ICML, 2014.
- (2014) ICML
- Zhang, R.¹ Kwok, J.²

31
- 84929605040
- CoRR, abs/1412.6651
- S. Zhang, A. Choromanska, and Y. Le Cun. Deep learning with elastic averaging SGD. CoRR, abs/1412.6651, 2014.
- (2014) Deep Learning with Elastic Averaging SGD
- Zhang, S.¹ Choromanska, A.² Le Cun, Y.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.