메뉴 건너뛰기




Volumn 4, Issue January, 2014, Pages 2933-2941

Identifying and attacking the saddle point problem in high-dimensional non-convex optimization

Author keywords

[No Author keywords available]

Indexed keywords

CONVEX OPTIMIZATION; ERRORS; GLOBAL OPTIMIZATION; INFORMATION SCIENCE; NEURAL NETWORKS; NEWTON-RAPHSON METHOD; RANDOM VARIABLES; RECURRENT NEURAL NETWORKS;

EID: 84928534967     PISSN: 10495258     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (1146)

References (27)
  • 1
    • 0024774330 scopus 로고
    • Neural networks and principal component analysis: Learning from examples without local minima
    • Baldi, P. and Hornik, K. (1989). Neural networks and principal component analysis: Learning from examples without local minima. Neural Networks, 2(1), 53-58.
    • (1989) Neural Networks , vol.2 , Issue.1 , pp. 53-58
    • Baldi, P.1    Hornik, K.2
  • 3
    • 0028392483 scopus 로고
    • Learning long-term dependencies with gradient descent is difficult
    • March 94
    • Bengio, Y., Simard, P., and Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. 5(2), 157-166. Special Issue on Recurrent Neural Networks, March 94.
    • (1994) Special Issue on Recurrent Neural Networks , vol.5 , Issue.2 , pp. 157-166
    • Bengio, Y.1    Simard, P.2    Frasconi, P.3
  • 6
    • 34147142335 scopus 로고    scopus 로고
    • Statistics of critical points of Gaussian fields on large-dimensional spaces
    • Bray, A. J. and Dean, D. S. (2007). Statistics of critical points of gaussian fields on large-dimensional spaces. Physics Review Letter, 98, 150201.
    • (2007) Physics Review Letter , vol.98
    • Bray, A.J.1    Dean, D.S.2
  • 8
    • 36448994342 scopus 로고    scopus 로고
    • Replica symmetry breaking condition exposed by random matrix calculation of landscape complexity
    • Fyodorov, Y. V. and Williams, I. (2007). Replica symmetry breaking condition exposed by random matrix calculation of landscape complexity. Journal of Statistical Physics, 129(5-6), 1081-1116.
    • (2007) Journal of Statistical Physics , vol.129 , Issue.5-6 , pp. 1081-1116
    • Fyodorov, Y.V.1    Williams, I.2
  • 9
    • 0038323312 scopus 로고    scopus 로고
    • On-line learning theory of soft committee machines with correlated hidden units steepest gradient descent and natural gradient descent
    • Inoue, M., Park, H., and Okada, M. (2003). On-line learning theory of soft committee machines with correlated hidden units steepest gradient descent and natural gradient descent. Journal of the Physical Society of Japan, 72(4), 805-810.
    • (2003) Journal of the Physical Society of Japan , vol.72 , Issue.4 , pp. 805-810
    • Inoue, M.1    Park, H.2    Okada, M.3
  • 12
    • 85162023444 scopus 로고    scopus 로고
    • An analysis on negative curvature induced by singularity in multi-layer neural-network learning
    • Mizutani, E. and Dreyfus, S. (2010). An analysis on negative curvature induced by singularity in multi-layer neural-network learning. In Advances in Neural Information Processing Systems, pages 1669-1677.
    • (2010) Advances in Neural Information Processing Systems , pp. 1669-1677
    • Mizutani, E.1    Dreyfus, S.2
  • 17
    • 84897497795 scopus 로고    scopus 로고
    • Onthe difficulty of training recurrent neural networks
    • Pascanu, R., Mikolov, T., and Bengio, Y. (2013). Onthe difficulty of training recurrent neural networks. In ICML'2013.
    • (2013) ICML'2013
    • Pascanu, R.1    Mikolov, T.2    Bengio, Y.3
  • 20
    • 0000198852 scopus 로고    scopus 로고
    • Natural gradient descent for on-line learning
    • Rattray, M., Saad, D., and Amari, S. I. (1998). Natural Gradient Descent for On-Line Learning. Physical Review Letters, 81(24), 5461-5464.
    • (1998) Physical Review Letters , vol.81 , Issue.24 , pp. 5461-5464
    • Rattray, M.1    Saad, D.2    Amari, S.I.3
  • 21
    • 4243234689 scopus 로고
    • On-line learning in soft committee machines
    • Saad, D. and Solla, S. A. (1995). On-line learning in soft committee machines. Physical Review E, 52, 4225-4243.
    • (1995) Physical Review E , vol.52 , pp. 4225-4243
    • Saad, D.1    Solla, S.A.2
  • 24
    • 84919904380 scopus 로고    scopus 로고
    • Fast large-scale optimization by unifying stochastic gradient and quasi-Newton methods
    • Sohl-Dickstein, J., Poole, B., and Ganguli, S. (2014). Fast large-scale optimization by unifying stochastic gradient and quasi-newton methods. In ICML'2014.
    • (2014) ICML'2014
    • Sohl-Dickstein, J.1    Poole, B.2    Ganguli, S.3
  • 25
    • 84897510162 scopus 로고    scopus 로고
    • On the importance of initialization and momentum in deep learning
    • S. Dasgupta and D. Mcallester, editors JMLR Workshop and Conference Proceedings
    • Sutskever, I., Martens, J., Dahl, G. E., and Hinton, G. E. (2013). On the importance of initialization and momentum in deep learning. In S. Dasgupta and D. Mcallester, editors, Proceedings of the 30th International Conference on Machine Learning (ICML-13), Volume 28, pages 1139-1147. JMLR Workshop and Conference Proceedings.
    • (2013) Proceedings of the 30th International Conference on Machine Learning (ICML-13) , vol.28 , pp. 1139-1147
    • Sutskever, I.1    Martens, J.2    Dahl, G.E.3    Hinton, G.E.4
  • 26
    • 84867614640 scopus 로고    scopus 로고
    • Krylov subspace descent for deep learning
    • Vinyals, O. and Povey, D. (2012). Krylov Subspace Descent for Deep Learning. In AISTATS.
    • (2012) AISTATS
    • Vinyals, O.1    Povey, D.2
  • 27
    • 0001560594 scopus 로고
    • On the distribution of the roots of certain symmetric matrices
    • Wigner, E. P. (1958). On the distribution of the roots of certain symmetric matrices. The Annals of Mathematics, 67(2), 325-327.
    • (1958) The Annals of Mathematics , vol.67 , Issue.2 , pp. 325-327
    • Wigner, E.P.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.