메뉴 건너뛰기




Volumn , Issue , 2017, Pages

Understanding deep learning requires rethinking generalization

Author keywords

[No Author keywords available]

Indexed keywords

CLASSIFICATION (OF INFORMATION); GRADIENT METHODS; NEURAL NETWORKS; SAMPLING; STOCHASTIC SYSTEMS;

EID: 85088231398     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (1692)

References (32)
  • 2
    • 0032028728 scopus 로고    scopus 로고
    • The sample complexity of pattern classification with neural networks - The size of the weights is more important than the size of the network
    • Peter L Bartlett. The Sample Complexity of Pattern Classification with Neural Networks - The Size of the Weights is More Important than the Size of the Network. IEEE Trans. Information Theory, 1998.
    • (1998) IEEE Trans. Information Theory
    • Bartlett, P.L.1
  • 3
    • 0038453192 scopus 로고    scopus 로고
    • Rademacher and Gaussian complexities: Risk bounds and structural results
    • March
    • Peter L Bartlett and Shahar Mendelson. Rademacher and gaussian complexities: risk bounds and structural results. Journal of Machine Learning Research, 3:463-482, March 2003.
    • (2003) Journal of Machine Learning Research , vol.3 , pp. 463-482
    • Bartlett, P.L.1    Mendelson, S.2
  • 7
    • 84998891586 scopus 로고    scopus 로고
    • Convolutional rectifier networks as generalized tensor decompositions
    • Nadav Cohen and Amnon Shashua. Convolutional Rectifier Networks as Generalized Tensor Decompositions. In ICML, 2016.
    • (2016) ICML
    • Cohen, N.1    Shashua, A.2
  • 8
    • 0024861871 scopus 로고
    • Approximation by superposition of sigmoidal functions
    • G Cybenko. Approximation by superposition of sigmoidal functions. Mathematics of Control, Signals and Systems, 2(4):303-314, 1989.
    • (1989) Mathematics of Control, Signals and Systems , vol.2 , Issue.4 , pp. 303-314
    • Cybenko, G.1
  • 11
    • 85022188122 scopus 로고    scopus 로고
    • The power of depth for feedforward neural networks
    • Ronen Eldan and Ohad Shamir. The Power of Depth for Feedforward Neural Networks. In COLT, 2016.
    • (2016) COLT
    • Eldan, R.1    Shamir, O.2
  • 12
    • 84998546946 scopus 로고    scopus 로고
    • Train faster, generalize better: Stability of stochastic gradient descent
    • Moritz Hardt, Benjamin Recht, and Yoram Singer. Train faster, generalize better: Stability of stochastic gradient descent. In ICML, 2016.
    • (2016) ICML
    • Hardt, M.1    Recht, B.2    Singer, Y.3
  • 13
    • 84986274465 scopus 로고    scopus 로고
    • Deep residual learning for image recognition
    • Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep Residual Learning for Image Recognition. In CVPR, 2016.
    • (2016) CVPR
    • He, K.1    Zhang, X.2    Ren, S.3    Sun, J.4
  • 14
    • 84969584486 scopus 로고    scopus 로고
    • Batch normalization: Accelerating deep network training by reducing internal covariate shift
    • Sergey Ioffe and Christian Szegedy. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In ICML, 2015.
    • (2015) ICML
    • Ioffe, S.1    Szegedy, C.2
  • 17
    • 84998886271 scopus 로고    scopus 로고
    • Generalization properties and implicit regularization for multiple passes SGM
    • Junhong Lin, Raffaello Camoriano, and Lorenzo Rosasco. Generalization Properties and Implicit Regularization for Multiple Passes SGM. In ICML, 2016.
    • (2016) ICML
    • Lin, J.1    Camoriano, R.2    Rosasco, L.3
  • 20
    • 0006863682 scopus 로고
    • Approximation properties of a multilayered feedforward artificial neural network
    • Hrushikesh Narhar Mhaskar. Approximation properties of a multilayered feedforward artificial neural network. Advances in Computational Mathematics, 1(1):61-80, 1993.
    • (1993) Advances in Computational Mathematics , vol.1 , Issue.1 , pp. 61-80
    • Mhaskar, H.N.1
  • 21
    • 1842515655 scopus 로고    scopus 로고
    • Statistical learning: Stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization
    • Massachusetts Institute of Technology
    • Sayan Mukherjee, Partha Niyogi, Tomaso Poggio, and Ryan Rifkin. Statistical learning: Stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization. Technical Report AI Memo 2002-024, Massachusetts Institute of Technology, 2002.
    • (2002) Technical Report AI Memo
    • Mukherjee, S.1    Niyogi, P.2    Poggio, T.3    Rifkin, R.4
  • 23
    • 85007187503 scopus 로고    scopus 로고
    • Norm-based capacity control in neural networks
    • Behnam Neyshabur, Ryota Tomioka, and Nathan Srebro. Norm-Based Capacity Control in Neural Networks. In COLT, pp. 1376-1401, 2015.
    • (2015) COLT , pp. 1376-1401
    • Neyshabur, B.1    Tomioka, R.2    Srebro, N.3
  • 24
    • 1842420581 scopus 로고    scopus 로고
    • General conditions for predic-tivity in learning theory
    • Tomaso Poggio, Ryan Rifkin, Sayan Mukherjee, and Partha Niyogi. General conditions for predic-tivity in learning theory. Nature, 428(6981):419-422, 2004.
    • (2004) Nature , vol.428 , Issue.6981 , pp. 419-422
    • Poggio, T.1    Rifkin, R.2    Mukherjee, S.3    Niyogi, P.4
  • 29
    • 84986296808 scopus 로고    scopus 로고
    • Rethinking the inception architecture for computer vision
    • Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, and Zbigniew Wojna. Rethinking the inception architecture for computer vision. In CVPR, pp. 2818-2826, 2016. doi: 10.1109/CVPR.2016.308.
    • (2016) CVPR , pp. 2818-2826
    • Szegedy, C.1    Vanhoucke, V.2    Ioffe, S.3    Shlens, J.4    Wojna, Z.5
  • 30
    • 84998628850 scopus 로고    scopus 로고
    • Benefits of depth in neural networks
    • Matus Telgarsky. Benefits of depth in neural networks. In COLT, 2016.
    • (2016) COLT
    • Telgarsky, M.1
  • 32
    • 34547435898 scopus 로고    scopus 로고
    • On early stopping in gradient descent learning
    • Yuan Yao, Lorenzo Rosasco, and Andrea Caponnetto. On early stopping in gradient descent learning. Constructive Approximation, 26(2):289-315, 2007.
    • (2007) Constructive Approximation , vol.26 , Issue.2 , pp. 289-315
    • Yao, Y.1    Rosasco, L.2    Caponnetto, A.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.