메뉴 건너뛰기




Volumn 9, Issue , 2010, Pages 249-256

Understanding the difficulty of training deep feedforward neural networks

Author keywords

[No Author keywords available]

Indexed keywords

FASTER CONVERGENCE; GRADIENT DESCENT; HIDDEN LAYERS; JACOBIANS; MEAN VALUES; NON-LINEAR ACTIVATION; NON-LINEARITY; SINGULAR VALUES;

EID: 84862277874     PISSN: 15324435     EISSN: 15337928     Source Type: Journal    
DOI: None     Document Type: Conference Paper
Times cited : (17430)

References (20)
  • 1
    • 69349090197 scopus 로고    scopus 로고
    • Learning deep architectures for AI
    • Also published as a book. Now Publishers, 2009
    • Bengio, Y. (2009). Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2, 1-127. Also published as a book. Now Publishers, 2009.
    • (2009) Foundations and Trends in Machine Learning , vol.2 , pp. 1-127
    • Bengio, Y.1
  • 2
    • 84864073449 scopus 로고    scopus 로고
    • Greedy layer-wise training of deep networks
    • MIT Press
    • Bengio, Y., Lamblin, P., Popovici, D., & Larochelle, H. (2007). Greedy layer-wise training of deep networks. NIPS 19 (pp. 153-160). MIT Press.
    • (2007) NIPS , vol.19 , pp. 153-160
    • Bengio, Y.1    Lamblin, P.2    Popovici, D.3    Larochelle, H.4
  • 3
  • 5
    • 77953344311 scopus 로고    scopus 로고
    • Doctoral dissertation, The Robotics Institute, Carnegie Mellon University
    • Bradley, D. (2009). Learning in modular systems. Doctoral dissertation, The Robotics Institute, Carnegie Mellon University.
    • (2009) Learning in Modular Systems
    • Bradley, D.1
  • 6
    • 56449095373 scopus 로고    scopus 로고
    • A unified architecture for natural language processing: Deep neural networks with multitask learning
    • Collobert, R., &Weston, J. (2008). A unified architecture for natural language processing: Deep neural networks with multitask learning. ICML 2008.
    • (2008) ICML 2008
    • Collobert, R.1    Weston, J.2
  • 7
    • 79961226155 scopus 로고    scopus 로고
    • The difficulty of training deep architectures and the effect of unsupervised pre-training
    • Erhan, D., Manzagol, P.-A., Bengio, Y., Bengio, S., & Vincent, P. (2009). The difficulty of training deep architectures and the effect of unsupervised pre-training. AISTATS'2009 (pp. 153-160).
    • (2009) AISTATS'2009 , pp. 153-160
    • Erhan, D.1    Manzagol, P.-A.2    Bengio, Y.3    Bengio, S.4    Vincent, P.5
  • 8
    • 33745805403 scopus 로고    scopus 로고
    • A fast learning algorithm for deep belief nets
    • Hinton, G. E., Osindero, S., & Teh, Y. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18, 1527-1554.
    • (2006) Neural Computation , vol.18 , pp. 1527-1554
    • Hinton, G.E.1    Osindero, S.2    Teh, Y.3
  • 11
    • 50249093806 scopus 로고    scopus 로고
    • An empirical evaluation of deep architectures on problems with many factors of variation
    • Larochelle, H., Erhan, D., Courville, A., Bergstra, J., & Bengio, Y. (2007). An empirical evaluation of deep architectures on problems with many factors of variation. ICML 2007.
    • (2007) ICML 2007
    • Larochelle, H.1    Erhan, D.2    Courville, A.3    Bergstra, J.4    Bengio, Y.5
  • 12
    • 0032203257 scopus 로고    scopus 로고
    • Gradient-based learning applied to document recognition
    • LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998a). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86, 2278-2324.
    • (1998) Proceedings of the IEEE , vol.86 , pp. 2278-2324
    • Lecun, Y.1    Bottou, L.2    Bengio, Y.3    Haffner, P.4
  • 14
    • 84858779990 scopus 로고    scopus 로고
    • A scalable hierarchical distributed language model
    • Mnih, A., & Hinton, G. E. (2009). A scalable hierarchical distributed language model. NIPS 21 (pp. 1081-1088).
    • (2009) NIPS , vol.21 , pp. 1081-1088
    • Mnih, A.1    Hinton, G.E.2
  • 15
    • 84864069017 scopus 로고    scopus 로고
    • Efficient learning of sparse representations with an energy-based model
    • Ranzato, M., Poultney, C., Chopra, S., & LeCun, Y. (2007). Efficient learning of sparse representations with an energy-based model. NIPS 19.
    • (2007) NIPS , vol.19
    • Ranzato, M.1    Poultney, C.2    Chopra, S.3    Lecun, Y.4
  • 16
    • 0022471098 scopus 로고
    • Learning representations by back-propagating errors
    • Rumelhart, D. E., Hinton, G. E., &Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323, 533-536.
    • (1986) Nature , vol.323 , pp. 533-536
    • Rumelhart, D.E.1    Hinton, G.E.2    Williams, R.J.3
  • 17
    • 0001336749 scopus 로고
    • Accelerated learning in layered neural networks
    • Solla, S. A., Levin, E., & Fleisher, M. (1988). Accelerated learning in layered neural networks. Complex Systems, 2, 625-639.
    • (1988) Complex Systems , vol.2 , pp. 625-639
    • Solla, S.A.1    Levin, E.2    Fleisher, M.3
  • 18
    • 56449089103 scopus 로고    scopus 로고
    • Extracting and composing robust features with denoising autoencoders
    • Vincent, P., Larochelle, H., Bengio, Y., & Manzagol, P.-A. (2008). Extracting and composing robust features with denoising autoencoders. ICML 2008.
    • (2008) ICML 2008
    • Vincent, P.1    Larochelle, H.2    Bengio, Y.3    Manzagol, P.-A.4
  • 19
    • 56449119888 scopus 로고    scopus 로고
    • Deep learning via semi-supervised embedding
    • New York, NY, USA: ACM
    • Weston, J., Ratle, F., & Collobert, R. (2008). Deep learning via semi-supervised embedding. ICML 2008 (pp. 1168-1175). New York, NY, USA: ACM.
    • (2008) ICML 2008 , pp. 1168-1175
    • Weston, J.1    Ratle, F.2    Collobert, R.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.