메뉴 건너뛰기




Volumn 11, Issue , 2010, Pages 625-660

Why does unsupervised pre-training help deep learning?

Author keywords

Deep architectures; Deep belief networks; Non convex optimization; Stacked denoising auto encoders; Unsupervised pre training

Indexed keywords

BELIEF NETWORKS; DE-NOISING; NONCONVEX OPTIMIZATION; PRE-TRAINING; UNSUPERVISED PRE-TRAINING;

EID: 77949522811     PISSN: 15324435     EISSN: 15337928     Source Type: Journal    
DOI: None     Document Type: Article
Times cited : (1983)

References (51)
  • 3
    • 0001347323 scopus 로고
    • Complexity regularization with application to artificial neural networks
    • G. Roussas, editor, Kluwer Academic Publishers
    • Andrew E. Barron. Complexity regularization with application to artificial neural networks. In G. Roussas, editor, Nonparametric Functional Estimation and Related Topics, pages 561-576. Kluwer Academic Publishers, 1991.
    • (1991) Nonparametric Functional Estimation and Related Topics , pp. 561-576
    • Barron, A.E.1
  • 4
    • 84880203756 scopus 로고    scopus 로고
    • Laplacian eigenmaps and spectral techniques for embedding and clustering
    • T.G. Dietterich, S. Becker, and Z. Ghahramani, editors, Cambridge, MA, MIT Press
    • Mikhail Belkin and Partha Niyogi. Laplacian eigenmaps and spectral techniques for embedding and clustering. In T.G. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems 14 (NIPS'01), Cambridge, MA, 2002. MIT Press.
    • (2002) Advances in Neural Information Processing Systems 14 (NIPS'01)
    • Belkin, M.1    Niyogi, P.2
  • 5
    • 69349090197 scopus 로고    scopus 로고
    • Learning deep architectures for AI
    • Also published as a book. Now Publishers, 2009
    • Yoshua Bengio. Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2(1):1-127, 2009. Also published as a book. Now Publishers, 2009.
    • (2009) Foundations and Trends in Machine Learning , vol.2 , Issue.1 , pp. 1-127
    • Bengio, Y.1
  • 6
    • 67651049775 scopus 로고    scopus 로고
    • Justifying and generalizing contrastive divergence
    • June
    • Yoshua Bengio and Olivier Delalleau. Justifying and generalizing contrastive divergence. Neural Computation, 21(6):1601-1621, June 2009.
    • (2009) Neural Computation , vol.21 , Issue.6 , pp. 1601-1621
    • Bengio, Y.1    Delalleau, O.2
  • 7
    • 34547975052 scopus 로고    scopus 로고
    • Scaling learning algorithms towards AI
    • L. Bottou, O. Chapelle, D. DeCoste, and J. Weston, editors, MIT Press
    • Yoshua Bengio and Yann LeCun. Scaling learning algorithms towards AI. In L. Bottou, O. Chapelle, D. DeCoste, and J. Weston, editors, Large Scale Kernel Machines, pages 321-360. MIT Press, 2007.
    • (2007) Large Scale Kernel Machines , pp. 321-360
    • Bengio, Y.1    Lecun, Y.2
  • 8
    • 77954662106 scopus 로고    scopus 로고
    • The curse of highly variable functions for local kernel machines
    • Y. Weiss, B. Schölkopf, and J. Platt, editors, MIT Press, Cambridge, MA
    • Yoshua Bengio, Olivier Delalleau, and Nicolas Le Roux. The curse of highly variable functions for local kernel machines. In Y. Weiss, B. Schölkopf, and J. Platt, editors, Advances in Neural Information Processing Systems 18 (NIPS'05), pages 107-114. MIT Press, Cambridge, MA, 2006.
    • (2006) Advances in Neural Information Processing Systems 18 (NIPS'05) , pp. 107-114
    • Bengio, Y.1    Delalleau, O.2    Le Roux, N.3
  • 11
    • 50649084677 scopus 로고    scopus 로고
    • Cluster kernels for semi-supervised learning
    • S. Becker, S. Thrun, and K. Obermayer, editors, Cambridge, MA, MIT Press
    • Olivier Chapelle, JasonWeston, and Bernhard Schölkopf. Cluster kernels for semi-supervised learning. In S. Becker, S. Thrun, and K. Obermayer, editors, Advances in Neural Information Processing Systems 15 (NIPS'02), pages 585-592, Cambridge, MA, 2003. MIT Press.
    • (2003) Advances in Neural Information Processing Systems 15 (NIPS'02) , pp. 585-592
    • Chapelle, O.1    Weston, J.2    Schölkopf, B.3
  • 13
    • 56449095373 scopus 로고    scopus 로고
    • A unified architecture for natural language processing: Deep neural networks with multitask learning
    • William W. Cohen, Andrew McCallum, and Sam T. Roweis, editors, ACM
    • Ronan Collobert and Jason Weston. A unified architecture for natural language processing: Deep neural networks with multitask learning. In William W. Cohen, Andrew McCallum, and Sam T. Roweis, editors, Proceedings of the Twenty-fifth International Conference on Machine Learning (ICML'08), pages 160-167. ACM, 2008.
    • (2008) Proceedings of the Twenty-fifth International Conference on Machine Learning (ICML'08) , pp. 160-167
    • Collobert, R.1    Weston, J.2
  • 14
    • 77949524387 scopus 로고    scopus 로고
    • Visualizing higher-layer features of a deep network
    • Université de Montréal
    • Dumitru Erhan, Yoshua Bengio, Aaron Courville, and Pascal Vincent. Visualizing higher-layer features of a deep network. Technical Report 1341, Université de Montréal, 2009.
    • (2009) Technical Report 1341
    • Erhan, D.1    Bengio, Y.2    Courville, A.3    Vincent, P.4
  • 16
    • 84860644702 scopus 로고    scopus 로고
    • Measuring invariances in deep networks
    • Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, editors
    • Ian Goodfellow, Quoc Le, Andrew Saxe, and Andrew Ng. Measuring invariances in deep networks. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, editors, Advances in Neural Information Processing Systems 22, pages 646-654. 2009.
    • (2009) Advances in Neural Information Processing Systems , vol.22 , pp. 646-654
    • Goodfellow, I.1    Le, Q.2    Saxe, A.3    Ng, A.4
  • 19
    • 0001295178 scopus 로고
    • On the power of small-depth threshold circuits
    • Johan Håstad and Mikael Goldmann. On the power of small-depth threshold circuits. Computational Complexity, 1:113-129, 1991.
    • (1991) Computational Complexity , vol.1 , pp. 113-129
    • Håstad, J.1    Goldmann, M.2
  • 20
    • 0013344078 scopus 로고    scopus 로고
    • Training products of experts by minimizing contrastive divergence
    • Geoffrey E. Hinton. Training products of experts by minimizing contrastive divergence. Neural Computation, 14:1771-1800, 2002.
    • (2002) Neural Computation , vol.14 , pp. 1771-1800
    • Hinton, G.E.1
  • 21
    • 56449117245 scopus 로고    scopus 로고
    • To recognize shapes, first learn to generate images
    • Paul Cisek, Trevor Drew, and John Kalaska, editors, Elsevier
    • Geoffrey E. Hinton. To recognize shapes, first learn to generate images. In Paul Cisek, Trevor Drew, and John Kalaska, editors, Computational Neuroscience: Theoretical Insights into Brain Function. Elsevier, 2007.
    • (2007) Computational Neuroscience: Theoretical Insights into Brain Function
    • Hinton, G.E.1
  • 22
    • 33746600649 scopus 로고    scopus 로고
    • Reducing the dimensionality of data with neural networks.
    • July
    • Geoffrey E. Hinton and Ruslan Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313(5786):504-507, July 2006.
    • (2006) Science , vol.313 , Issue.5786 , pp. 504-507
    • Hinton, G.E.1    Salakhutdinov, R.2
  • 23
    • 33745805403 scopus 로고    scopus 로고
    • A fast learning algorithm for deep belief nets
    • Goeffrey E. Hinton, Simon Osindero, and Yee Whye Teh. A fast learning algorithm for deep belief nets. Neural Computation, 18:1527-1554, 2006.
    • (2006) Neural Computation , vol.18 , pp. 1527-1554
    • Hinton, G.E.1    Osindero, S.2    Whye Teh, Y.3
  • 24
    • 56449110012 scopus 로고    scopus 로고
    • Classification using discriminative restricted Boltzmann machines
    • William W. Cohen, Andrew McCallum, and Sam T. Roweis, editors, ACM
    • Hugo Larochelle and Yoshua Bengio. Classification using discriminative restricted Boltzmann machines. In William W. Cohen, Andrew McCallum, and Sam T. Roweis, editors, Proceedings of the Twenty-fifth International Conference on Machine Learning (ICML'08), pages 536-543. ACM, 2008.
    • (2008) Proceedings of the Twenty-fifth International Conference on Machine Learning (ICML'08) , pp. 536-543
    • Larochelle, H.1    Bengio, Y.2
  • 25
    • 34547967782 scopus 로고    scopus 로고
    • An empirical evaluation of deep architectures on problems with many factors of variation
    • Hugo Larochelle, Dumitru Erhan, Aaron Courville, James Bergstra, and Yoshua Bengio. An empirical evaluation of deep architectures on problems with many factors of variation. In Int. Conf. Mach. Learn., pages 473-480, 2007.
    • (2007) Int. Conf. Mach. Learn. , pp. 473-480
    • Larochelle, H.1    Erhan, D.2    Courville, A.3    Bergstra, J.4    Bengio, Y.5
  • 29
    • 0032203257 scopus 로고    scopus 로고
    • Gradient-based learning applied to document recognition
    • Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278-2324, 1998.
    • (1998) Proceedings of the IEEE , vol.86 , Issue.11 , pp. 2278-2324
    • Lecun, Y.1    Bottou, L.2    Bengio, Y.3    Haffner, P.4
  • 30
    • 85161980001 scopus 로고    scopus 로고
    • Sparse deep belief net model for visual area V2
    • J.C. Platt, D. Koller, Y. Singer, and S. Roweis, editors, MIT Press, Cambridge, MA
    • Honglak Lee, Chaitanya Ekanadham, and Andrew Ng. Sparse deep belief net model for visual area V2. In J.C. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems 20 (NIPS'07), pages 873-880. MIT Press, Cambridge, MA, 2008.
    • (2008) Advances in Neural Information Processing Systems 20 (NIPS'07) , pp. 873-880
    • Lee, H.1    Ekanadham, C.2    Ng., A.3
  • 31
    • 71149119164 scopus 로고    scopus 로고
    • Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations
    • Léon Bottou and Michael Littman, editors, ACM, Montreal (Qc), Canada
    • Honglak Lee, Roger Grosse, Rajesh Ranganath, and Andrew Y. Ng. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In Léon Bottou and Michael Littman, editors, Proceedings of the Twenty-sixth International Conference on Machine Learning (ICML'09). ACM, Montreal (Qc), Canada, 2009.
    • (2009) Proceedings of the Twenty-sixth International Conference on Machine Learning (ICML'09)
    • Lee, H.1    Grosse, R.2    Ranganath, R.3    Ng, A.Y.4
  • 32
    • 57849102080 scopus 로고    scopus 로고
    • Training invariant support vector machines using selective sampling
    • Léon Bottou, Olivier Chapelle, Dennis DeCoste, and Jason Weston, editors, MIT Press, Cambridge, MA
    • Gaëlle Loosli, Stéphane Canu, and Léon Bottou. Training invariant support vector machines using selective sampling. In Léon Bottou, Olivier Chapelle, Dennis DeCoste, and Jason Weston, editors, Large Scale Kernel Machines, pages 301-320. MIT Press, Cambridge, MA., 2007.
    • (2007) Large Scale Kernel Machines , pp. 301-320
    • Loosli, G.1    Canu, S.2    Bottou, L.3
  • 33
    • 71149084945 scopus 로고    scopus 로고
    • Deep learning from temporal coherence in video
    • Léon Bottou and Michael Littman, editors, Montreal, June, Omnipress
    • Hossein Mobahi, Ronan Collobert, and Jason Weston. Deep learning from temporal coherence in video. In Léon Bottou and Michael Littman, editors, Proceedings of the 26th International Conference on Machine Learning, pages 737-744, Montreal, June 2009. Omnipress.
    • (2009) Proceedings of the 26th International Conference on Machine Learning , pp. 737-744
    • Mobahi, H.1    Collobert, R.2    Weston, J.3
  • 34
    • 59549087165 scopus 로고    scopus 로고
    • On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes
    • T.G. Dietterich, S. Becker, and Z. Ghahramani, editors
    • Andrew Y. Ng and Michael I. Jordan. On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. In T.G. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems 14 (NIPS'01), pages 841-848, 2002.
    • (2002) Advances in Neural Information Processing Systems 14 (NIPS'01) , pp. 841-848
    • Ng, A.Y.1    Jordan, M.I.2
  • 35
    • 85161976678 scopus 로고    scopus 로고
    • Modeling image patches with a directed hierarchy of markov random field
    • J.C. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Cambridge, MA, MIT Press
    • Simon Osindero and Geoffrey E. Hinton. Modeling image patches with a directed hierarchy of markov random field. In J.C. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems 20 (NIPS'07), pages 1121-1128, Cambridge, MA, 2008. MIT Press.
    • (2008) Advances in Neural Information Processing Systems 20 (NIPS'07) , pp. 1121-1128
    • Osindero, S.1    Hinton, G.E.2
  • 37
    • 84864069017 scopus 로고    scopus 로고
    • Efficient learning of sparse representations with an energy-based model
    • B. Schölkopf, J. Platt, and T. Hoffman, editors, MIT Press
    • Marc'Aurelio Ranzato, Christopher Poultney, Sumit Chopra, and Yann LeCun. Efficient learning of sparse representations with an energy-based model. In B. Schölkopf, J. Platt, and T. Hoffman, editors, Advances in Neural Information Processing Systems 19 (NIPS'06), pages 1137-1144. MIT Press, 2007.
    • (2007) Advances in Neural Information Processing Systems 19 (NIPS'06) , pp. 1137-1144
    • Ranzato, M.A.1    Poultney, C.2    Chopra, S.3    Lecun, Y.4
  • 38
    • 85161966246 scopus 로고    scopus 로고
    • Sparse feature learning for deep belief networks
    • J.C. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Cambridge, MA, MIT Press
    • Marc'Aurelio Ranzato, Y-Lan Boureau, and Yann LeCun. Sparse feature learning for deep belief networks. In J.C. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems 20 (NIPS'07), pages 1185-1192, Cambridge, MA, 2008. MIT Press.
    • (2008) Advances in Neural Information Processing Systems 20 (NIPS'07) , pp. 1185-1192
    • Ranzato, M.A.1    Boureau, Y.-L.2    Lecun, Y.3
  • 39
    • 85162037149 scopus 로고    scopus 로고
    • Using deep belief nets to learn covariance kernels for Gaussian processes
    • J.C. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Cambridge, MA, MIT Press
    • Ruslan Salakhutdinov and Geoffrey E. Hinton. Using deep belief nets to learn covariance kernels for Gaussian processes. In J.C. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems 20 (NIPS'07), pages 1249-1256, Cambridge, MA, 2008. MIT Press.
    • (2008) Advances in Neural Information Processing Systems 20 (NIPS'07) , pp. 1249-1256
    • Salakhutdinov, R.1    Hinton, G.E.2
  • 42
    • 0000938157 scopus 로고    scopus 로고
    • Learning continuous attractors in recurrent networks
    • M.I. Jordan, M.J. Kearns, and S.A. Solla, editors, MIT Press
    • Sebastian H. Seung. Learning continuous attractors in recurrent networks. In M.I. Jordan, M.J. Kearns, and S.A. Solla, editors, Advances in Neural Information Processing Systems 10 (NIPS'97), pages 654-660. MIT Press, 1998.
    • (1998) Advances in Neural Information Processing Systems 10 (NIPS'97) , pp. 654-660
    • Seung, S.H.1
  • 43
    • 0029489722 scopus 로고
    • Overtraining, regularization and searching for a minimum, with application to neural networks
    • Jonas Sjöberg and Lennart Ljung. Overtraining, regularization and searching for a minimum, with application to neural networks. International Journal of Control, 62(6):1391-1407, 1995.
    • (1995) International Journal of Control , vol.62 , Issue.6 , pp. 1391-1407
    • Sjöberg, J.1    Ljung, L.2
  • 45
    • 0034704229 scopus 로고    scopus 로고
    • A global geometric framework for nonlinear dimensionality reduction
    • December
    • Joshua Tenenbaum, Vin de Silva, and John C. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500):2319-2323, December 2000.
    • (2000) Science , vol.290 , Issue.5500 , pp. 2319-2323
    • Tenenbaum, J.1    De Silva, V.2    Langford, J.C.3
  • 48
    • 84899000641 scopus 로고    scopus 로고
    • Exponential family harmoniums with an application to information retrieval
    • L.K. Saul, Y. Weiss, and L. Bottou, editors, Cambridge, MA, MIT Press
    • Max Welling, Michal Rosen-Zvi, and Geoffrey E. Hinton. Exponential family harmoniums with an application to information retrieval. In L.K. Saul, Y. Weiss, and L. Bottou, editors, Advances in Neural Information Processing Systems 17 (NIPS'04), pages 1481-1488, Cambridge, MA, 2005. MIT Press.
    • (2005) Advances in Neural Information Processing Systems 17 (NIPS'04) , pp. 1481-1488
    • Welling, M.1    Rosen-Zvi, M.2    Hinton, G.E.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.