메뉴 건너뛰기




Volumn 10, Issue , 2009, Pages 1-40

Exploring strategies for training deep neural networks

Author keywords

Artificial neural networks; Autoassociators; Deep belief networks; Restricted Boltzmann machines; Unsupervised learning

Indexed keywords

ALGORITHMS; BACKPROPAGATION; BAYESIAN NETWORKS; LEARNING ALGORITHMS; NETWORK LAYERS; OPTIMIZATION; UNSUPERVISED LEARNING;

EID: 59449087310     PISSN: 15324435     EISSN: 15337928     Source Type: Journal    
DOI: None     Document Type: Article
Times cited : (922)

References (61)
  • 1
    • 27844439373 scopus 로고    scopus 로고
    • A framework for learning predictive structures from multiple tasks and unlabeled data
    • Rie Kubota Ando and Tong Zhang. A framework for learning predictive structures from multiple tasks and unlabeled data. Journal of Machine Learning Research, 6:1817-1853, 2005.
    • (2005) Journal of Machine Learning Research , vol.6 , pp. 1817-1853
    • Kubota Ando, R.1    Zhang, T.2
  • 2
    • 0005862230 scopus 로고    scopus 로고
    • Exponentially many local minima for single neurons
    • M. Mozer, D. S. Touretzky, and M. Perrone, editors, MIT Press, Cambridge, MA
    • Peter Auer, Mark Herbster, and Manfred K. Warmuth. Exponentially many local minima for single neurons. In M. Mozer, D. S. Touretzky, and M. Perrone, editors, Advances in Neural Information Processing System 8, pages 315-322. MIT Press, Cambridge, MA, 1996.
    • (1996) Advances in Neural Information Processing System 8 , pp. 315-322
    • Auer, P.1    Herbster, M.2    Warmuth, M.K.3
  • 3
    • 0024774330 scopus 로고
    • Neural networks and principal component analysis: Learning from examples without local minima
    • Pierre Baldi and Kurt Hornik. Neural networks and principal component analysis: Learning from examples without local minima. Neural Networks, 2:53-58, 1989.
    • (1989) Neural Networks , vol.2 , pp. 53-58
    • Baldi, P.1    Hornik, K.2
  • 4
    • 0029411030 scopus 로고
    • An information maximisation approach to blind separation and blind deconvolution
    • Anthony J. Bell and Terrence J. Sejnowski. An information maximisation approach to blind separation and blind deconvolution. Neural Computation, 7(6):1129-1159, 1995.
    • (1995) Neural Computation , vol.7 , Issue.6 , pp. 1129-1159
    • Bell, A.J.1    Sejnowski, T.J.2
  • 5
    • 59449098798 scopus 로고    scopus 로고
    • Yoshua Bengio. Learning deep architectures for AI. Technical Report 1312, Université de Montréal, dept. IRO, 2007.
    • Yoshua Bengio. Learning deep architectures for AI. Technical Report 1312, Université de Montréal, dept. IRO, 2007.
  • 6
    • 56449121705 scopus 로고    scopus 로고
    • Justifying and generalizing contrastive divergence
    • Technical Report 1311, Dept. IRO, Université de Montréal
    • Yoshua Bengio and Olivier Delalleau. Justifying and generalizing contrastive divergence. Technical Report 1311, Dept. IRO, Université de Montréal, 2007.
    • (2007)
    • Bengio, Y.1    Delalleau, O.2
  • 7
    • 59449097579 scopus 로고    scopus 로고
    • Yoshua Bengio and Yann Le Cun. Scaling learning algorithms towards AI. In L. Bottou, O. Chapelle, D. DeCoste, and J. Weston, editors, Large Scale Kernel Machines. MIT Press, 2007.
    • Yoshua Bengio and Yann Le Cun. Scaling learning algorithms towards AI. In L. Bottou, O. Chapelle, D. DeCoste, and J. Weston, editors, Large Scale Kernel Machines. MIT Press, 2007.
  • 8
    • 77954662106 scopus 로고    scopus 로고
    • Yoshua Bengio, Olivier Delalleau, and Nicolas Le Roux. The curse of highly variable functions for local kernel machines. In Y. Weiss, B. Schölkopf, and J. Piatt, editors, Advances in Neural Information Processing Systems 18, pages 107-114. MIT Press, Cambridge, MA, 2006.
    • Yoshua Bengio, Olivier Delalleau, and Nicolas Le Roux. The curse of highly variable functions for local kernel machines. In Y. Weiss, B. Schölkopf, and J. Piatt, editors, Advances in Neural Information Processing Systems 18, pages 107-114. MIT Press, Cambridge, MA, 2006.
  • 10
    • 33646405050 scopus 로고    scopus 로고
    • The tradeoff between generative and discriminative classifiers
    • Prague, August, URL
    • Guillaume Bouchard and Bill Triggs. The tradeoff between generative and discriminative classifiers. In IASC International Symposium on Computational Statistics (COMPSTAT), pages 721-728, Prague, August 2004. URL http://lear.inrialpes.fr/pubs/2004/BT04.
    • (2004) IASC International Symposium on Computational Statistics (COMPSTAT) , pp. 721-728
    • Bouchard, G.1    Triggs, B.2
  • 12
    • 0037768682 scopus 로고    scopus 로고
    • A continuous restricted Boltzmann machine with an implementable training algorithm
    • Hsin Chen and Alan F. Murray. A continuous restricted Boltzmann machine with an implementable training algorithm. IEE Proceedings of Vision, Image and Signal Processing, 150(3): 153-158, 2003.
    • (2003) IEE Proceedings of Vision, Image and Signal Processing , vol.150 , Issue.3 , pp. 153-158
    • Chen, H.1    Murray, A.F.2
  • 13
    • 56449095373 scopus 로고    scopus 로고
    • A unified architecture for natural language processing: Deep neural networks with multitask learning
    • URL
    • Ronan Collobert and Jason Weston. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the Twenty-fifth International Conference on Machine Learning (ICML 2008), pages 160-167, 2008. URL http://www.kyb.tuebingen.mpg.de/bs/ people/weston/papers/unified\-nlp.pdf.
    • (2008) Proceedings of the Twenty-fifth International Conference on Machine Learning (ICML 2008) , pp. 160-167
    • Collobert, R.1    Weston, J.2
  • 14
    • 0028416938 scopus 로고
    • Independent component analysis - a new concept?
    • Pierre Comon. Independent component analysis - a new concept? Signal Processing, 36:287-314, 1994.
    • (1994) Signal Processing , vol.36 , pp. 287-314
    • Comon, P.1
  • 15
    • 0001109377 scopus 로고
    • Learning internal representations from grayscale images: An example of extensional programming
    • Seattle, Lawrence Erlbaum, Hillsdale
    • Garrison W. Cottrell, Paul Munro, and David Zipser. Learning internal representations from grayscale images: An example of extensional programming. In Ninth Annual Conference of the Cognitive Science Society, pages 462-473, Seattle 1987, 1987. Lawrence Erlbaum, Hillsdale.
    • (1987) Ninth Annual Conference of the Cognitive Science Society , pp. 462-473
    • Cottrell, G.W.1    Munro, P.2    Zipser, D.3
  • 17
    • 0000362092 scopus 로고
    • Non-linear dimensionality reduction
    • C.L. Giles, S.J. Hanson, and J.D. Cowan, editors, San Mateo CA, Morgan Kaufmann
    • David DeMers and Garrison W. Cottrell. Non-linear dimensionality reduction. In C.L. Giles, S.J. Hanson, and J.D. Cowan, editors, Advances in Neural Information Processing Systems 5, pages 580-587, San Mateo CA, 1993. Morgan Kaufmann.
    • (1993) Advances in Neural Information Processing Systems 5 , pp. 580-587
    • DeMers, D.1    Cottrell, G.W.2
  • 18
    • 0000155950 scopus 로고
    • The cascade-correlation learning architecture
    • D.S. Touretzky, editor, Denver, CO, Morgan Kaufmann, San Mateo
    • Scott E. Fahlman and Christian Lebiere. The cascade-correlation learning architecture. In D.S. Touretzky, editor, Advances in Neural Information Processing Systems 2, pages 524-532, Denver, CO, 1990. Morgan Kaufmann, San Mateo.
    • (1990) Advances in Neural Information Processing Systems 2 , pp. 524-532
    • Fahlman, S.E.1    Lebiere, C.2
  • 19
    • 84952149204 scopus 로고
    • A statistical view of some chemometrics regression tools
    • Ildiko E. Frank and Jerome H. Friedman. A statistical view of some chemometrics regression tools. Technometrics, 35(2): 109-148, 1993.
    • (1993) Technometrics , vol.35 , Issue.2 , pp. 109-148
    • Frank, I.E.1    Friedman, J.H.2
  • 21
    • 0034018074 scopus 로고    scopus 로고
    • Local minima and plateaus in hierarchical structures of multilayer perceptrons
    • Kenji Fukumizu and Shun-ichi Amari. Local minima and plateaus in hierarchical structures of multilayer perceptrons. Neural Networks, 13(3):317-327, 2000.
    • (2000) Neural Networks , vol.13 , Issue.3 , pp. 317-327
    • Fukumizu, K.1    Amari, S.-I.2
  • 24
    • 0001295178 scopus 로고
    • On the power of small-depth threshold circuits
    • Johan Hastad and M. Goldmann. On the power of small-depth threshold circuits. Computational Complexity, 1:113-129, 1991.
    • (1991) Computational Complexity , vol.1 , pp. 113-129
    • Hastad, J.1    Goldmann, M.2
  • 25
    • 0013344078 scopus 로고    scopus 로고
    • Training products of experts by minimizing contrastive divergence
    • Geoffrey E. Hinton. Training products of experts by minimizing contrastive divergence. Neural Computation, 14:1771-1800, 2002.
    • (2002) Neural Computation , vol.14 , pp. 1771-1800
    • Hinton, G.E.1
  • 26
    • 0024732792 scopus 로고
    • Connectionist learning procedures
    • Geoffrey E. Hinton. Connectionist learning procedures. Artificial Intelligence, 40:185-234, 1989.
    • (1989) Artificial Intelligence , vol.40 , pp. 185-234
    • Hinton, G.E.1
  • 27
    • 34547975984 scopus 로고    scopus 로고
    • To recognize shapes, first learn to generate images
    • TR 2006-003, University of Toronto
    • Geoffrey E. Hinton. To recognize shapes, first learn to generate images. Technical Report UTML TR 2006-003, University of Toronto, 2006.
    • (2006) Technical Report UTML
    • Hinton, G.E.1
  • 28
    • 33746600649 scopus 로고    scopus 로고
    • Reducing the dimensionality of data with neural networks
    • July
    • Geoffrey E. Hinton and Ruslan R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313(5786):504-507, July 2006.
    • (2006) Science , vol.313 , Issue.5786 , pp. 504-507
    • Hinton, G.E.1    Salakhutdinov, R.R.2
  • 29
    • 0029652445 scopus 로고
    • The wake-sleep algorithm for unsupervised neural networks
    • Geoffrey E. Hinton, Peter Dayan, Brendan J. Frey, and Radford M. Neal. The wake-sleep algorithm for unsupervised neural networks. Science, 268:1558-1161, 1995.
    • (1995) Science , vol.268 , pp. 1558-1161
    • Hinton, G.E.1    Dayan, P.2    Frey, B.J.3    Neal, R.M.4
  • 30
    • 33745805403 scopus 로고    scopus 로고
    • A fast learning algorithm for deep belief nets
    • Goeffrey E. Hinton, Simon Osindero, and Yee-Whye Teh. A fast learning algorithm for deep belief nets. Neural Computation, 18:1527-1554, 2006.
    • (2006) Neural Computation , vol.18 , pp. 1527-1554
    • Hinton, G.E.1    Osindero, S.2    Teh, Y.-W.3
  • 31
    • 33745181681 scopus 로고    scopus 로고
    • Alex Holub and Pietro Perona. A discriminative framework for modelling object classes. In CVPR '05: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - 1, pages 664-671, Washington, DC, USA, 2005. IEEE Computer Society. ISBN 0-7695-2372-2. doi: http://dx.doi.org/10.1109/CVPR.2005.25.
    • Alex Holub and Pietro Perona. A discriminative framework for modelling object classes. In CVPR '05: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1, pages 664-671, Washington, DC, USA, 2005. IEEE Computer Society. ISBN 0-7695-2372-2. doi: http://dx.doi.org/10.1109/CVPR.2005.25.
  • 32
    • 0024880831 scopus 로고
    • Multilayer feedforward networks are universal approximators
    • Kurt Hornik, Maxwell Stinchcombe, and Haibert White. Multilayer feedforward networks are universal approximators. Neural Networks, 2:359-366, 1989.
    • (1989) Neural Networks , vol.2 , pp. 359-366
    • Hornik, K.1    Stinchcombe, M.2    White, H.3
  • 33
    • 84898982939 scopus 로고    scopus 로고
    • Exploiting generative models in discriminative classifiers
    • M.S. Kearns, S.A. Solla, and D.A. Cohn, editors, MIT Press, Cambridge, MA
    • Tommi S. Jaakkola and David Haussler. Exploiting generative models in discriminative classifiers. In M.S. Kearns, S.A. Solla, and D.A. Cohn, editors, Advances in Neural Information Processing Systems 11. MIT Press, Cambridge, MA, 1999.
    • (1999) Advances in Neural Information Processing Systems 11
    • Jaakkola, T.S.1    Haussler, D.2
  • 35
    • 0026191274 scopus 로고
    • Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture
    • Christian Jutten and Jeanny Herault. Blind separation of sources, part I: an adaptive algorithm based on neuromimetic architecture. Signal Processing, 24:1-10, 1991.
    • (1991) Signal Processing , vol.24 , pp. 1-10
    • Jutten, C.1    Herault, J.2
  • 37
    • 34547967782 scopus 로고    scopus 로고
    • An empirical evaluation of deep architectures on problems with many factors of variation
    • Zoubin Ghahramani, editor, Omnipress, URL
    • Hugo Larochelle, Dumitru Erhan, Aaron Courville, James Bergstra, and Yoshua Bengio. An empirical evaluation of deep architectures on problems with many factors of variation. In Zoubin Ghahramani, editor, Twenty-fourth International Conference on Machine Learning (ICML 2007), pages 473-480. Omnipress, 2007. URL http://www.machinelearning.org/proceedings/icm12007/papers/ 331.pdf.
    • (2007) Twenty-fourth International Conference on Machine Learning (ICML 2007) , pp. 473-480
    • Larochelle, H.1    Erhan, D.2    Courville, A.3    Bergstra, J.4    Bengio, Y.5
  • 38
    • 33845597672 scopus 로고    scopus 로고
    • Julia A. Lasserre, Christopher M. Bishop, and Thomas P. Minka. Principled hybrids of generative and discriminative models. In CVPR '06: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 87-94, Washington, DC, USA, 2006. IEEE Computer Society. ISBN 0-7695-2597-0. doi: http://dx.doi.org/10.1109/CVPR.2006.227.
    • Julia A. Lasserre, Christopher M. Bishop, and Thomas P. Minka. Principled hybrids of generative and discriminative models. In CVPR '06: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 87-94, Washington, DC, USA, 2006. IEEE Computer Society. ISBN 0-7695-2597-0. doi: http://dx.doi.org/10.1109/CVPR.2006.227.
  • 39
    • 0032203257 scopus 로고    scopus 로고
    • Gradient-based learning applied to document recognition
    • November
    • Yann LeCun, Leon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278-2324, November 1998.
    • (1998) Proceedings of the IEEE , vol.86 , Issue.11 , pp. 2278-2324
    • LeCun, Y.1    Bottou, L.2    Bengio, Y.3    Haffner, P.4
  • 40
    • 0029972717 scopus 로고    scopus 로고
    • Training MLPs layer by layer using an objective function for internal representations
    • Régis Lengellé and Thierry Denoeux. Training MLPs layer by layer using an objective function for internal representations. Neural Networks, 9:83-97, 1996.
    • (1996) Neural Networks , vol.9 , pp. 83-97
    • Lengellé, R.1    Denoeux, T.2
  • 41
    • 0036634215 scopus 로고    scopus 로고
    • A monte-carlo EM approach for partially observable diffusion processes: Theory and applications to neural networks
    • Javier R. Movellan, Paul Mineiro, and R. J. Williams. A monte-carlo EM approach for partially observable diffusion processes: theory and applications to neural networks. Neural Computation, 14:1501-1544,2002.
    • (2002) Neural Computation , vol.14 , pp. 1501-1544
    • Movellan, J.R.1    Mineiro, P.2    Williams, R.J.3
  • 42
    • 44049116681 scopus 로고
    • Connectionist learning of belief networks
    • Radford M. Neal. Connectionist learning of belief networks. Artificial Intelligence, 56:71-113, 1992.
    • (1992) Artificial Intelligence , vol.56 , pp. 71-113
    • Neal, R.M.1
  • 43
    • 1942418620 scopus 로고    scopus 로고
    • On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes
    • Andrew Y Ng and Michael I. Jordan. On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. In NIPS, pages 841-848, 2001.
    • (2001) NIPS , pp. 841-848
    • Ng, A.Y.1    Jordan, M.I.2
  • 45
    • 34948870900 scopus 로고    scopus 로고
    • Marc'Aurelio Ranzato, Fu-Jie Huang, Y-Lan Boureau, and Yann LeCun. Unsupervised learning of invariant feature hierarchies with applications to object recognition. In Proc. Computer Vision and Pattern Recognition Conference (CVPR'07). IEEE Press, 2007a.
    • Marc'Aurelio Ranzato, Fu-Jie Huang, Y-Lan Boureau, and Yann LeCun. Unsupervised learning of invariant feature hierarchies with applications to object recognition. In Proc. Computer Vision and Pattern Recognition Conference (CVPR'07). IEEE Press, 2007a.
  • 46
    • 84864069017 scopus 로고    scopus 로고
    • Efficient learning of sparse representations with an energy-based model
    • B. Schölkopf, J. Platt, and T. Hoffman, editors, MIT Press
    • Marc'Aurelio Ranzato, Christopher Poultney, Sumit Chopra, and Yann LeCun. Efficient learning of sparse representations with an energy-based model. In B. Schölkopf, J. Platt, and T. Hoffman, editors, Advances in Neural Information Processing Systems 19. MIT Press, 2007b.
    • (2007) Advances in Neural Information Processing Systems 19
    • Ranzato, M.1    Poultney, C.2    Chopra, S.3    LeCun, Y.4
  • 47
    • 59449110338 scopus 로고    scopus 로고
    • Marc'Aurelio Ranzato, Y-Lan Boureau, and Yann LeCun. Sparse feature learning for deep belief networks. In J.C. Piatt, D. Koller, Y Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems 20. MIT Press, Cambridge, MA, 2008. URL http://www.es.nyu.edu/-ranzato/publications/ ranzato-nips07.pdf.
    • Marc'Aurelio Ranzato, Y-Lan Boureau, and Yann LeCun. Sparse feature learning for deep belief networks. In J.C. Piatt, D. Koller, Y Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems 20. MIT Press, Cambridge, MA, 2008. URL http://www.es.nyu.edu/-ranzato/publications/ ranzato-nips07.pdf.
  • 48
    • 85162037149 scopus 로고    scopus 로고
    • Using deep belief nets to learn covariance kernels for gaussian processes
    • J. C. Piatt, D. Koller, Y Singer, and S. Roweis, editors, MIT Press, Cambridge, MA, URL
    • Ruslan Salakhutdinov and Geoffrey Hinton. Using deep belief nets to learn covariance kernels for gaussian processes. In J. C. Piatt, D. Koller, Y Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems 20. MIT Press, Cambridge, MA, 2008. URL http://www.csri.utoronto.ca/~hinton/ absps/dbngp.pdf.
    • (2008) Advances in Neural Information Processing Systems 20
    • Salakhutdinov, R.1    Hinton, G.2
  • 50
    • 70049096835 scopus 로고    scopus 로고
    • Learning a nonlinear embedding by preserving class neighbourhood structure
    • San Juan, Porto Rico, Omnipress
    • Ruslan Salakhutdinov and Geoffrey Hinton. Learning a nonlinear embedding by preserving class neighbourhood structure. In Proceedings of AISTATS 2007, San Juan, Porto Rico, 2007b. Omnipress.
    • (2007) Proceedings of AISTATS 2007
    • Salakhutdinov, R.1    Hinton, G.2
  • 55
    • 0000329993 scopus 로고
    • Information processing in dynamical systems: Foundations of harmony theory
    • D. E. Rumelhart and J. L. McClelland, editors, chapter 6, MIT Press, Cambridge
    • Paul Smolensky. Information processing in dynamical systems: Foundations of harmony theory. In D. E. Rumelhart and J. L. McClelland, editors, Parallel Distributed Processing, volume 1, chapter 6, pages 194-281. MIT Press, Cambridge, 1986.
    • (1986) Parallel Distributed Processing , vol.1 , pp. 194-281
    • Smolensky, P.1
  • 59
    • 84899000641 scopus 로고    scopus 로고
    • Exponential family harmoniums with an application to information retrieval
    • L.K. Saul, Y. Weiss, and L. Bottou, editors, MIT Press
    • Max Welling, Michal Rosen-Zvi, and Geoffrey E. Hinton. Exponential family harmoniums with an application to information retrieval. In L.K. Saul, Y. Weiss, and L. Bottou, editors, Advances in Neural Information Processing Systems 17. MIT Press, 2005.
    • (2005) Advances in Neural Information Processing Systems 17
    • Welling, M.1    Rosen-Zvi, M.2    Hinton, G.E.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.