SCOPUS 정보 검색 플랫폼

IJCAI International Joint Conference on Artificial Intelligence

Volumn 2015-January, Issue , 2015, Pages 3460-3468

Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves

(3) Domhan, Tobias a Springenberg, Jost Tobias a Hutter, Frank a

Author keywords

[No Author keywords available]

Indexed keywords

ARTIFICIAL INTELLIGENCE; BENCHMARKING; EXTRAPOLATION; LEARNING SYSTEMS; NETWORK ARCHITECTURE; NEURAL NETWORKS; OBJECT RECOGNITION; OPTIMIZATION; STOCHASTIC SYSTEMS;

COMPUTATIONAL RESOURCES; DEEP NEURAL NETWORKS; EARLY TERMINATION; HYPER-PARAMETER OPTIMIZATIONS; MACHINE LEARNING PROBLEM; PROBABILISTIC MODELING; STATE OF THE ART; STOCHASTIC GRADIENT DESCENT;

CURVE FITTING;

EID: 84949921865 PISSN: 10450823 EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (602)

References (36)

1
- 0034241361
- Gradient-based optimization of hyperparameters
- Y. Bengio. Gradient-based optimization of hyperparameters. Neural Computation, 12(8):1889-1900, 2000.
- (2000) Neural Computation , vol.12 , Issue.8 , pp. 1889-1900
- Bengio, Y.¹

2
- 84857855190
- Random search for hyper-parameter optimization
- J. Bergstra and Y. Bengio. Random search for hyper-parameter optimization. JMLR, 13(1):281-305, 2012.
- (2012) JMLR , vol.13 , Issue.1 , pp. 281-305
- Bergstra, J.¹ Bengio, Y.²

3
- 85162384813
- Algorithms for hyper-parameter optimization
- J. Bergstra, R. Bardenet, Y. Bengio, and B. Kégl. Algorithms for hyper-parameter optimization. In Proc. of NIPS, pages 2546-2554, 2011.
- (2011) Proc. of NIPS , pp. 2546-2554
- Bergstra, J.¹ Bardenet, R.² Bengio, Y.³ Kégl, B.⁴

4
- 84897558007
- Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures
- J. Bergstra, D. Yamins, and D.D. Cox. Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. In Proc. of ICML, pages 115-123, 2013.
- (2013) Proc. of ICML , pp. 115-123
- Bergstra, J.¹ Yamins, D.² Cox, D.D.³

5
- 0035478854
- Random forests
- L. Breiman. Random forests. Machine learning, 45(1):5-32, 2001.
- (2001) Machine Learning , vol.45 , Issue.1 , pp. 5-32
- Breiman, L.¹

6
- 84869826137
- A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning
- E. Brochu, V. M. Cora, and N. de Freitas. A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. CoRR, abs/1012.2599, 2010.
- (2010) CoRR, Abs/1012.2599
- Brochu, E.¹ Cora, V.M.² De Freitas, N.³

7
- 84862283411
- An analysis of single-layer networks in unsupervised feature learning
- A. Coates, A. Y. Ng, and H. Lee. An analysis of single-layer networks in unsupervised feature learning. In Proc. of AISTATS, pages 215-223, 2011.
- (2011) Proc. of AISTATS , pp. 215-223
- Coates, A.¹ Ng, A.Y.² Lee, H.³

8
- 84890527827
- Improving deep neural networks for lvcsr using rectified linear units and dropout
- IEEE
- G. Dahl, T. Sainath, and G. Hinton. Improving deep neural networks for lvcsr using rectified linear units and dropout. In Proc. of ICASSP, pages 8609-8613. IEEE, 2013.
- (2013) Proc. of ICASSP , pp. 8609-8613
- Dahl, G.¹ Sainath, T.² Hinton, G.³

9
- 84890526837
- New types of deep neural network learning for speech recognition and related applications: An overview
- L. Deng, G. Hinton, and B. Kingsbury. New types of deep neural network learning for speech recognition and related applications: An overview. In Proc. of ICASSP, 2013.
- (2013) Proc. of ICASSP
- Deng, L.¹ Hinton, G.² Kingsbury, B.³

10
- 84919881041
- Decaf: A deep convolutional activation feature for generic visual recognition
- J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell. Decaf: A deep convolutional activation feature for generic visual recognition. In Proc. of ICML, 2014.
- (2014) Proc. of ICML
- Donahue, J.¹ Jia, Y.² Vinyals, O.³ Hoffman, J.⁴ Zhang, N.⁵ Tzeng, E.⁶ Darrell, T.⁷

11
- 84919931099
- Towards an empirical foundation for assessing Bayesian optimization of hyperparameters
- K. Eggensperger, M. Feurer, F. Hutter, J. Bergstra, J. Snoek, H. Hoos, and K. Leyton-Brown. Towards an empirical foundation for assessing Bayesian optimization of hyperparameters. In NIPS Workshop on Bayesian Optimization in Theory and Practice (BayesOpt'13), 2013.
- (2013) NIPS Workshop on Bayesian Optimization in Theory and Practice (BayesOpt'13)
- Eggensperger, K.¹ Feurer, M.² Hutter, F.³ Bergstra, J.⁴ Snoek, J.⁵ Hoos, H.⁶ Leyton-Brown, K.⁷

12
- 84875838326
- Emcee: The MCMC hammer
- D. Foreman-Mackey, D. W. Hogg, D. Lang, and J. Goodman. emcee: The MCMC Hammer. PASP, 125:306-312, 2013.
- (2013) PASP , vol.125 , pp. 306-312
- Foreman-Mackey, D.¹ Hogg, D.W.² Lang, D.³ Goodman, J.⁴

13
- 0012330992
- Modeling decision tree performance with the power law
- L. Frey and D. Fisher. Modeling decision tree performance with the power law. In Proc. of AISTATS, 1999.
- (1999) Proc. of AISTATS
- Frey, L.¹ Fisher, D.²

14
- 84862277874
- Understanding the difficulty of training deep feedforward neural networks
- X. Glorot and Y. Bengio. Understanding the difficulty of training deep feedforward neural networks. In Proc. of AISTATS, pages 249-256, 2010.
- (2010) Proc. of AISTATS , pp. 249-256
- Glorot, X.¹ Bengio, Y.²

15
- 84897543523
- Maxout networks
- I. Goodfellow, D. Warde-Farley, M. Mirza, A. Courville, and Y. Bengio. Maxout networks. In Proc. of ICML, 2013.
- (2013) Proc. of ICML
- Goodfellow, I.¹ Warde-Farley, D.² Mirza, M.³ Courville, A.⁴ Bengio, Y.⁵

16
- 84974711038
- Modelling classification performance for large data sets
- Springer
- B. Gu, F. Hu, and H. Liu. Modelling classification performance for large data sets. In Proc. of WAIM, pages 317-328. Springer, 2001.
- (2001) Proc. of WAIM , pp. 317-328
- Gu, B.¹ Hu, F.² Liu, H.³

17
- 84868554032
- Sequential model-based optimization for general algorithm configuration
- Springer
- F. Hutter, H. Hoos, and K. Leyton-Brown. Sequential model-based optimization for general algorithm configuration. In Proc. of LION, pages 507-523. Springer, 2011.
- (2011) Proc. of LION , pp. 507-523
- Hutter, F.¹ Hoos, H.² Leyton-Brown, K.³

18
- 84887848457
- Algorithm runtime prediction: Methods and evaluation
- F. Hutter, L. Xu, H. H. Hoos, and K. Leyton-Brown. Algorithm runtime prediction: Methods and evaluation. AIJ, 206(0):79-111, 2014.
- (2014) AIJ , vol.206 , pp. 79-111
- Hutter, F.¹ Xu, L.² Hoos, H.H.³ Leyton-Brown, K.⁴

19
- 77953183471
- What is the best multi-stage architecture for object recognition?
- K. Jarrett, K. Kavukcuoglu, M. Ranzato, and Y. LeCun. What is the best multi-stage architecture for object recognition? In Proc. of ICCV, 2009.
- (2009) Proc. of ICCV
- Jarrett, K.¹ Kavukcuoglu, K.² Ranzato, M.³ LeCun, Y.⁴

20
- 84949870156
- Caffe: Convolutional architecture for fast feature embedding
- Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093, 2014.
- (2014) ArXiv Preprint ArXiv:1408.5093
- Jia, Y.¹ Shelhamer, E.² Donahue, J.³ Karayev, S.⁴ Long, J.⁵ Girshick, R.⁶ Guadarrama, S.⁷ Darrell, T.⁸

21
- 0000561424
- Efficient global optimization of expensive black-box functions
- D. Jones, M. Schonlau, and W. Welch. Efficient global optimization of expensive black-box functions. Journal of Global optimization, 13(4):455-492, 1998.
- (1998) Journal of Global Optimization , vol.13 , Issue.4 , pp. 455-492
- Jones, D.¹ Schonlau, M.² Welch, W.³

22
- 84878208192
- Prediction of learning curves in machine translation
- P. Kolachina, N. Cancedda, M. Dymetman, and S. Venkatapathy. Prediction of learning curves in machine translation. In Proc. of ACL, pages 22-30, 2012.
- (2012) Proc. of ACL , pp. 22-30
- Kolachina, P.¹ Cancedda, N.² Dymetman, M.³ Venkatapathy, S.⁴

23
- 84876231242
- Imagenet classification with deep convolutional neural networks
- A. Krizhevsky, I. Sutskever, and G. Hinton. Imagenet classification with deep convolutional neural networks. In Proc. of NIPS, pages 1097-1105, 2012.
- (2012) Proc. of NIPS , pp. 1097-1105
- Krizhevsky, A.¹ Sutskever, I.² Hinton, G.³

24
- 77956002520
- Learning multiple layers of features from tiny images
- A. Krizhevsky. Learning multiple layers of features from tiny images. Master's thesis, University of Toronto, 2009.
- (2009) Master's Thesis, University of Toronto
- Krizhevsky, A.¹

25
- 0000359337
- Backpropagation applied to handwritten zip code recognition
- Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. Backpropagation applied to handwritten zip code recognition. Neural computation, 1(4):541-551, 1989.
- (1989) Neural Computation , vol.1 , Issue.4 , pp. 541-551
- LeCun, Y.¹ Boser, B.² Denker, J.S.³ Henderson, D.⁴ Howard, R.E.⁵ Hubbard, W.⁶ Jackel, L.D.⁷

26
- 84943645147
- Deeply supervised nets
- C. Lee, S. Xie, P. Gallagher, Z. Zhang, and Z. Tu. Deeply supervised nets. In Deep Learning and Representation Learning Workshop, NIPS, 2014.
- (2014) Deep Learning and Representation Learning Workshop, NIPS
- Lee, C.¹ Xie, S.² Gallagher, P.³ Zhang, Z.⁴ Tu, Z.⁵

27
- 85083953135
- Network in network
- M. Lin, Q. Chen, and S. Yan. Network in network. In ICLR: Conference Track, 2014.
- (2014) ICLR: Conference Track
- Lin, M.¹ Chen, Q.² Yan, S.³

28
- 0001923944
- Hoeffding races: Accelerating model selection search for classification and function approximation
- O. Maron and A. Moore. Hoeffding races: Accelerating model selection search for classification and function approximation. In Proc. of NIPS, pages 59-66, 1994.
- (1994) Proc. of NIPS , pp. 59-66
- Maron, O.¹ Moore, A.²

29
- 84875298474
- Second Edition, Volume 7700 of LNCS. Springer
- G. Montavon, G. Orr, and K.-R. Müller, editors. Neural Networks: Tricks of the Trade - Second Edition, volume 7700 of LNCS. Springer, 2012.
- (2012) Neural Networks: Tricks of the Trade
- Montavon, G.¹ Orr, G.² Müller, K.-R.³

30
- 25444448065
- The MIT Press
- C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning. The MIT Press, 2006.
- (2006) Gaussian Processes for Machine Learning
- Rasmussen, C.E.¹ Williams, C.K.I.²

31
- 84869201485
- Practical Bayesian optimization of machine learning algorithms
- J. Snoek, H. Larochelle, and R.P. Adams. Practical Bayesian optimization of machine learning algorithms. In Proc. of NIPS, pages 2951-2959, 2012.
- (2012) Proc. of NIPS , pp. 2951-2959
- Snoek, J.¹ Larochelle, H.² Adams, R.P.³

32
- 85083954305
- arxiv:cs/arXiv:1412.6806
- J. T. Springenberg, A. Dosovitskiy, T. Brox, and M. Riedmiller. Striving for simplicity: The all convolutional net. In arxiv:cs/arXiv:1412.6806, 2015.
- (2015) Striving for Simplicity: The all Convolutional Net
- Springenberg, J.T.¹ Dosovitskiy, A.² Brox, T.³ Riedmiller, M.⁴

33
- 84904163933
- Dropout: A simple way to prevent neural networks from overfitting
- N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. JMLR, 15:1929-1958, 2014.
- (2014) JMLR , vol.15 , pp. 1929-1958
- Srivastava, N.¹ Hinton, G.² Krizhevsky, A.³ Sutskever, I.⁴ Salakhutdinov, R.⁵

34
- 85053528161
- Raiders of the lost architecture: Kernels for Bayesian optimization in conditional parameter spaces
- K. Swersky, D. Duvenaud, J. Snoek, F. Hutter, and M. Osborne. Raiders of the lost architecture: Kernels for Bayesian optimization in conditional parameter spaces. In NIPS workshop on Bayesian Optimization in theory and practice (BayesOptâAZ13), 2013.
- (2013) NIPS Workshop on Bayesian Optimization in Theory and Practice (BayesOptâAZ13)
- Swersky, K.¹ Duvenaud, D.² Snoek, J.³ Hutter, F.⁴ Osborne, M.⁵

35
- 84938340353
- arXiv preprint arXiv:1406.3896
- K. Swersky, J. Snoek, and R. P. Adams. Freeze-thaw Bayesian optimization. arXiv preprint arXiv:1406.3896, 2014.
- (2014) Freeze-thaw Bayesian Optimization
- Swersky, K.¹ Snoek, J.² Adams, R.P.³

36
- 34547435898
- On early stopping in gradient descent learning
- Y. Yao, L. Rosasco, and A. Caponnetto. On early stopping in gradient descent learning. Constructive Approximation, 26(2):289-315, 2007.
- (2007) Constructive Approximation , vol.26 , Issue.2 , pp. 289-315
- Yao, Y.¹ Rosasco, L.² Caponnetto, A.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.