SCOPUS 정보 검색 플랫폼

32nd International Conference on Machine Learning, ICML 2015

Volumn 1, Issue , 2015, Pages 448-456

Batch normalization: Accelerating deep network training by reducing internal covariate shift

(2) Ioffe, Sergey a Szegedy, Christian a

a GOOGLE INC (United States)

Author keywords

[No Author keywords available]

Indexed keywords

ARTIFICIAL INTELLIGENCE; LEARNING SYSTEMS;

CLASSIFICATION MODELS; COVARIATE SHIFTS; DEEP NEURAL NETWORKS; HIGHER LEARNING; LEARNING RATES; MODEL ARCHITECTURE; NETWORK TRAINING; STATE OF THE ART;

IMAGE CLASSIFICATION;

EID: 84969584486 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (31101)

References (24)

1
- 84862277874
- Understanding the difficulty of training deep feedforward neural networks
- May
- Bengio, Yoshua and Glorot, Xavier. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of AISTATS 2010, volume 9, pp. 249-256, May 2010.
- (2010) Proceedings of AISTATS 2010 , vol.9 , pp. 249-256
- Bengio, Y.¹ Glorot, X.²

2
- 84877760312
- Large scale distributed deep networks
- Dean, Jeffrey, Corrado, Greg S., Monga, Raj at, Chen, Kai, Devin, Matthieu, Le, Quoc V., Mao, Mark Z., Ranzato, Marc'Aurelio, Senior, Andrew, Tucker, Paul, Yang, Ke, and Ng, Andrew Y. Large scale distributed deep networks. In NIPS, 2012.
- (2012) NIPS
- Dean, J.¹ Corrado, G.S.² Monga, R.³ Chen, K.⁴ Devin, M.⁵ Le Quoc, V.⁶ Mao, M.Z.⁷ Ranzato, M.⁸ Senior, A.⁹ Tucker, P.¹⁰ Yang, K.¹¹ Ng, A.Y.¹²

3
- 84969553390
- (unpublished)
- Desjardins, Guillaume and Kavukcuoglu, Koray. Natural neural networks, (unpublished).
- Natural Neural Networks
- Guillaume, D.¹ Koray, K.²

4
- 80052250414
- Adaptive subgradient methods for online learning and stochastic optimization
- July
- Duchi, John, Hazan, Elad, and Singer, Yoram. Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res., 12: 2121-2159, July 2011. ISSN 1532-4435.
- (2011) J. Mach. Learn. Res. , vol.12 , pp. 2121-2159
- Duchi, J.¹ Hazan, E.² Singer, Y.³

5
- 85083951034
- Knowledge matters: Importance of prior information for optimization
- abs/1301.4083
- G ülçehre, Caglar and Bengio, Yoshua. Knowledge matters: Importance of prior information for optimization. CoRR, abs/1301.4083, 2013.
- (2013) CoRR
- Gülçehre, C.¹ Bengio, Y.²

6
- 84937472647
- Delving deep into rectifiers: Surpassing human-level performance on imageNet classification
- February
- He, K., Zhang, X., Ren, S., and Sun, J. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. ArXiv e-prints, February 2015.
- (2015) ArXiv E-prints
- He, K.¹ Zhang, X.² Ren, S.³ Sun, J.⁴

7
- 0042826822
- Independent component analysis: Algorithms and applications
- May
- Hyvarinen, A. and Oja, E. Independent component analysis: Algorithms and applications. Neural Netw., 13 (4-5): 411-130, May 2000.
- (2000) Neural Netw , vol.13 , Issue.4-5 , pp. 411-430
- Hyvarinen, A.¹ Oja, E.²

8
- 74549152521
- Jiang, Jing. A literature survey on domain adaptation of statistical classifiers, 2008.
- (2008) A Literature Survey on Domain Adaptation of Statistical Classifiers
- Jiang, J.¹

9
- 0032203257
- Gradient-based learning applied to document recognition
- November
- LeCun, Y, Bottou, L., Bengio, Y., and Haffner, P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86 (11): 2278-2324, November 1998a.
- (1998) Proceedings of the IEEE , vol.86 , Issue.11 , pp. 2278-2324
- LeCun, Y.¹ Bottou, L.² Bengio, Y.³ Haffner, P.⁴

10
- 0001857994
- Efficient backprop
- Orr, G. and K., Muller (eds.) Springer
- LeCun, Y, Bottou, L., Orr, G., and Muller, K. Efficient backprop. In Orr, G. and K., Muller (eds.), Neural Networks: Tricks of the trade. Springer, 1998b.
- (1998) Neural Networks: Tricks of the Trade
- LeCun, Y.¹ Bottou, L.² Orr, G.³ Muller, K.⁴

11
- 52249097028
- Nonlinear image representation using divisive normalization
- IEEE Computer Society, Jun 23-28
- Lyu, S and Simoncelli, E P. Nonlinear image representation using divisive normalization. In Proc. Computer Vision and Pattern Recognition, pp. 1-8. IEEE Computer Society, Jun 23-28 2008. doi: 10.1109/CVPR.2008.4587821.
- (2008) Proc. Computer Vision and Pattern Recognition , pp. 1-8
- Lyu, S.¹ Simoncelli, E.P.²

12
- 77956509090
- Rectified linear units improve restricted Boltzmann machines
- Omnipress
- Nair, Vinod and Hinton, Geoffrey E. Rectified linear units improve restricted boltzmann machines. In ICML, pp. 807-814. Omnipress, 2010.
- (2010) ICML , pp. 807-814
- Nair, V.¹ Hinton, G.E.²

13
- 84897497795
- On the difficulty of training recurrent neural networks
- Pascanu, Razvan, Mikolov, Tomas, and Bengio, Yoshua. On the difficulty of training recurrent neural networks. In Proceedings of the 30th International Conference on Machine Learning, ICML 2013, Atlanta, GA, USA, 16-21 June 2013, pp. 1310-1318, 2013.
- (2013) Proceedings of the 30th International Conference on Machine Learning ICML 2013, Atlanta, GA, USA, 16-21 June 2013 , pp. 1310-1318
- Pascanu, R.¹ Mikolov, T.² Bengio, Y.³

14
- 84969522474
- Parallel training of deep neural networks with natural gradient and parameter averaging
- abs/1410.7455
- Povey, Daniel, Zhang, Xiaohui, and Khudanpur, Sanjeev. Parallel training of deep neural networks with natural gradient and parameter averaging. CoRR, abs/1410.7455, 2014.
- (2014) CoRR
- Povey, D.¹ Zhang, X.² Khudanpur, S.³

15
- 84893409634
- Deep learning made easier by linear transformations in per-ceptrons
- Raiko, Tapani, Valpola, Harri, and LeCun, Yann. Deep learning made easier by linear transformations in per-ceptrons. In International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 924-932, 2012.
- (2012) International Conference on Artificial Intelligence and Statistics (AISTATS) , pp. 924-932
- Raiko, T.¹ Valpola, H.² LeCun, Y.³

16
- 84909978410
- Russakovsky, Olga, Deng, Jia, Su, Hao, Krause, Jonathan, Satheesh, Sanjeev, Ma, Sean, Huang, Zhiheng, Karpa-thy, Andrej, Khosla, Aditya, Bernstein, Michael, Berg, Alexander C, and Fei-Fei, Li. ImageNet Large Scale Visual Recognition Challenge, 2014.
- (2014) ImageNet Large Scale Visual Recognition Challenge
- Russakovsky, O.¹ Deng, J.² Su, H.³ Krause, J.⁴ Satheesh, S.⁵ Ma, S.⁶ Huang, Z.⁷ Karpathy, A.⁸ Khosla, A.⁹ Bernstein, M.¹⁰ Berg, A.C.¹¹ Fei-Fei, L.¹²

17
- 84969522090
- Exact solutions to the nonlinear dynamics of learning in deep linear neural networks
- abs/1312.6120
- Saxe, Andrew M., McClelland, James L., and Ganguli, Surya. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. CoRR, abs/1312.6120, 2013.
- (2013) CoRR
- Saxe, A.M.¹ McClelland, J.L.² Ganguli, S.³

18
- 0037527188
- Improving predictive inference under covariate shift by weighting the log-likelihood function
- October
- Shimodaira, Hidetoshi. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference, 90 (2): 227-244, October 2000.
- (2000) Journal of Statistical Planning and Inference , vol.90 , Issue.2 , pp. 227-244
- Shimodaira, H.¹

19
- 84904163933
- Dropout: A simple way to prevent neural networks from overfitting
- January
- Srivastava, Nitish, Hinton, Geoffrey, Krizhevsky, Alex, Sutskever, Ilya, and Salakhutdinov, Ruslan. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res., 15 (1): 1929-1958, January 2014.
- (2014) J. Mach. Learn. Res. , vol.15 , Issue.1 , pp. 1929-1958
- Srivastava, N.¹ Hinton, G.² Krizhevsky, A.³ Sutskever, I.⁴ Salakhutdinov, R.⁵

20
- 84897510162
- On the importance of initialization and momentum in deep learning
- JMLR.org
- Sutskever, Ilya, Martens, James, Dahl, George E., and Hinton, Geoffrey E. On the importance of initialization and momentum in deep learning. In ICML (3), volume 28 of JMLR Proceedings, pp. 1139-1147. JMLR.org, 2013.
- (2013) ICML (3) of JMLR Proceedings , vol.28 , pp. 1139-1147
- Sutskever, I.¹ Martens, J.² Dahl, G.E.³ Hinton, G.E.⁴

21
- 84941122549
- Going deeper with convolutions
- abs/1409.4842
- Szegedy, Christian, Liu, Wei, Jia, Yangqing, Sermanet, Pierre, Reed, Scott, Anguelov, Dragomir, Erhan, Du-mitru, Vanhoucke, Vincent, and Rabinovich, Andrew. Going deeper with convolutions. CoRR, abs/1409.4842, 2014.
- (2014) CoRR
- Szegedy, C.¹ Liu, W.² Jia, Y.³ Sermanet, P.⁴ Reed, S.⁵ Anguelov, D.⁶ Erhan, D.⁷ Vanhoucke, V.⁸ Rabinovich, A.⁹

22
- 85162533997
- A convergence analysis of log-linear training
- Shawe-Taylor, J., Zemel, R.S., Bartlett, P., Pereira, F.C.N., and Weinberger, K.Q. (eds.), Granada, Spain, December
- Wiesler, Simon and Ney, Hermann. A convergence analysis of log-linear training. In Shawe-Taylor, J., Zemel, R.S., Bartlett, P., Pereira, F.C.N., and Weinberger, K.Q. (eds.), Advances in Neural Information Processing Systems 24, pp. 657-665, Granada, Spain, December 2011.
- (2011) Advances in Neural Information Processing Systems , vol.24 , pp. 657-665
- Wiesler, S.¹ Ney, H.²

23
- 84905233897
- Mean-normalized stochastic gradient for large-scale deep learning
- Florence, Italy, May
- Wiesler, Simon, Richard, Alexander, Schliiter, Ralf, and Ney, Hermann. Mean-normalized stochastic gradient for large-scale deep learning. In IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 180-184, Florence, Italy, May 2014.
- (2014) IEEE International Conference on Acoustics, Speech, and Signal Processing , pp. 180-184
- Simon, W.¹ Alexander, R.² Ralf, S.³ Hermann, N.⁴

24
- 84930572185
- Wu, Ren, Yan, Shengen, Shan, Yi, Dang, Qingqing, and Sun, Gang. Deep image: Scaling up image recognition, 2015.
- (2015) Deep Image: Scaling Up Image Recognition
- Wu, R.¹ Yan, S.² Shan, Y.³ Dang, Q.⁴ Sun, G.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.