SCOPUS 정보 검색 플랫폼

4th International Conference on Learning Representations, ICLR 2016 - Conference Track Proceedings

Volumn , Issue , 2016, Pages

8-bit approximations for parallelism in deep learning

(1) Dettmers, Tim a

a UNIVERSITY OF LUGANO (Switzerland)

Author keywords

[No Author keywords available]

Indexed keywords

APPROXIMATION ALGORITHMS; BANDWIDTH; DATA TRANSFER; PROGRAM PROCESSORS;

AVAILABLE BANDWIDTH; COMMUNICATION BANDWIDTH; CONVOLUTIONAL NETWORKS; DATA PARALLELISM; NON-LINEAR ACTIVATION; PREDICTIVE MODELING; PREDICTIVE PERFORMANCE; VERY LARGE SYSTEMS;

DEEP LEARNING;

EID: 85083953599 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (69)

References (21)

1
- 85069497682
- Project adam: Building an efficient and scalable deep learning training system
- Chilimbi, Trishul, Suzue, Yutaka, Apacible, Johnson, and Kalyanaraman, Karthik. Project adam: Building an efficient and scalable deep learning training system. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14), pp. 571–582, 2014.
- (2014) 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14) , pp. 571-582
- Chilimbi, T.¹ Suzue, Y.² Apacible, J.³ Kalyanaraman, K.⁴

2
- 84866714584
- Multi-column deep neural networks for image classification
- Ciresan, Dan, Meier, Ueli, and Schmidhuber, Jürgen. Multi-column deep neural networks for image classification. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pp. 3642–3649. IEEE, 2012.
- (2012) Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on , pp. 3642-3649
- Ciresan, D.¹ Meier, U.² Schmidhuber, J.³

3
- 84894294885
- Deep learning with cots hpc systems
- Coates, Adam, Huval, Brody, Wang, Tao, Wu, David, Catanzaro, Bryan, and Andrew, Ng. Deep learning with cots hpc systems. In Proceedings of the 30th international conference on machine learning, pp. 1337–1345, 2013.
- (2013) Proceedings of the 30th International Conference on Machine Learning , pp. 1337-1345
- Coates, A.¹ Huval, B.² Wang, T.³ Wu, D.⁴ Catanzaro, B.⁵ Andrew, N.⁶

4
- 84963513778
- arXiv preprint
- Courbariaux, Matthieu, Bengio, Yoshua, and David, Jean-Pierre. Low precision arithmetic for deep learning. arXiv preprint arXiv:1412.7024, 2014.
- (2014) Low Precision Arithmetic for Deep Learning
- Courbariaux, M.¹ Bengio, Y.² David, J.-P.³

5
- 84055222005
- Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
- Dahl, George E, Yu, Dong, Deng, Li, and Acero, Alex. Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. Audio, Speech, and Language Processing, IEEE Transactions on, 20(1):30–42, 2012.
- (2012) Audio, Speech, and Language Processing, IEEE Transactions on , vol.20 , Issue.1 , pp. 30-42
- Dahl, G.E.¹ Yu, D.² Deng, L.³ Acero, A.⁴

6
- 84877760312
- Large scale distributed deep networks
- Dean, Jeffrey, Corrado, Greg, Monga, Rajat, Chen, Kai, Devin, Matthieu, Mao, Mark, Senior, Andrew, Tucker, Paul, Yang, Ke, Le, Quoc V, et al. Large scale distributed deep networks. In Advances in Neural Information Processing Systems, pp. 1223–1231, 2012.
- (2012) Advances in Neural Information Processing Systems , pp. 1223-1231
- Dean, J.¹ Corrado, G.² Monga, R.³ Chen, K.⁴ Devin, M.⁵ Mao, M.⁶ Senior, A.⁷ Tucker, P.⁸ Yang, K.⁹ Le, Q.V.¹⁰

7
- 84892421248
- arXiv preprint
- Goodfellow, Ian J, Warde-Farley, David, Mirza, Mehdi, Courville, Aaron, and Bengio, Yoshua. Maxout networks. arXiv preprint arXiv:1302.4389, 2013.
- (2013) Maxout Networks
- Goodfellow, I.J.¹ Warde-Farley, D.² Mirza, M.³ Courville, A.⁴ Bengio, Y.⁵

8
- 84946878550
- arXiv preprint
- Gupta, Suyog, Agrawal, Ankur, Gopalakrishnan, Kailash, and Narayanan, Pritish. Deep learning with limited numerical precision. arXiv preprint arXiv:1502.02551, 2015.
- (2015) Deep Learning with Limited Numerical Precision
- Gupta, S.¹ Agrawal, A.² Gopalakrishnan, K.³ Narayanan, P.⁴

9
- 0031573117
- Long short-term memory
- Hochreiter, Sepp and Schmidhuber, Jürgen. Long short-term memory. Neural computation, 9(8): 1735–1780, 1997.
- (1997) Neural Computation , vol.9 , Issue.8 , pp. 1735-1780
- Hochreiter, S.¹ Schmidhuber, J.²

10
- 0041914606
- Hochreiter, Sepp, Bengio, Yoshua, Frasconi, Paolo, and Schmidhuber, Jürgen. Gradient flow in recurrent nets: the difficulty of learning long-term dependencies, 2001.
- (2001) Gradient Flow in Recurrent Nets: The Difficulty of Learning Long-Term Dependencies
- Hochreiter, S.¹ Bengio, Y.² Frasconi, P.³ Schmidhuber, J.⁴

11
- 84932095919
- arXiv preprint
- Krizhevsky, Alex. One weird trick for parallelizing convolutional neural networks. arXiv preprint arXiv:1404.5997, 2014.
- (2014) One Weird Trick for Parallelizing Convolutional Neural Networks
- Krizhevsky, A.¹

12
- 84876231242
- Imagenet classification with deep convolutional neural networks
- Krizhevsky, Alex, Sutskever, Ilya, and Hinton, Geoffrey E. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pp. 1097–1105, 2012.
- (2012) Advances in Neural Information Processing Systems , pp. 1097-1105
- Krizhevsky, A.¹ Sutskever, I.² Hinton, G.E.³

13
- 84921817164
- Learning representations by back-propagating errors
- Rumelhart, David E, Hinton, Geoffrey E, and Williams, Ronald J. Learning representations by back-propagating errors. Cognitive modeling, 5:3, 1988.
- (1988) Cognitive Modeling , vol.5 , pp. 3
- Rumelhart, D.E.¹ Hinton, G.E.² Williams, R.J.³

14
- 84910651844
- Deep learning in neural networks: An overview
- Schmidhuber, Jürgen. Deep learning in neural networks: An overview. Neural Networks, 61:85–117, 2015.
- (2015) Neural Networks , vol.61 , pp. 85-117
- Schmidhuber, J.¹

15
- 84910069984
- 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs
- Seide, Frank, Fu, Hao, Droppo, Jasha, Li, Gang, and Yu, Dong. 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech dnns. In Fifteenth Annual Conference of the International Speech Communication Association, 2014.
- (2014) Fifteenth Annual Conference of the International Speech Communication Association
- Seide, F.¹ Fu, H.² Droppo, J.³ Li, G.⁴ Yu, D.⁵

16
- 85052052023
- PhD thesis, The Ohio State University
- Singh, Ashish Kumar. Optimizing All-to-All and Allgather Communications on GPGPU Clusters. PhD thesis, The Ohio State University, 2012.
- (2012) Optimizing All-to-All and Allgather Communications on GPGPU Clusters
- Singh, A.K.¹

17
- 84959142008
- Scalable distributed dnn training using commodity GPU cloud computing
- Strom, Nikko. Scalable distributed dnn training using commodity gpu cloud computing. In Sixteenth Annual Conference of the International Speech Communication Association, 2015.
- (2015) Sixteenth Annual Conference of the International Speech Communication Association
- Strom, N.¹

18
- 33646719765
- High performance rdma based all-to-all broadcast for infiniband clusters
- Springer
- Sur, Sayantan, Bondhugula, Uday Kumar Reddy, Mamidala, Amith, Jin, H-W, and Panda, Dha-baleswar K. High performance rdma based all-to-all broadcast for infiniband clusters. In High Performance Computing–HiPC 2005, pp. 148–157. Springer, 2005.
- (2005) High Performance Computing–HiPC 2005 , pp. 148-157
- Sur, S.¹ Bondhugula² Reddy, U.K.³ Mamidala, A.⁴ Jin, H.-W.⁵ Panda, D.-B.K.⁶

19
- 84893343292
- Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude
- Tieleman, Tijmen and Hinton, Geoffrey. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks for Machine Learning, 4, 2012.
- (2012) COURSERA: Neural Networks for Machine Learning , vol.4
- Tieleman, T.¹ Hinton, G.²

20
- 84867754966
- Improving the speed of neural networks on cpus
- Vanhoucke, Vincent, Senior, Andrew, and Mao, Mark Z. Improving the speed of neural networks on cpus. In Proc. Deep Learning and Unsupervised Feature Learning NIPS Workshop, volume 1, 2011.
- (2011) Proc. Deep Learning and Unsupervised Feature Learning NIPS Workshop , vol.1
- Vanhoucke, V.¹ Senior, A.² Mao, M.Z.³

21
- 84930572185
- arXiv preprint
- Wu, Ren, Yan, Shengen, Shan, Yi, Dang, Qingqing, and Sun, Gang. Deep image: Scaling up image recognition. arXiv preprint arXiv:1501.02876, 2015.
- (2015) Deep Image: Scaling up Image Recognition
- Wu, R.¹ Yan, S.² Shan, Y.³ Dang, Q.⁴ Sun, G.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.