-
1
-
-
85069497682
-
Project adam: Building an efficient and scalable deep learning training system
-
Chilimbi, Trishul, Suzue, Yutaka, Apacible, Johnson, and Kalyanaraman, Karthik. Project adam: Building an efficient and scalable deep learning training system. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14), pp. 571–582, 2014.
-
(2014)
11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14)
, pp. 571-582
-
-
Chilimbi, T.1
Suzue, Y.2
Apacible, J.3
Kalyanaraman, K.4
-
2
-
-
84866714584
-
Multi-column deep neural networks for image classification
-
Ciresan, Dan, Meier, Ueli, and Schmidhuber, Jürgen. Multi-column deep neural networks for image classification. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pp. 3642–3649. IEEE, 2012.
-
(2012)
Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on
, pp. 3642-3649
-
-
Ciresan, D.1
Meier, U.2
Schmidhuber, J.3
-
3
-
-
84894294885
-
Deep learning with cots hpc systems
-
Coates, Adam, Huval, Brody, Wang, Tao, Wu, David, Catanzaro, Bryan, and Andrew, Ng. Deep learning with cots hpc systems. In Proceedings of the 30th international conference on machine learning, pp. 1337–1345, 2013.
-
(2013)
Proceedings of the 30th International Conference on Machine Learning
, pp. 1337-1345
-
-
Coates, A.1
Huval, B.2
Wang, T.3
Wu, D.4
Catanzaro, B.5
Andrew, N.6
-
5
-
-
84055222005
-
Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
-
Dahl, George E, Yu, Dong, Deng, Li, and Acero, Alex. Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. Audio, Speech, and Language Processing, IEEE Transactions on, 20(1):30–42, 2012.
-
(2012)
Audio, Speech, and Language Processing, IEEE Transactions on
, vol.20
, Issue.1
, pp. 30-42
-
-
Dahl, G.E.1
Yu, D.2
Deng, L.3
Acero, A.4
-
6
-
-
84877760312
-
Large scale distributed deep networks
-
Dean, Jeffrey, Corrado, Greg, Monga, Rajat, Chen, Kai, Devin, Matthieu, Mao, Mark, Senior, Andrew, Tucker, Paul, Yang, Ke, Le, Quoc V, et al. Large scale distributed deep networks. In Advances in Neural Information Processing Systems, pp. 1223–1231, 2012.
-
(2012)
Advances in Neural Information Processing Systems
, pp. 1223-1231
-
-
Dean, J.1
Corrado, G.2
Monga, R.3
Chen, K.4
Devin, M.5
Mao, M.6
Senior, A.7
Tucker, P.8
Yang, K.9
Le, Q.V.10
-
7
-
-
84892421248
-
-
arXiv preprint
-
Goodfellow, Ian J, Warde-Farley, David, Mirza, Mehdi, Courville, Aaron, and Bengio, Yoshua. Maxout networks. arXiv preprint arXiv:1302.4389, 2013.
-
(2013)
Maxout Networks
-
-
Goodfellow, I.J.1
Warde-Farley, D.2
Mirza, M.3
Courville, A.4
Bengio, Y.5
-
8
-
-
84946878550
-
-
arXiv preprint
-
Gupta, Suyog, Agrawal, Ankur, Gopalakrishnan, Kailash, and Narayanan, Pritish. Deep learning with limited numerical precision. arXiv preprint arXiv:1502.02551, 2015.
-
(2015)
Deep Learning with Limited Numerical Precision
-
-
Gupta, S.1
Agrawal, A.2
Gopalakrishnan, K.3
Narayanan, P.4
-
10
-
-
0041914606
-
-
Hochreiter, Sepp, Bengio, Yoshua, Frasconi, Paolo, and Schmidhuber, Jürgen. Gradient flow in recurrent nets: the difficulty of learning long-term dependencies, 2001.
-
(2001)
Gradient Flow in Recurrent Nets: The Difficulty of Learning Long-Term Dependencies
-
-
Hochreiter, S.1
Bengio, Y.2
Frasconi, P.3
Schmidhuber, J.4
-
12
-
-
84876231242
-
Imagenet classification with deep convolutional neural networks
-
Krizhevsky, Alex, Sutskever, Ilya, and Hinton, Geoffrey E. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pp. 1097–1105, 2012.
-
(2012)
Advances in Neural Information Processing Systems
, pp. 1097-1105
-
-
Krizhevsky, A.1
Sutskever, I.2
Hinton, G.E.3
-
13
-
-
84921817164
-
Learning representations by back-propagating errors
-
Rumelhart, David E, Hinton, Geoffrey E, and Williams, Ronald J. Learning representations by back-propagating errors. Cognitive modeling, 5:3, 1988.
-
(1988)
Cognitive Modeling
, vol.5
, pp. 3
-
-
Rumelhart, D.E.1
Hinton, G.E.2
Williams, R.J.3
-
14
-
-
84910651844
-
Deep learning in neural networks: An overview
-
Schmidhuber, Jürgen. Deep learning in neural networks: An overview. Neural Networks, 61:85–117, 2015.
-
(2015)
Neural Networks
, vol.61
, pp. 85-117
-
-
Schmidhuber, J.1
-
15
-
-
84910069984
-
1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs
-
Seide, Frank, Fu, Hao, Droppo, Jasha, Li, Gang, and Yu, Dong. 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech dnns. In Fifteenth Annual Conference of the International Speech Communication Association, 2014.
-
(2014)
Fifteenth Annual Conference of the International Speech Communication Association
-
-
Seide, F.1
Fu, H.2
Droppo, J.3
Li, G.4
Yu, D.5
-
18
-
-
33646719765
-
High performance rdma based all-to-all broadcast for infiniband clusters
-
Springer
-
Sur, Sayantan, Bondhugula, Uday Kumar Reddy, Mamidala, Amith, Jin, H-W, and Panda, Dha-baleswar K. High performance rdma based all-to-all broadcast for infiniband clusters. In High Performance Computing–HiPC 2005, pp. 148–157. Springer, 2005.
-
(2005)
High Performance Computing–HiPC 2005
, pp. 148-157
-
-
Sur, S.1
Bondhugula2
Reddy, U.K.3
Mamidala, A.4
Jin, H.-W.5
Panda, D.-B.K.6
-
19
-
-
84893343292
-
Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude
-
Tieleman, Tijmen and Hinton, Geoffrey. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks for Machine Learning, 4, 2012.
-
(2012)
COURSERA: Neural Networks for Machine Learning
, vol.4
-
-
Tieleman, T.1
Hinton, G.2
-
20
-
-
84867754966
-
Improving the speed of neural networks on cpus
-
Vanhoucke, Vincent, Senior, Andrew, and Mao, Mark Z. Improving the speed of neural networks on cpus. In Proc. Deep Learning and Unsupervised Feature Learning NIPS Workshop, volume 1, 2011.
-
(2011)
Proc. Deep Learning and Unsupervised Feature Learning NIPS Workshop
, vol.1
-
-
Vanhoucke, V.1
Senior, A.2
Mao, M.Z.3
-
21
-
-
84930572185
-
-
arXiv preprint
-
Wu, Ren, Yan, Shengen, Shan, Yi, Dang, Qingqing, and Sun, Gang. Deep image: Scaling up image recognition. arXiv preprint arXiv:1501.02876, 2015.
-
(2015)
Deep Image: Scaling up Image Recognition
-
-
Wu, R.1
Yan, S.2
Shan, Y.3
Dang, Q.4
Sun, G.5
|