-
1
-
-
84954514573
-
-
NVIDIA cuBLAS https://developer.nvidia.com/cublas.
-
NVIDIA cuBLAS
-
-
-
2
-
-
84971553276
-
-
NVIDIA cuDNN https://developer.nvidia.com/cudnn.
-
NVIDIA cuDNN
-
-
-
3
-
-
84858012279
-
Scalable inference in latent variable models
-
A. Ahmed, M. Aly, J. Gonzalez, S. Narayanamurthy, and A. J. Smola. Scalable inference in latent variable models. In WSDM, 2012.
-
(2012)
WSDM
-
-
Ahmed, A.1
Aly, M.2
Gonzalez, J.3
Narayanamurthy, S.4
Smola, A.J.5
-
4
-
-
84919919193
-
Distributed stochastic gradient MCMC
-
S. Ahn, B. Shahbaba, and M. Welling. Distributed stochastic gradient MCMC. In ICML, 2014.
-
(2014)
ICML
-
-
Ahn, S.1
Shahbaba, B.2
Welling, M.3
-
6
-
-
84990032982
-
-
arXiv preprint arXiv:1512.01274
-
T. Chen, M. Li, Y. Li, M. Lin, N. Wang, M. Wang, T. Xiao, B. Xu, C. Zhang, and Z. Zhang. MXNet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274, 2015.
-
(2015)
MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems
-
-
Chen, T.1
Li, M.2
Li, Y.3
Lin, M.4
Wang, N.5
Wang, M.6
Xiao, T.7
Xu, B.8
Zhang, C.9
Zhang, Z.10
-
7
-
-
85069497682
-
Project Adam: Building an efficient and scalable deep learning training system
-
T. Chilimbi, Y. Suzue, J. Apacible, and K. Kalyanaraman. Project Adam: Building an efficient and scalable deep learning training system. In OSDI, 2014.
-
(2014)
OSDI
-
-
Chilimbi, T.1
Suzue, Y.2
Apacible, J.3
Kalyanaraman, K.4
-
8
-
-
84866714584
-
Multi-column deep neural networks for image classification
-
D. Ciresan, U. Meier, and J. Schmidhuber. Multi-column deep neural networks for image classification. In CVPR, 2012.
-
(2012)
CVPR
-
-
Ciresan, D.1
Meier, U.2
Schmidhuber, J.3
-
9
-
-
84897484337
-
Deep learning with COTS HPC systems
-
A. Coates, B. Huval, T. Wang, D. Wu, B. Catanzaro, and N. Andrew. Deep learning with COTS HPC systems. In ICML, 2013.
-
(2013)
ICML
-
-
Coates, A.1
Huval, B.2
Wang, T.3
Wu, D.4
Catanzaro, B.5
Andrew, N.6
-
10
-
-
85077475089
-
Exploiting bounded staleness to speed up big data analytics
-
H. Cui, J. Cipar, Q. Ho, J. K. Kim, S. Lee, A. Kumar, J. Wei, W. Dai, G. R. Ganger, P. B. Gibbons, G. A. Gibson, and E. P.Xing. Exploiting bounded staleness to speed up big data analytics. In USENIX ATC, 2014.
-
(2014)
USENIX ATC
-
-
Cui, H.1
Cipar, J.2
Ho, Q.3
Kim, J.K.4
Lee, S.5
Kumar, A.6
Wei, J.7
Dai, W.8
Ganger, G.R.9
Gibbons, P.B.10
Gibson, G.A.11
Xing, E.P.12
-
11
-
-
85118315826
-
Exploiting iterative-ness for parallel ML computations
-
H. Cui, A. Tumanov, J. Wei, L. Xu, W. Dai, J. Haber-Kucharsky, Q. Ho, G. R. Ganger, P. B. Gibbons, G. A. Gibson, and E. P. Xing. Exploiting iterative-ness for parallel ML computations. In SoCC, 2014.
-
(2014)
SoCC
-
-
Cui, H.1
Tumanov, A.2
Wei, J.3
Xu, L.4
Dai, W.5
Haber-Kucharsky, J.6
Ho, Q.7
Ganger, G.R.8
Gibbons, P.B.9
Gibson, G.A.10
Xing, E.P.11
-
12
-
-
84971509545
-
Scalable deep learning on distributed GPUs with a GPU-specialized parameter server
-
H. Cui, G. R. Ganger, and P. B. Gibbons. Scalable deep learning on distributed GPUs with a GPU-specialized parameter server. CMU PDL Technical Report (CMU-PDL-15-107), 2015.
-
(2015)
CMU PDL Technical Report
-
-
Cui, H.1
Ganger, G.R.2
Gibbons, P.B.3
-
13
-
-
84055222005
-
Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
-
G. E. Dahl, D. Yu, L. Deng, and A. Acero. Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Transactions on Audio, Speech, and Language Processing, 20(1), 2012.
-
(2012)
IEEE Transactions on Audio, Speech, and Language Processing
, vol.20
, Issue.1
-
-
Dahl, G.E.1
Yu, D.2
Deng, L.3
Acero, A.4
-
14
-
-
84877760312
-
Large scale distributed deep networks
-
J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, M. Mao, A. Senior, P. Tucker, K. Yang, Q. V. Le, et al. Large scale distributed deep networks. In NIPS, 2012.
-
(2012)
NIPS
-
-
Dean, J.1
Corrado, G.2
Monga, R.3
Chen, K.4
Devin, M.5
Mao, M.6
Senior, A.7
Tucker, P.8
Yang, K.9
Le, Q.V.10
-
15
-
-
85198028989
-
ImageNet: A large-scale hierarchical image database
-
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A large-scale hierarchical image database. In CVPR, 2009.
-
(2009)
CVPR
-
-
Deng, J.1
Dong, W.2
Socher, R.3
Li, L.-J.4
Li, K.5
Fei-Fei, L.6
-
16
-
-
84944046597
-
-
arXiv preprint arXiv:1411.4389
-
J. Donahue, L. A. Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, K. Saenko, and T. Darrell. Long-term recurrent convolutional networks for visual recognition and description. arXiv preprint arXiv:1411.4389, 2014.
-
(2014)
Long-term Recurrent Convolutional Networks for Visual Recognition and Description
-
-
Donahue, J.1
Hendricks, L.A.2
Guadarrama, S.3
Rohrbach, M.4
Venugopalan, S.5
Saenko, K.6
Darrell, T.7
-
17
-
-
84891720231
-
PRObE: A thousand-node experimental cluster for computer systems research
-
G. Gibson, G. Grider, A. Jacobson, and W. Lloyd. PRObE: A thousand-node experimental cluster for computer systems research. USENIX ;login:, 2013.
-
(2013)
USENIX ;login:
-
-
Gibson, G.1
Grider, G.2
Jacobson, A.3
Lloyd, W.4
-
18
-
-
85032751458
-
Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
-
G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, 29(6), 2012.
-
(2012)
IEEE Signal Processing Magazine
, vol.29
, Issue.6
-
-
Hinton, G.1
Deng, L.2
Yu, D.3
Dahl, G.E.4
Mohamed, A.-R.5
Jaitly, N.6
Senior, A.7
Vanhoucke, V.8
Nguyen, P.9
Sainath, T.N.10
-
19
-
-
84898988368
-
More effective distributed ML via a Stale Synchronous Parallel parameter server
-
Q. Ho, J. Cipar, H. Cui, S. Lee, J. K. Kim, P. B. Gibbons, G. A. Gibson, G. R. Ganger, and E. P. Xing. More effective distributed ML via a Stale Synchronous Parallel parameter server. In NIPS, 2013.
-
(2013)
NIPS
-
-
Ho, Q.1
Cipar, J.2
Cui, H.3
Lee, S.4
Kim, J.K.5
Gibbons, P.B.6
Gibson, G.A.7
Ganger, G.R.8
Xing, E.P.9
-
21
-
-
84913555165
-
-
arXiv preprint arXiv:1408.5093
-
Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093, 2014.
-
(2014)
Caffe: Convolutional Architecture for Fast Feature Embedding
-
-
Jia, Y.1
Shelhamer, E.2
Donahue, J.3
Karayev, S.4
Long, J.5
Girshick, R.6
Guadarrama, S.7
Darrell, T.8
-
23
-
-
84876231242
-
ImageNet classification with deep convolutional neural networks
-
A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet classification with deep convolutional neural networks. In NIPS, 2012.
-
(2012)
NIPS
-
-
Krizhevsky, A.1
Sutskever, I.2
Hinton, G.E.3
-
24
-
-
84937912100
-
Scaling distributed machine learning with the parameter server
-
M. Li, D. G. Andersen, J. W. Park, A. J. Smola, A. Ahmed, V. Josifovski, J. Long, E. J. Shekita, and B.-Y. Su. Scaling distributed machine learning with the parameter server. In OSDI, 2014.
-
(2014)
OSDI
-
-
Li, M.1
Andersen, D.G.2
Park, J.W.3
Smola, A.J.4
Ahmed, A.5
Josifovski, V.6
Long, J.7
Shekita, E.J.8
Su, B.-Y.9
-
25
-
-
82155188108
-
Piccolo: Building fast, distributed programs with partitioned tables
-
R. Power and J. Li. Piccolo: Building fast, distributed programs with partitioned tables. In OSDI, 2010.
-
(2010)
OSDI
-
-
Power, R.1
Li, J.2
-
26
-
-
84947041871
-
ImageNet large scale visual recognition challenge
-
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 2015.
-
(2015)
International Journal of Computer Vision
-
-
Russakovsky, O.1
Deng, J.2
Su, H.3
Krause, J.4
Satheesh, S.5
Ma, S.6
Huang, Z.7
Karpathy, A.8
Khosla, A.9
Bernstein, M.10
Berg, A.C.11
Fei-Fei, L.12
-
28
-
-
84964983441
-
-
arXiv preprint arXiv:1409.4842
-
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. arXiv preprint arXiv:1409.4842, 2014.
-
(2014)
Going Deeper with Convolutions
-
-
Szegedy, C.1
Liu, W.2
Jia, Y.3
Sermanet, P.4
Reed, S.5
Anguelov, D.6
Erhan, D.7
Vanhoucke, V.8
Rabinovich, A.9
-
30
-
-
84906870319
-
Minerva: A scalable and highly efficient training platform for deep learning
-
M. Wang, T. Xiao, J. Li, J. Zhang, C. Hong, and Z. Zhang. Minerva: A scalable and highly efficient training platform for deep learning. NIPS 2014 Workshop of Distributed Matrix Computations, 2014.
-
(2014)
NIPS 2014 Workshop of Distributed Matrix Computations
-
-
Wang, M.1
Xiao, T.2
Li, J.3
Zhang, J.4
Hong, C.5
Zhang, Z.6
-
31
-
-
84912132796
-
-
arXiv preprint arXiv:1405.4402
-
Y. Wang, X. Zhao, Z. Sun, H. Yan, L. Wang, Z. Jin, L. Wang, Y. Gao, J. Zeng, Q. Yang, et al. Towards topic modeling for big data. arXiv preprint arXiv:1405.4402, 2014.
-
(2014)
Towards Topic Modeling for Big Data
-
-
Wang, Y.1
Zhao, X.2
Sun, Z.3
Yan, H.4
Wang, L.5
Jin, Z.6
Wang, L.7
Gao, Y.8
Zeng, J.9
Yang, Q.10
-
32
-
-
84959036260
-
Managed communication and consistency for fast data-parallel iterative analytics
-
J. Wei, W. Dai, A. Qiao, Q. Ho, H. Cui, G. R. Ganger, P. B. Gibbons, G. A. Gibson, and E. P. Xing. Managed communication and consistency for fast data-parallel iterative analytics. In SoCC, 2015.
-
(2015)
SoCC
-
-
Wei, J.1
Dai, W.2
Qiao, A.3
Ho, Q.4
Cui, H.5
Ganger, G.R.6
Gibbons, P.B.7
Gibson, G.A.8
Xing, E.P.9
-
33
-
-
84930572185
-
-
arXiv preprint arXiv:1501.02876
-
R. Wu, S. Yan, Y. Shan, Q. Dang, and G. Sun. Deep image: Scaling up image recognition. arXiv preprint arXiv:1501.02876, 2015.
-
(2015)
Deep Image: Scaling Up Image Recognition
-
-
Wu, R.1
Yan, S.2
Shan, Y.3
Dang, Q.4
Sun, G.5
-
34
-
-
84959228762
-
Beyond short snippets: Deep networks for video classification
-
J. Yue-Hei Ng, M. Hausknecht, S. Vijayanarasimhan, O. Vinyals, R. Monga, and G. Toderici. Beyond short snippets: Deep networks for video classification. In CVPR, 2015.
-
(2015)
CVPR
-
-
Yue-Hei Ng, J.1
Hausknecht, M.2
Vijayanarasimhan, S.3
Vinyals, O.4
Monga, R.5
Toderici, G.6
-
35
-
-
84971554516
-
-
arXiv preprint arXiv:1512.06216
-
H. Zhang, Z. Hu, J. Wei, P. Xie, G. Kim, Q. Ho, and E. Xing. Poseidon: A system architecture for efficient GPU-based deep learning on multiple machines. arXiv preprint arXiv:1512.06216, 2015.
-
(2015)
Poseidon: A System Architecture for Efficient GPU-based Deep Learning on Multiple Machines
-
-
Zhang, H.1
Hu, Z.2
Wei, J.3
Xie, P.4
Kim, G.5
Ho, Q.6
Xing, E.7
-
36
-
-
84912111128
-
Asynchronous distributed ADMM algorithm for global variable consensus optimization
-
R. Zhang and J. Kwok. Asynchronous distributed ADMM algorithm for global variable consensus optimization. In ICML, 2014.
-
(2014)
ICML
-
-
Zhang, R.1
Kwok, J.2
|