메뉴 건너뛰기




Volumn , Issue , 2017, Pages 193-205

S-caffe: Co-designing MPI runtimes and caffe for scalable deep learning on modern GPU clusters

Author keywords

Caffe; CUDA aware MPI; Deep learning; Distributed training; MPI reduce

Indexed keywords

COBALT; COBALT COMPOUNDS; DESIGN; GRAPHICS PROCESSING UNIT; PARALLEL PROGRAMMING; PROGRAM PROCESSORS;

EID: 85014452127     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/3018743.3018769     Document Type: Conference Paper
Times cited : (124)

References (49)
  • 1
    • 85014491906 scopus 로고    scopus 로고
    • Caffe: Multi-GPU Usage and Performance. https://github. com/yahoo/caffe/blob/master/docs/multigpu.md.
  • 2
    • 85014502106 scopus 로고    scopus 로고
    • KESCH: Cray CS-Storm System. http://www.cscs.ch/computers/kesch escha/index.html.
  • 3
    • 85014442860 scopus 로고    scopus 로고
    • Intel Caffe. https://github.com/intelcaffe.
  • 4
    • 85014428233 scopus 로고    scopus 로고
    • A Unified Runtime System for Heterogeneous Multicore Architectures. http://starpu.gforge.inria.fr.
  • 5
    • 85014443136 scopus 로고    scopus 로고
    • Online; accessed Dec-2016
    • ILSVRC2012 Dataset. http://image-net.org/challenges/LSVRC/2012/index, 2012. [Online; accessed Dec-2016].
    • (2012)
  • 6
    • 85014480125 scopus 로고    scopus 로고
    • Online; accessed Dec-2016
    • Caffe Website. http://caffe.berkeleyvision.org/, 2015. [Online; accessed Dec-2016].
    • (2015)
  • 7
    • 85014476564 scopus 로고    scopus 로고
    • Online; accessed Dec-2016
    • CaffeNet. http://papers.nips.cc/book/advances-in-neuralinformation-processing-systems-25-2012, 2015. [Online; accessed Dec-2016].
    • (2015)
  • 8
    • 85014434114 scopus 로고    scopus 로고
    • Online; accessed Dec-2016
    • GPU Direct RDMA. http://docs.nvidia.com/cuda/gpudirectrdma/, 2015. [Online; accessed Dec-2016].
    • (2015)
  • 9
    • 85014443969 scopus 로고    scopus 로고
    • Online; accessed Dec-2016
    • HPC: Powering Deep Learning. http://computing.ornl. gov/workshops/SMC15/docs/bcatanzaro smcc.pdf, 2015. [Online; accessed Dec-2016].
    • (2015)
  • 10
    • 85014500526 scopus 로고    scopus 로고
    • Online; accessed Dec-2016
    • LMDB. http://symas.com/mdb/, 2015. [Online; accessed Dec-2016].
    • (2015)
  • 11
    • 85014503814 scopus 로고    scopus 로고
    • Online; accessed Dec-2016
    • Nvidia Development Platform for Autonomous Cars. http://www.nvidia.com/object/drive-px.html, 2016. [Online; accessed Dec-2016].
    • (2016)
  • 12
    • 85014461247 scopus 로고    scopus 로고
    • Online; accessed Dec-2016
    • CNTK. http://www.cntk.ai/, 2016. [Online; accessed Dec-2016].
    • (2016)
  • 13
    • 85014461431 scopus 로고    scopus 로고
    • Online; accessed Dec-2016
    • Nvidia GPUs Comparison. http://www.extremetech.com/computing/194391-nvidias-new-tesla-k80-doubles-up-ongpu-horsepower, 2016. [Online; accessed Dec-2016].
    • (2016)
  • 15
    • 41249087856 scopus 로고    scopus 로고
    • General purpose molecular dynamics simulations fully implemented on graphics processing units
    • J. A. Anderson, C. D. Lorenz, and A. Travesset. General Purpose Molecular Dynamics Simulations Fully Implemented on Graphics Processing Units. Journal of Computational Physics, 227(10):5342-5359, 2008.
    • (2008) Journal of Computational Physics , vol.227 , Issue.10 , pp. 5342-5359
    • Anderson, J.A.1    Lorenz, C.D.2    Travesset, A.3
  • 16
    • 85017408696 scopus 로고    scopus 로고
    • Comparative study of caffe, neon, theano, and torch for deep learning
    • S. Bahrampour, N. Ramakrishnan, L. Schott, and M. Shah. Comparative Study of Caffe, Neon, Theano, and Torch for Deep Learning. CoRR, abs/1511.06435, 2016.
    • (2016) CoRR
    • Bahrampour, S.1    Ramakrishnan, N.2    Schott, L.3    Shah, M.4
  • 22
    • 85014448741 scopus 로고    scopus 로고
    • Online; accessed Dec-2016
    • Cray. http://docs.cray.com/books/004-3689-001/html-004-3689-001/004-3689-001-toc.html, 2016. [Online; accessed Dec-2016].
    • (2016)
  • 23
    • 84971575164 scopus 로고    scopus 로고
    • Geeps: Scalable deep learning on distributed GPUs with a GPUspecialized parameter server
    • New York, NY, USA, ACM
    • H. Cui, H. Zhang, G. R. Ganger, P. B. Gibbons, and E. P. Xing. Geeps: Scalable deep learning on distributed gpus with a gpuspecialized parameter server. In Proceedings of the Eleventh European Conference on Computer Systems, EuroSys'16, pages 4:1-4:16, New York, NY, USA, 2016. ACM. ISBN 978-1-4503-4240-7. doi: 10.1145/2901318.2901323. URL http://doi.acm.org/10.1145/2901318.2901323.
    • (2016) Proceedings of the Eleventh European Conference on Computer Systems, EuroSys'16 , pp. 41-416
    • Cui, H.1    Zhang, H.2    Ganger, G.R.3    Gibbons, P.B.4    Xing, E.P.5
  • 31
    • 85014424561 scopus 로고    scopus 로고
    • Inspur. https://github.com/Caffe-MPI/Caffe-MPI.github.io, 2016.
    • (2016)
  • 34
    • 84946590547 scopus 로고    scopus 로고
    • One weird trick for parallelizing convolutional neural networks
    • A. Krizhevsky. One weird trick for parallelizing convolutional neural networks. CoRR, abs/1404.5997, 2014.
    • (2014) CoRR
    • Krizhevsky, A.1
  • 36
    • 84876231242 scopus 로고    scopus 로고
    • ImageNet classification with deep convolutional neural networks
    • F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, editors, Curran Associates, Inc.
    • A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet Classification with Deep Convolutional Neural Networks. In F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 25, pages 1097-1105. Curran Associates, Inc., 2012. URL http://papers.nips.cc/paper/4824-imagenet-classificationwith-deep-convolutional-neural-networks.pdf.
    • (2012) Advances in Neural Information Processing Systems 25 , pp. 1097-1105
    • Krizhevsky, A.1    Sutskever, I.2    Hinton, G.E.3
  • 42
    • 85014428518 scopus 로고    scopus 로고
    • Network Based Computing Laboratory. OSU Micro-Benchmarks. http://mvapich.cse.ohio-state.edu/benchmarks/, 2016.
    • (2016) OSU Micro-Benchmarks


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.