메뉴 건너뛰기




Volumn , Issue , 2017, Pages 5-14

Can FPGAs beat GPUs in accelerating next-generation deep neural networks?

Author keywords

Accelerator; Deep learning; GPU; Intel Stratix 10 FPGA

Indexed keywords

COMPUTER ARCHITECTURE; DEEP LEARNING; DEEP NEURAL NETWORKS; DIGITAL ARITHMETIC; ENERGY EFFICIENCY; GRAPHICS PROCESSING UNIT; LOGIC GATES; NEXT GENERATION NETWORKS; PARTICLE ACCELERATORS; PROGRAM PROCESSORS; RANDOM ACCESS STORAGE;

EID: 85016004196     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/3020078.3021740     Document Type: Conference Paper
Times cited : (429)

References (30)
  • 1
    • 84965117606 scopus 로고    scopus 로고
    • BinaryConnect: Training Deep Neural Networks with binary weights during propagations
    • M. Courbariaux, Y. Bengio, J-P. David "BinaryConnect: Training Deep Neural Networks with binary weights during propagations," NIPS 2015.
    • NIPS 2015
    • Courbariaux, M.1    Bengio, Y.2    David, J.-P.3
  • 5
    • 85016049062 scopus 로고    scopus 로고
    • Accelerating Deep Convolutional Networks Using Low-Precision and Sparsity
    • G. Venkatesh, E. Nurvitadhi, D. Marr, ".Accelerating Deep Convolutional Networks Using Low-Precision and Sparsity," ICASSP, 2017.
    • 2017, ICASSP
    • Venkatesh, G.1    Nurvitadhi, E.2    Marr, D.3
  • 6
    • 85083950579 scopus 로고    scopus 로고
    • Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization, and Huffman Coding
    • S. Han, H. Mao, W. J. Dally, "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization, and Huffman Coding," ICLR 2016.
    • ICLR 2016
    • Han, S.1    Mao, H.2    Dally, W.J.3
  • 7
    • 84988910518 scopus 로고    scopus 로고
    • Hardware-Oriented Approximation of Convolutional Neural Networks
    • P. Gysel, et al., "Hardware-Oriented Approximation of Convolutional Neural Networks," ICLR Workshop 2016.
    • ICLR Workshop 2016
    • Gysel, P.1
  • 8
    • 84988372953 scopus 로고    scopus 로고
    • Cnvlutin: Ineffectual-Neuron-Free Deep Convolutional Neural Network Computing
    • J. Albericio, P. Judd, T. Hetherington, et al, "Cnvlutin: Ineffectual-Neuron-Free Deep Convolutional Neural Network Computing," ISCA 2016.
    • ISCA 2016
    • Albericio, J.1    Judd, P.2    Hetherington, T.3
  • 9
    • 84988443578 scopus 로고    scopus 로고
    • EIE: Efficient Inference Engine on Compressed Deep Neural Network
    • S. Han, X. Liu, et al., "EIE: Efficient Inference Engine on Compressed Deep Neural Network," ISCA 2016.
    • ISCA 2016
    • Han, S.1    Liu, X.2
  • 10
    • 84966471227 scopus 로고    scopus 로고
    • Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks
    • N. Suda, V. Chandra, et al., "Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks," ISFPGA 2016.
    • ISFPGA 2016
    • Suda, N.1    Chandra, V.2
  • 11
    • 84966533810 scopus 로고    scopus 로고
    • Going Deeper with Embedded FPGA Platform for Convolutional Neural Network
    • J. Qiu, et al., "Going Deeper with Embedded FPGA Platform for Convolutional Neural Network," ISFPGA 2016.
    • ISFPGA 2016
    • Qiu, J.1
  • 12
    • 85017528648 scopus 로고    scopus 로고
    • Accelerating Datacenter Workloads
    • Slides
    • P.K. Gupta, "Accelerating Datacenter Workloads," Keynote at FPL 2016. Slides available at www.fpl2016.org.
    • Keynote at FPL 2016
    • Gupta, P.K.1
  • 13
    • 84905454486 scopus 로고    scopus 로고
    • A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services
    • A. Putnam, A. M. Caulfield, et al., "A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services," ISCA 2014.
    • ISCA 2014
    • Putnam, A.1    Caulfield, A.M.2
  • 14
    • 0003859414 scopus 로고
    • Prentice-Hall, Inc. Upper Saddle River, NJ, USA
    • S. Y. Kung, "VLSI Array Processors," Prentice-Hall, Inc. Upper Saddle River, NJ, USA, 1987.
    • (1987) VLSI Array Processors
    • Kung, S.Y.1
  • 15
    • 80055100054 scopus 로고    scopus 로고
    • A High-Performance, Low-Power Linear Algebra Core
    • A. Pedram, et al., "A High-Performance, Low-Power Linear Algebra Core," ASAP 2011.
    • ASAP 2011
    • Pedram, A.1
  • 16
    • 85016038505 scopus 로고    scopus 로고
    • Altera Arria 10 Website. https://www.altera.com/products/fpga/arria-series/arria-10/overview.html
    • Altera Arria 10 Website
  • 17
    • 85015988598 scopus 로고    scopus 로고
    • Altera Stratix 10 Website. https://www.altera.com/products/fpga/stratix-series/stratix-10/overview.html
    • Altera Stratix 10 Website
  • 18
    • 85016037507 scopus 로고    scopus 로고
    • Nvidia Titan X Website. http://www.geforce.com/hardware/10series/titan-x-pascal
    • Nvidia Titan X Website
  • 22
    • 85016087197 scopus 로고    scopus 로고
    • HyperPipelining of High-Speed Interface Logic
    • G. Baeckler, "HyperPipelining of High-Speed Interface Logic," ISFPGA Tutorial, 2016.
    • (2016) ISFPGA Tutorial
    • Baeckler, G.1
  • 24
    • 47349126095 scopus 로고    scopus 로고
    • Generating FPGA Accelerated DFT Libraries
    • P. D'Alberto, P. A. Milder, et al., "Generating FPGA Accelerated DFT Libraries," FCCM 2007.
    • FCCM 2007
    • D'Alberto, P.1    Milder, P.A.2
  • 25
    • 84969930652 scopus 로고    scopus 로고
    • Compressing Neural Networks with the Hashing Trick
    • W. Chen, J. Wilson, et al., "Compressing Neural Networks with the Hashing Trick," ICML 2015.
    • ICML 2015
    • Chen, W.1    Wilson, J.2
  • 28
    • 84994813371 scopus 로고    scopus 로고
    • Accelerating Recurrent Neural Networks in Analytics Servers: Comparison of FPGA, CPU, GPU, and ASIC
    • E. Nurvitadhi, J. Sim, D. Sheffield, et al, "Accelerating Recurrent Neural Networks in Analytics Servers: Comparison of FPGA, CPU, GPU, and ASIC," FPL 2016.
    • FPL 2016
    • Nurvitadhi, E.1    Sim, J.2    Sheffield, D.3
  • 30
    • 85016000557 scopus 로고    scopus 로고
    • Accelerating Binarized Neural Networks: Comparison of FPGA, CPU, GPU, and ASIC
    • E. Nurvitadhi, D. Sheffield, J. Sim, et al, "Accelerating Binarized Neural Networks: Comparison of FPGA, CPU, GPU, and ASIC," FPT 2016.
    • FPT 2016
    • Nurvitadhi, E.1    Sheffield, D.2    Sim, J.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.