SCOPUS 정보 검색 플랫폼

FPGA 2017 - Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

Volumn , Issue , 2017, Pages 5-14

Can FPGAs beat GPUs in accelerating next-generation deep neural networks?

(11) Nurvitadhi, Eriko a Venkatesh, Ganesh a Sim, Jaewoong a Marr, Debbie a Huang, Randy a Ong, Jason Gee Hock a Liew, Yeong Tat a Srivatsan, Krishnan a Moss, Duncan a Subhaschandra, Suchit a Boudoukh, Guy a

a INTEL CORPORATION (United States)

Author keywords

Accelerator; Deep learning; GPU; Intel Stratix 10 FPGA

Indexed keywords

COMPUTER ARCHITECTURE; DEEP LEARNING; DEEP NEURAL NETWORKS; DIGITAL ARITHMETIC; ENERGY EFFICIENCY; GRAPHICS PROCESSING UNIT; LOGIC GATES; NEXT GENERATION NETWORKS; PARTICLE ACCELERATORS; PROGRAM PROCESSORS; RANDOM ACCESS STORAGE;

ALGORITHMIC EFFICIENCIES; CURRENT GENERATION; CUSTOMIZABILITY; FLOATING POINT UNITS; FLOATING POINTS; FPGA TECHNOLOGY; HIGH BANDWIDTH; MEMORY BLOCKS;

FIELD PROGRAMMABLE GATE ARRAYS (FPGA);

EID: 85016004196 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/3020078.3021740 Document Type: Conference Paper

Times cited : (429)

References (30)

1
- 84965117606
- BinaryConnect: Training Deep Neural Networks with binary weights during propagations
- M. Courbariaux, Y. Bengio, J-P. David "BinaryConnect: Training Deep Neural Networks with binary weights during propagations," NIPS 2015.
- NIPS 2015
- Courbariaux, M.¹ Bengio, Y.² David, J.-P.³

2
- 84988920420
- arXiv:1602.02830 [cs.LG]
- M. Courbariaux, I. Hubara, et al., "Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1," arXiv:1602.02830 [cs.LG].
- Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1
- Courbariaux, M.¹ Hubara, I.²

3
- 84990055874
- arXiv:1603.05279 [cs.CV]
- M. Rastegari, V. Ordonez, J. Redmon, A. Farhadi "XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks," arXiv:1603.05279 [cs.CV]
- XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks
- Rastegari, M.¹ Ordonez, V.² Redmon, J.³ Farhadi, A.⁴

4
- 85016060111
- arXiv:1605.04711 [cs.CV]
- F. Li, B. Liu. "Ternary Weight Networks," arXiv:1605.04711 [cs.CV]
- Ternary Weight Networks
- Li, F.¹ Liu, B.²

5
- 85016049062
- Accelerating Deep Convolutional Networks Using Low-Precision and Sparsity
- G. Venkatesh, E. Nurvitadhi, D. Marr, ".Accelerating Deep Convolutional Networks Using Low-Precision and Sparsity," ICASSP, 2017.
- 2017, ICASSP
- Venkatesh, G.¹ Nurvitadhi, E.² Marr, D.³

6
- 85083950579
- Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization, and Huffman Coding
- S. Han, H. Mao, W. J. Dally, "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization, and Huffman Coding," ICLR 2016.
- ICLR 2016
- Han, S.¹ Mao, H.² Dally, W.J.³

7
- 84988910518
- Hardware-Oriented Approximation of Convolutional Neural Networks
- P. Gysel, et al., "Hardware-Oriented Approximation of Convolutional Neural Networks," ICLR Workshop 2016.
- ICLR Workshop 2016
- Gysel, P.¹

8
- 84988372953
- Cnvlutin: Ineffectual-Neuron-Free Deep Convolutional Neural Network Computing
- J. Albericio, P. Judd, T. Hetherington, et al, "Cnvlutin: Ineffectual-Neuron-Free Deep Convolutional Neural Network Computing," ISCA 2016.
- ISCA 2016
- Albericio, J.¹ Judd, P.² Hetherington, T.³

9
- 84988443578
- EIE: Efficient Inference Engine on Compressed Deep Neural Network
- S. Han, X. Liu, et al., "EIE: Efficient Inference Engine on Compressed Deep Neural Network," ISCA 2016.
- ISCA 2016
- Han, S.¹ Liu, X.²

10
- 84966471227
- Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks
- N. Suda, V. Chandra, et al., "Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks," ISFPGA 2016.
- ISFPGA 2016
- Suda, N.¹ Chandra, V.²

11
- 84966533810
- Going Deeper with Embedded FPGA Platform for Convolutional Neural Network
- J. Qiu, et al., "Going Deeper with Embedded FPGA Platform for Convolutional Neural Network," ISFPGA 2016.
- ISFPGA 2016
- Qiu, J.¹

12
- 85017528648
- Accelerating Datacenter Workloads
- Slides
- P.K. Gupta, "Accelerating Datacenter Workloads," Keynote at FPL 2016. Slides available at www.fpl2016.org.
- Keynote at FPL 2016
- Gupta, P.K.¹

13
- 84905454486
- A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services
- A. Putnam, A. M. Caulfield, et al., "A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services," ISCA 2014.
- ISCA 2014
- Putnam, A.¹ Caulfield, A.M.²

14
- 0003859414
- Prentice-Hall, Inc. Upper Saddle River, NJ, USA
- S. Y. Kung, "VLSI Array Processors," Prentice-Hall, Inc. Upper Saddle River, NJ, USA, 1987.
- (1987) VLSI Array Processors
- Kung, S.Y.¹

15
- 80055100054
- A High-Performance, Low-Power Linear Algebra Core
- A. Pedram, et al., "A High-Performance, Low-Power Linear Algebra Core," ASAP 2011.
- ASAP 2011
- Pedram, A.¹

16
- 85016038505
- Altera Arria 10 Website. https://www.altera.com/products/fpga/arria-series/arria-10/overview.html
- Altera Arria 10 Website

17
- 85015988598
- Altera Stratix 10 Website. https://www.altera.com/products/fpga/stratix-series/stratix-10/overview.html
- Altera Stratix 10 Website

18
- 85016037507
- Nvidia Titan X Website. http://www.geforce.com/hardware/10series/titan-x-pascal
- Nvidia Titan X Website

19
- 85015982654
- Altera's PowerPlay Early Power Estimators (EPE) and Power Analyzer, https://www.altera.com/support/support-resources/operation-and-testing/power/pow-powerplay.html
- Altera's PowerPlay Early Power Estimators (EPE) and Power Analyzer

20
- 84990056975
- S. Gross, M. Wilber, "Training and investigating Residual Nets," http://torch.ch/blog/2016/02/04/resnets.html
- Training and Investigating Residual Nets
- Gross, S.¹ Wilber, M.²

21
- 85047207244
- J. C. Johnson, "cnn-benchmarks",available at https://github.com/jcjohnson/cnn-benchmarks
- Cnn-benchmarks
- Johnson, J.C.¹

22
- 85016087197
- HyperPipelining of High-Speed Interface Logic
- G. Baeckler, "HyperPipelining of High-Speed Interface Logic," ISFPGA Tutorial, 2016.
- (2016) ISFPGA Tutorial
- Baeckler, G.¹

23
- 85017247188
- arXiv:1509.09308 [cs.NE]
- A. Lavin, S. Gray, "Fast Algorithms for Convolutional Neural Networks," arXiv:1509.09308 [cs.NE].
- Fast Algorithms for Convolutional Neural Networks
- Lavin, A.¹ Gray, S.²

24
- 47349126095
- Generating FPGA Accelerated DFT Libraries
- P. D'Alberto, P. A. Milder, et al., "Generating FPGA Accelerated DFT Libraries," FCCM 2007.
- FCCM 2007
- D'Alberto, P.¹ Milder, P.A.²

25
- 84969930652
- Compressing Neural Networks with the Hashing Trick
- W. Chen, J. Wilson, et al., "Compressing Neural Networks with the Hashing Trick," ICML 2015.
- ICML 2015
- Chen, W.¹ Wilson, J.²

26
- 84988406311
- Dadiannao: A machine-learning supercomputer
- Y. Chen, T. Luo, S. Liu, et al., "Dadiannao: A machine-learning supercomputer," Int. Symposium on Microarchitecture (MICRO), 2014.
- (2014) Int. Symposium on Microarchitecture (MICRO)
- Chen, Y.¹ Luo, T.² Liu, S.³

27
- 77957919571
- BLAS Comparison on FPGA, CPU and GPU
- S. Kestur, et al., "BLAS Comparison on FPGA, CPU and GPU," IEEE Annual Sym. on VLSI (ISVLSI), 2010
- (2010) IEEE Annual Sym. on VLSI (ISVLSI)
- Kestur, S.¹

28
- 84994813371
- Accelerating Recurrent Neural Networks in Analytics Servers: Comparison of FPGA, CPU, GPU, and ASIC
- E. Nurvitadhi, J. Sim, D. Sheffield, et al, "Accelerating Recurrent Neural Networks in Analytics Servers: Comparison of FPGA, CPU, GPU, and ASIC," FPL 2016.
- FPL 2016
- Nurvitadhi, E.¹ Sim, J.² Sheffield, D.³

29
- 84899668467
- Website
- MAGMA: Matrix Algebra on GPU and Multicore Architectures.Website: http://icl.cs.utk.edu/magma/
- MAGMA: Matrix Algebra on GPU and Multicore Architectures

30
- 85016000557
- Accelerating Binarized Neural Networks: Comparison of FPGA, CPU, GPU, and ASIC
- E. Nurvitadhi, D. Sheffield, J. Sim, et al, "Accelerating Binarized Neural Networks: Comparison of FPGA, CPU, GPU, and ASIC," FPT 2016.
- FPT 2016
- Nurvitadhi, E.¹ Sheffield, D.² Sim, J.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.