SCOPUS 정보 검색 플랫폼

Proceedings - 2016 43rd International Symposium on Computer Architecture, ISCA 2016

Volumn , Issue , 2016, Pages 367-379

Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks

(3) Chen, Yu Hsin a Emer, Joel a,b Sze, Vivienne a

a Cambridge ^* (United States)

b NVIDIA (United States)

Author keywords

Convolutional Neural Networks; Dataflow; Energy Efficiency; Spatial Architecture

Indexed keywords

ARTIFICIAL INTELLIGENCE; COMPLEX NETWORKS; COMPUTER ARCHITECTURE; CONVOLUTION; COST BENEFIT ANALYSIS; COSTS; DATA FLOW ANALYSIS; DATA HANDLING; DIGITAL STORAGE; ENERGY UTILIZATION; NETWORK ARCHITECTURE; NEURAL NETWORKS;

ANALYSIS FRAMEWORKS; CONVOLUTIONAL NEURAL NETWORK; DATAFLOW; FABRICATED CHIPS; HIGH-DIMENSIONAL; PARALLEL PROCESSING; PROCESSING ENGINE; SPATIAL PARALLELISM;

ENERGY EFFICIENCY;

EID: 84988317007 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ISCA.2016.40 Document Type: Conference Paper

Times cited : (1493)

References (43)

1
- 84930630277
- Deep learning
- Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," Nature, vol. 521, no. 7553, 2015.
- (2015) Nature , vol.521 , Issue.7553
- LeCun, Y.¹ Bengio, Y.² Hinton, G.³

2
- 84876231242
- Imagenet classification with deep convolutional neural networks
- A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," in NIPS, 2012.
- (2012) NIPS
- Krizhevsky, A.¹ Sutskever, I.² Hinton, G.E.³

3
- 84925410541
- CoRR abs/1409.1556
- K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition," CoRR, vol. abs/1409.1556, 2014.
- (2014) Very Deep Convolutional Networks for Large-Scale Image Recognition
- Simonyan, K.¹ Zisserman, A.²

4
- 84988372022
- Going deeper with convolutions
- C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, "Going Deeper With Convolutions," in IEEE CVPR, 2015.
- (2015) IEEE CVPR
- Szegedy, C.¹ Liu, W.² Jia, Y.³ Sermanet, P.⁴ Reed, S.⁵ Anguelov, D.⁶ Erhan, D.⁷ Vanhoucke, V.⁸ Rabinovich, A.⁹

5
- 84988399663
- Deep residual learning for image recognition
- K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," in IEEE CVPR, 2016.
- (2016) IEEE CVPR
- He, K.¹ Zhang, X.² Ren, S.³ Sun, J.⁴

6
- 84940737052
- Rich feature hierarchies for accurate object detection and semantic segmentation
- R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation," in IEEE CVPR, 2014.
- (2014) IEEE CVPR
- Girshick, R.¹ Donahue, J.² Darrell, T.³ Malik, J.⁴

7
- 84906347546
- CoRR abs/1312.6229
- P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun, "OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks," CoRR, vol. abs/1312.6229, 2013.
- (2013) OverFeat: Integrated Recognition, Localization and Detection Using Convolutional Networks
- Sermanet, P.¹ Eigen, D.² Zhang, X.³ Mathieu, M.⁴ Fergus, R.⁵ LeCun, Y.⁶

8
- 84937964578
- Learning deep features for scene recognition using places database
- B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva, "Learning Deep Features for Scene Recognition using Places Database," in NIPS, 2014.
- (2014) NIPS
- Zhou, B.¹ Lapedriza, A.² Xiao, J.³ Torralba, A.⁴ Oliva, A.⁵

9
- 0024767889
- Handwritten digit recognition: Applications of neural network chips and automatic learning
- Y. Le Cun, L. Jackel, B. Boser, J. Denker, H. Graf, I. Guyon, D. Henderson, R. Howard, and W. Hubbard, "Handwritten digit recognition: applications of neural network chips and automatic learning," IEEE Communications Magazine, vol. 27, no. 11, 1989.
- (1989) IEEE Communications Magazine , vol.27 , Issue.11
- Le Cun, Y.¹ Jackel, L.² Boser, B.³ Denker, J.⁴ Graf, H.⁵ Guyon, I.⁶ Henderson, D.⁷ Howard, R.⁸ Hubbard, W.⁹

10
- 84988320060
- Minimizing computation in convolutional neural networks
- J. Cong and B. Xiao, "Minimizing computation in convolutional neural networks," in ICANN, 2014.
- (2014) ICANN
- Cong, J.¹ Xiao, B.²

11
- 82455182918
- Power, programmability, and granularity: The challenges of exascale computing
- B. Dally, "Power, Programmability, and Granularity: The Challenges of ExaScale Computing," in IEEE IPDPS, 2011.
- (2011) IEEE IPDPS
- Dally, B.¹

12
- 84946239930
- Computing's energy problem (and what we can do about it)
- M. Horowitz, "Computing's energy problem (and what we can do about it)," in IEEE ISSCC, 2014.
- (2014) IEEE ISSCC
- Horowitz, M.¹

13
- 77954995378
- Understanding sources of inefficiency in general-purpose chips
- R. Hameed, W. Qadeer, M. Wachs, O. Azizi, A. Solomatnikov, B. C. Lee, S. Richardson, C. Kozyrakis, and M. Horowitz, "Understanding Sources of Inefficiency in General-purpose Chips," in ISCA, 2010.
- (2010) ISCA
- Hameed, R.¹ Qadeer, W.² Wachs, M.³ Azizi, O.⁴ Solomatnikov, A.⁵ Lee, B.C.⁶ Richardson, S.⁷ Kozyrakis, C.⁸ Horowitz, M.⁹

14
- 84944081816
- CoRR abs/1410.0759
- S. Chetlur, C. Woolley, P. Vandermersch, J. Cohen, J. Tran, B. Catanzaro, and E. Shelhamer, "cuDNN: Efficient Primitives for Deep Learning," CoRR, vol. abs/1410.0759, 2014.
- (2014) CuDNN: Efficient Primitives for Deep Learning
- Chetlur, S.¹ Woolley, C.² Vandermersch, P.³ Cohen, J.⁴ Tran, J.⁵ Catanzaro, B.⁶ Shelhamer, E.⁷

15
- 84988439736
- A massively parallel coprocessor for convolutional neural networks
- M. Sankaradas, V. Jakkula, S. Cadambi, S. Chakradhar, I. Durdanovic, E. Cosatto, and H. P. Graf, "A Massively Parallel Coprocessor for Convolutional Neural Networks," in IEEE ASAP, 2009.
- (2009) IEEE ASAP
- Sankaradas, M.¹ Jakkula, V.² Cadambi, S.³ Chakradhar, S.⁴ Durdanovic, I.⁵ Cosatto, E.⁶ Graf, H.P.⁷

16
- 79551569552
- Towards an embedded biologically-inspired machine vision processor
- V. Sriram, D. Cox, K. H. Tsoi, and W. Luk, "Towards an embedded biologically-inspired machine vision processor," in FPT, 2010.
- (2010) FPT
- Sriram, V.¹ Cox, D.² Tsoi, K.H.³ Luk, W.⁴

17
- 77955007393
- A dynamically configurable coprocessor for convolutional neural networks
- S. Chakradhar, M. Sankaradas, V. Jakkula, and S. Cadambi, "A Dynamically Configurable Coprocessor for Convolutional Neural Networks," in ISCA, 2010.
- (2010) ISCA
- Chakradhar, S.¹ Sankaradas, M.² Jakkula, V.³ Cadambi, S.⁴

18
- 84892533708
- Memory-centric accelerator design for Convolutional Neural Networks
- M. Peemen, A. A. A. Setio, B. Mesman, and H. Corporaal, "Memory-centric accelerator design for Convolutional Neural Networks," in IEEE ICCD, 2013.
- (2013) IEEE ICCD
- Peemen, M.¹ Setio, A.A.A.² Mesman, B.³ Corporaal, H.⁴

19
- 84988337693
- A 240 g-ops/s mobile coprocessor for deep neural networks
- V. Gokhale, J. Jin, A. Dundar, B. Martini, and E. Culurciello, "A 240 G-ops/s Mobile Coprocessor for Deep Neural Networks," in IEEE CVPRW, 2014.
- (2014) IEEE CVPRW
- Gokhale, V.¹ Jin, J.² Dundar, A.³ Martini, B.⁴ Culurciello, E.⁵

20
- 84946878550
- CoRR abs/1502.02551
- S. Gupta, A. Agrawal, K. Gopalakrishnan, and P. Narayanan, "Deep Learning with Limited Numerical Precision," CoRR, vol. abs/1502.02551, 2015.
- (2015) Deep Learning with Limited Numerical Precision
- Gupta, S.¹ Agrawal, A.² Gopalakrishnan, K.³ Narayanan, P.⁴

21
- 84962921765
- Optimizing FPGA-based accelerator design for deep convolutional neural networks
- C. Zhang, P. Li, G. Sun, Y. Guan, B. Xiao, and J. Cong, "Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks," in FPGA, 2015.
- (2015) FPGA
- Zhang, C.¹ Li, P.² Sun, G.³ Guan, Y.⁴ Xiao, B.⁵ Cong, J.⁶

22
- 84897780584
- Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning
- T. Chen, Z. Du, N. Sun, J. Wang, C. Wu, Y. Chen, and O. Temam, "DianNao: A Small-footprint High-throughput Accelerator for Ubiquitous Machine-learning," in ASPLOS, 2014.
- (2014) ASPLOS
- Chen, T.¹ Du, Z.² Sun, N.³ Wang, J.⁴ Wu, C.⁵ Chen, Y.⁶ Temam, O.⁷

23
- 84959912559
- Shidiannao: Shifting vision processing closer to the sensor
- Z. Du, R. Fasthuber, T. Chen, P. Ienne, L. Li, T. Luo, X. Feng, Y. Chen, and O. Temam, "ShiDianNao: Shifting Vision Processing Closer to the Sensor," in ISCA, 2015.
- (2015) ISCA
- Du, Z.¹ Fasthuber, R.² Chen, T.³ Ienne, P.⁴ Li, L.⁵ Luo, T.⁶ Feng, X.⁷ Chen, Y.⁸ Temam, O.⁹

24
- 84988406311
- Dadiannao: A machine-learning supercomputer
- Y. Chen, T. Luo, S. Liu, S. Zhang, L. He, J. Wang, L. Li, T. Chen, Z. Xu, N. Sun, and O. Temam, "DaDianNao: A Machine-Learning Supercomputer," in MICRO, 2014.
- (2014) MICRO
- Chen, Y.¹ Luo, T.² Liu, S.³ Zhang, S.⁴ He, L.⁵ Wang, J.⁶ Li, L.⁷ Chen, T.⁸ Xu, Z.⁹ Sun, N.¹⁰ Temam, O.¹¹

25
- 84940782827
- A 1.93TOPS/W scalable deep learning/inference processor with tetra-parallel MIMD architecture for big-data applications
- S. Park, K. Bong, D. Shin, J. Lee, S. Choi, and H.-J. Yoo, "A 1.93TOPS/W scalable deep learning/inference processor with tetra-parallel MIMD architecture for big-data applications," in IEEE ISSCC, 2015.
- (2015) IEEE ISSCC
- Park, S.¹ Bong, K.² Shin, D.³ Lee, J.⁴ Choi, S.⁵ Yoo, H.-J.⁶

26
- 84955438096
- Origami: A convolutional network accelerator
- L. Cavigelli, D. Gschwend, C. Mayer, S. Willi, B. Muheim, and L. Benini, "Origami: A Convolutional Network Accelerator," in GLSVLSI, 2015.
- (2015) GLSVLSI
- Cavigelli, L.¹ Gschwend, D.² Mayer, C.³ Willi, S.⁴ Muheim, B.⁵ Benini, L.⁶

27
- 50149111721
- MATRIX: A reconfigurable computing architecture with configurable instruction distribution and deployable resources
- E. Mirsky and A. DeHon, "MATRIX: a reconfigurable computing architecture with configurable instruction distribution and deployable resources," in IEEE FCCM, 1996.
- (1996) IEEE FCCM
- Mirsky, E.¹ DeHon, A.²

28
- 0001892534
- Garp: A MIPS processor with a reconfigurable coprocessor
- J. R. Hauser and J. Wawrzynek, "Garp: a MIPS processor with a reconfigurable coprocessor," in IEEE FCCM, 1997.
- (1997) IEEE FCCM
- Hauser, J.R.¹ Wawrzynek, J.²

29
- 17844392445
- ADRES: An architecture with tightly coupled vliw processor and coarse-grained reconfigurable matrix
- B. Mei, S. Vernalde, D. Verkest, H. D. Man, and R. Lauwereins, "ADRES: An Architecture with Tightly Coupled VLIW Processor and Coarse-Grained Reconfigurable Matrix," in FPL, 2003.
- (2003) FPL
- Mei, B.¹ Vernalde, S.² Verkest, D.³ Man, H.D.⁴ Lauwereins, R.⁵

30
- 84881163269
- Triggered instructions: A control paradigm for spatially-programmed architectures
- A. Parashar, M. Pellauer, M. Adler, B. Ahsan, N. Crago, D. Lustig, V. Pavlov, A. Zhai, M. Gambhir, A. Jaleel, R. Allmon, R. Rayess, S. Maresh, and J. Emer, "Triggered Instructions: A Control Paradigm for Spatially-programmed Architectures," in ISCA, 2013.
- (2013) ISCA
- Parashar, A.¹ Pellauer, M.² Adler, M.³ Ahsan, B.⁴ Crago, N.⁵ Lustig, D.⁶ Pavlov, V.⁷ Zhai, A.⁸ Gambhir, M.⁹ Jaleel, A.¹⁰ Allmon, R.¹¹ Rayess, R.¹² Maresh, S.¹³ Emer, J.¹⁴

31
- 84988399611
- Dynamically specialized datapaths for energy efficient computing
- V. Govindaraju, C.-H. Ho, and K. Sankaralingam, "Dynamically Specialized Datapaths for Energy Efficient Computing," in IEEE HPCA, 2011.
- (2011) IEEE HPCA
- Govindaraju, V.¹ Ho, C.-H.² Sankaralingam, K.³

32
- 34249721013
- The wavescalar architecture
- S. Swanson, A. Schwerin, M. Mercaldi, A. Petersen, A. Putnam, K. Michelson, M. Oskin, and S. J. Eggers, "The WaveScalar Architecture," ACM TOCS, vol. 25, no. 2, 2007.
- (2007) ACM TOCS , vol.25 , Issue.2
- Swanson, S.¹ Schwerin, A.² Mercaldi, M.³ Petersen, A.⁴ Putnam, A.⁵ Michelson, K.⁶ Oskin, M.⁷ Eggers, S.J.⁸

33
- 0036045954
- PipeRench: A virtualized programmable datapath in 0.18 micron technology
- H. Schmit, D. Whelihan, A. Tsai, M. Moe, B. Levine, and R. Reed Taylor, "PipeRench: A virtualized programmable datapath in 0.18 micron technology," in IEEE CICC, 2002.
- (2002) IEEE CICC
- Schmit, H.¹ Whelihan, D.² Tsai, A.³ Moe, M.⁴ Levine, B.⁵ Reed Taylor, R.⁶

34
- 3242815471
- Scaling to the end of silicon with edge architectures
- D. Burger, S. W. Keckler, K. S. McKinley, M. Dahlin, L. K. John, C. Lin, C. R. Moore, J. Burrill, R. G. McDonald, and W. Yoder, "Scaling to the End of Silicon with EDGE Architectures," Computer, vol. 37, no. 7, 2004.
- (2004) Computer , vol.37 , Issue.7
- Burger, D.¹ Keckler, S.W.² McKinley, K.S.³ Dahlin, M.⁴ John, L.K.⁵ Lin, C.⁶ Moore, C.R.⁷ Burrill, J.⁸ McDonald, R.G.⁹ Yoder, W.¹⁰

35
- 84960079095
- Exploring the potential of heterogeneous von neumann/dataflow execution models
- T. Nowatzki, V. Gangadhar, and K. Sankaralingam, "Exploring the Potential of Heterogeneous Von Neumann/Dataflow Execution Models," in ISCA, 2015.
- (2015) ISCA
- Nowatzki, T.¹ Gangadhar, V.² Sankaralingam, K.³

36
- 84955509252
- Convolutional networks and applications in vision
- Y. LeCun, K. Kavukcuoglu, and C. Farabet, "Convolutional networks and applications in vision," in IEEE ISCAS, 2010.
- (2010) IEEE ISCAS
- LeCun, Y.¹ Kavukcuoglu, K.² Farabet, C.³

37
- 77956509090
- Rectified linear units improve restricted boltzmann machines
- V. Nair and G. E. Hinton, "Rectified Linear Units Improve Restricted Boltzmann Machines," in ICML, 2010.
- (2010) ICML
- Nair, V.¹ Hinton, G.E.²

38
- 85083950579
- Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding
- S. Han, H. Mao, and W. J. Dally, "Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding," in ICLR, 2016.
- (2016) ICLR
- Han, S.¹ Mao, H.² Dally, W.J.³

39
- 84913555165
- arXiv preprint arXiv:1408.5093
- Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, "Caffe: Convolutional Architecture for Fast Feature Embedding," arXiv preprint arXiv:1408.5093, 2014.
- (2014) Caffe: Convolutional Architecture for Fast Feature Embedding
- Jia, Y.¹ Shelhamer, E.² Donahue, J.³ Karayev, S.⁴ Long, J.⁵ Girshick, R.⁶ Guadarrama, S.⁷ Darrell, T.⁸

40
- 84881162326
- Convolution engine: Balancing efficiency and flexibility in specialized computing
- W. Qadeer, R. Hameed, O. Shacham, P. Venkatesan, C. Kozyrakis, and M. A. Horowitz, "Convolution Engine: Balancing Efficiency and Flexibility in Specialized Computing," in ISCA, 2013.
- (2013) ISCA
- Qadeer, W.¹ Hameed, R.² Shacham, O.³ Venkatesan, P.⁴ Kozyrakis, C.⁵ Horowitz, M.A.⁶

41
- 84988330726
- Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks
- Y.-H. Chen, T. Krishna, J. Emer, and V. Sze, "Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks," in IEEE ISSCC, 2016.
- (2016) IEEE ISSCC
- Chen, Y.-H.¹ Krishna, T.² Emer, J.³ Sze, V.⁴

42
- 84988372773
- Exploiting spatial architectures for edit distance algorithms
- J. J. Tithi, N. C. Crago, and J. S. Emer, "Exploiting spatial architectures for edit distance algorithms," IEEE ISPASS, 2014.
- (2014) IEEE ISPASS
- Tithi, J.J.¹ Crago, N.C.² Emer, J.S.³

43
- 84864850882
- Towards energy-proportional datacenter memory with mobile dram
- K. T. Malladi, B. C. Lee, F. A. Nothaft, C. Kozyrakis, K. Periyathambi, and M. Horowitz, "Towards energy-proportional datacenter memory with mobile dram," in ISCA, 2012.
- (2012) ISCA
- Malladi, K.T.¹ Lee, B.C.² Nothaft, F.A.³ Kozyrakis, C.⁴ Periyathambi, K.⁵ Horowitz, M.⁶

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.