-
1
-
-
84930630277
-
Deep learning
-
Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," Nature, vol. 521, no. 7553, 2015.
-
(2015)
Nature
, vol.521
, Issue.7553
-
-
LeCun, Y.1
Bengio, Y.2
Hinton, G.3
-
2
-
-
84876231242
-
Imagenet classification with deep convolutional neural networks
-
A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," in NIPS, 2012.
-
(2012)
NIPS
-
-
Krizhevsky, A.1
Sutskever, I.2
Hinton, G.E.3
-
4
-
-
84988372022
-
Going deeper with convolutions
-
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, "Going Deeper With Convolutions," in IEEE CVPR, 2015.
-
(2015)
IEEE CVPR
-
-
Szegedy, C.1
Liu, W.2
Jia, Y.3
Sermanet, P.4
Reed, S.5
Anguelov, D.6
Erhan, D.7
Vanhoucke, V.8
Rabinovich, A.9
-
5
-
-
84988399663
-
Deep residual learning for image recognition
-
K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," in IEEE CVPR, 2016.
-
(2016)
IEEE CVPR
-
-
He, K.1
Zhang, X.2
Ren, S.3
Sun, J.4
-
6
-
-
84940737052
-
Rich feature hierarchies for accurate object detection and semantic segmentation
-
R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation," in IEEE CVPR, 2014.
-
(2014)
IEEE CVPR
-
-
Girshick, R.1
Donahue, J.2
Darrell, T.3
Malik, J.4
-
7
-
-
84906347546
-
-
CoRR abs/1312.6229
-
P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun, "OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks," CoRR, vol. abs/1312.6229, 2013.
-
(2013)
OverFeat: Integrated Recognition, Localization and Detection Using Convolutional Networks
-
-
Sermanet, P.1
Eigen, D.2
Zhang, X.3
Mathieu, M.4
Fergus, R.5
LeCun, Y.6
-
8
-
-
84937964578
-
Learning deep features for scene recognition using places database
-
B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva, "Learning Deep Features for Scene Recognition using Places Database," in NIPS, 2014.
-
(2014)
NIPS
-
-
Zhou, B.1
Lapedriza, A.2
Xiao, J.3
Torralba, A.4
Oliva, A.5
-
9
-
-
0024767889
-
Handwritten digit recognition: Applications of neural network chips and automatic learning
-
Y. Le Cun, L. Jackel, B. Boser, J. Denker, H. Graf, I. Guyon, D. Henderson, R. Howard, and W. Hubbard, "Handwritten digit recognition: applications of neural network chips and automatic learning," IEEE Communications Magazine, vol. 27, no. 11, 1989.
-
(1989)
IEEE Communications Magazine
, vol.27
, Issue.11
-
-
Le Cun, Y.1
Jackel, L.2
Boser, B.3
Denker, J.4
Graf, H.5
Guyon, I.6
Henderson, D.7
Howard, R.8
Hubbard, W.9
-
10
-
-
84988320060
-
Minimizing computation in convolutional neural networks
-
J. Cong and B. Xiao, "Minimizing computation in convolutional neural networks," in ICANN, 2014.
-
(2014)
ICANN
-
-
Cong, J.1
Xiao, B.2
-
11
-
-
82455182918
-
Power, programmability, and granularity: The challenges of exascale computing
-
B. Dally, "Power, Programmability, and Granularity: The Challenges of ExaScale Computing," in IEEE IPDPS, 2011.
-
(2011)
IEEE IPDPS
-
-
Dally, B.1
-
12
-
-
84946239930
-
Computing's energy problem (and what we can do about it)
-
M. Horowitz, "Computing's energy problem (and what we can do about it)," in IEEE ISSCC, 2014.
-
(2014)
IEEE ISSCC
-
-
Horowitz, M.1
-
13
-
-
77954995378
-
Understanding sources of inefficiency in general-purpose chips
-
R. Hameed, W. Qadeer, M. Wachs, O. Azizi, A. Solomatnikov, B. C. Lee, S. Richardson, C. Kozyrakis, and M. Horowitz, "Understanding Sources of Inefficiency in General-purpose Chips," in ISCA, 2010.
-
(2010)
ISCA
-
-
Hameed, R.1
Qadeer, W.2
Wachs, M.3
Azizi, O.4
Solomatnikov, A.5
Lee, B.C.6
Richardson, S.7
Kozyrakis, C.8
Horowitz, M.9
-
14
-
-
84944081816
-
-
CoRR abs/1410.0759
-
S. Chetlur, C. Woolley, P. Vandermersch, J. Cohen, J. Tran, B. Catanzaro, and E. Shelhamer, "cuDNN: Efficient Primitives for Deep Learning," CoRR, vol. abs/1410.0759, 2014.
-
(2014)
CuDNN: Efficient Primitives for Deep Learning
-
-
Chetlur, S.1
Woolley, C.2
Vandermersch, P.3
Cohen, J.4
Tran, J.5
Catanzaro, B.6
Shelhamer, E.7
-
15
-
-
84988439736
-
A massively parallel coprocessor for convolutional neural networks
-
M. Sankaradas, V. Jakkula, S. Cadambi, S. Chakradhar, I. Durdanovic, E. Cosatto, and H. P. Graf, "A Massively Parallel Coprocessor for Convolutional Neural Networks," in IEEE ASAP, 2009.
-
(2009)
IEEE ASAP
-
-
Sankaradas, M.1
Jakkula, V.2
Cadambi, S.3
Chakradhar, S.4
Durdanovic, I.5
Cosatto, E.6
Graf, H.P.7
-
16
-
-
79551569552
-
Towards an embedded biologically-inspired machine vision processor
-
V. Sriram, D. Cox, K. H. Tsoi, and W. Luk, "Towards an embedded biologically-inspired machine vision processor," in FPT, 2010.
-
(2010)
FPT
-
-
Sriram, V.1
Cox, D.2
Tsoi, K.H.3
Luk, W.4
-
17
-
-
77955007393
-
A dynamically configurable coprocessor for convolutional neural networks
-
S. Chakradhar, M. Sankaradas, V. Jakkula, and S. Cadambi, "A Dynamically Configurable Coprocessor for Convolutional Neural Networks," in ISCA, 2010.
-
(2010)
ISCA
-
-
Chakradhar, S.1
Sankaradas, M.2
Jakkula, V.3
Cadambi, S.4
-
18
-
-
84892533708
-
Memory-centric accelerator design for Convolutional Neural Networks
-
M. Peemen, A. A. A. Setio, B. Mesman, and H. Corporaal, "Memory-centric accelerator design for Convolutional Neural Networks," in IEEE ICCD, 2013.
-
(2013)
IEEE ICCD
-
-
Peemen, M.1
Setio, A.A.A.2
Mesman, B.3
Corporaal, H.4
-
19
-
-
84988337693
-
A 240 g-ops/s mobile coprocessor for deep neural networks
-
V. Gokhale, J. Jin, A. Dundar, B. Martini, and E. Culurciello, "A 240 G-ops/s Mobile Coprocessor for Deep Neural Networks," in IEEE CVPRW, 2014.
-
(2014)
IEEE CVPRW
-
-
Gokhale, V.1
Jin, J.2
Dundar, A.3
Martini, B.4
Culurciello, E.5
-
20
-
-
84946878550
-
-
CoRR abs/1502.02551
-
S. Gupta, A. Agrawal, K. Gopalakrishnan, and P. Narayanan, "Deep Learning with Limited Numerical Precision," CoRR, vol. abs/1502.02551, 2015.
-
(2015)
Deep Learning with Limited Numerical Precision
-
-
Gupta, S.1
Agrawal, A.2
Gopalakrishnan, K.3
Narayanan, P.4
-
21
-
-
84962921765
-
Optimizing FPGA-based accelerator design for deep convolutional neural networks
-
C. Zhang, P. Li, G. Sun, Y. Guan, B. Xiao, and J. Cong, "Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks," in FPGA, 2015.
-
(2015)
FPGA
-
-
Zhang, C.1
Li, P.2
Sun, G.3
Guan, Y.4
Xiao, B.5
Cong, J.6
-
22
-
-
84897780584
-
Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning
-
T. Chen, Z. Du, N. Sun, J. Wang, C. Wu, Y. Chen, and O. Temam, "DianNao: A Small-footprint High-throughput Accelerator for Ubiquitous Machine-learning," in ASPLOS, 2014.
-
(2014)
ASPLOS
-
-
Chen, T.1
Du, Z.2
Sun, N.3
Wang, J.4
Wu, C.5
Chen, Y.6
Temam, O.7
-
23
-
-
84959912559
-
Shidiannao: Shifting vision processing closer to the sensor
-
Z. Du, R. Fasthuber, T. Chen, P. Ienne, L. Li, T. Luo, X. Feng, Y. Chen, and O. Temam, "ShiDianNao: Shifting Vision Processing Closer to the Sensor," in ISCA, 2015.
-
(2015)
ISCA
-
-
Du, Z.1
Fasthuber, R.2
Chen, T.3
Ienne, P.4
Li, L.5
Luo, T.6
Feng, X.7
Chen, Y.8
Temam, O.9
-
24
-
-
84988406311
-
Dadiannao: A machine-learning supercomputer
-
Y. Chen, T. Luo, S. Liu, S. Zhang, L. He, J. Wang, L. Li, T. Chen, Z. Xu, N. Sun, and O. Temam, "DaDianNao: A Machine-Learning Supercomputer," in MICRO, 2014.
-
(2014)
MICRO
-
-
Chen, Y.1
Luo, T.2
Liu, S.3
Zhang, S.4
He, L.5
Wang, J.6
Li, L.7
Chen, T.8
Xu, Z.9
Sun, N.10
Temam, O.11
-
25
-
-
84940782827
-
A 1.93TOPS/W scalable deep learning/inference processor with tetra-parallel MIMD architecture for big-data applications
-
S. Park, K. Bong, D. Shin, J. Lee, S. Choi, and H.-J. Yoo, "A 1.93TOPS/W scalable deep learning/inference processor with tetra-parallel MIMD architecture for big-data applications," in IEEE ISSCC, 2015.
-
(2015)
IEEE ISSCC
-
-
Park, S.1
Bong, K.2
Shin, D.3
Lee, J.4
Choi, S.5
Yoo, H.-J.6
-
26
-
-
84955438096
-
Origami: A convolutional network accelerator
-
L. Cavigelli, D. Gschwend, C. Mayer, S. Willi, B. Muheim, and L. Benini, "Origami: A Convolutional Network Accelerator," in GLSVLSI, 2015.
-
(2015)
GLSVLSI
-
-
Cavigelli, L.1
Gschwend, D.2
Mayer, C.3
Willi, S.4
Muheim, B.5
Benini, L.6
-
27
-
-
50149111721
-
MATRIX: A reconfigurable computing architecture with configurable instruction distribution and deployable resources
-
E. Mirsky and A. DeHon, "MATRIX: a reconfigurable computing architecture with configurable instruction distribution and deployable resources," in IEEE FCCM, 1996.
-
(1996)
IEEE FCCM
-
-
Mirsky, E.1
DeHon, A.2
-
28
-
-
0001892534
-
Garp: A MIPS processor with a reconfigurable coprocessor
-
J. R. Hauser and J. Wawrzynek, "Garp: a MIPS processor with a reconfigurable coprocessor," in IEEE FCCM, 1997.
-
(1997)
IEEE FCCM
-
-
Hauser, J.R.1
Wawrzynek, J.2
-
29
-
-
17844392445
-
ADRES: An architecture with tightly coupled vliw processor and coarse-grained reconfigurable matrix
-
B. Mei, S. Vernalde, D. Verkest, H. D. Man, and R. Lauwereins, "ADRES: An Architecture with Tightly Coupled VLIW Processor and Coarse-Grained Reconfigurable Matrix," in FPL, 2003.
-
(2003)
FPL
-
-
Mei, B.1
Vernalde, S.2
Verkest, D.3
Man, H.D.4
Lauwereins, R.5
-
30
-
-
84881163269
-
Triggered instructions: A control paradigm for spatially-programmed architectures
-
A. Parashar, M. Pellauer, M. Adler, B. Ahsan, N. Crago, D. Lustig, V. Pavlov, A. Zhai, M. Gambhir, A. Jaleel, R. Allmon, R. Rayess, S. Maresh, and J. Emer, "Triggered Instructions: A Control Paradigm for Spatially-programmed Architectures," in ISCA, 2013.
-
(2013)
ISCA
-
-
Parashar, A.1
Pellauer, M.2
Adler, M.3
Ahsan, B.4
Crago, N.5
Lustig, D.6
Pavlov, V.7
Zhai, A.8
Gambhir, M.9
Jaleel, A.10
Allmon, R.11
Rayess, R.12
Maresh, S.13
Emer, J.14
-
31
-
-
84988399611
-
Dynamically specialized datapaths for energy efficient computing
-
V. Govindaraju, C.-H. Ho, and K. Sankaralingam, "Dynamically Specialized Datapaths for Energy Efficient Computing," in IEEE HPCA, 2011.
-
(2011)
IEEE HPCA
-
-
Govindaraju, V.1
Ho, C.-H.2
Sankaralingam, K.3
-
32
-
-
34249721013
-
The wavescalar architecture
-
S. Swanson, A. Schwerin, M. Mercaldi, A. Petersen, A. Putnam, K. Michelson, M. Oskin, and S. J. Eggers, "The WaveScalar Architecture," ACM TOCS, vol. 25, no. 2, 2007.
-
(2007)
ACM TOCS
, vol.25
, Issue.2
-
-
Swanson, S.1
Schwerin, A.2
Mercaldi, M.3
Petersen, A.4
Putnam, A.5
Michelson, K.6
Oskin, M.7
Eggers, S.J.8
-
33
-
-
0036045954
-
PipeRench: A virtualized programmable datapath in 0.18 micron technology
-
H. Schmit, D. Whelihan, A. Tsai, M. Moe, B. Levine, and R. Reed Taylor, "PipeRench: A virtualized programmable datapath in 0.18 micron technology," in IEEE CICC, 2002.
-
(2002)
IEEE CICC
-
-
Schmit, H.1
Whelihan, D.2
Tsai, A.3
Moe, M.4
Levine, B.5
Reed Taylor, R.6
-
34
-
-
3242815471
-
Scaling to the end of silicon with edge architectures
-
D. Burger, S. W. Keckler, K. S. McKinley, M. Dahlin, L. K. John, C. Lin, C. R. Moore, J. Burrill, R. G. McDonald, and W. Yoder, "Scaling to the End of Silicon with EDGE Architectures," Computer, vol. 37, no. 7, 2004.
-
(2004)
Computer
, vol.37
, Issue.7
-
-
Burger, D.1
Keckler, S.W.2
McKinley, K.S.3
Dahlin, M.4
John, L.K.5
Lin, C.6
Moore, C.R.7
Burrill, J.8
McDonald, R.G.9
Yoder, W.10
-
35
-
-
84960079095
-
Exploring the potential of heterogeneous von neumann/dataflow execution models
-
T. Nowatzki, V. Gangadhar, and K. Sankaralingam, "Exploring the Potential of Heterogeneous Von Neumann/Dataflow Execution Models," in ISCA, 2015.
-
(2015)
ISCA
-
-
Nowatzki, T.1
Gangadhar, V.2
Sankaralingam, K.3
-
37
-
-
77956509090
-
Rectified linear units improve restricted boltzmann machines
-
V. Nair and G. E. Hinton, "Rectified Linear Units Improve Restricted Boltzmann Machines," in ICML, 2010.
-
(2010)
ICML
-
-
Nair, V.1
Hinton, G.E.2
-
38
-
-
85083950579
-
Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding
-
S. Han, H. Mao, and W. J. Dally, "Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding," in ICLR, 2016.
-
(2016)
ICLR
-
-
Han, S.1
Mao, H.2
Dally, W.J.3
-
39
-
-
84913555165
-
-
arXiv preprint arXiv:1408.5093
-
Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, "Caffe: Convolutional Architecture for Fast Feature Embedding," arXiv preprint arXiv:1408.5093, 2014.
-
(2014)
Caffe: Convolutional Architecture for Fast Feature Embedding
-
-
Jia, Y.1
Shelhamer, E.2
Donahue, J.3
Karayev, S.4
Long, J.5
Girshick, R.6
Guadarrama, S.7
Darrell, T.8
-
40
-
-
84881162326
-
Convolution engine: Balancing efficiency and flexibility in specialized computing
-
W. Qadeer, R. Hameed, O. Shacham, P. Venkatesan, C. Kozyrakis, and M. A. Horowitz, "Convolution Engine: Balancing Efficiency and Flexibility in Specialized Computing," in ISCA, 2013.
-
(2013)
ISCA
-
-
Qadeer, W.1
Hameed, R.2
Shacham, O.3
Venkatesan, P.4
Kozyrakis, C.5
Horowitz, M.A.6
-
41
-
-
84988330726
-
Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks
-
Y.-H. Chen, T. Krishna, J. Emer, and V. Sze, "Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks," in IEEE ISSCC, 2016.
-
(2016)
IEEE ISSCC
-
-
Chen, Y.-H.1
Krishna, T.2
Emer, J.3
Sze, V.4
-
42
-
-
84988372773
-
Exploiting spatial architectures for edit distance algorithms
-
J. J. Tithi, N. C. Crago, and J. S. Emer, "Exploiting spatial architectures for edit distance algorithms," IEEE ISPASS, 2014.
-
(2014)
IEEE ISPASS
-
-
Tithi, J.J.1
Crago, N.C.2
Emer, J.S.3
-
43
-
-
84864850882
-
Towards energy-proportional datacenter memory with mobile dram
-
K. T. Malladi, B. C. Lee, F. A. Nothaft, C. Kozyrakis, K. Periyathambi, and M. Horowitz, "Towards energy-proportional datacenter memory with mobile dram," in ISCA, 2012.
-
(2012)
ISCA
-
-
Malladi, K.T.1
Lee, B.C.2
Nothaft, F.A.3
Kozyrakis, C.4
Periyathambi, K.5
Horowitz, M.6
|