-
1
-
-
79961040286
-
Toward dark silicon in servers
-
July-Aug.
-
N. Hardavellas, M. Ferdman, B. Falsafi, and A. Ailamaki. Toward dark silicon in servers. IEEE Micro, 31(4):6-15, July-Aug. 2011.
-
(2011)
IEEE Micro
, vol.31
, Issue.4
, pp. 6-15
-
-
Hardavellas, N.1
Ferdman, M.2
Falsafi, B.3
Ailamaki, A.4
-
2
-
-
80052528714
-
Dark silicon and the end of multicore scaling
-
Hadi Esmaeilzadeh, Emily Blem, Renee St. Amant, Karthikeyan Sankaralingam, and Doug Burger. Dark silicon and the end of multicore scaling. In ISCA, 2011.
-
(2011)
ISCA
-
-
Esmaeilzadeh, H.1
Blem, E.2
St Amant, R.3
Sankaralingam, K.4
Burger, D.5
-
3
-
-
77952256041
-
Conservation cores: Reducing the energy of mature computations
-
Ganesh Venkatesh, Jack Sampson, Nathan Goulding, Saturnino Garcia, Vladyslav Bryksin, Jose Lugo-Martinez, Steven Swanson, and Michael Bedford Taylor. Conservation cores: Reducing the energy of mature computations. In ASPLOS, 2010.
-
(2010)
ASPLOS
-
-
Venkatesh, G.1
Sampson, J.2
Goulding, N.3
Garcia, S.4
Bryksin, V.5
Lugo-Martinez, J.6
Swanson, S.7
Taylor, M.B.8
-
4
-
-
0016116644
-
Design of ion-implanted mosfet's with very small physical dimensions
-
October
-
R. H. Dennard, F. H. Gaensslen, V. L. Rideout, E. Bassous, and A. R. LeBlanc. Design of ion-implanted mosfet's with very small physical dimensions. IEEE Journal of Solid-State Circuits, 9, October 1974.
-
(1974)
IEEE Journal of Solid-State Circuits
, vol.9
-
-
Dennard, R.H.1
Gaensslen, F.H.2
Rideout, V.L.3
Bassous, E.4
LeBlanc, A.R.5
-
5
-
-
84860270793
-
CPU DB: Recording microprocessor history
-
April
-
Andrew Danowitz, Kyle Kelley, James Mao, John P. Stevenson, and Mark Horowitz. CPU DB: Recording microprocessor history. ACM Queue, 10(4):10:10-10:27, April 2012.
-
(2012)
ACM Queue
, vol.10
, Issue.4
, pp. 1010-1027
-
-
Danowitz, A.1
Kelley, K.2
Mao, J.3
Stevenson, J.P.4
Horowitz, M.5
-
7
-
-
84905454486
-
A reconfigurable fabric for accelerating large-scale datacenter services
-
June
-
Andrew Putnam, Adrian Caulfield, Eric Chung, Derek Chiou, Kypros Constantinides, John Demme, Hadi Esmaeilzadeh, Jeremy Fowers, Gopi Prashanth, Jan Gray, Michael Haselman, Scott Hauck, Stephen Heil, Amir Hormati, Joo-Young Kim, Sitaram Lanka, James R. Larus, Eric Peterson, Aaron Smith, Jason Thong, Phillip Yi Xiao, and Doug Burger. A reconfigurable fabric for accelerating large-scale datacenter services. In ISCA, June 2014.
-
(2014)
ISCA
-
-
Putnam, A.1
Caulfield, A.2
Chung, E.3
Chiou, D.4
Constantinides, K.5
Demme, J.6
Esmaeilzadeh, H.7
Fowers, J.8
Prashanth, G.9
Gray, J.10
Haselman, M.11
Hauck, S.12
Heil, S.13
Hormati, A.14
Kim, J.-Y.15
Lanka, S.16
Larus, J.R.17
Peterson, E.18
Smith, A.19
Thong, J.20
Xiao, P.Y.21
Burger, D.22
more..
-
8
-
-
79955890625
-
Dynamically specialized datapaths for energy efficient computing
-
Venkatraman Govindaraju, Chen-Han Ho, and Karthikeyan Sankaralingam. Dynamically specialized datapaths for energy efficient computing. In HPCA, 2011.
-
(2011)
HPCA
-
-
Govindaraju, V.1
Ho, C.-H.2
Sankaralingam, K.3
-
9
-
-
84858776502
-
QsCores: Trading dark silicon for scalable energy efficiency with quasi-specific cores
-
Ganesh Venkatesh, John Sampson, Nathan Goulding, Sravanthi Kota Venkata, Steven Swanson, and Michael Taylor. QsCores: Trading dark silicon for scalable energy efficiency with quasi-specific cores. In MICRO, 2011.
-
(2011)
MICRO
-
-
Venkatesh, G.1
Sampson, J.2
Goulding, N.3
Venkata, S.K.4
Swanson, S.5
Taylor, M.6
-
10
-
-
84863374615
-
Bundled execution of recurring traces for energy-efficient general purpose processing
-
Shantanu Gupta, Shuguang Feng, Amin Ansari, Scott Mahlke, and David August. Bundled execution of recurring traces for energy-efficient general purpose processing. In MICRO, 2011.
-
(2011)
MICRO
-
-
Gupta, S.1
Feng, S.2
Ansari, A.3
Mahlke, S.4
August, D.5
-
11
-
-
84939202658
-
Sirius: An open end-to-end voice and vision personal assistant and its implications for future warehouse scale computers
-
Johann Hauswald, Michael A. Laurenzano, Yunqi Zhang, Cheng Li, Austin Rovinski, Arjun Khurana, Ron Dreslinski, Trevor Mudge, Vinicius Petrucci, Lingjia Tang, and Jason Mars. Sirius: An open end-to-end voice and vision personal assistant and its implications for future warehouse scale computers. In ASPLOS, 2015.
-
(2015)
ASPLOS
-
-
Hauswald, J.1
Laurenzano, M.A.2
Zhang, Y.3
Li, C.4
Rovinski, A.5
Khurana, A.6
Dreslinski, R.7
Mudge, T.8
Petrucci, V.9
Tang, L.10
Mars, J.11
-
16
-
-
84862644049
-
Towards a unified architecture for in-RDBMS analytics
-
Xixuan Feng, Arun Kumar, Benjamin Recht, and Christopher Ré. Towards a unified architecture for in-RDBMS analytics. In Proceedings of the International Conference on Management of Data, SIGMOD '12, 2012.
-
(2012)
Proceedings of the International Conference on Management of Data, SIGMOD '12
-
-
Feng, X.1
Kumar, A.2
Recht, B.3
Ré, C.4
-
18
-
-
84913555165
-
-
arXiv preprint arXiv: 1408.5093
-
Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093, 2014.
-
(2014)
Caffe: Convolutional Architecture for Fast Feature Embedding
-
-
Jia, Y.1
Shelhamer, E.2
Donahue, J.3
Karayev, S.4
Long, J.5
Girshick, R.6
Guadarrama, S.7
Darrell, T.8
-
22
-
-
0010615068
-
Neurocomputing using the MasPar MP-1
-
K. W. Przytula and V. K. Prasnna, editors chapter 2 Prentice-Hall
-
Kamil A. Grajski. Neurocomputing, using the MasPar MP-1. In K. W. Przytula and V. K. Prasnna, editors, Parallel Digital Implementations of Neural Networks, chapter 2, pages 51-76. Prentice-Hall, 1993.
-
(1993)
Parallel Digital Implementations of Neural Networks
, pp. 51-76
-
-
Kamil, A.1
Grajski2
-
23
-
-
84876591853
-
Neural acceleration for general-purpose approximate programs
-
Hadi Esmaeilzadeh, Adrian Sampson, Luis Ceze, and Doug Burger. Neural acceleration for general-purpose approximate programs. In MICRO, 2012.
-
(2012)
MICRO
-
-
Esmaeilzadeh, H.1
Sampson, A.2
Ceze, L.3
Burger, D.4
-
24
-
-
0032203257
-
Gradient-based learning applied to document recognition
-
Yann Lecun, LÃl'on Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. In Proceedings of the IEEE, pages 2278-2324, 1998.
-
(1998)
Proceedings of the IEEE
, pp. 2278-2324
-
-
LeCun, Y.1
Bottou, L.2
Bengio, Y.3
Haffner, P.4
-
25
-
-
84965003162
-
-
Nvidia. Jetson
-
Nvidia. Jetson. http://www.nvidia.com/object/jetson-tk1-embedded-dev-kit.html, 2015.
-
(2015)
-
-
-
26
-
-
50949133669
-
Liblinear: A library for large linear classification
-
June
-
Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. Liblinear: A library for large linear classification. J. Mach. Learn. Res., 9:1871-1874, June 2008.
-
(2008)
J. Mach. Learn. Res.
, vol.9
, pp. 1871-1874
-
-
Fan, R.-E.1
Chang, K.-W.2
Hsieh, C.-J.3
Wang, X.-R.4
Lin, C.-J.5
-
27
-
-
84876211743
-
MLPACK: A scalable C++ machine learning library
-
Ryan R. Curtin, James R. Cline, Neil P. Slagle, William B. March, P. Ram, Nishant A. Mehta, and Alexander G. Gray. MLPACK: A scalable C++ machine learning library. Journal of Machine Learning Research, 14:801-805, 2013.
-
(2013)
Journal of Machine Learning Research
, vol.14
, pp. 801-805
-
-
Curtin, R.R.1
Cline, J.R.2
Slagle, N.P.3
March, W.B.4
Ram, P.5
Mehta, N.A.6
Gray, A.G.7
-
28
-
-
84874092321
-
Model-driven level 3 BLAS performance optimization on loongson 3A processor
-
Zhang Xianyi, Wang Qian, and Zhang Yunquan. Model-driven level 3 BLAS performance optimization on loongson 3A processor. In ICPADS, 2012.
-
(2012)
ICPADS
-
-
Xianyi, Z.1
Qian, W.2
Yunquan, Z.3
-
29
-
-
84863614151
-
Factorization machines with libFM
-
May
-
Steffen Rendle. Factorization machines with libFM. ACM Trans. Intell. Syst. Technol., 3(3):57:1-57:22, May 2012.
-
(2012)
ACM Trans. Intell. Syst. Technol.
, vol.3
, Issue.3
, pp. 571-5722
-
-
Rendle, S.1
-
30
-
-
79955702502
-
Libsvm: A library for support vector machines
-
May
-
Chih-Chung Chang and Chih-Jen Lin. Libsvm: A library for support vector machines. ACM Trans. Intell. Syst. Technol., 2(3):27:1-27:27, May 2011.
-
(2011)
ACM Trans. Intell. Syst. Technol.
, vol.2
, Issue.3
, pp. 271-2727
-
-
Chang, C.-C.1
Lin, C.-J.2
-
32
-
-
84939194962
-
PuDianNao: A polyvalent machine learning accelerator
-
Daofu Liu, Tianshi Chen, Shaoli Liu, Jinhong Zhou, Shengyuan Zhou, Olivier Teman, Xiaobing Feng, Xuehai Zhou, and Yunji Chen. PuDianNao: A polyvalent machine learning accelerator. In ASPLOS, 2015.
-
(2015)
ASPLOS
-
-
Liu, D.1
Chen, T.2
Liu, S.3
Zhou, J.4
Zhou, S.5
Teman, O.6
Feng, X.7
Zhou, X.8
Chen, Y.9
-
34
-
-
84944081816
-
-
CoRR
-
Sharan Chetlur, Cliff Woolley, Philippe Vandermersch, Jonathan Cohen, John Tran, Bryan Catanzaro, and Evan Shelhamer. cudnn: Efficient primitives for deep learning. CoRR, 2014.
-
(2014)
Cudnn: Efficient Primitives for Deep Learning
-
-
Chetlur, S.1
Woolley, C.2
Vandermersch, P.3
Cohen, J.4
Tran, J.5
Catanzaro, B.6
Shelhamer, E.7
-
35
-
-
84885624310
-
Parallel architectures for the knn classifier-design of soft IP cores and FPGA implementations
-
September
-
Ioannis Stamoulias and Elias S. Manolakos. Parallel architectures for the knn classifier-design of soft IP cores and FPGA implementations. ACM Trans. Embed. Comput. Syst., 13(2):22:1-22:21, September 2013.
-
(2013)
ACM Trans. Embed. Comput. Syst.
, vol.13
, Issue.2
, pp. 221-2221
-
-
Stamoulias, I.1
Manolakos, E.S.2
-
36
-
-
77955985658
-
IP-cores design for the knn classifier
-
May
-
E.S. Manolakos and I. Stamoulias. IP-cores design for the knn classifier. In ISCAS, May 2010.
-
(2010)
ISCAS
-
-
Manolakos, E.S.1
Stamoulias, I.2
-
38
-
-
34047242620
-
Real-time K-Means clustering for color images on reconfigurable hardware
-
Tsutomu Maruyama. Real-time K-Means clustering for color images on reconfigurable hardware. In ICPR, pages 816-819, 2006.
-
(2006)
ICPR
, pp. 816-819
-
-
Maruyama, T.1
-
39
-
-
50149091681
-
Hyperspectral images clustering on reconfigurable hardware using the k-means algorithm
-
Sept
-
A.Gda.S. Filho, A.C. Frery, C.C. de Araujo, H. Alice, J. Cerqueira, J.A. Loureiro, M.E. de Lima, Mdas.G.S. Oliveira, and M.M. Horta. Hyperspectral images clustering on reconfigurable hardware using the k-means algorithm. In SBCCI, Sept 2003.
-
(2003)
SBCCI
-
-
Gda, A.1
Filho, S.2
Frery, A.C.3
De Araujo, C.C.4
Alice, H.5
Cerqueira, J.6
Loureiro, J.A.7
De Lima, M.E.8
Oliveira, M.G.S.9
Horta, M.M.10
-
40
-
-
77954269943
-
A heterogeneous FPGA architecture for support vector machine training
-
May
-
M. Papadonikolakis and C. Bouganis. A heterogeneous FPGA architecture for support vector machine training. In FCCM, May 2010.
-
(2010)
FCCM
-
-
Papadonikolakis, M.1
Bouganis, C.2
-
41
-
-
74349084542
-
A massively parallel fpga-based coprocessor for support vector machines
-
April
-
S. Cadambi, I. Durdanovic, V. Jakkula, M. Sankaradass, E. Cosatto, S. Chakradhar, and H.P. Graf. A massively parallel fpga-based coprocessor for support vector machines. In FCCM, April 2009.
-
(2009)
FCCM
-
-
Cadambi, S.1
Durdanovic, I.2
Jakkula, V.3
Sankaradass, M.4
Cosatto, E.5
Chakradhar, S.6
Graf, H.P.7
-
42
-
-
79953123438
-
An energy-efficient heterogeneous system for embedded learning and classification
-
March
-
A. Majumdar, S. Cadambi, and S.T. Chakradhar. An energy-efficient heterogeneous system for embedded learning and classification. Embedded Systems Letters, IEEE, 3(1):42-45, March 2011.
-
(2011)
Embedded Systems Letters IEEE
, vol.3
, Issue.1
, pp. 42-45
-
-
Majumdar, A.1
Cadambi, S.2
Chakradhar, S.T.3
-
43
-
-
84859452113
-
A massively parallel, energy efficient programmable accelerator for learning and classification
-
March
-
Abhinandan Majumdar, Srihari Cadambi, Michela Becchi, Srimat T. Chakradhar, and Hans Peter Graf. A massively parallel, energy efficient programmable accelerator for learning and classification. ACM Trans. Archit. Code Optim., 9(1):6:1-6:30, March 2012.
-
(2012)
ACM Trans. Archit. Code Optim.
, vol.9
, Issue.1
, pp. 61-630
-
-
Majumdar, A.1
Cadambi, S.2
Becchi, M.3
Chakradhar, S.T.4
Graf, H.P.5
-
44
-
-
80054919955
-
NeuFlow: A runtime reconfigurable dataflow processor for vision
-
June
-
C. Farabet, B. Martini, B. Corda, P. Akselrod, E. Culurciello, and Y. LeCun. NeuFlow: A runtime reconfigurable dataflow processor for vision. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2011 IEEE Computer Society Conference on, pages 109-116, June 2011.
-
(2011)
Computer Vision and Pattern Recognition Workshops (CVPRW) 2011 IEEE Computer Society Conference on
, pp. 109-116
-
-
Farabet, C.1
Martini, B.2
Corda, B.3
Akselrod, P.4
Culurciello, E.5
LeCun, Y.6
-
45
-
-
84897780584
-
DianNao: A small-footprint high-throughput accelerator for ubiquitous machine-learning
-
Tianshi Chen, Zidong Du, Ninghui Sun, JiaWang, ChengyongWu, Yunji Chen, and Olivier Temam. DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning. In ASPLOS, 2014.
-
(2014)
ASPLOS
-
-
Chen, T.1
Du, Z.2
Sun, N.3
Wang, J.4
Wu, C.5
Chen, Y.6
Temam, O.7
-
46
-
-
84863551827
-
Accelerating neuromorphic vision algorithms for recognition
-
June
-
A.A. Maashri, M. DeBole, M. Cotter, N. Chandramoorthy, Yang Xiao, V. Narayanan, and C. Chakrabarti. Accelerating neuromorphic vision algorithms for recognition. In DAC, June 2012.
-
(2012)
DAC
-
-
Maashri, A.A.1
DeBole, M.2
Cotter, M.3
Chandramoorthy, N.4
Xiao, Y.5
Narayanan, V.6
Chakrabarti, C.7
-
47
-
-
84934280945
-
SNNAP: Approximate computing on programmable socs via neural acceleration
-
Thierry Moreau, Mark Wyse, Jacob Nelson, Adrian Sampson, Hadi Esmaeilzadeh, Luis Ceze, and Mark Oskin. SNNAP: Approximate computing on programmable socs via neural acceleration. In HPCA, 2015.
-
(2015)
HPCA
-
-
Moreau, T.1
Wyse, M.2
Nelson, J.3
Sampson, A.4
Esmaeilzadeh, H.5
Ceze, L.6
Oskin, M.7
-
48
-
-
79961187689
-
A hardware acceleration technique for gradient descent and conjugate gradient
-
June
-
D. Kesler, B. Deka, and R. Kumar. A hardware acceleration technique for gradient descent and conjugate gradient. In SASP, June 2011.
-
(2011)
SASP
-
-
Kesler, D.1
Deka, B.2
Kumar, R.3
-
49
-
-
79953230605
-
Constantinides. A high throughput FPGAbased floating point conjugate gradient implementation for dense matrices
-
January
-
Antonio Roldao and George A. Constantinides. A high throughput FPGAbased floating point conjugate gradient implementation for dense matrices. ACM Trans. Reconfigurable Technol. Syst., 3(1):1:1-1:19, January 2010.
-
(2010)
ACM Trans. Reconfigurable Technol. Syst.
, vol.3
, Issue.1
, pp. 11-119
-
-
Roldao, A.1
George, A.2
-
50
-
-
34147131364
-
A hybrid approach for mapping conjugate gradient onto an fpga-augmented reconfigurable supercomputer
-
April
-
G.R. Morris, V.K. Prasanna, and R.D.,erson. A hybrid approach for mapping conjugate gradient onto an fpga-augmented reconfigurable supercomputer. In FCCM, April 2006.
-
(2006)
FCCM
-
-
Morris, G.R.1
Prasanna, V.K.2
Erson, R.D.3
-
51
-
-
60349119698
-
An implementation of the conjugate gradient algorithm on fpgas
-
April
-
D. DuBois, A. DuBois, T. Boorman, C. Connor, and S. Poole. An implementation of the conjugate gradient algorithm on fpgas. In FCCM, April 2008.
-
(2008)
FCCM
-
-
DuBois, D.1
DuBois, A.2
Boorman, T.3
Connor, C.4
Poole, S.5
-
52
-
-
77954069604
-
FPGA implementation of kNN classifier based on wavelet transform and partial distance search
-
Yao-Jung Yeh, Hui-Ya Li,Wen-Jyi Hwang, and Chiung-Yao Fang. FPGA implementation of kNN classifier based on wavelet transform and partial distance search. In SCIA, 2007.
-
(2007)
SCIA
-
-
Yeh, Y.-J.1
Li, H.-Y.2
Hwang, W.-J.3
Fang, C.-Y.4
-
53
-
-
54949115901
-
CHiMPS: A high-level compilation flow for hybrid CPU-FPGA architectures
-
Andrew R. Putnam, Dave Bennett, Eric Dellinger, Jeff Mason, and Prasanna Sundararajan. CHiMPS: A high-level compilation flow for hybrid CPU-FPGA architectures. In FPGA, 2008.
-
(2008)
FPGA
-
-
Putnam, A.R.1
Bennett, D.2
Dellinger, E.3
Mason, J.4
Sundararajan, P.5
-
54
-
-
77955733335
-
Fpmr: Mapreduce framework on fpga
-
Yi Shan, BoWang, Jing Yan, YuWang, Ningyi Xu, and Huazhong Yang. Fpmr: Mapreduce framework on fpga. In FPGA, 2010.
-
(2010)
FPGA
-
-
Shan, Y.1
Wang, B.2
Yan, J.3
Wang, Y.4
Xu, N.5
Yang, H.6
-
55
-
-
84912524416
-
A high memory bandwidth fpga accelerator for sparse matrix-vector multiplication
-
IEEE, May
-
Jeremy Fowers, Kalin Ovtcharov, Karin Strauss, Eric Chung, and Greg Stitt. A high memory bandwidth fpga accelerator for sparse matrix-vector multiplication. In FCCM. IEEE, May 2014.
-
(2014)
FCCM
-
-
Fowers, J.1
Ovtcharov, K.2
Strauss, K.3
Chung, E.4
Stitt, G.5
-
56
-
-
79952918458
-
CoRAM: An in-fabric memory architecture for fpga-based computing
-
Eric S. Chung, James C. Hoe, and Ken Mai. CoRAM: An in-fabric memory architecture for fpga-based computing. In FPGA, 2011.
-
(2011)
FPGA
-
-
Chung, E.S.1
Hoe, J.C.2
Mai, K.3
-
57
-
-
84881142714
-
LINQits: Big data on little clients
-
Eric S. Chung, John D. Davis, and Jaewon Lee. LINQits: Big data on little clients. In ISCA, 2013.
-
(2013)
ISCA
-
-
Chung, E.S.1
Davis, J.D.2
Lee, J.3
-
58
-
-
84898616656
-
-
FPL, Sept
-
M. King, A. Khan, A. Agarwal, O. Arcas, and Arvind. Generating infrastructure for FPGA-accelerated applications. In FPL, Sept 2013.
-
(2013)
Generating Infrastructure for FPGA-accelerated Applications
-
-
King, M.1
Khan, A.2
Agarwal, A.3
Arcas, O.4
Arvind5
|