-
1
-
-
84978755800
-
-
accessed: 2015-09-04
-
"The Green500 List-June 2015," http://www.green500.org/lists/ green201506, accessed: 2015-09-04.
-
The Green500 List-June 2015
-
-
-
2
-
-
84886548074
-
Trends in energy-efficient computing: A perspective from the Green500
-
B. Subramaniam, W. Saunders, T. Scogland, and w.-c. Feng, "Trends in energy-efficient computing: A perspective from the Green500," in Green Computing Coriference (IGCC), 2013.
-
(2013)
Green Computing Coriference (IGCC)
-
-
Subramaniam, B.1
Saunders, W.2
Scogland, T.3
Feng, W.-C.4
-
3
-
-
84863393413
-
A performance and energy comparison of FPGAs, GPUs, and multicores for slidingwindow applications
-
J. Fowers, G. Brown, P. Cooke, and G. Stitt, " A Performance and Energy Comparison of FPGAs, GPUs, and Multicores for Slidingwindow Applications," in Int. Symp. on Field-programmable Gate Arrays (FPGA), 2012.
-
(2012)
Int. Symp. on Field-programmable Gate Arrays (FPGA)
-
-
Fowers, J.1
Brown, G.2
Cooke, P.3
Stitt, G.4
-
4
-
-
77957919571
-
BLAS comparison on FPGA, CPU and GPU
-
S. Kestur, J. Davis, and O. Williams, "BLAS Comparison on FPGA, CPU and GPU," in Annual Symposium on VLSI (ISVLSI), 2010.
-
(2010)
Annual Symposium on VLSI (ISVLSI)
-
-
Kestur, S.1
Davis, J.2
Williams, O.3
-
5
-
-
79952909372
-
Bridging the GPGPU-FPGA efficiency gap
-
C. W. F1etcher, I. A. Lebedev, N. B. Asadi, D. R. Burke, and J. Wawrzy nek, "Bridging the GPGPU-FPGA Efficiency Gap," in Field Programmable Gate Arrays (FPGA), 2011.
-
(2011)
Field Programmable Gate Arrays (FPGA)
-
-
Fletcher, C.W.1
Lebedev, I.A.2
Asadi, N.B.3
Burke, D.R.4
Wawrzy Nek, J.5
-
6
-
-
77649253148
-
Performance comparison of graphics processors to reconfigurable logic: A case study
-
B. Cope, P. Cheung, W. Luk, and L. Howes, "Performance Comparison of Graphics Processors to Reconfigurable Logic: A Case Study," IEEE Trans. on Computers, vol. 59, no. 4, 2010.
-
(2010)
IEEE Trans. on Computers
, vol.59
, Issue.4
-
-
Cope, B.1
Cheung, P.2
Luk, W.3
Howes, L.4
-
7
-
-
84905454486
-
A reconfigurable fabric for accelerating large-scale datacenter services
-
A. Putnam, A. Caulfield, E. Chung, D. Chiou, K. Constantinides, J. Demme, H. Esmaeilzadeh, J. Fowers, G. Gopal, J. Gray, M. Haselman, S. Hauck, S. Heil, A. Hormati, J.-Y. Kim, S. Lanka, J. Larus, E. Peterson, S. Pope, A. Smith, J. Thong, P. Xiao, and D. Burger, "A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services," in Int. Symp. on Computer Architecture (ISCA), 2014.
-
(2014)
Int. Symp. on Computer Architecture (ISCA)
-
-
Putnam, A.1
Caulfield, A.2
Chung, E.3
Chiou, D.4
Constantinides, K.5
Demme, J.6
Esmaeilzadeh, H.7
Fowers, J.8
Gopal, G.9
Gray, J.10
Haselman, M.11
Hauck, S.12
Heil, S.13
Hormati, A.14
Kim, J.-Y.15
Lanka, S.16
Larus, J.17
Peterson, E.18
Pope, S.19
Smith, A.20
Thong, J.21
Xiao, P.22
Burger, D.23
more..
-
8
-
-
84982813068
-
Sda: Software-defined accelerator for large-scale DNN systems
-
J. Ouyang, S. Lin, W. Qi, Y. Wang, B. Yu, and S. Jiang, "SDA: Software-Defined Accelerator for Large-Scale DNN Systems," in HotChips26, 2014.
-
(2014)
HotChips26
-
-
Ouyang, J.1
Lin, S.2
Qi, W.3
Wang, Y.4
Yu, B.5
Jiang, S.6
-
9
-
-
84978687066
-
Intel xeon+FPGA platform for the data center
-
Workshop on Recorifigurable Computing for the Masses
-
P. K. Gupta, "Intel Xeon+FPGA Platform for the Data Center," in Field Programmable Logic and Applications (FPL), Workshop on Recorifigurable Computing for the Masses, 2014.
-
(2014)
Field Programmable Logic and Applications (FPL)
-
-
Gupta, P.K.1
-
10
-
-
84922876530
-
CAPI: A coherent accelerator processor interface
-
J. Stuecheli, B. Blaner, C. Johns, and M. Siegel, "CAPI: A Coherent Accelerator Processor Interface," IBM Journal of Research and Development, vol. 59, no. 1, 2015.
-
(2015)
IBM Journal of Research and Development
, vol.59
, Issue.1
-
-
Stuecheli, J.1
Blaner, B.2
Johns, C.3
Siegel, M.4
-
11
-
-
35648995516
-
-
EECS Department, University of California, Berkeley, Tech. Rep.
-
K. Asanovic, R. Bodik, B. C. Catanzaro, J. J. Gebis, P. Husbands, K. Keutzer, D. A. Patterson, W. L. Plishker, I. Shalf, S. W. Williams, and K. A. Yelick, "The Landscape of Parallel Computing Research: A View from Berkeley," EECS Department, University of California, Berkeley, Tech. Rep., 2006.
-
(2006)
The Landscape of Parallel Computing Research: A View from Berkeley
-
-
Asanovic, K.1
Bodik, R.2
Catanzaro, B.C.3
Gebis, J.J.4
Husbands, P.5
Keutzer, K.6
Patterson, D.A.7
Plishker, W.L.8
Shalf, I.9
Williams, S.W.10
Yelick, K.A.11
-
12
-
-
77954719557
-
The scalable heterogeneous computing (shoc) benchmark suite
-
A. Danalis, G. Marin, C. McCurdy, J. Meredith, P. Roth, K. Spafford, V. Tipparaju, and J. Vetter, "The Scalable HeterOgeneous Computing (SHOC) Benchmark Suite," in 3rd Workshop on General-Purpose Computation on Graphics Processors (GPGPU), 2010.
-
(2010)
3rd Workshop on General-Purpose Computation on Graphics Processors (GPGPU)
-
-
Danalis, A.1
Marin, G.2
McCurdy, C.3
Meredith, J.4
Roth, P.5
Spafford, K.6
Tipparaju, V.7
Vetter, J.8
-
13
-
-
70649092154
-
Rodinia: A benchmark suite for heterogeneous computing
-
S. Che, M. Boyer, J. Meng, D. Tarjan, J. Sheaffer, S.-H. Lee, and K. Skadron, "Rodinia: A benchmark suite for heterogeneous computing," in Int. Symp. on Workload Characterization (llSWC), 2009.
-
(2009)
Int. Symp. on Workload Characterization (LlSWC)
-
-
Che, S.1
Boyer, M.2
Meng, J.3
Tarjan, D.4
Sheaffer, J.5
Lee, S.-H.6
Skadron, K.7
-
14
-
-
84861017830
-
Opencl and the 13 dwarfs: A work in progress
-
W-c. Feng, H. Lin, T. Scogland, and J. Zhang, "OpenCL and the 13 Dwarfs: A Work in Progress," in Int. Conf. on Peiformance Engineering (ICPE), 2012.
-
(2012)
Int. Conf. on Peiformance Engineering (ICPE)
-
-
Feng, W.-C.1
Lin, H.2
Scogland, T.3
Zhang, J.4
-
15
-
-
84903825248
-
A unified methodology for a fast benchmarking of parallel architecture
-
A. Guerre, J.-T. Acquaviva, and Y. Lhuillier, "A unified methodology for a fast benchmarking of parallel architecture," in Design, Automation and Test in Europe (DATE), 2014.
-
(2014)
Design, Automation and Test in Europe (DATE)
-
-
Guerre, A.1
Acquaviva, J.-T.2
Lhuillier, Y.3
-
16
-
-
84906342283
-
Analyzing the energyefficiency of dense linear algebra kerneis by power-profiling a hybrid CPUIFPGA system
-
H. Giefers, R. Polig, and C. Hagleitner, "Analyzing the energyefficiency of dense linear algebra kerneis by power-profiling a hybrid CPUIFPGA system," in Application-specijic Systems, Architectures and Processors (ASAP), 2014.
-
(2014)
Application-specijic Systems, Architectures and Processors (ASAP)
-
-
Giefers, H.1
Polig, R.2
Hagleitner, C.3
-
17
-
-
84918776204
-
The power-performance tradeoffs of the intel xeon phi on HPC applications
-
B. Li, H.-C. Chang, S. L. Song, c.-Y. Su, T. Meyer, J. Mooring, and K. Cameron, "The Power-Performance Tradeoffs of the Intel Xeon Phi on HPC Applications," in Int. Workshop on Large Scale Parallel Processing (LSPP), 2014.
-
(2014)
Int. Workshop on Large Scale Parallel Processing (LSPP)
-
-
Li, B.1
Chang, H.-C.2
Song, S.L.3
Su, C.-Y.4
Meyer, T.5
Mooring, J.6
Cameron, K.7
-
18
-
-
84863347222
-
A performance analysis framework for identifying potential benefits in GPGPU applications
-
J. Sim, A. Dasgupta, H. Kim, and R. Vuduc, "A performance analysis framework for identifying potential benefits in GPGPU applications," in Principles and Practice of Parallel Programming (PPoPP), 2012.
-
(2012)
Principles and Practice of Parallel Programming (PPoPP)
-
-
Sim, J.1
Dasgupta, A.2
Kim, H.3
Vuduc, R.4
-
20
-
-
84887917163
-
CUSPARSE library: A set of basic linear algebrasubroutines for sparse matrices
-
M. Naumov, L. S. Chien, P. Vandermersch, and U. Kapasi, "CUSPARSE Library: A Set of Basic Linear AlgebraSubroutines for Sparse Matrices," in GPU Technology Coriference, 2010.
-
(2010)
GPU Technology Coriference
-
-
Naumov, M.1
Chien, L.S.2
Vandermersch, P.3
Kapasi, U.4
-
21
-
-
81355161778
-
The university of Florida sparse matrix collection
-
T. A. Davis and Y. Hu, "The University of Florida Sparse Matrix Collection," ACM Trans. Math. Softw., vol. 38, no. 1, 2011.
-
(2011)
ACM Trans. Math. Softw.
, vol.38
, Issue.1
-
-
Davis, T.A.1
Hu, Y.2
-
22
-
-
84879835573
-
Efficient sparse matrix-vector multiplication on x86-based many-core processors
-
X. Liu, M. Smelyanskiy, E. Chow, and P. Dubey, "Efficient Sparse Matrix-vector Multiplication on x86-based Many-core Processors," in Int. Con! on Supercomputing (ISC), 2013.
-
(2013)
Int. Con! on Supercomputing (ISC)
-
-
Liu, X.1
Smelyanskiy, M.2
Chow, E.3
Dubey, P.4
-
24
-
-
77949382525
-
FPGA vs. GPU for sparse matrix vector multiply
-
Y. Zhang, Y. Shalabi, R. Jain, K. Nagar, and J. Bakos, "FPGA vs. GPU for sparse matrix vector multiply," in Field-Programmable Technology (FPT), 2009.
-
(2009)
Field-Programmable Technology (FPT)
-
-
Zhang, Y.1
Shalabi, Y.2
Jain, R.3
Nagar, K.4
Bakos, J.5
-
25
-
-
77951180817
-
Instruction set innovations for the convey HC-l computer
-
T. Brewer, "Instruction Set Innovations for the Convey HC-l Computer," Micro, IEEE, vol. 30, no. 2, 2010.
-
(2010)
Micro, IEEE
, vol.30
, Issue.2
-
-
Brewer, T.1
-
28
-
-
84875673115
-
-
Version 4.304.55 ed., NVIDIA Corp.
-
NVML API REFERENCE MANUAL, Version 4.304.55 ed., NVIDIA Corp., 2012.
-
(2012)
NVML Api Reference Manual
-
-
-
29
-
-
84938811918
-
Measuring GPU power with the K20 built-in sensor
-
M. Burtscher, I. Zecena, and Z. Zong, "Measuring GPU Power with the K20 Built-in Sensor," in Proceedings ofWorkshop on General Purpose Processing Using GPUs, ser. GPGPU-7, 2014.
-
(2014)
Proceedings OfWorkshop on General Purpose Processing Using GPUs, Ser. GPGPU-7
-
-
Burtscher, M.1
Zecena, I.2
Zong, Z.3
-
30
-
-
77957942221
-
RAPL: Memory power estimation and capping
-
H. David, E. Gorbatov, U. R. Hanebutte, R. Khanna, and C. Le, "RAPL: Memory Power Estimation and Capping," in Int. Symp. on Low Power Electronics and Design (ISLPED), 2010.
-
(2010)
Int. Symp. on Low Power Electronics and Design (ISLPED)
-
-
David, H.1
Gorbatov, E.2
Hanebutte, U.R.3
Khanna, R.4
Le, C.5
-
31
-
-
84978635984
-
-
Electronic Educational Devices, accessed: 2015-09-08
-
Watts up? and Watts up? PRO Operators Manual, https://www. wattsupmeters.com, Electronic Educational Devices, accessed: 2015-09-08.
-
Watts Up? and Watts Up? PRO Operators Manual
-
-
-
32
-
-
85043146402
-
-
V2.1 ed., Standard Performance Evaluation Corporation (SPEC), SPEC Power and Performance Committee
-
Power and Peiformance Benchmark Methodology, V2.1 ed., Standard Performance Evaluation Corporation (SPEC), SPEC Power and Performance Committee, 2012.
-
(2012)
Power and Peiformance Benchmark Methodology
-
-
-
34
-
-
0034316092
-
Poweraware microarchitecture: Design and modeling challenges for nextgeneration microprocessors
-
D. Brooks, P. Bose, S. Schuster, H. Jacobson, P. Kudva, A. Buyuktosunoglu, J.-D. Wellman, V Zyuban, M. Gupta, and P. Cook, "PowerAware Microarchitecture: Design and Modeling Challenges for NextGeneration Microprocessors," IEEE Micro, vol. 20, no. 6, 2000.
-
(2000)
IEEE Micro
, vol.20
, Issue.6
-
-
Brooks, D.1
Bose, P.2
Schuster, S.3
Jacobson, H.4
Kudva, P.5
Buyuktosunoglu, A.6
Wellman, J.-D.7
Zyuban, V.8
Gupta, M.9
Cook, P.10
-
35
-
-
0030243819
-
Energy dissipation in general purpose microprocessors
-
R. Gonzalez and M. Horowitz, "Energy dissipation in general purpose microprocessors," Solid-State Circuits, vol. 31, no. 9, 1996.
-
(1996)
Solid-State Circuits
, vol.31
, Issue.9
-
-
Gonzalez, R.1
Horowitz, M.2
-
43
-
-
84903765018
-
-
cusparse, accessed: 2015-09-04. 56
-
"NVIDIA CUDA Sparse Matrix library," https://developer.nvidia.coml cusparse, accessed: 2015-09-04. 56
-
NVIDIA CUDA Sparse Matrix Library
-
-
-
45
-
-
70450227686
-
Performance evaluation of the sparse matrix-vector multiplication on modern architectures
-
G. Goumas, K. Kourtis, N. Anastopoulos, V Karakasis, and N. Koziris, "Performance evaluation of the sparse matrix-vector multiplication on modern architectures," The Journal of Supercomputing, vol. 50, no. 1, 2009.
-
(2009)
The Journal of Supercomputing
, vol.50
, Issue.1
-
-
Goumas, G.1
Kourtis, K.2
Anastopoulos, N.3
Karakasis, V.4
Koziris, N.5
-
46
-
-
10044233808
-
-
Ph.D. dissertation, University of California, Berkeley, CA, USA
-
R. W Vuduc, "Automatie Performance Tuning of Sparse Matrix Kerneis," Ph.D. dissertation, University of California, Berkeley, CA, USA, 2004.
-
(2004)
Automatie Performance Tuning of Sparse Matrix Kerneis
-
-
Vuduc, R.W.1
-
47
-
-
0003550735
-
SPARSKIT: A basic tool kit for sparse matrix computations
-
version 2
-
Y. Saad, "SPARSKIT: a basic tool kit for sparse matrix computations," Tech. Rep., 1994, version 2.
-
(1994)
Tech. Rep.
-
-
Saad, Y.1
-
48
-
-
84896855863
-
YaSpMV: Yet another SpMV framework on GPUs
-
S. Yan, C. Li, Y. Zhang, and H. Zhou, "yaSpMV: Yet Another SpMV Framework on GPUs," in Principles and Practice of Parallel Programming (PPoPP), 2014.
-
(2014)
Principles and Practice of Parallel Programming (PPoPP)
-
-
Yan, S.1
Li, C.2
Zhang, Y.3
Zhou, H.4
-
49
-
-
84864051848
-
ClSpMV: A cross-platform OpenCL SpMV framework on GPUs
-
B.-Y. Su and K. Keutzer, "clSpMV: A Cross-Platform OpenCL SpMV Framework on GPUs," in Supercomputing (ISC), 2012.
-
(2012)
Supercomputing (ISC)
-
-
Su, B.-Y.1
Keutzer, K.2
-
50
-
-
84911360428
-
A unified sparse matrix data format for efficient general sparse matrixvector multiply on modern processors with wide SIMD units
-
M. Kreutzer, G. Hager, G. Wellein, H. Fehske, and A. R. Bishop, "A unified sparse matrix data format for efficient general sparse matrixvector multiply on modern processors with wide SIMD units," SIAM Journal on Scientific Computing, vol. 36, no. 5, 2014.
-
(2014)
SIAM Journal on Scientific Computing
, vol.36
, Issue.5
-
-
Kreutzer, M.1
Hager, G.2
Wellein, G.3
Fehske, H.4
Bishop, A.R.5
-
55
-
-
84983319724
-
Stochastic matrix-function estimators: Scalable bigdata kerneis with high performance
-
P. W J. Staar, P. K. Barkoutsos, R. Istrate, A. C. I. Malossi, I. Tavernelli, N. Moll, H. Giefers, C. Hagleitner, C. Bekas, and A. Curioni, "Stochastic Matrix-Function Estimators: Scalable BigData Kerneis with High Performance," in Parallel and Distributed Processing Symposium (IPDPS), 2016.
-
(2016)
Parallel and Distributed Processing Symposium (IPDPS)
-
-
Staar, P.W.J.1
Barkoutsos, P.K.2
Istrate, R.3
Malossi, A.C.I.4
Tavernelli, I.5
Moll, N.6
Giefers, H.7
Hagleitner, C.8
Bekas, C.9
Curioni, A.10
-
56
-
-
0032683760
-
The case for application-specific benchmarking
-
M. Seltzer, D. Krinsky, K. Smith, and X. Zhang, "The Case for Application-Specific Benchmarking," in Hot Topics in Operating Systems (HOTOS), 1999.
-
(1999)
Hot Topics in Operating Systems (HOTOS)
-
-
Seltzer, M.1
Krinsky, D.2
Smith, K.3
Zhang, X.4
-
60
-
-
84978756887
-
Floating-point megafunctions
-
-, "Floating-Point Megafunctions," User Guide, 2013.
-
(2013)
User Guide
-
-
Altera Corp1
-
62
-
-
84988038566
-
A survey and evaluation of FPGA high-level synthesis tools
-
Preprint
-
R. Nane, V-Mo Sima, C. Pilato, J. Choi, B. Fort, A. Canis, Y. Chen, H. Hsiao, S. Brown, F. Ferrandi, J. Anderson, and K. Bertels, "A survey and evaluation of fpga high-level synthesis tools," ComputerAided Design of Integrated Circuits and Systems, IEEE Transactions on, 2016, Preprint.
-
(2016)
ComputerAided Design of Integrated Circuits and Systems, IEEE Transactions on
-
-
Nane, R.1
Sima, V.2
Pilato, C.3
Choi, J.4
Fort, B.5
Canis, A.6
Chen, Y.7
Hsiao, H.8
Brown, S.9
Ferrandi, F.10
Anderson, J.11
Bertels, K.12
-
63
-
-
84978767122
-
Sdaccel development environment
-
Xilinx, Inc., "SDAccel Development Environment," User Guide, 2015.
-
(2015)
User Guide
-
-
Xilinx, Inc.,1
-
64
-
-
84955581883
-
Comparative analysis of opencl vs. HDL with image-processing kerneis on Stratix-V FPGA
-
K. Hili, S. Craciun, A. George, and H. Lam, "Comparative analysis of OpenCL vs. HDL with image-processing kerneis on Stratix-V FPGA," in Application-specijic Systems, Architectures and Processors (ASAP), 2015.
-
(2015)
Application-specijic Systems, Architectures and Processors (ASAP)
-
-
Hili, K.1
Craciun, S.2
George, A.3
Lam, H.4
-
65
-
-
47249127725
-
The case for energy-proportional computing
-
L. A. Barroso and U. Hölzle, "The Case for Energy-Proportional Computing," Computer, vol. 40, no. 12, 2007.
-
(2007)
Computer
, vol.40
, Issue.12
-
-
Barroso, L.A.1
Hölzle, U.2
-
66
-
-
85021450123
-
Energy aware consolidation for cloud computing
-
S. Srikantaiah, A. Kansal, and F. Zhao, "Energy Aware Consolidation for Cloud Computing," in HotPower, 2008.
-
(2008)
HotPower
-
-
Srikantaiah, S.1
Kansal, A.2
Zhao, F.3
-
67
-
-
84901242759
-
A survey on techniques for improving the energy efficiency of large-scale distributed systems
-
A.-C. Orgerie, M. D. d. Assuncao, and L. Lefevre, " A Survey on Techniques for Improving the Energy Efficiency of Large-scale Distributed Systems," ACM Comput. Surv., vol. 46, no. 4, 2014.
-
(2014)
ACM Comput. Surv.
, vol.46
, Issue.4
-
-
Orgerie, A.-C.1
Assuncao, M.D.D.2
Lefevre, L.3
-
69
-
-
84940769996
-
Energy-efficient microserver based on a 12-core l.8ghz 188k-coremark 28nm bulk CMOS 64b soc for big-data applications with 159gb/sll memory bandwidth system density
-
R. Luijten, D. Pham, R. Clauberg, M. Cossale, H. Nguyen, and M. Pandya, "Energy-Efficient Microserver Based on a 12-Core l.8GHz 188K-CoreMark 28nm Bulk CMOS 64b SoC for Big-Data Applications with 159GB/slL Memory Bandwidth System Density," in SolidState Circuits Conference (ISSCC), 2015.
-
(2015)
SolidState Circuits Conference (ISSCC)
-
-
Luijten, R.1
Pham, D.2
Clauberg, R.3
Cossale, M.4
Nguyen, H.5
Pandya, M.6
-
70
-
-
84964876912
-
Performance and productivity evaluation of hybrid-threading hls versus hdls
-
G. Wang, H. Lam, A. George, and G. Edwards, "Performance and Productivity Evaluation of Hybrid-Threading HLS versus HDLs," in High Peiformance Extreme Computing Coriference (HPEC), 2015.
-
(2015)
High Peiformance Extreme Computing Coriference (HPEC)
-
-
Wang, G.1
Lam, H.2
George, A.3
Edwards, G.4
|