-
4
-
-
80052311496
-
A model-driven partitioning and auto-tuning integratedframework for sparse matrix-vector multiplication on GPUs
-
P. Guo, H. Huang, Q. Chen, L. Wang, E.-J. Lee, and P. Chen,"A Model-Driven Partitioning and Auto-Tuning IntegratedFramework for Sparse Matrix-Vector Multiplication on GPUs,"Proc. TeraGrid Conf. Extreme Digital Discovery (TG '11), pp. 2:1-2:8, 2011.
-
(2011)
Proc. TeraGrid Conf. Extreme Digital Discovery (TG '11)
, pp. 21-28
-
-
Guo, P.1
Huang, H.2
Chen, Q.3
Wang, L.4
Lee, E.-J.5
Chen, P.6
-
5
-
-
0242533311
-
Sparse matrixsolvers on the gpu: Conjugate gradients and multigrid
-
J. Bolz, I. Farmer, E. Grinspun, and P. Schroder, "Sparse MatrixSolvers on The GPU: Conjugate Gradients and Multigrid," ACMTrans. Graphics, vol. 22, no. 3, pp. 917-924, 2003.
-
(2003)
ACMTrans. Graphics
, vol.22
, Issue.3
, pp. 917-924
-
-
Bolz, J.1
Farmer, I.2
Grinspun, E.3
Schroder, P.4
-
6
-
-
84898683364
-
-
NVIDIA CUDA C Programming Guide, Version 4.0, May 2011
-
NVIDIA CUDA C Programming Guide, Version 4.0, May 2011.
-
-
-
-
7
-
-
60649099576
-
Optimizing matrix multiplicationfor a short-vector simd architecture-cell processor
-
J. Kurzak, W. Alvaro, and J. Dongarra, "Optimizing Matrix Multiplicationfor a Short-Vector Simd Architecture-Cell Processor,"J. Parallel Computing, vol. 35, no. 3, pp. 138-150, 2009.
-
(2009)
J. Parallel Computing
, vol.35
, Issue.3
, pp. 138-150
-
-
Kurzak, J.1
Alvaro, W.2
Dongarra, J.3
-
8
-
-
1542501019
-
Sparsity: Optimization frameworkfor sparse matrix kernels
-
E.-J. Im, K. Yelick, and R. Vuduc, "Sparsity: Optimization Frameworkfor Sparse Matrix Kernels," Int'l J. High Performance ComputingApplications, vol. 18, no. 1, pp. 135-158, 2004.
-
(2004)
Int'l J. High Performance ComputingApplications
, vol.18
, Issue.1
, pp. 135-158
-
-
Im, E.-J.1
Yelick, K.2
Vuduc, R.3
-
10
-
-
20744452904
-
Self-adapting linear algebraalgorithms and software
-
Feb.
-
J. Demmel, J. Dongarra, V. Eijkhout, E. Fuentes, A. Petitet,R.C.W.R. Vuduc, and K. Yelick, "Self-Adapting Linear AlgebraAlgorithms and Software," Proc. IEEE, vol. 93, no. 2, pp. 293-312,Feb. 2005.
-
(2005)
Proc. IEEE
, vol.93
, Issue.2
, pp. 293-312
-
-
Demmel, J.1
Dongarra, J.2
Eijkhout, V.3
Fuentes, E.4
Petitet, A.5
Vuduc, R.C.W.R.6
Yelick, K.7
-
12
-
-
78249244772
-
Improving the performance of the sparse matrix vector productwith GPUs
-
F. Vazquez, G. Ortega, J.J. Fernandez, and E.M. Garzon,"Improving the Performance of the Sparse Matrix Vector Productwith GPUs," Proc. 10th IEEE Int'l Conf. Computer and InformationTechnology (CIT '10), pp. 1146-1151, 2010.
-
(2010)
Proc. 10th IEEE Int'l Conf. Computer and InformationTechnology (CIT '10)
, pp. 1146-1151
-
-
Vazquez, F.1
Ortega, G.2
Fernandez, J.J.3
Garzon, E.M.4
-
13
-
-
77949577730
-
Automaticallytuning sparse matrix-vector multiplication for GPU architectures
-
A. Monakov, A. Lokhmotov, and A. Avetisyan, "AutomaticallyTuning Sparse Matrix-Vector Multiplication for GPUArchitectures," Proc. Fifth Int'l Conf. High Performance EmbeddedArchitectures and Compilers (HiPEAC '10), pp. 111-125, 2010.
-
(2010)
Proc. Fifth Int'l Conf. High Performance EmbeddedArchitectures and Compilers (HiPEAC '10)
, pp. 111-125
-
-
Monakov, A.1
Lokhmotov, A.2
Avetisyan, A.3
-
15
-
-
77956072107
-
Optimizingsparse matrix-vector multiplication on CUDA
-
June
-
Z. Wang, X. Xu, W. Zhao, Y. Zhang, and S. He, "OptimizingSparse Matrix-Vector Multiplication on CUDA," Proc. Second Int'lConf. Education Technology and Computer (ICETC '10), vol. 4,pp. V4-109-V4-113, June 2010.
-
(2010)
Proc. Second int'Lconf. Education Technology and Computer (ICETC '10)
, vol.4
-
-
Wang, Z.1
Xu, X.2
Zhao, W.3
Zhang, Y.4
He, S.5
-
16
-
-
84857332778
-
Optimization of sparse matrix-vector multiplication using reorderingtechniques on GPUs
-
J.C. Pichel, F.F. Rivera, M. Fernandez, and A. Rodriguez, "Optimization of Sparse Matrix-Vector Multiplication Using ReorderingTechniques on GPUs," Microprocessors and Microsystems,vol. 36, no. 2, pp. 65-77, 2012.
-
(2012)
Microprocessors and Microsystems
, vol.36
, Issue.2
, pp. 65-77
-
-
Pichel, J.C.1
Rivera, F.F.2
Fernandez, M.3
Rodriguez, A.4
-
17
-
-
84862123284
-
Fast sparsematrix-vector multiplication on GPUs: Implications for graphmining
-
Jan.
-
X. Yang, S. Parthasarathy, and P. Sadayappan, "Fast SparseMatrix-Vector Multiplication on GPUs: Implications for GraphMining," Proc. VLDB Endowment, vol. 4, no. 4, pp. 231-242, Jan.2011.
-
(2011)
Proc. VLDB Endowment
, vol.4
, Issue.4
, pp. 231-242
-
-
Yang, X.1
Parthasarathy, S.2
Sadayappan, P.3
-
18
-
-
43449094719
-
Program optimization space pruning for a multithreaded GPU
-
DOI 10.1145/1356058.1356084, Proceedings of the 2008 CGO - Sixth International Symposium on Code Generation and Optimization
-
S. Ryoo, C.I. Rodrigues, S.S. Stone, S.S. Baghsorkhi, S.-Z. Ueng,J.A. Stratton, and W.-M.W. Hwu, "Program Optimization SpacePruning for a Multithreaded GPU," Proc. ACM Sixth Ann. IEEE/ACM Int'l Symp. Code Generation and Optimization (CGO '08),pp. 195-204, 2008. (Pubitemid 351667266)
-
(2008)
Proceedings of the 2008 CGO - Sixth International Symposium on Code Generation and Optimization
, pp. 195-204
-
-
Ryoo, S.1
Rodrigues, C.I.2
Stone, S.S.3
Baghsorkhi, S.S.4
Ueng, S.-Z.5
Stratton, J.A.6
Hwu, W.-M.W.7
-
19
-
-
77957679421
-
Model-driven autotuningof sparse matrix-vector multiply on GPUs
-
J.W. Choi, A. Singh, and R.W. Vuduc, "Model-Driven Autotuningof Sparse Matrix-Vector Multiply on GPUs," Proc. 15thACM SIGPLAN Symp. Principles and Practice of Parallel Programming(PPoPP '10), pp. 115-126, 2010.
-
(2010)
Proc. 15thACM SIGPLAN Symp. Principles and Practice of Parallel Programming(PPoPP '10)
, pp. 115-126
-
-
Choi, J.W.1
Singh, A.2
Vuduc, R.W.3
-
21
-
-
84886727304
-
Performance modeling and optimizationof sparse matrix-vector multiplication on NVIDIA CUDAPlatform
-
S. Xu, W. Xue, and H. Lin, "Performance Modeling and Optimizationof Sparse Matrix-Vector Multiplication on NVIDIA CUDAPlatform," J. Supercomputing, vol. 63, pp. 710-721, 2013.
-
(2013)
J. Supercomputing
, vol.63
, pp. 710-721
-
-
Xu, S.1
Xue, W.2
Lin, H.3
-
23
-
-
77957561221
-
An adaptive performance modeling tool for GPU architectures
-
S.S. Baghsorkhi, M. Delahaye, S.J. Patel, W.D. Gropp, and W.-M.W. Hwu, "An Adaptive Performance Modeling Tool for GPUArchitectures," Proc. 15th ACM SIGPLAN Symp. Principles andPractice of Parallel Programming (PPoPP '10), pp. 105-114, 2010.
-
(2010)
Proc. 15th ACM SIGPLAN Symp. Principles AndPractice of Parallel Programming (PPoPP '10)
, pp. 105-114
-
-
Baghsorkhi, S.S.1
Delahaye, M.2
Patel, S.J.3
Gropp, W.D.4
Hwu, W.-M.W.5
-
24
-
-
70450231944
-
An analytical model for a gpu architecturewith memory-level and thread-level parallelismawareness
-
S. Hong and H. Kim, "An Analytical Model for a GPU Architecturewith Memory-Level and Thread-Level ParallelismAwareness," Proc. 36th ACM Ann. Int'l Symp. Computer Architecture(ISCA '09), pp. 152-163, 2009.
-
(2009)
Proc. 36th ACM Ann. Int'l Symp. Computer Architecture(ISCA '09)
, pp. 152-163
-
-
Hong, S.1
Kim, H.2
-
25
-
-
77952204218
-
A performance prediction model forthe CUDA GPGPU platform
-
Dec.
-
K. Kothapalli, R. Mukherjee, M. Rehman, S. Patidar, P. Narayanan,and K. Srinathan, "A Performance Prediction Model forthe CUDA GPGPU Platform," Proc. Int'l Conf. High PerformanceComputing (HiPC '09), pp. 463-472, Dec. 2009.
-
(2009)
Proc. Int'l Conf. High PerformanceComputing (HiPC '09)
, pp. 463-472
-
-
Kothapalli, K.1
Mukherjee, R.2
Rehman, M.3
Patidar, S.4
Narayananand, P.5
Srinathan, K.6
-
26
-
-
81355161778
-
The university of florida sparse matrixcollection
-
T.A. Davis and Y. Hu, "The University of Florida Sparse MatrixCollection," ACM Trans. Math. Software, vol. 38, no. 1, pp. 1:1-1:25,2011.
-
(2011)
ACM Trans. Math. Software
, vol.38
, Issue.1
, pp. 11-125
-
-
Davis, T.A.1
Hu, Y.2
-
27
-
-
56749158843
-
Optimization of sparse matrix-vector multiplication on emergingmulticore platforms
-
S. Williams, L. Oliker, R. Vuduc, J. Shalf, K. Yelick, and J. Demmel,"Optimization of Sparse Matrix-Vector Multiplication on EmergingMulticore Platforms," Proc. ACM/IEEE Conf. Supercomputing,2007.
-
(2007)
Proc. ACM/IEEE Conf. Supercomputing
-
-
Williams, S.1
Oliker, L.2
Vuduc, R.3
Shalf, J.4
Yelick, K.5
Demmel, J.6
|