-
1
-
-
74049143158
-
Implementing sparse matrix-vector multiplication on throughput-oriented processors
-
DOI:10.1145/1654059.1654078
-
N. Bell and M. Garland: Implementing sparse matrix-vector multiplication on throughput-oriented processors. Proc. SC'09. DOI:10.1145/1654059.1654078
-
Proc. SC'09
-
-
Bell, N.1
Garland, M.2
-
2
-
-
77749340082
-
Model-driven autotuning of sparse matrix-vector multiply on GPUs
-
DOI:10.1145/1693453.1693471
-
J.W. Choi, A. Singh, and R.W. Vuduc: Model-driven autotuning of sparse matrix-vector multiply on GPUs. Proc. PPoPP'10. DOI:10.1145/1693453.1693471
-
Proc. PPoPP'10
-
-
Choi, J.W.1
Singh, A.2
Vuduc, R.W.3
-
4
-
-
80052903010
-
Hybrid-parallel sparse matrix-vector multiplication with explicit communication overlap on current multicore-based systems
-
DOI:10.1142/S0129626411000254
-
G. Schubert, G. Hager, H. Fehske, and G. Wellein: Hybrid-parallel sparse matrix-vector multiplication with explicit communication overlap on current multicore-based systems. Parallel Processing Letters 21(3), 339-358 (2011). DOI:10.1142/S0129626411000254
-
(2011)
Parallel Processing Letters
, vol.21
, Issue.3
, pp. 339-358
-
-
Schubert, G.1
Hager, G.2
Fehske, H.3
Wellein, G.4
-
5
-
-
84883091562
-
Performance engineering for the Lattice Boltzmann method on GPGPUs: Architectural requirements and performance results
-
Accepted for publication in Preprint
-
J. Habich, C. Feichtinger, H. Köstler, G. Hager, and G. Wellein: Performance engineering for the Lattice Boltzmann method on GPGPUs: Architectural requirements and performance results. Accepted for publication in Computers & Fluids. Preprint: http://arxiv.org/abs/1112.0850
-
Computers & Fluids
-
-
Habich, J.1
Feichtinger, C.2
Köstler, H.3
Hager, G.4
Wellein, G.5
-
6
-
-
84855246970
-
An Introduction to Algebraic Multigrid
-
U. Trottenberg et al. (Eds.): Academic Press
-
K. Stüben: An Introduction to Algebraic Multigrid. In: U. Trottenberg et al. (Eds.): Multigrid: Basics, Parallelism and Adaptivity, Academic Press (2000).
-
(2000)
Multigrid: Basics, Parallelism and Adaptivity
-
-
Stüben, K.1
-
7
-
-
84867433749
-
-
http://www.scai.fraunhofer.de/en/business-research-areas/ numerical-software/products/samg.html
-
-
-
-
8
-
-
83455182306
-
Performance limitations for sparse matrix-vector multiplications on current multicore environments
-
S. Wagner et al., Springer, ISBN 978-3642138713 DOI:10.1007/978-3-642- 13872-0-2
-
G. Schubert, G. Hager and H. Fehske: Performance limitations for sparse matrix-vector multiplications on current multicore environments. In: S. Wagner et al., High Performance Computing in Science and Engineering, Garching/Munich 2009. Springer, ISBN 978-3642138713 (2010), 13-26. DOI:10.1007/978-3-642-13872- 0-2
-
(2010)
High Performance Computing in Science and Engineering, Garching/Munich 2009
, pp. 13-26
-
-
Schubert, G.1
Hager, G.2
Fehske, H.3
-
9
-
-
80052898254
-
HICFD - Highly Efficient Implementation of CFD Codes for HPC Many-Core Architectures
-
Springer [in print]
-
A. Basermann et al.: HICFD - Highly Efficient Implementation of CFD Codes for HPC Many-Core Architectures. In: Proceedings of CiHPC, Springer 2011 [in print]
-
(2011)
Proceedings of CiHPC
-
-
Basermann, A.1
-
10
-
-
73349098372
-
-
Technical Report CNA-150, Center for Numerical Analysis, University of Texas, Aug.
-
R. Grimes, D. Kincaid, and D. Young. ITPACK User's Guide. Technical Report CNA-150, Center for Numerical Analysis, University of Texas, Aug. 1979. http://rene.ma.utexas.edu/CNA/ITPACK/
-
(1979)
ITPACK User's Guide
-
-
Grimes, R.1
Kincaid, D.2
Young, D.3
-
11
-
-
21144451281
-
Fast sparse matrix-vector multiplication for TFlop/s computers
-
J. Palma, J. Dongarra (Ed.): High Performance Computing for Computational Science - VECPAR2002, Springer Berlin DOI:10.1007/3-540-36569-9-18
-
G. Wellein, G. Hager, A. Basermann, and H. Fehske: Fast sparse matrix-vector multiplication for TFlop/s computers. In: J. Palma, J. Dongarra (Ed.): High Performance Computing for Computational Science - VECPAR2002, LNCS 2565, Springer Berlin (2003). DOI:10.1007/3-540-36569-9-18
-
(2003)
LNCS
, vol.2565
-
-
Wellein, G.1
Hager, G.2
Basermann, A.3
Fehske, H.4
-
12
-
-
77949577730
-
Automatically Tuning Sparse Matrix-Vector Multiplication for GPU Architectures
-
Y. Patt, P. Foglia, E. Duesterwald, P. Faraboschi, X. Martorell (Eds.): Springer, ISBN 978-3-642-11514-1 DOI:10.1007/978-3-642-11515-8-10
-
A. Monakov, A. Lokhmotov, A. Avetisyan: Automatically Tuning Sparse Matrix-Vector Multiplication for GPU Architectures. In: Y. Patt, P. Foglia, E. Duesterwald, P. Faraboschi, X. Martorell (Eds.): Lecture Notes in Computer Science, Springer, ISBN 978-3-642-11514-1 (2010), 111-125. DOI:10.1007/978-3- 642-11515-8-10
-
(2010)
Lecture Notes in Computer Science
, pp. 111-125
-
-
Monakov, A.1
Lokhmotov, A.2
Avetisyan, A.3
-
13
-
-
79958091044
-
A Memory Efficient and Fast Sparse Matrix Vector Product on a GPU
-
DOI:10.2528/PIER11031607
-
A. Dziekonski, A. Lamecki, M. Mrozowski: A Memory Efficient and Fast Sparse Matrix Vector Product on a GPU. Progress In Electromagnetics Research 116, 49-63 (2011). DOI:10.2528/PIER11031607
-
(2011)
Progress in Electromagnetics Research
, vol.116
, pp. 49-63
-
-
Dziekonski, A.1
Lamecki, A.2
Mrozowski, M.3
|