SCOPUS 정보 검색 플랫폼

Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012

Volumn , Issue , 2012, Pages 1696-1702

Sparse matrix-vector multiplication on GPGPU clusters: A new storage format and a scalable implementation

(6) Kreutzer, Moritz a Hager, Georg a Wellein, Gerhard a Fehske, Holger b Basermann, Achim c Bishop, Alan R d

a Erlangen Regional Computing Center (Germany)

b UNIVERSITY OF GREIFSWALD (Germany)

c GERMAN AEROSPACE CENTER DLR (Germany)

d LOS ALAMOS NATIONAL LABORATORY (United States)

Author keywords

CUDA; GPGPU; Sparse matrices

Indexed keywords

CUDA; DISTRIBUTED MEMORY; GPGPU; MATRIX STRUCTURE; MEMORY FOOTPRINT; MEMORY OVERHEADS; PARALLELIZATIONS; PERFORMANCE BOTTLENECKS; PERFORMANCE MODEL; PERFORMANCE PROPERTIES; SCALABLE IMPLEMENTATION; SPARSE MATRICES; SPARSE MATRIX-VECTOR MULTIPLICATION; SPARSE SOLVERS; SPARSITY PATTERNS; STORAGE FORMATS; TEST SCENARIO;

DISTRIBUTED PARAMETER NETWORKS; MATRIX ALGEBRA;

PROGRAM PROCESSORS;

EID: 84867417216 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/IPDPSW.2012.211 Document Type: Conference Paper

Times cited : (54)

References (13)

1
- 74049143158
- Implementing sparse matrix-vector multiplication on throughput-oriented processors
- DOI:10.1145/1654059.1654078
- N. Bell and M. Garland: Implementing sparse matrix-vector multiplication on throughput-oriented processors. Proc. SC'09. DOI:10.1145/1654059.1654078
- Proc. SC'09
- Bell, N.¹ Garland, M.²

2
- 77749340082
- Model-driven autotuning of sparse matrix-vector multiply on GPUs
- DOI:10.1145/1693453.1693471
- J.W. Choi, A. Singh, and R.W. Vuduc: Model-driven autotuning of sparse matrix-vector multiply on GPUs. Proc. PPoPP'10. DOI:10.1145/1693453.1693471
- Proc. PPoPP'10
- Choi, J.W.¹ Singh, A.² Vuduc, R.W.³

3
- 79955614550
- A new approach for sparse matrix vector product on NVIDIA GPUs
- DOI:10.1002/cpe.1658
- V. Vázquez, J. J. Fernández, and E. M. Garzón: A new approach for sparse matrix vector product on NVIDIA GPUs. Concurrency and Computation: Practice and Experience 23(8), 815-826 (2011). DOI:10.1002/cpe.1658
- (2011) Concurrency and Computation: Practice and Experience , vol.23 , Issue.8 , pp. 815-826
- Vázquez, V.¹ Fernández, J.J.² Garzón, E.M.³

4
- 80052903010
- Hybrid-parallel sparse matrix-vector multiplication with explicit communication overlap on current multicore-based systems
- DOI:10.1142/S0129626411000254
- G. Schubert, G. Hager, H. Fehske, and G. Wellein: Hybrid-parallel sparse matrix-vector multiplication with explicit communication overlap on current multicore-based systems. Parallel Processing Letters 21(3), 339-358 (2011). DOI:10.1142/S0129626411000254
- (2011) Parallel Processing Letters , vol.21 , Issue.3 , pp. 339-358
- Schubert, G.¹ Hager, G.² Fehske, H.³ Wellein, G.⁴

5
- 84883091562
- Performance engineering for the Lattice Boltzmann method on GPGPUs: Architectural requirements and performance results
- Accepted for publication in Preprint
- J. Habich, C. Feichtinger, H. Köstler, G. Hager, and G. Wellein: Performance engineering for the Lattice Boltzmann method on GPGPUs: Architectural requirements and performance results. Accepted for publication in Computers & Fluids. Preprint: http://arxiv.org/abs/1112.0850
- Computers & Fluids
- Habich, J.¹ Feichtinger, C.² Köstler, H.³ Hager, G.⁴ Wellein, G.⁵

6
- 84855246970
- An Introduction to Algebraic Multigrid
- U. Trottenberg et al. (Eds.): Academic Press
- K. Stüben: An Introduction to Algebraic Multigrid. In: U. Trottenberg et al. (Eds.): Multigrid: Basics, Parallelism and Adaptivity, Academic Press (2000).
- (2000) Multigrid: Basics, Parallelism and Adaptivity
- Stüben, K.¹

7
- 84867433749
- http://www.scai.fraunhofer.de/en/business-research-areas/ numerical-software/products/samg.html

8
- 83455182306
- Performance limitations for sparse matrix-vector multiplications on current multicore environments
- S. Wagner et al., Springer, ISBN 978-3642138713 DOI:10.1007/978-3-642- 13872-0-2
- G. Schubert, G. Hager and H. Fehske: Performance limitations for sparse matrix-vector multiplications on current multicore environments. In: S. Wagner et al., High Performance Computing in Science and Engineering, Garching/Munich 2009. Springer, ISBN 978-3642138713 (2010), 13-26. DOI:10.1007/978-3-642-13872- 0-2
- (2010) High Performance Computing in Science and Engineering, Garching/Munich 2009 , pp. 13-26
- Schubert, G.¹ Hager, G.² Fehske, H.³

9
- 80052898254
- HICFD - Highly Efficient Implementation of CFD Codes for HPC Many-Core Architectures
- Springer [in print]
- A. Basermann et al.: HICFD - Highly Efficient Implementation of CFD Codes for HPC Many-Core Architectures. In: Proceedings of CiHPC, Springer 2011 [in print]
- (2011) Proceedings of CiHPC
- Basermann, A.¹

10
- 73349098372
- Technical Report CNA-150, Center for Numerical Analysis, University of Texas, Aug.
- R. Grimes, D. Kincaid, and D. Young. ITPACK User's Guide. Technical Report CNA-150, Center for Numerical Analysis, University of Texas, Aug. 1979. http://rene.ma.utexas.edu/CNA/ITPACK/
- (1979) ITPACK User's Guide
- Grimes, R.¹ Kincaid, D.² Young, D.³

11
- 21144451281
- Fast sparse matrix-vector multiplication for TFlop/s computers
- J. Palma, J. Dongarra (Ed.): High Performance Computing for Computational Science - VECPAR2002, Springer Berlin DOI:10.1007/3-540-36569-9-18
- G. Wellein, G. Hager, A. Basermann, and H. Fehske: Fast sparse matrix-vector multiplication for TFlop/s computers. In: J. Palma, J. Dongarra (Ed.): High Performance Computing for Computational Science - VECPAR2002, LNCS 2565, Springer Berlin (2003). DOI:10.1007/3-540-36569-9-18
- (2003) LNCS , vol.2565
- Wellein, G.¹ Hager, G.² Basermann, A.³ Fehske, H.⁴

12
- 77949577730
- Automatically Tuning Sparse Matrix-Vector Multiplication for GPU Architectures
- Y. Patt, P. Foglia, E. Duesterwald, P. Faraboschi, X. Martorell (Eds.): Springer, ISBN 978-3-642-11514-1 DOI:10.1007/978-3-642-11515-8-10
- A. Monakov, A. Lokhmotov, A. Avetisyan: Automatically Tuning Sparse Matrix-Vector Multiplication for GPU Architectures. In: Y. Patt, P. Foglia, E. Duesterwald, P. Faraboschi, X. Martorell (Eds.): Lecture Notes in Computer Science, Springer, ISBN 978-3-642-11514-1 (2010), 111-125. DOI:10.1007/978-3- 642-11515-8-10
- (2010) Lecture Notes in Computer Science , pp. 111-125
- Monakov, A.¹ Lokhmotov, A.² Avetisyan, A.³

13
- 79958091044
- A Memory Efficient and Fast Sparse Matrix Vector Product on a GPU
- DOI:10.2528/PIER11031607
- A. Dziekonski, A. Lamecki, M. Mrozowski: A Memory Efficient and Fast Sparse Matrix Vector Product on a GPU. Progress In Electromagnetics Research 116, 49-63 (2011). DOI:10.2528/PIER11031607
- (2011) Progress in Electromagnetics Research , vol.116 , pp. 49-63
- Dziekonski, A.¹ Lamecki, A.² Mrozowski, M.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.