메뉴 건너뛰기




Volumn 56, Issue 4, 2011, Pages 447-466

New Row-grouped CSR format for storing sparse matrices on GPU with implementation in CUDA

Author keywords

CUDA; GPU; Parallel computing; Sparse matrices; SpMV; Thread computing

Indexed keywords

CUDA; GPU; SPARSE MATRICES; SPMV; THREAD COMPUTING;

EID: 84857836454     PISSN: 00017043     EISSN: None     Source Type: Journal    
DOI: None     Document Type: Article
Times cited : (17)

References (20)
  • 2
    • 84857837437 scopus 로고    scopus 로고
    • NVIDIA Corporation, May
    • NVIDIA Corporation, CUDA CUBLAS library, PG-00000-002 V3.1, May 2010, http://developer.download.nvidia.com/compute/cuda/3-1/toolkit/docs/ CUBLAS-Library-3.1.pdf.
    • (2010) CUDA CUBLAS Library, PG-00000-002 V3.1
  • 3
    • 33845468997 scopus 로고    scopus 로고
    • LU-GPU: Efficient algorithms for solving dense linear systems on graphics hardware
    • DOI 10.1109/SC.2005.42, Proceedings - Thirteenth International Symposium on Temporal Representation and Reasoning, TIME 2006
    • N. Galoppo, N. K Govindaraju, M. Henson, D. Manocha: LU-GPU: Efficient algorithms for solving dense linear systems on graphics hardware. Proceedings ACM/ IEEE SC'05, Conference of Supercomputing, Nov. 12-18, 2005, Seattle (USA), doi: 10.1109/SC.2005.42. (Pubitemid 44902346)
    • (2005) Proceedings of the ACM/IEEE 2005 Supercomputing Conference, SC'05 , vol.2005 , pp. 1559955
    • Galoppo, N.1    Govindaraju, N.K.2    Henson, M.3    Manocha, D.4
  • 4
    • 67650056991 scopus 로고    scopus 로고
    • LU, QL and Cholesky factorizations using vector capabilities of GPUs
    • Electrical Engineering and Computer Sciences, University of California, Berkeley
    • V. Volkov, J. Demel: LU, QL and Cholesky factorizations using vector capabilities of GPUs. Techn. Rep. UCB/EECS-2008-49, Electrical Engineering and Computer Sciences, University of California, Berkeley, 2008.
    • (2008) Techn. Rep. UCB/EECS-2008-49
    • Volkov, V.1    Demel, J.2
  • 5
    • 70350368872 scopus 로고    scopus 로고
    • Efficient sparse matrix-vector multiplication on CUDA
    • NVIDIA Corporation
    • N. Bell, M. Garland: Efficient sparse matrix-vector multiplication on CUDA. Techn. Rep. NVR-2008-004, NVIDIA Corporation 2008.
    • (2008) Techn. Rep. NVR-2008-004
    • Bell, N.1    Garland, M.2
  • 7
    • 55849145179 scopus 로고    scopus 로고
    • Improving the performance of multithreaded sparse matrix-vector multiplication using index and value compression
    • Sept. 8-12, Portland (USA)
    • K. Kourtis, G. Goumas, N. Koziris: Improving the performance of multithreaded sparse matrix-vector multiplication using index and value compression. Proc. 37th International Conference on Parallel Processing, Sept. 8-12, 2008, Portland (USA), 511-519.
    • (2008) Proc. 37th International Conference on Parallel Processing , pp. 511-519
    • Kourtis, K.1    Goumas, G.2    Koziris, N.3
  • 11
    • 0013269731 scopus 로고    scopus 로고
    • The University of Florida sparse matrix collection
    • T. A. Davis, Y. Hu: The University of Florida sparse matrix collection. NA Digest 92 (42), http://www.cise.ufl.edu/research/sparse/matrices/.
    • NA Digest , vol.92 , Issue.42
    • Davis, T.A.1    Hu, Y.2
  • 13
    • 79953817719 scopus 로고    scopus 로고
    • NVIDIA Corporation
    • NVIDIA CUDA Programming Guide 3.0. NVIDIA Corporation, 2010, http://developer.download.nvidia.com/compute/cuda/3-0/toolkit/docs/ NVIDIA-CUDA-ProgrammingGuide.pdf.
    • (2010) NVIDIA CUDA Programming Guide 3.0
  • 14
    • 77952611196 scopus 로고    scopus 로고
    • Concurrent number cruncher: A gpu implementation of a general sparse linear solver
    • L. Buatois, G. Caumon, B. Levy: Concurrent number cruncher: a gpu implementation of a general sparse linear solver. Int. J. Parallel Emerg. Distrib. Syst. 24 (2009), 205-223.
    • (2009) Int. J. Parallel Emerg. Distrib. Syst. , vol.24 , pp. 205-223
    • Buatois, L.1    Caumon, G.2    Levy, B.3
  • 16
    • 84857874326 scopus 로고    scopus 로고
    • Master's thesis, Faculty of Nuclear Sciences and Physical Engineering, Czech Technical University in Prague
    • J. Vacata: GPGPU: General purpose computation on GPUs. Master's thesis, Faculty of Nuclear Sciences and Physical Engineering, Czech Technical University in Prague, 2008.
    • (2008) GPGPU: General Purpose Computation on GPUs
    • Vacata, J.1
  • 17
    • 77949577730 scopus 로고    scopus 로고
    • Automatically tuning sparse matrixvector multiplication for GPU architectures
    • Pisa (Italy), Jan. 25-27, (Y. N. Patt, P. Foglia, E. Duesterwald, P. Faraboschi, X. Martorell, eds.), Springer, Berlin 2010
    • A. Monakov, A. Lokhmotov, A. Avetisyan: Automatically tuning sparse matrixvector multiplication for GPU architectures. Proc. 5th International Conferences on High Performance Embedded Architectures and Compilers (HiPEAC 2010), Pisa (Italy), Jan. 25-27, 2010 (Y. N. Patt, P. Foglia, E. Duesterwald, P. Faraboschi, X. Martorell, eds.), Springer, Berlin 2010, 111-125.
    • (2010) Proc. 5th International Conferences on High Performance Embedded Architectures and Compilers (HiPEAC 2010) , pp. 111-125
    • Monakov, A.1    Lokhmotov, A.2    Avetisyan, A.3
  • 18
    • 84857885283 scopus 로고    scopus 로고
    • Nvidia, Cusp 0.1.1. http://code.google.com/p/cusp-library/, 2010.
    • (2010) Nvidia, Cusp 0.1.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.