메뉴 건너뛰기




Volumn 28, Issue 2, 2014, Pages 183-195

Optimization of quasi-diagonal matrix-vector multiplication on GPU

Author keywords

compute unified device architecture (CUDA); Graphics processing unit (GPU); quasi diagonal matrix; sparse matrix; sparse matrix vector multiplication (SpMV)

Indexed keywords

COMPRESSION RATIO (MACHINERY); COMPUTER GRAPHICS; DIFFERENTIAL EQUATIONS; LINEAR ALGEBRA; PARALLEL ARCHITECTURES; PROGRAM PROCESSORS;

EID: 84900536807     PISSN: 10943420     EISSN: 17412846     Source Type: Journal    
DOI: 10.1177/1094342013501126     Document Type: Article
Times cited : (38)

References (29)
  • 7
    • 77954837026 scopus 로고    scopus 로고
    • Finite element sparse matrix vector multiplication on graphic processing units
    • Dehnavi MM, Fernandez DM, Giannacopoulos D.. Finite element sparse matrix vector multiplication on graphic processing units. IEEE Transactions on Magnetics. 2010 ; 46 (8). 2982-2985
    • (2010) IEEE Transactions on Magnetics , vol.46 , Issue.8 , pp. 2982-2985
    • Dehnavi, M.M.1    Fernandez, D.M.2    Giannacopoulos, D.3
  • 11
    • 67650661447 scopus 로고    scopus 로고
    • NVIDIA Developer Technology (accessed 11 September 2012)
    • HarrisM (2007) Optimizing parallel reduction in CUDA. NVIDIA Developer Technology. Available at:http://developer.download.nvidia.com/assets/cuda/files/ reduction.pdf(accessed 11 September 2012).
    • (2007) Optimizing Parallel Reduction in CUDA
    • Harris, M.1
  • 14
    • 79251596328 scopus 로고    scopus 로고
    • Parallelization methods for implementation of discharge simulation along resin insulator surfaces
    • Li K, et al. Parallelization methods for implementation of discharge simulation along resin insulator surfaces. Computers & Electrical Engineering. 2011 ; 37 (1). 30-40
    • (2011) Computers & Electrical Engineering , vol.37 , Issue.1 , pp. 30-40
    • Li, K.1
  • 17
    • 84876747158 scopus 로고    scopus 로고
    • NVIDIA 4th ed. (accessed 11 September 2012)
    • NVIDIA (2012a) CUDA Toolkit 4.2 CUBLAS Library, 4th ed. Available at: http://docs.nvidia.com/cuda/cublas/index.html (accessed 11 September 2012).
    • (2012) CUDA Toolkit 4.2 CUBLAS Library
  • 18
    • 84900558239 scopus 로고    scopus 로고
    • NVIDIA 2nd ed. (accessed 11 September 2012)
    • NVIDIA (2012b) The NVIDIA CUDA sparse matrix library (cuSPARSE), 2nd ed. Available at: http://docs.nvidia.com/cuda/cusparse/index.html (accessed 11 September 2012).
    • (2012) The NVIDIA CUDA Sparse Matrix Library (CuSPARSE)
  • 20
    • 84857332778 scopus 로고    scopus 로고
    • Optimization of sparse matrix-vector multiplication using reordering techniques on GPUs
    • Pichel JC, Rivera FF, Fernández M, et al. Optimization of sparse matrix-vector multiplication using reordering techniques on GPUs. Microprocessors and Microsystems. 2012 ; 36 (2). 65-77
    • (2012) Microprocessors and Microsystems , vol.36 , Issue.2 , pp. 65-77
    • Pichel, J.C.1    Rivera, F.F.2    Fernández, M.3
  • 21
    • 84863648377 scopus 로고    scopus 로고
    • Tuning solution of large non-Hermitian linear systems on multiple graphics processing unit accelerated workstations
    • Ries F, De Marco T, Guerrieri R. Tuning solution of large non-Hermitian linear systems on multiple graphics processing unit accelerated workstations. International Journal of High Performance Computing Applications. 2012 ; 26 (3). 296-309
    • (2012) International Journal of High Performance Computing Applications , vol.26 , Issue.3 , pp. 296-309
    • Ries, F.1    De Marco, T.2    Guerrieri, R.3
  • 25
    • 84900552750 scopus 로고    scopus 로고
    • University of Florida (accessed 11 September 2012)
    • University of Florida (2011) UF sparse matrix collection. Available at: http://www.cise.ufl.edu/research/sparse/matrices/groups.html (accessed 11 September 2012).
    • (2011) UF Sparse Matrix Collection
  • 28
    • 79958031324 scopus 로고    scopus 로고
    • A novel security-driven scheduling algorithm for precedence-constrained tasks in heterogeneous distributed systems
    • Xiaoyong T, Li K, Zeng Z, et al. A novel security-driven scheduling algorithm for precedence-constrained tasks in heterogeneous distributed systems. IEEE Transactions on Computers. 2011 ; 60 (7). 1017-1029
    • (2011) IEEE Transactions on Computers , vol.60 , Issue.7 , pp. 1017-1029
    • Xiaoyong, T.1    Li, K.2    Zeng, Z.3
  • 29
    • 84862123284 scopus 로고    scopus 로고
    • Fast sparse matrix-vector multiplication on GPUs: Implications for graph mining
    • Xintian Y, Parthasarathy S, Sadayappan P. Fast sparse matrix-vector multiplication on GPUs: implications for graph mining. Proceedings of the VLDB Endowment. 2011 ; 4 (4). 231-242
    • (2011) Proceedings of the VLDB Endowment , vol.4 , Issue.4 , pp. 231-242
    • Xintian, Y.1    Parthasarathy, S.2    Sadayappan, P.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.