SCOPUS 정보 검색 플랫폼

International Journal of High Performance Computing Applications

Volumn 28, Issue 2, 2014, Pages 183-195

Optimization of quasi-diagonal matrix-vector multiplication on GPU

(5) Yang, Wangdong a,b,c Li, Kenli a,b,c Liu, Yan b Shi, Lin b,c Wan, Lanjun b,c

a HUNAN CITY UNIVERSITY (China)

b HUNAN UNIVERSITY (China)

c National Supercomputing Centre in Changsha (China)

Author keywords

compute unified device architecture (CUDA); Graphics processing unit (GPU); quasi diagonal matrix; sparse matrix; sparse matrix vector multiplication (SpMV)

Indexed keywords

COMPRESSION RATIO (MACHINERY); COMPUTER GRAPHICS; DIFFERENTIAL EQUATIONS; LINEAR ALGEBRA; PARALLEL ARCHITECTURES; PROGRAM PROCESSORS;

COMPUTE UNIFIED DEVICE ARCHITECTURE(CUDA); GRAPHICS PROCESSING UNIT; QUASI-DIAGONAL MATRICES; SPARSE MATRICES; SPARSE MATRIX-VECTOR MULTIPLICATION;

COMPUTER GRAPHICS EQUIPMENT;

EID: 84900536807 PISSN: 10943420 EISSN: 17412846 Source Type: Journal
DOI: 10.1177/1094342013501126 Document Type: Article

Times cited : (38)

References (29)

1
- 72849129747
- Baskaran MM, Bordawekar R Optimizing sparse matrix-vector multiplication on GPUs using compile-time and run-time strategies. 2008 :
- (2008) Optimizing Sparse Matrix-vector Multiplication on GPUs Using Compile-time and Run-time Strategies
- Baskaran, M.M.¹ Bordawekar, R.²

2
- 84900537722
- Bell N, Garland M 2008 :
- (2008)
- Bell, N.¹ Garland, M.²

3
- 77956238872
- NY: New York
- Boyer B, Dumas J-G, Giorgi P Proceedings of the 4th international workshop on parallel and symbolic computation (PASCO'10). NY: New York ; 2010 July 2010: 80-88.
- (2010) Proceedings of the 4th International Workshop on Parallel and Symbolic Computation (PASCO'10) , pp. 80-88
- Boyer, B.¹ Dumas, J.-G.² Giorgi, P.³

4
- 84900550633
- Buatois L, Caumon G, Levy B High performance computing and communications - third international conference (HPCC'07). 2010: 358-371.
- (2010) High Performance Computing and Communications - Third International Conference (HPCC'07) , pp. 358-371
- Buatois, L.¹ Caumon, G.² Levy, B.³

5
- 68849128505
- LA: Baton Rouge
- Cevahir A, Nukada A, Matsuoka S Proceedings of the international conference on computational science (ICCS'09). LA: Baton Rouge ; 2009 May 2009: 893-903.
- (2009) Proceedings of the International Conference on Computational Science (ICCS'09) , pp. 893-903
- Cevahir, A.¹ Nukada, A.² Matsuoka, S.³

6
- 77957679421
- India: Bangalore
- Choi JW, Singh A, Vuduc RW Proceedings of the 15th ACM SIGPLAN annual symposium on principles and practice of parallel programming (PPoPP'10). India: Bangalore ; 2010 January 2010: 115-126.
- (2010) Proceedings of the 15th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (PPoPP'10) , pp. 115-126
- Choi, J.W.¹ Singh, A.² Vuduc, R.W.³

7
- 77954837026
- Finite element sparse matrix vector multiplication on graphic processing units
- Dehnavi MM, Fernandez DM, Giannacopoulos D.. Finite element sparse matrix vector multiplication on graphic processing units. IEEE Transactions on Magnetics. 2010 ; 46 (8). 2982-2985
- (2010) IEEE Transactions on Magnetics , vol.46 , Issue.8 , pp. 2982-2985
- Dehnavi, M.M.¹ Fernandez, D.M.² Giannacopoulos, D.³

8
- 84856613262
- Feng X, Jin H, Zheng R, et al Proceedings of the 2011 IEEE 17th international conference on parallel and distributed systems (ICPADS2011). 2011: 165-172.
- (2011) Proceedings of the 2011 IEEE 17th International Conference on Parallel and Distributed Systems (ICPADS2011) , pp. 165-172
- Feng, X.¹ Jin, H.² Zheng, R.³

9
- 70449908600
- FL: Miami
- Fujimoto N Proceedings of IEEE international symposium on parallel and distributed processing (IPDPS'08). FL: Miami ; 2008 April 2008: 1-8.
- (2008) Proceedings of IEEE International Symposium on Parallel and Distributed Processing (IPDPS'08) , pp. 1-8
- Fujimoto, N.¹

10
- 79951628991
- Optimizing sparse data structures for matrix-vector multiply
- Guo D, Gropp W. Optimizing sparse data structures for matrix-vector multiply. International Journal of High Performance Computing Applications. 2011 ; 25 (1). 115-131
- (2011) International Journal of High Performance Computing Applications , vol.25 , Issue.1 , pp. 115-131
- Guo, D.¹ Gropp, W.²

11
- 67650661447
- NVIDIA Developer Technology (accessed 11 September 2012)
- HarrisM (2007) Optimizing parallel reduction in CUDA. NVIDIA Developer Technology. Available at:http://developer.download.nvidia.com/assets/cuda/files/ reduction.pdf(accessed 11 September 2012).
- (2007) Optimizing Parallel Reduction in CUDA
- Harris, M.¹

12
- 78149317310
- Australia: Melbourne
- Hugues MR, Petiton SG Proceedings of 12th IEEE international conference on high performance computing and communications (HPCC'10). Australia: Melbourne ; 2010 September 2010: 122-129.
- (2010) Proceedings of 12th IEEE International Conference on High Performance Computing and Communications (HPCC'10) , pp. 122-129
- Hugues, M.R.¹ Petiton, S.G.²

13
- 77951427635
- Austria: Vienna
- Karakasis V, Goumas G, Koziris N Proceedings of the 2009 international conference on parallel processing (ICPP'09). Austria: Vienna ; 2009 September 2009: 356-364.
- (2009) Proceedings of the 2009 International Conference on Parallel Processing (ICPP'09) , pp. 356-364
- Karakasis, V.¹ Goumas, G.² Koziris, N.³

14
- 79251596328
- Parallelization methods for implementation of discharge simulation along resin insulator surfaces
- Li K, et al. Parallelization methods for implementation of discharge simulation along resin insulator surfaces. Computers & Electrical Engineering. 2011 ; 37 (1). 30-40
- (2011) Computers & Electrical Engineering , vol.37 , Issue.1 , pp. 30-40
- Li, K.¹

15
- 77949577730
- Italy: Pisa
- Monakov A, Lokhmotov A, Avetisyan A Proceedings of international conference on high-performance and embedded architectures and compilers (HiPEAC'10). Italy: Pisa ; 2010 July 2010: 111-115.
- (2010) Proceedings of International Conference on High-performance and Embedded Architectures and Compilers (HiPEAC'10) , pp. 111-115
- Monakov, A.¹ Lokhmotov, A.² Avetisyan, A.³

16
- 84900545484
- NVIDIA 3rd ed (accessed 11 September 2012)
- NVIDIA (2013) A library for sparse linear algebra and graph computations on CUDA (CUSP), 3rd ed. Available at: https://github.com/cusplibrary/ cusplibrary(accessed 11 September 2012).
- (2013) A Library for Sparse Linear Algebra and Graph Computations on CUDA (CUSP)

17
- 84876747158
- NVIDIA 4th ed. (accessed 11 September 2012)
- NVIDIA (2012a) CUDA Toolkit 4.2 CUBLAS Library, 4th ed. Available at: http://docs.nvidia.com/cuda/cublas/index.html (accessed 11 September 2012).
- (2012) CUDA Toolkit 4.2 CUBLAS Library

18
- 84900558239
- NVIDIA 2nd ed. (accessed 11 September 2012)
- NVIDIA (2012b) The NVIDIA CUDA sparse matrix library (cuSPARSE), 2nd ed. Available at: http://docs.nvidia.com/cuda/cusparse/index.html (accessed 11 September 2012).
- (2012) The NVIDIA CUDA Sparse Matrix Library (CuSPARSE)

19
- 51049091381
- Ohshima S, Kise K, Katagiri T, et al Proceedings of 7th international meeting on high performance computing for computational science (VECPAR'06). 2006: 305-318.
- (2006) Proceedings of 7th International Meeting on High Performance Computing for Computational Science (VECPAR'06) , pp. 305-318
- Ohshima, S.¹ Kise, K.² Katagiri, T.³

20
- 84857332778
- Optimization of sparse matrix-vector multiplication using reordering techniques on GPUs
- Pichel JC, Rivera FF, Fernández M, et al. Optimization of sparse matrix-vector multiplication using reordering techniques on GPUs. Microprocessors and Microsystems. 2012 ; 36 (2). 65-77
- (2012) Microprocessors and Microsystems , vol.36 , Issue.2 , pp. 65-77
- Pichel, J.C.¹ Rivera, F.F.² Fernández, M.³

21
- 84863648377
- Tuning solution of large non-Hermitian linear systems on multiple graphics processing unit accelerated workstations
- Ries F, De Marco T, Guerrieri R. Tuning solution of large non-Hermitian linear systems on multiple graphics processing unit accelerated workstations. International Journal of High Performance Computing Applications. 2012 ; 26 (3). 296-309
- (2012) International Journal of High Performance Computing Applications , vol.26 , Issue.3 , pp. 296-309
- Ries, F.¹ De Marco, T.² Guerrieri, R.³

22
- 0003550735
- (accessed 11 September 2012)
- SaadY. (2005) Sparskit: a basic tool-kit for sparse matrix computations, version 2. Available at: http://www-users.cs.umn.edu/saad/software/SPARSKIT/ sparskit.html (accessed 11 September 2012).
- (2005) Sparskit: A Basic Tool-kit for Sparse Matrix Computations, Version 2
- Saad, Y.¹

23
- 84863697759
- A framework for GPU accelerated deformable object modeling
- Shahingohar A, Eagleson R. A framework for GPU accelerated deformable object modeling. International Journal of High Performance Computing Applications. 2011 ; 26 (3). 203-214
- (2011) International Journal of High Performance Computing Applications , vol.26 , Issue.3 , pp. 203-214
- Shahingohar, A.¹ Eagleson, R.²

24
- 84900558910
- Shan Y, Wu T, Wang Y, et al Proceedings of IEEE 8th symposium on application specific processors (SASP'10). 2010: 67-70.
- (2010) Proceedings of IEEE 8th Symposium on Application Specific Processors (SASP'10) , pp. 67-70
- Shan, Y.¹ Wu, T.² Wang, Y.³

25
- 84900552750
- University of Florida (accessed 11 September 2012)
- University of Florida (2011) UF sparse matrix collection. Available at: http://www.cise.ufl.edu/research/sparse/matrices/groups.html (accessed 11 September 2012).
- (2011) UF Sparse Matrix Collection

26
- 78249244772
- Vázquez F, Ortega G, Fernández JJ, et al Proceedings of IEEE international conference on computer and information technology (CIT'10). 2010: 1146-1151.
- (2010) Proceedings of IEEE International Conference on Computer and Information Technology (CIT'10) , pp. 1146-1151
- Vázquez, F.¹ Ortega, G.² Fernández, J.J.³

27
- 79951616484
- Vuduc R, Chandramowlishwaran A, Choi J, et al Proceedings of the 2nd USENIX conference on hot topics in parallelism (HotPar'10). 2010: 13-13.
- (2010) Proceedings of the 2nd USENIX Conference on Hot Topics in Parallelism (HotPar'10) , pp. 13-13
- Vuduc, R.¹ Chandramowlishwaran, A.² Choi, J.³

28
- 79958031324
- A novel security-driven scheduling algorithm for precedence-constrained tasks in heterogeneous distributed systems
- Xiaoyong T, Li K, Zeng Z, et al. A novel security-driven scheduling algorithm for precedence-constrained tasks in heterogeneous distributed systems. IEEE Transactions on Computers. 2011 ; 60 (7). 1017-1029
- (2011) IEEE Transactions on Computers , vol.60 , Issue.7 , pp. 1017-1029
- Xiaoyong, T.¹ Li, K.² Zeng, Z.³

29
- 84862123284
- Fast sparse matrix-vector multiplication on GPUs: Implications for graph mining
- Xintian Y, Parthasarathy S, Sadayappan P. Fast sparse matrix-vector multiplication on GPUs: implications for graph mining. Proceedings of the VLDB Endowment. 2011 ; 4 (4). 231-242
- (2011) Proceedings of the VLDB Endowment , vol.4 , Issue.4 , pp. 231-242
- Xintian, Y.¹ Parthasarathy, S.² Sadayappan, P.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.