SCOPUS 정보 검색 플랫폼

Parallel Computing

Volumn 38, Issue 8, 2012, Pages 408-420

Automatic tuning of the sparse matrix vector product on GPUs based on the ELLR-T approach

(3) Vázquez, Francisco a Fernández, José Jesús a,b Garzón, Ester M a

a UNIVERSITY OF ALMERÍA (Spain)

b CENTRO NACIONAL DE BIOTECNOLOGÍA (Spain)

Author keywords

GPU computing; GPU performance modeling; Sparse matrix vector product

Indexed keywords

ACCELERATION FACTORS; AUTOMATIC TUNING; AUTOTUNING; COMPARATIVE ANALYSIS; EVALUATION RESULTS; GPU COMPUTING; GRAPHICS PROCESSING UNITS; MEMORY OPERATIONS; OPTIMUM SELECTION; PERFORMANCE MODELING; SPARSE MATRICES; TEST MATRIX; TWO PARAMETER;

COMPUTER GRAPHICS; MATRIX ALGEBRA; MEMORY ARCHITECTURE;

PROGRAM PROCESSORS;

EID: 84862637273 PISSN: 01678191 EISSN: None Source Type: Journal
DOI: 10.1016/j.parco.2011.08.003 Document Type: Article

Times cited : (35)

References (21)

1
- 77950518538
- A matrix approach to tomographic reconstruction and its implementation on GPUs
- F. Vázquez, E.M. Garzón, and J.J. Fernández A matrix approach to tomographic reconstruction and its implementation on GPUs Journal of Structural Biology 170 2010 146 151
- (2010) Journal of Structural Biology , vol.170 , pp. 146-151
- Vázquez, F.¹ Garzón, E.M.² Fernández, J.J.³

2
- 0040667844
- Sparse matrix computations on parallel processor arrays
- A.T. Ogielski, and W. Aiello Sparse matrix computations on parallel processor arrays SIAM Journal of Scientific Computing 14 1992 519 530
- (1992) SIAM Journal of Scientific Computing , vol.14 , pp. 519-530
- Ogielski, A.T.¹ Aiello, W.²

3
- 0031269220
- Improving the memory-system performance of sparse-matrix vector multiplication
- S. Toledo Improving memory-system performance of sparse matrix-vector multiplication IBM Journal of Research and Development 41 6 1997 711 725 (Pubitemid 127557044)
- (1997) IBM Journal of Research and Development , vol.41 , Issue.6 , pp. 711-725
- Toledo, S.¹

4
- 60949098907
- Optimization of sparse matrix-vector multiplication on emerging multicore platforms
- S. Williams, L. Oliker, R. Vuduc, J. Shalf, K. Yelick, and J. Demmel Optimization of sparse matrix-vector multiplication on emerging multicore platforms Parallel Computing 35 3 2009 178 194
- (2009) Parallel Computing , vol.35 , Issue.3 , pp. 178-194
- Williams, S.¹ Oliker, L.² Vuduc, R.³ Shalf, J.⁴ Yelick, K.⁵ Demmel, J.⁶

5
- 34547309668
- NVIDIA Version 2.3
- NVIDIA, CUDA Programming guide. Version 2.3, 2009.
- (2009) CUDA Programming Guide

6
- 74349092397
- Kronos, Group
- Kronos, Group, OpenCL - the open standard for parallel programming of heterogeneous systems.
- OpenCL - The Open Standard for Parallel Programming of Heterogeneous Systems

7
- 74049143158
- Implementing sparse matrix-vector multiplication on throughput-oriented processors
- ACM New York, NY, USA
- N. Bell, and M. Garland Implementing sparse matrix-vector multiplication on throughput-oriented processors SC'09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis 2009 ACM New York, NY, USA 1 11
- (2009) SC'09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis , pp. 1-11
- Bell, N.¹ Garland, M.²

8
- 77952611196
- Concurrent number cruncher - A GPU implementation of a general sparse linear solver
- L. Buatois, G. Caumon, and B. Lévy Concurrent number cruncher - A GPU implementation of a general sparse linear solver International Journal of Parallel Emergent and Distributed Systems 24 3 2009 205 223
- (2009) International Journal of Parallel Emergent and Distributed Systems , vol.24 , Issue.3 , pp. 205-223
- Buatois, L.¹ Caumon, G.² Lévy, B.³

9
- 74049163483
- Optimizing sparse matrix-vector multiplication on GPUs
- April
- M.M. Baskaran, R. Bordawekar, Optimizing sparse matrix-vector multiplication on GPUs, Tech. Rep. Research Report RC24704, IBM, April 2009.
- (2009) Tech. Rep. Research Report RC24704, IBM
- Baskaran, M.M.¹ Bordawekar, R.²

10
- 77749340082
- Model-driven autotuning of sparse matrix-vector multiply on GPUs
- ACM New York, NY, USA
- J.W. Choi, A. Singh, and R. Vuduc Model-driven autotuning of sparse matrix-vector multiply on GPUs PPoPP'10: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming 2010 ACM New York, NY, USA 115 126
- (2010) PPoPP'10: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming , pp. 115-126
- Choi, J.W.¹ Singh, A.² Vuduc, R.³

11
- 77949577730
- Automatically tuning sparse matrix-vector multiplication for GPU architectures
- LNCS 5952
- A. Monakov, A. Lokhmotov, A. Avetisyan, Automatically tuning sparse matrix-vector multiplication for GPU architectures, in: Proceedings of HiPEAC 2010, LNCS 5952, 2010, pp. 111-125.
- (2010) Proceedings of HiPEAC 2010 , pp. 111-125
- Monakov, A.¹ Lokhmotov, A.² Avetisyan, A.³

12
- 79955614550
- A new approach for sparse matrix vector product on NVIDIA GPUs
- F. Vázquez, J.J. Fernández, E.M. Garzón, A new approach for sparse matrix vector product on NVIDIA GPUs, Concurrency and Computation: Practice And Experience. .
- Concurrency and Computation: Practice and Experience
- F. Vázquez¹

13
- 78249244772
- Improving the performance of the sparse matrix vector product with GPUs
- IEEE Computer Society
- F. Vázquez, G. Ortega, J.J. Fernández, E.M. Garzón, Improving the performance of the sparse matrix vector product with GPUs, in: 10th IEEE International Conference on Computer and Information Technology. CIT 2010, IEEE Computer Society, 2010, pp. 1146-1151. .
- (2010) 10th IEEE International Conference on Computer and Information Technology. CIT 2010 , pp. 1146-1151
- F. Vázquez¹

14
- 0043281832
- ITPACKV 2D User's guide
- University of Texas at Austin
- D.R. Kincaid, T.C. Oppe, D.M. Young, ITPACKV 2D User's guide, Tech. Rep. CNA-232, Center for Numerical Analysis. University of Texas at Austin, 1989.
- (1989) Tech. Rep. CNA-232, Center for Numerical Analysis
- Kincaid, D.R.¹ Oppe, T.C.² Young, D.M.³

15
- 70450231944
- An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness
- News 37 ACM, New York, NY, USA
- S. Hong, H. Kim, An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness, in: SIGARCH Comput. Archit. News 37, vol. 3, ACM, New York, NY, USA, 2009, pp. 152-163. .
- (2009) SIGARCH Comput. Archit. , vol.3 , pp. 152-163
- Hong, S.¹ Kim, H.²

16
- 77957561221
- An adaptive performance modeling tool for gpu architectures
- ACM New York, NY, USA
- S. Baghsorkhi, M. Delahaye, S. Patel, W. Gropp, and W. Hwu An adaptive performance modeling tool for gpu architectures PPoPP'10: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming 2010 ACM New York, NY, USA 105 114
- (2010) PPoPP'10: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming , pp. 105-114
- Baghsorkhi, S.¹ Delahaye, M.² Patel, S.³ Gropp, W.⁴ Hwu, W.⁵

17
- 84862692992
- Next generation CUDA architecture
- NVIDIA
- NVIDIA, Next generation CUDA architecture. Fermi Architecture. , 2010.
- (2010) Fermi Architecture

18
- 84862677879
- CUDA C programming
- NVIDIA CUDA Toolkit 2.3, July
- NVIDIA, CUDA C programming. Best practices guide. CUDA Toolkit 2.3, July 2009.
- (2009) Best Practices Guide

19
- 60649099576
- Optimizing matrix multiplication for a short-vector SIMD architecture - CELL processor
- Revolutionary Technologies for Acceleration of Emerging Petascale Applications
- J. Kurzak, W. Alvaro, and J. Dongarra Optimizing matrix multiplication for a short-vector SIMD architecture - CELL processor Parallel Computing 35 3 2009 138 150 Revolutionary Technologies for Acceleration of Emerging Petascale Applications
- (2009) Parallel Computing , vol.35 , Issue.3 , pp. 138-150
- Kurzak, J.¹ Alvaro, W.² Dongarra, J.³

20
- 70349247125
- INTEL, Math kernel library
- INTEL, Math kernel library. reference manual, 2009.
- (2009) Reference Manual

21
- 78249255327
- M.M. Baskaran, R. Bordawekar, Sparse matrix-vector multiplication toolkit for graphics processing units, 2009. .
- (2009) Sparse Matrix-vector Multiplication Toolkit for Graphics Processing Units
- Baskaran, M.M.¹ Bordawekar, R.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.