SCOPUS 정보 검색 플랫폼

Proceedings of the TeraGrid 2011 Conference: Extreme Digital Discovery, TG'11

Volumn , Issue , 2011, Pages

A model-driven partitioning and auto-tuning integrated framework for sparse matrix-vector multiplication on GPUs

(6) Guo, Ping a Huang, He a Chen, Qichang a Wang, Liqiang a Lee, En Jui a Chen, Po a

a UNIVERSITY OF WYOMING (United States)

Author keywords

auto tuning; GPU; performance modeling; sparse matrix vector multiplication

Indexed keywords

AUTOTUNING; EXECUTION TIME; GPU; GRAPHICS PROCESSING UNIT; HIGH-PERFORMANCE COMPUTING; INTEGRATED FRAMEWORKS; MODEL-DRIVEN; PERFORMANCE IMPROVEMENTS; PERFORMANCE MODEL; PERFORMANCE MODELING; PROCESSING CAPABILITY; SPARSE MATRICES; SPARSE MATRIX-VECTOR MULTIPLICATION; STORAGE FORMATS;

COMPUTER GRAPHICS EQUIPMENT; COMPUTER SOFTWARE SELECTION AND EVALUATION; PROGRAM PROCESSORS; VECTORS;

MATRIX ALGEBRA;

EID: 80052311496 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/2016741.2016744 Document Type: Conference Paper

Times cited : (31)

References (12)

1
- 60649099576
- Optimizing matrix multiplication for a short-vector simd architecture-cell processor
- J. Kurzak, W. Alvaro, and J. Dongarra. Optimizing matrix multiplication for a short-vector simd architecture-cell processor. Parallel Comput., 35(3):138-150, 2009.
- (2009) Parallel Comput. , vol.35 , Issue.3 , pp. 138-150
- Kurzak, J.¹ Alvaro, W.² Dongarra, J.³

2
- 56749158843
- Optimization of sparse matrix-vector multiplication on emerging multicore platforms
- S. Williams, L. Oliker, R. Vuduc, J. Shalf, K. Yelick, and J. Demmel. Optimization of sparse matrix-vector multiplication on emerging multicore platforms. In In Proc. 2007 ACM/IEEE Conference on Supercomputing, 2007.
- In Proc. 2007 ACM/IEEE Conference on Supercomputing, 2007
- Williams, S.¹ Oliker, L.² Vuduc, R.³ Shalf, J.⁴ Yelick, K.⁵ Demmel, J.⁶

3
- 74049143158
- Implementing sparse matrix-vector multiplication on throughput-oriented processors
- New York, NY, USA
- N. Bell and M. Garland. Implementing sparse matrix-vector multiplication on throughput-oriented processors. In SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, pages 1-11, New York, NY, USA, 2009.
- (2009) SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis , pp. 1-11
- Bell, N.¹ Garland, M.²

4
- 72849129747
- Technical report, Research Report RC24704, IBM TJ Watson Research Center, december
- M. M. Baskaran and R. Bordawekar. Optimizing sparse matrix-vector multiplication on gpus using compile-time and run-time strategies. Technical report, Research Report RC24704, IBM TJ Watson Research Center, december 2008.
- (2008) Optimizing Sparse Matrix-vector Multiplication on Gpus Using Compile-time and Run-time Strategies
- Baskaran, M.M.¹ Bordawekar, R.²

5
- 67650694407
- December
- NVIDIA CUDA (Compute Unified Device Architecture): Programming Guide,Version 2.1, December 2008.
- (2008) NVIDIA CUDA (Compute Unified Device Architecture): Programming Guide,Version 2.1

6
- 20744452904
- Self-adapting linear algebra algorithms and software
- J. Demmel, J. Dongarra, V. Eijkhout, E. Fuentes, A. Petitet, R. C. Whaley R. Vuduc, and K. Yelick. Self-adapting linear algebra algorithms and software. Proceeding of IEEE, 93(2):293-312, 2005.
- (2005) Proceeding of IEEE , vol.93 , Issue.2 , pp. 293-312
- Demmel, J.¹ Dongarra, J.² Eijkhout, V.³ Fuentes, E.⁴ Petitet, A.⁵ Whaley, R.C.⁶ Vuduc, R.⁷ Yelick, K.⁸

7
- 1542501019
- Sparsity: Optimization framework for sparse matrix kernels
- Eun-Jin Im, K.a.t.h.e.r.i.n.e. Yelick, and R.i.c.h.a.r.d. Vuduc. Sparsity: Optimization framework for sparse matrix kernels. Int. J. High Perform. Comput. Appl., 18(1):135-158, 2004.
- (2004) Int. J. High Perform. Comput. Appl. , vol.18 , Issue.1 , pp. 135-158
- Im, E.-J.¹ Yelick, K.² Vuduc, R.³

8
- 0242533311
- Sparse matrix solvers on the gpu: Conjugate gradients and multigrid
- J. Bolz, I. Farmer, E. Grinspun, and P. Schroder. Sparse matrix solvers on the gpu: Conjugate gradients and multigrid. ACM Trans. Graph., 22(3):917-924, 2003.
- (2003) ACM Trans. Graph. , vol.22 , Issue.3 , pp. 917-924
- Bolz, J.¹ Farmer, I.² Grinspun, E.³ Schroder, P.⁴

9
- 77957679421
- Model-driven autotuning of sparse matrix-vector multiply on gpus
- New York, NY, USA
- J. W. Choi, A. Singh, and R. W. Vuduc. Model-driven autotuning of sparse matrix-vector multiply on gpus. In PPoPP '10: Proceedings of the 15th ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 115-126, New York, NY, USA, 2010.
- (2010) PPoPP '10: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming , pp. 115-126
- Choi, J.W.¹ Singh, A.² Vuduc, R.W.³

10
- 70350368872
- Technical report, Nvidia Technical Report NVR-2008-004
- N. Bell and M. Garland. Efficient sparse matrix-vector multiplication on cuda. Technical report, Nvidia Technical Report NVR-2008-004, 2008.
- (2008) Efficient Sparse Matrix-vector Multiplication on Cuda
- Bell, N.¹ Garland, M.²

11
- 74049114159
- Auto-tuning 3-d fft library for cuda gpus
- New York, NY, USA
- A. Nukada and S. Matsuoka. Auto-tuning 3-d fft library for cuda gpus. In SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, pages 1-10, New York, NY, USA, 2009.
- (2009) SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis , pp. 1-10
- Nukada, A.¹ Matsuoka, S.²

12
- 79952428965
- Auto-tuning cuda parameters for sparse matrix-vector multiplication on gpus
- 0
- Ping Guo and Liqiang Wang. Auto-tuning cuda parameters for sparse matrix-vector multiplication on gpus. Computational and Information Sciences, International Conference on, 0:1154-1157, 2010.
- (2010) Computational and Information Sciences, International Conference on , pp. 1154-1157
- Guo, P.¹ Wang, L.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.