SCOPUS 정보 검색 플랫폼

ACM SIGPLAN Notices

Volumn 45, Issue 5, 2010, Pages 115-125

Model-driven autotuning of sparse matrix-vector multiply on GPUs

(3) Choi, Jee W a Singh, Amik b Vuduc, Richard W c

a GEORGIA INSTITUTE OF TECHNOLOGY (United States)

b INDIAN INSTITUTE OF TECHNOLOGY ROORKEE (India)

c GEORGIA INSTITUTE OF TECHNOLOGY (United States)

Author keywords

GPU; Performance modeling; Sparse matrix vector multiplication

Indexed keywords

AUTOTUNING; COMPRESSED SPARSE ROW; DIRECTLY MODEL; EXHAUSTIVE SEARCH; GPU; GRAPHICS PROCESSING UNIT; INPUT MATRICES; MODEL-DRIVEN; MULTITHREADED; OFFLINE; PARAMETER-TUNING; PERFORMANCE LIMITATIONS; PERFORMANCE MODEL; PERFORMANCE MODELING; PERFORMANCE TUNING; RUNTIMES; SPARSE MATRICES; SPARSE MATRIX-VECTOR MULTIPLICATION; STORAGE FORMATS; VECTOR PROCESSORS;

PROGRAM PROCESSORS; TUNING;

VECTORS;

EID: 77957679421 PISSN: 15232867 EISSN: None Source Type: Journal
DOI: 10.1145/1837853.1693471 Document Type: Conference Paper

Times cited : (195)

References (22)

1
- 77957660323
- NVIDIA CUDA (Compute Unified Device Architecture): Programming Guide Version 2.1 December
- NVIDIA CUDA (Compute Unified Device Architecture): Programming Guide, Version 2.1, December 2008.
- (2008)

2
- 72849129747
- Technical Report RC24704 (W0812-047) IBM T.J.Watson Research Center Yorktown Heights NY USA December
- Muthu Manikandan Baskaran and Rajesh Bordawekar. Optimizing sparse matrix-vector multiplication on GPUs using compile-time and run-time strategies. Technical Report RC24704 (W0812-047), IBM T.J.Watson Research Center, Yorktown Heights, NY, USA, December 2008.
- (2008) Optimizing Sparse Matrix-Vector Multiplication on GPUs using Compile-Time and Run-Time Strategies
- Baskaran, M.M.¹ Bordawekar, R.²

3
- 77956260008
- Efficient sparse matrix-vector multiplication on CUDA
- Portland, OR, USA, November, (to appear)
- Nathan Bell and Michael Garland. Efficient sparse matrix-vector multiplication on CUDA. In Proc. ACM/IEEE Conf. Supercomputing (SC), Portland, OR, USA, November 2009. (to appear).
- (2009) Proc. ACM/IEEE Conf. Supercomputing (SC)
- Bell, N.¹ Garland, M.²

4
- 77953998137
- Sparse matrix solvers on the GPU: Conjugate gradients and multigrid
- San Diego, CA, USA, July
- Jeff Bolz, Ian Farmer, Eitan Grinspun, and Peter Schröder. Sparse matrix solvers on the GPU: Conjugate gradients and multigrid. In Proc. Special Interest Group on Graphics Conf. (SIGGRAPH), San Diego, CA, USA, July 2003. doi: http://dx.doi.org/10.1145/882262.882364.
- (2003) Proc. Special Interest Group on Graphics Conf. (SIGGRAPH)
- Bolz, J.¹ Farmer, I.² Grinspun, E.³ Schröder, P.⁴

5
- 70449699793
- Genera-purpose sparse matrix building blocks using the NVIDIA CUDA technology platform
- Boston, MA, USA, October
- Matthias Christen and Olaf Schenk. Genera-purpose sparse matrix building blocks using the NVIDIA CUDA technology platform. In Proc. Workshop on General-Purpose Processing on Graphics Processing Units (GPGPU), Boston, MA, USA, October 2007.
- (2007) Proc. Workshop on General-Purpose Processing on Graphics Processing Units (GPGPU)
- Christen, M.¹ Schenk, O.²

6
- 25144499116
- Vectorized sparse matrix multiply for compressed row storage
- 2005 of LNCS, Springer Berlin / Heidelberg
- Eduardo F. D'Azevedo, Mark R. Fahey, and Richard T. Mills. Vectorized sparse matrix multiply for compressed row storage. In Proc. Int'l. Conf. Computational Science (ICCS), volume 3514/2005 of LNCS, pages 99-106. Springer Berlin / Heidelberg, 2005. doi: http://dx.doi.org/10.1007/11428831 13.
- (2005) Proc. Int'l. Conf. Computational Science (ICCS) , vol.3514 , pp. 99-106
- Eduardo, F.¹ D'Azevedo, M.R.F.² Richard, T.M.³

7
- 20744452904
- Self-adapting linear algebra algorithms and software
- February
- James Demmel, Jack Dongarra, Viktor Eijkhout, Erika Fuentes, Antoine Petitet, Richard Vuduc, R. Clint Whaley, and Katherine Yelick. Self-adapting linear algebra algorithms and software. Proc. IEEE, 93(2):293-312, February 2005. doi: http://dx.doi.org/10.1109/JPROC.2004.840848.
- (2005) Proc. IEEE , vol.93 , Issue.2 , pp. 293-312
- Demmel, J.¹ Dongarra, J.² Eijkhout, V.³ Fuentes, E.⁴ Petitet, A.⁵ Vuduc, R.⁶ Whaley, R.C.⁷ Yelick, K.⁸

8
- 51549093017
- Sparse matrix computations on manycore GPUs
- Anaheim, CA, USA
- Michael Garland. Sparse matrix computations on manycore GPUs. In Proc. ACM/IEEE Design Automation Conf. (DAC), pages 2-6, Anaheim, CA, USA, 2008. doi: http://dx.doi.org/10.1145/1391469.1391473.
- (2008) Proc. ACM/IEEE Design Automation Conf. (DAC) , pp. 2-6
- Garland, M.¹

9
- 0035370546
- Towards a fast sparse symmetric matrix-vector multiplication
- June
- Roman Geus and Stefan Röllin. Towards a fast sparse symmetric matrix-vector multiplication. Parallel Computing, 27(7):883-896, June 2001. doi: http://dx.doi.org/10.1016/S0167-8191(01)00073-74
- (2001) Parallel Computing , vol.27 , Issue.7 , pp. 883-896
- Geus, R.¹ Röllin, S.²

10
- 70450231944
- An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness
- Austin, TX, USA, June
- Sunpyo Hong and Hyesoon Kim. An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness. In Proc. ACM Int'l. Symp. Comp. Arch. (ISCA), pages 152-163, Austin, TX, USA, June 2009. doi: http://dx.doi.org/10.1145/1555815.1555775.
- (2009) Proc. ACM Int'l. Symp. Comp. Arch. (ISCA) , pp. 152-163
- Hong, S.¹ Kim, H.²

11
- 1542501019
- SPARSITY: Optimization framework for sparse matrix kernels
- February
- Eun-Jin Im, Katherine Yelick, and Richard Vuduc. SPARSITY: Optimization framework for sparse matrix kernels. Int'l J. of High Performance Computing Applications (IJHPCA), 18(1):135-158, February 2004. doi: http://dx.doi.org/10. 1177/1094342004041296.
- (2004) Int'l J. of High Performance Computing Applications (IJHPCA) , vol.18 , Issue.1 , pp. 135-158
- Im, E.-J.¹ Yelick, K.² Vuduc, R.³

12
- 35248834555
- Parallel finite element analysis platform for the Earth Simulator: GeoFEM
- of LNCS, Springer
- Hiroshi Okuda, Kengo Nakajima, Mikio Iizuka, Li Chen, and Hisashi Nakamura. Parallel finite element analysis platform for the Earth Simulator: GeoFEM. In Proc. Int'l. Conf. Computational Science (ICCS), volume 2659 of LNCS, pages 773-780. Springer, 2003. doi: http://dx.doi.org/10.1007/3-540-44863-2 75.
- (2003) Proc. Int'l. Conf. Computational Science (ICCS) , vol.2659 , pp. 773-780
- Okuda, H.¹ Nakajima, K.² Iizuka, M.³ Chen, L.⁴ Nakamura, H.⁵

13
- 85031264203
- Improving performance of sparse matrix-vector multiplication
- Portland, OR, USA
- Ali Pinar and Michael T. Heath. Improving performance of sparse matrix-vector multiplication. In Proc. ACM/IEEE Conf. Supercomputing (SC), Portland, OR, USA, 1999. doi: http://dx.doi.org/10.1145/331532.331562.
- (1999) Proc. ACM/IEEE Conf. Supercomputing (SC)
- Pinar, A.¹ Michael, T.H.²

14
- 0003877041
- Springer Verlag
- John R. Rice and Ronald F. Boisvert. Solving elliptic problems using ELLPACK. Springer Verlag, 1984.
- (1984) Solving Elliptic Problems using ELLPACK
- John, R.R.¹ Ronald, F.B.²

15
- 0003550735
- version 2., March
- Yousef Saad. SPARSKIT: A basic tool kit for sparse matrix computations, version 2. http://www-users.cs.umn.edu/ saad/software/SPARSKIT /sparskit.html, March 2005.
- (2005) SPARSKIT: A Basic Tool Kit for Sparse Matrix Computations
- Saad, Y.¹

16
- 78651284120
- Scan primitives for GPU computing
- San Diego, CA, USA
- Shubhabrata Sengupta, Mark Harris, Yao Zhang, and John D. Owens. Scan primitives for GPU computing. In Proc. ACM SIGGRAPH/EUROGRAPHICS Symp. Graphics Hardware, San Diego, CA, USA, 2007.
- (2007) Proc. ACM SIGGRAPH/EUROGRAPHICS Symp. Graphics Hardware
- Sengupta, S.¹ Harris, M.² Zhang, Y.³ Owens, J.D.⁴

17
- 70350771131
- Benchmarking GPUs to tune dense linear algebra
- Austin, TX, USA, November
- Vasily Volkov and James W. Demmel. Benchmarking GPUs to tune dense linear algebra. In Proc. ACM/IEEE Conf. on Supercomputing (SC), Austin, TX, USA, November 2008.
- (2008) Proc. ACM/IEEE Conf. on Supercomputing (SC)
- Volkov, V.¹ Demmel, J.W.²

18
- 24344485098
- OSKI: A library of automatically tuned sparse matrix kernels
- Richard Vuduc, JamesW. Demmel, and Katherine A. Yelick. OSKI: A library of automatically tuned sparse matrix kernels. In Proc. SciDAC, J. Phys.: Conf. Series, volume 16, pages 521-530, 2005. doi: http://dx.doi.org/10.1088/1742- 6596/16/1/071.
- (2005) Proc. SciDAC, J. Phys.: Conf. Series , vol.16 , pp. 521-530
- Vuduc, R.¹ Demmel, J.W.² Yelick, K.A.³

19
- 10044233808
- PhD thesis, University of California, Berkeley, CA, USA, January
- Richard W. Vuduc. Automatic performance tuning of sparse matrix kernels. PhD thesis, University of California, Berkeley, CA, USA, January 2004.
- (2004) Automatic Performance Tuning of Sparse Matrix Kernels
- Vuduc, R.W.¹

20
- 33646389518
- Fast sparse matrix-vector multiplication by exploiting variable block structure
- LNCS, Sorrento, Italy, September, LNCSSpringer. doi
- Richard W. Vuduc and Hyun-Jin Moon. Fast sparse matrix-vector multiplication by exploiting variable block structure. In Proc. High- Performance Computing and Communications Conf., volume LNCS 3726/2005, pages 807-816, Sorrento, Italy, September 2005. Springer. doi: http://dx.doi.org/10. 1007/11557654 91.
- (2005) Proc. High- Performance Computing and Communications Conf. , vol.2005-3726 , pp. 807-816
- Vuduc, R.W.¹ Moon, H.-J.²

21
- 60949098907
- Optimizing sparse matrix-vector multiply on emerging multicore platforms
- March
- Sam Williams, Richard Vuduc, Leonid Oliker, John Shalf, Katherine Yelick, and James Demmel. Optimizing sparse matrix-vector multiply on emerging multicore platforms. Journal of Parallel Computing, 35(3):178-194, March 2009. doi: http://dx.doi.org/10.1016/j.parco.2008.12.006.
- (2009) Journal of Parallel Computing , vol.35 , Issue.3 , pp. 178-194
- Williams, S.¹ Vuduc, R.² Oliker, L.³ Shalf, J.⁴ Yelick, K.⁵ Demmel, J.⁶

22
- 20744459570
- Is search really necessary to generate high-performance BLAS?
- February
- Kamen Yotov, Xiaoming Li, Gang Ren, María Jesús Garzarán, David Padua, Keshav Pingali, and Paul Stodghill. Is search really necessary to generate high-performance BLAS? Proc. IEEE, 93(2):358-386, February 2005. doi: .
- (2005) Proc IEEE , vol.93 , Issue.2 , pp. 358-386
- Yotov, K.¹ Li, X.² Ren, G.³ Garzarán, M.J.⁴ Padua, D.⁵ Pingali, K.⁶ Stodghill, P.⁷

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.