SCOPUS 정보 검색 플랫폼

International Conference for High Performance Computing, Networking, Storage and Analysis, SC

Volumn , Issue , 2013, Pages

Accelerating sparse matrix-vector multiplication on GPUs using bit-representation-optimized schemes

(9) Tang, Wai Teng a Tan, Wen Jun a Ray, Rajarshi b Wong, Yi Wen b Chen, Weiguang b Kuo, Shyh Hao c Goh, Rick Siow Mong c Turner, Stephen John a Wong, Weng Fai b

a NANYANG TECHNOLOGICAL UNIVERSITY (Singapore)

b NATIONAL UNIVERSITY OF SINGAPORE (Singapore)

c INSTITUTE OF HIGH PERFORMANCE COMPUTING (Singapore)

Author keywords

Data compression; GPU; Matrix vector multiplication; Memory bandwidth; Parallelism; Sparse matrix format

Indexed keywords

CLUSTERING ALGORITHMS; DIGITAL STORAGE; MATRIX ALGEBRA; MULTIPROCESSING SYSTEMS; OPTIMIZATION; PROGRAM PROCESSORS;

GPU; MATRIX VECTOR MULTIPLICATION; MEMORY BANDWIDTHS; PARALLELISM; SPARSE MATRIX FORMATS;

DATA COMPRESSION;

EID: 84899694907 PISSN: 21674329 EISSN: 21674337 Source Type: Conference Proceeding
DOI: 10.1145/2503210.2503234 Document Type: Conference Paper

Times cited : (49)

References (27)

1
- 0030491606
- An approximate minimum degree ordering algorithm
- Oct.
- P. R. Amestoy, T. A. Davis, and I. S. Du-. An approximate minimum degree ordering algorithm. SIAM J. Matrix Anal. Appl., 17(4):886-905, Oct. 1996.
- (1996) SIAM J. Matrix Anal. Appl. , vol.17 , Issue.4 , pp. 886-905
- Amestoy, P.R.¹ Davis, T.A.² Du, I.S.³

2
- 74049163483
- Technical report, RC24704, IBM T. J. Watson
- M. M. Baskaran and R. Bordawekar. Optimizing sparse matrix-vector multiplication on GPUs. Technical report, RC24704, IBM T. J. Watson, 2009.
- (2009) Optimizing Sparse Matrix-vector Multiplication on GPUs
- Baskaran, M.M.¹ Bordawekar, R.²

3
- 78650279432
- Pattern-based sparse matrix representation for memory-efficient SMVM kernels
- New York, NY, USA
- M. Belgin, G. Back, and C. J. Ribbens. Pattern-based sparse matrix representation for memory-efficient SMVM kernels. In Proceedings of the 23rd international conference on Supercomputing, ICS'09, pages 100-109, New York, NY, USA, 2009.
- (2009) Proceedings of the 23rd International Conference on Supercomputing, ICS'09 , pp. 100-109
- Belgin, M.¹ Back, G.² Ribbens, C.J.³

4
- 74049143158
- Implementing sparse matrix-vector multiplication on throughput-oriented processors
- New York, NY, USA
- N. Bell and M. Garland. Implementing sparse matrix-vector multiplication on throughput-oriented processors. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC'09, pages 18:1-18:11, New York, NY, USA, 2009.
- (2009) Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC'09 , pp. 181-1811
- Bell, N.¹ Garland, M.²

5
- 84899687916
- Cusp library
- N. Bell and M. Garland. Cusp library, 2012. http://cusp-library. googlecode. com.
- (2012)
- Bell, N.¹ Garland, M.²

6
- 77957679421
- Model-driven autotuning of sparse matrix-vector multiply on GPUs
- New York, NY, USA
- J. W. Choi, A. Singh, and R. W. Vuduc. Model-driven autotuning of sparse matrix-vector multiply on GPUs. In Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP'10, pages 115-126, New York, NY, USA, 2010.
- (2010) Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP'10 , pp. 115-126
- Choi, J.W.¹ Singh, A.² Vuduc, R.W.³

7
- 81355161778
- The University of Florida sparse matrix collection
- Dec.
- T. A. Davis and Y. Hu. The University of Florida sparse matrix collection. ACM Trans. Math. Softw., 38(1):1:1-1:25, Dec. 2011. http://www. cise. u. edu/research/sparse/matrices/.
- (2011) ACM Trans. Math. Softw. , vol.38 , Issue.1 , pp. 11-125
- Davis, T.A.¹ Hu, Y.²

8
- 25144499116
- Vectorized sparse matrix multiply for compressed row storage format
- Berlin, Heidelberg
- E. F. D'Azevedo, M. R. Fahey, and R. T. Mills. Vectorized sparse matrix multiply for compressed row storage format. In Proceedings of the 5th international conference on Computational Science-Volume Part I, ICCS'05, pages 99-106, Berlin, Heidelberg, 2005.
- (2005) Proceedings of the 5th International Conference on Computational Science-Volume Part I, ICCS'05 , pp. 99-106
- D'Azevedo, E.F.¹ Fahey, M.R.² Mills, R.T.³

9
- 0003645035
- Prentice Hall Professional Technical Reference
- A. George and J. W. Liu. Computer Solution of Large Sparse Positive Denite Systems. Prentice Hall Professional Technical Reference, 1981.
- (1981) Computer Solution of Large Sparse Positive Denite Systems
- George, A.¹ Liu, J.W.²

10
- 84858763464
- High-performance sparse matrix-vector multiplication on GPUs for structured grid computations
- New York, NY, USA
- J. Godwin, J. Holewinski, and P. Sadayappan. High-performance sparse matrix-vector multiplication on GPUs for structured grid computations. In Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units, GPGPU-5, pages 47-56, New York, NY, USA, 2012.
- (2012) Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units, GPGPU-5 , pp. 47-56
- Godwin, J.¹ Holewinski, J.² Sadayappan, P.³

11
- 84864039129
- Automatically generating and tuning GPU code for sparse matrix-vector multiplication from a high-level representation
- New York, NY, USA
- D. Grewe and A. Lokhmotov. Automatically generating and tuning GPU code for sparse matrix-vector multiplication from a high-level representation. In Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units, GPGPU-4, pages 12:1-12:8, New York, NY, USA, 2011.
- (2011) Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units , vol.GPGPU-4 , pp. 121-128
- Grewe, D.¹ Lokhmotov, A.²

12
- 0004972603
- PhD thesis, University of California Berkeley
- E.-J. Im. Optimizing the performance of sparse matrix-vector multiplication. PhD thesis, University of California Berkeley, 2000.
- (2000) Optimizing the Performance of Sparse Matrix-vector Multiplication
- Im, E.-J.¹

13
- 84949647432
- Optimizing sparse matrix computations for register reuse in SPARSITY
- London, UK
- E.-J. Im and K. A. Yelick. Optimizing sparse matrix computations for register reuse in SPARSITY. In Proceedings of the International Conference on Computational Sciences-Part I, ICCS'01, pages 127-136, London, UK, 2001.
- (2001) Proceedings of the International Conference on Computational Sciences-Part I, ICCS'01 , pp. 127-136
- Im, E.-J.¹ Yelick, K.A.²

14
- 77950369345
- Data clustering: 50 years beyond k-means
- June
- A. K. Jain. Data clustering: 50 years beyond k-means. Pattern Recogn. Lett., 31(8):651-666, June 2010.
- (2010) Pattern Recogn. Lett. , vol.31 , Issue.8 , pp. 651-666
- Jain, A.K.¹

15
- 84899678759
- D. R. Kincaid, T. C. Oppe, and D. M. Young. ITPACKV 2D User Guide, CNA-232, 1989. http://rene. ma. utexas. edu/CNA/ITPACK/manuals /userv2d/.
- (1989) ITPACKV 2D User Guide , vol.CNA-232
- Kincaid, D.R.¹ Oppe, T.C.² Young, D.M.³

16
- 55849146932
- Optimizing sparse matrix-vector multiplication using index and value compression
- New York, NY, USA
- K. Kourtis, G. Goumas, and N. Koziris. Optimizing sparse matrix-vector multiplication using index and value compression. In Proceedings of the 5th Conference on Computing frontiers, CF'08, pages 87-96, New York, NY, USA, 2008.
- (2008) Proceedings of the 5th Conference on Computing Frontiers, CF'08 , pp. 87-96
- Kourtis, K.¹ Goumas, G.² Koziris, N.³

17
- 2942628343
- Optimizing sparse matrix-vector product computations using unroll and jam
- May
- J. Mellor-Crummey and J. Garvin. Optimizing sparse matrix-vector product computations using unroll and jam. Int. J. High Perform. Comput. Appl., 18(2):225-236, May 2004.
- (2004) Int. J. High Perform. Comput. Appl. , vol.18 , Issue.2 , pp. 225-236
- Mellor-Crummey, J.¹ Garvin, J.²

18
- 77949577730
- Automatically tuning sparse matrix-vector multiplication for GPU architectures
- Berlin, Heidelberg
- A. Monakov, A. Lokhmotov, and A. Avetisyan. Automatically tuning sparse matrix-vector multiplication for GPU architectures. In Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers, HiPEAC'10, pages 111-125, Berlin, Heidelberg, 2010.
- (2010) Proceedings of the 5th International Conference on High Performance Embedded Architectures and Compilers, HiPEAC'10 , pp. 111-125
- Monakov, A.¹ Lokhmotov, A.² Avetisyan, A.³

19
- 84899678731
- Nvidia Compute Unied Device Architecture (CUDA). http://www. nvidia. com/object/cuda home new. html.
- Nvidia Compute Unied Device Architecture (CUDA)

20
- 85031264203
- Improving performance of sparse matrix-vector multiplication
- New York, NY, USA
- A. Pinar and M. T. Heath. Improving performance of sparse matrix-vector multiplication. In Proceedings of the 1999 ACM/IEEE conference on Supercomputing, Supercomputing'99, New York, NY, USA, 1999.
- (1999) Proceedings of the 1999 ACM/IEEE Conference on Supercomputing, Supercomputing'99
- Pinar, A.¹ Heath, M.T.²

21
- 24144467633
- Iterative methods for sparse linear systems
- Philadelphia, PA, USA, 2nd edition
- Y. Saad. Iterative Methods for Sparse Linear Systems. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2nd edition, 2003.
- (2003) Society for Industrial and Applied Mathematics
- Saad, Y.¹

22
- 84864051848
- ClSpMV: A cross-platform OpenCL SpMV framework on GPUs
- New York, NY, USA
- B.-Y. Su and K. Keutzer. clSpMV: A cross-platform OpenCL SpMV framework on GPUs. In Proceedings of the 26th ACM international conference on Supercomputing, ICS'12, pages 353-364, New York, NY, USA, 2012.
- (2012) Proceedings of the 26th ACM International Conference on Supercomputing, ICS'12 , pp. 353-364
- Su, B.-Y.¹ Keutzer, K.²

23
- 79955614550
- A new approach for sparse matrix vector product on NVIDIA GPUs
- June
- F. Vazquez, J. J. Fernandez, and E. M. Garzon. A new approach for sparse matrix vector product on NVIDIA GPUs. Concurr. Comput.: Pract. Exper., 23(8):815-826, June 2011.
- (2011) Concurr. Comput.: Pract. Exper. , vol.23 , Issue.8 , pp. 815-826
- Vazquez, F.¹ Fernandez, J.J.² Garzon, E.M.³

24
- 24344485098
- OSKI: A library of automatically tuned sparse matrix kernels
- R. Vuduc, J. W. Demmel, and K. A. Yelick. OSKI: A library of automatically tuned sparse matrix kernels. In Proc. SciDAC, J. Physics: Conf. Ser., volume 16, pages 521-530, 2005.
- (2005) Proc. SciDAC, J. Physics: Conf. Ser. , vol.16 , pp. 521-530
- Vuduc, R.¹ Demmel, J.W.² Yelick, K.A.³

25
- 33646389518
- Fast sparse matrix-vector multiplication by exploiting variable block structure
- Berlin, Heidelberg
- R. W. Vuduc and H.-J. Moon. Fast sparse matrix-vector multiplication by exploiting variable block structure. In Proceedings of the First international conference on High Performance Computing and Communications, HPCC'05, pages 807-816, Berlin, Heidelberg, 2005.
- (2005) Proceedings of the First International Conference on High Performance Computing and Communications, HPCC'05 , pp. 807-816
- Vuduc, R.W.¹ Moon, H.-J.²

26
- 34547468948
- Accelerating sparse matrix computations via data compression
- New York, NY, USA
- J. Willcock and A. Lumsdaine. Accelerating sparse matrix computations via data compression. In Proceedings of the 20th annual international conference on Supercomputing, ICS'06, pages 307-316, New York, NY, USA, 2006.
- (2006) Proceedings of the 20th Annual International Conference on Supercomputing, ICS'06 , pp. 307-316
- Willcock, J.¹ Lumsdaine, A.²

27
- 56749158843
- Optimization of sparse matrix-vector multiplication on emerging multicore platforms
- New York, NY, USA
- S. Williams, L. Oliker, R. Vuduc, J. Shalf, K. Yelick, and J. Demmel. Optimization of sparse matrix-vector multiplication on emerging multicore platforms. In Proceedings of the 2007 ACM/IEEE conference on Supercomputing, SC'07, pages 38:1-38:12, New York, NY, USA, 2007.
- (2007) Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, SC'07 , pp. 381-3812
- Williams, S.¹ Oliker, L.² Vuduc, R.³ Shalf, J.⁴ Yelick, K.⁵ Demmel, J.⁶

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.