SCOPUS 정보 검색 플랫폼

Proceedings of the 2012 International Conference on High Performance Computing and Simulation, HPCS 2012

Volumn , Issue , 2012, Pages 496-502

Accurate CUDA performance modeling for sparse matrix-vector multiplication

(2) Guo, Ping a Wang, Liqiang a

a UNIVERSITY OF WYOMING (United States)

Author keywords

CUDA; GPU; Performance modeling; Sparse Matrix Vector Multiplication

Indexed keywords

CUDA; EXECUTION TIME; GPU; PERFORMANCE MODELING; SPARSE MATRIX-VECTOR MULTIPLICATION; TEST CASE;

CIRCUIT SIMULATION;

SOFTWARE ENGINEERING;

EID: 84866980089 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/HPCSim.2012.6266964 Document Type: Conference Paper

Times cited : (7)

References (18)

1
- 74049143158
- Implementing sparse matrix-vector multiplication on throughput-oriented processors
- N. Bell and M. Garland, "Implementing sparse matrix-vector multiplication on throughput-oriented processors," in SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, New York, NY, USA, 2009, pp. 1-11.
- SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, New York, NY, USA, 2009 , pp. 1-11
- Bell, N.¹ Garland, M.²

2
- 84867015272
- Master's thesis, Utrecht University
- A. Resios and V. Holdermans, "GPU performance prediction using parametrized models," Master's thesis, Utrecht University, 2011.
- (2011) GPU Performance Prediction Using Parametrized Models
- Resios, A.¹ Holdermans, V.²

3
- 0242533311
- Sparse matrix solvers on the GPU: Conjugate gradients and multigrid
- J. Bolz, I. Farmer, E. Grinspun, and P. Schroder, "Sparse matrix solvers on the GPU: conjugate gradients and multigrid," ACM Trans. Graph., vol. 22, no. 3, pp. 917-924, 2003.
- (2003) ACM Trans. Graph. , vol.22 , Issue.3 , pp. 917-924
- Bolz, J.¹ Farmer, I.² Grinspun, E.³ Schroder, P.⁴

4
- 84856182857
- May
- NVIDIA CUDA C Programming Guide, Version 4.0, May 2011.
- (2011) NVIDIA CUDA C Programming Guide, Version 4.0

5
- 60649099576
- Optimizing matrix multiplication for a short-vector simd architecture-cell processor
- J. Kurzak, W. Alvaro, and J. Dongarra, "Optimizing matrix multiplication for a short-vector simd architecture-cell processor," Parallel Comput., vol. 35, no. 3, pp. 138-150, 2009.
- (2009) Parallel Comput. , vol.35 , Issue.3 , pp. 138-150
- Kurzak, J.¹ Alvaro, W.² Dongarra, J.³

6
- 20744452904
- Self-adapting linear algebra algorithms and software
- J. Demmel, J. Dongarra, V. Eijkhout, E. Fuentes, A. Petitet, R. C. W. R. Vuduc, and K. Yelick, "Self-adapting linear algebra algorithms and software," Proceeding of IEEE, vol. 93, no. 2, pp. 293-312, 2005.
- (2005) Proceeding of IEEE , vol.93 , Issue.2 , pp. 293-312
- Demmel, J.¹ Dongarra, J.² Eijkhout, V.³ Fuentes, E.⁴ Petitet, A.⁵ Vuduc, R.C.W.R.⁶ Yelick, K.⁷

7
- 1542501019
- Sparsity: Optimization framework for sparse matrix kernels
- E.-J. Im, K. Yelick, and R. Vuduc, "Sparsity: Optimization framework for sparse matrix kernels," Int. J. High Perform. Comput. Appl., vol. 18, no. 1, pp. 135-158, 2004.
- (2004) Int. J. High Perform. Comput. Appl. , vol.18 , Issue.1 , pp. 135-158
- Im, E.-J.¹ Yelick, K.² Vuduc, R.³

8
- 72849129747
- Research Report RC24704, IBM TJ Watson Research Center, Tech. Rep., december
- M. M. Baskaran and R. Bordawekar, "Optimizing sparse matrix-vector multiplication on GPUs using compile-time and run-time strategies," Research Report RC24704, IBM TJ Watson Research Center, Tech. Rep., december 2008.
- (2008) Optimizing Sparse Matrix-vector Multiplication on GPUs Using Compile-time and Run-time Strategies
- Baskaran, M.M.¹ Bordawekar, R.²

9
- 79952428965
- Auto-tuning CUDA parameters for sparse matrixvector multiplication on GPUs
- Proceedings of the 2010 International Conference on Computational and Information Sciences, ser. IEEE Computer Society
- P. Guo and L. Wang, "Auto-tuning CUDA parameters for sparse matrixvector multiplication on GPUs," in Proceedings of the 2010 International Conference on Computational and Information Sciences, ser. ICCIS '10. Washington, DC, USA: IEEE Computer Society, 2010, pp. 1154-1157.
- (2010) ICCIS '10Washington, DC, USA , pp. 1154-1157
- Guo, P.¹ Wang, L.²

10
- 74049114159
- Auto-tuning 3-D FFT library for CUDA GPUS
- A. Nukada and S. Matsuoka, "Auto-tuning 3-D FFT library for CUDA GPUS," in SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, New York, NY, USA, 2009, pp. 1-10.
- SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, New York, NY, USA, 2009 , pp. 1-10
- Nukada, A.¹ Matsuoka, S.²

11
- 78249244772
- Improving the performance of the sparse matrix vector product with gpus
- Proceedings of the 2010 10th IEEE International Conference on Computer and Information Technology, ser. Washington, DC, USA: IEEE Computer Society
- F. Vazquez, G. Ortega, J. J. Fernandez, and E. M. Garzon, "Improving the performance of the sparse matrix vector product with gpus," in Proceedings of the 2010 10th IEEE International Conference on Computer and Information Technology, ser. CIT '10. Washington, DC, USA: IEEE Computer Society, 2010, pp. 1146-1151.
- (2010) CIT '10 , pp. 1146-1151
- Vazquez, F.¹ Ortega, G.² Fernandez, J.J.³ Garzon, E.M.⁴

12
- 77957679421
- Model-driven autotuning of sparse matrix-vector multiply on GPUs
- J. W. Choi, A. Singh, and R. W. Vuduc, "Model-driven autotuning of sparse matrix-vector multiply on GPUs," in PPoPP '10: Proceedings of the 15th ACM SIGPLAN symposium on Principles and practice of parallel programming, New York, NY, USA, 2010, pp. 115-126.
- PPoPP '10: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, New York, NY, USA, 2010 , pp. 115-126
- Choi, J.W.¹ Singh, A.² Vuduc, R.W.³

13
- 84886723259
- Performance modeling and optimization of sparse matrix-vector multiplication on NVIDIA CUDA platform
- S. Xu, W. Xue, and H. Lin, "Performance modeling and optimization of sparse matrix-vector multiplication on NVIDIA CUDA platform," The Journal of Supercomputing, pp. 1-12, 2011.
- (2011) The Journal of Supercomputing , pp. 1-12
- Xu, S.¹ Xue, W.² Lin, H.³

14
- 79955921273
- A quantitative performance analysis model for GPU architectures
- Y. Zhang and J. Owens, "A quantitative performance analysis model for GPU architectures," in High Performance Computer Architecture (HPCA), 2011 IEEE 17th International Symposium on, feb. 2011, pp. 382-393.
- High Performance Computer Architecture (HPCA), 2011 IEEE 17th International Symposium on, Feb. 2011 , pp. 382-393
- Zhang, Y.¹ Owens, J.²

15
- 77749337497
- An adaptive performance modeling tool for GPU architectures
- Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, ser. New York, NY, USA: ACM
- S. S. Baghsorkhi, M. Delahaye, S. J. Patel, W. D. Gropp, and W.-m. W. Hwu, "An adaptive performance modeling tool for GPU architectures," in Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, ser. PPoPP '10. New York, NY, USA: ACM, 2010, pp. 105-114.
- (2010) PPoPP '10 , pp. 105-114
- Baghsorkhi, S.S.¹ Delahaye, M.² Patel, S.J.³ Gropp, W.D.⁴ Hwu, W.W.⁵

16
- 70450231944
- An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness
- Proceedings of the 36th annual international symposium on Computer architecture, ser. New York, NY, USA: ACM
- S. Hong and H. Kim, "An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness," in Proceedings of the 36th annual international symposium on Computer architecture, ser. ISCA '09. New York, NY, USA: ACM, 2009, pp. 152-163.
- (2009) ISCA '09 , pp. 152-163
- Hong, S.¹ Kim, H.²

17
- 77952204218
- A performance prediction model for the CUDA GPGPU platform
- K. Kothapalli, R. Mukherjee, M. Rehman, S. Patidar, P. Narayanan, and K. Srinathan, "A performance prediction model for the CUDA GPGPU platform," in High Performance Computing (HiPC), 2009 International Conference on, dec. 2009, pp. 463-472.
- High Performance Computing (HiPC), 2009 International Conference on, Dec. 2009 , pp. 463-472
- Kothapalli, K.¹ Mukherjee, R.² Rehman, M.³ Patidar, S.⁴ Narayanan, P.⁵ Srinathan, K.⁶

18
- 56749158843
- Optimization of sparse matrix-vector multiplication on emerging multicore platforms
- S. Williams, L. Oliker, R. Vuduc, J. Shalf, K. Yelick, and J. Demmel, "Optimization of sparse matrix-vector multiplication on emerging multicore platforms," in Proc. 2007 ACM/IEEE Conference on Supercomputing, 2007.
- Proc. 2007 ACM/IEEE Conference on Supercomputing, 2007
- Williams, S.¹ Oliker, L.² Vuduc, R.³ Shalf, J.⁴ Yelick, K.⁵ Demmel, J.⁶

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.