메뉴 건너뛰기




Volumn , Issue , 2012, Pages 496-502

Accurate CUDA performance modeling for sparse matrix-vector multiplication

Author keywords

CUDA; GPU; Performance modeling; Sparse Matrix Vector Multiplication

Indexed keywords

CUDA; EXECUTION TIME; GPU; PERFORMANCE MODELING; SPARSE MATRIX-VECTOR MULTIPLICATION; TEST CASE;

EID: 84866980089     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/HPCSim.2012.6266964     Document Type: Conference Paper
Times cited : (7)

References (18)
  • 3
    • 0242533311 scopus 로고    scopus 로고
    • Sparse matrix solvers on the GPU: Conjugate gradients and multigrid
    • J. Bolz, I. Farmer, E. Grinspun, and P. Schroder, "Sparse matrix solvers on the GPU: conjugate gradients and multigrid," ACM Trans. Graph., vol. 22, no. 3, pp. 917-924, 2003.
    • (2003) ACM Trans. Graph. , vol.22 , Issue.3 , pp. 917-924
    • Bolz, J.1    Farmer, I.2    Grinspun, E.3    Schroder, P.4
  • 5
    • 60649099576 scopus 로고    scopus 로고
    • Optimizing matrix multiplication for a short-vector simd architecture-cell processor
    • J. Kurzak, W. Alvaro, and J. Dongarra, "Optimizing matrix multiplication for a short-vector simd architecture-cell processor," Parallel Comput., vol. 35, no. 3, pp. 138-150, 2009.
    • (2009) Parallel Comput. , vol.35 , Issue.3 , pp. 138-150
    • Kurzak, J.1    Alvaro, W.2    Dongarra, J.3
  • 7
    • 1542501019 scopus 로고    scopus 로고
    • Sparsity: Optimization framework for sparse matrix kernels
    • E.-J. Im, K. Yelick, and R. Vuduc, "Sparsity: Optimization framework for sparse matrix kernels," Int. J. High Perform. Comput. Appl., vol. 18, no. 1, pp. 135-158, 2004.
    • (2004) Int. J. High Perform. Comput. Appl. , vol.18 , Issue.1 , pp. 135-158
    • Im, E.-J.1    Yelick, K.2    Vuduc, R.3
  • 9
    • 79952428965 scopus 로고    scopus 로고
    • Auto-tuning CUDA parameters for sparse matrixvector multiplication on GPUs
    • Proceedings of the 2010 International Conference on Computational and Information Sciences, ser. IEEE Computer Society
    • P. Guo and L. Wang, "Auto-tuning CUDA parameters for sparse matrixvector multiplication on GPUs," in Proceedings of the 2010 International Conference on Computational and Information Sciences, ser. ICCIS '10. Washington, DC, USA: IEEE Computer Society, 2010, pp. 1154-1157.
    • (2010) ICCIS '10Washington, DC, USA , pp. 1154-1157
    • Guo, P.1    Wang, L.2
  • 11
    • 78249244772 scopus 로고    scopus 로고
    • Improving the performance of the sparse matrix vector product with gpus
    • Proceedings of the 2010 10th IEEE International Conference on Computer and Information Technology, ser. Washington, DC, USA: IEEE Computer Society
    • F. Vazquez, G. Ortega, J. J. Fernandez, and E. M. Garzon, "Improving the performance of the sparse matrix vector product with gpus," in Proceedings of the 2010 10th IEEE International Conference on Computer and Information Technology, ser. CIT '10. Washington, DC, USA: IEEE Computer Society, 2010, pp. 1146-1151.
    • (2010) CIT '10 , pp. 1146-1151
    • Vazquez, F.1    Ortega, G.2    Fernandez, J.J.3    Garzon, E.M.4
  • 13
    • 84886723259 scopus 로고    scopus 로고
    • Performance modeling and optimization of sparse matrix-vector multiplication on NVIDIA CUDA platform
    • S. Xu, W. Xue, and H. Lin, "Performance modeling and optimization of sparse matrix-vector multiplication on NVIDIA CUDA platform," The Journal of Supercomputing, pp. 1-12, 2011.
    • (2011) The Journal of Supercomputing , pp. 1-12
    • Xu, S.1    Xue, W.2    Lin, H.3
  • 15
    • 77749337497 scopus 로고    scopus 로고
    • An adaptive performance modeling tool for GPU architectures
    • Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, ser. New York, NY, USA: ACM
    • S. S. Baghsorkhi, M. Delahaye, S. J. Patel, W. D. Gropp, and W.-m. W. Hwu, "An adaptive performance modeling tool for GPU architectures," in Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, ser. PPoPP '10. New York, NY, USA: ACM, 2010, pp. 105-114.
    • (2010) PPoPP '10 , pp. 105-114
    • Baghsorkhi, S.S.1    Delahaye, M.2    Patel, S.J.3    Gropp, W.D.4    Hwu, W.W.5
  • 16
    • 70450231944 scopus 로고    scopus 로고
    • An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness
    • Proceedings of the 36th annual international symposium on Computer architecture, ser. New York, NY, USA: ACM
    • S. Hong and H. Kim, "An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness," in Proceedings of the 36th annual international symposium on Computer architecture, ser. ISCA '09. New York, NY, USA: ACM, 2009, pp. 152-163.
    • (2009) ISCA '09 , pp. 152-163
    • Hong, S.1    Kim, H.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.