메뉴 건너뛰기




Volumn 26, Issue 1, 2015, Pages 196-205

Performance analysis and optimization for SpMV on GPU using probabilistic modeling

Author keywords

GPU; performance modeling; probability mass function; sparse matrix vector multiplication

Indexed keywords

COMPUTER HARDWARE; FUNCTIONS; GRAPHICS PROCESSING UNIT;

EID: 84919470072     PISSN: 10459219     EISSN: None     Source Type: Journal    
DOI: 10.1109/TPDS.2014.2308221     Document Type: Article
Times cited : (214)

References (36)
  • 3
    • 84900536807 scopus 로고    scopus 로고
    • Optimization of quasi diagonal matrix-vector multiplication on GPU
    • first published on September
    • W. Yang, K. Li, Y. Liu, L. Shi, and C. Wang, Optimization of Quasi Diagonal Matrix-Vector Multiplication on GPU, Int'l J. High Performance Computing Applications, first published on September 2, 2013, doi:10.1177/1094342013501126, http://hpc.sagepub.com/content/early/2013/09/02/1094342013501126.full.pdf
    • (2013) Int'l J. High Performance Computing Applications , vol.2
    • Yang, W.1    Li, K.2    Liu, Y.3    Shi, L.4    Wang, C.5
  • 4
    • 0242533311 scopus 로고    scopus 로고
    • Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid
    • July
    • J. Bolz, I. Farmer, E. Grinspun, and P. Schroder, "Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid," ACM Trans. Graphics, vol. 22, no. 3, pp. 917-924, July 2003.
    • (2003) ACM Trans. Graphics , vol.22 , Issue.3 , pp. 917-924
    • Bolz, J.1    Farmer, I.2    Grinspun, E.3    Schroder, P.4
  • 8
    • 84857332778 scopus 로고    scopus 로고
    • Optimization of Sparse Matrix-Vector Multiplication Using Reordering Techniques on GPUs
    • J.C. Pichel, F.F. Rivera, M. Fernandez, and A. Rodriguez, "Optimization of Sparse Matrix-Vector Multiplication Using Reordering Techniques on GPUs," Microprocessors and Microsystems, vol. 36, no. 2, pp. 65-77, 2012.
    • (2012) Microprocessors and Microsystems , vol.36 , Issue.2 , pp. 65-77
    • Pichel, J.C.1    Rivera, F.F.2    Fernandez, M.3    Rodriguez, A.4
  • 10
    • 84919494711 scopus 로고    scopus 로고
    • High-level strategies for parallel shared-memory sparse matrix-vector multiplication
    • Jan.
    • A.-J.N. Yzelman and D. Roose, "High-Level Strategies for Parallel Shared-Memory Sparse Matrix-Vector Multiplication," IEEE Trans. Parallel and Distributed Systems, vol. 25, no. 1, pp. 116-125, http://doi.ieeecomputersociety.org/10.1109/TPDS.2013.31, Jan. 2014.
    • (2014) IEEE Trans. Parallel and Distributed Systems , vol.25 , Issue.1 , pp. 116-125
    • Yzelman, A.-J.N.1    Roose, D.2
  • 13
    • 60649099576 scopus 로고    scopus 로고
    • Optimizing matrix multiplication for a short-vector simd architecture-cell processor
    • J. Kurzak, W. Alvaro, and J. Dongarra, "Optimizing Matrix Multiplication for a Short-Vector SIMD Architecture-Cell Processor," Parallel Computing, vol. 35, no. 3, pp. 138-150, 2009.
    • (2009) Parallel Computing , vol.35 , Issue.3 , pp. 138-150
    • Kurzak, J.1    Alvaro, W.2    Dongarra, J.3
  • 20
    • 84862123284 scopus 로고    scopus 로고
    • Fast sparse matrix-vector multiplication on GPUs: Implications for graph mining
    • Jan.
    • X. Yang, S. Parthasarathy, and P. Sadayappan, "Fast Sparse Matrix-Vector Multiplication on GPUs: Implications for Graph Mining," Proc. VLDB Endowment, vol. 4, no. 4, pp. 231-242, Jan. 2011.
    • (2011) Proc. VLDB Endowment , vol.4 , Issue.4 , pp. 231-242
    • Yang, X.1    Parthasarathy, S.2    Sadayappan, P.3
  • 21
    • 84855223315 scopus 로고    scopus 로고
    • Generating optimal CUDA sparse matrix-vector product implementations for evolving GPU hardware
    • A.H.El Zein and A.P. Rendell, "Generating Optimal CUDA Sparse Matrix-Vector Product Implementations for Evolving GPU Hardware," Concurrency and Computation: Practice and Experience, vol. 24, no. 1, pp. 3-13, 2012.
    • (2012) Concurrency and Computation: Practice and Experience , vol.24 , Issue.1 , pp. 3-13
    • Zein A.H.El1    Rendell, A.P.2
  • 26
    • 84881061313 scopus 로고    scopus 로고
    • Parallel sparse approximate inverse preconditioning on graphic processing units
    • Sept.
    • M.M. Dehnavi, D. Fernandez, and J.L. Gaudiot, "Parallel Sparse Approximate Inverse Preconditioning on Graphic Processing Units," IEEE Trans. Parallel and Distributed Systems, vol. 24, no. 9, pp. 1852-1862, Sept. 2013.
    • (2013) IEEE Trans. Parallel and Distributed Systems , vol.24 , Issue.9 , pp. 1852-1862
    • Dehnavi, M.M.1    Fernandez, D.2    Gaudiot, J.L.3
  • 28
    • 70450231944 scopus 로고    scopus 로고
    • An analytical model for a gpu architecture with memory-level and thread-level parallelism awareness
    • S. Hong and H. Kim, "An Analytical Model for a GPU Architecture with Memory-Level and Thread-Level Parallelism Awareness," Proc. 36th Ann. Int'l Symp. Computer Architecture (ISCA '09), pp. 152-163, 2009.
    • (2009) Proc. 36th Ann. Int'l Symp. Computer Architecture (ISCA '09) , pp. 152-163
    • Hong, S.1    Kim, H.2
  • 32
    • 84898682038 scopus 로고    scopus 로고
    • A performance modeling and optimization analysis tool for sparse matrix-vector multiplication on GPUs
    • P. Guo, L. Wang, and P. Chen, "A Performance Modeling and Optimization Analysis Tool for Sparse Matrix-Vector Multiplication on GPUs," IEEE Trans. Parallel and Distributed Systems, vol. 25, no. 5, pp. 1112-1123, 2014.
    • (2014) IEEE Trans. Parallel and Distributed Systems , vol.25 , Issue.5 , pp. 1112-1123
    • Guo, P.1    Wang, L.2    Chen, P.3
  • 33
    • 84885948161 scopus 로고    scopus 로고
    • Sparse matrix vector multiplication on the single-chip cloud computer many-core processor
    • J.C. Pichel and F.F. Rivera, "Sparse Matrix Vector Multiplication on the Single-Chip Cloud Computer Many-Core Processor," J. Parallel and Distributed Computing, vol. 73, no. 12, pp. 1539-1550, 2013.
    • (2013) J. Parallel and Distributed Computing , vol.73 , Issue.12 , pp. 1539-1550
    • Pichel, J.C.1    Rivera, F.F.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.