SCOPUS 정보 검색 플랫폼

International Journal of High Performance Computing Applications

Volumn 18, Issue 2 SPEC. ISS., 2004, Pages 225-236

Optimizing sparse matrix-vector product computations using unroll and jam

(2) Mellor Crummey, John M a Garvin, John a

a Rice University (United States)

Author keywords

Data structures; Matrix vector product; Microfactors; Performance optimization; Sparse matrices; Sparse matrix format

Indexed keywords

BANDWIDTH; BUFFER STORAGE; COMPUTER ARCHITECTURE; COMPUTER SYSTEMS; DATA STRUCTURES; OPTIMIZATION; VECTORS;

MICROFACTORS; PERFORMANCE OPTIMIZATION; SPARSE MATRIX FORMAT; SPARSE MATRIX-VECTOR PRODUCT;

MATRIX ALGEBRA;

EID: 2942628343 PISSN: 10943420 EISSN: None Source Type: Journal
DOI: 10.1177/1094342004038951 Document Type: Article

Times cited : (85)

References (18)

1
- 0345629462
- A high performance algorithm using pre-processing for the sparse matrix-vector multiplication
- November
- Agarwal, R., Gustavson, F., and Zubair, M. November 1992. A high performance algorithm using pre-processing for the sparse matrix-vector multiplication. In Proceedings of Supercomputing 1992, Minneapolis, MN.
- (1992) Proceedings of Supercomputing 1992, Minneapolis, MN
- Agarwal, R.¹ Gustavson, F.² Zubair, M.³

2
- 0001775038
- A catalogue of optimizing transformations
- In J. Rustin, editor
- Allen, F., and Cocke, J., 1972. A catalogue of optimizing transformations. In J. Rustin, editor, Design and Optimization of Compilers, Prentice-Hall, Englewood Cliffs, NJ.
- (1972) Design and Optimization of Compilers, Prentice-Hall, Englewood Cliffs, NJ
- Allen, F.¹ Cocke, J.²

3
- 0025447908
- Improving register allocation for subscripted variables
- June
- Callahan, D., Carr, S., and Kennedy, K., June 1990. Improving register allocation for subscripted variables. In Proceedings of the SIGPLAN'90 Conference on Programming Language Design and Implementation, White Plains, NY.
- (1990) Proceedings of the SIGPLAN'90 Conference on Programming Language Design and Implementation, White Plains, NY
- Callahan, D.¹ Carr, S.² Kennedy, K.³

4
- 0000493064
- Estimating interlock and improving balance for pipelined machines
- Callahan, D., Cocke, J., and Kennedy, K., 1988. Estimating interlock and improving balance for pipelined machines. Journal of Parallel and Distributed Computing 5(4):334-358.
- (1988) Journal of Parallel and Distributed Computing , vol.5 , Issue.4 , pp. 334-358
- Callahan, D.¹ Cocke, J.² Kennedy, K.³

5
- 0028549474
- Improving the ratio of memory operations to floating-point operations in loops
- Carr, S., and Kennedy, K., 1994 Improving the ratio of memory operations to floating-point operations in loops. ACM Transactions on Programming Languages and Systems 16(6):1768-1810.
- (1994) ACM Transactions on Programming Languages and Systems , vol.16 , Issue.6 , pp. 1768-1810
- Carr, S.¹ Kennedy, K.²

6
- 0037883031
- The design and implementation of a parallel unstructured Euler solver using software primitives, AIAA-92-0562
- January
- Das, R., Mavriplis, D., Saltz, J., Gupta, S., and Ponnusamy, R., January 1992. The design and implementation of a parallel unstructured Euler solver using software primitives, AIAA-92-0562. In Proceedings of the 30th Aerospace Sciences Meeting, AIAA.
- (1992) Proceedings of the 30th Aerospace Sciences Meeting, AIAA
- Das, R.¹ Mavriplis, D.² Saltz, J.³ Gupta, S.⁴ Ponnusamy, R.⁵

7
- 0003645035
- Prentice-Hall, Englewood Cliffs, NJ
- George, A., and Liu, G., 1981. Computer Solution of Large Sparse Positive Definite Systems, Prentice-Hall, Englewood Cliffs, NJ.
- (1981) Computer Solution of Large Sparse Positive Definite Systems
- George, A.¹ Liu, G.²

8
- 2942615888
- Improving the performance of sparse matrix-vector multiplication by blocking
- July
- Gropp, W., Kaushik, D., Keyes, D., and Smith, B. July 2000. Improving the performance of sparse matrix-vector multiplication by blocking. Talk presented at SIAM Annual Meeting, San Juan, Puerto Rico. Available as http://www.icase.edu/~keyes/multivec.pdf.
- (2000) SIAM Annual Meeting, San Juan, Puerto Rico
- Gropp, W.¹ Kaushik, D.² Keyes, D.³ Smith, B.⁴

9
- 0004972603
- Optimizing the performance of sparse matrix-vector multiplication
- PhD thesis, University of California Berkeley
- Im, E.-J., 2000. Optimizing the performance of sparse matrix-vector multiplication. PhD thesis, University of California Berkeley.
- (2000)
- Im, E.-J.¹

10
- 84949647432
- Optimizing sparse matrix computations for register reuse in SPARSITY
- In V. N. Alexandrov, J. Dongarra, B. A. Juliano, R. S. Renner, and C. J. K. Tan, editors; Springer, Berlin
- Im, E.-J., and Yelick, K.A., 2001. Optimizing sparse matrix computations for register reuse in SPARSITY. In V. N. Alexandrov, J. Dongarra, B. A. Juliano, R. S. Renner, and C. J. K. Tan, editors, Proceedings of International Conference on Computational Science, Lecture Notes in Computer Science Vol. 2073, Springer, Berlin, pp. 127-136.
- (2001) Proceedings of International Conference on Computational Science, Lecture Notes in Computer Science , vol.2073 , pp. 127-136
- Im, E.-J.¹ Yelick, K.A.²

11
- 78149347218
- Predictive performance and scalability modeling of a large-scale application
- November
- Kerbyson, D., Alme, H., Hoisie, A., Petrini, F., Wasserman, H., and Gittings, M., November 2001. Predictive performance and scalability modeling of a large-scale application. In Supercomputing 2001, Denver, CO.
- (2001) Supercomputing 2001, Denver, CO
- Kerbyson, D.¹ Alme, H.² Hoisie, A.³ Petrini, F.⁴ Wasserman, H.⁵ Gittings, M.⁶

12
- 2942626618
- National Institute of Standards and Technology (NIST), 2003. Matrix market. http:.//math.nist.gov/MatrixMarket
- (2001)

13
- 0003635989
- NSPCG user's guide
- December; Center for Numerical Analysis, The University of Texas at Austin
- Oppe, T., Joubert, W., and Kinkaid, D., December 1988. NSPCG user's guide. Technical Report, Center for Numerical Analysis, The University of Texas at Austin.
- (1988) Technical Report
- Oppe, T.¹ Joubert, W.² Kinkaid, D.³

14
- 0039771978
- RSIM: An execution-driven simulator for ILP-based shared-memory multiprocessors and uniprocessors
- February; Also appears in IEEE TCCA Newsletter (October)
- Pai, V.S., Ranganathan, P., and Adve, S.V., February 1997. RSIM: an execution-driven simulator for ILP-based shared-memory multiprocessors and uniprocessors. In Proceedings of the Third Workshop on Computer Architecture Education. Also appears in IEEE TCCA Newsletter (October).
- (1997) Proceedings of the Third Workshop on Computer Architecture Education
- Pai, V.S.¹ Ranganathan, P.² Adve, S.V.³

15
- 0345871024
- Data structures to vectorize CG algorithms for general sparsity patterns
- Paolini, G., and di Brozolo, G.R., 1989. Data structures to vectorize CG algorithms for general sparsity patterns. BIT 29:703:718.
- (1989) BIT , vol.29 , pp. 703-718
- Paolini, G.¹ Di Brozolo, G.R.²

16
- 0342773466
- Krylov subspace methods on supercomputers
- Research Institute for Advanced Computer Science, NASA Research Center
- Saad, Y., 1988. Krylov subspace methods on supercomputers. Technical Report 88.40, Research Institute for Advanced Computer Science, NASA Research Center.
- (1988) Technical Report 88.40
- Saad, Y.¹

17
- 0031269220
- Improving the memory-system performance of sparse-matrix vector multiplication
- Toledo, S., 1997. Improving the memory-system performance of sparse-matrix vector multiplication. IBM Journal of Research and Development 41(6):711-725.
- (1997) IBM Journal of Research and Development , vol.41 , Issue.6 , pp. 711-725
- Toledo, S.¹

18
- 84990830919
- Performance optimizations and bounds for sparse matrix-vector multiply
- November
- Vuduc, R., Demmel, J., Yelick, K., Kamil, S., Nishtala, R., and Lee, B., November 2002. Performance optimizations and bounds for sparse matrix-vector multiply. Proceedings of SC'02: High Performance Networking and Computing.
- (2002) Proceedings of SC'02: High Performance Networking and Computing
- Vuduc, R.¹ Demmel, J.² Yelick, K.³ Kamil, S.⁴ Nishtala, R.⁵ Lee, B.⁶

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.