SCOPUS 정보 검색 플랫폼

IBM Journal of Research and Development

Volumn 41, Issue 6, 1997, Pages 711-725

Improving the memory-system performance of sparse-matrix vector multiplication

(1) Toledo, S a,b,c,d

a PALO ALTO RESEARCH CENTER (United States)

b TEL AVIV UNIVERSITY (Israel)

c MASSACHUSETTS INSTITUTE OF TECHNOLOGY (United States)

d IBM T J WATSON RESEARCH CENTER (United States)

Author keywords

[No Author keywords available]

Indexed keywords

BUFFER STORAGE; COMPUTATIONAL METHODS; MATRIX ALGEBRA; PROGRAM PROCESSORS; REDUCED INSTRUCTION SET COMPUTING; RESPONSE TIME (COMPUTER SYSTEMS); STORAGE ALLOCATION (COMPUTER); VECTORS;

SPARSE MATRIX VECTOR MULTIPLICATION;

PARALLEL PROCESSING SYSTEMS;

EID: 0031269220 PISSN: 00188646 EISSN: None Source Type: Journal
DOI: 10.1147/rd.416.0711 Document Type: Article

Times cited : (124)

References (22)

1
- 0003473816
- SIAM Press, Philadelphia
- R. Barret, M. Berry, T. Chan, J. Demmel, J. Donato, J. Dongarra, V. Eijkhout, R. Pozo, C. Romine, and H. van der Vorst, Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, SIAM Press, Philadelphia, 1993.
- (1993) Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods
- Barret, R.¹ Berry, M.² Chan, T.³ Demmel, J.⁴ Donato, J.⁵ Dongarra, J.⁶ Eijkhout, V.⁷ Pozo, R.⁸ Romine, C.⁹ Van Der Vorst, H.¹⁰

2
- 0028386843
- The Design and Implementation of a Parallel Unstructured Euler Solver Using Software Primitives
- R. Das, D. J. Mavriplis, J. Saltz, S. Gupta, and R. Ponnusamy, "The Design and Implementation of a Parallel Unstructured Euler Solver Using Software Primitives," AIAA J. 32, 489-496 (1994).
- (1994) AIAA J. , vol.32 , pp. 489-496
- Das, R.¹ Mavriplis, D.J.² Saltz, J.³ Gupta, S.⁴ Ponnusamy, R.⁵

3
- 85031661900
- Characterizing the Behavior of Sparse Algorithms on Caches
- IEEE Computer Society Press, Piscataway, NJ
- O. Temam and W. Jalby, "Characterizing the Behavior of Sparse Algorithms on Caches," Proceedings of Supercomputing '92, IEEE Computer Society Press, Piscataway, NJ, 1992, pp. 578-587.
- (1992) Proceedings of Supercomputing '92 , pp. 578-587
- Temam, O.¹ Jalby, W.²

4
- 0007426480
- Renumbering Unstructured Grids to Improve the Performance of Codes on Hierarchical Memory Machines
- Numerical Analysis Group, Oxford University Computing Laboratory, Oxford, England, May
- D. A. Burgess and M. B. Giles, "Renumbering Unstructured Grids to Improve the Performance of Codes on Hierarchical Memory Machines," Technical Report 95/06, Numerical Analysis Group, Oxford University Computing Laboratory, Oxford, England, May 1995.
- (1995) Technical Report 95/06
- Burgess, D.A.¹ Giles, M.B.²

5
- 84983621818
- A High-Performance Algorithm Using Pre-Processing for Sparse Matrix-Vector Multiplication
- IEEE Computer Society Press, Piscataway, NJ, November
- R. C. Agarwal, F. G. Gustavson, and M. Zubair, "A High-Performance Algorithm Using Pre-Processing for Sparse Matrix-Vector Multiplication," Proceedings of Supercomputing '92, IEEE Computer Society Press, Piscataway, NJ, November 1992, pp. 32-41.
- (1992) Proceedings of Supercomputing '92 , pp. 32-41
- Agarwal, R.C.¹ Gustavson, F.G.² Zubair, M.³

6
- 0343802000
- Argonne National Laboratory, Argonne, IL
- Satish Balay, William Gropp, Lois Curman McInnes, and Barry Smith, PETSc 2.0 Users' Manual, Technical Report ANL-95/11, Revision 2.0.15, Argonne National Laboratory, Argonne, IL, 1996.
- (1996) PETSc 2.0 Users' Manual, Technical Report ANL-95/11, Revision 2.0.15
- Balay, S.¹ Gropp, W.² McInnes, L.C.³ Smith, B.⁴

7
- 0014612601
- Reducing the Bandwidth of Sparse Symmetric Matrices
- New York
- E. Cuthill and J. McKee, "Reducing the Bandwidth of Sparse Symmetric Matrices," Proceedings of the 24th National Conference of the Association for Computing Machinery, New York, 1969, pp. 157-172.
- (1969) Proceedings of the 24th National Conference of the Association for Computing Machinery , pp. 157-172
- Cuthill, E.¹ McKee, J.²

8
- 0003554096
- PWS Publishing Company, New York
- Yousef Saad, Iterative Methods for Sparse Linear Systems, PWS Publishing Company, New York, 1996.
- (1996) Iterative Methods for Sparse Linear Systems
- Saad, Y.¹

9
- 0029292848
- Superscalar Instruction Execution in the 21164 Alpha Microprocessor
- April
- John H. Edmondson, Paul Rubinfeld, Ronald Preston, and Vidya Rajagopalan, "Superscalar Instruction Execution in the 21164 Alpha Microprocessor," IEEE Micro, pp. 33-43 (April 1995).
- (1995) IEEE Micro , pp. 33-43
- Edmondson, J.H.¹ Rubinfeld, P.² Preston, R.³ Rajagopalan, V.⁴

10
- 0030125973
- UltraSparc I: A Four-Issue Processor Supporting Multimedia
- April
- Marc Tremblay and J. Michael O'Connor, "UltraSparc I: A Four-Issue Processor Supporting Multimedia," IEEE Micro, pp. 42-49 (April 1996).
- (1996) IEEE Micro , pp. 42-49
- Tremblay, M.¹ O'Connor, J.M.²

11
- 0028427170
- Improving Performance of Linear Algebra Algorithms for Dense Matrices Using Algorithmic Prefetch
- R. C. Agarwal, F. G. Gustavson, and M. Zubair, "Improving Performance of Linear Algebra Algorithms for Dense Matrices Using Algorithmic Prefetch," IBM J. Res. Develop. 38, 265-275 (1994).
- (1994) IBM J. Res. Develop. , vol.38 , pp. 265-275
- Agarwal, R.C.¹ Gustavson, F.G.² Zubair, M.³

12
- 0003690936
- Ph.D. thesis, Rice University, Houston, May
- A. K. Porterfield, "Software Methods for Improvement of Cache Performance on Supercomputer Applications," Ph.D. thesis, Rice University, Houston, May 1989.
- (1989) Software Methods for Improvement of Cache Performance on Supercomputer Applications
- Porterfield, A.K.¹

13
- 0004033521
- Ph.D. thesis, Stanford University, March
- Todd C. Mowry, "Tolerating Latency Through Software-Controlled Data Prefetching," Ph.D. thesis, Stanford University, March 1994.
- (1994) Tolerating Latency Through Software-Controlled Data Prefetching
- Mowry, T.C.¹

14
- 0029218142
- High-Performance Parallel Implementations of the NAS Kernel Benchmarks on the IBM SP2
- R. C. Agarwal, B. Alpern, L. Carter, F. G. Gustavson, D. J. Klepacki, R. Lawrence, and M. Zubair, "High-Performance Parallel Implementations of the NAS Kernel Benchmarks on the IBM SP2," IBM Syst. J. 34, 263-272 (1995).
- (1995) IBM Syst. J. , vol.34 , pp. 263-272
- Agarwal, R.C.¹ Alpern, B.² Carter, L.³ Gustavson, F.G.⁴ Klepacki, D.J.⁵ Lawrence, R.⁶ Zubair, M.⁷

15
- 0028511878
- POWER2: Next Generation of the RISC System/6000 Family
- S. W. White and S. Dhawan, "POWER2: Next Generation of the RISC System/6000 Family," IBM J. Res. Develop. 38, 493-502 (1994).
- (1994) IBM J. Res. Develop. , vol.38 , pp. 493-502
- White, S.W.¹ Dhawan, S.²

16
- 0028508236
- The POWER2 Performance Monitor
- E. H. Welbon, C. C. Chan-Nui, D. J. Shippy, and D. A. Hicks, "The POWER2 Performance Monitor," IBM J. Res. Develop. 38, 545-554 (1994).
- (1994) IBM J. Res. Develop. , vol.38 , pp. 545-554
- Welbon, E.H.¹ Chan-Nui, C.C.² Shippy, D.J.³ Hicks, D.A.⁴

17
- 0030672717
- Fast and Effective Algorithms for Graph Partitioning and Sparse-Matrix Ordering
- A. Gupta, "Fast and Effective Algorithms for Graph Partitioning and Sparse-Matrix Ordering," IBM J. Res. Develop. 41, 171-184 (1997).
- (1997) IBM J. Res. Develop. , vol.41 , pp. 171-184
- Gupta, A.¹

18
- 0040487002
- WGPP: Watson Graph Partitioning (and Sparse Matrix Ordering) Package
- IBM Thomas J. Watson Research Center, Yorktown Heights, NY, May
- Anshul Gupta, "WGPP: Watson Graph Partitioning (and Sparse Matrix Ordering) Package," Technical Report RC-20453, IBM Thomas J. Watson Research Center, Yorktown Heights, NY, May 1996.
- (1996) Technical Report RC-20453
- Gupta, A.¹

19
- 0018515759
- Basic Linear Algebra Subprogram for Fortran Usage
- C. L. Lawson, R. J. Hanson, D. R. Kincaid, and F. T. Krogh, "Basic Linear Algebra Subprogram for Fortran Usage," ACM Trans. Math. Software 5, 308-323 (1979).
- (1979) ACM Trans. Math. Software , vol.5 , pp. 308-323
- Lawson, C.L.¹ Hanson, R.J.² Kincaid, D.R.³ Krogh, F.T.⁴

20
- 0000602242
- The Effect of Ordering on Preconditioned Conjugate Gradient
- Iain S. Duff and Gérard Meurant, "The Effect of Ordering on Preconditioned Conjugate Gradient," BIT 29, 635-657 (1989).
- (1989) BIT , vol.29 , pp. 635-657
- Duff, I.S.¹ Meurant, G.²

21
- 0347031952
- Ph.D. thesis, University of Illinois at Urbana-Champaign, May
- R. L. Lee, "The Effectiveness of Caches and Data Prefetch Buffers in Large-Scale Shared Memory Multiprocessors," Ph.D. thesis, University of Illinois at Urbana-Champaign, May 1987.
- (1987) The Effectiveness of Caches and Data Prefetch Buffers in Large-Scale Shared Memory Multiprocessors
- Lee, R.L.¹

22
- 0026267802
- An Effective On-Chip Preloading Scheme to Reduce Data Access Penalty
- IEEE Computer Society Press, Piscataway, NJ
- J.-L. Baer and T.-F. Chen, "An Effective On-Chip Preloading Scheme to Reduce Data Access Penalty," Proceedings of Supercomputing '91, IEEE Computer Society Press, Piscataway, NJ, 1991, pp. 176-186.
- (1991) Proceedings of Supercomputing '91 , pp. 176-186
- Baer, J.-L.¹ Chen, T.-F.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.