SCOPUS 정보 검색 플랫폼

Journal of Parallel and Distributed Computing

Volumn 22, Issue 3, 1994, Pages 523-537

Scalability issues affecting the design of a dense linear algebra library

(3) Dongarra, Jack J a,b Vandegeijn, Robert A c Walker, David W b

a UNIVERSITY OF TENNESSEE (United States)

b OAK RIDGE NATIONAL LABORATORY (United States)

c UNIVERSITY OF TEXAS AT AUSTIN (United States)

Author keywords

[No Author keywords available]

Indexed keywords

EID: 0000778168 PISSN: 07437315 EISSN: None Source Type: Journal
DOI: 10.1006/jpdc.1994.1108 Document Type: Article

Times cited : (48)

References (49)

1
- 0025536635
- Lapack: A portable linear algebra library for high-performance computers
- IEEE Press
- Anderson, E., Bai, Z., Bischof, C., Demmel, J., Dongarra, J. J., DuCroz, J., Greenbaum, A., Hammarling, S., McKenney, A., and Sorensen, D. Lapack: A portable linear algebra library for high-performance computers. Proceedings of Supercomputing '90, IEEE Press, 1990. pp. 1-10.
- (1990) Proceedings of Supercomputing 90 , pp. 1-10
- Erson, E.¹ Bai, Z.² Bischof, C.³ Demmel, J.⁴ Dongarra, J.J.⁵ Ducroz, J.⁶ Greenbaum, A.⁷ Hammarling, S.⁸ McKenney, A.⁹ Sorensen, D.¹⁰

2
- 0003706460
- Philadelphia
- Anderson, E., Bai, Z., Demmel, J., Dongarra, J., DuCroz, J., Greenbaum, A., Hammarling, S., McKenney, A., Ostrouchov, S., and Sorensen, D. LAPACK Users' Guide. SIAM, Philadelphia, 1992.
- (1992) LAPACK Users' Guide. SIAM
- Anderson, E.¹ Bai, Z.² Demmel, J.³ Dongarra, J.⁴ Ducroz, J.⁵ Greenbaum, A.⁶ Hammarling, S.⁷ McKenney, A.⁸ Ostrouchov, S.⁹ Sorensen, D.¹⁰

3
- 85067631613
- Basic linear algebra communication subprograms
- IEEE Comput. Soc. Press
- Anderson, E., Benzoni, A., Dongarra, J., Moulton, S., Ostrouchov, S., Tourancheau, B., and van de Geijn, R. Basic linear algebra communication subprograms. Sixth Distributed Memory Computing Conference Proceedings. IEEE Comput. Soc. Press. 1991, pp. 287-290.
- (1991) Sixth Distributed Memory Computing Conference Proceedings , pp. 287-290
- Anderson, E.¹ Benzoni, A.² Dongarra, J.³ Moulton, S.⁴ Ostrouchov, S.⁵ Tourancheau, B.⁶ Van De Geijn, R.⁷

4
- 0242343480
- LAPACK for distributed memory architectures: Progress report
- SIAM
- Anderson, E., Benzoni, A., Dongarra, J. J., Moulton, S., Ostrouchov, S., Tourancheau, B., and van de Geijn, R. LAPACK for distributed memory architectures: Progress report. Parallel Processing for Scientific Computing, Fifth SIAM Conference. SIAM, 1991.
- (1991) Parallel Processing for Scientific Computing, Fifth SIAM Conference
- Anderson, E.¹ Benzoni, A.² Dongarra, J.J.³ Moulton, S.⁴ Ostrouchov, S.⁵ Tourancheau, B.⁶ Van De Geijn, R.⁷

5
- 0039408378
- Engineering Computing and Analysis Technical Report ECA-TR-147, Boeing Computer Services
- Ashcraft, C. C. The distributed solution of linear systems using the torus wrap data mapping. Engineering Computing and Analysis Technical Report ECA-TR-147, Boeing Computer Services. 1990.
- (1990) The Distributed Solution of Linear Systems Using the Torus Wrap Data Mapping.
- Ashcraft, C.C.¹

6
- 85027595794
- Engineering Computing and Analysis Technical Report ECA-TR-161
- Boeing Computer Services
- Ashcraft, C. C. A taxonomy of distributed dense LU factorization methods. Engineering Computing and Analysis Technical Report ECA-TR-161, Boeing Computer Services. 1991.
- (1991) A Taxonomy of Distributed Dense LU Factorization Methods
- Ashcraft, C.C.¹

7
- 0001175581
- Design of a parallel nonsymmetric eigenroutine toolbox
- Sincovec, R. (Ed.)
- Bai, Z., and Demmel, J. Design of a parallel nonsymmetric eigenroutine toolbox. In Sincovec, R. (Ed.), Proceedings of Sixth SIAM Conference on Parallel Processing for Scientific Computing. SIAM Press, 1993.
- (1993) Proceedings of Sixth SIAM Conference on Parallel Processing for Scientific Computing. SIAM Press
- Bai, Z.¹ Demmel, J.²

8
- 0025997771
- Using Strassen’s algorithm to accelerate the solution of linear systems
- Bailey, D. H., Lee, K., and Simon, H. D. Using Strassen’s algorithm to accelerate the solution of linear systems. J. Supercomputing 4 (1990), 357-371.
- (1990) J. Supercomputing , vol.4 , pp. 357-371
- Bailey, D.H.¹ Lee, K.² Simon, H.D.³

9
- 3042648854
- The LINPACK benchmark on the AP 1000: Preliminary report
- Brent, R. P. The LINPACK benchmark on the AP 1000: Preliminary report. Proceedings of the 2nd CAP Workshop. Nov. 1991.
- Proceedings of the 2Nd CAP Workshop , pp. 1991
- Brent, R.P.¹

10
- 0002924772
- Scalapack: A scalable linear algebra library for distributed memory concurrent computers
- IEEE Comput. Soc. Press
- Choi, J., Dongarra, J. J., Pozo, R., and Walker, D. W. Scalapack: A scalable linear algebra library for distributed memory concurrent computers. Proceedings of the Fourth Symposium on the Frontiers of Massively Parallel Computation. IEEE Comput. Soc. Press, 1992. pp. 120-127.
- (1992) Proceedings of the Fourth Symposium on the Frontiers of Massively Parallel Computation , pp. 120-127
- Choi, J.¹ Dongarra, J.J.² Pozo, R.³ Walker, D.W.⁴

11
- 85027610980
- Elsevier, Amsterdam
- Choi, J., Dongarra, J. J., and Walker, D. W. The design of scalable software libraries for distributed memory concurrent computers. Proceedings of the CNRS-NSF Workshop on Environments and Tools for Parallel Scientific Computing. Elsevier, Amsterdam, 1993.
- (1993) The Design of Scalable Software Libraries for Distributed Memory Concurrent Computers
- Choi, J.¹ Dongarra, J.J.² Walker, D.W.³

12
- 35248831050
- Electromagnetic scattering calculations on the Intel Touchstone Delta
- IEEE Comput. Soc. Press
- Cwik, T., Patterson, J., and Scott, D. Electromagnetic scattering calculations on the Intel Touchstone Delta. Proceedings of Supercomputing '92. IEEE Comput. Soc. Press, 1992. pp. 538-542.
- (1992) Proceedings of Supercomputing 92 , pp. 538-542
- Cwik, T.¹ Patterson, J.² Scott, D.³

13
- 84855320879
- Technical Report, Argonne National Laboratory, Mathematics and Computer Science Division
- Demmel, J., Dongarra, J. J., Du Croz, J., Greenbaum, A., Hammarling, S., and Sorensen, D. Prospectus for the development of a linear algebra library for high performance computers. Technical Report 97, Argonne National Laboratory, Mathematics and Computer Science Division, Sept. 1987.
- (1987) Prospectus for the Development of a Linear Algebra Library for High Performance Computers , vol.97
- Demmel, J.¹ Dongarra, J.J.² Du Croz, J.³ Greenbaum, A.⁴ Hammarling, S.⁵ Sorensen, D.⁶

14
- 84947657247
- LINPACK benchmark: Performance of various computers using standard linear equations software
- Dongarra, J. J. LINPACK benchmark: Performance of various computers using standard linear equations software. Supercomputing Rev. 5, 3 (March 1992), 54-63.
- (1992) Supercomputing Rev , vol.5 , Issue.3 , pp. 54-63
- Dongarra, J.J.¹

15
- 0003555195
- Philadelphia
- Dongarra, J. J., Bunch, J., Moler, C., and Stewart, G. W. LINPACK User's Guide. SIAM, Philadelphia, 1979.
- (1979) LINPACK User's Guide. SIAM
- Dongarra, J.J.¹ Bunch, J.² Moler, C.³ Stewart, G.W.⁴

16
- 84911589505
- Technical Report, Argonne National Laboratory. Mathematics and Computer Science Division, Apr
- Dongarra, J. J., Du Croz, J., Duff, L. and Hammarling, S. A proposal for a set of level 3 basic linear algebra subprograms. Technical Report 88, Argonne National Laboratory. Mathematics and Computer Science Division, Apr. 1987.
- (1987) A Proposal for a Set of Level 3 Basic Linear Algebra Subprograms , vol.88
- Dongarra, J.J.¹ Du Croz, J.² Duff, L.³ Hammarling, S.⁴

17
- 0025402476
- A set of level 3 basic linear algebra subprograms
- Dongarra, J. J., Duff, L, Du Croz, J., and Hammarling, S. A set of level 3 basic linear algebra subprograms. ACM Trans. Math. Software 16 (March 1990), 1-17.
- (1990) ACM Trans. Math. Software , vol.16 , pp. 1-17
- Dongarra, J.J.¹ Duff, L.² Du Croz, J.³ Hammarling, S.⁴

18
- 0003793981
- Philadelphia
- Dongarra, J. J., Duff, I. S., Sorensen, D. C., and van der Vorst, H. A. Solving Linear Systems on Vector and Shared Memory Computers. SIAM, Philadelphia, 1990.
- (1990) Solving Linear Systems on Vector and Shared Memory Computers. SIAM
- Dongarra, J.J.¹ Duff, I.S.² Sorensen, D.C.³ Van Der Vorst, H.A.⁴

19
- 0003517895
- Technical Report TM-12231
- Oak Ridge National Laboratory
- Dongarra, J. J., Hempel, R., Hey, A. J. G., and Walker, D. W. A proposal for a user-level message passing interface in a distributed memory environment. Technical Report TM-12231. Oak Ridge National Laboratory. Feb. 1993.
- (1993) A Proposal for a User-Level Message Passing Interface in a Distributed Memory Environment
- Dongarra, J.J.¹ Hempel, R.² Hey, A.J.G.³ Walker, D.W.⁴

20
- 85027614764
- Technical Report CS-90-115, University of Tennessee at Knoxville
- Computer Science Department
- Dongarra, J. J., and Ostrouchov, S. LAPACK block factorization algorithms on the Intel iPSC/860. Technical Report CS-90-115, University of Tennessee at Knoxville, Computer Science Department, Oct. 1990.
- (1990) LAPACK Block Factorization Algorithms on the Intel Ipsc/860
- Dongarra, J.J.¹ Ostrouchov, S.²

21
- 0039169604
- An object oriented design for high performance linear algebra on distributed memory architectures
- Dongarra, J. J., Pozo, R., and Walker, D. W. An object oriented design for high performance linear algebra on distributed memory architectures. Proceedings of Object Oriented Numerics Conference. 1993.
- (1993) Proceedings of Object Oriented Numerics Conference
- Dongarra, J.J.¹ Pozo, R.² Walker, D.W.³

22
- 0004060334
- Two-dimensional basic linear algebra communication subprograms
- Computer Science Department, University of Tennessee. Knoxville, TN
- Dongarra, J. J., and van de Geijn, R. A. Two-dimensional basic linear algebra communication subprograms. Technical Report LAPACK working note 37, Computer Science Department, University of Tennessee. Knoxville, TN, Oct. 1991.
- (1991) Technical Report LAPACK Working Note , vol.37
- Dongarra, J.J.¹ Van De Geijn, R.A.²

23
- 0026912004
- Reduction to condensed form for the eigenvalue problem on distributed memory architectures
- Dongarra, J. J. and van de Geijn, R. A. Reduction to condensed form for the eigenvalue problem on distributed memory architectures. Parallel Comput. 18 (1992), 973-982.
- (1992) Parallel Comput , vol.18 , pp. 973-982
- Dongarra, J.J.¹ Van De Geijn, R.A.²

24
- 0026991394
- A look at scalable dense linear algebra libraries
- InJ. H. Saltz (Ed.), IEEE Press
- Dongarra, J. J., van de Geijn, R. A., and Walker, D. W. A look at scalable dense linear algebra libraries. InJ. H. Saltz (Ed.), Proceedings of the 1992 Scalable High Performance Computing Conference. IEEE Press, 1992.
- (1992) Proceedings of the 1992 Scalable High Performance Computing Conference
- Dongarra, J.J.¹ Van De Geijn, R.A.² Walker, D.W.³

25
- 0002663082
- GEMMW: A portable level 3 BLAS Winograd variant of Strassen's matrix-matrix multiply algorithm
- Douglas, C. C., Heroux, M., Slishman, G., and Smith, R. M. GEMMW: A portable level 3 BLAS Winograd variant of Strassen's matrix-matrix multiply algorithm. J. Comput. Phys. 110 (1994), 1-10.
- (1994) J. Comput. Phys. , vol.110 , pp. 1-10
- Douglas, C.C.¹ Heroux, M.² Slishman, G.³ Smith, R.M.⁴

26
- 84888771978
- Large dense numerical linear algebra in 1993: The parallel computing influence
- Edelman, A. Large dense numerical linear algebra in 1993: The parallel computing influence. Int. J. Supercomputing Appl. 7, 2 (1993).
- (1993) Int. J. Supercomputing Appl. , vol.2 , pp. 7
- Edelman, A.¹

27
- 85027594380
- Fortran D language specification. Technical Report CRPC-TR90079. Center for Research on Parallel Computation
- Fox, G. C., Hiranandani, S., Kennedy, K., Koelbel, C., Kremer, U., Tseng, C-W., and Wu, M-Y. Fortran D language specification. Technical Report CRPC-TR90079. Center for Research on Parallel Computation, Rice University, Dec. 1990.
- (1990) Rice University
- Fox, G.C.¹ Hiranandani, S.² Kennedy, K.³ Koelbel, C.⁴ Kremer, U.⁵ Tseng, C.-W.⁶ Wu, M.-Y.⁷

28
- 0003506603
- Prentice-Hall, Englewood Cliffs, NJ
- Fox, G. C., Johnson, M. A., Lyzenga, G. A., Otto, S. W., Salmon, J. K., and Walker, D. W. Solving Problems on Concurrent Processors. Vol. 1. Prentice-Hall, Englewood Cliffs, NJ, 1988.
- (1988) Solving Problems on Concurrent Processors , vol.1
- Fox, G.C.¹ Johnson, M.A.² Lyzenga, G.A.³ Otto, S.W.⁴ Salmon, J.K.⁵ Walker, D.W.⁶

29
- 0003407903
- Lecture Notes in Computer Science, Springer-Verlag, Berlin
- Garbow, B. S., Boyle, J. M., Dongarra, J. J., and Moler, C. B. Matrix Eigensystem Routines—EISPACK Guide Extension. Lecture Notes in Computer Science. Vol. 51. Springer-Verlag, Berlin. 1977.
- (1977) Matrix Eigensystem Routines—EISPACK Guide Extension , vol.51
- Garbow, B.S.¹ Boyle, J.M.² Dongarra, J.J.³ Moler, C.B.⁴

30
- 0042625581
- Technical Report TM-11616
- Oak Ridge National Laboratory
- Geist, G. A., Heath, M. T., Peyton, B. W., and Worley, P. H. A user's guide to PICL: A portable instrumented communication library. Technical Report TM-11616, Oak Ridge National Laboratory, Oct. 1990.
- (1990) A User's Guide to PICL: A Portable Instrumented Communication Library
- Geist, G.A.¹ Heath, M.T.² Peyton, B.W.³ Worley, P.H.⁴

31
- 0027644684
- The scalability of FFT on parallel computers
- A detailed version is available as Technical Report TR 90-53, Department of Computer Science, University of Minnesota. MN 55455
- Gupta, A., and Kumar, V. The scalability of FFT on parallel computers. IEEE Trans. Parallel Distrib. Systems 4, 7 (July 1993). A detailed version is available as Technical Report TR 90-53, Department of Computer Science, University of Minnesota. MN 55455.
- (1993) IEEE Trans. Parallel Distrib. Systems , vol.4 , Issue.7
- Gupta, A.¹ Kumar, V.²

32
- 0024012163
- Reevaluating Amdahl's law
- Gustafson, J. Reevaluating Amdahl's law. Comm. ACM 31, 5 (1988), 532-533.
- (1988) Comm. ACM , vol.31 , Issue.5 , pp. 532-533
- Gustafson, J.¹

33
- 0026202198
- The design of a scalable, fixed-time computer benchmark
- Gustafson, J., Rover, D., Elbert, S., and Carter, M. The design of a scalable, fixed-time computer benchmark. J. Parallel Distrib. Corn-put. 12 (1991), 388-401.
- (1991) J. Parallel Distrib. Corn-Put , vol.12 , pp. 388-401
- Gustafson, J.¹ Rover, D.² Elbert, S.³ Carter, M.⁴

34
- 36849016613
- Technical Report SAND92-0792
- Sandia National Laboratories
- Hendrickson, B., and Womble, D. The torus-wrap mapping for dense matrix computations on massively parallel computers. Technical Report SAND92-0792, Sandia National Laboratories, Apr. 1992.
- (1992) The Torus-Wrap Mapping for Dense Matrix Computations on Massively Parallel Computers
- Hendrickson, B.¹ Womble, D.²

35
- 79952176071
- Version 0.4
- High Performance Fortran Forum. High Performance Fortran Language Specification, Version 0.4. Nov. 1992.
- (1992) High Performance Fortran Language Specification

36
- 0025637437
- Exploiting fast matrix multiplication within the level 3 BLAS
- Higham, N. J. Exploiting fast matrix multiplication within the level 3 BLAS. ACM Trans. Math. Software 16, 4 (1990), 352-368.
- (1990) ACM Trans. Math. Software , vol.16 , Issue.4 , pp. 352-368
- Higham, N.J.¹

37
- 0022909361
- Distributed routing algorithms for broadcasting and personalized communication in hypercubes
- Ho, C.-T., and Johnsson, S. L. Distributed routing algorithms for broadcasting and personalized communication in hypercubes. Proceedings of the 1986 International Conference on Parallel Processing. IEEE. 1986, pp. 640-648.
- (1986) Proceedings of the 1986 International Conference on Parallel Processing. IEEE , pp. 640-648
- Ho, C.-T.¹ Johnsson, S.L.²

38
- 0004185241
- Hilger, Bristol
- Hockney, R. W., and Jesshope, C. R. Parallel Computers. Hilger, Bristol, 1981.
- (1981) Parallel Computers
- Hockney, R.W.¹ Jesshope, C.R.²

39
- 3543092493
- Analyzing scalability of parallel algorithms and architectures. Technical report, TR-91-18. Computer Science Department, University of Minnesota. June 1991
- A short version of the paper, Urbana, IL, Oct, 1991
- Kumar, V., and Gupta, A. Analyzing scalability of parallel algorithms and architectures. Technical report, TR-91-18. Computer Science Department, University of Minnesota. June 1991. J. Parallel Distrib. Comput. 22, 3 (1994) 379-391. A short version of the paper appears in the Proceedings of the 1991 International Conference on Supercomputing. Germany, and as an invited paper in the Proceedings of the 29th Annual Allerton Conference on Communication, Control and Computing. Urbana, IL, Oct. 1991.
- (1994) J. Parallel Distrib. Comput , vol.22 , Issue.3 , pp. 379-391
- Kumar, V.¹ Gupta, A.²

40
- 12444284722
- Block-cyclic dense linear algebra
- Technical Report TR-04-92. Harvard University
- Lichtenstein, W., and Johnsson, S. L. Block-cyclic dense linear algebra. Technical Report TR-04-92. Harvard University. Center for Research in Computing Technology, Jan. 1992.
- (1992) Center for Research in Computing Technology
- Lichtenstein, W.¹ Johnsson, S.L.²

41
- 84913396228
- Technical Report YALEU/DCS/RR-387. Department of Computer Science, Yale University
- Saad, Y., and Schultz, M. H. Parallel direct methods for solving banded linear systems. Technical Report YALEU/DCS/RR-387. Department of Computer Science, Yale University, 1985.
- (1985) Parallel Direct Methods for Solving Banded Linear Systems.
- Saad, Y.¹ Schultz, M.H.²

42
- 12444263806
- Technical report. Numerical Mathematics Group. Lawrence Livermore National Laboratory
- Skjellum, A. J., and Baldwin, C. The multicomputer toolbox: Scalable parallel libraries for large-scale concurrent applications. Technical report. Numerical Mathematics Group. Lawrence Livermore National Laboratory, Dec. 1991.
- (1991) The Multicomputer Toolbox: Scalable Parallel Libraries for Large-Scale Concurrent Applications
- Skjellum, A.J.¹ Baldwin, C.²

43
- 10444243598
- LU factorization of sparse, unsym-metric, Jacobian matrices on multicomputers
- Walker, D. W. and Stout, Q. F. (Eds.)
- Skjellum, A. J., and Leung, A. LU factorization of sparse, unsym-metric, Jacobian matrices on multicomputers. In Walker, D. W. and Stout, Q. F. (Eds.). Proceedings of the Fifth Distributed Memory Concurrent Computing Conference. IEEE Press. 1990, pp. 328-337.
- (1990) Proceedings of the Fifth Distributed Memory Concurrent Computing Conference. IEEE Press , pp. 328-337
- Skjellum, A.J.¹ Leung, A.²

44
- 0003595562
- Lecture Notes in Computer Science, Springer-Verlag, Berlin
- Smith, B. T., Boyle, J. M., Dongarra, J. J., Garbow, B. S., Ikebe, Y., Klema, V. C., and Moler, C. B. Matrix Eigensyslem Routines—EISPACK Guide, Lecture Notes in Computer Science, Vol. 6. Springer-Verlag, Berlin, 1976.
- (1976) Matrix Eigensyslem Routines—EISPACK Guide , vol.6
- Smith, B.T.¹ Boyle, J.M.² Dongarra, J.J.³ Garbow, B.S.⁴ Ikebe, Y.⁵ Klema, V.C.⁶ Moler, C.B.⁷

45
- 34250487811
- Gaussian elimination is not optimal
- Strassen, V. Gaussian elimination is not optimal. Ntuner. Math. 13 (1969), 354-356.
- (1969) Ntuner. Math. , vol.13 , pp. 354-356
- Strassen, V.¹

46
- 0002853545
- Scalable problems and memory-bounded speedup
- Sun, X.-H., and Ni, L. Scalable problems and memory-bounded speedup. J. Parallel Distrib. Computing 19, 1 (1993). 27-37.
- (1993) J. Parallel Distrib. Computing , vol.19 , Issue.1 , pp. 27-37
- Sun, X.-H.¹ Ni, L.²

47
- 85027614460
- Cambridge, MA
- Thinking Machines Corporation. CMS Technical Summary. Cambridge, MA, 1991.
- (1991) CMS Technical Summary

48
- 85027592685
- Computer Science Report TR-91-28
- Univ. of Texas
- van de Geijn, R. A. Massively parallel LINPACK benchmark on the Intel Touchstone Delta and iPSC/860 systems. Computer Science Report TR-91-28, Univ. of Texas, 1991.
- (1991) Massively Parallel LINPACK Benchmark on the Intel Touchstone Delta and Ipsc/860 Systems
- Van De Geijn, R.A.¹

49
- 0025639404
- Data redistribution and concurrency
- Van de Velde, E. F. Data redistribution and concurrency. Parallel Comput. 16 (Dec 1990).
- (1990) Parallel Comput , pp. 16
- Van De Velde, E.F.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.