-
1
-
-
0028513316
-
Exploiting functional parallelism of POWER2 to design high-performance numerical algorithms
-
Agarwal R.C., Gustavson F.G., Zubair M. Exploiting functional parallelism of POWER2 to design high-performance numerical algorithms. IBM J. Res. Dev. 38(5):1994;563-576.
-
(1994)
IBM J. Res. Dev.
, vol.38
, Issue.5
, pp. 563-576
-
-
Agarwal, R.C.1
Gustavson, F.G.2
Zubair, M.3
-
2
-
-
0028427170
-
Improving performance of linear algebra algorithms for dense matrices using algorithmic prefetch
-
Agarwal R.C., Gustavson F.G., Zubair M. Improving performance of linear algebra algorithms for dense matrices using algorithmic prefetch. IBM J. Res. Dev. 38(3):1994;265-275.
-
(1994)
IBM J. Res. Dev.
, vol.38
, Issue.3
, pp. 265-275
-
-
Agarwal, R.C.1
Gustavson, F.G.2
Zubair, M.3
-
4
-
-
1642303799
-
-
Specification Sheets, March
-
P.R. Amestoy, I.S. Duff, J. L'Excellent, J. Koster, M. Tuma, MUltifrontal Massively Parallel Solver (MUMPS version 4.1), Specification Sheets, March 2000. http://www.enseeiht.fr/lima/apo/MUMPS/doc.html .
-
(2000)
MUltifrontal Massively Parallel Solver (MUMPS Version 4.1)
-
-
Amestoy, P.R.1
Duff, I.S.2
L'Excellent, J.3
Koster, J.4
Tuma, M.5
-
5
-
-
18044400448
-
A recursive formulation of Cholesky factorization of a matrix in packed storage
-
Andersen B.S., Waśniewski J., Gustavson F.G. A recursive formulation of Cholesky factorization of a matrix in packed storage. ACM Trans. Math. Softw. 27:2001;214-244.
-
(2001)
ACM Trans. Math. Softw.
, vol.27
, pp. 214-244
-
-
Andersen, B.S.1
Waśniewski, J.2
Gustavson, F.G.3
-
6
-
-
0024901312
-
The influence of relaxed supernode partitions on the multifrontal method
-
Ashcraft C., Grimes R. The influence of relaxed supernode partitions on the multifrontal method. ACM Trans. Math. Softw. 15(4):1989;291-309.
-
(1989)
ACM Trans. Math. Softw.
, vol.15
, Issue.4
, pp. 291-309
-
-
Ashcraft, C.1
Grimes, R.2
-
7
-
-
0030661485
-
Optimizing matrix multiply using PHIPAC: A portable, high-performance, ANSI C coding methodology
-
Vienna, Austria
-
J. Bilmes, K. Asanovic, C.W. Chin, J. Demmel, Optimizing matrix multiply using PHIPAC: a portable, high-performance, ANSI C coding methodology, in: Proceedings of the International Conference on Supercomputing, Vienna, Austria, 1997.
-
(1997)
Proceedings of the International Conference on Supercomputing
-
-
Bilmes, J.1
Asanovic, K.2
Chin, C.W.3
Demmel, J.4
-
8
-
-
0003615167
-
-
SIAM, Philadelphia, PA
-
L.S. Blackford, J. Choi, A. Cleary, E. D'Azevedo, J. Demmel, I. Dhillon, J. Dongarra, S. Hammarling, G. Henry, A. Petitet, K. Stanley, D. Walker, R.C. Whaley, ScaLAPACK User's Guide, SIAM, Philadelphia, PA, 1997. http://www.netlib.org.
-
(1997)
ScaLAPACK User's Guide
-
-
Blackford, L.S.1
Choi, J.2
Cleary, A.3
D'Azevedo, E.4
Demmel, J.5
Dhillon, I.6
Dongarra, J.7
Hammarling, S.8
Henry, G.9
Petitet, A.10
Stanley, K.11
Walker, D.12
Whaley, R.C.13
-
9
-
-
0003459808
-
-
Ph.D. Thesis, MIT Department of Electrical Engineering and Computer Science, September
-
R.D. Blumofe, Executing multithreaded programs efficiently, Ph.D. Thesis, MIT Department of Electrical Engineering and Computer Science, September 1995.
-
(1995)
Executing Multithreaded Programs Efficiently
-
-
Blumofe, R.D.1
-
10
-
-
0002634823
-
Scheduling multithreaded computations by work stealing
-
Santa Fe, New Mexico, IEEE Computer Society Press, November
-
R.D. Blumofe, C.E. Leiserson, Scheduling multithreaded computations by work stealing, in: Proceedings of the 35th Annual Symposium on Foundations of Computer Science, Santa Fe, New Mexico, IEEE Computer Society Press, November 1994, pp. 356-368.
-
(1994)
Proceedings of the 35th Annual Symposium on Foundations of Computer Science
, pp. 356-368
-
-
Blumofe, R.D.1
Leiserson, C.E.2
-
11
-
-
0002924772
-
ScaLAPACK: A scalable linear algebra for distributed memory concurrent computers
-
Also available as University of Tennessee Technical Report CS-92-181
-
J. Choi, J. Dongarra, R. Pozo, D. Walker, ScaLAPACK: a scalable linear algebra for distributed memory concurrent computers, in: Proceedings of the Fourth Symposium on the Frontiers of Massively Parallel Computation, 1992, pp. 120-127. Also available as University of Tennessee Technical Report CS-92-181.
-
(1992)
Proceedings of the Fourth Symposium on the Frontiers of Massively Parallel Computation
, pp. 120-127
-
-
Choi, J.1
Dongarra, J.2
Pozo, R.3
Walker, D.4
-
13
-
-
0012493293
-
A User's Guide to the Blacs v1.0
-
Technical Report UT CS-95-281, University of Tennessee
-
J. Dongarra, R. Whaley, A User's Guide to the Blacs v1.0, Technical Report UT CS-95-281, LAPACK Working Note 94, University of Tennessee, 1995. http://www.netlib.org/blacs/.
-
(1995)
LAPACK Working Note
, vol.94
-
-
Dongarra, J.1
Whaley, R.2
-
14
-
-
0025401417
-
Algorithm 679: A set of level 3 basic linear algebra subprograms
-
Dongarra J.J., Cruz J.D., Hammarling S., Duff I. Algorithm 679: a set of level 3 basic linear algebra subprograms. ACM Trans. Math. Softw. 16(1):1990;18-28.
-
(1990)
ACM Trans. Math. Softw.
, vol.16
, Issue.1
, pp. 18-28
-
-
Dongarra, J.J.1
Cruz, J.D.2
Hammarling, S.3
Duff, I.4
-
15
-
-
0025402476
-
A set of level 3 basic linear algebra subprograms
-
Dongarra J.J., Cruz J.D., Hammarling S., Duff I. A set of level 3 basic linear algebra subprograms. ACM Trans. Math. Softw. 16(1):1990;1-17.
-
(1990)
ACM Trans. Math. Softw.
, vol.16
, Issue.1
, pp. 1-17
-
-
Dongarra, J.J.1
Cruz, J.D.2
Hammarling, S.3
Duff, I.4
-
16
-
-
1642389912
-
A new recursive implementation of sparse Cholesky factorization
-
Lausanne, Switzerland, August
-
J.J. Dongarra, P. Raghavan, A new recursive implementation of sparse Cholesky factorization, in: Proceedings of the 16th IMACS World Congress 2000 on Scientific Computing, Applications, Mathematics, and Simulation, Lausanne, Switzerland, August 2000.
-
(2000)
Proceedings of the 16th IMACS World Congress 2000 on Scientific Computing, Applications, Mathematics, and Simulation
-
-
Dongarra, J.J.1
Raghavan, P.2
-
17
-
-
0020822138
-
The multifrontal solution of indefinite sparse symmetric linear equations
-
Duff I., Reid J. The multifrontal solution of indefinite sparse symmetric linear equations. ACM Trans. Math. Softw. 9:1983;302-325.
-
(1983)
ACM Trans. Math. Softw.
, vol.9
, pp. 302-325
-
-
Duff, I.1
Reid, J.2
-
18
-
-
0022754738
-
Parallel implementation of multifrontal schemes
-
I.S. Duff, Parallel implementation of multifrontal schemes, Parallel Comput. 3 (1986).
-
(1986)
Parallel Comput.
, vol.3
-
-
Duff, I.S.1
-
19
-
-
0034224207
-
Applying recursion to serial and parallel QR factorization leads to better performance
-
Elmroth E., Gustavson F. Applying recursion to serial and parallel QR factorization leads to better performance. IBM J. Res. Dev. 44(4):2000;605-624.
-
(2000)
IBM J. Res. Dev.
, vol.44
, Issue.4
, pp. 605-624
-
-
Elmroth, E.1
Gustavson, F.2
-
20
-
-
0012536008
-
A faster and simpler recursive algorithm for the LAPACK routine DGELS
-
Elmroth E., Gustavson F.G. A faster and simpler recursive algorithm for the LAPACK routine DGELS. BIT. 41:2001;936-949.
-
(2001)
BIT
, vol.41
, pp. 936-949
-
-
Elmroth, E.1
Gustavson, F.G.2
-
21
-
-
0031622953
-
The implementation of the Cilk-5 multithreaded language
-
Frigo M., Leiserson C.E., Randall K.H. The implementation of the Cilk-5 multithreaded language. ACM SIGPLAN Notices. 33(5):1998;212-223.
-
(1998)
ACM SIGPLAN Notices
, vol.33
, Issue.5
, pp. 212-223
-
-
Frigo, M.1
Leiserson, C.E.2
Randall, K.H.3
-
23
-
-
0031140712
-
Highly scalable parallel algorithms for sparse matrix factorization
-
Gupta A., Karypis G., Kumar V. Highly scalable parallel algorithms for sparse matrix factorization. IEEE Trans. Parallel Distrib. Syst. 8(5):1997;502-520.
-
(1997)
IEEE Trans. Parallel Distrib. Syst.
, vol.8
, Issue.5
, pp. 502-520
-
-
Gupta, A.1
Karypis, G.2
Kumar, V.3
-
24
-
-
84947926251
-
Recursive blocked data formats and BLAS's for dense linear algebra algorithms
-
in: B. Kågström, J. Dongarra, E. Elmroth, J. Waśniewski (Eds.), Proceedings of the Fourth International Workshop on Applied Parallel Computing and Large Scale Scientific and Industrial Problems (PARA'98), Springer, Umeå, Sweden, June
-
F. Gustavson, A. Henriksson, I. Jonsson, B. Kågström, P. Ling, Recursive blocked data formats and BLAS's for dense linear algebra algorithms, in: B. Kågström, J. Dongarra, E. Elmroth, J. Waśniewski (Eds.), Proceedings of the Fourth International Workshop on Applied Parallel Computing and Large Scale Scientific and Industrial Problems (PARA'98), Lecture Notes in Computer Science Number 1541, Springer, Umeå, Sweden, June 1998, pp. 574-578.
-
(1998)
Lecture Notes in Computer Science Number
, vol.1541
, pp. 574-578
-
-
Gustavson, F.1
Henriksson, A.2
Jonsson, I.3
Kågström, B.4
Ling, P.5
-
25
-
-
0031273280
-
Recursion leads to automatic variable blocking for dense linear-algebra algorithms
-
Gustavson F.G. Recursion leads to automatic variable blocking for dense linear-algebra algorithms. IBM J. Res. Dev. 41:1997;737-755.
-
(1997)
IBM J. Res. Dev.
, vol.41
, pp. 737-755
-
-
Gustavson, F.G.1
-
26
-
-
0034312453
-
Minimal-storage high-performance Cholesky factorization via blocking and recursion
-
Gustavson F.G., Jonsson I. Minimal-storage high-performance Cholesky factorization via blocking and recursion. IBM J. Res. Dev. 44:2000;823-850.
-
(2000)
IBM J. Res. Dev.
, vol.44
, pp. 823-850
-
-
Gustavson, F.G.1
Jonsson, I.2
-
27
-
-
0036467470
-
PaStiX: A high-performance parallel direct solver for sparse symmetric definite systems
-
Hénon P., Ramet P., Roman J. PaStiX: a high-performance parallel direct solver for sparse symmetric definite systems. Parallel Comput. 28:2002;301-321.
-
(2002)
Parallel Comput.
, vol.28
, pp. 301-321
-
-
Hénon, P.1
Ramet, P.2
Roman, J.3
-
29
-
-
1642274432
-
-
Intel, Math Kernel Library (MKL), 2001. http://www.intel.com/software/products/mkl/.
-
(2001)
Math Kernel Library (MKL)
-
-
-
30
-
-
84886852438
-
Parallel and fully recursive multifrontal supernodal sparse Cholesky
-
Part II, Amsterdam, April
-
D. Irony, G. Shklarski, S. Toledo, Parallel and fully recursive multifrontal supernodal sparse Cholesky, in: Proceedings of the International Conference on Computational Science (ICCS 2002), Part II, Amsterdam, April 2002, pp. 335-344.
-
(2002)
Proceedings of the International Conference on Computational Science (ICCS 2002)
, pp. 335-344
-
-
Irony, D.1
Shklarski, G.2
Toledo, S.3
-
31
-
-
0003406235
-
PSPASES: Scalable parallel direct solver library for sparse symmetric positive definite linear systems
-
User's Manual for Version 1.0.3, Department of Computer Science, University of Minnesota, revised 1999
-
M. Joshi, A. Gupta, F. Gustavson, G. Karypis, V. Kumar, PSPASES: scalable parallel direct solver library for sparse symmetric positive definite linear systems, in: User's Manual for Version 1.0.3, Technical Report TR 97-059, Department of Computer Science, University of Minnesota, 1997, revised 1999.
-
(1997)
Technical Report TR 97-059
, vol.TR 97-059
-
-
Joshi, M.1
Gupta, A.2
Gustavson, F.3
Karypis, G.4
Kumar, V.5
-
32
-
-
1642355738
-
PSPASES: An efficient and scalable parallel sparse direct solver
-
Annapolis, MD, February, Unpublished article
-
M. Joshi, A. Gupta, F. Gustavson, G. Karypis, V. Kumar, PSPASES: an efficient and scalable parallel sparse direct solver, in: Proceedings of the International Workshop on Frontiers of Parallel Numerical Computations and Applications (Frontiers'99), Annapolis, MD, February 1999, Unpublished article. http://www-users.cs.umn.edu/~mjoshi
-
(1999)
Proceedings of the International Workshop on Frontiers of Parallel Numerical Computations and Applications (Frontiers'99)
-
-
Joshi, M.1
Gupta, A.2
Gustavson, F.3
Karypis, G.4
Kumar, V.5
-
33
-
-
0028459839
-
DXML: A high-performance scientific subroutine library
-
Kamath C., Ho R., Manley D.P. DXML: a high-performance scientific subroutine library. Dig. Tech. J. 6(3):1994;44-56.
-
(1994)
Dig. Tech. J.
, vol.6
, Issue.3
, pp. 44-56
-
-
Kamath, C.1
Ho, R.2
Manley, D.P.3
-
34
-
-
0022785798
-
On the storage requirement in the out-of-core multifrontal method for sparse factorization
-
Liu J.W.H. On the storage requirement in the out-of-core multifrontal method for sparse factorization. ACM Trans. Math. Softw. 12(3):1986;249-264.
-
(1986)
ACM Trans. Math. Softw.
, vol.12
, Issue.3
, pp. 249-264
-
-
Liu, J.W.H.1
-
35
-
-
0024877196
-
The multifrontal method and paging in sparse Cholesky factorization
-
Liu J.W.H. The multifrontal method and paging in sparse Cholesky factorization. ACM Trans. Math. Softw. 15(4):1989;310-325.
-
(1989)
ACM Trans. Math. Softw.
, vol.15
, Issue.4
, pp. 310-325
-
-
Liu, J.W.H.1
-
36
-
-
0026840122
-
The multifrontal method for sparse matrix solution: Theory and practice
-
Liu J.W.H. The multifrontal method for sparse matrix solution: theory and practice. SIAM Rev. 34(1):1992;82-109.
-
(1992)
SIAM Rev.
, vol.34
, Issue.1
, pp. 82-109
-
-
Liu, J.W.H.1
-
37
-
-
0007990948
-
Sparse factorization with two-level scheduling in PARADISO
-
Portsmouth, VA, March, (CD-ROM)
-
O. Schenk, K. Gärtner, Sparse factorization with two-level scheduling in PARADISO, in: Proceedings of the 10th SIAM Conference on Parallel Processing for Scientific Computing, Portsmouth, VA, March 2001, p. 10 (CD-ROM).
-
(2001)
Proceedings of the 10th SIAM Conference on Parallel Processing for Scientific Computing
, pp. 10
-
-
Schenk, O.1
Gärtner, K.2
-
39
-
-
1642298823
-
-
Cilk-5.3.2 Reference Manual, MIT Laboratory for Computer Science, Cambridge, MA, November
-
Cilk-5.3.2 Reference Manual, Supercomputing Technologies Group, MIT Laboratory for Computer Science, Cambridge, MA, November 2001. http://supertech.lcs.mit.edu/cilk.
-
(2001)
Supercomputing Technologies Group
-
-
-
40
-
-
0003753533
-
-
August, The MathWorks Inc., Natick, MA
-
MATLAB Reference Guide, August 1992, The MathWorks Inc., Natick, MA.
-
(1992)
MATLAB Reference Guide
-
-
-
41
-
-
0031496750
-
Locality of reference in LU decomposition with partial pivoting
-
Toledo S. Locality of reference in LU decomposition with partial pivoting. SIAM J. Matrix Anal. Appl. 18(4):1997;1065-1081.
-
(1997)
SIAM J. Matrix Anal. Appl.
, vol.18
, Issue.4
, pp. 1065-1081
-
-
Toledo, S.1
-
42
-
-
0003418094
-
Automatically tuned linear algebra software
-
Computer Science Department, University of Tennessee
-
R.C. Whaley, J.J. Dongarra, Automatically tuned linear algebra software, Technical Report, Computer Science Department, University of Tennessee, 1998, http://www.netlib.org/atlas .
-
(1998)
Technical Report
-
-
Whaley, R.C.1
Dongarra, J.J.2
|