-
1
-
-
0028513316
-
Exploiting functional parallelism of POWER2 to design high-performance numerical algorithms
-
R. C. Agarwal, F. G. Gustavson, and M. Zubair. Exploiting functional parallelism of POWER2 to design high-performance numerical algorithms. IBM Journal of Research and Development, 38(5):563-576, 1994.
-
(1994)
IBM Journal of Research and Development
, vol.38
, Issue.5
, pp. 563-576
-
-
Agarwal, R.C.1
Gustavson, F.G.2
Zubair, M.3
-
2
-
-
0028427170
-
Improving performance of linear algebra algorithms for dense matrices using algorithmic prefetch
-
R. C. Agarwal, F. G. Gustavson, and M. Zubair. Improving performance of linear algebra algorithms for dense matrices using algorithmic prefetch. IBM Journal of Research and Development, 38(3):265-275, 1994.
-
(1994)
IBM Journal of Research and Development
, vol.38
, Issue.3
, pp. 265-275
-
-
Agarwal, R.C.1
Gustavson, F.G.2
Zubair, M.3
-
4
-
-
0030661485
-
Optimizing matrix multiply using PHIPAC: A portable, high-performance, ANSI C coding methodology
-
Vienna, Austria
-
J. Bilmes, K. Asanovic, C. W. Chin, and J. Demmel. Optimizing matrix multiply using PHIPAC: a portable, high-performance, ANSI C coding methodology. In Proceedings of the International Conference on Supercomputing, Vienna, Austria, 1997.
-
(1997)
Proceedings of the International Conference on Supercomputing
-
-
Bilmes, J.1
Asanovic, K.2
Chin, C.W.3
Demmel, J.4
-
5
-
-
84886850879
-
-
available online from
-
Compaq. Compaq extended math library (CXML). Software and documuntation available online from http://www.compaq.com/math/, 2001.
-
Software and Documuntation
-
-
-
6
-
-
0025401417
-
Algorithm 679: A set of level 3 basic linear algebra subprograms
-
J. J. Dongarra, J. D. Cruz, S. Hammarling, and I. Duff. Algorithm 679: A set of level 3 basic linear algebra subprograms. ACM Transactions on Mathematical Software, 16(1):18-28, 1990.
-
(1990)
ACM Transactions on Mathematical Software
, vol.16
, Issue.1
, pp. 18-28
-
-
Dongarra, J.J.1
Cruz, J.D.2
Hammarling, S.3
Duff, I.4
-
7
-
-
0025402476
-
A set of level 3 basic linear algebra subprograms
-
J. J. Dongarra, J. D. Cruz, S. Hammarling, and I. Duff. A set of level 3 basic linear algebra subprograms. ACM Transactions on Mathematical Software, 16(1):1-17, 1990.
-
(1990)
ACM Transactions on Mathematical Software
, vol.16
, Issue.1
, pp. 1-17
-
-
Dongarra, J.J.1
Cruz, J.D.2
Hammarling, S.3
Duff, I.4
-
8
-
-
1642389912
-
A new recursive implementation of sparse Cholesky factorization
-
Aug
-
J. J. Dongarra and P. Raghavan. A new recursive implementation of sparse Cholesky factorization. In Proceedings of the 16th IMACS World Congress 2000 on Scientific Computing, Applications, Mathematics, and Simulation, Lausanne, Switzerland, Aug. 2000.
-
(2000)
Proceedings of the 16th IMACS World Congress 2000 on Scientific Computing, Applications, Mathematics, and Simulation, Lausanne, Switzerland
-
-
Dongarra, J.J.1
Raghavan, P.2
-
9
-
-
0034224207
-
Applying recursion to serial and parallel QR factorization leads to better performance
-
E. Elmroth and F. Gustavson. Applying recursion to serial and parallel QR factorization leads to better performance. IBM Journal of Research and Development, 44(4):605-624, 2000.
-
(2000)
IBM Journal of Research and Development
, vol.44
, Issue.4
, pp. 605-624
-
-
Elmroth, E.1
Gustavson, F.2
-
10
-
-
0012536008
-
A faster and simpler recursive algorithm for the LAPACK routine DGELS
-
E. Elmroth and F. G. Gustavson. A faster and simpler recursive algorithm for the LAPACK routine DGELS. BIT, 41:936-949, 2001.
-
(2001)
BIT
, vol.41
, pp. 936-949
-
-
Elmroth, E.1
Gustavson, F.G.2
-
11
-
-
0347507496
-
The implementation of the Cilk-5 multithreaded language
-
M. Frigo, C. E. Leiserson, and K. H. Randall. The implementation of the Cilk-5 multithreaded language. ACM SIGPLAN Notices, 33(5):212-223, 1998.
-
(1998)
ACM SIGPLAN Notices
, vol.33
, Issue.5
, pp. 212-223
-
-
Frigo, M.1
Leiserson, C.E.2
Randall, K.H.3
-
12
-
-
84947926251
-
Recursive blocked data formats and BLAS's for dense linear algebra algorithms
-
B. Kagstr̈om, J. Dongarra, E. Elmroth, and J. Wásniewski, editors, number 1541 in Lecture Notes in Computer Science Number, Ume, Sweden, June, Springer
-
F. Gustavson, A. Henriksson, I. Jonsson, B. Ka°gström, and P. Ling. Recursive blocked data formats and BLAS's for dense linear algebra algorithms. In B. K°agstr̈om, J. Dongarra, E. Elmroth, and J. Wásniewski, editors, Proceedings of the 4th International Workshop on Applied Parallel Computing and Large Scale Scientific and Industrial Problems (PARA '98), number 1541 in Lecture Notes in Computer Science Number, pages 574-578, Ume, Sweden, June 1998. Springer.
-
(1998)
Proceedings of the 4th International Workshop on Applied Parallel Computing and Large Scale Scientific and Industrial Problems (PARA '98)
, pp. 574-578
-
-
Gustavson, F.1
Henriksson, A.2
Jonsson, I.3
Kagström, B.4
Ling, P.5
-
13
-
-
0031273280
-
Recursion leads to automatic variable blocking for dense linear-algebra algorithms
-
Nov
-
F. G. Gustavson. Recursion leads to automatic variable blocking for dense linear-algebra algorithms. IBM Journal of Research and Development, 41:737-755, Nov. 1997.
-
(1997)
IBM Journal of Research and Development
, vol.41
, pp. 737-755
-
-
Gustavson, F.G.1
-
14
-
-
0034312453
-
Minimal-storage high-performance Cholesky factorization via blocking and recursion
-
Nov
-
F. G. Gustavson and I. Jonsson. Minimal-storage high-performance Cholesky factorization via blocking and recursion. IBM Journal of Research and Development, 44:823-850, Nov. 2000.
-
(2000)
IBM Journal of Research and Development
, vol.44
, pp. 823-850
-
-
Gustavson, F.G.1
Jonsson, I.2
-
15
-
-
84886842084
-
-
Software and documuntation available online from
-
IBM. Engineering and scientific subroutine library (SCSL). Software and documuntation available online from http://www-1.ibm.com/servers/eservers/ pseries/software/sp/essl.html, 2001.
-
Engineering and Scientific Subroutine Library (SCSL)
-
-
-
16
-
-
84872201157
-
-
Software and documuntation available online from
-
Intel. Math kernel library (MKL). Software and documuntation available online from http://www.intel.com/software/products/mkl/, 2001.
-
Math Kernel Library (MKL)
-
-
-
17
-
-
0028459839
-
DXML: A high-performance scientific subroutine library
-
C. Kamath, R. Ho, and D. P. Manley. DXML: a high-performance scientific subroutine library. Digital Technical Journal, 6(3):44-56, 1994.
-
(1994)
Digital Technical Journal
, vol.6
, Issue.3
, pp. 44-56
-
-
Kamath, C.1
Ho, R.2
Manley, D.P.3
-
18
-
-
0022785798
-
On the storage requirement in the out-of-core multifrontal method for sparse factorization
-
J. W. H. Liu. On the storage requirement in the out-of-core multifrontal method for sparse factorization. ACM Transactions on Mathematical Software, 12(3):249-264, 1986.
-
(1986)
ACM Transactions on Mathematical Software
, vol.12
, Issue.3
, pp. 249-264
-
-
Liu, J.W.H.1
-
19
-
-
0024877196
-
The multifrontal method and paging in sparse Cholesky factorization
-
J. W. H. Liu. The multifrontal method and paging in sparse Cholesky factorization. ACM Transactions on Mathematical Software, 15(4):310-325, 1989.
-
(1989)
ACM Transactions on Mathematical Software
, vol.15
, Issue.4
, pp. 310-325
-
-
Liu, J.W.H.1
-
20
-
-
0026840122
-
The multifrontal method for sparse matrix solution: Theory and practice
-
J.W. H. Liu. The multifrontal method for sparse matrix solution: Theory and practice. SIAM Review, 34(1):82-109, 1992.
-
(1992)
SIAM Review
, vol.34
, Issue.1
, pp. 82-109
-
-
Liu, J.W.H.1
-
21
-
-
0000018801
-
Block sparse Cholesky algorithms on advanced uniprocessor computers
-
E. G. Ng and B. W. Peyton. Block sparse Cholesky algorithms on advanced uniprocessor computers. SIAM Journal on Scientific Computing, 14(5):1034-1056, 1993.
-
(1993)
SIAM Journal on Scientific Computing
, vol.14
, Issue.5
, pp. 1034-1056
-
-
Ng, E.G.1
Peyton, B.W.2
-
22
-
-
0007990948
-
Sparse factorization with two-level scheduling in PARADISO
-
pages on CDROM, Portsmouth, Virginia, Mar
-
O. Schenk and K. Gärtner. Sparse factorization with two-level scheduling in PARADISO. In Proceedings of the 10th SIAM Conference on Parallel Processing for Scientific Computing, page 10 pages on CDROM, Portsmouth, Virginia, Mar. 2001.
-
(2001)
Proceedings of the 10th SIAM Conference on Parallel Processing for Scientific Computing
, pp. 10
-
-
Schenk, O.1
Gärtner, K.2
-
23
-
-
84886831435
-
-
Software and documuntation available online from from
-
SGI. Scientific computing software library (SCSL). Software and documuntation available online from from http://www.sgi.com/software/scsl.html, 1993-2001.
-
Scientific Computing Software Library (SCSL)
-
-
-
24
-
-
0842274646
-
-
Cambridge, MA. Cilk-5.3 Reference Manual, June, Available online at
-
Supercomputing Technologies Group, MIT Laboratory for Computer Science, Cambridge, MA. Cilk-5.3 Reference Manual, June 2000. Available online at http://supertech.lcs.mit.edu/cilk.
-
(2000)
MIT Laboratory for Computer Science
-
-
-
25
-
-
0031496750
-
Locality of reference in LU decomposition with partial pivoting
-
S. Toledo. Locality of reference in LU decomposition with partial pivoting. SIAM Journal on Matrix Analysis and Applications, 18(4):1065-1081, 1997.
-
(1997)
SIAM Journal on Matrix Analysis and Applications
, vol.18
, Issue.4
, pp. 1065-1081
-
-
Toledo, S.1
|