-
1
-
-
0028513316
-
Exploiting functional parallism of POWER2 to design high-performance numerical algorithms
-
R. C. Agarwal, F. G. Gustavson, and M. Zubair, Exploiting functional parallism of POWER2 to design high-performance numerical algorithms, IBM J. Res. Develop., 38 (1994), pp. 563-576.
-
(1994)
IBM J. Res. Develop.
, vol.38
, pp. 563-576
-
-
Agarwal, R.C.1
Gustavson, F.G.2
Zubair, M.3
-
2
-
-
84937408012
-
Automatic generation of block-recursive codes
-
in Euro-Par 2000 Parallel Processing, A. Bode et al., eds.; Springer-Verlag, New York
-
N. Ahmed and K. Pingali, Automatic generation of block-recursive codes, in Euro-Par 2000 Parallel Processing, A. Bode et al., eds., Lecture Notes in Comput. Sci. 1900, Springer-Verlag, New York, 2000, pp. 368-378.
-
(2000)
Lecture Notes in Comput. Sci. 1900
, pp. 368-378
-
-
Ahmed, N.1
Pingali, K.2
-
3
-
-
18044400448
-
A recursive formulation of Cholesky factorization of a matrix in packed storage
-
B. Andersen, F. Gustavson, and J. Waśniewski, A recursive formulation of Cholesky factorization of a matrix in packed storage, ACM Trans. Math. Software, 27 (2001), pp. 214-244.
-
(2001)
ACM Trans. Math. Software
, vol.27
, pp. 214-244
-
-
Andersen, B.1
Gustavson, F.2
Waśniewski, J.3
-
4
-
-
0003706460
-
-
SIAM, Philadelphia
-
E. Anderson, Z. Bai, C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney and D. Sorensen, LAPACK Users' Guide, 3rd ed., SIAM, Philadelphia, 1999.
-
(1999)
LAPACK Users' Guide, 3rd Ed.
-
-
Anderson, E.1
Bai, Z.2
Bischof, C.3
Blackford, S.4
Demmel, J.5
Dongarra, J.6
Du Croz, J.7
Greenbaum, A.8
Hammarling, S.9
McKenney, A.10
Sorensen, D.11
-
5
-
-
84870413589
-
Automatically tuned linear algebra software
-
ATLAS, Automatically Tuned Linear Algebra Software, http://math-atlas.sourceforge.net/.
-
-
-
-
6
-
-
0027608822
-
On computing condition numbers for the nonsymmetric eigenproblem
-
Z. Bai, J. Demmel, and A. McKenney, On computing condition numbers for the nonsymmetric eigenproblem, ACM Trans. Math. Software, 19 (1993), pp. 202-223.
-
(1993)
ACM Trans. Math. Software
, vol.19
, pp. 202-223
-
-
Bai, Z.1
Demmel, J.2
McKenney, A.3
-
7
-
-
33846349887
-
A hierarchical O(n logn) force calculation algorithm
-
J. Barnes and P. Hut, A hierarchical O(n logn) force calculation algorithm, Nature, 324 (1986), pp. 446-449.
-
(1986)
Nature
, vol.324
, pp. 446-449
-
-
Barnes, J.1
Hut, P.2
-
9
-
-
0030661485
-
Optimizing matrix multiply using PHiPAC: A portable high-performance ANSI C methodology
-
J. Bilmes, K. Asanovic, C.-W. Chin, and J. Demmel, Optimizing matrix multiply using PHiPAC: A portable high-performance ANSI C methodology, in Proceedings of the International Conference on Supercomputing, Vienna, 1997, pp. 340-347.
-
Proceedings of the International Conference on Supercomputing, Vienna, 1997
, pp. 340-347
-
-
Bilmes, J.1
Asanovic, K.2
Chin, C.-W.3
Demmel, J.4
-
10
-
-
0001951009
-
The WY representation for products of householder matrices
-
C. Bischof and C. Van Loan, The WY representation for products of Householder matrices, SIAM J. Sci. Statist. Comput., 8 (1987), pp. s2-s13.
-
(1987)
SIAM J. Sci. Statist. Comput.
, vol.8
-
-
Bischof, C.1
Van Loan, C.2
-
11
-
-
0036401631
-
The multishift QR algorithm. Part I: Maintaining well-focused shifts and level 3 performance
-
K. Braman, R. Byers, and R. Mathias, The multishift QR algorithm. Part I: Maintaining well-focused shifts and level 3 performance, SIAM J. Matrix Anal. Appl., 23 (2002), pp. 929-947.
-
(2002)
SIAM J. Matrix Anal. Appl.
, vol.23
, pp. 929-947
-
-
Braman, K.1
Byers, R.2
Mathias, R.3
-
12
-
-
0036400807
-
The multishift QR algorithm. Part II: Aggressive early deflation
-
K. Braman, R. Byers, and R. Mathias, The multishift QR algorithm. Part II: Aggressive early deflation, SIAM J. Matrix Anal. Appl., 23 (2002), pp. 948-973.
-
(2002)
SIAM J. Matrix Anal. Appl.
, vol.23
, pp. 948-973
-
-
Braman, K.1
Byers, R.2
Mathias, R.3
-
13
-
-
0003555195
-
-
SIAM, Philadelphia
-
J. R. Bunch, J. J. Dongarra, C. B. Moler, and G. W. Stewart, LINPACK User's Guide, SIAM, Philadelphia, 1979.
-
(1979)
LINPACK User's Guide
-
-
Bunch, J.R.1
Dongarra, J.J.2
Moler, C.B.3
Stewart, G.W.4
-
14
-
-
0031223129
-
Compiler blockability of dense matrix factorizations
-
S. Carr and R. B. Lehoucq, Compiler blockability of dense matrix factorizations, ACM Trans. Math. Software, 23 (1997), pp. 336-361.
-
(1997)
ACM Trans. Math. Software
, vol.23
, pp. 336-361
-
-
Carr, S.1
Lehoucq, R.B.2
-
15
-
-
0036870763
-
Recursive array layouts and fast matrix multiplication
-
S. Chatterjee, A. R. Lebeck, P. K. Patnala, and M. Thottethodi, Recursive array layouts and fast matrix multiplication, IEEE Trans. Parallel Distrib. Systems, 13 (2002), pp. 1105-1123.
-
(2002)
IEEE Trans. Parallel Distrib. Systems
, vol.13
, pp. 1105-1123
-
-
Chatterjee, S.1
Lebeck, A.R.2
Patnala, P.K.3
Thottethodi, M.4
-
16
-
-
38149012697
-
The solution of the matrix equations AX B - C XD = E and (Y A - DZ, Y C - BZ) = (E, F)
-
K.-W. E. Chu, The solution of the matrix equations AXB - CXD = E and (Y A - DZ, Y C - BZ) = (E, F), Linear Algebra Appl., 93 (1987), pp. 93-105.
-
(1987)
Linear Algebra Appl.
, vol.93
, pp. 93-105
-
-
Chu, K.-W.E.1
-
17
-
-
0000659575
-
A divide and conquer method for the symmetric tridiagonal eigenproblem
-
J. J. M. Cuppen, A divide and conquer method for the symmetric tridiagonal eigenproblem, Numer. Math., 36 (1981), pp. 177-195.
-
(1981)
Numer. Math.
, vol.36
, pp. 177-195
-
-
Cuppen, J.J.M.1
-
18
-
-
0010976738
-
Blocked algorithms and software for reduction of a regular matrix pair to generalized Schur form
-
K. Dackland and B. Kågström, Blocked algorithms and software for reduction of a regular matrix pair to generalized Schur form, ACM Trans. Math. Software, 25 (1999), pp. 425-454.
-
(1999)
ACM Trans. Math. Software
, vol.25
, pp. 425-454
-
-
Dackland, K.1
Kågström, B.2
-
19
-
-
45949117656
-
Computing stable eigendecompositions of matrix pencils
-
J. Demmel and B. Kågström, Computing stable eigendecompositions of matrix pencils, Linear Algebra Appl., 88/89, (1987), pp. 139-186.
-
(1987)
Linear Algebra Appl.
, vol.88-89
, pp. 139-186
-
-
Demmel, J.1
Kågström, B.2
-
20
-
-
0026913668
-
Stability of block algorithms with fast level-3 BLAS
-
J. W. Demmel and N. J. Higham, Stability of block algorithms with fast level-3 BLAS, ACM Trans. Math. Software, 18 (1992), pp. 274-291.
-
(1992)
ACM Trans. Math. Software
, vol.18
, pp. 274-291
-
-
Demmel, J.W.1
Higham, N.J.2
-
21
-
-
0025401417
-
Algorithm 679: A set of level 3 basic linear algebra subprograms
-
J. Dongarra, J. Du Croz, I. Duff, and S. Hammarling, Algorithm 679: A set of level 3 basic linear algebra subprograms, ACM Trans. Math. Software, 16 (1990), pp. 18-28.
-
(1990)
ACM Trans. Math. Software
, vol.16
, pp. 18-28
-
-
Dongarra, J.1
Du Croz, J.2
Duff, I.3
Hammarling, S.4
-
22
-
-
0025402476
-
A set of level 3 basic linear algebra subprograms
-
J. Dongarra, J. Du Croz, I. Duff, and S. Hammarling, A set of level 3 basic linear algebra subprograms, ACM Trans. Math. Software, 16 (1990), pp. 1-17.
-
(1990)
ACM Trans. Math. Software
, vol.16
, pp. 1-17
-
-
Dongarra, J.1
Du Croz, J.2
Duff, I.3
Hammarling, S.4
-
23
-
-
0023983122
-
An extended set of Fortran basic linear algebra subroutines
-
J. Dongarra, J. Du Croz, S. Hammarling, and R. J. Hanson, An extended set of Fortran basic linear algebra subroutines, ACM Trans. Math. Software, 14 (1988), pp. 1-17.
-
(1988)
ACM Trans. Math. Software
, vol.14
, pp. 1-17
-
-
Dongarra, J.1
Du Croz, J.2
Hammarling, S.3
Hanson, R.J.4
-
24
-
-
0035176737
-
Recursive approach in sparse matrix LU factorization
-
J. Dongarra, V. Eijkhout, and P. Łszczek, Recursive approach in sparse matrix LU factorization, Sci. Programming, 9 (2001), pp. 51-60.
-
(2001)
Sci. Programming
, vol.9
, pp. 51-60
-
-
Dongarra, J.1
Eijkhout, V.2
Łszczek, P.3
-
25
-
-
0029324485
-
Software libraries for linear algebra computations on high performance computers
-
J. J. Dongarra and D. W. Walker, Software libraries for linear algebra computations on high performance computers, SIAM Rev., 37 (1995), pp. 151-180.
-
(1995)
SIAM Rev.
, vol.37
, pp. 151-180
-
-
Dongarra, J.J.1
Walker, D.W.2
-
26
-
-
0002663082
-
GEMMW: A portable 3 BLAS Winograd variant of Strassen's matrix multiply algorithm
-
C. G. Douglas, M. Heroux, G. Slishman, and R. M. Smith, GEMMW: A portable 3 BLAS Winograd variant of Strassen's matrix multiply algorithm, J. Comput. Phys., 110 (1994), pp. 1.-10.
-
(1994)
J. Comput. Phys.
, vol.110
, pp. 1-10
-
-
Douglas, C.G.1
Heroux, M.2
Slishman, G.3
Smith, R.M.4
-
27
-
-
84947936389
-
New serial and parallel recursive QR factorization algorithms for SMP systems
-
in Applied Parallel Computing: Large Scale Scientific and Industrial Problems, B. Kågström et al. eds.; Springer-Verlag, New York
-
E. Elmroth and F. G. Gustavson, New serial and parallel recursive QR factorization algorithms for SMP systems, in Applied Parallel Computing: Large Scale Scientific and Industrial Problems, B. Kågström et al. eds., Lecture Notes in Comput. Sci. 1541, Springer-Verlag, New York, 1998, pp. 120-128.
-
(1998)
Lecture Notes in Comput. Sci. 1541
, pp. 120-128
-
-
Elmroth, E.1
Gustavson, F.G.2
-
28
-
-
0034224207
-
Applying recursion to serial and parallel QR factorization leads to better performance
-
E. Elmroth and F. G. Gustavson, Applying recursion to serial and parallel QR factorization leads to better performance, IBM J. Res. Develop., 44 (2000), pp. 605-624.
-
(2000)
IBM J. Res. Develop.
, vol.44
, pp. 605-624
-
-
Elmroth, E.1
Gustavson, F.G.2
-
29
-
-
0012536008
-
A faster and simpler recursive algorithm for the LAPACK routine DGELS
-
E. Elmroth and F. G. Gustavson, A faster and simpler recursive algorithm for the LAPACK routine DGELS, BIT, 41 (2001), pp. 936-949.
-
(2001)
BIT
, vol.41
, pp. 936-949
-
-
Elmroth, E.1
Gustavson, F.G.2
-
30
-
-
84957033906
-
High-performance library software for QR factorization
-
in Applied Parallel Computing: New Paradigms for HPC in Industry and Academia, T. Sørvik et al., eds.; Springer-Verlag, New York
-
E. Elmroth and F. G. Gustavson, High-performance library software for QR factorization, in Applied Parallel Computing: New Paradigms for HPC in Industry and Academia, T. Sørvik et al., eds., Lecture Notes in Comput. Sci. 1947, Springer-Verlag, New York, 2001, pp. 53-63.
-
(2001)
Lecture Notes in Comput. Sci. 1947
, pp. 53-63
-
-
Elmroth, E.1
Gustavson, F.G.2
-
32
-
-
0038716587
-
QR factorization with Morton-ordered quadtree matrices for memory re-use and parallelism
-
J. D. Frens and D. S. Wise, QR factorization with Morton-ordered quadtree matrices for memory re-use and parallelism , in Proceedings of the 2003 ACM Symposium on Principles and Practice of Parallel Programming, ACM SIGPLAN Notices, 38 (10) (2003), pp. 144-154.
-
(2003)
Proceedings of the 2003 ACM Symposium on Principles and Practice of Parallel Programming, ACM SIGPLAN Notices
, vol.38
, Issue.1
, pp. 144-154
-
-
Frens, J.D.1
Wise, D.S.2
-
33
-
-
0348209599
-
A fast Fourier transform compiler
-
M. Frigo, A fast Fourier transform compiler, in Proceedings of the 1999 ACM SIGPLAN Conference on Programming Language Design and Implementation, ACM SIGPLAN Notices, 34 (3) (1999), pp. 169-180.
-
(1999)
Proceedings of the 1999 ACM SIGPLAN Conference on Programming Language Design and Implementation, ACM SIGPLAN Notices
, vol.34
, Issue.3
, pp. 169-180
-
-
Frigo, M.1
-
35
-
-
0033350255
-
Cache-oblivious algorithms
-
M. Frigo, C. E. Leiserson, H. Prokop, and S. Ramachandran, Cache-oblivious algorithms. in Proceedings of the 40th Annual IEEE Symposium on Foundations of Computer Science, New York, 1999, IEEE Computer Society, Los Alamitos, CA, 1999.
-
Proceedings of the 40th Annual IEEE Symposium on Foundations of Computer Science, New York, 1999, IEEE Computer Society, Los Alamitos, CA, 1999
-
-
Frigo, M.1
Leiserson, C.E.2
Prokop, H.3
Ramachandran, S.4
-
36
-
-
0003100264
-
Parallel algorithms for dense linear algebra computations
-
K. A. Gallivan, R. J. Plemmons, and A. H. Sameh, Parallel algorithms for dense linear algebra computations, SIAM Rev., 32 (1990), pp. 54-135.
-
(1990)
SIAM Rev.
, vol.32
, pp. 54-135
-
-
Gallivan, K.A.1
Plemmons, R.J.2
Sameh, A.H.3
-
38
-
-
0018721357
-
A Hessenberg-Schur method for the matrix problem AX + X B = C
-
G. Golub, S. Nash, and C. Van Loan, A Hessenberg-Schur method for the matrix problem AX + X B = C, IEEE Trans. Automat. Control, AC-24 (1979), pp. 909-913.
-
(1979)
IEEE Trans. Automat. Control
, vol.AC-24
, pp. 909-913
-
-
Golub, G.1
Nash, S.2
Van Loan, C.3
-
39
-
-
0004236492
-
-
Johns Hopkins University Press, Baltimore, MD
-
G. Golub and C. Van Loan, Matrix Computations, 3rd ed., Johns Hopkins University Press, Baltimore, MD, 1996.
-
(1996)
Matrix Computations, 3rd Ed.
-
-
Golub, G.1
Van Loan, C.2
-
40
-
-
1542392269
-
On reducing TLB misses in matrix multiplication
-
Department of Computer Sciences, University of Texas at Austin
-
K. Goto and R. van de Geijn, On Reducing TLB Misses in Matrix Multiplication, Technical Report TR-2002-55, FLAME Working Note 9, Department of Computer Sciences, University of Texas at Austin, 2002.
-
(2002)
Technical Report TR-2002-55, FLAME Working Note 9
-
-
Goto, K.1
Van De Geijn, R.2
-
41
-
-
0039435412
-
Formal linear algebra methods environment (FLAME)
-
J. Gunnels, F. G. Gustavson, G. Henry, and R. van de Geijn, Formal linear algebra methods environment (FLAME), ACM Trans. Math. Software, 27 (2001) pp. 422-455.
-
(2001)
ACM Trans. Math. Software
, vol.27
, pp. 422-455
-
-
Gunnels, J.1
Gustavson, F.G.2
Henry, G.3
Van De Geijn, R.4
-
42
-
-
1842843487
-
Fault-tolerant high-performance matrix-matrix multiplication
-
Department of Computing Sciences, University of Texas at Austin
-
J. A. Gunnels, D. S. Katz, E. S. Quintana-Orti, and R. van de Geijn, Fault-Tolerant High-Performance Matrix-Matrix Multiplication. FLAME Technical Report TR-2000-34, Working Note 2. Department of Computing Sciences, University of Texas at Austin, 2000.
-
(2000)
FLAME Technical Report TR-2000-34, Working Note 2
-
-
Gunnels, J.A.1
Katz, D.S.2
Quintana-Orti, E.S.3
Van De Geijn, R.4
-
43
-
-
0031273280
-
Recursion leads to automatic variable blocking for dense linear-algebra algorithms
-
F. G. Gustavson, Recursion leads to automatic variable blocking for dense linear-algebra algorithms, IBM J. Res. Develop., 41 (1997), pp. 737-755.
-
(1997)
IBM J. Res. Develop.
, vol.41
, pp. 737-755
-
-
Gustavson, F.G.1
-
44
-
-
84901913528
-
New generalized data structures for matrices lead to a variety of high performance algorithms
-
in The Architectures for Scientific Software, R. F. Boisvert and P. T. P. Tang, eds.; Kluwer Academic, Dordrecht, The Netherlands
-
F. G. Gustavson, New generalized data structures for matrices lead to a variety of high performance algorithms, in The Architectures for Scientific Software, R. F. Boisvert and P. T. P. Tang, eds., IFIP Conference Proceedings 188, Kluwer Academic, Dordrecht, The Netherlands, pp. 211-234.
-
IFIP Conference Proceedings 188
, pp. 211-234
-
-
Gustavson, F.G.1
-
45
-
-
0037230301
-
High-performance linear algebra algorithms using new generalized data structures for matrices
-
F. G. Gustavson, High-performance linear algebra algorithms using new generalized data structures for matrices, IBM J. Res. Develop., 47 (2003), pp. 31-554.
-
(2003)
IBM J. Res. Develop.
, vol.47
, pp. 31-554
-
-
Gustavson, F.G.1
-
46
-
-
84947926251
-
Recursive blocked data formats and BLAS's for dense linear algebra algorithms
-
in Applied Parallel Computing: Large Scale Scientific and Industrial Problems, B. Kågström et al., eds.; Springer-Verlag, New York
-
F. G. Gustavson, A. Henriksson, I. Jonsson, B. Kågström, and P. Ling, Recursive blocked data formats and BLAS's for dense linear algebra algorithms, in Applied Parallel Computing: Large Scale Scientific and Industrial Problems, B. Kågström et al., eds., Lecture Notes in Comput. Sci. 1541, Springer-Verlag, New York, 1998, pp. 195-206.
-
(1998)
Lecture Notes in Comput. Sci. 1541
, pp. 195-206
-
-
Gustavson, F.G.1
Henriksson, A.2
Jonsson, I.3
Kågström, B.4
Ling, P.5
-
47
-
-
84947907655
-
Superscalar GEMM-based level 3 BLAS - The on-going evolution of a portable and high-performance library
-
in Applied Parallel Computing: Large Scale Scientific and Industrial Problems, B. Kåström et al., eds.; Springer-Verlag, New York
-
F. G. Gustavson, A. Henriksson, I. Jonsson, B. Kågström, and P. Ling, Superscalar GEMM-based level 3 *BLAS - The on-going evolution of a portable and high-performance library, in Applied Parallel Computing: Large Scale Scientific and Industrial Problems, B. Kåström et al., eds., Lecture Notes in Comput. Sci. 1541, Springer-Verlag, New York, 1998, pp. 207-215.
-
(1998)
Lecture Notes in Comput. Sci. 1541
, pp. 207-215
-
-
Gustavson, F.G.1
Henriksson, A.2
Jonsson, I.3
Kågström, B.4
Ling, P.5
-
48
-
-
0034312453
-
Minimal-storage high-performance Cholesky factorization via blocking and recursion
-
F. G. Gustavson and I. Jonsson, Minimal-storage high-performance Cholesky factorization via blocking and recursion, IBM J. Res. Develop., 44 (2000), pp. 823-849.
-
(2000)
IBM J. Res. Develop.
, vol.44
, pp. 823-849
-
-
Gustavson, F.G.1
Jonsson, I.2
-
50
-
-
0000567621
-
Numerical solution of the stable, non-negative definite Lyapunov equation
-
S. J. Hammarling, Numerical solution of the stable, non-negative definite Lyapunov equation, IMA J. Numer. Anal., 2 (1982), pp. 303-323.
-
(1982)
IMA J. Numer. Anal.
, vol.2
, pp. 303-323
-
-
Hammarling, S.J.1
-
51
-
-
0343910469
-
High-performance matrix multiplication on the IBM SP high node
-
Master's thesis, UMNAD-98.235, Department of Computing Science, Umeå University, Umeå, Sweden
-
A. Henriksson and I. Jonsson, High-Performance Matrix Multiplication on the IBM SP High Node, Master's thesis, UMNAD-98.235, Department of Computing Science, Umeå University, Umeå, Sweden, 1998.
-
(1998)
-
-
Henriksson, A.1
Jonsson, I.2
-
52
-
-
0024143903
-
Fortran codes for estimating the one-norm of a real or complex matrix with applications to condition estimation
-
N. J. Higham, Fortran codes for estimating the one-norm of a real or complex matrix with applications to condition estimation, ACM Trans. Math. Software, 14 (1988), pp. 381-396.
-
(1988)
ACM Trans. Math. Software
, vol.14
, pp. 381-396
-
-
Higham, N.J.1
-
53
-
-
0001045175
-
Perturbation theory and backward error for AX - XB = C
-
N. J. Higham, Perturbation theory and backward error for AX - XB = C, BIT, 33 (1993), pp. 124-136.
-
(1993)
BIT
, vol.33
, pp. 124-136
-
-
Higham, N.J.1
-
58
-
-
84886852438
-
Parallel and fully recursive multifrontal supernodal sparse Cholesky
-
in Computational Science - ICCS 2002, P. Sloot et al., eds.; Springer-Verlag, Berlin
-
D. Irony, G. Shklarski, and S. Toledo, Parallel and fully recursive multifrontal supernodal sparse Cholesky, in Computational Science - ICCS 2002, P. Sloot et al., eds., Lecture Notes in Comput. Sci. 2330, Springer-Verlag, Berlin, 2002, pp. 335-344.
-
(2002)
Lecture Notes in Comput. Sci. 2330
, pp. 335-344
-
-
Irony, D.1
Shklarski, G.2
Toledo, S.3
-
59
-
-
24244446738
-
Analysis of processor and memory utilization of recursive algorithms for sylvester-type matrix equations using performance monitoring
-
Technical Report UNINF-03.16, Department of Computing Science, Umeå University, Umeå, Sweden
-
I. Jonsson, Analysis of Processor and Memory Utilization of Recursive Algorithms for Sylvester-Type Matrix Equations Using Performance Monitoring, Technical Report UNINF-03.16, Department of Computing Science, Umeå University, Umeå, Sweden, 2003.
-
(2003)
-
-
Jonsson, I.1
-
60
-
-
1842832563
-
Parallel triangular Sylvester-type matrix equation solvers for SMP systems using recursive blocking
-
in Applied Parallel Computing: New Paradigms for HPC Industry and Academia, T. Sørvik et al., eds.; Springer-Verlag, New York
-
I. Jonsson and B. Kågström, Parallel triangular Sylvester-type matrix equation solvers for SMP systems using recursive blocking, in Applied Parallel Computing: New Paradigms for HPC Industry and Academia, T. Sørvik et al., eds., Lecture Notes in Comput. Sci. 1947, Springer-Verlag, New York, 2001, pp. 64-73.
-
(2001)
Lecture Notes in Comput. Sci. 1947
, pp. 64-73
-
-
Jonsson, I.1
Kågström, B.2
-
61
-
-
1842843484
-
Parallel two-sided Sylvester-type matrix equation solvers for SMP systems using recursive blocking
-
in Applied Parallel Computing: Advanced Scientific Computing, J. Fagerhom et al., eds.; Springer-Verlag, New York
-
I. Jonsson and B. Kågström, Parallel two-sided Sylvester-type matrix equation solvers for SMP systems using recursive blocking, in Applied Parallel Computing: Advanced Scientific Computing, J. Fagerhom et al., eds., Lecture Notes in Comput. Sci. 2367, Springer-Verlag, New York, 2002, pp. 297-306.
-
(2002)
Lecture Notes in Comput. Sci. 2367
, pp. 297-306
-
-
Jonsson, I.1
Kågström, B.2
-
62
-
-
19044380439
-
Recursive blocked algorithms for solving triangular systems - Part I: One-sided and coupled Sylvester-type matrix equations
-
I. Jonsson and B. Kågström, Recursive blocked algorithms for solving triangular systems - Part I: One-sided and coupled Sylvester-type matrix equations, ACM Trans. Math. Software, 28 (2002), pp. 392-415.
-
(2002)
ACM Trans. Math. Software
, vol.28
, pp. 392-415
-
-
Jonsson, I.1
Kågström, B.2
-
63
-
-
19044400922
-
Recursive blocked algorithms for solving triangular systems - Part II: Two sided and generalized Sylvester and lyapunov equations
-
I. Jonsson and B. Kågström, Recursive blocked algorithms for solving triangular systems - Part II: Two sided and generalized Sylvester and Lyapunov equations, ACM Trans. Math. Software, 28, (2002), pp. 416-435.
-
(2002)
ACM Trans. Math. Software
, vol.28
, pp. 416-435
-
-
Jonsson, I.1
Kågström, B.2
-
64
-
-
1842843483
-
RECSY - a high performance library for sylvester-type matrix equations
-
I. Jonsson and B. Kågström, RECSY - A High Performance Library for Sylvester-Type Matrix Equations, http://www.cs.umu.se/research/parallel/recsy, 2003.
-
(2003)
-
-
Jonsson, I.1
Kågström, B.2
-
65
-
-
21844500600
-
A perturbation analysis of the generalized Sylvester equation (AR - LB, DR - LE) = (C, F)
-
B. Kåström, A perturbation analysis of the generalized Sylvester equation (AR - LB, DR - LE) = (C, F), SIAM J. Matrix Anal. Appl., 15 (1994), pp. 1045-1060.
-
(1994)
SIAM J. Matrix Anal. Appl.
, vol.15
, pp. 1045-1060
-
-
Kåström, B.1
-
66
-
-
0032155271
-
GEMM-based level 3 BLAS: High-performance model implementations and performance evaluation benchmark
-
B. Kågström, P. Ling, and C. Van Loan, GEMM-based level 3 BLAS: High-performance model implementations and performance evaluation benchmark, ACM Trans. Math. Software, 24 (1998), pp. 268-302.
-
(1998)
ACM Trans. Math. Software
, vol.24
, pp. 268-302
-
-
Kågström, B.1
Ling, P.2
Van Loan, C.3
-
67
-
-
0032155342
-
Algorithm 784: GEMM-based level 3 BLAS: Portability and optimization issues
-
B. Kågström, P. Ling, and C. Van Loan, Algorithm 784: GEMM-based level 3 BLAS: Portability and optimization issues, ACM Trans. Math. Software, 24, (1998), pp. 303-316.
-
(1998)
ACM Trans. Math. Software
, vol.24
, pp. 303-316
-
-
Kågström, B.1
Ling, P.2
Van Loan, C.3
-
69
-
-
0041103179
-
Computing eigenspaces with specified eigenvalues of a regular matrix pair (A, B) and condition estimation: Theory, algorithms and software
-
B. Kågström, and P. Poromaa, Computing eigenspaces with specified eigenvalues of a regular matrix pair (A, B) and condition estimation: Theory, algorithms and software, Numer. Algorithms, 12 (1996), pp. 369-407.
-
(1996)
Numer. Algorithms
, vol.12
, pp. 369-407
-
-
Kågström, B.1
Poromaa, P.2
-
70
-
-
0030092417
-
LAPACK-style algorithms and software for solving the generalized Sylvester equation and estimating the separation between regular matrix pairs
-
B. Kågström, and P. Poromaa, LAPACK-style algorithms and software for solving the generalized Sylvester equation and estimating the separation between regular matrix pairs, ACM Trans, Math. Software, 22 (1996), pp. 78-103.
-
(1996)
ACM Trans, Math. Software
, vol.22
, pp. 78-103
-
-
Kågström, B.1
Poromaa, P.2
-
71
-
-
0040740214
-
A generalized state-space approach for the additive decomposition of a transfer matrix
-
B. Kågström, and P. Van Dooren, A generalized state-space approach for the additive decomposition of a transfer matrix, Internat. J. Numer. Linear Algebra Appl., 1 (1992), pp. 165-181.
-
(1992)
Internat. J. Numer. Linear Algebra Appl.
, vol.1
, pp. 165-181
-
-
Kågström, B.1
Van Dooren, P.2
-
72
-
-
0024700168
-
Generalized Schur methods with condition estimators for solving the generalized Sylvester equation
-
B. Kågström, and L. Westin, Generalized Schur methods with condition estimators for solving the generalized Sylvester equation, IEEE Trans. Automat. Control, 34 (1989), pp. 745-751.
-
(1989)
IEEE Trans. Automat. Control
, vol.34
, pp. 745-751
-
-
Kågström, B.1
Westin, L.2
-
73
-
-
84976859541
-
The cache performance and optimizations of blocked algorithms
-
M. S. Lam, E. E. Rothberg, and M. E. Wolf, The cache performance and optimizations of blocked algorithms, in Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 1991, pp. 63-74.
-
Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 1991
, pp. 63-74
-
-
Lam, M.S.1
Rothberg, E.E.2
Wolf, M.E.3
-
74
-
-
0018515759
-
Basic linear algebra subprograms for Fortran usage
-
C. Lawson, R. Hanson, D. Kincaid, and F. Krogh, Basic linear algebra subprograms for Fortran usage, ACM Trans. Math. Software, 5 (1979), pp. 308-323.
-
(1979)
ACM Trans. Math. Software
, vol.5
, pp. 308-323
-
-
Lawson, C.1
Hanson, R.2
Kincaid, D.3
Krogh, F.4
-
75
-
-
0038835621
-
High-performance recursive BLAS kernels using new data formats for the QR factorization
-
Master's thesis, UMNAD-235.00, Department of Computing Science, Umeå University, Umeå, Sweden
-
A. Lindkvist, High-Performance Recursive BLAS Kernels Using New Data Formats for the QR Factorization, Master's thesis, UMNAD-235.00, Department of Computing Science, Umeå University, Umeå, Sweden, 2000.
-
(2000)
-
-
Lindkvist, A.1
-
76
-
-
0004235292
-
-
The MathWorks Inc., Natick, MA
-
MathWorks, Using MATLAB, The MathWorks Inc., Natick, MA, 2002.
-
(2002)
Using MATLAB
-
-
-
77
-
-
0042235298
-
Tiling, block data layout, and memory hierarchy performance
-
N. Park, B. Hong, and V. K. Prasanna, Tiling, block data layout, and memory hierarchy performance, IEEE Trans. Parallel Distrib. Systems, 14 (2003), pp. 640-654.
-
(2003)
IEEE Trans. Parallel Distrib. Systems
, vol.14
, pp. 640-654
-
-
Park, N.1
Hong, B.2
Prasanna, V.K.3
-
78
-
-
84976734144
-
The influence of the compiler on the cost of mathematical software - in particular on the cost of triangular factorization
-
B. N. Parlett and Y. Wang, The influence of the compiler on the cost of mathematical software - In particular on the cost of triangular factorization, ACM Trans. Math. Software, 1 (1975), pp. 35-46.
-
(1975)
ACM Trans. Math. Software
, vol.1
, pp. 35-46
-
-
Parlett, B.N.1
Wang, Y.2
-
79
-
-
0032342953
-
Numerical solution of generalized Lyapunov equations
-
T. Penzl, Numerical solution of generalized Lyapunov equations, Adv. Comput. Math., 8 (1998), pp. 33-48.
-
(1998)
Adv. Comput. Math.
, vol.8
, pp. 33-48
-
-
Penzl, T.1
-
80
-
-
84947916433
-
Parallel algorithms for triangular sylvester equations: Design, scheduling and scalability issues
-
in Applied Parallel Computing: Large Scale Scientific and Industrial Problems, B. Kågström et al., eds.; Springer-Verlag, New York
-
P. Poromaa, Parallel algorithms for triangular sylvester equations: Design, scheduling and scalability issues, in Applied Parallel Computing: Large Scale Scientific and Industrial Problems, B. Kågström et al., eds., Lecture Notes in Comput. Sci. 1541, Springer-Verlag, New York, 1998, pp. 438-446.
-
(1998)
Lecture Notes in Comput. Sci. 1541
, pp. 438-446
-
-
Poromaa, P.1
-
81
-
-
0039428372
-
High performance computing: Algorithms and library software for sylvester equations and certain eigenvalve problems with applications in condition estimation
-
Ph.D. Thesis, UMINF-97.16, Department of Computing Science, Umeå University, Umeå, Sweden
-
P. Poromaa, High Performance Computing: Algorithms and Library Software for Sylvester Equations and Certain Eigenvalve Problems with Applications in Condition Estimation, Ph.D. Thesis, UMINF-97.16, Department of Computing Science, Umeå University, Umeå, Sweden, 1997.
-
(1997)
-
-
Poromaa, P.1
-
82
-
-
85039512365
-
Out-of-core SVD and QR decompositions
-
E. Rabani and S. Toledo, Out-of-core SVD and QR decompositions, in Proceedings of the 10th SIAM Conference on Parallel Processing for Scientific Computing, Portsmouth, VA, CD-ROM, SIAM, Philadelphia, 2001.
-
Proceedings of the 10th SIAM Conference on Parallel Processing for Scientific Computing, Portsmouth, VA, CD-ROM, SIAM, Philadelphia, 2001
-
-
Rabani, E.1
Toledo, S.2
-
83
-
-
0037142952
-
Very large electronic structure calculations using an out-of-core filter-diagonalization method
-
E. Rabani and S. Toledo, Very large electronic structure calculations using an out-of-core filter-diagonalization method, J. Comput. Phys., 180 (2002), pp. 256-269.
-
(2002)
J. Comput. Phys.
, vol.180
, pp. 256-269
-
-
Rabani, E.1
Toledo, S.2
-
84
-
-
0003596534
-
Space-filling curves
-
Springer-Verlag, Berlin
-
H. Sagan, Space-Filling Curves, Springer-Verlag, Berlin, 1994.
-
(1994)
-
-
Sagan, H.1
-
85
-
-
28144458231
-
Skeletons from the treecode closet
-
J. K. Salmon and M. S. Warren, Skeletons from the treecode closet, J. Comput. Phys., 111 (1994), pp. 136-155.
-
(1994)
J. Comput. Phys.
, vol.111
, pp. 136-155
-
-
Salmon, J.K.1
Warren, M.S.2
-
86
-
-
0028443162
-
Fast parallel tree codes for gravitational and fluid dynamical n-body problems
-
J. K. Salmon, M. S. Warren, and G. S. Winckelmans, Fast parallel tree codes for gravitational and fluid dynamical n-body problems, Internat. J. Supercomput. Appl., 8 (1994), pp. 129-142.
-
(1994)
Internat. J. Supercomput. Appl.
, vol.8
, pp. 129-142
-
-
Salmon, J.K.1
Warren, M.S.2
Winckelmans, G.S.3
-
87
-
-
0021644214
-
The quadtree and related hierarchical data structures
-
H. Samet, The quadtree and related hierarchical data structures, Comput. Surveys, 16 (1984), pp. 188-260.
-
(1984)
Comput. Surveys
, vol.16
, pp. 188-260
-
-
Samet, H.1
-
88
-
-
0003078924
-
A storage-efficient WY representation for products of Householder transformations
-
R. Schrieber and C. Van Loan, A storage-efficient WY representation for products of Householder transformations, SIAM J. Sci. Statist. Comput., 10 (1989), pp. 53-57.
-
(1989)
SIAM J. Sci. Statist. Comput.
, vol.10
, pp. 53-57
-
-
Schrieber, R.1
Van Loan, C.2
-
89
-
-
1642274431
-
Scientific computing software library (SCSL)
-
SGI
-
SGI, Scientific Computing Software Library (SCSL), software and documentation available from http://www.sgi.com/software/scsl.html, 1993-2003.
-
(1993)
-
-
-
90
-
-
85039535130
-
-
SLICOT, The SLICOT Library and the Numerics in Control Network (NICONET) website, http://www.win.tue.nl/niconet/.
-
-
-
-
91
-
-
0003203931
-
Matrix eigensystem routines-EISPACK guide
-
Springer-Verlag, Berlin
-
B. T. Smith, J. M. Boyle, J. J. Dongarra, B. S. Garbow, Y. Ikebe, V. C. Klema, and C. B. Moler, Matrix Eigensystem Routines-EISPACK Guide, Lecture Notes in Comput. Sci. 6, Springer-Verlag, Berlin, 1976.
-
(1976)
Lecture Notes in Comput. Sci.
, vol.6
-
-
Smith, B.T.1
Boyle, J.M.2
Dongarra, J.J.3
Garbow, B.S.4
Ikebe, Y.5
Klema, V.C.6
Moler, C.B.7
-
94
-
-
34250487811
-
Gaussian elimination is not optimal
-
V. Strassen, Gaussian elimination is not optimal, Numer. Math., 13 (1969), pp. 354-356.
-
(1969)
Numer. Math.
, vol.13
, pp. 354-356
-
-
Strassen, V.1
-
95
-
-
0031496750
-
Locality of reference in LU decomposition with partial pivoting
-
S. Toledo, Locality of reference in LU decomposition with partial pivoting, SIAM J. Matrix Anal. Appl., 18 (1997), pp. 1065-1081.
-
(1997)
SIAM J. Matrix Anal. Appl.
, vol.18
, pp. 1065-1081
-
-
Toledo, S.1
-
96
-
-
32844469834
-
The top 500 supercomputer sites
-
TOP500
-
TOP500, The Top 500 Supercomputer Sites, http://www.top500.org/.
-
-
-
-
97
-
-
0037173976
-
A framework for high-performance matrix multiplication based on hierarchical abstractions, algorithms and optimized low-level kernels
-
V. Valsalam and A. Skjellum, A framework for high-performance matrix multiplication based on hierarchical abstractions, algorithms and optimized low-level kernels, Concurrency Computat. Pract. Exper., 14, (2002), pp. 805-839.
-
(2002)
Concurrency Computat. Pract. Exper.
, vol.14
, pp. 805-839
-
-
Valsalam, V.1
Skjellum, A.2
-
98
-
-
67650257712
-
Statistical models for automatic performance tuning
-
Springer-Verlag, New York
-
R. Vuduc, J. W. Demmel, and J. A. Bilmes, Statistical models for automatic performance tuning, in Proceedings of the International Conference on Computational Science, Lecture Notes in Comput. Sci. 2073, Springer-Verlag, New York 2001, pp. 117-126.
-
(2001)
Proceedings of the International Conference on Computational Science, Lecture Notes in Comput. Sci. 2073
, pp. 117-126
-
-
Vuduc, R.1
Demmel, J.W.2
Bilmes, J.A.3
-
99
-
-
1842843479
-
T Ax
-
Springer-Verlag, New York
-
T Ax, in Proceedings of the ICCS Workshop on Parallel Linear Algebra, Lecture Notes in Comput. Sci. 2660, Springer-Verlag, New York, 2003, pp. 705-714.
-
(2003)
Proceedings of the ICCS Workshop on Parallel Linear Algebra, Lecture Notes in Comput. Sci. 2660
, pp. 705-714
-
-
Vuduc, R.1
Gyulassy, A.2
Demmel, J.W.3
Yelick, K.A.4
-
100
-
-
85039523487
-
Automated empirical optimization of software and the ATLAS project
-
R. C. Whaley, A. Patitet, and J. Dongarra, Automated empirical optimization of software and the ATLAS project, LAPACK Working Note 147, 2000; see also http://sourcegforge.net/projects/math-atlas/.
-
LAPACK Working Note 147, 2000
-
-
Whaley, R.C.1
Patitet, A.2
Dongarra, J.3
-
102
-
-
0034819362
-
Language support for Morton order matrices
-
D. S. Wise, G. A. Alexander, J. D. Frens, and Y. H. Gu, Language support for Morton order matrices. ACM SIGPLAN Notices, 36 (7) (2001), pp. 24-33.
-
(2001)
ACM SIGPLAN Notices
, vol.36
, Issue.7
, pp. 24-33
-
-
Wise, D.S.1
Alexander, G.A.2
Frens, J.D.3
Gu, Y.H.4
-
104
-
-
0034447396
-
Transforming loops to recursion for multi-level memory hierarchies
-
Q. Yi, V. Adve, and K. Kennedy, Transforming loops to recursion for multi-level memory hierarchies, ACM SIGPLAN Notices, 35 (5) (2000), pp. 169-181.
-
(2000)
ACM SIGPLAN Notices
, vol.35
, Issue.5
, pp. 169-181
-
-
Yi, Q.1
Adve, V.2
Kennedy, K.3
|