메뉴 건너뛰기




Volumn 46, Issue 1, 2004, Pages 3-45

Recursive blocked algorithms and hybrid data structures for dense matrix library software

Author keywords

Automatic variable blocking; Dense linear algebra; ESSL; Factorizations; GEMM based; Hybrid data structures; LAPACK; Level 3 BLAS; Library software; Matrix equations; RECSY; Recursion; SLICOT; SMP parallelization; Superscalar; Superscalar kernels

Indexed keywords

ALGOL (PROGRAMMING LANGUAGE); ALGORITHMS; CODES (SYMBOLS); COMPUTER SOFTWARE; DIGITAL COMPUTERS; DIGITAL LIBRARIES; LINEAR SYSTEMS; MATRIX ALGEBRA; PROBLEM SOLVING;

EID: 1842832833     PISSN: 00361445     EISSN: None     Source Type: Journal    
DOI: 10.1137/S0036144503428693     Document Type: Article
Times cited : (124)

References (104)
  • 1
    • 0028513316 scopus 로고
    • Exploiting functional parallism of POWER2 to design high-performance numerical algorithms
    • R. C. Agarwal, F. G. Gustavson, and M. Zubair, Exploiting functional parallism of POWER2 to design high-performance numerical algorithms, IBM J. Res. Develop., 38 (1994), pp. 563-576.
    • (1994) IBM J. Res. Develop. , vol.38 , pp. 563-576
    • Agarwal, R.C.1    Gustavson, F.G.2    Zubair, M.3
  • 2
    • 84937408012 scopus 로고    scopus 로고
    • Automatic generation of block-recursive codes
    • in Euro-Par 2000 Parallel Processing, A. Bode et al., eds.; Springer-Verlag, New York
    • N. Ahmed and K. Pingali, Automatic generation of block-recursive codes, in Euro-Par 2000 Parallel Processing, A. Bode et al., eds., Lecture Notes in Comput. Sci. 1900, Springer-Verlag, New York, 2000, pp. 368-378.
    • (2000) Lecture Notes in Comput. Sci. 1900 , pp. 368-378
    • Ahmed, N.1    Pingali, K.2
  • 3
    • 18044400448 scopus 로고    scopus 로고
    • A recursive formulation of Cholesky factorization of a matrix in packed storage
    • B. Andersen, F. Gustavson, and J. Waśniewski, A recursive formulation of Cholesky factorization of a matrix in packed storage, ACM Trans. Math. Software, 27 (2001), pp. 214-244.
    • (2001) ACM Trans. Math. Software , vol.27 , pp. 214-244
    • Andersen, B.1    Gustavson, F.2    Waśniewski, J.3
  • 5
    • 84870413589 scopus 로고    scopus 로고
    • Automatically tuned linear algebra software
    • ATLAS, Automatically Tuned Linear Algebra Software, http://math-atlas.sourceforge.net/.
  • 6
    • 0027608822 scopus 로고
    • On computing condition numbers for the nonsymmetric eigenproblem
    • Z. Bai, J. Demmel, and A. McKenney, On computing condition numbers for the nonsymmetric eigenproblem, ACM Trans. Math. Software, 19 (1993), pp. 202-223.
    • (1993) ACM Trans. Math. Software , vol.19 , pp. 202-223
    • Bai, Z.1    Demmel, J.2    McKenney, A.3
  • 7
    • 33846349887 scopus 로고
    • A hierarchical O(n logn) force calculation algorithm
    • J. Barnes and P. Hut, A hierarchical O(n logn) force calculation algorithm, Nature, 324 (1986), pp. 446-449.
    • (1986) Nature , vol.324 , pp. 446-449
    • Barnes, J.1    Hut, P.2
  • 10
    • 0001951009 scopus 로고
    • The WY representation for products of householder matrices
    • C. Bischof and C. Van Loan, The WY representation for products of Householder matrices, SIAM J. Sci. Statist. Comput., 8 (1987), pp. s2-s13.
    • (1987) SIAM J. Sci. Statist. Comput. , vol.8
    • Bischof, C.1    Van Loan, C.2
  • 11
    • 0036401631 scopus 로고    scopus 로고
    • The multishift QR algorithm. Part I: Maintaining well-focused shifts and level 3 performance
    • K. Braman, R. Byers, and R. Mathias, The multishift QR algorithm. Part I: Maintaining well-focused shifts and level 3 performance, SIAM J. Matrix Anal. Appl., 23 (2002), pp. 929-947.
    • (2002) SIAM J. Matrix Anal. Appl. , vol.23 , pp. 929-947
    • Braman, K.1    Byers, R.2    Mathias, R.3
  • 12
    • 0036400807 scopus 로고    scopus 로고
    • The multishift QR algorithm. Part II: Aggressive early deflation
    • K. Braman, R. Byers, and R. Mathias, The multishift QR algorithm. Part II: Aggressive early deflation, SIAM J. Matrix Anal. Appl., 23 (2002), pp. 948-973.
    • (2002) SIAM J. Matrix Anal. Appl. , vol.23 , pp. 948-973
    • Braman, K.1    Byers, R.2    Mathias, R.3
  • 14
    • 0031223129 scopus 로고    scopus 로고
    • Compiler blockability of dense matrix factorizations
    • S. Carr and R. B. Lehoucq, Compiler blockability of dense matrix factorizations, ACM Trans. Math. Software, 23 (1997), pp. 336-361.
    • (1997) ACM Trans. Math. Software , vol.23 , pp. 336-361
    • Carr, S.1    Lehoucq, R.B.2
  • 16
    • 38149012697 scopus 로고
    • The solution of the matrix equations AX B - C XD = E and (Y A - DZ, Y C - BZ) = (E, F)
    • K.-W. E. Chu, The solution of the matrix equations AXB - CXD = E and (Y A - DZ, Y C - BZ) = (E, F), Linear Algebra Appl., 93 (1987), pp. 93-105.
    • (1987) Linear Algebra Appl. , vol.93 , pp. 93-105
    • Chu, K.-W.E.1
  • 17
    • 0000659575 scopus 로고
    • A divide and conquer method for the symmetric tridiagonal eigenproblem
    • J. J. M. Cuppen, A divide and conquer method for the symmetric tridiagonal eigenproblem, Numer. Math., 36 (1981), pp. 177-195.
    • (1981) Numer. Math. , vol.36 , pp. 177-195
    • Cuppen, J.J.M.1
  • 18
    • 0010976738 scopus 로고    scopus 로고
    • Blocked algorithms and software for reduction of a regular matrix pair to generalized Schur form
    • K. Dackland and B. Kågström, Blocked algorithms and software for reduction of a regular matrix pair to generalized Schur form, ACM Trans. Math. Software, 25 (1999), pp. 425-454.
    • (1999) ACM Trans. Math. Software , vol.25 , pp. 425-454
    • Dackland, K.1    Kågström, B.2
  • 19
    • 45949117656 scopus 로고
    • Computing stable eigendecompositions of matrix pencils
    • J. Demmel and B. Kågström, Computing stable eigendecompositions of matrix pencils, Linear Algebra Appl., 88/89, (1987), pp. 139-186.
    • (1987) Linear Algebra Appl. , vol.88-89 , pp. 139-186
    • Demmel, J.1    Kågström, B.2
  • 20
    • 0026913668 scopus 로고
    • Stability of block algorithms with fast level-3 BLAS
    • J. W. Demmel and N. J. Higham, Stability of block algorithms with fast level-3 BLAS, ACM Trans. Math. Software, 18 (1992), pp. 274-291.
    • (1992) ACM Trans. Math. Software , vol.18 , pp. 274-291
    • Demmel, J.W.1    Higham, N.J.2
  • 24
    • 0035176737 scopus 로고    scopus 로고
    • Recursive approach in sparse matrix LU factorization
    • J. Dongarra, V. Eijkhout, and P. Łszczek, Recursive approach in sparse matrix LU factorization, Sci. Programming, 9 (2001), pp. 51-60.
    • (2001) Sci. Programming , vol.9 , pp. 51-60
    • Dongarra, J.1    Eijkhout, V.2    Łszczek, P.3
  • 25
    • 0029324485 scopus 로고
    • Software libraries for linear algebra computations on high performance computers
    • J. J. Dongarra and D. W. Walker, Software libraries for linear algebra computations on high performance computers, SIAM Rev., 37 (1995), pp. 151-180.
    • (1995) SIAM Rev. , vol.37 , pp. 151-180
    • Dongarra, J.J.1    Walker, D.W.2
  • 26
    • 0002663082 scopus 로고
    • GEMMW: A portable 3 BLAS Winograd variant of Strassen's matrix multiply algorithm
    • C. G. Douglas, M. Heroux, G. Slishman, and R. M. Smith, GEMMW: A portable 3 BLAS Winograd variant of Strassen's matrix multiply algorithm, J. Comput. Phys., 110 (1994), pp. 1.-10.
    • (1994) J. Comput. Phys. , vol.110 , pp. 1-10
    • Douglas, C.G.1    Heroux, M.2    Slishman, G.3    Smith, R.M.4
  • 27
    • 84947936389 scopus 로고    scopus 로고
    • New serial and parallel recursive QR factorization algorithms for SMP systems
    • in Applied Parallel Computing: Large Scale Scientific and Industrial Problems, B. Kågström et al. eds.; Springer-Verlag, New York
    • E. Elmroth and F. G. Gustavson, New serial and parallel recursive QR factorization algorithms for SMP systems, in Applied Parallel Computing: Large Scale Scientific and Industrial Problems, B. Kågström et al. eds., Lecture Notes in Comput. Sci. 1541, Springer-Verlag, New York, 1998, pp. 120-128.
    • (1998) Lecture Notes in Comput. Sci. 1541 , pp. 120-128
    • Elmroth, E.1    Gustavson, F.G.2
  • 28
    • 0034224207 scopus 로고    scopus 로고
    • Applying recursion to serial and parallel QR factorization leads to better performance
    • E. Elmroth and F. G. Gustavson, Applying recursion to serial and parallel QR factorization leads to better performance, IBM J. Res. Develop., 44 (2000), pp. 605-624.
    • (2000) IBM J. Res. Develop. , vol.44 , pp. 605-624
    • Elmroth, E.1    Gustavson, F.G.2
  • 29
    • 0012536008 scopus 로고    scopus 로고
    • A faster and simpler recursive algorithm for the LAPACK routine DGELS
    • E. Elmroth and F. G. Gustavson, A faster and simpler recursive algorithm for the LAPACK routine DGELS, BIT, 41 (2001), pp. 936-949.
    • (2001) BIT , vol.41 , pp. 936-949
    • Elmroth, E.1    Gustavson, F.G.2
  • 30
    • 84957033906 scopus 로고    scopus 로고
    • High-performance library software for QR factorization
    • in Applied Parallel Computing: New Paradigms for HPC in Industry and Academia, T. Sørvik et al., eds.; Springer-Verlag, New York
    • E. Elmroth and F. G. Gustavson, High-performance library software for QR factorization, in Applied Parallel Computing: New Paradigms for HPC in Industry and Academia, T. Sørvik et al., eds., Lecture Notes in Comput. Sci. 1947, Springer-Verlag, New York, 2001, pp. 53-63.
    • (2001) Lecture Notes in Comput. Sci. 1947 , pp. 53-63
    • Elmroth, E.1    Gustavson, F.G.2
  • 36
    • 0003100264 scopus 로고
    • Parallel algorithms for dense linear algebra computations
    • K. A. Gallivan, R. J. Plemmons, and A. H. Sameh, Parallel algorithms for dense linear algebra computations, SIAM Rev., 32 (1990), pp. 54-135.
    • (1990) SIAM Rev. , vol.32 , pp. 54-135
    • Gallivan, K.A.1    Plemmons, R.J.2    Sameh, A.H.3
  • 38
    • 0018721357 scopus 로고
    • A Hessenberg-Schur method for the matrix problem AX + X B = C
    • G. Golub, S. Nash, and C. Van Loan, A Hessenberg-Schur method for the matrix problem AX + X B = C, IEEE Trans. Automat. Control, AC-24 (1979), pp. 909-913.
    • (1979) IEEE Trans. Automat. Control , vol.AC-24 , pp. 909-913
    • Golub, G.1    Nash, S.2    Van Loan, C.3
  • 40
    • 1542392269 scopus 로고    scopus 로고
    • On reducing TLB misses in matrix multiplication
    • Department of Computer Sciences, University of Texas at Austin
    • K. Goto and R. van de Geijn, On Reducing TLB Misses in Matrix Multiplication, Technical Report TR-2002-55, FLAME Working Note 9, Department of Computer Sciences, University of Texas at Austin, 2002.
    • (2002) Technical Report TR-2002-55, FLAME Working Note 9
    • Goto, K.1    Van De Geijn, R.2
  • 43
    • 0031273280 scopus 로고    scopus 로고
    • Recursion leads to automatic variable blocking for dense linear-algebra algorithms
    • F. G. Gustavson, Recursion leads to automatic variable blocking for dense linear-algebra algorithms, IBM J. Res. Develop., 41 (1997), pp. 737-755.
    • (1997) IBM J. Res. Develop. , vol.41 , pp. 737-755
    • Gustavson, F.G.1
  • 44
    • 84901913528 scopus 로고    scopus 로고
    • New generalized data structures for matrices lead to a variety of high performance algorithms
    • in The Architectures for Scientific Software, R. F. Boisvert and P. T. P. Tang, eds.; Kluwer Academic, Dordrecht, The Netherlands
    • F. G. Gustavson, New generalized data structures for matrices lead to a variety of high performance algorithms, in The Architectures for Scientific Software, R. F. Boisvert and P. T. P. Tang, eds., IFIP Conference Proceedings 188, Kluwer Academic, Dordrecht, The Netherlands, pp. 211-234.
    • IFIP Conference Proceedings 188 , pp. 211-234
    • Gustavson, F.G.1
  • 45
    • 0037230301 scopus 로고    scopus 로고
    • High-performance linear algebra algorithms using new generalized data structures for matrices
    • F. G. Gustavson, High-performance linear algebra algorithms using new generalized data structures for matrices, IBM J. Res. Develop., 47 (2003), pp. 31-554.
    • (2003) IBM J. Res. Develop. , vol.47 , pp. 31-554
    • Gustavson, F.G.1
  • 46
    • 84947926251 scopus 로고    scopus 로고
    • Recursive blocked data formats and BLAS's for dense linear algebra algorithms
    • in Applied Parallel Computing: Large Scale Scientific and Industrial Problems, B. Kågström et al., eds.; Springer-Verlag, New York
    • F. G. Gustavson, A. Henriksson, I. Jonsson, B. Kågström, and P. Ling, Recursive blocked data formats and BLAS's for dense linear algebra algorithms, in Applied Parallel Computing: Large Scale Scientific and Industrial Problems, B. Kågström et al., eds., Lecture Notes in Comput. Sci. 1541, Springer-Verlag, New York, 1998, pp. 195-206.
    • (1998) Lecture Notes in Comput. Sci. 1541 , pp. 195-206
    • Gustavson, F.G.1    Henriksson, A.2    Jonsson, I.3    Kågström, B.4    Ling, P.5
  • 47
    • 84947907655 scopus 로고    scopus 로고
    • Superscalar GEMM-based level 3 BLAS - The on-going evolution of a portable and high-performance library
    • in Applied Parallel Computing: Large Scale Scientific and Industrial Problems, B. Kåström et al., eds.; Springer-Verlag, New York
    • F. G. Gustavson, A. Henriksson, I. Jonsson, B. Kågström, and P. Ling, Superscalar GEMM-based level 3 *BLAS - The on-going evolution of a portable and high-performance library, in Applied Parallel Computing: Large Scale Scientific and Industrial Problems, B. Kåström et al., eds., Lecture Notes in Comput. Sci. 1541, Springer-Verlag, New York, 1998, pp. 207-215.
    • (1998) Lecture Notes in Comput. Sci. 1541 , pp. 207-215
    • Gustavson, F.G.1    Henriksson, A.2    Jonsson, I.3    Kågström, B.4    Ling, P.5
  • 48
    • 0034312453 scopus 로고    scopus 로고
    • Minimal-storage high-performance Cholesky factorization via blocking and recursion
    • F. G. Gustavson and I. Jonsson, Minimal-storage high-performance Cholesky factorization via blocking and recursion, IBM J. Res. Develop., 44 (2000), pp. 823-849.
    • (2000) IBM J. Res. Develop. , vol.44 , pp. 823-849
    • Gustavson, F.G.1    Jonsson, I.2
  • 50
    • 0000567621 scopus 로고
    • Numerical solution of the stable, non-negative definite Lyapunov equation
    • S. J. Hammarling, Numerical solution of the stable, non-negative definite Lyapunov equation, IMA J. Numer. Anal., 2 (1982), pp. 303-323.
    • (1982) IMA J. Numer. Anal. , vol.2 , pp. 303-323
    • Hammarling, S.J.1
  • 51
    • 0343910469 scopus 로고    scopus 로고
    • High-performance matrix multiplication on the IBM SP high node
    • Master's thesis, UMNAD-98.235, Department of Computing Science, Umeå University, Umeå, Sweden
    • A. Henriksson and I. Jonsson, High-Performance Matrix Multiplication on the IBM SP High Node, Master's thesis, UMNAD-98.235, Department of Computing Science, Umeå University, Umeå, Sweden, 1998.
    • (1998)
    • Henriksson, A.1    Jonsson, I.2
  • 52
    • 0024143903 scopus 로고
    • Fortran codes for estimating the one-norm of a real or complex matrix with applications to condition estimation
    • N. J. Higham, Fortran codes for estimating the one-norm of a real or complex matrix with applications to condition estimation, ACM Trans. Math. Software, 14 (1988), pp. 381-396.
    • (1988) ACM Trans. Math. Software , vol.14 , pp. 381-396
    • Higham, N.J.1
  • 53
    • 0001045175 scopus 로고
    • Perturbation theory and backward error for AX - XB = C
    • N. J. Higham, Perturbation theory and backward error for AX - XB = C, BIT, 33 (1993), pp. 124-136.
    • (1993) BIT , vol.33 , pp. 124-136
    • Higham, N.J.1
  • 58
    • 84886852438 scopus 로고    scopus 로고
    • Parallel and fully recursive multifrontal supernodal sparse Cholesky
    • in Computational Science - ICCS 2002, P. Sloot et al., eds.; Springer-Verlag, Berlin
    • D. Irony, G. Shklarski, and S. Toledo, Parallel and fully recursive multifrontal supernodal sparse Cholesky, in Computational Science - ICCS 2002, P. Sloot et al., eds., Lecture Notes in Comput. Sci. 2330, Springer-Verlag, Berlin, 2002, pp. 335-344.
    • (2002) Lecture Notes in Comput. Sci. 2330 , pp. 335-344
    • Irony, D.1    Shklarski, G.2    Toledo, S.3
  • 59
    • 24244446738 scopus 로고    scopus 로고
    • Analysis of processor and memory utilization of recursive algorithms for sylvester-type matrix equations using performance monitoring
    • Technical Report UNINF-03.16, Department of Computing Science, Umeå University, Umeå, Sweden
    • I. Jonsson, Analysis of Processor and Memory Utilization of Recursive Algorithms for Sylvester-Type Matrix Equations Using Performance Monitoring, Technical Report UNINF-03.16, Department of Computing Science, Umeå University, Umeå, Sweden, 2003.
    • (2003)
    • Jonsson, I.1
  • 60
    • 1842832563 scopus 로고    scopus 로고
    • Parallel triangular Sylvester-type matrix equation solvers for SMP systems using recursive blocking
    • in Applied Parallel Computing: New Paradigms for HPC Industry and Academia, T. Sørvik et al., eds.; Springer-Verlag, New York
    • I. Jonsson and B. Kågström, Parallel triangular Sylvester-type matrix equation solvers for SMP systems using recursive blocking, in Applied Parallel Computing: New Paradigms for HPC Industry and Academia, T. Sørvik et al., eds., Lecture Notes in Comput. Sci. 1947, Springer-Verlag, New York, 2001, pp. 64-73.
    • (2001) Lecture Notes in Comput. Sci. 1947 , pp. 64-73
    • Jonsson, I.1    Kågström, B.2
  • 61
    • 1842843484 scopus 로고    scopus 로고
    • Parallel two-sided Sylvester-type matrix equation solvers for SMP systems using recursive blocking
    • in Applied Parallel Computing: Advanced Scientific Computing, J. Fagerhom et al., eds.; Springer-Verlag, New York
    • I. Jonsson and B. Kågström, Parallel two-sided Sylvester-type matrix equation solvers for SMP systems using recursive blocking, in Applied Parallel Computing: Advanced Scientific Computing, J. Fagerhom et al., eds., Lecture Notes in Comput. Sci. 2367, Springer-Verlag, New York, 2002, pp. 297-306.
    • (2002) Lecture Notes in Comput. Sci. 2367 , pp. 297-306
    • Jonsson, I.1    Kågström, B.2
  • 62
    • 19044380439 scopus 로고    scopus 로고
    • Recursive blocked algorithms for solving triangular systems - Part I: One-sided and coupled Sylvester-type matrix equations
    • I. Jonsson and B. Kågström, Recursive blocked algorithms for solving triangular systems - Part I: One-sided and coupled Sylvester-type matrix equations, ACM Trans. Math. Software, 28 (2002), pp. 392-415.
    • (2002) ACM Trans. Math. Software , vol.28 , pp. 392-415
    • Jonsson, I.1    Kågström, B.2
  • 63
    • 19044400922 scopus 로고    scopus 로고
    • Recursive blocked algorithms for solving triangular systems - Part II: Two sided and generalized Sylvester and lyapunov equations
    • I. Jonsson and B. Kågström, Recursive blocked algorithms for solving triangular systems - Part II: Two sided and generalized Sylvester and Lyapunov equations, ACM Trans. Math. Software, 28, (2002), pp. 416-435.
    • (2002) ACM Trans. Math. Software , vol.28 , pp. 416-435
    • Jonsson, I.1    Kågström, B.2
  • 64
    • 1842843483 scopus 로고    scopus 로고
    • RECSY - a high performance library for sylvester-type matrix equations
    • I. Jonsson and B. Kågström, RECSY - A High Performance Library for Sylvester-Type Matrix Equations, http://www.cs.umu.se/research/parallel/recsy, 2003.
    • (2003)
    • Jonsson, I.1    Kågström, B.2
  • 65
    • 21844500600 scopus 로고
    • A perturbation analysis of the generalized Sylvester equation (AR - LB, DR - LE) = (C, F)
    • B. Kåström, A perturbation analysis of the generalized Sylvester equation (AR - LB, DR - LE) = (C, F), SIAM J. Matrix Anal. Appl., 15 (1994), pp. 1045-1060.
    • (1994) SIAM J. Matrix Anal. Appl. , vol.15 , pp. 1045-1060
    • Kåström, B.1
  • 66
    • 0032155271 scopus 로고    scopus 로고
    • GEMM-based level 3 BLAS: High-performance model implementations and performance evaluation benchmark
    • B. Kågström, P. Ling, and C. Van Loan, GEMM-based level 3 BLAS: High-performance model implementations and performance evaluation benchmark, ACM Trans. Math. Software, 24 (1998), pp. 268-302.
    • (1998) ACM Trans. Math. Software , vol.24 , pp. 268-302
    • Kågström, B.1    Ling, P.2    Van Loan, C.3
  • 67
    • 0032155342 scopus 로고    scopus 로고
    • Algorithm 784: GEMM-based level 3 BLAS: Portability and optimization issues
    • B. Kågström, P. Ling, and C. Van Loan, Algorithm 784: GEMM-based level 3 BLAS: Portability and optimization issues, ACM Trans. Math. Software, 24, (1998), pp. 303-316.
    • (1998) ACM Trans. Math. Software , vol.24 , pp. 303-316
    • Kågström, B.1    Ling, P.2    Van Loan, C.3
  • 69
    • 0041103179 scopus 로고    scopus 로고
    • Computing eigenspaces with specified eigenvalues of a regular matrix pair (A, B) and condition estimation: Theory, algorithms and software
    • B. Kågström, and P. Poromaa, Computing eigenspaces with specified eigenvalues of a regular matrix pair (A, B) and condition estimation: Theory, algorithms and software, Numer. Algorithms, 12 (1996), pp. 369-407.
    • (1996) Numer. Algorithms , vol.12 , pp. 369-407
    • Kågström, B.1    Poromaa, P.2
  • 70
    • 0030092417 scopus 로고    scopus 로고
    • LAPACK-style algorithms and software for solving the generalized Sylvester equation and estimating the separation between regular matrix pairs
    • B. Kågström, and P. Poromaa, LAPACK-style algorithms and software for solving the generalized Sylvester equation and estimating the separation between regular matrix pairs, ACM Trans, Math. Software, 22 (1996), pp. 78-103.
    • (1996) ACM Trans, Math. Software , vol.22 , pp. 78-103
    • Kågström, B.1    Poromaa, P.2
  • 71
    • 0040740214 scopus 로고
    • A generalized state-space approach for the additive decomposition of a transfer matrix
    • B. Kågström, and P. Van Dooren, A generalized state-space approach for the additive decomposition of a transfer matrix, Internat. J. Numer. Linear Algebra Appl., 1 (1992), pp. 165-181.
    • (1992) Internat. J. Numer. Linear Algebra Appl. , vol.1 , pp. 165-181
    • Kågström, B.1    Van Dooren, P.2
  • 72
    • 0024700168 scopus 로고
    • Generalized Schur methods with condition estimators for solving the generalized Sylvester equation
    • B. Kågström, and L. Westin, Generalized Schur methods with condition estimators for solving the generalized Sylvester equation, IEEE Trans. Automat. Control, 34 (1989), pp. 745-751.
    • (1989) IEEE Trans. Automat. Control , vol.34 , pp. 745-751
    • Kågström, B.1    Westin, L.2
  • 75
    • 0038835621 scopus 로고    scopus 로고
    • High-performance recursive BLAS kernels using new data formats for the QR factorization
    • Master's thesis, UMNAD-235.00, Department of Computing Science, Umeå University, Umeå, Sweden
    • A. Lindkvist, High-Performance Recursive BLAS Kernels Using New Data Formats for the QR Factorization, Master's thesis, UMNAD-235.00, Department of Computing Science, Umeå University, Umeå, Sweden, 2000.
    • (2000)
    • Lindkvist, A.1
  • 76
    • 0004235292 scopus 로고    scopus 로고
    • The MathWorks Inc., Natick, MA
    • MathWorks, Using MATLAB, The MathWorks Inc., Natick, MA, 2002.
    • (2002) Using MATLAB
  • 78
    • 84976734144 scopus 로고
    • The influence of the compiler on the cost of mathematical software - in particular on the cost of triangular factorization
    • B. N. Parlett and Y. Wang, The influence of the compiler on the cost of mathematical software - In particular on the cost of triangular factorization, ACM Trans. Math. Software, 1 (1975), pp. 35-46.
    • (1975) ACM Trans. Math. Software , vol.1 , pp. 35-46
    • Parlett, B.N.1    Wang, Y.2
  • 79
    • 0032342953 scopus 로고    scopus 로고
    • Numerical solution of generalized Lyapunov equations
    • T. Penzl, Numerical solution of generalized Lyapunov equations, Adv. Comput. Math., 8 (1998), pp. 33-48.
    • (1998) Adv. Comput. Math. , vol.8 , pp. 33-48
    • Penzl, T.1
  • 80
    • 84947916433 scopus 로고    scopus 로고
    • Parallel algorithms for triangular sylvester equations: Design, scheduling and scalability issues
    • in Applied Parallel Computing: Large Scale Scientific and Industrial Problems, B. Kågström et al., eds.; Springer-Verlag, New York
    • P. Poromaa, Parallel algorithms for triangular sylvester equations: Design, scheduling and scalability issues, in Applied Parallel Computing: Large Scale Scientific and Industrial Problems, B. Kågström et al., eds., Lecture Notes in Comput. Sci. 1541, Springer-Verlag, New York, 1998, pp. 438-446.
    • (1998) Lecture Notes in Comput. Sci. 1541 , pp. 438-446
    • Poromaa, P.1
  • 81
    • 0039428372 scopus 로고    scopus 로고
    • High performance computing: Algorithms and library software for sylvester equations and certain eigenvalve problems with applications in condition estimation
    • Ph.D. Thesis, UMINF-97.16, Department of Computing Science, Umeå University, Umeå, Sweden
    • P. Poromaa, High Performance Computing: Algorithms and Library Software for Sylvester Equations and Certain Eigenvalve Problems with Applications in Condition Estimation, Ph.D. Thesis, UMINF-97.16, Department of Computing Science, Umeå University, Umeå, Sweden, 1997.
    • (1997)
    • Poromaa, P.1
  • 83
    • 0037142952 scopus 로고    scopus 로고
    • Very large electronic structure calculations using an out-of-core filter-diagonalization method
    • E. Rabani and S. Toledo, Very large electronic structure calculations using an out-of-core filter-diagonalization method, J. Comput. Phys., 180 (2002), pp. 256-269.
    • (2002) J. Comput. Phys. , vol.180 , pp. 256-269
    • Rabani, E.1    Toledo, S.2
  • 84
    • 0003596534 scopus 로고
    • Space-filling curves
    • Springer-Verlag, Berlin
    • H. Sagan, Space-Filling Curves, Springer-Verlag, Berlin, 1994.
    • (1994)
    • Sagan, H.1
  • 85
    • 28144458231 scopus 로고
    • Skeletons from the treecode closet
    • J. K. Salmon and M. S. Warren, Skeletons from the treecode closet, J. Comput. Phys., 111 (1994), pp. 136-155.
    • (1994) J. Comput. Phys. , vol.111 , pp. 136-155
    • Salmon, J.K.1    Warren, M.S.2
  • 86
    • 0028443162 scopus 로고
    • Fast parallel tree codes for gravitational and fluid dynamical n-body problems
    • J. K. Salmon, M. S. Warren, and G. S. Winckelmans, Fast parallel tree codes for gravitational and fluid dynamical n-body problems, Internat. J. Supercomput. Appl., 8 (1994), pp. 129-142.
    • (1994) Internat. J. Supercomput. Appl. , vol.8 , pp. 129-142
    • Salmon, J.K.1    Warren, M.S.2    Winckelmans, G.S.3
  • 87
    • 0021644214 scopus 로고
    • The quadtree and related hierarchical data structures
    • H. Samet, The quadtree and related hierarchical data structures, Comput. Surveys, 16 (1984), pp. 188-260.
    • (1984) Comput. Surveys , vol.16 , pp. 188-260
    • Samet, H.1
  • 88
    • 0003078924 scopus 로고
    • A storage-efficient WY representation for products of Householder transformations
    • R. Schrieber and C. Van Loan, A storage-efficient WY representation for products of Householder transformations, SIAM J. Sci. Statist. Comput., 10 (1989), pp. 53-57.
    • (1989) SIAM J. Sci. Statist. Comput. , vol.10 , pp. 53-57
    • Schrieber, R.1    Van Loan, C.2
  • 89
    • 1642274431 scopus 로고
    • Scientific computing software library (SCSL)
    • SGI
    • SGI, Scientific Computing Software Library (SCSL), software and documentation available from http://www.sgi.com/software/scsl.html, 1993-2003.
    • (1993)
  • 90
    • 85039535130 scopus 로고    scopus 로고
    • SLICOT, The SLICOT Library and the Numerics in Control Network (NICONET) website, http://www.win.tue.nl/niconet/.
  • 94
    • 34250487811 scopus 로고
    • Gaussian elimination is not optimal
    • V. Strassen, Gaussian elimination is not optimal, Numer. Math., 13 (1969), pp. 354-356.
    • (1969) Numer. Math. , vol.13 , pp. 354-356
    • Strassen, V.1
  • 95
    • 0031496750 scopus 로고    scopus 로고
    • Locality of reference in LU decomposition with partial pivoting
    • S. Toledo, Locality of reference in LU decomposition with partial pivoting, SIAM J. Matrix Anal. Appl., 18 (1997), pp. 1065-1081.
    • (1997) SIAM J. Matrix Anal. Appl. , vol.18 , pp. 1065-1081
    • Toledo, S.1
  • 96
    • 32844469834 scopus 로고    scopus 로고
    • The top 500 supercomputer sites
    • TOP500
    • TOP500, The Top 500 Supercomputer Sites, http://www.top500.org/.
  • 97
    • 0037173976 scopus 로고    scopus 로고
    • A framework for high-performance matrix multiplication based on hierarchical abstractions, algorithms and optimized low-level kernels
    • V. Valsalam and A. Skjellum, A framework for high-performance matrix multiplication based on hierarchical abstractions, algorithms and optimized low-level kernels, Concurrency Computat. Pract. Exper., 14, (2002), pp. 805-839.
    • (2002) Concurrency Computat. Pract. Exper. , vol.14 , pp. 805-839
    • Valsalam, V.1    Skjellum, A.2
  • 100
    • 85039523487 scopus 로고    scopus 로고
    • Automated empirical optimization of software and the ATLAS project
    • R. C. Whaley, A. Patitet, and J. Dongarra, Automated empirical optimization of software and the ATLAS project, LAPACK Working Note 147, 2000; see also http://sourcegforge.net/projects/math-atlas/.
    • LAPACK Working Note 147, 2000
    • Whaley, R.C.1    Patitet, A.2    Dongarra, J.3
  • 104
    • 0034447396 scopus 로고    scopus 로고
    • Transforming loops to recursion for multi-level memory hierarchies
    • Q. Yi, V. Adve, and K. Kennedy, Transforming loops to recursion for multi-level memory hierarchies, ACM SIGPLAN Notices, 35 (5) (2000), pp. 169-181.
    • (2000) ACM SIGPLAN Notices , vol.35 , Issue.5 , pp. 169-181
    • Yi, Q.1    Adve, V.2    Kennedy, K.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.