메뉴 건너뛰기




Volumn 93, Issue 2, 2005, Pages 293-311

Self-adapting Linear Algebra algorithms and software

Author keywords

Adaptive methods; Basic Linear Algebra Subprograms (BLAS); Dense kernels; Iterative methods; Linear systems; Matrix matrix product; Matrix vector product; Performance optimization; Preconditioners; Sparse kernels

Indexed keywords

ALGORITHMS; COMPUTER ARCHITECTURE; COMPUTER SIMULATION; COMPUTER SOFTWARE; ITERATIVE METHODS; LINEAR SYSTEMS; MATHEMATICAL MODELS; MATRIX ALGEBRA; OPTIMIZATION; VECTORS;

EID: 20744452904     PISSN: 00189219     EISSN: None     Source Type: Journal    
DOI: 10.1109/JPROC.2004.840848     Document Type: Conference Paper
Times cited : (160)

References (81)
  • 1
    • 0030661485 scopus 로고    scopus 로고
    • Optimizing matrix multiply using PHiPAC: A portable, high-performance, ANSI C coding methodology
    • Vienna, Austria
    • J. Bilmes, K. Asanović, C. Chin, and J. Demmel, "Optimizing matrix multiply using PHiPAC: A portable, high-performance, ANSI C coding methodology," presented at the Int. Conf. Supercomputing, Vienna, Austria, 1997.
    • (1997) Int. Conf. Supercomputing
    • Bilmes, J.1    Asanović, K.2    Chin, C.3    Demmel, J.4
  • 2
    • 0343462141 scopus 로고    scopus 로고
    • Automated empirical optimization of software and the ATLAS project
    • R. C. Whaley, A. Petitet, and J. J. Dongarra, "Automated empirical optimization of software and the ATLAS project," Parallel Comput., vol. 27, no. 1-2, pp. 3-35, 2001.
    • (2001) Parallel Comput. , vol.27 , Issue.1-2 , pp. 3-35
    • Whaley, R.C.1    Petitet, A.2    Dongarra, J.J.3
  • 4
    • 0023349165 scopus 로고
    • Distribution of mathematical software via electronic mail
    • J. Dongarra and E. Grosse, "Distribution of mathematical software via electronic mail," Commun. ACM, vol. 30, no. 5, pp. 403-407, 1987.
    • (1987) Commun. ACM , vol.30 , Issue.5 , pp. 403-407
    • Dongarra, J.1    Grosse, E.2
  • 5
    • 20744452158 scopus 로고
    • The development of a floating-point validation package
    • Como, Italy
    • J. Du Croz and M. Pont, "The development of a floating-point validation package," presented at the 8th Symp. Computer Arithmetic, Como, Italy, 1987.
    • (1987) 8th Symp. Computer Arithmetic
    • Du Croz, J.1    Pont, M.2
  • 6
    • 20744442970 scopus 로고
    • [Online]
    • W. Kahan. (1987) Paranoia. [Online]. Available: http://www.netlib.org/
    • (1987) Paranoia
    • Kahan, W.1
  • 7
    • 0039169604 scopus 로고
    • An object oriented design for high performance linear algebra on distributed memory architectures
    • J. Dongarra, R. Pozo, and D. Walker, "An object oriented design for high performance linear algebra on distributed memory architectures," in Proc. Object Oriented Numerics Conf, 1993, pp. 268-269.
    • (1993) Proc. Object Oriented Numerics Conf , pp. 268-269
    • Dongarra, J.1    Pozo, R.2    Walker, D.3
  • 9
    • 0009876420 scopus 로고    scopus 로고
    • Installation guide and design of the HPF 1.1 interface to ScaLAPACK, SLHPF (LAPACK Working Note 137)
    • Univ. Tennessee, Knoxville, TN
    • L. Blackford, J. Dongarra, C. Papadopoulos, and R. C. Whaley, "Installation guide and design of the HPF 1.1 interface to ScaLAPACK, SLHPF (LAPACK Working Note 137)," Univ. Tennessee, Knoxville, TN, Tech. Rep. UT CS-98-396, 1998.
    • (1998) Tech. Rep , vol.UT CS-98-396
    • Blackford, L.1    Dongarra, J.2    Papadopoulos, C.3    Whaley, R.C.4
  • 10
    • 13244279577 scopus 로고    scopus 로고
    • Minimizing development and maintenance costs in supporting persistently optimized BLAS
    • to be published
    • R. C. Whaley and A. Petitet, "Minimizing development and maintenance costs in supporting persistently optimized BLAS," Softw. Pract. Exper., to be published.
    • Softw. Pract. Exper.
    • Whaley, R.C.1    Petitet, A.2
  • 13
    • 0003418094 scopus 로고    scopus 로고
    • Automatically tuned linear algebra software
    • Univ. Tennessee, Knoxville
    • _, "Automatically Tuned Linear Algebra Software," Univ. Tennessee, Knoxville, Tech. Rep. UT-CS-97-366, 1997.
    • (1997) Tech. Rep. , vol.UT-CS-97-366
  • 16
    • 0003533835 scopus 로고    scopus 로고
    • The fastest fourier transform in the west
    • Massachusetts Inst. Technol.
    • M. Frigo and S. G. Johnson, "The Fastest Fourier Transform in the West," Massachusetts Inst. Technol., Tech. Rep. MIT-LCS-TR-728, 1997.
    • (1997) Tech. Rep. , vol.MIT-LCS-TR-728
    • Frigo, M.1    Johnson, S.G.2
  • 18
    • 0042014175 scopus 로고
    • A proposal for standard linear algebra subprograms
    • R. Hanson, F. Krogh, and C. Lawson, "A proposal for standard linear algebra subprograms," ACM SIGNUM Newslett., vol. 8, no. 16, p. 16, 1973.
    • (1973) ACM SIGNUM Newslett. , vol.8 , Issue.16 , pp. 16
    • Hanson, R.1    Krogh, F.2    Lawson, C.3
  • 19
    • 0018515759 scopus 로고
    • Basic linear algebra subprograms for Fortran usage
    • C. Lawson, R. Hanson, D. Kincaid, and F. Krogh, "Basic linear algebra subprograms for Fortran usage," ACM Trans. Math. Softw., vol. 5, no. 3, pp. 308-323, 1979.
    • (1979) ACM Trans. Math. Softw. , vol.5 , Issue.3 , pp. 308-323
    • Lawson, C.1    Hanson, R.2    Kincaid, D.3    Krogh, F.4
  • 20
    • 0023982822 scopus 로고
    • Algorithm 656: An extended set of basic linear algebra subprograms: Model implementation and test programs
    • J. Dongarra, J. D. Croz, S. Hammarling, and R. Hanson, "Algorithm 656: An extended set of basic linear algebra subprograms: Model implementation and test programs," ACM Trans. Math. Softw., vol. 14, no. 1, pp. 18-32, 1988.
    • (1988) ACM Trans. Math. Softw. , vol.14 , Issue.1 , pp. 18-32
    • Dongarra, J.1    Croz, J.D.2    Hammarling, S.3    Hanson, R.4
  • 21
    • 0023983122 scopus 로고
    • An extended set of FORTRAN basic linear algebra subprograms
    • _, "An extended set of FORTRAN basic linear algebra subprograms," ACM Trans. Math. Softw., vol. 14, no. 1, pp. 1-17, 1988.
    • (1988) ACM Trans. Math. Softw. , vol.14 , Issue.1 , pp. 1-17
  • 25
    • 0040831411 scopus 로고
    • GEMM-based level 3 BLAS: High-performance model implementations and performance evaluation benchmark
    • Dept. Comput. Sci., Umeå Univ., Umeå, Sweden
    • B. Kågström, P. Ling, and C. van Loan, "GEMM-based level 3 BLAS: High-performance model implementations and performance evaluation benchmark," Dept. Comput. Sci., Umeå Univ., Umeå, Sweden, Tech. Rep. UMINF 95-18, 1995.
    • (1995) Tech. Rep. , vol.UMINF 95-18
    • Kågström, B.1    Ling, P.2    Van Loan, C.3
  • 26
    • 0032155271 scopus 로고    scopus 로고
    • GEMM-based level 3 BLAS: High performance model implementations and performance evaluation benchmark
    • _, "GEMM-based level 3 BLAS: High performance model implementations and performance evaluation benchmark," ACM Trans. Math. Softw., vol. 24, no. 3, pp. 268-302, 1998.
    • (1998) ACM Trans. Math. Softw. , vol.24 , Issue.3 , pp. 268-302
  • 27
    • 0028443077 scopus 로고
    • A parallel block implementation of level 3 BLAS for MIMD vector processors
    • M. Dayde, I. Duff, and A. Petitet, "A parallel block implementation of level 3 BLAS for MIMD vector processors," ACM Trans. Math. Softw., vol. 20, no. 2, pp. 178-193, 1994.
    • (1994) ACM Trans. Math. Softw. , vol.20 , Issue.2 , pp. 178-193
    • Dayde, M.1    Duff, I.2    Petitet, A.3
  • 29
    • 84947907655 scopus 로고    scopus 로고
    • Superscalar GEMM-based level 3 BLAS-The on-going evolution of a portable and high-performance library
    • B. Kågström, J. Dongarra, E. Elmroth, and J. Waśniewski, Eds.
    • _, "Superscalar GEMM-based level 3 BLAS-The on-going evolution of a portable and high-performance library," in Lecture Notes in Computer Science, Applied Parallel Computing, PARA'98, vol. 1541, B. Kågström, J. Dongarra, E. Elmroth, and J. Waśniewski, Eds., 1998, pp. 207-215.
    • (1998) Lecture Notes in Computer Science, Applied Parallel Computing, PARA'98 , vol.1541 , pp. 207-215
  • 30
    • 1542501019 scopus 로고    scopus 로고
    • Sparsity: Optimization framework for sparse matrix kernels
    • Feb.
    • E.-J. Im, K. Yelick, and R. Vuduc, "Sparsity: Optimization framework for sparse matrix kernels," Int. J. High Perform. Comput. Appl., vol. 18, no. 1, pp. 135-158, Feb. 2004.
    • (2004) Int. J. High Perform. Comput. Appl. , vol.18 , Issue.1 , pp. 135-158
    • Im, E.-J.1    Yelick, K.2    Vuduc, R.3
  • 32
    • 20744433790 scopus 로고    scopus 로고
    • Performance modeling and analysis of cache blocking in sparse matrix vector multiply
    • Univ. California, Berkeley
    • B. C. Lee, R. Vuduc, J. W. Demmel, K. A. Yelick, M. deLorimier, and L. Zhong, "Performance modeling and analysis of cache blocking in sparse matrix vector multiply," Univ. California, Berkeley, Tech. Rep. UCB/CSD-04-1335, 2004.
    • (2004) Tech. Rep. , vol.UCB-CSD-04-1335
    • Lee, B.C.1    Vuduc, R.2    Demmel, J.W.3    Yelick, K.A.4    Delorimier, M.5    Zhong, L.6
  • 39
    • 1842512506 scopus 로고
    • Sparse matrices in MATLAB: Design implementation
    • Xerox PARC
    • J. R. Gilbert, C. Moler, and R. Schreiber, "Sparse matrices in MATLAB: Design implementation," Xerox PARC, Tech. Rep. CSL-91-04, 1991.
    • (1991) Tech. Rep. , vol.CSL-91-04
    • Gilbert, J.R.1    Moler, C.2    Schreiber, R.3
  • 41
    • 0001714824 scopus 로고    scopus 로고
    • Cache miss equations: A compiler framework for analyzing and tuning memory behavior
    • S. Ghosh, M. Martonosi, and S. Malik, "Cache miss equations: A compiler framework for analyzing and tuning memory behavior," ACM Trans. Program. Lang. Syst., vol. 21, no. 4, pp. 703-746, 1999.
    • (1999) ACM Trans. Program. Lang. Syst. , vol.21 , Issue.4 , pp. 703-746
    • Ghosh, S.1    Martonosi, M.2    Malik, S.3
  • 42
    • 0030190854 scopus 로고    scopus 로고
    • Improving data locality with loop transformations
    • Jul.
    • K. S. McKinley, S. Carr, and C.-W. Tseng, "Improving data locality with loop transformations," ACM Trans. Program. Lang. Syst., vol. 18, no. 4, pp. 424-453, Jul. 1996.
    • (1996) ACM Trans. Program. Lang. Syst. , vol.18 , Issue.4 , pp. 424-453
    • McKinley, K.S.1    Carr, S.2    Tseng, C.-W.3
  • 44
    • 20744459543 scopus 로고    scopus 로고
    • [Online]
    • K. Goto. (2004) High-performance BLAS. [Online]. Available: www.cs.utexas.edu/users/flame/goto
    • (2004) High-performance BLAS
    • Goto, K.1
  • 49
    • 20744458817 scopus 로고    scopus 로고
    • A technique for accelerating the convergence of restarted GMRES
    • Dept. Comput. Sci., Univ. Colorado
    • A. H. Baker, E. R. Jessup, and T. Manteuffel, "A technique for accelerating the convergence of restarted GMRES," Dept. Comput. Sci., Univ. Colorado, Tech. Rep. CU-CS-045-03, 2003.
    • (2003) Tech. Rep. , vol.CU-CS-045-03
    • Baker, A.H.1    Jessup, E.R.2    Manteuffel, T.3
  • 50
    • 0029546874 scopus 로고
    • Using linear algebra for intelligent information retrieval
    • M. W. Berry, S. T. Dumais, and G. W. O'Brien, "Using linear algebra for intelligent information retrieval," SIAM Rev., vol. 37, no. 4, pp. 573-595, 1995.
    • (1995) SIAM Rev. , vol.37 , Issue.4 , pp. 573-595
    • Berry, M.W.1    Dumais, S.T.2    O'Brien, G.W.3
  • 53
    • 85031264203 scopus 로고    scopus 로고
    • Improving performance of sparse matrix-vector multiplication
    • Portland, OR
    • A. Pinar and M. Heath, "Improving performance of sparse matrix-vector multiplication," presented at the ACM/IEEE Conf. Supercomputing, Portland, OR, 1999.
    • (1999) ACM/IEEE Conf. Supercomputing
    • Pinar, A.1    Heath, M.2
  • 54
    • 20744448169 scopus 로고
    • [Online]
    • W. J. Stewart. (1995) MARCA models home page. [Online]. Available: www.csc.ncsu.edu/faculty/WStewart/MARCA_Models/MARCA_Models.html
    • (1995) MARCA Models Home Page
    • Stewart, W.J.1
  • 55
    • 0039762246 scopus 로고
    • Adaptive use of iterative methods in interior point methods for linear programming
    • Univ. Maryland, College Park
    • W. Wang and D. P. O'Leary, "Adaptive use of iterative methods in interior point methods for linear programming," Univ. Maryland, College Park, Tech. Rep. UMIACS-95-111, 1995.
    • (1995) Tech. Rep. , vol.UMIACS-95-111
    • Wang, W.1    O'Leary, D.P.2
  • 56
    • 4243148480 scopus 로고    scopus 로고
    • Authoritative sources in a hyperlinked environment
    • J. M. Kleinberg, "Authoritative sources in a hyperlinked environment," J. ACM, vol. 46, no. 5, pp. 604-632, 1999.
    • (1999) J. ACM , vol.46 , Issue.5 , pp. 604-632
    • Kleinberg, J.M.1
  • 58
    • 20744454097 scopus 로고    scopus 로고
    • Toward a fast parallel sparse matrix-vector multiplication
    • E. H. D'Hollander, J. R. Joubert, F. J. Peters, and H. Sips, Eds.
    • R. Geus and S. Röllin, "Toward a fast parallel sparse matrix-vector multiplication," in Proc. Int. Conf. Parallel Computing (ParCo), E. H. D'Hollander, J. R. Joubert, F. J. Peters, and H. Sips, Eds., 1999, pp. 308-315.
    • (1999) Proc. Int. Conf. Parallel Computing (ParCo) , pp. 308-315
    • Geus, R.1    Röllin, S.2
  • 59
    • 0039958691 scopus 로고    scopus 로고
    • Improving memory-system performance of sparse matrix-vector multiplication
    • Minneapolis, MN
    • S. Toledo, "Improving memory-system performance of sparse matrix-vector multiplication," presented at the 8th SIAM Conf. Parallel Processing for Scientific Computing, Minneapolis, MN, 1997.
    • (1997) 8th SIAM Conf. Parallel Processing for Scientific Computing
    • Toledo, S.1
  • 60
    • 0242590437 scopus 로고
    • Advanced compiler optimizations for sparse computations
    • A. J. C. Bik and H. A. G. Wijshoff, "Advanced compiler optimizations for sparse computations," J. Parallel Distrib. Comput., vol. 31, no. 1, pp. 14-24, 1995.
    • (1995) J. Parallel Distrib. Comput. , vol.31 , Issue.1 , pp. 14-24
    • Bik, A.J.C.1    Wijshoff, H.A.G.2
  • 61
    • 85117190330 scopus 로고    scopus 로고
    • A framework for sparse matrix code synthesis from high-level specifications
    • Dallas, TX
    • N. Ahmed, N. Mateev, K. Pingali, and P. Stodghill, "A framework for sparse matrix code synthesis from high-level specifications." presented at the Conf. Supercomputing 2000, Dallas, TX.
    • (2000) Conf. Supercomputing
    • Ahmed, N.1    Mateev, N.2    Pingali, K.3    Stodghill, P.4
  • 65
    • 0036375922 scopus 로고    scopus 로고
    • Experiences tuning SMG98 - A semicoarsening multigrid benchmark based on the hypre library
    • G. Jin and J. Mellor-Crummey, "Experiences tuning SMG98 - A semicoarsening multigrid benchmark based on the hypre library," in Proc. Int. Conf. Supercomputing, 2002, pp. 305-314.
    • (2002) Proc. Int. Conf. Supercomputing , pp. 305-314
    • Jin, G.1    Mellor-Crummey, J.2
  • 67
    • 0018011681 scopus 로고
    • Two fast algorithms for sparse matrices: Multiplication and permuted transposition
    • F. G. Gustavson, "Two fast algorithms for sparse matrices: Multiplication and permuted transposition," ACM Trans. Math. Softw., vol. 4, no. 3, pp. 250-269, 1978.
    • (1978) ACM Trans. Math. Softw. , vol.4 , Issue.3 , pp. 250-269
    • Gustavson, F.G.1
  • 68
    • 20744457691 scopus 로고    scopus 로고
    • Sparse matrix multiplication
    • Nov.
    • P. Briggs, "Sparse matrix multiplication," SIGPLAN Notices, vol. 31, no. 11, pp. 33-37, Nov. 1996.
    • (1996) SIGPLAN Notices , vol.31 , Issue.11 , pp. 33-37
    • Briggs, P.1
  • 69
    • 2342573740 scopus 로고    scopus 로고
    • Structure prediction and computation of sparse matrix products
    • E. Cohen, "Structure prediction and computation of sparse matrix products," J. Combinatorial Optimization, vol. 2, no. 4, pp. 307-332, 1999.
    • (1999) J. Combinatorial Optimization , vol.2 , Issue.4 , pp. 307-332
    • Cohen, E.1
  • 71
    • 0034625292 scopus 로고    scopus 로고
    • A general parallel sparse-blocked matrix multiply for linear scaling scf theory
    • M. Challacombe, "A general parallel sparse-blocked matrix multiply for linear scaling scf theory," Comput. Phys. Commun., vol. 128, p. 93, 2000.
    • (2000) Comput. Phys. Commun. , vol.128 , pp. 93
    • Challacombe, M.1
  • 72
    • 0031488469 scopus 로고    scopus 로고
    • Deflated and augmented Krylov subspace techniques
    • A. Chapman and Y. Saad, "Deflated and augmented Krylov subspace techniques," Numer. Linear Algebra Appl., vol. 4, no. 1, pp. 43-66, 1997.
    • (1997) Numer. Linear Algebra Appl. , vol.4 , Issue.1 , pp. 43-66
    • Chapman, A.1    Saad, Y.2
  • 73
    • 0000048673 scopus 로고
    • GMRes: A generalized minimal residual algorithm for solving nonsymmetric linear systems
    • Y. Saad and M. H. Schultz, "GMRes: A generalized minimal residual algorithm for solving nonsymmetric linear systems," SIAM J. Sci. Stat. Comput., vol. 7, pp. 856-869, 1986.
    • (1986) SIAM J. Sci. Stat. Comput. , vol.7 , pp. 856-869
    • Saad, Y.1    Schultz, M.H.2
  • 74
    • 12444275589 scopus 로고    scopus 로고
    • A proposed standard for numerical metadata
    • Innovative Comput. Lab., Univ. Tennessee
    • V. Eijkhout and E. Fuentes, "A proposed standard for numerical metadata," Innovative Comput. Lab., Univ. Tennessee, Tech. Rep. ICL-UT-03-02, 2003.
    • (2003) Tech. Rep. , vol.ICL-UT-03-02
    • Eijkhout, V.1    Fuentes, E.2
  • 75
    • 0000094594 scopus 로고
    • An iteration method for the solution of the eigenvalue problem of linear differential and integral operators
    • C. Lanczos, "An iteration method for the solution of the eigenvalue problem of linear differential and integral operators," J. Res. Nat. Bureau Stand., vol. 45, pp. 255-282, 1950.
    • (1950) J. Res. Nat. Bureau Stand. , vol.45 , pp. 255-282
    • Lanczos, C.1
  • 76
    • 72449139170 scopus 로고
    • Computational variants of the Lanczos method for the eigenproblem
    • C. Paige, "Computational variants of the Lanczos method for the eigenproblem," J. Inst. Math. Appl., vol. 10, pp. 373-381, 1972.
    • (1972) J. Inst. Math. Appl. , vol.10 , pp. 373-381
    • Paige, C.1
  • 77
    • 0000005482 scopus 로고
    • Bi-CGSTAB: A fast and smoothly converging variant of Bi-CG for the solution of nonsymmetric linear systems
    • H. van der Vorst, "Bi-CGSTAB: A fast and smoothly converging variant of Bi-CG for the solution of nonsymmetric linear systems," SIAM J. Sci. Stat. Comput., vol. 13, pp. 631-644, 1992.
    • (1992) SIAM J. Sci. Stat. Comput. , vol.13 , pp. 631-644
    • Van Der Vorst, H.1
  • 78
    • 20744438146 scopus 로고    scopus 로고
    • [Online]
    • MatrixMarket [Online]. Available: http://math.nist.gov/Matrix-Market
  • 79
    • 84937397986 scopus 로고    scopus 로고
    • Parallel multilevel algorithms for multi-constraint graph partitioning
    • K. Schloegel, G. Karypis, and V. Kumar, "Parallel multilevel algorithms for multi-constraint graph partitioning," in Proc. EuroPar 2000, pp. 296-310.
    • Proc. EuroPar 2000 , pp. 296-310
    • Schloegel, K.1    Karypis, G.2    Kumar, V.3
  • 81
    • 20744444975 scopus 로고    scopus 로고
    • Automatic determination of matrix blocks
    • Dept. Comput. Sci., Univ. Tennessee
    • V. Eijkhout, "Automatic determination of matrix blocks," Dept. Comput. Sci., Univ. Tennessee, Tech. Rep. ut-cs-01-458, 2001.
    • (2001) Tech. Rep. , vol.UT-CS-01-458
    • Eijkhout, V.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.