메뉴 건너뛰기




Volumn 36, Issue 3, 2009, Pages

Programming matrix algorithms-by-blocks for thread-level parallelism

Author keywords

High performance; Libraries; Linear algebra; Multithreaded architectures

Indexed keywords

ANECDOTAL EVIDENCES; APRIORI; CLASSICAL ALGORITHMS; CORE FUNCTIONALITY; DATA DEPENDENCIES; DESIGN DECISIONS; HIGH PRODUCTIVITY; HIGH-PERFORMANCE; LU FACTORIZATION; MATRIX ALGORITHMS; MULTITHREADED ARCHITECTURES; OUT OF ORDER; OUT-OF-CORE COMPUTATION; PERFORMANCE IMPROVEMENTS; PROBLEM DOMAIN; PROGRAMMABILITY; QR FACTORIZATIONS; RUNTIME SYSTEMS; SEPARATION OF CONCERNS; THREAD LEVEL PARALLELISM; VIABLE SOLUTIONS;

EID: 70349755577     PISSN: 00983500     EISSN: 15577295     Source Type: Journal    
DOI: 10.1145/1527286.1527288     Document Type: Article
Times cited : (116)

References (46)
  • 1
    • 0037834788 scopus 로고    scopus 로고
    • Open MP issues arising in the development of parallel BLAS and LAPACK libraries
    • ADDISON, C., REN,Y., AND VAN WAVEREN, M. 2003. OpenMP issues arising in the development of parallel BLAS and LAPACK libraries. Sci. Program. 11, 2, 95-104.
    • (2003) Sci. Program , vol.11 , Issue.2 , pp. 95-104
    • Addison, C.1    Ren, Y.2    Van Waveren, M.3
  • 5
    • 48849099645 scopus 로고    scopus 로고
    • Families of algorithms related to the inversion of a symmetric positive definite matrix
    • Bientinesi, P., Gunter,B., and van de Geijn, R. A. 2008. Families of algorithms related to the inversion of a symmetric positive definite matrix. ACM Trans. Math. Softw. 35,1, 1-22.
    • (2008) ACM Trans. Math. Softw. , vol.35 , Issue.1 , pp. 1-22
    • Bientinesi, P.1    Gunter, B.2    Van De Geijn, R.A.3
  • 6
    • 17644370328 scopus 로고    scopus 로고
    • Representing linear algebra algorithms in code: The FLAME application programming interfaces
    • BIENTINESI, P., QUINTANA-ORTÍ,E.S., and VAN DE GEIJN, R. A. 2005. Representing linear algebra algorithms in code: The FLAME application programming interfaces. ACM Trans. Math. Softw. 31, 1, 27-59.
    • (2005) ACM Trans. Math. Softw , vol.31 , Issue.1 , pp. 27-59
    • Bientinesi, P.1    Quintana-Ortí, E.S.2    Van De Geijn, R.A.3
  • 15
    • 0025402476 scopus 로고
    • A set of level 3 basic linear algebra sub-programs
    • DONGARRA, J. J., DU CROZ, J., HAMMARLING,S., AND DUFF, I. 1990. A set of level 3 basic linear algebra sub-programs. ACM Trans. Math. Softw. 16, 1, 1-17.
    • (1990) ACM Trans. Math. Softw , vol.16 , Issue.1 , pp. 1-17
    • Dongarra, J.J.1    Du Croz, J.2    Duff, I.3
  • 18
    • 1842832833 scopus 로고    scopus 로고
    • Recursive blocked algorithms and hybrid data structures for dense matrix library software
    • ELMROTH, E.,GUSTAVSON, F., JONSSON, I., AND KAGSTROM, B. 2004. Recursive blocked algorithms and hybrid data structures for dense matrix library software. SIAM Rev. 46, 1, 3-45.
    • (2004) SIAM Rev. , vol.46 , Issue.1 , pp. 3-45
    • Elmroth, E.1    Gustavson, F.2    Jonsson, I.3    Kagstrom, B.4
  • 20
    • 44249094647 scopus 로고    scopus 로고
    • Anatomy of a high-performance matrix multiplication
    • GOTO,K. AND VAN DE GEIJN, R. A. 2008. Anatomy of a high-performance matrix multiplication. ACM Trans. Math. Softw. 34, 3, 1-25.
    • (2008) ACM Trans. Math. Softw , vol.34 , Issue.3 , pp. 1-25
    • Goto, K.1    Van De Geijn, R.A.2
  • 23
    • 17644368925 scopus 로고    scopus 로고
    • Parallel out-of-core computation and updating the QR factorization
    • GUNTER,B.C. AND VAN DE GEIJN, R. A. 2005. Parallel out-of-core computation and updating the QR factorization. ACM Trans. Math. Softw. 31, 1, 60-78.
    • (2005) ACM Trans. Math. Softw , vol.31 , Issue.1 , pp. 60-78
    • Gunter, B.C.1    Van De Geijn, R.A.2
  • 25
    • 38049087210 scopus 로고    scopus 로고
    • Three algorithms for Cholesky factorization on distributed memory using packed storage
    • Lecture Notes in Computer Science 4699. Springer, Berlin/Heidelberg, Germany
    • GUSTAVSON,F.G., KARLSSON, L., AND KAGSTROM, B. 2007. Three algorithms for Cholesky factorization on distributed memory using packed storage. in Proceedings of the Workshop on State- of-the-Art in Scientific Computing. Lecture Notes in Computer Science, vol.4699. Springer, Berlin/Heidelberg, Germany, 550-559.
    • (2007) Proceedings of the Workshop on State- Of-the-Art in Scientific Computing , pp. 550-559
    • Gustavson, F.G.1    Karlsson, L.2    Kagstrom, B.3
  • 30
    • 47349106165 scopus 로고    scopus 로고
    • FLAME Working Note #12 TR-2004-2015 Department of Computer Sciences, University of Texas at Austin, Austin
    • LOW, T. M. AND VAN DE GEIJN, R. 2004. An API for manipulating matrices stored by blocks. FLAME Working Note #12 TR-2004-2015 Department of Computer Sciences, University of Texas at Austin, Austin.
    • (2004) An API for Manipulating Matrices Stored by Blocks
    • Low, T.M.1    Van De Geijn, R.2
  • 33
    • 0030157365 scopus 로고    scopus 로고
    • Global arrays: A nonuniform memory access programming model for high-performance computers
    • June
    • NIEPLOCHA, J., HARRISON, R., AND LITTLEFIELD, R. 1996. Global arrays: A nonuniform memory access programming model for high-performance computers. J. Supercomput. 10, 2 (June), 197-220.
    • (1996) J. Supercomput , vol.10 , Issue.2 , pp. 197-220
    • Nieplocha, J.1    Harrison, R.2    Littlefield, R.3
  • 39
    • 0022026625 scopus 로고
    • Analysis of pairwise pivoting in Gaussian elimination
    • SORENSEN, D. C. 1985. Analysis of pairwise pivoting in Gaussian elimination. IEEE Trans. Comput. 34, 3, 274-278.
    • (1985) IEEE Trans. Comput , vol.34 , Issue.3 , pp. 274-278
    • Sorensen, D.C.1
  • 40
    • 0035003299 scopus 로고    scopus 로고
    • A comparison of lookahead and algorithmic blocking techniques for parallel matrix factorization
    • STRAZDINS, P. 2001. A comparison of lookahead and algorithmic blocking techniques for parallel matrix factorization. Int. J. Parall. Distrib. Syst. Netw. 4, 1, 26-35.
    • (2001) Int. J. Parall. Distrib. Syst. Netw , vol.4 , Issue.1 , pp. 26-35
    • Strazdins, P.1
  • 41
    • 0002831423 scopus 로고    scopus 로고
    • A survey of out-of-core algorithms in numerical linear algebra
    • J. Abello and J. S. Vitter, Eds. American Mathematical Society, Boston
    • TOLEDO, S. 1999. A survey of out-of-core algorithms in numerical linear algebra. In External Memory Algorithms, J. Abello and J. S. Vitter, Eds. American Mathematical Society, Boston, 161-179.
    • (1999) External Memory Algorithms , pp. 161-179
    • Toledo, S.1
  • 42
    • 0037173976 scopus 로고    scopus 로고
    • A framework for high-performance matrix multiplication based on hierarchical abstractions, algorithms and optimized low-level kernels
    • VALSALAM,V. AND SKJELLUM, A. 2002. A framework for high-performance matrix multiplication based on hierarchical abstractions, algorithms and optimized low-level kernels. Concurr. Com- putat. Pract. Exper. 14, 10, 805-840.
    • (2002) Concurr. Com- Putat. Pract. Exper , vol.14 , Issue.10 , pp. 805-840
    • Valsalam, V.1    Skjellum, A.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.