-
1
-
-
0037834788
-
Open MP issues arising in the development of parallel BLAS and LAPACK libraries
-
ADDISON, C., REN,Y., AND VAN WAVEREN, M. 2003. OpenMP issues arising in the development of parallel BLAS and LAPACK libraries. Sci. Program. 11, 2, 95-104.
-
(2003)
Sci. Program
, vol.11
, Issue.2
, pp. 95-104
-
-
Addison, C.1
Ren, Y.2
Van Waveren, M.3
-
3
-
-
0003706460
-
-
Society for Industrial and Applied Mathematics, Philadelphia
-
ANDERSON, E., BAI, Z., BISCHOF, C., BLACKFORD, L. S., DEMMEL, J., DONGARRA, J. J., CROZ,J.D., HAMMARLING, S., GREENBAUM, A., MCKENNEY, A., AND SORENSEN, D. 1999. LAPACK Users' Guide, 3rd Ed. Society for Industrial and Applied Mathematics, Philadelphia.
-
(1999)
LAPACK Users' Guide, 3rd Ed.
-
-
Anderson, E.1
Bai, Z.2
Bischof, C.3
Blackford, L.S.4
Demmel, J.5
Crozj, J.D.J.6
Hammarling, S.7
Greenbaum, A.8
McKenney, A.9
Sorensen, D.10
-
4
-
-
0003660984
-
-
Tech. rep. ANL-95/11-Revision 2.1.5, Argonne National Laboratory, Argonne
-
BALAY, S., BUSCHELMAN, K., EIJKHOUT, V., GROPP,W.D., KAUSHIK, D., KNEPLEY,M.G., MCINNES,L.C., SMITH,B.F., AND ZHANG, H. 2004. PETSc users manual. Tech. rep. ANL-95/11-Revision 2.1.5, Argonne National Laboratory, Argonne.
-
(2004)
PETSc Users Manual
-
-
Balay, S.1
Buschelman, K.2
Groppwd, E.V.3
Kaushik, D.4
Knepley, M.G.5
McInnes, L.C.6
Smith, B.F.7
Zhang, H.8
-
5
-
-
48849099645
-
Families of algorithms related to the inversion of a symmetric positive definite matrix
-
Bientinesi, P., Gunter,B., and van de Geijn, R. A. 2008. Families of algorithms related to the inversion of a symmetric positive definite matrix. ACM Trans. Math. Softw. 35,1, 1-22.
-
(2008)
ACM Trans. Math. Softw.
, vol.35
, Issue.1
, pp. 1-22
-
-
Bientinesi, P.1
Gunter, B.2
Van De Geijn, R.A.3
-
6
-
-
17644370328
-
Representing linear algebra algorithms in code: The FLAME application programming interfaces
-
BIENTINESI, P., QUINTANA-ORTÍ,E.S., and VAN DE GEIJN, R. A. 2005. Representing linear algebra algorithms in code: The FLAME application programming interfaces. ACM Trans. Math. Softw. 31, 1, 27-59.
-
(2005)
ACM Trans. Math. Softw
, vol.31
, Issue.1
, pp. 27-59
-
-
Bientinesi, P.1
Quintana-Ortí, E.S.2
Van De Geijn, R.A.3
-
7
-
-
51049101584
-
-
LAPACK Working Note 191 UT-CS. University of Knoxville
-
BUTTARI, A., LANGOU, J., KURZAK,J., AND DONGARRA, J. 2007. A class of parallel tiled linear algebra algorithms for multicore architectures. LAPACK Working Note 191 UT-CS-07-600. University of Knoxville.
-
(2007)
A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures
, pp. 07-600
-
-
Buttari, A.1
Langou, J.2
Kurzak, J.3
Dongarra, J.4
-
8
-
-
50249105132
-
Parallel tiled QR factorization for multicore architectures
-
BUTTARI, A., LANGOU, J., KURZAK,J., AND DONGARRA, J. 2008. Parallel tiled QR factorization for multicore architectures. Concurr. Computat. Pract. Experi. 20, 13, 1573-1590.
-
(2008)
Concurr. Computat. Pract. Experi
, vol.20
, Issue.13
, pp. 1573-1590
-
-
Buttari, A.1
Langou, J.2
Kurzak, J.3
Dongarra, J.4
-
9
-
-
35248843628
-
SuperMatrix out- of-order scheduling of matrix operations for SMP and multi-core architectures
-
San Diego
-
CHAN, E., QUINTANA-ORTÍ, E. S., QUINTANA-ORTÍ, G., AND VAN DE GEIJN, R. 2007a. SuperMatrix out- of-order scheduling of matrix operations for SMP and multi-core architectures. In Proceedings of the 19th ACM Symposium on Parallelism in Algorithms and Architectures, San Diego, 116-125.
-
(2007)
Proceedings of the 19th ACM Symposium on Parallelism in Algorithms and Architectures
, pp. 116-125
-
-
Chan, E.1
Quintana-Ortí, E.S.2
Quintana-Ortí, G.3
Van De Geijn, R.4
-
10
-
-
67650056933
-
SuperMatrix: A multithreaded runtime scheduling system for algorithms-by-blocks
-
Salt Lake City
-
CHAN, E., VAN ZEE, F. G., BIENTINESI, P., QUINTANA-ORTÍ, E. S., QUINTANA-ORTÍ, G., AND VAN DE GEIJN, R. 2008. SuperMatrix: A multithreaded runtime scheduling system for algorithms-by-blocks. In Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Salt Lake City, 123-132.
-
(2008)
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
, pp. 123-132
-
-
Chan, E.1
Van Zee, F.G.2
Bientinesi, P.3
Quintana-Ortí, E.S.4
Quintana-Ortí, G.5
Van De Geijn, R.6
-
11
-
-
51049099053
-
Satisfying your dependencies with SuperMatrix
-
CHAN, E., VAN ZEE, F. G., QUINTANA-ORTÍ, E. S., QUINTANA-ORTÍ, G., AND VAN DE GEIJN, R. 2007b. Satisfying your dependencies with SuperMatrix. In Proceedings of the 2007 IEEE International Conference on Cluster Computing Austin, 91-99.
-
(2007)
Proceedings of the 2007 IEEE International Conference on Cluster Computing Austin
, pp. 91-99
-
-
Chan, E.1
Van Zee, F.G.2
Quintana-Ortí, E.S.3
Quintana-Ortí, G.4
Van De Geijn, R.5
-
12
-
-
0002924772
-
ScaLAPACK: A scalable linear algebra library for distributed memory concurrent computers
-
McLean
-
CHOI, J., DONGARRA, J. J., POZO, R., AND WALKER, D. W. 1992. ScaLAPACK: A scalable linear algebra library for distributed memory concurrent computers. In Proceedings of the 4th Symposium on the Frontiers of Massively Parallel Computation, McLean, 120-127.
-
(1992)
Proceedings of the 4th Symposium on the Frontiers of Massively Parallel Computation
, pp. 120-127
-
-
Choi, J.1
Dongarra, J.J.2
Pozo, R.3
Walker, D.W.4
-
14
-
-
0003555195
-
-
SIAM, Philadelphia
-
DONGARRA, J. J., BUNCH, J. R., MOLER,C.B., and STEWART, G. W. 1979. LINPACK Users' Guide. SIAM, Philadelphia.
-
(1979)
LINPACK Users' Guide
-
-
Dongarra, J.J.1
Molerc, R.B.J.2
Stewart, G.W.3
-
15
-
-
0025402476
-
A set of level 3 basic linear algebra sub-programs
-
DONGARRA, J. J., DU CROZ, J., HAMMARLING,S., AND DUFF, I. 1990. A set of level 3 basic linear algebra sub-programs. ACM Trans. Math. Softw. 16, 1, 1-17.
-
(1990)
ACM Trans. Math. Softw
, vol.16
, Issue.1
, pp. 1-17
-
-
Dongarra, J.J.1
Du Croz, J.2
Duff, I.3
-
16
-
-
0023983122
-
An extended set of Fortran basic linear algebra subprograms
-
DONGARRA,J.J., DU CROZ, J., HAMMARLING,S., AND HANSON, R. J. 1988. An extended set of Fortran basic linear algebra subprograms. ACM Trans. Math. Softw. 14, 1, 1-17.
-
(1988)
ACM Trans. Math. Softw.
, vol.14
, Issue.1
, pp. 1-17
-
-
Dongarra, J.J.1
Du Croz, J.2
Hammarling, S.3
Hanson, R.J.4
-
18
-
-
1842832833
-
Recursive blocked algorithms and hybrid data structures for dense matrix library software
-
ELMROTH, E.,GUSTAVSON, F., JONSSON, I., AND KAGSTROM, B. 2004. Recursive blocked algorithms and hybrid data structures for dense matrix library software. SIAM Rev. 46, 1, 3-45.
-
(2004)
SIAM Rev.
, vol.46
, Issue.1
, pp. 3-45
-
-
Elmroth, E.1
Gustavson, F.2
Jonsson, I.3
Kagstrom, B.4
-
20
-
-
44249094647
-
Anatomy of a high-performance matrix multiplication
-
GOTO,K. AND VAN DE GEIJN, R. A. 2008. Anatomy of a high-performance matrix multiplication. ACM Trans. Math. Softw. 34, 3, 1-25.
-
(2008)
ACM Trans. Math. Softw
, vol.34
, Issue.3
, pp. 1-25
-
-
Goto, K.1
Van De Geijn, R.A.2
-
21
-
-
0004217970
-
-
MIT Press, Cambridge
-
Gropp, W., Lusk, E., and Skjellum, A. 1994. Using MPI. MIT Press, Cambridge.
-
(1994)
Using MPI
-
-
Gropp, W.1
Lusk, E.2
Skjellum, A.3
-
22
-
-
0039435412
-
FLAME: Formal linear algebra methods environment
-
GUNNELS, J. A., GUSTAVSON,F.G., HENRY,G.M., AND VAN DE GEIJN, R. A. 2001. FLAME: Formal linear algebra methods environment. ACM Trans. Math. Softw. 27, 4, 422-455.
-
(2001)
ACM Trans. Math. Softw
, vol.27
, Issue.4
, pp. 422-455
-
-
Gunnels, J.A.1
Gustavson, F.G.2
Henry, G.M.3
Van De Geijn, R.A.4
-
23
-
-
17644368925
-
Parallel out-of-core computation and updating the QR factorization
-
GUNTER,B.C. AND VAN DE GEIJN, R. A. 2005. Parallel out-of-core computation and updating the QR factorization. ACM Trans. Math. Softw. 31, 1, 60-78.
-
(2005)
ACM Trans. Math. Softw
, vol.31
, Issue.1
, pp. 60-78
-
-
Gunter, B.C.1
Van De Geijn, R.A.2
-
24
-
-
79959486752
-
Programming with tiles
-
Salt Lake City
-
Guo, J., Bikshandi, G., Fraguela, B., Garzaran, M., and Padua, D. 2008. Programming with tiles. In Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Salt Lake City, 111-122.
-
(2008)
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
, pp. 111-122
-
-
Guo, J.1
Bikshandi, G.2
Fraguela, B.3
Garzaran, M.4
Padua, D.5
-
25
-
-
38049087210
-
Three algorithms for Cholesky factorization on distributed memory using packed storage
-
Lecture Notes in Computer Science 4699. Springer, Berlin/Heidelberg, Germany
-
GUSTAVSON,F.G., KARLSSON, L., AND KAGSTROM, B. 2007. Three algorithms for Cholesky factorization on distributed memory using packed storage. in Proceedings of the Workshop on State- of-the-Art in Scientific Computing. Lecture Notes in Computer Science, vol.4699. Springer, Berlin/Heidelberg, Germany, 550-559.
-
(2007)
Proceedings of the Workshop on State- Of-the-Art in Scientific Computing
, pp. 550-559
-
-
Gustavson, F.G.1
Karlsson, L.2
Kagstrom, B.3
-
27
-
-
33745328323
-
Rapid development of high- performance out-of-core solvers
-
Lecture Notes in Computer Science. Springer, Berlin/Heidelberg, Germany
-
JOFFRAIN, T., QULNTANA-ORTÍ, E. S., AND VAN DE GEIJN, R. A. 2004. Rapid development of high- performance out-of-core solvers. In Proceedings of the Workshop on State-of-the-Art in Scientific Computing. Lecture Notes in Computer Science, vol.3732. Springer, Berlin/Heidelberg, Germany, 413-422.
-
(2004)
Proceedings of the Workshop on State-of-the-Art in Scientific Computing
, vol.3732
, pp. 413-422
-
-
Joffrain, T.1
Qulntana-Ortí, E.S.2
Van De Geijn, R.A.3
-
29
-
-
0018515759
-
Basic linear algebra subprograms for Fortran usage
-
LAWSON, C. L., HANSON, R. J., KINCAID, D. R., AND KROGH, F. T. 1979. Basic linear algebra subprograms for Fortran usage. ACM Trans. Math. Softw. 5, 3, 308-323.
-
(1979)
ACM Trans. Math. Softw
, vol.5
, Issue.3
, pp. 308-323
-
-
Lawson, C.L.1
Hanson, R.J.2
Kincaid, D.R.3
Krogh, F.T.4
-
30
-
-
47349106165
-
-
FLAME Working Note #12 TR-2004-2015 Department of Computer Sciences, University of Texas at Austin, Austin
-
LOW, T. M. AND VAN DE GEIJN, R. 2004. An API for manipulating matrices stored by blocks. FLAME Working Note #12 TR-2004-2015 Department of Computer Sciences, University of Texas at Austin, Austin.
-
(2004)
An API for Manipulating Matrices Stored by Blocks
-
-
Low, T.M.1
Van De Geijn, R.2
-
31
-
-
0346675759
-
Compiler and Software Distributed Shared Memory Support for Irregular Applications
-
LU, H., COX, A. L., DWARKADAS, S., RAJAMONY, R., AND ZWAENEPOEL, W. 1997. Compiler and software distributed shared memory support for irregular applications. In Proceedings of the 6th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Las Vegas, 48-56. (Pubitemid 127452540)
-
(1997)
SIGPLAN Notices (ACM Special Interest Group on Programming Languages)
, vol.32
, Issue.7
, pp. 48-56
-
-
Lu, H.1
Cox, A.L.2
Dwarkadas, S.3
Rajamony, R.4
Zwaenepoel, W.5
-
32
-
-
38049132009
-
Toward scalable matrix multiply on multithreaded architectures
-
Rennes, France
-
MARKER, B. A., VAN ZEE, F. G., GOTO, K., QUINTANA-ORTÍ, G., and VAN DE GEIJN, R. A. 2007. Toward scalable matrix multiply on multithreaded architectures. In Proceedings of the 13th International European Conference on Parallel and Distributed Computing (Rennes, France). 748-757.
-
(2007)
Proceedings of the 13th International European Conference on Parallel and Distributed Computing
, pp. 748-757
-
-
Marker, B.A.1
Van Zee, F.G.2
Goto, K.3
Quintana-Ortí, G.4
Van De Geijn, R.A.5
-
33
-
-
0030157365
-
Global arrays: A nonuniform memory access programming model for high-performance computers
-
June
-
NIEPLOCHA, J., HARRISON, R., AND LITTLEFIELD, R. 1996. Global arrays: A nonuniform memory access programming model for high-performance computers. J. Supercomput. 10, 2 (June), 197-220.
-
(1996)
J. Supercomput
, vol.10
, Issue.2
, pp. 197-220
-
-
Nieplocha, J.1
Harrison, R.2
Littlefield, R.3
-
35
-
-
70349752889
-
Design of scalable dense linear algebra libraries for multithreaded architectures: The Lu factorization
-
Miami
-
QULNTANA-ORTÍ, G., QULNTANA-ORTÍ, E. S., CHAN, E., VAN DE GEIJN, R., AND VAN ZEE, F. G. 2008a. Design of scalable dense linear algebra libraries for multithreaded architectures: The Lu factorization. In Proceedings of the Workshop on Multithreaded Architectures and Applications, Miami, 1-8.
-
(2008)
Proceedings of the Workshop on Multithreaded Architectures and Applications
, pp. 1-8
-
-
Qulntana-Ortí, G.1
Qulntana-Ortí, E.S.2
Chan, E.3
Van De Geijn, R.4
Van Zee, F.G.5
-
36
-
-
47349122478
-
Scheduling of QR factorization algorithms on SMP and multi-core architectures
-
Distributed and Network-Based Processing (Toulouse, France)
-
QULNTANA-ORTÍ, G., QULNTANA-ORTÍ, E. S., CHAN, E., VAN ZEE,F.G., AND VAN DE GEIJN, R. A. 2008b. Scheduling of QR factorization algorithms on SMP and multi-core architectures. In Proceedings of the 16th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (Toulouse, France). 301-307.
-
(2008)
Proceedings of the 16th Euromicro International Conference on Parallel
, pp. 301-307
-
-
Qulntana-Ortí, G.1
Qulntana-Ortí, E.S.2
Chan, E.3
Van, ZeeF.G.4
Van De Geijn, R.A.5
-
37
-
-
70349737531
-
An algorithm- by-blocks for SuperMatrix band Cholesky factorization
-
Toulouse, France
-
QULNTANA-ORTÍ, G., QULNTANA-ORTÍ, E. S., REMÓn, A., AND VAN DE GEIJN, R. 2008c. An algorithm- by-blocks for SuperMatrix band Cholesky factorization. In Proceedings of the 8th International Meeting on High-Performance Computing for Computational Science (Toulouse, France). 1-13.
-
(2008)
Proceedings of the 8th International Meeting on High-Performance Computing for Computational Science
, pp. 1-13
-
-
Qulntana-Ortí, G.1
Qulntana-Ortí, E.S.2
Remón, A.3
Van De Geijn, R.4
-
39
-
-
0022026625
-
Analysis of pairwise pivoting in Gaussian elimination
-
SORENSEN, D. C. 1985. Analysis of pairwise pivoting in Gaussian elimination. IEEE Trans. Comput. 34, 3, 274-278.
-
(1985)
IEEE Trans. Comput
, vol.34
, Issue.3
, pp. 274-278
-
-
Sorensen, D.C.1
-
40
-
-
0035003299
-
A comparison of lookahead and algorithmic blocking techniques for parallel matrix factorization
-
STRAZDINS, P. 2001. A comparison of lookahead and algorithmic blocking techniques for parallel matrix factorization. Int. J. Parall. Distrib. Syst. Netw. 4, 1, 26-35.
-
(2001)
Int. J. Parall. Distrib. Syst. Netw
, vol.4
, Issue.1
, pp. 26-35
-
-
Strazdins, P.1
-
41
-
-
0002831423
-
A survey of out-of-core algorithms in numerical linear algebra
-
J. Abello and J. S. Vitter, Eds. American Mathematical Society, Boston
-
TOLEDO, S. 1999. A survey of out-of-core algorithms in numerical linear algebra. In External Memory Algorithms, J. Abello and J. S. Vitter, Eds. American Mathematical Society, Boston, 161-179.
-
(1999)
External Memory Algorithms
, pp. 161-179
-
-
Toledo, S.1
-
42
-
-
0037173976
-
A framework for high-performance matrix multiplication based on hierarchical abstractions, algorithms and optimized low-level kernels
-
VALSALAM,V. AND SKJELLUM, A. 2002. A framework for high-performance matrix multiplication based on hierarchical abstractions, algorithms and optimized low-level kernels. Concurr. Com- putat. Pract. Exper. 14, 10, 805-840.
-
(2002)
Concurr. Com- Putat. Pract. Exper
, vol.14
, Issue.10
, pp. 805-840
-
-
Valsalam, V.1
Skjellum, A.2
-
45
-
-
85029485046
-
Compiler analysis for irregular problems in Fortran D
-
Lecture Notes in Computer Science. Springer, Berlin/Heidelberg, Germany
-
VON HANXLEDEN, R., KENNEDY, K., KOELBEL, C. H., DAS, R., AND SALTZ, J. H. 1992. Compiler analysis for irregular problems in Fortran D. In Proceedings of the 5th Workshop on Languages and Compilers for Parallel Computing. Lecture Notes in Computer Science, vol.757. Springer, Berlin/Heidelberg, Germany, 97-111.
-
(1992)
Proceedings of the 5th Workshop on Languages and Compilers for Parallel Computing
, vol.757
, pp. 97-111
-
-
Von Hanxleden, R.1
Kennedy, K.2
Koelbel, C.H.3
Das, R.4
Saltz, J.H.5
-
46
-
-
0034819362
-
Language support for Morton-order matrices
-
Snowbird
-
WISE,D.S., FRENS,J.D., GU,Y., AND ALEXANDER, G. A. 2001. Language support for Morton-order matrices. in Proceedings of the 8th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Snowbird, 24-33.
-
(2001)
Proceedings of the 8th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
, pp. 24-33
-
-
Wise, D.S.1
Frens, J.D.2
Gu, Y.3
Alexander, G.A.4
|