-
1
-
-
0003706460
-
-
SIAM, Philadelphia Also available from
-
E. ANDERSON, Z. BAI, C. BISCHOF, J. DEMMEL, J. DONGARRA, J. DU CROZ, A. GREENBAUM, S. HAMMARLING, A. MCKENNEY, S. OSTROUCHOV, AND D. SORENSEN, LAPACK's User's Guide, SIAM, Philadelphia, 1992. Also available from http://www.netlib. org/lapack/.
-
(1992)
LAPACK's User's Guide
-
-
Anderson, E.1
Bai, Z.2
Bischof, C.3
Demmel, J.4
Dongarra, J.5
Croz, J.D.U.6
Greenbaum, A.7
Hammarling, S.8
McKenney, A.9
Ostrouchov, S.10
Sorensen, D.11
-
2
-
-
0029370767
-
A three-dimensional approach to parallel matrix multiplication
-
R. C. AGARWAL, S. M. BALLE, F. G. GUSTAVSON, M. JOSHI, AND P. PALKAR, A three-dimensional approach to parallel matrix multiplication, IBM J. Res. Dev., 39 (1995), pp. 575-582.
-
(1995)
IBM J. Res. Dev.
, vol.39
, pp. 575-582
-
-
Agarwal, R.C.1
Balle, S.M.2
Gustavson, F.G.3
Joshi, M.4
Palkar, P.5
-
3
-
-
18044400448
-
A recursive formulation of Cholesky factorization of a matrix in packed storage
-
DOI 10.1145/383738.383741
-
B. S. ANDERSEN, F. GUSTAVSON, AND J. WASNIEWSKI, A recursive formulation of Cholesky factorization of a matrix in packed storage format, ACM Trans. Math. Software, 27 (2001), pp. 214-244. (Pubitemid 33602326)
-
(2001)
ACM Transactions on Mathematical Software
, vol.27
, Issue.2
, pp. 214-244
-
-
Andersen, B.S.1
Wasniewski, J.2
Gustavson, F.G.3
-
4
-
-
84937408012
-
Automatic generation of block-recursive codes
-
London, UK, Springer-Verlag, Berlin
-
N. AHMED AND K. PINGALI, Automatic generation of block-recursive codes, in Euro-Par '00: Proceedings of the 6th International Euro-Par Conference on Parallel Processing, London, UK, Springer-Verlag, Berlin, 2000, pp. 368-378.
-
(2000)
Euro-Par '00: Proceedings of the 6th International Euro-Par Conference on Parallel Processing
, pp. 368-378
-
-
Ahmed, N.1
Pingali, K.2
-
6
-
-
0001314661
-
The fan-both family of column-based distributed Cholesky factorization algorithms
-
J. R. Gilbert, A. George, and J. W. H. Liu, eds., Springer-Verlag, Berlin
-
C. ASHCRAFT, The fan-both family of column-based distributed Cholesky factorization algorithms, in Graph Theory and Sparse Matrix Computation, IMA Volumes in Mathematics and Its Applications 56, J. R. Gilbert, A. George, and J. W. H. Liu, eds., Springer-Verlag, Berlin, 1993, pp. 159-190
-
(1993)
Graph Theory and Sparse Matrix Computation, IMA Volumes in Mathematics and Its Applications
, vol.56
, pp. 159-190
-
-
Ashcraft, C.1
-
7
-
-
0024082546
-
The input/output complexity of sorting and related problems
-
DOI 10.1145/48529.48535
-
A. AGGARWAL AND J. S. VITTER, The input/output complexity of sorting and related problems, Comm. ACM, 31 (1988), pp. 1116-1127. (Pubitemid 18662481)
-
(1988)
Communications of the ACM
, vol.31
, Issue.9
, pp. 1116-1127
-
-
Aggarwal, A.1
Vitter, J.S.2
-
8
-
-
35248813384
-
Optimal sparse matrix dense vector multiplication in the I/O-model
-
DOI 10.1145/1248377.1248391, SPAA'07: Proceedings of the Nineteenth Annual Symposium on Parallelism in Algorithms and Architectures
-
M. A. BENDER, G. S. BRODAL, R. FAGERBERG, R. JACOB, AND E. VICARI, Optimal sparse matrix dense vector multiplication in the I/O-model, in SPAA '07: Proceedings of the 19th Annual ACM Symposium on Parallel Algorithms and Architectures, ACM, New York, 2007, pp. 61-70. (Pubitemid 47568555)
-
(2007)
Annual ACM Symposium on Parallelism in Algorithms and Architectures
, pp. 61-70
-
-
Bender, M.A.1
Brodal, G.S.2
Fagerberg, R.3
Jacob, R.4
Vicari, E.5
-
9
-
-
0036401631
-
The multishift QR algorithm. Part I: Maintaining well-focused shifts and level 3 performance
-
K. BRAMAN, R. BYERS, AND R. MATHIAS, The multishift QR algorithm. Part I: Maintaining well-focused shifts and level 3 performance, SIAM J. Matrix Anal. Appl., 23 (2002), pp. 929-947.
-
(2002)
SIAM J. Matrix Anal. Appl.
, vol.23
, pp. 929-947
-
-
Braman, K.1
Byers, R.2
Mathias, R.3
-
10
-
-
0036400807
-
The multishift QR algorithm. Part II: Aggressive early deflation
-
K. BRAMAN, R. BYERS, AND R. MATHIAS, The multishift QR algorithm. Part II: Aggressive early deflation, SIAM J. Matrix Anal. Appl., 23 (2002), pp. 948-973.
-
(2002)
SIAM J. Matrix Anal. Appl.
, vol.23
, pp. 948-973
-
-
Braman, K.1
Byers, R.2
Mathias, R.3
-
11
-
-
0003615167
-
-
SIAM, Philadelphia Also available from
-
L. S. BLACKFORD, J. CHOI, A. CLEARY, E. DAZEVEDO, J. DEMMEL, I. DHILLON, J. DONGARRA, S. HAMMARLING, G. HENRY, A. PETITET, K. STANLEY, D. WALKER, AND R. C. WHALEY, ScaLAPACK Users' Guide, SIAM, Philadelphia, 1997. Also available from http://www.netlib.org/scalapack/.
-
(1997)
ScaLAPACK Users' Guide
-
-
Blackford, L.S.1
Choi, J.2
Cleary, A.3
Dazevedo, E.4
Demmel, J.5
Dhillon, I.6
Dongarra, J.7
Hammarling, S.8
Henry, G.9
Petitet, A.10
Stanley, K.11
Walker, D.12
Whaley, R.C.13
-
12
-
-
80054042669
-
GUDENBERG, Basic linear algebra subprograms technical (BLAST) forum standard
-
L. S. BLACKFORD, J. DEMMEL, J. DONGARRA, I. DUFF, S. HAMMARLING, G. HENRY, M. HEROUX, L. KAUFMAN, A. LUMSDAINE, A. PETITET, R. POZO, K. REMINGTON, R. C. WHALEY, Z. MAANY, F. KROUGH, G. CORLISS, C. HU, B. KEAFOTT, W. WALSTER, AND J. WOLFF V. GUDENBERG, Basic linear algebra subprograms technical (BLAST) forum standard, Int. J. Supercomput. Appl. High Perform. Comput., 15 (2001), pp. 1-315.
-
(2001)
Int. J. Supercomput. Appl. High Perform. Comput.
, vol.15
, pp. 1-315
-
-
Blackford, L.S.1
Demmel, J.2
Dongarra, J.3
Duff, I.4
Hammarling, S.5
Henry, G.6
Heroux, M.7
Kaufman, L.8
Lumsdaine, A.9
Petitet, A.10
Pozo, R.11
Remington, K.12
Whaley, R.C.13
Maany, Z.14
Krough, F.15
Corliss, G.16
Hu, C.17
Keafott, B.18
Walster, W.19
Wolff V, J.20
more..
-
13
-
-
19044386208
-
An Updated Set of Basic Linear Algebra Subprograms (BLAS)
-
DOI 10.1145/567806.567807
-
L. S. BLACKFORD, J. DEMMEL, J. DONGARRA, I. DUFF, S. HAMMARLING, G. HENRY, M. HEROUX, L. KAUFMAN, A. LUMSDAINE, A. PETITET, R. POZO, K. REMINGTON, AND R. C. WHALEY, An updated set of basic linear algebra subroutines (BLAS), ACM Trans. Math. Software, 28 (2002), pp. 135-151. (Pubitemid 135701673)
-
(2002)
ACM Transactions on Mathematical Software
, vol.28
, Issue.2
, pp. 135-151
-
-
Blackford, L.S.1
Demmel, J.2
Dongarra, J.3
Duff, I.4
Hammarling, S.5
Henry, G.6
Heroux, M.7
Kaufman, L.8
Lumsdaine, A.9
Petitet, A.10
Pozo, R.11
Remington, K.12
Whaley, R.C.13
-
14
-
-
80054039665
-
Communication-optimal parallel and sequential eigenvalue and singular value algorithms
-
University of California-Berkeley
-
G. BALLARD, J. DEMMEL, AND I. DUMITRIU, Communication-Optimal Parallel and Sequential Eigenvalue and Singular Value Algorithms, EECS Technical Report EECS-2011-14, University of California-Berkeley, 2011.
-
(2011)
EECS Technical Report EECS-2011-14
-
-
Ballard, G.1
Demmel, J.2
Dumitriu, I.3
-
15
-
-
79251563454
-
Communication-optimal parallel and sequential Cholesky decomposition
-
G. BALLARD, J. DEMMEL, O. HOLTZ, AND O. SCHWARTZ, Communication-optimal parallel and sequential Cholesky decomposition, SIAM J. Sci. Comput., 32 (2010), pp. 3495-3523.
-
(2010)
SIAM J. Sci. Comput.
, vol.32
, pp. 3495-3523
-
-
Ballard, G.1
Demmel, J.2
Holtz, O.3
Schwartz, O.4
-
16
-
-
79959674766
-
Graph expansion and communication costs of fast matrix multiplication
-
to appear
-
G. BALLARD, J. DEMMEL, O. HOLTZ, AND O. SCHWARTZ, Graph expansion and communication costs of fast matrix multiplication, in Proceedings of the 23rd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2011), 2011, to appear.
-
(2011)
Proceedings of the 23rd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2011)
-
-
Ballard, G.1
Demmel, J.2
Holtz, O.3
Schwartz, O.4
-
17
-
-
80052309144
-
Minimizing communication in linear Algebra
-
University of California-Berkeley
-
G. BALLARD, J. DEMMEL, O. HOLTZ, AND O. SCHWARTZ, Minimizing Communication in Linear Algebra, EECS Technical Report EECS-2011-15, University of California-Berkeley, 2011.
-
(2011)
EECS Technical Report EECS-2011-15
-
-
Ballard, G.1
Demmel, J.2
Holtz, O.3
Schwartz, O.4
-
18
-
-
84966228742
-
Some stable methods for calculating inertia and solving symmetric linear systems
-
J. BUNCH AND L. KAUFMAN, Some stable methods for calculating inertia and solving symmetric linear systems, Math. Comp., 31 (1977), pp. 163-179.
-
(1977)
Math. Comp.
, vol.31
, pp. 163-179
-
-
Bunch, J.1
Kaufman, L.2
-
19
-
-
51049101584
-
A class of parallel tiled linear algebra algorithms for multicore architectures
-
A. BUTTARI, J. LANGOU, J. KURZAK, AND J. J. DONGARRA, A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures, Technical Report 191, LAPACK Working Note, 2007.
-
(2007)
Technical Report 191, LAPACK Working Note
-
-
Buttari, A.1
Langou, J.2
Kurzak, J.3
Dongarra, J.J.4
-
20
-
-
0012881041
-
Algorithm 807: The SBR toolbox-software for successive band reduction
-
C. H. BISCHOF, B. LANG, AND X. SUN, Algorithm 807: The SBR toolbox-software for successive band reduction, ACM Trans. Math. Software, 26 (2000), pp. 602-616.
-
(2000)
ACM Trans. Math. Software
, vol.26
, pp. 602-616
-
-
Bischof, C.H.1
Lang, B.2
Sun, X.3
-
21
-
-
0039699635
-
A framework for symmetric band reduction
-
C. H. BISCHOF, B. LANG, AND X. SUN, A framework for symmetric band reduction, ACM Trans. Math. Software, 26 (2000), pp. 581-601.
-
(2000)
ACM Trans. Math. Software
, vol.26
, pp. 581-601
-
-
Bischof, C.H.1
Lang, B.2
Sun, X.3
-
22
-
-
0001951009
-
The WY representation for products of Householder matrices
-
C. BISCHOF AND C. VAN LOAN, The WY representation for products of Householder matrices, SIAM J. Sci. Statist. Comput., 8 (1987), pp. 2-13.
-
(1987)
SIAM J. Sci. Statist. Comput.
, vol.8
, pp. 2-13
-
-
Bischof, C.1
Van Loan, C.2
-
24
-
-
0003712293
-
-
Ph.D. thesis, Montana State University, Bozeman, MT
-
L. CANNON, A Cellular Computer to Implement the Kalman Filter Algorithm, Ph.D. thesis, Montana State University, Bozeman, MT, 1969.
-
(1969)
A Cellular Computer to Implement the Kalman Filter Algorithm
-
-
Cannon, L.1
-
25
-
-
0004116989
-
-
MIT Press, Cambridge, MA
-
T. CORMEN, C. LEISERSON, R. RIVEST, AND C. STEIN, Introduction to Algorithms, 2nd ed., MIT Press, Cambridge, MA, 2001.
-
(2001)
Introduction to Algorithms, 2nd Ed
-
-
Cormen, T.1
Leiserson, C.2
Rivest, R.3
Stein, C.4
-
26
-
-
33244497406
-
Cache-oblivious dynamic programming
-
DOI 10.1145/1109557.1109622, Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithms
-
R. A. CHOWDHURY AND V. RAMACHANDRAN, Cache-oblivious dynamic programming, in Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithms, SIAM, Philadelphia, ACM, New York, 2006, pp. 591-600. (Pubitemid 43275280)
-
(2006)
Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms
, pp. 591-600
-
-
Chowdhury, R.A.1
Ramachandran, V.2
-
27
-
-
77954898947
-
Brief announcement: Lower bounds on communication for sparse Cholesky factorization of a model problem
-
P.-Y. DAVID, J. DEMMEL, L. GRIGORI, AND S. PEYRONNET, Brief announcement: Lower bounds on communication for sparse Cholesky factorization of a model problem, in Proceedings of the 22nd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), 2010.
-
(2010)
Proceedings of the 22nd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA)
-
-
David, P.-Y.1
Demmel, J.2
Grigori, L.3
Peyronnet, S.4
-
28
-
-
35548978022
-
Fast linear algebra is sable
-
J. DEMMEL, I. DUMITRIU, AND O. HOLTZ, Fast linear algebra is sable, Numer. Math., 108 (2007), pp. 59-91.
-
(2007)
Numer. Math.
, vol.108
, pp. 59-91
-
-
Demmel, J.1
Dumitriu, I.2
Holtz, O.3
-
29
-
-
80054022827
-
CS 267 course notes: Applications of parallel processing
-
University of California
-
J. DEMMEL, CS 267 Course Notes: Applications of Parallel Processing, Computer Science Division, University of California, 1996. http://www.cs. berkeley.edu/∼demmel/cs267.
-
(1996)
Computer Science Division
-
-
Demmel, J.1
-
30
-
-
0003252789
-
Applied numerical linear algebra
-
J. DEMMEL, Applied Numerical Linear Algebra, SIAM, Philadelphia, 1997.
-
(1997)
SIAM Philadelphia
-
-
Demmel, J.1
-
31
-
-
77953980008
-
Communication-optimal parallel and sequential QR and LU factorizations
-
University of California-Berkeley to appear in SIAM. J. Sci. Comput
-
J. DEMMEL, L. GRIGORI, M. HOEMMEN, AND J. LANGOU, Communication-Optimal Parallel and Sequential QR and LU Factorizations, EECS Technical Report EECS-2008-89, University of California-Berkeley, 2008, to appear in SIAM. J. Sci. Comput.
-
(2008)
EECS Technical Report EECS-2008-89
-
-
Demmel, J.1
Grigori, L.2
Hoemmen, M.3
Langou, J.4
-
32
-
-
77953980008
-
-
available from
-
J. DEMMEL, L. GRIGORI, M. HOEMMEN, AND J. LANGOU, Implementing communication-optimal parallel and sequential QR and LU factorizations, 2008, available from http://arxiv. org/abs/0809.2407.
-
(2008)
Implementing Communication-optimal Parallel and Sequential QR and LU Factorizations
-
-
Demmel, J.1
Grigori, L.2
Hoemmen, M.3
Langou, J.4
-
33
-
-
80054013486
-
-
ACM/IEEE, Austin, TX
-
J. DEMMEL, L. GRIGORI, AND H. XIANG, Communication-Avoiding Gaussian elimination, Supercomputing 08, ACM/IEEE, Austin, TX, 2008.
-
(2008)
Communication-Avoiding Gaussian Elimination, Supercomputing 08
-
-
Demmel, J.1
Grigori, L.2
Xiang, H.3
-
34
-
-
80051667036
-
CALU: A communication optimal LU factorization algorithm
-
University of California-Berkeley submitted to SIAM J. Matrix Anal. Appl
-
J. DEMMEL, L. GRIGORI, AND H. XIANG, CALU: A Communication Optimal LU Factorization Algorithm, EECS Technical Report EECS-2010-29, University of California-Berkeley, 2010, submitted to SIAM J. Matrix Anal. Appl.
-
(2010)
EECS Technical Report EECS-2010-29
-
-
Demmel, J.1
Grigori, L.2
Xiang, H.3
-
35
-
-
0000456144
-
Parallel matrix and graph algorithms
-
E. DEKEL, D. NASSIMI, AND S. SAHNI, Parallel matrix and graph algorithms, SIAM J. Comput., 10 (1981), pp. 657-675.
-
(1981)
SIAM J. Comput.
, vol.10
, pp. 657-675
-
-
Dekel, E.1
Nassimi, D.2
Sahni, S.3
-
36
-
-
84947936389
-
New serial and parallel recursive QR factorization algorithms for SMP systems
-
B. Kågström et al., eds. Springer, Berlin
-
E. ELMROTH AND F. GUSTAVSON, New serial and parallel recursive QR factorization algorithms for SMP systems, in Applied Parallel Computing. Large Scale Scientific and Industrial Problems, Lecture Notes in Comput. Sci. 1541, B. Kågström et al., eds., Springer, Berlin, 1998, pp. 120-128.
-
(1998)
Applied Parallel Computing. Large Scale Scientific and Industrial Problems, Lecture Notes in Comput. Sci.
, vol.1541
, pp. 120-128
-
-
Elmroth, E.1
Gustavson, F.2
-
37
-
-
0034224207
-
Applying recursion to serial and parallel QR factorization leads to better performance
-
E. ELMROTH AND F. GUSTAVSONApplying recursion to serial and parallel QR factorization leads to better performance, IBM J. Res. Dev., 44 (2000), pp. 605-624.
-
(2000)
IBM J. Res. Dev.
, vol.44
, pp. 605-624
-
-
Elmroth, E.1
Gustavson, F.2
-
38
-
-
1842832833
-
Recursive blocked algorithms and hybrid data structures for dense matrix library software
-
E. ELMROTH, F. GUSTAVSON, I. JONSSON, AND B. KÅGSTRÖM, Recursive blocked algorithms and hybrid data structures for dense matrix library software, SIAM Rev., 46 (2004), pp. 3-45.
-
(2004)
SIAM Rev.
, vol.46
, pp. 3-45
-
-
Elmroth, E.1
Gustavson, F.2
Jonsson, I.3
Kågström, B.4
-
39
-
-
0033350255
-
Cache-oblivious algorithms
-
M. FRIGO, C. E. LEISERSON, H. PROKOP, AND S. RAMACHANDRAN, Cache-Oblivious algorithms, in Proceedings of the 40th Annual IEEE Symposium on Foundations of Computer Science, IEEE Computer Society, Washington, DC, 1999, pp. 285-297. (Pubitemid 30539973)
-
(1999)
Annual Symposium on Foundations of Computer Science - Proceedings
, pp. 285-297
-
-
Frigo Matteo1
Leiserson Charles, E.2
Prokop Harald3
Ramachandran Sridhar4
-
40
-
-
1442337668
-
QR factorization with Morton-ordered quadtree matrices for memory re-use and parallelism
-
J. D. FRENS AND D. S. WISE, QR factorization with Morton-ordered quadtree matrices for memory re-use and parallelism, SIGPLAN Not., 38 (2003), pp. 144-154.
-
(2003)
SIGPLAN Not.
, vol.38
, pp. 144-154
-
-
Frens, J.D.1
Wise, D.S.2
-
41
-
-
0000264382
-
Nested dissection of a regular finite element mesh
-
A. GEORGE, Nested dissection of a regular finite element mesh, SIAM J. Numer. Anal., 10 (1973), pp. 345-363.
-
(1973)
SIAM J. Numer. Anal.
, vol.10
, pp. 345-363
-
-
George, A.1
-
42
-
-
17644368925
-
Parallel out-of-core computation and updating of the QR factorization
-
DOI 10.1145/1055531.1055534
-
B. C. GUNTER AND R. A. VAN DE GEIJN, Parallel out-of-core computation and updating of the QR factorization, ACM Trans. Math. Software, 31 (2005), pp. 60-78. (Pubitemid 40557862)
-
(2005)
ACM Transactions on Mathematical Software
, vol.31
, Issue.1
, pp. 60-78
-
-
Gunter, B.C.1
Van De Geijn, R.A.2
-
43
-
-
0039435412
-
FLAME: Formal linear algebra methods environment
-
DOI 10.1145/504210.504213
-
J. A. GUNNELS, F. G. GUSTAVSON, G. M. HENRY, AND R. A. VAN DE GEIJN, FLAME: Formal linear algebra methods environment, ACM Trans. Math. Software, 27 (2001), pp. 422-455. (Pubitemid 33602331)
-
(2001)
ACM Transactions on Mathematical Software
, vol.27
, Issue.4
, pp. 422-455
-
-
Gunnels, J.A.1
Gustavson, F.G.2
Henry, G.M.3
Van De Geijn, R.A.4
-
44
-
-
77953973267
-
Parallel block schemes for large-scale least-squares computations
-
University of Illinois Press, Champaign, IL
-
G. H. GOLUB, R. J. PLEMMONS, AND A. SAMEH, Parallel block schemes for large-scale least-squares computations, in High-Speed Computing: Scientific Applications and Algorithm Design University of Illinois Press, Champaign, IL, 1988, pp. 171-179.
-
(1988)
High-Speed Computing: Scientific Applications and Algorithm Design
, pp. 171-179
-
-
Golub, G.H.1
Plemmons, R.J.2
Sameh, A.3
-
45
-
-
0010865720
-
The analysis of a nested dissection algorithm
-
J. R. GILBERT AND R. E. TARJAN, The analysis of a nested dissection algorithm, Numer. Math., 50 (1987), pp. 377-404.
-
(1987)
Numer. Math.
, vol.50
, pp. 377-404
-
-
Gilbert, J.R.1
Tarjan, R.E.2
-
46
-
-
0031273280
-
Recursion leads to automatic variable blocking for dense linear-algebra algorithms
-
F. G. GUSTAVSON, Recursion leads to automatic variable blocking for dense linear-algebra algorithms, IBM J. Res. Dev., 41 (1997), pp. 737-756.
-
(1997)
IBM J. Res. Dev.
, vol.41
, pp. 737-756
-
-
Gustavson, F.G.1
-
47
-
-
0004236492
-
-
Johns Hopkins, University Press, Baltimore, MD
-
G. GOLUB AND C. VAN LOAN, Matrix Computations, 3rd ed., Johns Hopkins University Press, Baltimore, MD, 1996.
-
(1996)
Matrix Computations 3rd Ed
-
-
Golub, G.1
Van Loan, C.2
-
48
-
-
84971853043
-
I/O complexity: The red-blue pebble game
-
ACM, New York
-
J. W. HONG AND H. T. KUNG, I/O complexity: The red-blue pebble game, in Proceedings of the 13th Annual ACM Symposium on Theory of Computing, ACM, New York, 1981, pp. 326-333.
-
(1981)
Proceedings of the 13th Annual ACM Symposium on Theory of Computing
, pp. 326-333
-
-
Hong, J.W.1
Kung, H.T.2
-
49
-
-
0039236804
-
Complexity bounds for regular finite difference and finite element grids
-
A. J. HOFFMAN, M. S. MARTIN, AND D. J. ROSE, Complexity bounds for regular finite difference and finite element grids, SIAM J. Numer. Anal., 10 (1973), pp. 364-369.
-
(1973)
SIAM J. Numer Anal.
, vol.10
, pp. 364-369
-
-
Hoffman, A.J.1
Martin, M.S.2
Rose, D.J.3
-
50
-
-
0036493233
-
Trading replication for communication in parallel distributed-memory dense solvers
-
D. IRONY AND S. TOLEDO, Trading replication for communication in parallel distributed-memory dense solvers, Parallel Process. Lett., 12 (2002), pp. 79-94. (Pubitemid 34668795)
-
(2002)
Parallel Processing Letters
, vol.12
, Issue.1
, pp. 79-94
-
-
Irony, D.1
Toledo, S.2
-
51
-
-
10844258198
-
Communication lower bounds for distributed-memory matrix multiplication
-
DOI 10.1016/j.jpdc.2004.03.021
-
D. IRONY, S. TOLEDO, AND A. TISKIN, Communication lower bounds for distributed-memory matrix multiplication, J. Parallel Distrib. Comput., 64 (2004), pp. 1017-1026. (Pubitemid 40000755)
-
(2004)
Journal of Parallel and Distributed Computing
, vol.64
, Issue.9
, pp. 1017-1026
-
-
Irony, D.1
Toledo, S.2
Tiskin, A.3
-
52
-
-
0001289565
-
An inequality related to the isoperimetric inequality
-
L. H. LOOMIS AND H. WHITNEY, An inequality related to the isoperimetric inequality, Bull. Am. Math. Soc., 55 (1949), pp. 961-962.
-
(1949)
Bull. Am. Math. Soc.
, vol.55
, pp. 961-962
-
-
Loomis, L.H.1
Whitney, H.2
-
53
-
-
84966587111
-
Optimizing graph algorithms for improved cache performance
-
Fort Lauderdale, FL
-
J. P. MICHAEL, M. PENNER, AND V. K. PRASANNA, Optimizing graph algorithms for improved cache performance, in Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS 2002), Fort Lauderdale, FL, 2002, pp. 769-782.
-
(2002)
Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS 2002)
, pp. 769-782
-
-
Michael, J.P.1
Penner, M.2
Prasanna, V.K.3
-
54
-
-
0000743020
-
Memory-efficient matrix multiplication in the BSP model
-
W. F. MCCOLL AND A. TISKIN, Memory-efficient matrix multiplication in the BSP model, Algorithmica, 24 (1999), pp. 287-297. (Pubitemid 129715337)
-
(1999)
Algorithmica (New York)
, vol.24
, Issue.3-4
, pp. 287-297
-
-
McColl, W.F.1
Tiskin, A.2
-
55
-
-
0012032244
-
Modification of the Householder method based on compact WY representation
-
C. PUGLISI, Modification of the Householder method based on compact WY representation, SIAM J. Sci. Statist. Comput., 13 (1992), pp. 723-726.
-
(1992)
SIAM J. Sci. Statist. Comput.
, vol.13
, pp. 723-726
-
-
Puglisi, C.1
-
56
-
-
0042385409
-
Communication complexity of the Gaussian elimination algorithm on multiprocessors
-
Y. SAAD, Communication complexity of the Gaussian elimination algorithm on multiprocessors, Linear Algebra Appl., 77 (1986), pp. 315-340.
-
(1986)
Linear Algebra Appl.
, vol.77
, pp. 315-340
-
-
Saad, Y.1
-
58
-
-
84957579840
-
Extending the Hong-Kung model to memory hierarchies
-
Springer, Berlin
-
J. E. SAVAGE, Extending the Hong-Kung model to memory hierarchies, in Computing and Combinatorics, Lecture Notes in Comput. Sci. 959, Springer, Berlin, 1995, pp. 270-281.
-
(1995)
Computing and Combinatorics, Lecture Notes in Comput. Sci.
, vol.959
, pp. 270-281
-
-
Savage, J.E.1
-
59
-
-
80052305141
-
Communication-optimal parallel 2.5D matrix multiplication and LU factorization algorithms
-
University of California-Berkeley to appear in EURO-PAR 2011
-
E. SOLOMONIK AND J. DEMMEL, Communication-Optimal Parallel 2.5D Matrix Multiplication and LU Factorization Algorithms, EECS Technical Report EECS-2011-10, University of California-Berkeley, 2011, to appear in EURO-PAR 2011.
-
(2011)
EECS Technical Report EECS-2011-10
-
-
Solomonik, E.1
Demmel, J.2
-
60
-
-
0003078924
-
A storage-efficient WY representation for products of Householder transformations
-
R. SCHREIBER AND C. VAN LOAN, A storage-efficient WY representation for products of Householder transformations, SIAM J. Sci. Statist. Comput., 10 (1989), pp. 53-57.
-
(1989)
SIAM J. Sci. Statist. Comput.
, vol.10
, pp. 53-57
-
-
Schreiber, R.1
Van Loan, C.2
-
61
-
-
0031496750
-
Locality of reference in LU decomposition with partial pivoting
-
S. TOLEDO, Locality of reference in LU decomposition with partial pivoting, SIAM J. Matrix Anal. Appl., 18 (1997), pp. 1065-1081.
-
(1997)
SIAM J. Matrix Anal. Appl.
, vol.18
, pp. 1065-1081
-
-
Toledo, S.1
-
63
-
-
24344485098
-
OSKI: A library of automatically tuned sparse matrix kernels
-
J. of Physics: Conference Series Institute of Physics Publishing, London
-
R. VUDUC, J. DEMMEL, AND K. YELICK, OSKI: A library of automatically tuned sparse matrix kernels, in Proceedings of SciDAC 2005, J. of Physics: Conference Series, Institute of Physics Publishing, London, 2005.
-
(2005)
Proceedings of SciDAC 2005
-
-
Vuduc, R.1
Demmel, J.2
Yelick, K.3
-
64
-
-
34250883179
-
Fast sparse matrix multiplication
-
R. YUSTER AND U. ZWICK, Fast sparse matrix multiplication, ACM Trans. Algorithms, 1 (2005), pp. 2-13.
-
(2005)
ACM Trans. Algorithms
, vol.1
, pp. 2-13
-
-
Yuster, R.1
Zwick, U.2
|