-
1
-
-
0242343480
-
LAPACK for distributed memory architectures: Progress report
-
SIAM, Philadelphia
-
E. Anderson, A. Benzoni, J. Dongarra, S. Moulton, S. Ostrouchov, B. Tourancheau and R. van de Geijn, 'LAPACK for distributed memory architectures: progress report', in Proceedings of the Fifth SIAM Conference on Parallel Processing for Scientific Computing, SIAM, Philadelphia, 1992, pp. 625-630.
-
(1992)
Proceedings of the Fifth SIAM Conference on Parallel Processing for Scientific Computing
, pp. 625-630
-
-
Anderson, E.1
Benzoni, A.2
Dongarra, J.3
Moulton, S.4
Ostrouchov, S.5
Tourancheau, B.6
Van De Geijn, R.7
-
2
-
-
0002924772
-
Scalapack: A scalable linear algebra library for distributed memory concurrent computers
-
IEEE Comput. Soc. Press
-
J. Choi, J. J. Dongarra, R. Pozo and D. W. Walker, 'Scalapack: A scalable linear algebra library for distributed memory concurrent computers', Proceedings of the Fourth Symposium on the Frontiers of Massively Parallel Computation, IEEE Comput. Soc. Press, 1992, pp. 120-127.
-
(1992)
Proceedings of the Fourth Symposium on the Frontiers of Massively Parallel Computation
, pp. 120-127
-
-
Choi, J.1
Dongarra, J.J.2
Pozo, R.3
Walker, D.W.4
-
3
-
-
10444267815
-
LAPACK for distributed memory architectures: The next generation
-
Norfolk, March
-
J. Demmel, J. Dongarra, R. van de Geijn and D. Walker, 'LAPACK for distributed memory architectures: The next generation', in Proceedings of the Sixth SIAM Conference on Parallel Processing for Scientific Computing, Norfolk, March 1993.
-
(1993)
Proceedings of the Sixth SIAM Conference on Parallel Processing for Scientific Computing
-
-
Demmel, J.1
Dongarra, J.2
Van De Geijn, R.3
Walker, D.4
-
5
-
-
0000778168
-
Scalability issues affecting the design of a dense linear algebra library
-
Scalability of Parallel Algorithms
-
J. Dongarra, R. van de Geijn and D.Walker, 'Scalability issues affecting the design of a dense linear algebra library', Special Issue on Scalability of Parallel Algorithms, J. Parallel Distrib. Comput., 22, (3), (1994).
-
J. Parallel Distrib. Comput.
, vol.22
, Issue.3 SPEC. ISSUE
, pp. 1994
-
-
Dongarra, J.1
Van De Geijn, R.2
Walker, D.3
-
6
-
-
0003506603
-
-
Prentice Hall, Englewood Cliffs, N.J.
-
G. C. Fox, M. A. Johnson, G. A. Lyzenga, S. W. Otto, J. K. Salmon and D. W. Walker, Solving Problems on Concurrent Processors, Vol. 1, Prentice Hall, Englewood Cliffs, N.J., 1988.
-
(1988)
Solving Problems on Concurrent Processors
, vol.1
-
-
Fox, G.C.1
Johnson, M.A.2
Lyzenga, G.A.3
Otto, S.W.4
Salmon, J.K.5
Walker, D.W.6
-
7
-
-
12444284722
-
-
Harvard University, Center for Research in Computing Technology, TR-04-92, Jan.
-
W. Lichtenstein and S. L. Johnsson, 'Block-cyclic dense linear algebra', Harvard University, Center for Research in Computing Technology, TR-04-92, Jan. 1992.
-
(1992)
Block-cyclic Dense Linear Algebra
-
-
Lichtenstein, W.1
Johnsson, S.L.2
-
9
-
-
0025536635
-
Lapack: A portable linear algebra library for high performance computers
-
IEEE Press
-
E. Anderson, Z. Bai, C. Bischof, J. Demmel, J. Dongarra, J. DuCroz, A. Greenbaum, S. Hammarling, A. McKenney and D. Sorensen, 'Lapack: A portable linear algebra library for high performance computers', Proceedings of Supercomputing '90, IEEE Press, 1990, pp. 1-10.
-
(1990)
Proceedings of Supercomputing '90
, pp. 1-10
-
-
Anderson, E.1
Bai, Z.2
Bischof, C.3
Demmel, J.4
Dongarra, J.5
Ducroz, J.6
Greenbaum, A.7
Hammarling, S.8
McKenney, A.9
Sorensen, D.10
-
10
-
-
0003706460
-
-
SIAM, Philadelphia
-
E. Anderson, Z. Bai, J. Demmel, J. Dongarra, J. DuCroz, A. Greenbaum, S. Hammarling, A. McKenney, S. Ostrouchov and D. Sorensen, LAPACK Users' Guide, SIAM, Philadelphia, 1992.
-
(1992)
LAPACK Users' Guide
-
-
Anderson, E.1
Bai, Z.2
Demmel, J.3
Dongarra, J.4
DuCroz, J.5
Greenbaum, A.6
Hammarling, S.7
McKenney, A.8
Ostrouchov, S.9
Sorensen, D.10
-
11
-
-
0018515759
-
Basic linear algebra subprograms for Fortran usage
-
C. L. Lawson, R. J. Hanson, D. R. Kincaid and F. T. Krogh, 'Basic linear algebra subprograms for Fortran usage', TOMS, 5, (3), 308-323 (1979).
-
(1979)
Toms
, vol.5
, Issue.3
, pp. 308-323
-
-
Lawson, C.L.1
Hanson, R.J.2
Kincaid, D.R.3
Krogh, F.T.4
-
12
-
-
0023983122
-
An extended set of FORTRAN basic linear algebra subprograms
-
J. J. Dongarra, J. Du Croz, S. Hammarling and R. J. Hanson, 'An extended set of FORTRAN basic linear algebra subprograms', TOMS, 14, (1), 1-17 (1988).
-
(1988)
Toms
, vol.14
, Issue.1
, pp. 1-17
-
-
Dongarra, J.J.1
Du Croz, J.2
Hammarling, S.3
Hanson, R.J.4
-
13
-
-
0025402476
-
A set of Level 3 basic linear algebra subprograms
-
J. J. Dongarra, J. Du Croz, S. Hammarling and I. Duff, 'A set of Level 3 basic linear algebra subprograms', TOMS, 16, (1), 1-16 (1990).
-
(1990)
Toms
, vol.16
, Issue.1
, pp. 1-16
-
-
Dongarra, J.J.1
Du Croz, J.2
Hammarling, S.3
Duff, I.4
-
14
-
-
0003978709
-
-
LAPACK Working Note 100, University of Tennessee, CS-95-292, May
-
J. Choi, J. Dongarra, S. Ostrouchov, A. Petitet, D. Walker, and R. C. Whaley 'A proposal for a set of parallel basic linear algebra subprograms', LAPACK Working Note 100, University of Tennessee, CS-95-292, May 1995.
-
(1995)
A Proposal for a Set of Parallel Basic Linear Algebra Subprograms
-
-
Choi, J.1
Dongarra, J.2
Ostrouchov, S.3
Petitet, A.4
Walker, D.5
Whaley, R.C.6
-
15
-
-
0031123769
-
-
TR-95-13, Department of Computer Sciences, University of Texas, April
-
R. van de Geijn and J. Watts, 'SUMMA: Scalable universal matrix multiplication algorithm', TR-95-13, Department of Computer Sciences, University of Texas, April 1995. Also: LAPACK Working Note 96, May 1, Concurrency: Pract. Exp., 9, (4), 255-274 (1997).
-
(1995)
SUMMA: Scalable Universal Matrix Multiplication Algorithm
-
-
Van De Geijn, R.1
Watts, J.2
-
16
-
-
0031123769
-
-
Also: LAPACK Working Note 96, May 1
-
R. van de Geijn and J. Watts, 'SUMMA: Scalable universal matrix multiplication algorithm', TR-95-13, Department of Computer Sciences, University of Texas, April 1995. Also: LAPACK Working Note 96, May 1, Concurrency: Pract. Exp., 9, (4), 255-274 (1997).
-
(1997)
Concurrency: Pract. Exp.
, vol.9
, Issue.4
, pp. 255-274
-
-
-
17
-
-
0028530654
-
PUMMA: Parallel universal matrix multiplication algorithms on distributed memory concurrent computers
-
J. Choi, J. J. Dongarra and D. W. Walker, 'PUMMA: Parallel universal matrix multiplication algorithms on distributed memory concurrent computers', Concurrency: Pract. Exp., 6, (7), 543-570 (1994).
-
(1994)
Concurrency: Pract. Exp.
, vol.6
, Issue.7
, pp. 543-570
-
-
Choi, J.1
Dongarra, J.J.2
Walker, D.W.3
-
18
-
-
0037970044
-
Comparison of scalable parallel matrix multiplication libraries
-
Starksville, MS, Oct.
-
S. Huss-Lederman, E. Jacobson and A. Tsao, 'Comparison of scalable parallel matrix multiplication libraries', in Proceedings of the Scalable Parallel Libraries Conference, Starksville, MS, Oct. 1993.
-
(1993)
Proceedings of the Scalable Parallel Libraries Conference
-
-
Huss-Lederman, S.1
Jacobson, E.2
Tsao, A.3
-
19
-
-
0028529387
-
Matrix multiplication on the Intel Touchstone DELTA
-
S. Huss-Lederman, E. Jacobson, A. Tsao and G. Zhang, 'Matrix multiplication on the Intel Touchstone DELTA', Concurrency: Pract. Exp., 6, (7), 571-594 (1994).
-
(1994)
Concurrency: Pract. Exp.
, vol.6
, Issue.7
, pp. 571-594
-
-
Huss-Lederman, S.1
Jacobson, E.2
Tsao, A.3
Zhang, G.4
-
21
-
-
14744301600
-
Fast collective communication libraries, please
-
P. Mitra, D. Payne, L. Shuler, R. van de Geijn and J. Watts, 'Fast collective communication libraries, please', in the Proceedings of the Intel Supercomputing Users' Group Meeting 1995.
-
Proceedings of the Intel Supercomputing Users' Group Meeting 1995
-
-
Mitra, P.1
Payne, D.2
Shuler, L.3
Van De Geijn, R.4
Watts, J.5
-
22
-
-
12444294511
-
-
IBM T.J. Watson Research Center
-
R. C. Agarwal, F. G. Gustavson, S. M. Balle, M. Joshi and P. Palkar, 'A high performance matrix multiplication algorithm for MPPs', IBM T.J. Watson Research Center, 1995.
-
(1995)
A High Performance Matrix Multiplication Algorithm for MPPs
-
-
Agarwal, R.C.1
Gustavson, F.G.2
Balle, S.M.3
Joshi, M.4
Palkar, P.5
-
23
-
-
85033307657
-
-
TR-96-09, Department of Computer Sciences, University of Texas, May
-
A. Chtchelkanova, C. Edwards, J. Gunnels, G. Morrow, J. Overfelt and R A. van de Geijn, 'Towards usable and lean parallel linear algebra libraries', TR-96-09, Department of Computer Sciences, University of Texas, May 1996.
-
(1996)
Towards Usable and Lean Parallel Linear Algebra Libraries
-
-
Chtchelkanova, A.1
Edwards, C.2
Gunnels, J.3
Morrow, G.4
Overfelt, J.5
Van De Geijn, R.A.6
-
26
-
-
0029312007
-
A pipelined broadcast for multidimensional meshes
-
J. Watts and R. van de Geijn, 'A pipelined broadcast for multidimensional meshes', Parallel Process. Lett., 5, (2), 281-292 (1995).
-
(1995)
Parallel Process. Lett.
, vol.5
, Issue.2
, pp. 281-292
-
-
Watts, J.1
Van De Geijn, R.2
-
27
-
-
4744342117
-
-
TR-95-40, Department of Computer Sciences, University of Texas, Oct.
-
A. Chtchelkanova, J. Gunnels, G. Morrow, J. Overfelt and R. A. van de Geijn, 'Parallel implementation of BLAS: General techniques for Level 3 BLAS', TR-95-40, Department of Computer Sciences, University of Texas, Oct. 1995.
-
(1995)
Parallel Implementation of BLAS: General Techniques for Level 3 BLAS
-
-
Chtchelkanova, A.1
Gunnels, J.2
Morrow, G.3
Overfelt, J.4
Van De Geijn, R.A.5
-
28
-
-
12444256113
-
-
Department of Computer Sciences, UT-Austin, Report TR95-39, Oct.
-
C. Edwards, P. Geng, A. Patra, and R. van de Geijn, 'Parallel matrix distributions: have we been doing it all wrong?', Department of Computer Sciences, UT-Austin, Report TR95-39, Oct. 1995.
-
(1995)
Parallel Matrix Distributions: Have We Been Doing It All Wrong?
-
-
Edwards, C.1
Geng, P.2
Patra, A.3
Van De Geijn, R.4
-
29
-
-
0000262001
-
On parallelizable eigensolvers
-
L. Auslander and A.Tsao, On parallelizable eigensolvers', Adv. Appl. Math., 13, 253-261, (1992).
-
(1992)
Adv. Appl. Math.
, vol.13
, pp. 253-261
-
-
Auslander, L.1
Tsao, A.2
-
30
-
-
0001175581
-
Design of a parallel nonsymmetric eigenroutine toolbox, Part I
-
R. Sincovec, D. Keyes, M. Leuze, L. Petzold and D. Reed (Eds.), SIAM Publications, Philadelphia, PA
-
Z. Bai and J. Demmel, 'Design of a parallel nonsymmetric eigenroutine toolbox, Part I', Parallel Processing for Scientific Computing, R. Sincovec, D. Keyes, M. Leuze, L. Petzold and D. Reed (Eds.), SIAM Publications, Philadelphia, PA, 1993, pp. 391-398.
-
(1993)
Parallel Processing for Scientific Computing
, pp. 391-398
-
-
Bai, Z.1
Demmel, J.2
-
31
-
-
0040176467
-
-
LAPACK working note 91, University of Tennessee, Jan.
-
Z. Bai, J. Demmel, J. Dongarra, A. Petitet, H. Robinson and K. Stanley, The Spectral Decomposition of Nonsymmetric Matrices on Distributed Memory Parallel Computers, LAPACK working note 91, University of Tennessee, Jan. 1995.
-
(1995)
The Spectral Decomposition of Nonsymmetric Matrices on Distributed Memory Parallel Computers
-
-
Bai, Z.1
Demmel, J.2
Dongarra, J.3
Petitet, A.4
Robinson, H.5
Stanley, K.6
-
32
-
-
0039122444
-
A parallelizable eigensolver for real diagonalizable matrices with real eigenvalues
-
Supercomputing Research Center
-
S. Lederman, A. Tsao and T. Turnbull, 'A parallelizable eigensolver for real diagonalizable matrices with real eigenvalues', Technical Report TR-91-042, Supercomputing Research Center, 1991.
-
(1991)
Technical Report TR-91-042
-
-
Lederman, S.1
Tsao, A.2
Turnbull, T.3
-
33
-
-
0039740892
-
Anatomy of an out-of-core dense linear solver
-
K. Klimkowski and R. van de Geijn, 'Anatomy of an out-of-core dense linear solver', Vol III, Algorithms and Applications, Proceedings of the 1995 International Conference on Parallel Processing, pp. 29-33.
-
Algorithms and Applications, Proceedings of the 1995 International Conference on Parallel Processing
, vol.3
, pp. 29-33
-
-
Klimkowski, K.1
Van De Geijn, R.2
-
34
-
-
85033305549
-
A high performance parallel strassen implementation
-
to be published
-
B. Grayson and R. van de Geijn, 'A high performance parallel strassen implementation'. Parallel Process. Lett., to be published.
-
Parallel Process. Lett.
-
-
Grayson, B.1
Van De Geijn, R.2
-
35
-
-
0028545949
-
A high performance matrix multiplication algorithm on a distributed-memory parallel computer, using overlapped communication
-
R. C. Agarwal, F. G. Gustavson and M. Zubair, 'A high performance matrix multiplication algorithm on a distributed-memory parallel computer, using overlapped communication', IBM J. Res. Dev., 673-681 (1994).
-
(1994)
IBM J. Res. Dev.
, pp. 673-681
-
-
Agarwal, R.C.1
Gustavson, F.G.2
Zubair, M.3
-
37
-
-
0005269376
-
Level 3 BLAS for distributed memory concurrent computers
-
Saint Hilaire du Touvet, France, 7-8 Sept. 1992, Elsevier Science Publishers
-
J. Choi, J. J. Dongarra and D. W. Walker, 'Level 3 BLAS for distributed memory concurrent computers', CNRS-NSF Workshop on Environments and Tools for Parallel Scientific Computing, Saint Hilaire du Touvet, France, 7-8 Sept. 1992, Elsevier Science Publishers, 1992.
-
(1992)
CNRS-NSF Workshop on Environments and Tools for Parallel Scientific Computing
-
-
Choi, J.1
Dongarra, J.J.2
Walker, D.W.3
-
38
-
-
0003793981
-
-
SIAM, Philadelphia
-
J. J. Dongarra, I. S. Duff, D. C. Sorensen and H. A. van der Vorst, Solving Linear Systems on Vector and Shared Memory Computers, SIAM, Philadelphia, 1991.
-
(1991)
Solving Linear Systems on Vector and Shared Memory Computers
-
-
Dongarra, J.J.1
Duff, I.S.2
Sorensen, D.C.3
Van Der Vorst, H.A.4
-
40
-
-
0023288009
-
Matrix algorithms on a hypercube I: Matrix multiplication
-
G. Fox, S. Otto and A. Hey, 'Matrix algorithms on a hypercube I: Matrix multiplication', Parallel Comput., 3, 17-31 (1987).
-
(1987)
Parallel Comput.
, vol.3
, pp. 17-31
-
-
Fox S Otto, G.1
Hey, A.2
-
43
-
-
12444336579
-
Level 2 and 3 BLAS routines for the IBM 3090 VF/400: Implementation and experiences
-
Information Processing, University of Umeå, S-901 87 Umeå, Sweden
-
B. Kågström and P. Ling, 'Level 2 and 3 BLAS routines for the IBM 3090 VF/400: Implementation and experiences', Technical Report UMINF-154.88, Information Processing, University of Umeå, S-901 87 Umeå, Sweden, 1988.
-
(1988)
Technical Report UMINF-154.88
-
-
Kågström, B.1
Ling, P.2
-
44
-
-
85033287874
-
Implementing matrix-vector multiplication and conjugate gradient algorithms on distributed memory multicomputers
-
J. G. Lewis and R. A. van de Geijn, 'Implementing matrix-vector multiplication and conjugate gradient algorithms on distributed memory multicomputers', Supercomputing '93.
-
Supercomputing '93
-
-
Lewis, J.G.1
Van De Geijn, R.A.2
-
46
-
-
0026973156
-
A matrix product algorithm and its comparative performance on hypercubes
-
Q. Stout and M. Wolfe (eds.), IEEE Press, Los Alamitos, CA
-
C. Lin and L. Snyder, 'A matrix product algorithm and its comparative performance on hypercubes', in Proceedings of Scalable High Performance Computing Conference, Q. Stout and M. Wolfe (eds.), IEEE Press, Los Alamitos, CA, 1992, pp. 190-193.
-
(1992)
Proceedings of Scalable High Performance Computing Conference
, pp. 190-193
-
-
Lin, C.1
Snyder, L.2
|