-
1
-
-
0028513316
-
Exploiting functional parallelism of POWER2 to design high-performance numerical algorithms
-
Sept.
-
AGARWAL, R., GUSTAVSON, F., AND ZUBAIR, Z. 1994a. Exploiting functional parallelism of POWER2 to design high-performance numerical algorithms. IBM J. Res. Develop. 38, 5 (Sept.), 563-576.
-
(1994)
IBM J. Res. Develop.
, vol.38
, Issue.5
, pp. 563-576
-
-
Agarwal, R.1
Gustavson, F.2
Zubair, Z.3
-
2
-
-
0028427170
-
Improving performance of linear algebra algorithms for dense matrices using algorithmic prefetching
-
May
-
AGARWAL, R., GUSTAVSON, F., AND ZUBAIR, Z. 1994b. Improving performance of linear algebra algorithms for dense matrices using algorithmic prefetching. IBM J. Res. Develop. 38, 3 (May), 265-275.
-
(1994)
IBM J. Res. Develop.
, vol.38
, Issue.3
, pp. 265-275
-
-
Agarwal, R.1
Gustavson, F.2
Zubair, Z.3
-
3
-
-
0003706460
-
-
SIAM Publications
-
ANDERSON, E., BAI, Z., BISCHOF, C., DEMMEL, J., DONGARRA, J., DUCROZ, J., GREENBAUM, A., HAMMARLING, S., MCKENNY, A., OSTROUCHOV, S., AND SORENSEN, D. 1992. LAPACK Users Guide. SIAM Publications.
-
(1992)
LAPACK Users Guide
-
-
Anderson, E.1
Bai, Z.2
Bischof, C.3
Demmel, J.4
Dongarra, J.5
DuCroz, J.6
Greenbaum, A.7
Hammarling, S.8
McKenny, A.9
Ostrouchov, S.10
Sorensen, D.11
-
4
-
-
0031223129
-
Compiler blockability of dense matrix factorizations
-
Sept.
-
CARR, S. AND LEHOUCQ, R. 1997. Compiler blockability of dense matrix factorizations. ACM Trans. Math. Softw. 23, 3 (Sept.), 336-361.
-
(1997)
ACM Trans. Math. Softw.
, vol.23
, Issue.3
, pp. 336-361
-
-
Carr, S.1
Lehoucq, R.2
-
5
-
-
0346864565
-
Design issues and the performance of level 1 and level 2 kernels on Intel i860-based platforms
-
Department of Computing Science, Umeå University, Umeå, Sweden
-
DACKLAND, K. 1995. Design issues and the performance of level 1 and level 2 kernels on Intel i860-based platforms. Report UMINF-95.xx, Department of Computing Science, Umeå University, Umeå, Sweden.
-
(1995)
Report UMINF-95.xx
-
-
Dackland, K.1
-
6
-
-
0028443077
-
A parallel block implementation of level-3 BLAS for MIMD vector processors
-
June
-
DAYDÉ, M. J., DUFF, I. S., AND PETITET, A. 1994. A parallel block implementation of level-3 BLAS for MIMD vector processors. ACM Trans. Math. Softw. 20, 2 (June), 178-193.
-
(1994)
ACM Trans. Math. Softw.
, vol.20
, Issue.2
, pp. 178-193
-
-
Daydé, M.J.1
Duff, I.S.2
Petitet, A.3
-
7
-
-
0023983122
-
An extended set of Fortran basic linear Algebra Subprograms
-
DONGARRA, J., DUCROZ, J. D., HAMMARLING, S., AND HANSON, R. 1988. An extended set of Fortran basic linear Algebra Subprograms. ACM Trans. Math. Software 14, 1-17, 18-32.
-
(1988)
ACM Trans. Math. Software
, vol.14
, Issue.1-17
, pp. 18-32
-
-
Dongarra, J.1
DuCroz, J.D.2
Hammarling, S.3
Hanson, R.4
-
8
-
-
0025402476
-
A set of level 3 Basic Linear Algebra Subprograms
-
Mar.
-
DONGARRA, J., DUCROZ, J., DUFF, I., AND HAMMARLING, S. 1990a. A set of level 3 Basic Linear Algebra Subprograms. ACM Trans. Math. Software 16, 1 (Mar.), 1-17.
-
(1990)
ACM Trans. Math. Software
, vol.16
, Issue.1
, pp. 1-17
-
-
Dongarra, J.1
DuCroz, J.2
Duff, I.3
Hammarling, S.4
-
9
-
-
0025401417
-
Algorithm 679: A set of level 3 Basic Linear Algebra Subprograms: Model implementation and test programs
-
Mar.
-
DONGARRA, J., DUCROZ, J., DUFF, I., AND HAMMARLING, S. 1990b. Algorithm 679: A set of level 3 Basic Linear Algebra Subprograms: Model implementation and test programs. ACM Trans. Math. Software 16, 1 (Mar.), 18-28.
-
(1990)
ACM Trans. Math. Software
, vol.16
, Issue.1
, pp. 18-28
-
-
Dongarra, J.1
DuCroz, J.2
Duff, I.3
Hammarling, S.4
-
10
-
-
0040354150
-
The IBM RISC System 6000 and linear algebra operations
-
DONGARRA, J., MAYES, P., AND RADICATI DI BROZOLO, G. 1991. The IBM RISC System 6000 and linear algebra operations. Supercomput. 8, 4, 15-30.
-
(1991)
Supercomput.
, vol.8
, Issue.4
, pp. 15-30
-
-
Dongarra, J.1
Mayes, P.2
Radicati Di Brozolo, G.3
-
11
-
-
0002663082
-
GEMMV: A portable level 3 BLAS Winograd variant of Strassen's matrix-matrix multiply algorithm
-
DOUGLAS, C., HEROUX, M., SLISHMAN, G., AND SMITH, R. 1994. GEMMV: A portable level 3 BLAS Winograd variant of Strassen's matrix-matrix multiply algorithm. J. Comput. Phys. 110, 1-10.
-
(1994)
J. Comput. Phys.
, vol.110
, pp. 1-10
-
-
Douglas, C.1
Heroux, M.2
Slishman, G.3
Smith, R.4
-
12
-
-
84972622535
-
Impact of hierarchical memory systems on linear algebra algorithm design
-
GALLIVAN, K., JALBY, W., MEIER, U., AND SAMEH, A. 1988. Impact of hierarchical memory systems on linear algebra algorithm design. Int. J. Supercomput. Appl. 2, 12-48.
-
(1988)
Int. J. Supercomput. Appl.
, vol.2
, pp. 12-48
-
-
Gallivan, K.1
Jalby, W.2
Meier, U.3
Sameh, A.4
-
14
-
-
0348125138
-
-
Working Note (April), Department of Mathematics, University of Manchester, Manchester, UK
-
GREEN, M. 1994. High performance level 3 BLAS. A KSR implementation. Working Note (April), Department of Mathematics, University of Manchester, Manchester, UK.
-
(1994)
High Performance Level 3 BLAS. A KSR Implementation
-
-
Green, M.1
-
15
-
-
0025637437
-
Exploiting fast matrix multiplication within the level 3 BLAS
-
HIGHAM, N. 1990. Exploiting fast matrix multiplication within the level 3 BLAS. ACM Trans. Math. Softw. 16, 4, 352-368.
-
(1990)
ACM Trans. Math. Softw.
, vol.16
, Issue.4
, pp. 352-368
-
-
Higham, N.1
-
17
-
-
0040000454
-
-
Technical Report. 312936-001 (Oct.), Intel Supercomputer Division. Beaverton, Ore.
-
INTEL. 1993. Paragon Basic Math Library performance report. Technical Report. 312936-001 (Oct.), Intel Supercomputer Division. Beaverton, Ore.
-
(1993)
Paragon Basic Math Library Performance Report
-
-
-
18
-
-
10844292223
-
-
Technical Report CTC91TR47 (Dec.), Department of Computer Science, Cornell University
-
KÅGSTRÖM, B. AND VAN LOAN, C. 1989. GEMM-based level 3 BLAS. Technical Report CTC91TR47 (Dec.), Department of Computer Science, Cornell University.
-
(1989)
GEMM-based Level 3 BLAS
-
-
Kågström, B.1
Van Loan, C.2
-
19
-
-
0346234145
-
High performance GEMM-based level 3 BLAS: Sample routines for double precision real data
-
(Amsterdam, 1991). North-Holland
-
KÅGSTRÖM, B., LING, P., AND VAN LOAN, C. 1991. High performance GEMM-based level 3 BLAS: Sample routines for double precision real data. In High Performance Computing II (Amsterdam, 1991). North-Holland, 269-281.
-
(1991)
High Performance Computing
, vol.2
, pp. 269-281
-
-
Kågström, B.1
Ling, P.2
Van Loan, C.3
-
20
-
-
10844275231
-
Portable high performance GEMM-based level 3 BLAS
-
(Philadelphia, 1993). SIAM Publications
-
KÅGSTRÖM, B., LING, P., AND VAN LOAN, C. 1993. Portable high performance GEMM-based level 3 BLAS. In Parallel Processing for Scientific Computing (Philadelphia, 1993). SIAM Publications, 339-346.
-
(1993)
Parallel Processing for Scientific Computing
, pp. 339-346
-
-
Kågström, B.1
Ling, P.2
Van Loan, C.3
-
21
-
-
26744449251
-
-
Report UMINF-94.13 (December), Department of Computing Science, Umeå University, Umeå, Sweden. Revised, December
-
KÅGSTRÖM, B., LING, P., AND VAN LOAN, C. 1994. GEMM-based level 3 BLAS: Algorithms for the model implementations. Report UMINF-94.13 (December), Department of Computing Science, Umeå University, Umeå, Sweden. Revised, December 1995.
-
(1994)
GEMM-based Level 3 BLAS: Algorithms for the Model Implementations
-
-
Kågström, B.1
Ling, P.2
Van Loan, C.3
-
22
-
-
0032155342
-
Algorithm 784: GEMM-based level 3 BLAS: Portability and optimization issues
-
This issue
-
KÅGSTRÖM, B., LING, P., AND VAN LOAN, C. 1998. Algorithm 784: GEMM-based level 3 BLAS: Portability and optimization issues. ACM Trans. Math. Software. This issue.
-
(1998)
ACM Trans. Math. Software
-
-
Kågström, B.1
Ling, P.2
Van Loan, C.3
-
23
-
-
0018515759
-
Basic Linear Algebra Subprograms for Fortran usage
-
LAWSON, C., HANSON, R., KINCAID, R., AND KROGH, F. 1979. Basic Linear Algebra Subprograms for Fortran usage. ACM Trans. Math. Softw. 5, 308-323.
-
(1979)
ACM Trans. Math. Softw.
, vol.5
, pp. 308-323
-
-
Lawson, C.1
Hanson, R.2
Kincaid, R.3
Krogh, F.4
-
24
-
-
0027656965
-
A set of high performance level-3 BLAS structured and tuned for the IBM 3090 VF and implemented in Fortran 77
-
Sept.
-
LING, P. 1993. A set of high performance level-3 BLAS structured and tuned for the IBM 3090 VF and implemented in Fortran 77. J. Supercomput. 7, 3 (Sept.), 323-355.
-
(1993)
J. Supercomput.
, vol.7
, Issue.3
, pp. 323-355
-
-
Ling, P.1
-
25
-
-
0347495130
-
Implementation of the level 2 and 3 BLAS on the CRAY Y-MP and the CRAY-2
-
Feb.
-
SHEIK, Q., PHUONG, V., CHAO, Y., AND MERCHANT, M. 1992. Implementation of the level 2 and 3 BLAS on the CRAY Y-MP and the CRAY-2. J. Supercomput. 5, 4 (Feb.), 291-305.
-
(1992)
J. Supercomput.
, vol.5
, Issue.4
, pp. 291-305
-
-
Sheik, Q.1
Phuong, V.2
Chao, Y.3
Merchant, M.4
-
26
-
-
34250487811
-
Gaussian elimination is not optimal
-
STRASSEN, V. 1969. Gaussian elimination is not optimal. Numer. Math. 13, 354-356.
-
(1969)
Numer. Math.
, vol.13
, pp. 354-356
-
-
Strassen, V.1
|