-
1
-
-
0028513316
-
Exploiting functional parallelism of POWER2 to design high-performance numerical algorithms
-
Agarwal, R.C., Gustavson, F.G., Zubair, M.: Exploiting functional parallelism of POWER2 to design high-performance numerical algorithms. IBM Journal of Research and Development 38(5), 563-576 (1994)
-
(1994)
IBM Journal of Research and Development
, vol.38
, Issue.5
, pp. 563-576
-
-
Agarwal, R.C.1
Gustavson, F.G.2
Zubair, M.3
-
2
-
-
18044400448
-
A Recursive Formulation of Cholesky Factorization of a Matrix in Packed Storage
-
Andersen, B.S., Gustavson, F.G., Wasnieski, J.: A Recursive Formulation of Cholesky Factorization of a Matrix in Packed Storage. ACM TOMS 27(2), 214-244 (2001)
-
(2001)
ACM TOMS
, vol.27
, Issue.2
, pp. 214-244
-
-
Andersen, B.S.1
Gustavson, F.G.2
Wasnieski, J.3
-
3
-
-
30544437857
-
A Fully Portable High Performance Minimal Storage Hybrid Cholesky Algorithm
-
Andersen, B.S., Gunnels, J.A., Gustavson, F.G., Reid, J.K., Wasnieski, J.: A Fully Portable High Performance Minimal Storage Hybrid Cholesky Algorithm. ACM TOMS 31(2), 201-227 (2005)
-
(2005)
ACM TOMS
, vol.31
, Issue.2
, pp. 201-227
-
-
Andersen, B.S.1
Gunnels, J.A.2
Gustavson, F.G.3
Reid, J.K.4
Wasnieski, J.5
-
4
-
-
0003706460
-
-
3.0, SIAM, Philadelphia
-
Anderson, E., Bai, Z., Bischof, C., Demmel, J., Dongarra, J., Du Croz, J., Greenbaum, A., Hammarling, S., McKenney, A., Ostrouchov, S., Sorensen, D.: LAPACK Users' Guide Release 3.0, SIAM, Philadelphia (1999), http://www.netlib. org/lapack/lug/lapack.lug.html
-
(1999)
LAPACK Users' Guide Release
-
-
Anderson, E.1
Bai, Z.2
Bischof, C.3
Demmel, J.4
Dongarra, J.5
Du Croz, J.6
Greenbaum, A.7
Hammarling, S.8
McKenney, A.9
Ostrouchov, S.10
Sorensen, D.11
-
5
-
-
0030661485
-
Optimizing Matrix Multiply Using PHiPAC: A Portable, High-Performance, ANSI C Coding Methodology
-
Vienna, Austria
-
Bilmes, J., Asanovic, K., Whye Chin, C., Demmel, J.: Optimizing Matrix Multiply Using PHiPAC: A Portable, High-Performance, ANSI C Coding Methodology. In: Proceedings of International Conference on Supercomputing, Vienna, Austria (1997)
-
(1997)
Proceedings of International Conference on Supercomputing
-
-
Bilmes, J.1
Asanovic, K.2
Whye Chin, C.3
Demmel, J.4
-
6
-
-
21044454029
-
Design and Exploitation of a High-performance SIMD Floating-point Unit for Blue Gene/L
-
Chatterjee, S., et al.: Design and Exploitation of a High-performance SIMD Floating-point Unit for Blue Gene/L. IBM Journal of Research and Development 49(2-3), 377-391 (2005)
-
(2005)
IBM Journal of Research and Development
, vol.49
, Issue.2-3
, pp. 377-391
-
-
Chatterjee, S.1
-
7
-
-
0003555195
-
-
2.0. SIAM, Philadelphia
-
Dongarra, J.J., Moler, C.B., Bunch, J.R., Stewart, G.W.: LINPACK Users' Guide Release 2.0. SIAM, Philadelphia (1979)
-
(1979)
LINPACK Users' Guide Release
-
-
Dongarra, J.J.1
Moler, C.B.2
Bunch, J.R.3
Stewart, G.W.4
-
8
-
-
0021310295
-
Implementing Linear Algebra Algorithms for Dense Matrices on a Vector Pipeline Machine
-
Dongarra, J.J., Gustavson, F.G., Karp, A.: Implementing Linear Algebra Algorithms for Dense Matrices on a Vector Pipeline Machine. SIAM Review 26(1), 91-112 (1984)
-
(1984)
SIAM Review
, vol.26
, Issue.1
, pp. 91-112
-
-
Dongarra, J.J.1
Gustavson, F.G.2
Karp, A.3
-
9
-
-
0023983122
-
An Extended Set of FORTRAN Basic Linear Algebra Subprograms
-
Dongarra, J.J., Du Croz, J., Hammarling, S., Hanson, R.J.: An Extended Set of FORTRAN Basic Linear Algebra Subprograms. TOMS 14(1), 1-17 (1988)
-
(1988)
TOMS
, vol.14
, Issue.1
, pp. 1-17
-
-
Dongarra, J.J.1
Du Croz, J.2
Hammarling, S.3
Hanson, R.J.4
-
10
-
-
0025402476
-
A Set of Level 3 Basic Linear Algebra Subprograms
-
Dongarra, J.J., Du Croz, J., Hammarling, S., Duff, I.: A Set of Level 3 Basic Linear Algebra Subprograms. TOMS 16(1), 1-17 (1990)
-
(1990)
TOMS
, vol.16
, Issue.1
, pp. 1-17
-
-
Dongarra, J.J.1
Du Croz, J.2
Hammarling, S.3
Duff, I.4
-
11
-
-
1842832833
-
Recursive Blocked Algorithms and Hybrid Data Structures for Dense Matrix Library Software
-
Elmroth, E., Gustavson, F.G., Kagstrom, B., Jonsson, I.: Recursive Blocked Algorithms and Hybrid Data Structures for Dense Matrix Library Software. SIAM Review 46(1), 3-45 (2004)
-
(2004)
SIAM Review
, vol.46
, Issue.1
, pp. 3-45
-
-
Elmroth, E.1
Gustavson, F.G.2
Kagstrom, B.3
Jonsson, I.4
-
12
-
-
0039435412
-
Formal linear algebra methods environment (FLAME)
-
Gunnels, J., Gustavson, F.G., Henry, G., van de Geijn, R.: Formal linear algebra methods environment (FLAME). ACM TOMS 27(4), 422-455 (2001)
-
(2001)
ACM TOMS
, vol.27
, Issue.4
, pp. 422-455
-
-
Gunnels, J.1
Gustavson, F.G.2
Henry, G.3
van de Geijn, R.4
-
13
-
-
33745301710
-
-
Gunnels, J.A., Gustavson, F.G.: A New Array Format for Symmetric and Triangular Matrices. In: Dongarra, J.J., Madsen, K., Waśniewski, J. (eds.) PARA 2004. LNCS, 3732, pp. 247-255. Springer, Heidelberg (2006)
-
Gunnels, J.A., Gustavson, F.G.: A New Array Format for Symmetric and Triangular Matrices. In: Dongarra, J.J., Madsen, K., Waśniewski, J. (eds.) PARA 2004. LNCS, vol. 3732, pp. 247-255. Springer, Heidelberg (2006)
-
-
-
-
14
-
-
33745303272
-
-
Gunnels, J.A., Gustavson, F.G., Henry, G.M., van de Geijn, R.A.: A Family of High-Performance Matrix Multiplication Algorithms. In: Dongarra, J.J., Madsen, K., Waśniewski, J. (eds.) PARA 2004. LNCS, 3732, pp. 256-265. Springer, Heidelberg (2006)
-
Gunnels, J.A., Gustavson, F.G., Henry, G.M., van de Geijn, R.A.: A Family of High-Performance Matrix Multiplication Algorithms. In: Dongarra, J.J., Madsen, K., Waśniewski, J. (eds.) PARA 2004. LNCS, vol. 3732, pp. 256-265. Springer, Heidelberg (2006)
-
-
-
-
15
-
-
0031273280
-
Recursion Leads to Automatic Variable Blocking for Dense Linear-Algebra Algorithms
-
Gustavson, F.G.: Recursion Leads to Automatic Variable Blocking for Dense Linear-Algebra Algorithms. IBM Journal of Research and Development 41(6), 737-755 (1997)
-
(1997)
IBM Journal of Research and Development
, vol.41
, Issue.6
, pp. 737-755
-
-
Gustavson, F.G.1
-
16
-
-
0034312453
-
Minimal Storage High Performance Cholesky via Blocking and Recursion
-
Gustavson, F.G., Jonsson, I.: Minimal Storage High Performance Cholesky via Blocking and Recursion. IBM Journal of Research and Development 44(6), 823-849 (2000)
-
(2000)
IBM Journal of Research and Development
, vol.44
, Issue.6
, pp. 823-849
-
-
Gustavson, F.G.1
Jonsson, I.2
-
17
-
-
0037230301
-
High Performance Linear Algebra Algorithms using New Generalized Data Structures for Matrices
-
Gustavson, F.G.: High Performance Linear Algebra Algorithms using New Generalized Data Structures for Matrices. IBM Journal of Research and Development 47(1), 31-55 (2003)
-
(2003)
IBM Journal of Research and Development
, vol.47
, Issue.1
, pp. 31-55
-
-
Gustavson, F.G.1
-
18
-
-
33745312312
-
-
Gustavson, F.G.: New Generalized Data Structures for Matrices Lead to a Variety of High performance Dense Linear Algorithms. In: Dongarra, J.J., Madsen, K., Waániewski, J. (eds.) PARA 2004, LNCS, 3732, pp. 11-20. Springer, Heidelberg (2006)
-
Gustavson, F.G.: New Generalized Data Structures for Matrices Lead to a Variety of High performance Dense Linear Algorithms. In: Dongarra, J.J., Madsen, K., Waániewski, J. (eds.) PARA 2004, LNCS, vol. 3732, pp. 11-20. Springer, Heidelberg (2006)
-
-
-
-
19
-
-
38049016587
-
-
Gustavson, F.G., Wasniewski, J.: Rectangular Full Packed Format for LAPACK Algorithms Timings on Several Computers. In: Kågström, B., Elmroth, E., Dongarra, J., Wasniewski, J. (eds.) PARA 2006. LNCS, 4699, pp. 570-579. Springer, Heidelberg (2007)
-
Gustavson, F.G., Wasniewski, J.: Rectangular Full Packed Format for LAPACK Algorithms Timings on Several Computers. In: Kågström, B., Elmroth, E., Dongarra, J., Wasniewski, J. (eds.) PARA 2006. LNCS, vol. 4699, pp. 570-579. Springer, Heidelberg (2007)
-
-
-
-
20
-
-
38049035184
-
-
IBM: IBM Engineering and Scientific Subroutine Library for AIX Version 3, Release 3. IBM Pub. No. SA22-7272-04 December 2001
-
IBM: IBM Engineering and Scientific Subroutine Library for AIX Version 3, Release 3. IBM Pub. No. SA22-7272-04 (December 2001)
-
-
-
-
21
-
-
38049026531
-
-
August 17-19, Stanford, CA
-
Kalla, R., Sinharoy, B., Tendler, J.: Power 5. HotChips-15, August 17-19, 2003, Stanford, CA (2003)
-
(2003)
Power 5. HotChips-15
-
-
Kalla, R.1
Sinharoy, B.2
Tendler, J.3
-
22
-
-
0018515759
-
Basic Linear Algebra Subprograms for Fortran Usage
-
Lawson, C.L., Hanson, R.J., Kincaid, D.R., Krogh, F.T.: Basic Linear Algebra Subprograms for Fortran Usage. TOMS 5(3), 308-323 (1979)
-
(1979)
TOMS
, vol.5
, Issue.3
, pp. 308-323
-
-
Lawson, C.L.1
Hanson, R.J.2
Kincaid, D.R.3
Krogh, F.T.4
-
23
-
-
0042235298
-
Tiling, Block Data Layout, and Memory Hierarchy Performance
-
Park, N., Hong, B., Prasanna, V.K.: Tiling, Block Data Layout, and Memory Hierarchy Performance. IEEE Trans. Parallel and Distributed Systems 14(7), 640-654 (2003)
-
(2003)
IEEE Trans. Parallel and Distributed Systems
, vol.14
, Issue.7
, pp. 640-654
-
-
Park, N.1
Hong, B.2
Prasanna, V.K.3
-
24
-
-
25844437046
-
POWER5 System Microarchitecture
-
Sinharoy, B., Kalla, R.N., Tendler, J.M, Kovacs, R.G., Eickemeyer, R.J., Joyner, J.B.: POWER5 System Microarchitecture. IBM Journal of Research and Development 49(4/5), 505-521 (2005)
-
(2005)
IBM Journal of Research and Development
, vol.49
, Issue.4-5
, pp. 505-521
-
-
Sinharoy, B.1
Kalla, R.N.2
Tendler, J.M.3
Kovacs, R.G.4
Eickemeyer, R.J.5
Joyner, J.B.6
-
25
-
-
0343462141
-
Automated Empirical Optimization of Software and the ATLAS Project
-
Whaley, R.C., Petitet, A., Dongarra, J.J.: Automated Empirical Optimization of Software and the ATLAS Project. Parallel Computing (1-2), 3-35 (2001)
-
(2001)
Parallel Computing
, vol.1-2
, pp. 3-35
-
-
Whaley, R.C.1
Petitet, A.2
Dongarra, J.J.3
|