-
1
-
-
0028513316
-
Exploiting functional parallelism of POWER2 to design high-performance numerical algorithms
-
Agarwal, R.C., Gustavson, F.G., Zubair, M.: Exploiting functional parallelism of POWER2 to design high-performance numerical algorithms. IBM Journal of Research and Development 38(5), 563-576 (1994)
-
(1994)
IBM Journal of Research and Development
, vol.38
, Issue.5
, pp. 563-576
-
-
Agarwal, R.C.1
Gustavson, F.G.2
Zubair, M.3
-
2
-
-
0003003638
-
A study of replacement algorithms for a virtual-storage computer
-
Belady, L.A.: A study of replacement algorithms for a virtual-storage computer. IBM Systems Journal 5(2), 78-101 (1966)
-
(1966)
IBM Systems Journal
, vol.5
, Issue.2
, pp. 78-101
-
-
Belady, L.A.1
-
3
-
-
21044454029
-
Design and Exploitation of a High-performance SIMD Floating-point Unit for Blue Gene/L
-
Chatterjee, S., et al.: Design and Exploitation of a High-performance SIMD Floating-point Unit for Blue Gene/L. IBM Journal of Research and Development 49(2-3), 377-391 (2005)
-
(2005)
IBM Journal of Research and Development
, vol.49
, Issue.2-3
, pp. 377-391
-
-
Chatterjee, S.1
-
4
-
-
0033350255
-
Cache-oblivious Algorithms
-
IEEE Computer Society Press, Los Alamitos
-
Frigo, M., Leiserson, C., Prokop, H., Ramachandran, S.: Cache-oblivious Algorithms. In: FOCS '99: Proceedings of the 40th Annual Symposium on Foundations of Computer Science, p. 285. IEEE Computer Society Press, Los Alamitos (1999)
-
(1999)
FOCS '99: Proceedings of the 40th Annual Symposium on Foundations of Computer Science
, pp. 285
-
-
Frigo, M.1
Leiserson, C.2
Prokop, H.3
Ramachandran, S.4
-
5
-
-
0021310295
-
Implementing Linear Algebra Algorithms for Dense Matrices on a Vector Pipeline Machine
-
Dongarra, J.J., Gustavson, F.G., Karp, A.: Implementing Linear Algebra Algorithms for Dense Matrices on a Vector Pipeline Machine. SIAM Review 26(1), 91-112 (1984)
-
(1984)
SIAM Review
, vol.26
, Issue.1
, pp. 91-112
-
-
Dongarra, J.J.1
Gustavson, F.G.2
Karp, A.3
-
6
-
-
1842832833
-
Recursive Blocked Algorithms and Hybrid Data Structures for Dense Matrix Library Software
-
Elmroth, E., Gustavson, F.G., Kågström, B., Jonsson, I.: Recursive Blocked Algorithms and Hybrid Data Structures for Dense Matrix Library Software. SIAM Review 46(1), 3-45 (2004)
-
(2004)
SIAM Review
, vol.46
, Issue.1
, pp. 3-45
-
-
Elmroth, E.1
Gustavson, F.G.2
Kågström, B.3
Jonsson, I.4
-
8
-
-
33745303272
-
-
Gunnels, J.A., Gustavson, F.G., Henry, G.M., van de Geijn, R.A.: A Family of High-Performance Matrix Multiplication Algorithms. In: Dongarra, J.J., Madsen, K., Waśniewski, J. (eds.) PARA 2004. LNCS, 3732, pp. 256-265. Springer, Heidelberg (2006)
-
Gunnels, J.A., Gustavson, F.G., Henry, G.M., van de Geijn, R.A.: A Family of High-Performance Matrix Multiplication Algorithms. In: Dongarra, J.J., Madsen, K., Waśniewski, J. (eds.) PARA 2004. LNCS, vol. 3732, pp. 256-265. Springer, Heidelberg (2006)
-
-
-
-
9
-
-
0031273280
-
Recursion Leads to Automatic Variable Blocking for Dense Linear-Algebra Algorithms
-
Gustavson, F.G.: Recursion Leads to Automatic Variable Blocking for Dense Linear-Algebra Algorithms. IBM Journal of Research and Development 41(6), 737-755 (1997)
-
(1997)
IBM Journal of Research and Development
, vol.41
, Issue.6
, pp. 737-755
-
-
Gustavson, F.G.1
-
10
-
-
0037230301
-
High Performance Linear Algebra Algorithms using New Generalized Data Structures for Matrices
-
Gustavson, F.G.: High Performance Linear Algebra Algorithms using New Generalized Data Structures for Matrices. IBM Journal of Research and Development 47(1), 31-55 (2003)
-
(2003)
IBM Journal of Research and Development
, vol.47
, Issue.1
, pp. 31-55
-
-
Gustavson, F.G.1
-
11
-
-
38049054439
-
-
Gustavson, F.G., Gunnels, J.A., Sexton, J.C.: Minimal Data Copy for Dense Linear Algebra Factorization. In: Kågström, B., Elmroth, E., Dongarra, J., Wasniewski, J. (eds.) PARA 2006. LNCS, 4699, pp. 540-549. Springer, Heidelberg (2007)
-
Gustavson, F.G., Gunnels, J.A., Sexton, J.C.: Minimal Data Copy for Dense Linear Algebra Factorization. In: Kågström, B., Elmroth, E., Dongarra, J., Wasniewski, J. (eds.) PARA 2006. LNCS, vol. 4699, pp. 540-549. Springer, Heidelberg (2007)
-
-
-
-
12
-
-
84947926251
-
-
Gustavson, F.G., Henriksson, A., Jonsson, I., Kågström, B., Ling, P.: Recursive blocked data formats and BLAS's for dense linear algebra algorithms. In: Kagström, B., Elmroth, E., Wasniewski, J., Dongarra, J.J. (eds.) PARA 1998. LNCS, 1541, pp. 195-206. Springer, Heidelberg (1998)
-
Gustavson, F.G., Henriksson, A., Jonsson, I., Kågström, B., Ling, P.: Recursive blocked data formats and BLAS's for dense linear algebra algorithms. In: Kagström, B., Elmroth, E., Wasniewski, J., Dongarra, J.J. (eds.) PARA 1998. LNCS, vol. 1541, pp. 195-206. Springer, Heidelberg (1998)
-
-
-
-
13
-
-
84947907655
-
-
Gustavson, F.G., Henriksson, A., Jonsson, I., Kågström, B., Ling, P.: Super-scalar GEMM-based level 3 BLAS-the on-going evolution of a portable and high-performance library. In: Kagström, B., Elmroth, E., Waśniewski, J., Dongarra, J.J. (eds.) PARA 1998. LNCS, 1541, pp. 207-215. Springer, Heidelberg (1998)
-
Gustavson, F.G., Henriksson, A., Jonsson, I., Kågström, B., Ling, P.: Super-scalar GEMM-based level 3 BLAS-the on-going evolution of a portable and high-performance library. In: Kagström, B., Elmroth, E., Waśniewski, J., Dongarra, J.J. (eds.) PARA 1998. LNCS, vol. 1541, pp. 207-215. Springer, Heidelberg (1998)
-
-
-
-
14
-
-
0042235298
-
Tiling, Block Data Layout, and Memory Hierarchy Performance
-
Park, N., Hong, B., Prasanna, V.K.: Tiling, Block Data Layout, and Memory Hierarchy Performance. IEEE Trans. Parallel and Distributed Systems 14(7), 640-654 (2003)
-
(2003)
IEEE Trans. Parallel and Distributed Systems
, vol.14
, Issue.7
, pp. 640-654
-
-
Park, N.1
Hong, B.2
Prasanna, V.K.3
-
15
-
-
38049010611
-
The Price of Cache Obliviousness. Department of Computer Science, University of Texas
-
CS-TR-06-43 September
-
Roeder, T., Yotov, K., Pingali, K., Gunnels, J., Gustavson, F.: The Price of Cache Obliviousness. Department of Computer Science, University of Texas, Austin Technical Report CS-TR-06-43 (September 2006)
-
(2006)
Austin Technical Report
-
-
Roeder, T.1
Yotov, K.2
Pingali, K.3
Gunnels, J.4
Gustavson, F.5
-
16
-
-
25844437046
-
POWER5 System Microarchitecture
-
Sinharoy, B., alla, R.N., Tendler, J.M, Kovacs, R.G., Eickemeyer, R.J., Joyner, J.B.: POWER5 System Microarchitecture. IBM Journal of Research and Development 49(4/5), 505-521 (2005)
-
(2005)
IBM Journal of Research and Development
, vol.49
, Issue.4-5
, pp. 505-521
-
-
Sinharoy, B.1
alla, R.N.2
Tendler, J.M.3
Kovacs, R.G.4
Eickemeyer, R.J.5
Joyner, J.B.6
-
17
-
-
0343462141
-
Automated Empirical Optimization of Software and the ATLAS Project
-
Whaley, R.C., Petitet, A., Dongarra, J.J.: Automated Empirical Optimization of Software and the ATLAS Project. Parallel Computing (1-2), 3-35 (2001)
-
(2001)
Parallel Computing
, vol.1-2
, pp. 3-35
-
-
Whaley, R.C.1
Petitet, A.2
Dongarra, J.J.3
|