-
4
-
-
0028549474
-
Improving the ratio of memory operations to floating-point operations in loops
-
S. Carr and K. Kennedy. Improving the ratio of memory operations to floating-point operations in loops. ACM Transactions on Programming Languages and Systems, 16(6):1768-1810, 1994.
-
(1994)
ACM Transactions on Programming Languages and Systems
, vol.16
, Issue.6
, pp. 1768-1810
-
-
Carr, S.1
Kennedy, K.2
-
6
-
-
0029235623
-
Hierarchical Tiling for Improved Superscalar Performance
-
Santa Barbara, CA, Apr.
-
L. Carter, J. Ferrante, and S. F. Hummel. Hierarchical Tiling for Improved Superscalar Performance. In Proc. 9th International Parallel Processing Symposium, Santa Barbara, CA, Apr. 1995.
-
(1995)
Proc. 9th International Parallel Processing Symposium
-
-
Carter, L.1
Ferrante, J.2
Hummel, S.F.3
-
7
-
-
0032652980
-
Nonlinear Array Layouts for Hierarchical Memory Systems
-
S. Chatterjee, V. V. Jain, A. R. Lebeck, S. Mundhra, and M. Thottethodi. Nonlinear Array Layouts For Hierarchical Memory Systems. In Proc. 13th ACM Int'l Conference on Supercomputing, Phodes Greece, 1999.
-
(1999)
Proc. 13th ACM Int'l Conference on Supercomputing, Phodes Greece
-
-
Chatterjee, S.1
Jain, V.V.2
Lebeck, A.R.3
Mundhra, S.4
Thottethodi, M.5
-
9
-
-
0005042318
-
Applying Recursion to Serial and Parallel QR Factorization Leads to Better Performance
-
IBM T.J. Watson Research Center
-
E. Elmroth and F. Gustavson. Applying Recursion to Serial and Parallel QR Factorization Leads to Better Performance. Technical report, IBM T.J. Watson Research Center.
-
Technical Report
-
-
Elmroth, E.1
Gustavson, F.2
-
11
-
-
0001366267
-
Strategies for cache and local memory management by global program transformation
-
Oct.
-
D. Gannon, W. Jalby, and K. Gallivan. Strategies for cache and local memory management by global program transformation. Journal of Parallel and Distributed Computing, 5(5):587-616, Oct. 1988.
-
(1988)
Journal of Parallel and Distributed Computing
, vol.5
, Issue.5
, pp. 587-616
-
-
Gannon, D.1
Jalby, W.2
Gallivan, K.3
-
13
-
-
0031273280
-
Recursion Leads to Automatic Variable Blocking for Dense Linear-algebra Algorithms
-
Nov
-
F. G. Gustavson. Recursion Leads To Automatic Variable Blocking For Dense Linear-algebra Algorithms. IBM J. Res. Develop, 41(6), Nov 1997.
-
(1997)
IBM J. Res. Develop
, vol.41
, Issue.6
-
-
Gustavson, F.G.1
-
14
-
-
0003904906
-
The Omega Library Interface Guide
-
Dept. of Computer Science, Univ. of Maryland, College Park, Apr.
-
W. Kelly, V. Maslov, W. Pugh, E. Rosser, T. Shpeisman, and D. Wonnacott. The Omega Library Interface Guide. Technical report, Dept. of Computer Science, Univ. of Maryland, College Park, Apr. 1996.
-
(1996)
Technical Report
-
-
Kelly, W.1
Maslov, V.2
Pugh, W.3
Rosser, E.4
Shpeisman, T.5
Wonnacott, D.6
-
15
-
-
0029207952
-
Code generation for multiple mappings
-
McLean, VA, Feb.
-
W. Kelly, W. Pugh, and E. Rosser. Code generation for multiple mappings. In Frontiers '95: The 5th Symposium on the Frontiers of Massively Parallel Computation, McLean, VA, Feb. 1995.
-
(1995)
Frontiers '95: The 5th Symposium on the Frontiers of Massively Parallel Computation
-
-
Kelly, W.1
Pugh, W.2
Rosser, E.3
-
16
-
-
0030685988
-
Data-centric multi-level blocking
-
Las Vegas, NV, June
-
I. Kodukula, N. Ahmed, and K. Pingali. Data-centric multi-level blocking. In Proceedings of the SIGPLAN '97 Conference on Programming Language Design and Implementation, Las Vegas, NV, June 1997.
-
(1997)
Proceedings of the SIGPLAN '97 Conference on Programming Language Design and Implementation
-
-
Kodukula, I.1
Ahmed, N.2
Pingali, K.3
-
17
-
-
0026137116
-
The cache performance and optimizations of blocked algorithms
-
Santa Clara, CA, Apr.
-
M. Lam, E. Rothberg, and M. E. Wolf. The cache performance and optimizations of blocked algorithms. In Proc. Fourth Int'l Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS-IV), Santa Clara, CA, Apr. 1991.
-
(1991)
Proc. Fourth Int'l Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS-IV)
-
-
Lam, M.1
Rothberg, E.2
Wolf, M.E.3
-
18
-
-
0030190854
-
Improving data locality with loop transformations
-
July
-
K. S. McKinley, S. Carr, and C.-W. Tseng. Improving data locality with loop transformations. ACM Transactions on Programming Languages and Systems, 18(4):424-453, July 1996.
-
(1996)
ACM Transactions on Programming Languages and Systems
, vol.18
, Issue.4
, pp. 424-453
-
-
McKinley, K.S.1
Carr, S.2
Tseng, C.-W.3
-
19
-
-
0005062333
-
MHSLM: A Configurable Simulator for Multi-level Memory Hierarchies
-
Rice University
-
J. Mellor-Crummy and D. Whalley. MHSLM: A Configurable Simulator for Multi-level Memory Hierarchies. Technical Report TR-00-357, Rice University.
-
Technical Report TR-00-357
-
-
Mellor-Crummy, J.1
Whalley, D.2
-
20
-
-
0032684978
-
Improving Memory Hierarchy Performance for Irregular Applications
-
J. Mellor-Crummy, D. Whalley, and K. Kennedy. Improving Memory Hierarchy Performance For Irregular Applications. In Proc. 13th ACM Int'l Conference on Supercomputing, Phodes, Greece., 1999.
-
(1999)
Proc. 13th ACM Int'l Conference on Supercomputing, Phodes, Greece
-
-
Mellor-Crummy, J.1
Whalley, D.2
Kennedy, K.3
-
21
-
-
0030387154
-
An Analysis of Dag-Consistent Distributed Shared-Memory Algorithms
-
June
-
Robert D. Blumofe and Matteo Frigo and Christopher F. Joerg and Charles E. Leiserson and Keith H. Randall. An Analysis Of Dag-Consistent Distributed Shared-Memory Algorithms. In Proc. Eighth Annual ACM Symposium on Parallel Algorithms and Architectures(SPAA), Padua, Italy, June 1996.
-
(1996)
Proc. Eighth Annual ACM Symposium on Parallel Algorithms and Architectures(SPAA), Padua, Italy
-
-
Blumofe, R.D.1
Frigo, M.2
Joerg, C.F.3
Leiserson, C.E.4
Randall, K.H.5
-
28
-
-
0005042320
-
Iteration Space Slicing for Locality
-
July
-
William Pugh and Evan Rosser. Iteration Space Slicing For Locality. In LCPC 99, July 1999.
-
(1999)
N LCPC 99
-
-
Pugh, W.1
Rosser, E.2
|