-
1
-
-
0030661485
-
Optimizing matrix multiply using PHiPAC: A portable, high-performance, ANSI C coding methodology
-
J. Bilmes et al. Optimizing matrix multiply using PHiPAC: A portable, high-performance, ANSI C coding methodology. In Proc. ICS'97, pages 340-347, 1997.
-
(1997)
Proc. ICS'97
, pp. 340-347
-
-
Bilmes, J.1
-
2
-
-
0042697458
-
A multigrid tutorial
-
W. L. Briggs. A Multigrid Tutorial. SIAM, 1987.
-
(1987)
SIAM
-
-
Briggs, W.L.1
-
4
-
-
0002741087
-
-
UCSD Technical Report November
-
Larry Carter, Jeanne Ferrante, Susan Flynn Hummel, Bowen Alpern, Kang-Su Gatlin. Hierarchical Tiling: A Methodology for High Performance. UCSD Technical Report CS96-508, November 1996.
-
(1996)
Hierarchical Tiling: A Methodology for High Performance
-
-
Carter, L.1
Ferrante, J.2
Hummel, S.F.3
Alpern, B.4
Gatlin, K.-S.5
-
5
-
-
84981274540
-
Improving effective bandwidth through compiler enhancement of global cache reuse
-
San Francisco, CA
-
Chen Ding and Ken Kennedy. Improving Effective Bandwidth through Compiler Enhancement of Global Cache Reuse. In Proc. IPDPS 2001, San Francisco, CA, 2001.
-
(2001)
Proc. IPDPS 2001
-
-
Ding, C.1
Kennedy, K.2
-
6
-
-
33745205180
-
Maximizing cache memory usage for multigrid algorithms
-
Z. Chen, R. E. Ewing and Z.-C. Shi, editors, Springer-Verlag, Lecture Notes in Physics, Berlin
-
C. C. Douglas et al. Maximizing Cache Memory Usage for Multigrid Algorithms. In Z. Chen, R. E. Ewing and Z.-C. Shi, editors, Multiphase Flows and Transport in Porous Media: State of the Art, Springer-Verlag, Lecture Notes in Physics, Berlin, 2000.
-
(2000)
Multiphase Flows and Transport in Porous Media: State of the Art
-
-
Douglas, C.C.1
-
7
-
-
85117191258
-
-
FFTW. http://www.fftw.org/.
-
-
-
-
8
-
-
1142307058
-
-
Technical Report Computer Science Division, University of California, Berkeley
-
P. N. Hilfinger et al. Titanium Language Reference Manual. Technical Report CSD-01-1163, Computer Science Division, University of California, Berkeley, 2001.
-
(2001)
Titanium Language Reference Manual
-
-
Hilfinger, P.N.1
-
9
-
-
84875636475
-
Load balancing and data locality via fractiling: An experimental study
-
Boleslaw K. Szymanski and Balaram Sinharoy, editors, Kluwer Academic Publishers, Boston, MA
-
S. Flynn Hummel, I. Banicescu, C. Wang, and J. Wein. Load Balancing and Data Locality via Fractiling: An Experimental Study. In Boleslaw K. Szymanski and Balaram Sinharoy, editors, Proc. Third Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers, pages 85-89. Kluwer Academic Publishers, Boston, MA, 1995.
-
(1995)
Proc. Third Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
, pp. 85-89
-
-
Hummel, S.F.1
Banicescu, I.2
Wang, C.3
Wein, J.4
-
12
-
-
34547524504
-
Increasing temporal locality with skewing and recursive blocking
-
Denver, Colorado, November
-
G. Jin, J. Mellor-Crummey, and R. Fowler. Increasing Temporal Locality with Skewing and Recursive Blocking. In Proc. SC2001, Denver, Colorado, November 2001.
-
(2001)
Proc. SC2001
-
-
Jin, G.1
Mellor-Crummey, J.2
Fowler, R.3
-
13
-
-
0003904906
-
-
Technical Report Dept. of Computer Science, University of Maryland, College Park, March
-
Wayne Kelly, Vadim Maslov, William Pugh, Evan Rosser, Tatiana Shpeisman, and David Wonnacott. The Omega Library interface guide. Technical Report CS-TR-3445, Dept. of Computer Science, University of Maryland, College Park, March 1995.
-
(1995)
The Omega Library Interface Guide
-
-
Kelly, W.1
Maslov, V.2
Pugh, W.3
Rosser, E.4
Shpeisman, T.5
Wonnacott, D.6
-
14
-
-
0013103243
-
The effect of cache models on iterative compilation for combined tiling and unrolling
-
T. Kisuki, P. M. W. Knijnenburg, K. Gallivan, and M. F. P. O'Boyle. The Effect of Cache Models on Iterative Compilation for Combined Tiling and Unrolling. In Proc. FDDO-3, pages 31-40, 2000.
-
(2000)
Proc. FDDO-3
, pp. 31-40
-
-
Kisuki, T.1
Knijnenburg, P.M.W.2
Gallivan, K.3
O'Boyle, M.F.P.4
-
16
-
-
84949235179
-
Iterative compilation
-
P. M. W. Knijnenburg, T. Kisuki, and M. F. P. O'Boyle. Iterative Compilation. In Embedded Processor Design Challenges-System Architecture, Modeling and Simulation (SAMOS), Springer Lecture Notes in Computer Science vol. 2268, pages 171-187, 2002.
-
(2002)
Embedded Processor Design Challenges-System Architecture, Modeling and Simulation (SAMOS), Springer Lecture Notes in Computer Science
, vol.2268
, pp. 171-187
-
-
Knijnenburg, P.M.W.1
Kisuki, T.2
O'Boyle, M.F.P.3
-
19
-
-
0032067773
-
Maximizing parallelism and minimizing synchronization with affine partitions
-
Amy W. Lim and Monica S. Lam. Maximizing parallelism and minimizing synchronization with affine partitions. Parallel Computing, 24:445-475, 1998.
-
(1998)
Parallel Computing
, vol.24
, pp. 445-475
-
-
Lim, A.W.1
Lam, M.S.2
-
23
-
-
35248876385
-
Parallel 3D adaptive mesh refinement in titanium
-
San Antonio, TX, March
-
G. Pike, L. Semenzato, P. Colella, P. Hilfinger. Parallel 3D Adaptive Mesh Refinement in Titanium. In Proceedings of the SIAM Conference on Parallel Processing for Scientific Computing, San Antonio, TX, March 1999.
-
(1999)
Proceedings of the SIAM Conference on Parallel Processing for Scientific Computing
-
-
Pike, G.1
Semenzato, L.2
Colella, P.3
Hilfinger, P.4
-
28
-
-
1842843480
-
Statistical models for automatic performance tuning
-
San Francisco, CA, May
-
R. Vuduc, J. Demmel, and J. Bilmes. Statistical Models for Automatic Performance Tuning. In Proceedings of the 2001 International Conference on Computational Science (ICCS 2001), San Francisco, CA, May 2001.
-
(2001)
Proceedings of the 2001 International Conference on Computational Science (ICCS 2001)
-
-
Vuduc, R.1
Demmel, J.2
Bilmes, J.3
|