-
2
-
-
0033700781
-
Synthesizing transformations for locality enhancement of imperfectly-nested loop nests
-
Santa FE, NM
-
AHMED, N., MATEEV, N., AND PINGALI, K. 2000. Synthesizing transformations for locality enhancement of imperfectly-nested loop nests. In Proceedings of the 2000 International Conference on Supercomputing (Santa FE, NM). 141-152.
-
(2000)
Proceedings of the 2000 International Conference on Supercomputing
, pp. 141-152
-
-
Ahmed, N.1
Mateev, N.2
Pingali, K.3
-
3
-
-
0003515463
-
-
Prentice-Hall Inc., Englewood Cliffs, NJ
-
AHUJA, R., MAGNANTI, T., AND ORLIN, J. 1993. Network Flows: Theory, Algorithms, and Applications. Prentice-Hall Inc., Englewood Cliffs, NJ.
-
(1993)
Network Flows: Theory, Algorithms, and Applications
-
-
Ahuja, R.1
Magnanti, T.2
Orlin, J.3
-
4
-
-
84976725287
-
Software pipelining
-
ALLAN, V., JONES, R., LEE, R., AND ALLAN, S. 1995. Software pipelining. ACM Comput. Surv. 27, 3 (Sept.), 367-432.
-
(1995)
ACM Comput. Surv.
, vol.27
, Issue.3 SEPT.
, pp. 367-432
-
-
Allan, V.1
Jones, R.2
Lee, R.3
Allan, S.4
-
5
-
-
0023438847
-
Automatic translation of FORTRAN programs to vector form
-
ALLEN, J. R. AND KENNEDY, K. 1984. Automatic translation of FORTRAN programs to vector form. ACM Trans. Programm. Lang. Syst. 9, 4 (Oct.), 491-542.
-
(1984)
ACM Trans. Programm. Lang. Syst.
, vol.9
, Issue.4 OCT.
, pp. 491-542
-
-
Allen, J.R.1
Kennedy, K.2
-
6
-
-
84948647315
-
Recursive formulation of some dense linear algebra algorithms
-
San Antonio, TX
-
ANDERSEN, B. S., GUSTAVSON, F. G., WASNIEWSKI, J., AND YALAMOV, P. Y. 1999. Recursive formulation of some dense linear algebra algorithms. In Proceedings of the SIAM Conference on Parallel Processing for Scientific Computing (San Antonio, TX).
-
(1999)
Proceedings of the SIAM Conference on Parallel Processing for Scientific Computing
-
-
Andersen, B.S.1
Gustavson, F.G.2
Wasniewski, J.3
Yalamov, P.Y.4
-
7
-
-
0029181140
-
Data and computation transformation for multiprocessors
-
Santa Barbara, CA
-
ANDERSON, J. M., AMARASINGHE, S. P., AND LAM, M. S. 1995. Data and computation transformation for multiprocessors. In Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (Santa Barbara, CA). 166-178.
-
(1995)
Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
, pp. 166-178
-
-
Anderson, J.M.1
Amarasinghe, S.P.2
Lam, M.S.3
-
8
-
-
1242313972
-
A compiler framework for restructuring data declarations to enhance cache and tlb effectiveness
-
Toronto, Ont., Canada
-
BACON, D., CHOW, J.-H., JU, D., MUTHUKUMAR, K., AND SARKAR, V. 1994. A compiler framework for restructuring data declarations to enhance cache and tlb effectiveness. In Proceedings of CASCON'94 (Toronto, Ont., Canada).
-
(1994)
Proceedings of CASCON'94
-
-
Bacon, D.1
Chow, J.-H.2
Ju, D.3
Muthukumar, K.4
Sarkar, V.5
-
9
-
-
0032313172
-
Non-linear and symbolic data dependence testing
-
BLUME, W. AND EIGENMANN, R. 1998. Non-linear and symbolic data dependence testing. IEEE Trans. Parall. Distrib. Syst. 9, 12 (Dec.), 1180-1194.
-
(1998)
IEEE Trans. Parall. Distrib. Syst.
, vol.9
, Issue.12 DEC.
, pp. 1180-1194
-
-
Blume, W.1
Eigenmann, R.2
-
10
-
-
0032648736
-
Static tiling for heterogeneous computing platforms
-
BOULET, P., DONGARRA, J., ROBERT, Y., AND VIVIEN, F. 1999. Static tiling for heterogeneous computing platforms. Parall. Comput. 25, 547-568.
-
(1999)
Parall. Comput.
, vol.25
, pp. 547-568
-
-
Boulet, P.1
Dongarra, J.2
Robert, Y.3
Vivien, F.4
-
11
-
-
0024701521
-
Coloring heuristics for register al-location
-
BRIGGS, P., COOPER, K., KENNEDY, K., AND TORCSON, L. 1989. Coloring heuristics for register al-location. In Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation. 275-384.
-
(1989)
Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation
, pp. 275-384
-
-
Briggs, P.1
Cooper, K.2
Kennedy, K.3
Torcson, L.4
-
12
-
-
0029666646
-
Memory bandwidth limitations of future microprocessors
-
Philadelphia, PA
-
BURGER, D. C., GOODMAN, J. R., AND KÄGI, A. 1996. Memory bandwidth limitations of future microprocessors. In Proceedings of the 23rd International Symposium on Computer Architecture (Philadelphia, PA). 78-89.
-
(1996)
Proceedings of the 23rd International Symposium on Computer Architecture
, pp. 78-89
-
-
Burger, D.C.1
Goodman, J.R.2
Kägi, A.3
-
14
-
-
0032652980
-
Nonlinear array layouts for hierarchical memory systems
-
Rhodes, Greece
-
CHATTERJEE, S., JAIN, V., LEBECK, A., MUNDHRA, S., AND THOTTETHODI, M. 1999a. Nonlinear array layouts for hierarchical memory systems. In Proceedings of the Thirteenth ACM International Conference on Supercomputing (Rhodes, Greece). 444-453.
-
(1999)
Proceedings of the Thirteenth ACM International Conference on Supercomputing
, pp. 444-453
-
-
Chatterjee, S.1
Jain, V.2
Lebeck, A.3
Mundhra, S.4
Thottethodi, M.5
-
15
-
-
0032659795
-
Recursive array layouts and fast parallel matrix multiplication
-
Saint Malo, France
-
CHATTERJEE, S., LEBECK, A., PATNALA, P. K., AND THOTTETHODI, M. 1999b. Recursive array layouts and fast parallel matrix multiplication. In Proceedings of the 11th ACM Symposium on Parallel Algorithms and Architectures (Saint Malo, France).
-
(1999)
Proceedings of the 11th ACM Symposium on Parallel Algorithms and Architectures
-
-
Chatterjee, S.1
Lebeck, A.2
Patnala, P.K.3
Thottethodi, M.4
-
16
-
-
0034836237
-
Loop optimization for a class of memory-constrained computations
-
Naples, Italy
-
COCIORVA, D., WILKINS, J. W., LAM, C., BAUMGARTNER, G., RAMANUJAM, J., AND SADAYAPPAN, P. 2001. Loop optimization for a class of memory-constrained computations. In Proceedings of the 15th ACM International Conference on Supercomputing (Naples, Italy).
-
(2001)
Proceedings of the 15th ACM International Conference on Supercomputing
-
-
Cociorva, D.1
Wilkins, J.W.2
Lam, C.3
Baumgartner, G.4
Ramanujam, J.5
Sadayappan, P.6
-
19
-
-
0004116989
-
-
MIT Press, Cambridge, MA, and McGraw-Hill Book Company, New York, NY
-
CORMEN, T., LEISERSON, C., AND RIVEST, R. 1990. Introduction to Algorithms. MIT Press, Cambridge, MA, and McGraw-Hill Book Company, New York, NY.
-
(1990)
Introduction to Algorithms
-
-
Cormen, T.1
Leiserson, C.2
Rivest, R.3
-
21
-
-
85015240805
-
On estimating and enhancing cache effectiveness
-
Lecture Notes in Computer Science, Springer-Verlag, Berlin, Germany. August 1991
-
FERRANTE, J., SARKAR, V., AND THRASH, W. 1991. On estimating and enhancing cache effectiveness. In Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing. Lecture Notes in Computer Science, vol. 1863. Springer-Verlag, Berlin, Germany, 328-341. August 1991.
-
(1991)
Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
, vol.1863
, pp. 328-341
-
-
Ferrante, J.1
Sarkar, V.2
Thrash, W.3
-
23
-
-
0031611719
-
Precise miss analysis for program transformations with caches of arbitrary associativity
-
San Jose, CA
-
GHOSH, S., MARTONOSI, M., AND MALIK, S. 1998. Precise miss analysis for program transformations with caches of arbitrary associativity. In Proceedings of the Eighth ACM Conference on Architectural Support for Programming Languages and Operating Systems (San Jose, CA). 228-239.
-
(1998)
Proceedings of the Eighth ACM Conference on Architectural Support for Programming Languages and Operating Systems
, pp. 228-239
-
-
Ghosh, S.1
Martonosi, M.2
Malik, S.3
-
24
-
-
0030678732
-
Experience with efficient array data flow analysis for array privatization
-
Las Vegas, NV
-
GU, J., LI, Z., AND LEE, G. 1997. Experience with efficient array data flow analysis for array privatization. In Proceedings of the Sixth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (Las Vegas, NV). 157-167.
-
(1997)
Proceedings of the Sixth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
, pp. 157-167
-
-
Gu, J.1
Li, Z.2
Lee, G.3
-
27
-
-
84858693885
-
Increasing temporal locality with skewing and recursive blocking
-
Denver, CO
-
JIN, G., MELLOR-CRUMMEY, J., AND FOWLER, R. 2001. Increasing temporal locality with skewing and recursive blocking. In Proceedings of IEEE/ACM SC 2001 (Denver, CO).
-
(2001)
Proceedings of IEEE/ACM SC 2001
-
-
Jin, G.1
Mellor-Crummey, J.2
Fowler, R.3
-
28
-
-
0037722074
-
A matrix-based approach to the global locality optimization problem
-
PACT'98, Paris, France
-
KANDEMIR, M., CHOUDHARY, A., RAMANUJAM, J., AND BANERJEE, P. 1998. A matrix-based approach to the global locality optimization problem. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT'98, Paris, France).
-
(1998)
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques
-
-
Kandemir, M.1
Choudhary, A.2
Ramanujam, J.3
Banerjee, P.4
-
30
-
-
0001465739
-
Maximizing loop parallelism and improving data locality via loop fusion and distribution
-
Portland, OR, Aug. 1993. Lecture Notes in Computer Science, Springer-Verlag, Berlin, Germany
-
KENNEDY, K. AND MVKINLEY, K. S. 1993. Maximizing loop parallelism and improving data locality via loop fusion and distribution. In Proceedings of the Sixth Workhsop on Languages and Compilers for Parallel Computing (Portland, OR, Aug. 1993). Lecture Notes in Computer Science, vol. 768, Springer-Verlag, Berlin, Germany.
-
(1993)
Proceedings of the Sixth Workhsop on Languages and Compilers for Parallel Computing
, vol.768
-
-
Kennedy, K.1
Mvkinley, K.S.2
-
31
-
-
0347304618
-
Data-centric multi-level blocking
-
Las Vegas, NV
-
KODUKULA, I., AHMED, N., AND PINGALI, K. 1997. Data-centric multi-level blocking. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (Las Vegas, NV). 346-357.
-
(1997)
Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation
, pp. 346-357
-
-
Kodukula, I.1
Ahmed, N.2
Pingali, K.3
-
33
-
-
0026137116
-
The cache performance and optimizations of blocked algorithms
-
Santa Clara, CA
-
LAM, M. S., ROTHBERG, E. E., AND WOLF, M. E. 1991. The cache performance and optimizations of blocked algorithms. In Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (Santa Clara, CA). 63-74.
-
(1991)
Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems
, pp. 63-74
-
-
Lam, M.S.1
Rothberg, E.E.2
Wolf, M.E.3
-
35
-
-
3142754802
-
Smallest-last ordering and clustering and graph coloring algorithms
-
Department of Computer Science and Engineering, Southern Methodist University, Dallas, TX
-
MATULA, D. AND BECK, L. 1981. Smallest-last ordering and clustering and graph coloring algorithms. Tech. rep. TR CSE 8104. Department of Computer Science and Engineering, Southern Methodist University, Dallas, TX.
-
(1981)
Tech. Rep.
, vol.TR CSE 8104
-
-
Matula, D.1
Beck, L.2
-
36
-
-
0032308685
-
Quantifying the multi-level nature of tiling interactions
-
MITCHELL, N., HÖGSTEDT, K., CARTER, L., AND FERRANTE, J. 1998. Quantifying the multi-level nature of tiling interactions. Int. J. Parall. Programm. 26, 6 (Dec.), 641-670.
-
(1998)
Int. J. Parall. Programm.
, vol.26
, Issue.6 DEC.
, pp. 641-670
-
-
Mitchell, N.1
Högstedt, K.2
Carter, L.3
Ferrante, J.4
-
37
-
-
0032064896
-
Interprocedural analysis for loop scheduling and data allocation
-
NGUYEN, T. AND LI, Z. 1998. Interprocedural analysis for loop scheduling and data allocation. Parall. Comput. 24, 3, 477-504.
-
(1998)
Parall. Comput.
, vol.24
, Issue.3
, pp. 477-504
-
-
Nguyen, T.1
Li, Z.2
-
40
-
-
0033076195
-
Augmenting loop tiling with data alignment for improved cache performance
-
PANDA, P., NAKAMURA, H., DUTT, N., AND NICOLAU, A. 1999. Augmenting loop tiling with data alignment for improved cache performance. IEEE Trans. Comput. 48, 2 (Feb.), 142-149.
-
(1999)
IEEE Trans. Comput.
, vol.48
, Issue.2 FEB.
, pp. 142-149
-
-
Panda, P.1
Nakamura, H.2
Dutt, N.3
Nicolau, A.4
-
41
-
-
24644482622
-
Analysis of memory hierarchy performance of block data layout
-
Vancouver, B.C., Canada
-
PARK, N., HONG, B., AND PRASANNA, V. K. 2002. Analysis of memory hierarchy performance of block data layout. In Proceedings of the International Conference on Parallel Processing (Vancouver, B.C., Canada). 34-44.
-
(2002)
Proceedings of the International Conference on Parallel Processing
, pp. 34-44
-
-
Park, N.1
Hong, B.2
Prasanna, V.K.3
-
42
-
-
84976676720
-
A practical algorithm for exact array dependence analysis
-
PUGH, W. 1992. A practical algorithm for exact array dependence analysis. Commun. ACM 35, 8 (Aug.), 102-114.
-
(1992)
Commun. ACM
, vol.35
, Issue.8 AUG.
, pp. 102-114
-
-
Pugh, W.1
-
44
-
-
17244382508
-
Exploiting monotone convergence functions in parallel programs
-
University of Maryland, College Park, MD
-
PUGH, W., ROSSER, E., AND SHPEISMAN, T. 1996. Exploiting monotone convergence functions in parallel programs. Tech. rep. CS-TR-3636. University of Maryland, College Park, MD.
-
(1996)
Tech. Rep.
, vol.CS-TR-3636
-
-
Pugh, W.1
Rosser, E.2
Shpeisman, T.3
-
47
-
-
0005045396
-
-
Ph.D. dissertation. Department of Computer Science, University of Maryland at College Park, MD
-
ROSSER, E. 1998. Fine-grained analysis of array computations. Ph.D. dissertation. Department of Computer Science, University of Maryland at College Park, MD.
-
(1998)
Fine-grained Analysis of Array Computations
-
-
Rosser, E.1
-
50
-
-
0034825667
-
Data locality enhancement by memory reduction
-
Naples, Italy
-
SONG, Y., XU, R., WANG, C., AND LI, Z. 2001. Data locality enhancement by memory reduction. In Proceedings of the 15th ACM International Conference on Supercomputing (Naples, Italy).
-
(2001)
Proceedings of the 15th ACM International Conference on Supercomputing
-
-
Song, Y.1
Xu, R.2
Wang, C.3
Li, Z.4
-
51
-
-
0031612767
-
Schedule-independent storage mapping for loops
-
San Jose, CA
-
STROUT, M., CARTER, L., FERRANTE, J., AND SIMON, B. 1998. Schedule-independent storage mapping for loops. In Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems (San Jose, CA). 24-33.
-
(1998)
Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems
, pp. 24-33
-
-
Strout, M.1
Carter, L.2
Ferrante, J.3
Simon, B.4
-
52
-
-
0028429842
-
Cache interference phenomena
-
Nashville, TN
-
TEMAM, O., FRICKER, C., AND JALBY, W. 1994. Cache interference phenomena. In Proceedings of the ACM BIOMETRICS Conference on Measurement and Modeling of Computer Systems (Nashville, TN). 261-271.
-
(1994)
Proceedings of the ACM BIOMETRICS Conference on Measurement and Modeling of Computer Systems
, pp. 261-271
-
-
Temam, O.1
Fricker, C.2
Jalby, W.3
-
53
-
-
0034819362
-
Language support for Morton-order matrices
-
Snowbird, UT
-
WISE, D. S., ALEXANDER, G. A., FRENS, J. D., AND GU, Y. 2001. Language support for Morton-order matrices. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (Snowbird, UT).
-
(2001)
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
-
-
Wise, D.S.1
Alexander, G.A.2
Frens, J.D.3
Gu, Y.4
-
54
-
-
0003553286
-
-
Ph.D. dissertation. Department of Computer Science, Stanford University, Stanford, CA
-
WOLF, M. 1992. Improving locality and parallelism in nested loops. Ph.D. dissertation. Department of Computer Science, Stanford University, Stanford, CA.
-
(1992)
Improving Locality and Parallelism in Nested Loops
-
-
Wolf, M.1
-
56
-
-
1542392248
-
Achieving scalable locality with time skewing
-
WONNACOTT, D. 2002. Achieving scalable locality with time skewing. Int. J. Parall. Programm. 30, 3 (June), 181-221.
-
(2002)
Int. J. Parall. Programm.
, vol.30
, Issue.3 JUNE
, pp. 181-221
-
-
Wonnacott, D.1
-
57
-
-
0442303278
-
-
Kluwer Academic Publishers, Dordrecht, The Netherlands
-
XUE, J. 2000. Loop Tiling for Parallelism. Kluwer Academic Publishers, Dordrecht, The Netherlands.
-
(2000)
Loop Tiling for Parallelism
-
-
Xue, J.1
|