-
1
-
-
0019567795
-
On the performance enhancement of paging systems through program analysis and transformations
-
May
-
W. Abu-Sufah, D. Kuck, D. Lawrie, On the performance enhancement of paging systems through program analysis and transformations, IEEE Trans. Comput. C 30 (5) (May 1981) 341-356.
-
(1981)
IEEE Trans. Comput. C
, vol.30
, Issue.5
, pp. 341-356
-
-
Abu-Sufah, W.1
Kuck, D.2
Lawrie, D.3
-
2
-
-
0346246696
-
Using integer sets for data-parallel program analysis and optimization
-
Montreal, Canada, June
-
V. Adve, J. Mellor-Crummey, Using integer sets for data-parallel program analysis and optimization, in: Proceedings of the SIGPLAN '98 Conference on Programming Language Design and Implementation, Montreal, Canada, June 1998.
-
(1998)
Proceedings of the SIGPLAN '98 Conference on Programming Language Design and Implementation
-
-
Adve, V.1
Mellor-Crummey, J.2
-
3
-
-
85027615245
-
Automatic decomposition of scientific programs for parallel execution
-
Munich, Germany, January
-
J.R. Allen, D. Callahan, K. Kennedy, Automatic decomposition of scientific programs for parallel execution, in: Proceedings of the Fourteenth Annual ACM Symposium on the Principles of Programming Languages, Munich, Germany, January 1987.
-
(1987)
Proceedings of the Fourteenth Annual ACM Symposium on the Principles of Programming Languages
-
-
Allen, J.R.1
Callahan, D.2
Kennedy, K.3
-
4
-
-
0026938452
-
Vector register allocation
-
October
-
J.R. Allen, K. Kennedy, Vector register allocation, IEEE Trans. Comput. 41 (10) (October 1992) 1290-1317.
-
(1992)
IEEE Trans. Comput.
, vol.41
, Issue.10
, pp. 1290-1317
-
-
Allen, J.R.1
Kennedy, K.2
-
5
-
-
0037952146
-
-
Morgan Kaufman, Los Altos, CA
-
R. Allen, K. Kennedy, Optimizing Compilers for Modern Architectures, Morgan Kaufman, Los Altos, CA, 2001.
-
(2001)
Optimizing Compilers for Modern Architectures
-
-
Allen, R.1
Kennedy, K.2
-
6
-
-
0029181140
-
Data and computation transformation for multiprocessors
-
Santa Barbara, CA, July
-
J. Anderson, S. Amarasinghe, M. Lam, Data and computation transformation for multiprocessors, in: Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Santa Barbara, CA, July 1995.
-
(1995)
Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
-
-
Anderson, J.1
Amarasinghe, S.2
Lam, M.3
-
7
-
-
0002664653
-
The history of Fortran I, II, and III
-
Wexelblat (Ed.), Academic Press, New York
-
J. Backus, The history of Fortran I, II, and III, in: Wexelblat (Ed.), History of Programming Languages, Academic Press, New York, 1981, pp. 25-45.
-
(1981)
History of Programming Languages
, pp. 25-45
-
-
Backus, J.1
-
8
-
-
0039227905
-
Unfavorable strides in cache memory systems
-
Technical Report RNR-92-015, NASA Ames Research Center
-
D. Bailey, Unfavorable strides in cache memory systems, Technical Report RNR-92-015, NASA Ames Research Center, 1992.
-
(1992)
-
-
Bailey, D.1
-
9
-
-
84949042257
-
Efficient interprocedural data placement optimisation in a parallel library
-
May
-
O. Beckmann, P.H.J. Kelly, Efficient interprocedural data placement optimisation in a parallel library, in: Proceedings of the Fourth Workshop on Languages, Compilers, and Run-time Systems for Scalable Computers, May 1998.
-
(1998)
Proceedings of the Fourth Workshop on Languages, Compilers, and Run-time Systems for Scalable Computers
-
-
Beckmann, O.1
Kelly, P.H.J.2
-
10
-
-
0003003638
-
A study of replacement algorithms for a virtual-storage computer
-
L.A. Belady, A study of replacement algorithms for a virtual-storage computer, IBM Systems J. 5 (2) (1966) 78-101.
-
(1966)
IBM Systems J.
, vol.5
, Issue.2
, pp. 78-101
-
-
Belady, L.A.1
-
11
-
-
0029666646
-
Memory bandwidth limitations of future microprocessors
-
Philadelphia, PA, May
-
D.C. Burger, J.R. Goodman, A. Kagi, Memory bandwidth limitations of future microprocessors, in: Proceedings of the 23th International Symposium on Computer Architecture, Philadelphia, PA, May 1996.
-
(1996)
Proceedings of the 23th International Symposium on Computer Architecture
-
-
Burger, D.C.1
Goodman, J.R.2
Kagi, A.3
-
13
-
-
0000493064
-
Estimating interlock and improving balance for pipelined machines
-
August
-
D. Callahan, J. Cocke, K. Kennedy, Estimating interlock and improving balance for pipelined machines, Journal of Parallel and Distributed Computing 5 (4) (August 1988) 334-358.
-
(1988)
Journal of Parallel and Distributed Computing
, vol.5
, Issue.4
, pp. 334-358
-
-
Callahan, D.1
Cocke, J.2
Kennedy, K.3
-
15
-
-
17244376579
-
Cache-conscious structure definition
-
Atlanta, GA, May
-
T.M. Chilimbi, B. Davidson, J.R. Larus, Cache-conscious structure definition, in: Proceedings of SIGPLAN Conference on Programming Language Design and Implementation, Atlanta, GA, May 1999.
-
(1999)
Proceedings of SIGPLAN Conference on Programming Language Design and Implementation
-
-
Chilimbi, T.M.1
Davidson, B.2
Larus, J.R.3
-
16
-
-
84976859799
-
Unifying data and control transformations for distributed shared-memory machines
-
La Jolla, CA, June
-
M. Cierniak, W. Li, Unifying data and control transformations for distributed shared-memory machines, in: Proceedings of the SIGPLAN '95 Conference on Programming Language Design and Implementation, La Jolla, CA, June 1995.
-
(1995)
Proceedings of the SIGPLAN '95 Conference on Programming Language Design and Implementation
-
-
Cierniak, M.1
Li, W.2
-
17
-
-
0038220597
-
Profitability computations on program flow graphs
-
Technical Report RC 5123, IBM
-
J. Cocke, K. Kennedy, Profitability computations on program flow graphs, Technical Report RC 5123, IBM, 1974.
-
(1974)
-
-
Cocke, J.1
Kennedy, K.2
-
18
-
-
0026966832
-
The complexity of multiway cuts
-
May
-
E. Dahlhaus, D.S. Johnson, C.H. Papadimitriou, P.D. Seymour M. Yannakakis, The complexity of multiway cuts, in: Proceedings of the 24th Annual ACM Symposium on the Theory of Computing, May 1992.
-
(1992)
Proceedings of the 24th Annual ACM Symposium on the Theory of Computing
-
-
Dahlhaus, E.1
Johnson, D.S.2
Papadimitriou, C.H.3
Seymour, P.D.4
Yannakakis, M.5
-
21
-
-
1442313416
-
Predicting whole-program locality using reuse-distance analysis
-
San Diego, CA, June
-
C. Ding, Y. Zhong, Predicting whole-program locality using reuse-distance analysis, in: Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, San Diego, CA, June 2003.
-
(2003)
Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation
-
-
Ding, C.1
Zhong, Y.2
-
22
-
-
0002678692
-
On estimating and enhancing cache effectiveness
-
U. Banerjee, D. Gelernter, A. Nicolau, D. Padua (Eds.). Fourth International Workshop, Springer, Santa Clara, CA, August
-
J. Ferrante, V. Sarkar, W. Thrash. On estimating and enhancing cache effectiveness. in: U. Banerjee, D. Gelernter, A. Nicolau, D. Padua (Eds.). Languages and Compilers for Parallel Computing, Fourth International Workshop, Springer, Santa Clara, CA, August 1991.
-
(1991)
Languages and Compilers for Parallel Computing
-
-
Ferrante, J.1
Sarkar, V.2
Thrash, W.3
-
23
-
-
0001366267
-
Strategies for cache and local memory management by global program transformation
-
October
-
D. Gannon, W. Jalby, K. Gallivan, Strategies for cache and local memory management by global program transformation, J. Parallel Distrib. Comput. 5 (5) (October 1988) 587-616.
-
(1988)
J. Parallel Distrib. Comput.
, vol.5
, Issue.5
, pp. 587-616
-
-
Gannon, D.1
Jalby, W.2
Gallivan, K.3
-
24
-
-
0003154599
-
Collective loop fusion for array contraction
-
New Haven, CT, August
-
G. Gao, R. Olsen, V. Sarkar, R. Thekkath, Collective loop fusion for array contraction. in: Proceedings of the Fifth Workshop on Languages and Compilers for Parallel Computing, New Haven, CT, August 1992.
-
(1992)
Proceedings of the Fifth Workshop on Languages and Compilers for Parallel Computing
-
-
Gao, G.1
Olsen, R.2
Sarkar, V.3
Thekkath, R.4
-
25
-
-
0026186967
-
An implementation of interprocedural bounded regular section analysis
-
July
-
P. Havlak, K. Kennedy, An implementation of interprocedural bounded regular section analysis, IEEE Trans. Parallel Distrib. Systems 2 (3) (July 1991) 350-360.
-
(1991)
IEEE Trans. Parallel Distrib. Systems
, vol.2
, Issue.3
, pp. 350-360
-
-
Havlak, P.1
Kennedy, K.2
-
26
-
-
0029192199
-
Reducing false sharing on shared memory multiprocessors through compile time data transformations
-
Santa Barbara. CA, July
-
T.E. Jeremiassen, S.J. Eggers, Reducing false sharing on shared memory multiprocessors through compile time data transformations, in: Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Santa Barbara. CA, July 1995, pp. 179-188.
-
(1995)
Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
, pp. 179-188
-
-
Jeremiassen, T.E.1
Eggers, S.J.2
-
29
-
-
85030909152
-
Resource constrained loop fusion
-
Technical Report TR03-424, Department of Computer Science, Rice University, September
-
K. Kennedy, C. Ding, Resource constrained loop fusion, Technical Report TR03-424, Department of Computer Science, Rice University, September 2003.
-
(2003)
-
-
Kennedy, K.1
Ding, C.2
-
31
-
-
1242268977
-
Typed fusion with applications to parallel and sequential code generation
-
Technical Report TR93-208 Dept. of Computer Science, Rice University, August (also available as CRPC-TR94370)
-
K. Kennedy, K.S. McKinley, Typed fusion with applications to parallel and sequential code generation, Technical Report TR93-208 Dept. of Computer Science, Rice University, August 1993 (also available as CRPC-TR94370).
-
(1993)
-
-
Kennedy, K.1
McKinley, K.S.2
-
33
-
-
84983965442
-
An empirical study of FORTRAN programs
-
D. Knuth, An empirical study of FORTRAN programs, Software - Practice Experience 1 (1971) 105-133.
-
(1971)
Software - Practice Experience
, vol.1
, pp. 105-133
-
-
Knuth, D.1
-
34
-
-
0030685988
-
Data-centric multi-level blocking
-
Las Vegas, NV, June
-
I. Kodukula, N. Ahmed, K. Pingali, Data-centric multi-level blocking. in: Proceedings of the SIGPLAN '97 Conference on Programming Language Design and Implementation, Las Vegas, NV, June 1997.
-
(1997)
Proceedings of the SIGPLAN '97 Conference on Programming Language Design and Implementation
-
-
Kodukula, I.1
Ahmed, N.2
Pingali, K.3
-
35
-
-
0026137116
-
The cache Performance and optimizations of blocked algorithms
-
(ASPLOS-IV), Santa Clara, CA, April
-
M. Lam, E. Rothberg, M.E. Wolf, The cache Performance and optimizations of blocked algorithms, in: Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-IV), Santa Clara, CA, April 1991.
-
(1991)
Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems
-
-
Lam, M.1
Rothberg, E.2
Wolf, M.E.3
-
36
-
-
0006712810
-
Array restructuring for cache locality
-
Ph.D. Thesis, Technical Report UW-CSE-96-08-01, University of Washington
-
S. Leung, Array restructuring for cache locality, Ph.D. Thesis, Technical Report UW-CSE-96-08-01, University of Washington, 1996.
-
(1996)
-
-
Leung, S.1
-
37
-
-
0346877102
-
The implementation and evaluation of fusion and contraction in array languages
-
Montreal. Canada, June
-
E. Lewis, C. Lin, L. Snyder, The implementation and evaluation of fusion and contraction in array languages, in: Proceedings of the SIGPLAN '98 Conference on Programming Language Design and Implementation, Montreal. Canada, June 1998.
-
(1998)
Proceedings of the SIGPLAN '98 Conference on Programming Language Design and Implementation
-
-
Lewis, E.1
Lin, C.2
Snyder, L.3
-
38
-
-
85088887125
-
Design and evaluation of locality optimizations using affine partitioning
-
Snowbird UT, June
-
A. Lim, S. Liao. M. Lam, Design and evaluation of locality optimizations using affine partitioning, in: Proceedings of ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Snowbird, UT, June 2001.
-
(2001)
Proceedings of ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
-
-
Lim, A.1
Liao, S.2
Lam, M.3
-
39
-
-
85030902984
-
-
SimpleScalar LLC, SimpleScalar tool set, www.simplescalar.com
-
SimpleScalar LLC, SimpleScalar tool set, www.simplescalar.com.
-
-
-
-
40
-
-
0003475248
-
Memory storage patterns in parallel processing
-
Kluwer Academic, Boston
-
M.E. Mace, Memory storage patterns in parallel processing, Kluwer Academic, Boston, 1987.
-
(1987)
-
-
Mace, M.E.1
-
42
-
-
0014701246
-
Evaluation techniques for storage hierarchies
-
R.L. Mattson, J. Gecsei, D. Slutz, I.L. Traiger. Evaluation techniques for storage hierarchies, IBM System J. 9 (2) (1970) 78-117.
-
(1970)
IBM System J.
, vol.9
, Issue.2
, pp. 78-117
-
-
Mattson, R.L.1
Gecsei, J.2
Slutz, D.3
Traiger, I.L.4
-
43
-
-
0030190854
-
Improving data locality with loop transformations
-
July
-
K.S. McKinley, S. Carr, C.-W. Tseng, Improving data locality with loop transformations, ACM Trans. Programming Languages Systems 18 (4) (July 1996) 424-453.
-
(1996)
ACM Trans. Programming Languages Systems
, vol.18
, Issue.4
, pp. 424-453
-
-
McKinley, K.S.1
Carr, S.2
Tseng, C.-W.3
-
44
-
-
0003665539
-
Quantifying loop nest locality using SPEC'95 and the perfect benchmarks
-
November
-
K.S. McKinley, O. Temam, Quantifying loop nest locality using SPEC'95 and the perfect benchmarks, ACM Transactions on Computer Systems 17 (4) (November 1999) 288-336.
-
(1999)
ACM Transactions on Computer Systems
, vol.17
, Issue.4
, pp. 288-336
-
-
McKinley, K.S.1
Temam, O.2
-
45
-
-
0036040497
-
The hardness of cache conscious data placement
-
Portland, OR, January
-
E. Petrank, D. Rawitz, The hardness of cache conscious data placement, in: Proceedings of ACM Symposium on Principles of Programming Languages, Portland, OR, January 2002.
-
(2002)
Proceedings of ACM Symposium on Principles of Programming Languages
-
-
Petrank, E.1
Rawitz, D.2
-
47
-
-
0003690938
-
Software methods for improvement of cache performance
-
Ph.D. Thesis, Dept. of Computer Science, Rice University, May
-
A. Porterfield, Software methods for improvement of cache performance, Ph.D. Thesis, Dept. of Computer Science, Rice University, May 1989.
-
(1989)
-
-
Porterfield, A.1
-
48
-
-
84976676720
-
A practical algorithm for exact array dependence analysis
-
August
-
W. Pugh, A practical algorithm for exact array dependence analysis, Comm. ACM 35 (8) (August 1992) 102-114.
-
(1992)
Comm. ACM
, vol.35
, Issue.8
, pp. 102-114
-
-
Pugh, W.1
-
51
-
-
17244374581
-
New tiling techniques to improve cache temporal locality
-
Atlanta, GA, May
-
Y. Song, Z. Li, New tiling techniques to improve cache temporal locality, in: Proceedings of ACM SIGPLAN Conference on Programming Languages Design and Implementation, Atlanta, GA, May 1999.
-
(1999)
Proceedings of ACM SIGPLAN Conference on Programming Languages Design and Implementation
-
-
Song, Y.1
Li, Z.2
-
52
-
-
0034825667
-
Data locality enhancement by memory reduction
-
June
-
Y. Song, R. Xu, C. Wang, Z. Li, Data locality enhancement by memory reduction, in: Proceedings of ACM International Conference on Supercomputing, June 2001.
-
(2001)
Proceedings of ACM International Conference on Supercomputing
-
-
Song, Y.1
Xu, R.2
Wang, C.3
Li, Z.4
-
53
-
-
0013009642
-
Multi-configuration simulation algorithms for the evaluation of computer architecture designs
-
Technical Report, University of Michigan
-
R.A. Sugumar, S.G. Abraham, Multi-configuration simulation algorithms for the evaluation of computer architecture designs, Technical Report, University of Michigan, 1993.
-
(1993)
-
-
Sugumar, R.A.1
Abraham, S.G.2
-
54
-
-
0037882891
-
-
Ph.D. Thesis, Dept. of Computer Science, Rice University
-
K.O. Thabit, Cache Management by the Compiler. Ph.D. Thesis, Dept. of Computer Science, Rice University, 1981.
-
(1981)
Cache Management By the Compiler
-
-
Thabit, K.O.1
-
55
-
-
84976827033
-
A data locality optimizing algorithm
-
Toronto, Canada, June
-
M.E. Wolf, M. Lam, A data locality optimizing algorithm, in: Proceedings of the SIGPLAN '91 Conference on Programming Language Design and Implementation, Toronto, Canada, June 1991.
-
(1991)
Proceedings of the SIGPLAN '91 Conference on Programming Language Design and Implementation
-
-
Wolf, M.E.1
Lam, M.2
-
56
-
-
0011452853
-
-
Ph.D. Thesis, Dept. of Computer Science, University of Illinois at Urbana-Champaign, October
-
M.J. Wolfe, Optimizing Supercompilers for Supercomputers. Ph.D. Thesis, Dept. of Computer Science, University of Illinois at Urbana-Champaign, October 1982.
-
(1982)
Optimizing Supercompilers for Supercomputers
-
-
Wolfe, M.J.1
-
58
-
-
1542392248
-
Achieving scalable locality with time skewing
-
June
-
D. Wonnacott, Achieving scalable locality with time skewing, Internat. J. Parallel Programming 30 (3) (June 2002).
-
(2002)
Internat. J. Parallel Programming
, vol.30
, Issue.3
-
-
Wonnacott, D.1
-
59
-
-
17144398455
-
Transforming loops to recursion for multi-level memory hierarchies
-
Vancouver, Canada, June
-
Q. Yi, V. Adve, K. Kennedy, Transforming loops to recursion for multi-level memory hierarchies, in: Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, Vancouver, Canada, June 2000.
-
(2000)
Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation
-
-
Yi, Q.1
Adve, V.2
Kennedy, K.3
-
60
-
-
0037882892
-
Reuse distance analysis for scientific programs
-
Washington DC, March
-
Y. Zhong, C. Ding, K. Kennedy, Reuse distance analysis for scientific programs, in: Proceedings of Workshop on Languages, Compilers, and Run-time Systems for Scalable Computers, Washington DC, March 2002.
-
(2002)
Proceedings of Workshop on Languages, Compilers, and Run-time Systems for Scalable Computers
-
-
Zhong, Y.1
Ding, C.2
Kennedy, K.3
|