SCOPUS 정보 검색 플랫폼

Journal of Parallel and Distributed Computing

Volumn 60, Issue 8, 2000, Pages 924-965

Compiler Algorithms for Optimizing Locality and Parallelism on Shared and Distributed-Memory Machines

(3) Kandemir, M a Ramanujam, J b Choudhary, A c

a The Pennsylvania State University (United States)

b LOUISIANA STATE UNIVERSITY (United States)

c Northwestern University (United States)

Author keywords

Array restructuring; Data reuse; Locality optimizations; Loop transformations; Memory performance; Parallelism; Spatial locality; Temporal locality

Indexed keywords

EID: 0000563616 PISSN: 07437315 EISSN: None Source Type: Journal
DOI: 10.1006/jpdc.2000.1639 Document Type: Article

Times cited : (6)

References (51)

1
- 0029180378
- The MIT Alewife machine: Architecture and performance
- Agarwal A., Bianchini R., Chaiken D., Johnson K. L., Kranz D., Kubiatowicz J., Lim B., Mackenzie K., Yeung D. The MIT Alewife machine: architecture and performance. Proc. 22nd International Symposium on Computer Architecture. 1995.
- (1995) Proc. 22nd International Symposium on Computer Architecture
- Agarwal, A.¹ Bianchini, R.² Chaiken, D.³ Johnson, K.L.⁴ Kranz, D.⁵ Kubiatowicz, J.⁶ Lim, B.⁷ Mackenzie, K.⁸ Yeung, D.⁹

2
- 0029181140
- Data and computation transformations for multiprocessors
- Anderson J. M., Amarasinghe S. P., Lam M. S. Data and computation transformations for multiprocessors. Proc. 5th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. July 1995.
- (1995) Proc. 5th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
- Anderson, J.M.¹ Amarasinghe, S.P.² Lam, M.S.³

3
- 0027870804
- Global optimizations for parallelism and locality on scalable parallel machines
- p. 112-125
- Anderson J. M., Lam M. S. Global optimizations for parallelism and locality on scalable parallel machines. Proc. SIGPLAN Conference on Programming Language Design and Implementation. June 1993;. p. 112-125.
- (1993) Proc. SIGPLAN Conference on Programming Language Design and Implementation
- Anderson, J.M.¹ Lam, M.S.²

4
- 85067798127
- An interactive environment for data partitioning and distribution
- Charleston, SC, April
- V. Balasundaram, G. Fox, K. Kennedy, and, U. Kremer, An interactive environment for data partitioning and distribution, in, 5th Distributed Memory Computing Conference, Charleston, SC, April 1990.
- (1990) In, 5th Distributed Memory Computing Conference
- Balasundaram, V.¹ Fox, G.² Kennedy, K.³ Kremer, U.⁴

5
- 0029394470
- The PARADIGM compiler for distributed-memory multicomputers
- Banerjee P., Chandy J. A., Gupta M., Hodges E. W. IV, Holm J. G., Lain A., Palermo D. J., Ramaswamy S., Su E. The PARADIGM compiler for distributed-memory multicomputers. IEEE Comput. 28:October 1995;37-47.
- (1995) IEEE Comput. , vol.28 , pp. 37-47
- Banerjee, P.¹ Chandy, J.A.² Gupta, M.³ Hodges E.W. IV⁴ Holm, J.G.⁵ Lain, A.⁶ Palermo, D.J.⁷ Ramaswamy, S.⁸ Su, E.⁹

6
- 0041776644
- Bik A. J. C., Wijshoff H. A. G. Technical Report. 1994.
- (1994) Technical Report
- Bik, A.J.C.¹ Wijshoff, H.A.G.²

7
- 0030382364
- Parallel programming with Polaris
- Blume W., Doallo R., Eigenmann R., Grout J., Hoeflinger J., Lawrence T., Lee J., Padua D., Paek Y., Pottenger B., Rauchwerger L., Tu P. Parallel programming with Polaris. IEEE Comput. 29:December 1996;78-82.
- (1996) IEEE Comput. , vol.29 , pp. 78-82
- Blume, W.¹ Doallo, R.² Eigenmann, R.³ Grout, J.⁴ Hoeflinger, J.⁵ Lawrence, T.⁶ Lee, J.⁷ Padua, D.⁸ Paek, Y.⁹ Pottenger, B.¹⁰ Rauchwerger, L.¹¹ Tu, P.¹²

8
- 84900310806
- Techniques for compiling and executing HPF programs on shared-memory and distributed-memory parallel systems
- Bangalore, India, December
- Z. Bozkus, L. Meadows, D. Miles, S. Nakamoto, V. Schuster, and, M. Young, Techniques for compiling and executing HPF programs on shared-memory and distributed-memory parallel systems, in, Proc. 1st International Workshop on Parallel Processing, Bangalore, India, December 1994.
- (1994) In, Proc. 1st International Workshop on Parallel Processing
- Bozkus, Z.¹ Meadows, L.² Miles, D.³ Nakamoto, S.⁴ Schuster, V.⁵ Young, M.⁶

9
- 0030672574
- The evolution of the HP/Convex Exemplar
- p. 81-86
- Brewer T., Astfalf G. The evolution of the HP/Convex Exemplar. Proc. COMPCON Spring'97: 42nd IEEE Computer Society International Conference. February 1997;. p. 81-86.
- (1997) Proc. COMPCON Spring'97: 42nd IEEE Computer Society International Conference
- Brewer, T.¹ Astfalf, G.²

10
- 3142734175
- Blocking linear algebra codes for memory hierarchies
- Chicago, IL, December
- S. Carr, and, K. Kennedy, Blocking linear algebra codes for memory hierarchies, in, Proc. 4th SIAM Conference on Parallel Processing for Scientific Computing, Chicago, IL, December 1989.
- (1989) In, Proc. 4th SIAM Conference on Parallel Processing for Scientific Computing
- Carr, S.¹ Kennedy, K.²

11
- 0030651789
- Data-distribution support on distributed-shared memory multiprocessors
- Las Vegas, NV
- R. Chandra, D. Chen, R. Cox, D. Maydan, N. Nedeljkovic, and, J. M. Anderson, Data-distribution support on distributed-shared memory multiprocessors, in, Proc. Programming Language Design and Implementation (PLDI), Las Vegas, NV, 1997.
- (1997) In, Proc. Programming Language Design and Implementation (PLDI)
- Chandra, R.¹ Chen, D.² Cox, R.³ Maydan, D.⁴ Nedeljkovic, N.⁵ Anderson, J.M.⁶

12
- 84976859799
- Unifying data and control transformations for distributed shared memory machines
- La Jolla, CA, June
- M. Cierniak, and, W. Li, Unifying data and control transformations for distributed shared memory machines, in, Proc. SIGPLAN'95 Conference on Programming Language Design and Implementation, La Jolla, CA, June 1995.
- (1995) In, Proc. SIGPLAN'95 Conference on Programming Language Design and Implementation
- Cierniak, M.¹ Li, W.²

13
- 84976745804
- Tile size selection using cache organization and data layout
- La Jolla, CA, June
- S. Coleman, and, K. McKinley, Tile size selection using cache organization and data layout, in, Proc. SIGPLAN'95 Conference on Programming Language Design and Implementation, La Jolla, CA, June 1995.
- (1995) In, Proc. SIGPLAN'95 Conference on Programming Language Design and Implementation
- Coleman, S.¹ McKinley, K.²

14
- 0004116989
- Cambridge: The MIT Press
- Cormen T., Leiserson C., Rivest R. Introduction to Algorithms. 1990;The MIT Press, Cambridge.
- (1990) Introduction to Algorithms
- Cormen, T.¹ Leiserson, C.² Rivest, R.³

15
- 0026821098
- New CPU benchmark suites from SPEC
- San Francisco, CA, February
- K. M. Dixit, New CPU benchmark suites from SPEC, in, Proc. COMPCON'92 - 37th IEEE Computer Society International Conference, San Francisco, CA, February 1992.
- (1992) In, Proc. COMPCON'92 - 37th IEEE Computer Society International Conference
- Dixit, K.M.¹

16
- 0025402476
- A set of level 3 basic linear algebra subprograms
- Dongarra J. J., Croz J. D., Hammarling S., Duff I. A set of level 3 basic linear algebra subprograms. ACM Trans. Math. Software. 16:March 1990;1-17.
- (1990) ACM Trans. Math. Software , vol.16 , pp. 1-17
- Dongarra, J.J.¹ Croz, J.D.² Hammarling, S.³ Duff, I.⁴

17
- 0001366267
- Strategies for cache and local memory management by global program transformations
- Gannon D., Jalby W., Gallivan K. Strategies for cache and local memory management by global program transformations. J. Parallel Distrib. Comput. 1988;587-616.
- (1988) J. Parallel Distrib. Comput. , pp. 587-616
- Gannon, D.¹ Jalby, W.² Gallivan, K.³

18
- 0029430244
- A novel approach towards automatic data distribution
- San Diego, December
- J. Garcia, E. Ayguade, and, J. Labarta, A novel approach towards automatic data distribution, in, Proc. Supercomputing'95, San Diego, December 1995.
- (1995) In, Proc. Supercomputing'95
- Garcia, J.¹ Ayguade, E.² Labarta, J.³

19
- 0003603813
- New York: Freeman
- Garey M. R., Johnson D. S. Computers and Intractability: A Guide to the Theory of NP-Completeness. 1979;Freeman, New York.
- (1979) Computers and Intractability: A Guide to the Theory of NP-Completeness
- Garey, M.R.¹ Johnson, D.S.²

20
- 84990709846
- Updating distributed variables in local computations
- Gerndt M. Updating distributed variables in local computations. Concurrency Practice Experience. 2:September 1990;171-193.
- (1990) Concurrency Practice Experience , vol.2 , pp. 171-193
- Gerndt, M.¹

21
- 0026823950
- Demonstration of automatic data partitioning techniques for parallelizing compilers on multicomputers
- Gupta M., Banerjee P. Demonstration of automatic data partitioning techniques for parallelizing compilers on multicomputers. IEEE Trans. Parallel Distrib. Systems. 3:March 1992;179-193.
- (1992) IEEE Trans. Parallel Distrib. Systems , vol.3 , pp. 179-193
- Gupta, M.¹ Banerjee, P.²

22
- 0004302191
- San Mateo: Morgan Kaufmann
- Hennessy J. L., Patterson D. A. Computer Architecture: A Quantitative Approach. 1995;Morgan Kaufmann, San Mateo.
- (1995) Computer Architecture: A Quantitative Approach
- Hennessy, J.L.¹ Patterson, D.A.²

23
- 0004341270
- Richland: Pacific Northwest Laboratory
- NWChem: A Computation Chemistry Package for Parallel Computers. 1995;Pacific Northwest Laboratory, Richland.
- (1995) NWChem: A Computation Chemistry Package for Parallel Computers

24
- 84976813879
- Compiling Fortran D for MIMD distributed-memory machines
- Hiranandani S., Kennedy K., Tseng C.-W. Compiling Fortran D for MIMD distributed-memory machines. Commun. Assoc. Comput. Mach. 35:August 1992;66-88.
- (1992) Commun. Assoc. Comput. Mach. , vol.35 , pp. 66-88
- Hiranandani, S.¹ Kennedy, K.² Tseng, C.-W.³

25
- 0002065001
- Reduction of cache coherence overhead by compiler data layout and loop transformations
- Santa Clara, CA, August
- Y.-J. Ju, and, H. Dietz, Reduction of cache coherence overhead by compiler data layout and loop transformations, in, Proc. 4th Workshop on Languages and Compilers for Parallel Computing, Santa Clara, CA, August 1991.
- (1991) In, Proc. 4th Workshop on Languages and Compilers for Parallel Computing
- Ju, Y.-J.¹ Dietz, H.²

26
- 0003328017
- Automatic data layout for High Performance Fortran
- San Diego, CA, December
- K. Kennedy, and, U. Kremer, Automatic data layout for High Performance Fortran, in, Proceedings of Supercomputing'95, San Diego, CA, December 1995.
- (1995) In, Proceedings of Supercomputing'95
- Kennedy, K.¹ Kremer, U.²

27
- 85027602455
- Optimizing for parallelism and data locality
- Washington, D.C, July
- K. Kennedy, and, K. S. McKinley, Optimizing for parallelism and data locality, in, Proc. 1992 ACM International Conference on Supercomputing (ICS'92), Washington, D.C, July 1992.
- (1992) In, Proc. 1992 ACM International Conference on Supercomputing (ICS'92)
- Kennedy, K.¹ McKinley, K.S.²

28
- 0003927041
- Houston: Rice University
- Kremer U. Automatic Data Layout for Distributed Memory Machines. 1995;Rice University, Houston.
- (1995) Automatic Data Layout for Distributed Memory Machines
- Kremer, U.¹

29
- 0026137116
- The cache performance and optimizations of blocked algorithms
- April
- M. S. Lam, E. Rothberg, and, M. E. Wolf, The cache performance and optimizations of blocked algorithms, in, Proc. 4th International Conference on Architectural Support for Programming Languages and Operating Systems, April, 1991.
- (1991) In, Proc. 4th International Conference on Architectural Support for Programming Languages and Operating Systems
- Lam, M.S.¹ Rothberg, E.² Wolf, M.E.³

30
- 0030685588
- The SGI Origin: A CC-NUMA highly scalable server
- May
- J. Laudon, and, D. Lenoski, The SGI Origin: A CC-NUMA highly scalable server, in, Proc. 24th Annual International Symposium on Computer Architecture, May 1997.
- (1997) In, Proc. 24th Annual International Symposium on Computer Architecture
- Laudon, J.¹ Lenoski, D.²

31
- 0026865505
- The DASH prototype: Implementation and performance
- Gold Coast, Australia, May
- D. Lenoski, J. Laudon, T. Joe, D. Nakahira, L. Stevens, A. Gupta, and J. Hennessy, The DASH prototype: implementation and performance, in Proc. 19th International Symposium on Computer Architecture, Gold Coast, Australia, May 1992, pp. 92-103.
- (1992) In Proc. 19th International Symposium on Computer Architecture , pp. 92-103
- Lenoski, D.¹ Laudon, J.² Joe, T.³ Nakahira, D.⁴ Stevens, L.⁵ Gupta, A.⁶ Hennessy, J.⁷

32
- 0003888396
- Ithaca: Cornell University
- Li W. Compiling for NUMA Parallel Machines. 1993;Cornell University, Ithaca.
- (1993) Compiling for NUMA Parallel Machines
- Li, W.¹

33
- 0343865588
- McMahon F. Technical Report. 1986.
- (1986) Technical Report
- McMahon, F.¹

34
- 0030190854
- Improving data locality with loop transformations
- McKinley K., Carr S., Tseng C. W. Improving data locality with loop transformations. ACM Trans. Progr. Languages Systems. 18:July 1996;424-453.
- (1996) ACM Trans. Progr. Languages Systems , vol.18 , pp. 424-453
- McKinley, K.¹ Carr, S.² Tseng, C.W.³

35
- 0002848657
- Non-singular data transformations: Definition, validity, applications
- Aachen, Germany
- M. O'Boyle and P. Knijnenburg, Non-singular data transformations: definition, validity, applications, in Proc. 6th Workshop on Compilers for Parallel Computers, Aachen, Germany, 1996, pp. 287-297.
- (1996) In Proc. 6th Workshop on Compilers for Parallel Computers , pp. 287-297
- O'Boyle, M.¹ Knijnenburg, P.²

36
- 0001787125
- Automatic selection of dynamic data partitioning schemes for distributed-memory multicomputers
- Columbus, OH
- D. Palermo and P. Banerjee, Automatic selection of dynamic data partitioning schemes for distributed-memory multicomputers, in Proc. 8th Workshop on Languages and Compilers for Parallel Computing, Columbus, OH, 1995, pp. 392-406.
- (1995) In Proc. 8th Workshop on Languages and Compilers for Parallel Computing , pp. 392-406
- Palermo, D.¹ Banerjee, P.²

37
- 0024874341
- Parafrase-2: An environment for parallelizing, partitioning, synchronizing, and scheduling programs on multiprocessors
- St. Charles, IL, August
- C. Polychronopoulos, M. B. Girkar, M. R. Haghighat, C. L. Lee, B. P. Leung, and D. A. Schouten, Parafrase-2: an environment for parallelizing, partitioning, synchronizing, and scheduling programs on multiprocessors, in Proc. the International Conference on Parallel Processing, St. Charles, IL, August 1989, pp. 39-48.
- (1989) In Proc. the International Conference on Parallel Processing , pp. 39-48
- Polychronopoulos, C.¹ Girkar, M.B.² Haghighat, M.R.³ Lee, C.L.⁴ Leung, B.P.⁵ Schouten, D.A.⁶

38
- 85031707006
- Non-unimodular transformations of nested loops
- Minneapolis, MN, November
- J. Ramanujam, Non-unimodular transformations of nested loops, in Proc. Supercomputing 92, Minneapolis, MN, November 1992, pp. 214-223.
- (1992) In Proc. Supercomputing 92 , pp. 214-223
- Ramanujam, J.¹

39
- 0041776638
- Integrating data distribution and loop transformations for distributed memory machines
- February (D. Baileyet al., Eds.)
- J. Ramanujam and A. Narayan, Integrating data distribution and loop transformations for distributed memory machines, in Proc. 7th SIAM Conference on Parallel Processing for Scientific Computing, February 1995 (D. Baileyet al., Eds.), pp. 668-673.
- (1995) In Proc. 7th SIAM Conference on Parallel Processing for Scientific Computing , pp. 668-673
- Ramanujam, J.¹ Narayan, A.²

40
- 0043279904
- Automatic data mapping and program transformations
- Houston, TX, April
- J. Ramanujam, and, A. Narayan, Automatic data mapping and program transformations, in, Proc. Workshop on Automatic Data Layout and Performance Prediction, Houston, TX, April 1995.
- (1995) In, Proc. Workshop on Automatic Data Layout and Performance Prediction
- Ramanujam, J.¹ Narayan, A.²

41
- 84958797356
- Locality analysis for distributed shared-memory multiprocessors
- Santa Clara, CA, August
- V. Sarkar, G. R. Gao, and, S. Han, Locality analysis for distributed shared-memory multiprocessors, in, Proc. the Ninth International Workshop on Languages and Compilers for Parallel Computing, Santa Clara, CA, August 1996.
- (1996) In, Proc. the Ninth International Workshop on Languages and Compilers for Parallel Computing
- Sarkar, V.¹ Gao, G.R.² Han, S.³

42
- 0003690189
- Theory of Linear and Integer Programming
- New York: Wiley
- Schrijver A. Theory of Linear and Integer Programming. Wiley-Interscience Series in Discrete Mathematics and Optimization. 1986;Wiley, New York.
- (1986) Wiley-Interscience Series in Discrete Mathematics and Optimization
- Schrijver, A.¹

43
- 0029205969
- Aligning parallel arrays to reduce communication
- McLean, VA, February
- T. J. Sheffler, R. Schreiber, J. R. Gilbert, and S. Chatterjee, Aligning parallel arrays to reduce communication, in Frontiers '95: The 5th Symposium on the Frontiers of Massively Parallel Computing, McLean, VA, February 1995, pp. 324-331.
- (1995) In Frontiers '95: The 5th Symposium on the Frontiers of Massively Parallel Computing , pp. 324-331
- Sheffler, T.J.¹ Schreiber, R.² Gilbert, J.R.³ Chatterjee, S.⁴

44
- 0029194311
- Unified compilation techniques for shared and distributed address space machines
- Barcelona, Spain, July
- C.-W. Tseng, J. Anderson, S. Amarasinghe, and, M. Lam, Unified compilation techniques for shared and distributed address space machines, in, Proc. 1995 International Conference on Supercomputing (ICS'95), Barcelona, Spain, July 1995.
- (1995) In, Proc. 1995 International Conference on Supercomputing (ICS'95)
- Tseng, C.-W.¹ Anderson, J.² Amarasinghe, S.³ Lam, M.⁴

45
- 0030671818
- An evaluation of a commercial CC-NUMA architecture: The CONVEX Exemplar SPP-1200
- Geneva, Switzerland, April
- R. Thekkath, A. P. Singh, J. P. Singh, S. John, and, J. Hennessey, An evaluation of a commercial CC-NUMA architecture: the CONVEX Exemplar SPP-1200, in, Proc. 11th International Parallel Processing Symposium, Geneva, Switzerland, April 1997.
- (1997) In, Proc. 11th International Parallel Processing Symposium
- Thekkath, R.¹ Singh, A.P.² Singh, J.P.³ John, S.⁴ Hennessey, J.⁵

46
- 0029194338
- Evaluating the impact of advanced memory systems on compiler-parallelized codes
- Limassol, Cyprus, June
- E. Torrie, C.-W. Tseng, M. Martonosi, and, M. W. Hall, Evaluating the impact of advanced memory systems on compiler-parallelized codes, in, Proc. International Conference on Parallel Architectures and Compilations Techniques (PACT), Limassol, Cyprus, June 1995.
- (1995) In, Proc. International Conference on Parallel Architectures and Compilations Techniques (PACT)
- Torrie, E.¹ Tseng, C.-W.² Martonosi, M.³ Hall, M.W.⁴

47
- 84976692695
- SUIF: An infrastructure for research on parallelizing and optimizing compilers
- Wilson R. P., French R. S., Wilson C. S., Amarasinghe S. P., Anderson J. M., Tjiang S. W. K., Liao S.-W., Tseng C.-W., Hall M. W., Lam M. S., Hennessy J. L. SUIF: An infrastructure for research on parallelizing and optimizing compilers. ACM SIGPLAN Notices. 29:December 1994;31-37.
- (1994) ACM SIGPLAN Notices , vol.29 , pp. 31-37
- Wilson, R.P.¹ French, R.S.² Wilson, C.S.³ Amarasinghe, S.P.⁴ Anderson, J.M.⁵ Tjiang, S.W.K.⁶ Liao, S.-W.⁷ Tseng, C.-W.⁸ Hall, M.W.⁹ Lam, M.S.¹⁰ Hennessy, J.L.¹¹

48
- 0026232450
- A loop transformation theory and an algorithm to maximize parallelism
- Wolf M., Lam M. A loop transformation theory and an algorithm to maximize parallelism. IEEE Trans. Parallel Distrib. Systems. 2:October 1991;452-471.
- (1991) IEEE Trans. Parallel Distrib. Systems , vol.2 , pp. 452-471
- Wolf, M.¹ Lam, M.²

49
- 84976827033
- A data locality optimizing algorithm
- June
- M. Wolf and M. Lam, A data locality optimizing algorithm, in Proc. ACM SIGPLAN 91 Conf. Programming Language Design and Implementation, June 1991, pp. 30-44.
- (1991) In Proc. ACM SIGPLAN 91 Conf. Programming Language Design and Implementation , pp. 30-44
- Wolf, M.¹ Lam, M.²

50
- 0003927035
- Reading: Addison-Wesley
- Wolfe M. High Performance Compilers for Parallel Computing. 1996;Addison-Wesley, Reading.
- (1996) High Performance Compilers for Parallel Computing
- Wolfe, M.¹

51
- 0027543560
- Compiling for distributed-memory systems
- Zima H., Chapman B. Compiling for distributed-memory systems. Proc. IEEE. 81:1993;264-287.
- (1993) Proc. IEEE , vol.81 , pp. 264-287
- Zima, H.¹ Chapman, B.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.