SCOPUS 정보 검색 플랫폼

International Journal of Parallel Programming

Volumn 29, Issue 3, 2001, Pages 217-247

Improving memory hierarchy performance for irregular applications using data and computation reorderings

(3) Mellor Crummey, John a Whalley, David b Kennedy, Ken a

a Rice University (United States)

b Florida State University (United States)

Author keywords

Computation reordering; Data reordering; Memory hierarchy optimization; Multi level blocking; Space filling curves

Indexed keywords

COMPUTATION REORDERING; DATA REORDERING; MEMORY HIERARCHY OPTIMIZATION; MULTI-LEVEL BLOCKING; SPACE-FILLING CURVES;

BANDWIDTH; COMPUTATION THEORY; COMPUTER SIMULATION; COPYRIGHTS; HIERARCHICAL SYSTEMS; OPTIMIZATION; PROGRAM PROCESSORS; RANDOM PROCESSES; SOFTWARE ENGINEERING; SUPERVISORY AND EXECUTIVE PROGRAMS;

DATA STORAGE EQUIPMENT;

EID: 1542601822 PISSN: 08857458 EISSN: None Source Type: Journal
DOI: None Document Type: Article

Times cited : (107)

References (37)

1
- 0025447908
- Improving Register Allocation for Subscripted Variables
- June
- D. Callahan, S. Carr, and K. Kennedy, Improving Register Allocation for Subscripted Variables, Proc. ACM SIGPLAN Conf. Progr. Lang. Design Implementation, pp. 53-65 (June 1990).
- (1990) Proc. ACM SIGPLAN Conf. Progr. Lang. Design Implementation , pp. 53-65
- Callahan, D.¹ Carr, S.² Kennedy, K.³

2
- 0001366267
- Strategies for Cache and Local Memory Management by Global Program Transformation
- D. Gannon, W. Jalby, and K. Gallivan, Strategies for Cache and Local Memory Management by Global Program Transformation, J. Parallel Distributed Computing, 5:587-616 (1988).
- (1988) J. Parallel Distributed Computing , vol.5 , pp. 587-616
- Gannon, D.¹ Jalby, W.² Gallivan, K.³

3
- 0026137116
- The Cache Performance and Optimizations of Blocked Algorithms
- April
- M. S. Lam, E. E. Rothberg, and M. E. Wolf, The Cache Performance and Optimizations of Blocked Algorithms, Proc. Fourth Int'l. Conf. Architectural Support Progr. Lang. Oper. Syst., pp. 63-74 (April 1991).
- (1991) Proc. Fourth Int'l. Conf. Architectural Support Progr. Lang. Oper. Syst. , pp. 63-74
- Lam, M.S.¹ Rothberg, E.E.² Wolf, M.E.³

4
- 0003690936
- Ph.D. Dissertation, Rice University, Houston, Texas May
- A. K. Porterfield, Software Methods for Improvement of Cache Performance on Supercomputer Applications, Ph.D. Dissertation, Rice University, Houston, Texas (May 1989).
- (1989) Software Methods for Improvement of Cache Performance on Supercomputer Applications
- Porterfield, A.K.¹

5
- 84976827033
- A Data Locality Optimizing Algorithm
- June
- M. E. Wolf and M. S. Lam, A Data Locality Optimizing Algorithm, Proc. SIGPLAN Conf. Progr. Lang. Design and Implementation, pp. 30-44 (June 1991).
- (1991) Proc. SIGPLAN Conf. Progr. Lang. Design and Implementation , pp. 30-44
- Wolf, M.E.¹ Lam, M.S.²

6
- 0002678692
- On Estimating and Enhancing Cache Effectiveness
- August
- J. Ferrante, V. Sarkar, and W. Thrash, On Estimating and Enhancing Cache Effectiveness, Proc. Fourth Workshop on Lang. Compilers for Parallel Computing (August 1991).
- (1991) Proc. Fourth Workshop on Lang. Compilers for Parallel Computing
- Ferrante, J.¹ Sarkar, V.² Thrash, W.³

7
- 84976656398
- Effective Cache Prefetching on Bus-Based Multiprocessors
- February
- D. M. Tullsen and S. J. Eggers, Effective Cache Prefetching on Bus-Based Multiprocessors, ACM Trans. Computer Syst., 13(1):57-88 (February 1995).
- (1995) ACM Trans. Computer Syst. , vol.13 , Issue.1 , pp. 57-88
- Tullsen, D.M.¹ Eggers, S.J.²

8
- 0026918402
- Design and Evaluation of a Compiler Algorithm for Prefetching
- October
- T. C. Mowry, M. S. Lam, and A. Gupta, Design and Evaluation of a Compiler Algorithm for Prefetching, Proc. Fifth Int'l. Conf. Architectural Support Progr. Lang. Oper. Syst., pp. 62-73 (October 1992).
- (1992) Proc. Fifth Int'l. Conf. Architectural Support Progr. Lang. Oper. Syst. , pp. 62-73
- Mowry, T.C.¹ Lam, M.S.² Gupta, A.³

9
- 84945709131
- The Organization of Matrices and Matrix Operations in a Paged Multiprogramming Environment
- A. C. McKeller and E. G. Coffman, The Organization of Matrices and Matrix Operations in a Paged Multiprogramming Environment, Commun. ACM, 12(3):153-165 (1969).
- (1969) Commun. ACM , vol.12 , Issue.3 , pp. 153-165
- McKeller, A.C.¹ Coffman, E.G.²

10
- 85072516160
- Automatic Program Transformations for Virtual Memory Computers
- June
- W. Abu-Sufah, D. J. Kuck, and D. H. Lawrie, Automatic Program Transformations for Virtual Memory Computers, Proc. Nat'l. Computer Conf., pp. 969-974 (June 1979).
- (1979) Proc. Nat'l. Computer Conf. , pp. 969-974
- Abu-Sufah, W.¹ Kuck, D.J.² Lawrie, D.H.³

11
- 57649189259
- J. J. Navarro, E. Garcia, and J. R. Herrero, Proc. Tenth ACM Int'l. Conf. Supercomputing (ICS) (1996).
- (1996) Proc. Tenth ACM Int'l. Conf. Supercomputing (ICS)
- Navarro, J.J.¹ Garcia, E.² Herrero, J.R.³

12
- 0030685988
- Data-Centric Multi-level Blocking
- June
- I. Kodukula, N. Ahmed, and K. Pingali, Data-Centric Multi-level Blocking, Proc. ACM SIGPLAN Conf. Progr. Lang. Design Implementation, pp. 346-357 (June 1997).
- (1997) Proc. ACM SIGPLAN Conf. Progr. Lang. Design Implementation , pp. 346-357
- Kodukula, I.¹ Ahmed, N.² Pingali, K.³

13
- 84976795078
- Automatic Loop Interchange
- June
- J. R. Allen and K. Kennedy, Automatic Loop Interchange, Proc. SIGPLAN Symp. Compiler Construction SIGPLAN Notices, 19(6):233-246 (June 1984).
- (1984) Proc. SIGPLAN Symp. Compiler Construction SIGPLAN Notices , vol.19 , Issue.6 , pp. 233-246
- Allen, J.R.¹ Kennedy, K.²

14
- 0030190854
- Improving Data Locality with Loop Transformations
- July
- K. S. McKinley, S. Carr, and C.-W. Tseng, Improving Data Locality with Loop Transformations, ACM Trans. Progr. Lang. Syst., 18(4):424-453 (July 1996).
- (1996) ACM Trans. Progr. Lang. Syst. , vol.18 , Issue.4 , pp. 424-453
- McKinley, K.S.¹ Carr, S.² Tseng, C.-W.³

15
- 0032667957
- Improving Cache Performance of Dynamic Applications with Computation and Data Layout Transformations
- May
- C. Ding and K. Kennedy, Improving Cache Performance of Dynamic Applications with Computation and Data Layout Transformations, Proc. ACM SIGPLAN Conf. Progr. Lang. Design Implementation, pp. 229-241 (May 1999).
- (1999) Proc. ACM SIGPLAN Conf. Progr. Lang. Design Implementation , pp. 229-241
- Ding, C.¹ Kennedy, K.²

16
- 0028386843
- The Design and Implementation of a Parallel Unstructured Euler Solver Using Software Primitives
- R. Das, D. Mavriplis, J. Saltz, S. Gupta, and R. Ponnusamy, The Design and Implementation of a Parallel Unstructured Euler Solver Using Software Primitives, AIAA J., 32:489-496 (1994).
- (1994) AIAA J. , vol.32 , pp. 489-496
- Das, R.¹ Mavriplis, D.² Saltz, J.³ Gupta, S.⁴ Ponnusamy, R.⁵

17
- 0003596534
- Springer-Verlag, New York
- H. Sagan, Space-Filling Curves, Springer-Verlag, New York (1994).
- (1994) Space-Filling Curves
- Sagan, H.¹

18
- 0003715695
- Addison-Wesley, New York
- H. Samet, Applications of Spatial Data Structures: Computer Graphics, Image Processing and GIS, Addison-Wesley, New York (1989).
- (1989) Applications of Spatial Data Structures: Computer Graphics, Image Processing and GIS
- Samet, H.¹

19
- 0043005053
- Load Balancing and Data Locality in Adaptive Hierarhcical N-body Methods: Barnes-Hut, Fast Multipole, and Radiosity
- June
- J. P Singh, C. Holt, T. Totsuka, A. Gupta, and J. Hennessy, Load Balancing and Data Locality in Adaptive Hierarhcical N-body Methods: Barnes-Hut, Fast Multipole, and Radiosity, J. Parallel Distributed Computing (June 1995).
- (1995) J. Parallel Distributed Computing
- Singh, J.P.¹ Holt, C.² Totsuka, T.³ Gupta, A.⁴ Hennessy, J.⁵

20
- 0027747808
- A Parallel Hashed Oct-Tree N-Body Algorithm
- November
- M. S. Warren and J. K. Salmon, A Parallel Hashed Oct-Tree N-Body Algorithm, Proc. Supercomputing (November 1993).
- (1993) Proc. Supercomputing
- Warren, M.S.¹ Salmon, J.K.²

21
- 0029181785
- Architecture-Independent Locality-Improving Transformations of Computational Graphs Embedded in k-Dimensions
- C. Ou, M. Gunwani, and S. Ranka, Architecture-Independent Locality-Improving Transformations of Computational Graphs Embedded in k-Dimensions, Proc. Int'l. Conf. Supercomputing (1995).
- (1995) Proc. Int'l. Conf. Supercomputing
- Ou, C.¹ Gunwani, M.² Ranka, S.³

22
- 60349083022
- On Partitioning Dynamic Adaptive Grid Hierarchies
- January
- M. Parashar and J. C. Browne, On Partitioning Dynamic Adaptive Grid Hierarchies, Proc. Hawaii Conf. Syst. Sci. (January 1996).
- (1996) Proc. Hawaii Conf. Syst. Sci.
- Parashar, M.¹ Browne, J.C.²

23
- 0010227680
- Tuning Strassen's Matrix Multiplication Algorithm for Memory Efficiency
- November
- M. Thottethodi, S. Chatterjee, and A. R. Lebeck, Tuning Strassen's Matrix Multiplication Algorithm for Memory Efficiency, Proc. SC98: High Performance Computing and Networking (November 1998).
- (1998) Proc. SC98: High Performance Computing and Networking
- Thottethodi, M.¹ Chatterjee, S.² Lebeck, A.R.³

24
- 0030688479
- Auto-blocking Matrix Multiplication or Tracking BLAS3 Performance from Source Code
- June
- J. Frens and D. Wise, Auto-blocking Matrix Multiplication or Tracking BLAS3 Performance from Source Code, Proc. ACM SIGPLAN Conf. Progr. Lang. Design Implementation, pp. 206-216 (June 1997).
- (1997) Proc. ACM SIGPLAN Conf. Progr. Lang. Design Implementation , pp. 206-216
- Frens, J.¹ Wise, D.²

25
- 0031650717
- Memory Hierarchy Management for Iterative Graph Structures
- March
- I. Al-Furaih and S. Ranka, Memory Hierarchy Management for Iterative Graph Structures, Proc. Int'l. Parallel Processing Symp. (March 1998).
- (1998) Proc. Int'l. Parallel Processing Symp.
- Al-Furaih, I.¹ Ranka, S.²

26
- 0003645035
- Prentice Hall, Englewood Cliffs, New Jersey
- A. George and G. Liu, Computer Solution of Large Sparse Positive Definite Systems, Prentice Hall, Englewood Cliffs, New Jersey (1981).
- (1981) Computer Solution of Large Sparse Positive Definite Systems
- George, A.¹ Liu, G.²

27
- 0014612601
- Reducing the Bandwidth of Sparse Symmetric Matrices
- Association of Computing Machinery
- E. Cuthill and J. McKee, Reducing the Bandwidth of Sparse Symmetric Matrices, Proc. ACM National Conf., Association of Computing Machinery (1969).
- (1969) Proc. ACM National Conf.
- Cuthill, E.¹ McKee, J.²

28
- 0022661211
- An Algorithm for Profile and Wavefront Reduction of Sparse Matrices
- S. Sloan, An Algorithm for Profile and Wavefront Reduction of Sparse Matrices, Int'l. J. Numerical Methods Engng., 23:239-251 (1986).
- (1986) Int'l. J. Numerical Methods Engng. , vol.23 , pp. 239-251
- Sloan, S.¹

29
- 0033362479
- Localizing Nonaffine Array References
- October
- N. Mitchell, L. Carter, and J. Ferrante, Localizing Nonaffine Array References, Proc. Parallel Architectures and Compilation Techniques (October 1999).
- (1999) Proc. Parallel Architectures and Compilation Techniques
- Mitchell, N.¹ Carter, L.² Ferrante, J.³

30
- 0032684978
- Improving Memory Hierarchy Performance for Irregular Applications
- June
- J. Mellor-Crummey, D. Whalley, and K. Kennedy, Improving Memory Hierarchy Performance for Irregular Applications, Proc. ACM Int'l. Conf. Super computing, pp. 425-433 (June 1999).
- (1999) Proc. ACM Int'l. Conf. Super Computing , pp. 425-433
- Mellor-Crummey, J.¹ Whalley, D.² Kennedy, K.³

31
- 0008198155
- Master's thesis, MIT Department of Electrical Engineering and Computer Science June
- H. Prokop, Cache-Oblivious Algorithms, Master's thesis, MIT Department of Electrical Engineering and Computer Science (June 1999).
- (1999) Cache-Oblivious Algorithms
- Prokop, H.¹

32
- 0003657590
- Addison-Wesley, New York
- D. Knuth, The Art of Computer Programming Volume 3: Sorting and Searching, Addison-Wesley, New York (1973).
- (1973) The Art of Computer Programming Volume 3: Sorting and Searching , vol.3
- Knuth, D.¹

33
- 84986512474
- CHARMM: A Program for Macromolecular Energy, Minimization and Dynamics Calculations
- B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States, S. Swaminathan, and M. Karplus, CHARMM: A Program for Macromolecular Energy, Minimization and Dynamics Calculations, J. Computational Chemistry, 4:187-217 (1983).
- (1983) J. Computational Chemistry , vol.4 , pp. 187-217
- Brooks, B.R.¹ Bruccoleri, R.E.² Olafson, B.D.³ States, D.J.⁴ Swaminathan, S.⁵ Karplus, M.⁶

34
- 0032650115
- Parallel Multilevel k-way Partition Scheme for Irregular Graphs
- G. Karypis and V. Kumar, Parallel Multilevel k-way Partition Scheme for Irregular Graphs, SIAM Review, 41: 278-300 (1999).
- (1999) SIAM Review , vol.41 , pp. 278-300
- Karypis, G.¹ Kumar, V.²

35
- 57649182142
- Personal Communication September
- R. Robey, Personal Communication (September 2000).
- (2000)
- Robey, R.¹

36
- 17644383969
- Improving Fine-Grained Irregular Shared-Memory Benchmarks by Data Reordering
- November
- Y. C. Hu, A. Cox, and W. Zwaenepoel, Improving Fine-Grained Irregular Shared-Memory Benchmarks by Data Reordering, Proc. Super computing (November 2000).
- (2000) Proc. Super Computing
- Hu, Y.C.¹ Cox, A.² Zwaenepoel, W.³

37
- 0002596621
- Code Transformations to Improve Memory Parallelism
- November
- V. Pai and S. Adve, Code Transformations to Improve Memory Parallelism, Proc. MICRO-32 (November 1999).
- (1999) Proc. MICRO-32
- Pai, V.¹ Adve, S.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.