메뉴 건너뛰기




Volumn 29, Issue 3, 2001, Pages 217-247

Improving memory hierarchy performance for irregular applications using data and computation reorderings

Author keywords

Computation reordering; Data reordering; Memory hierarchy optimization; Multi level blocking; Space filling curves

Indexed keywords

COMPUTATION REORDERING; DATA REORDERING; MEMORY HIERARCHY OPTIMIZATION; MULTI-LEVEL BLOCKING; SPACE-FILLING CURVES;

EID: 1542601822     PISSN: 08857458     EISSN: None     Source Type: Journal    
DOI: None     Document Type: Article
Times cited : (107)

References (37)
  • 2
    • 0001366267 scopus 로고
    • Strategies for Cache and Local Memory Management by Global Program Transformation
    • D. Gannon, W. Jalby, and K. Gallivan, Strategies for Cache and Local Memory Management by Global Program Transformation, J. Parallel Distributed Computing, 5:587-616 (1988).
    • (1988) J. Parallel Distributed Computing , vol.5 , pp. 587-616
    • Gannon, D.1    Jalby, W.2    Gallivan, K.3
  • 7
    • 84976656398 scopus 로고
    • Effective Cache Prefetching on Bus-Based Multiprocessors
    • February
    • D. M. Tullsen and S. J. Eggers, Effective Cache Prefetching on Bus-Based Multiprocessors, ACM Trans. Computer Syst., 13(1):57-88 (February 1995).
    • (1995) ACM Trans. Computer Syst. , vol.13 , Issue.1 , pp. 57-88
    • Tullsen, D.M.1    Eggers, S.J.2
  • 9
    • 84945709131 scopus 로고
    • The Organization of Matrices and Matrix Operations in a Paged Multiprogramming Environment
    • A. C. McKeller and E. G. Coffman, The Organization of Matrices and Matrix Operations in a Paged Multiprogramming Environment, Commun. ACM, 12(3):153-165 (1969).
    • (1969) Commun. ACM , vol.12 , Issue.3 , pp. 153-165
    • McKeller, A.C.1    Coffman, E.G.2
  • 10
    • 85072516160 scopus 로고
    • Automatic Program Transformations for Virtual Memory Computers
    • June
    • W. Abu-Sufah, D. J. Kuck, and D. H. Lawrie, Automatic Program Transformations for Virtual Memory Computers, Proc. Nat'l. Computer Conf., pp. 969-974 (June 1979).
    • (1979) Proc. Nat'l. Computer Conf. , pp. 969-974
    • Abu-Sufah, W.1    Kuck, D.J.2    Lawrie, D.H.3
  • 14
    • 0030190854 scopus 로고    scopus 로고
    • Improving Data Locality with Loop Transformations
    • July
    • K. S. McKinley, S. Carr, and C.-W. Tseng, Improving Data Locality with Loop Transformations, ACM Trans. Progr. Lang. Syst., 18(4):424-453 (July 1996).
    • (1996) ACM Trans. Progr. Lang. Syst. , vol.18 , Issue.4 , pp. 424-453
    • McKinley, K.S.1    Carr, S.2    Tseng, C.-W.3
  • 15
    • 0032667957 scopus 로고    scopus 로고
    • Improving Cache Performance of Dynamic Applications with Computation and Data Layout Transformations
    • May
    • C. Ding and K. Kennedy, Improving Cache Performance of Dynamic Applications with Computation and Data Layout Transformations, Proc. ACM SIGPLAN Conf. Progr. Lang. Design Implementation, pp. 229-241 (May 1999).
    • (1999) Proc. ACM SIGPLAN Conf. Progr. Lang. Design Implementation , pp. 229-241
    • Ding, C.1    Kennedy, K.2
  • 16
    • 0028386843 scopus 로고
    • The Design and Implementation of a Parallel Unstructured Euler Solver Using Software Primitives
    • R. Das, D. Mavriplis, J. Saltz, S. Gupta, and R. Ponnusamy, The Design and Implementation of a Parallel Unstructured Euler Solver Using Software Primitives, AIAA J., 32:489-496 (1994).
    • (1994) AIAA J. , vol.32 , pp. 489-496
    • Das, R.1    Mavriplis, D.2    Saltz, J.3    Gupta, S.4    Ponnusamy, R.5
  • 19
    • 0043005053 scopus 로고
    • Load Balancing and Data Locality in Adaptive Hierarhcical N-body Methods: Barnes-Hut, Fast Multipole, and Radiosity
    • June
    • J. P Singh, C. Holt, T. Totsuka, A. Gupta, and J. Hennessy, Load Balancing and Data Locality in Adaptive Hierarhcical N-body Methods: Barnes-Hut, Fast Multipole, and Radiosity, J. Parallel Distributed Computing (June 1995).
    • (1995) J. Parallel Distributed Computing
    • Singh, J.P.1    Holt, C.2    Totsuka, T.3    Gupta, A.4    Hennessy, J.5
  • 20
    • 0027747808 scopus 로고
    • A Parallel Hashed Oct-Tree N-Body Algorithm
    • November
    • M. S. Warren and J. K. Salmon, A Parallel Hashed Oct-Tree N-Body Algorithm, Proc. Supercomputing (November 1993).
    • (1993) Proc. Supercomputing
    • Warren, M.S.1    Salmon, J.K.2
  • 21
    • 0029181785 scopus 로고
    • Architecture-Independent Locality-Improving Transformations of Computational Graphs Embedded in k-Dimensions
    • C. Ou, M. Gunwani, and S. Ranka, Architecture-Independent Locality-Improving Transformations of Computational Graphs Embedded in k-Dimensions, Proc. Int'l. Conf. Supercomputing (1995).
    • (1995) Proc. Int'l. Conf. Supercomputing
    • Ou, C.1    Gunwani, M.2    Ranka, S.3
  • 24
    • 0030688479 scopus 로고    scopus 로고
    • Auto-blocking Matrix Multiplication or Tracking BLAS3 Performance from Source Code
    • June
    • J. Frens and D. Wise, Auto-blocking Matrix Multiplication or Tracking BLAS3 Performance from Source Code, Proc. ACM SIGPLAN Conf. Progr. Lang. Design Implementation, pp. 206-216 (June 1997).
    • (1997) Proc. ACM SIGPLAN Conf. Progr. Lang. Design Implementation , pp. 206-216
    • Frens, J.1    Wise, D.2
  • 27
    • 0014612601 scopus 로고
    • Reducing the Bandwidth of Sparse Symmetric Matrices
    • Association of Computing Machinery
    • E. Cuthill and J. McKee, Reducing the Bandwidth of Sparse Symmetric Matrices, Proc. ACM National Conf., Association of Computing Machinery (1969).
    • (1969) Proc. ACM National Conf.
    • Cuthill, E.1    McKee, J.2
  • 28
    • 0022661211 scopus 로고
    • An Algorithm for Profile and Wavefront Reduction of Sparse Matrices
    • S. Sloan, An Algorithm for Profile and Wavefront Reduction of Sparse Matrices, Int'l. J. Numerical Methods Engng., 23:239-251 (1986).
    • (1986) Int'l. J. Numerical Methods Engng. , vol.23 , pp. 239-251
    • Sloan, S.1
  • 31
    • 0008198155 scopus 로고    scopus 로고
    • Master's thesis, MIT Department of Electrical Engineering and Computer Science June
    • H. Prokop, Cache-Oblivious Algorithms, Master's thesis, MIT Department of Electrical Engineering and Computer Science (June 1999).
    • (1999) Cache-Oblivious Algorithms
    • Prokop, H.1
  • 34
    • 0032650115 scopus 로고    scopus 로고
    • Parallel Multilevel k-way Partition Scheme for Irregular Graphs
    • G. Karypis and V. Kumar, Parallel Multilevel k-way Partition Scheme for Irregular Graphs, SIAM Review, 41: 278-300 (1999).
    • (1999) SIAM Review , vol.41 , pp. 278-300
    • Karypis, G.1    Kumar, V.2
  • 35
    • 57649182142 scopus 로고    scopus 로고
    • Personal Communication September
    • R. Robey, Personal Communication (September 2000).
    • (2000)
    • Robey, R.1
  • 36
    • 17644383969 scopus 로고    scopus 로고
    • Improving Fine-Grained Irregular Shared-Memory Benchmarks by Data Reordering
    • November
    • Y. C. Hu, A. Cox, and W. Zwaenepoel, Improving Fine-Grained Irregular Shared-Memory Benchmarks by Data Reordering, Proc. Super computing (November 2000).
    • (2000) Proc. Super Computing
    • Hu, Y.C.1    Cox, A.2    Zwaenepoel, W.3
  • 37
    • 0002596621 scopus 로고    scopus 로고
    • Code Transformations to Improve Memory Parallelism
    • November
    • V. Pai and S. Adve, Code Transformations to Improve Memory Parallelism, Proc. MICRO-32 (November 1999).
    • (1999) Proc. MICRO-32
    • Pai, V.1    Adve, S.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.