메뉴 건너뛰기




Volumn 10, Issue 2, 1999, Pages 115-135

A linear algebra framework for automatic determination of optimal data layouts

Author keywords

Array restructuring; Data reuse; Locality optimizations; Memory performance; Parallelism; Spatial locality

Indexed keywords

COMPUTER SYSTEMS PROGRAMMING; DATA STORAGE EQUIPMENT; DATA STRUCTURES; MATRIX ALGEBRA; OPTIMIZATION; RESPONSE TIME (COMPUTER SYSTEMS);

EID: 0033077834     PISSN: 10459219     EISSN: None     Source Type: Journal    
DOI: 10.1109/71.752779     Document Type: Article
Times cited : (46)

References (44)
  • 1
    • 84976772979 scopus 로고
    • Optimizing Parallel Programs Using Affinity Regions
    • St. Charles, Ill., Aug.
    • B. Appelbe and B. Lakshmanan, "Optimizing Parallel Programs Using Affinity Regions," Proc. 1993 Int'l Conf. Parallel Processing, pp. 246-249, St. Charles, Ill., Aug. 1993.
    • (1993) Proc. 1993 Int'l Conf. Parallel Processing , pp. 246-249
    • Appelbe, B.1    Lakshmanan, B.2
  • 3
    • 0027870804 scopus 로고
    • Global Optimizations for Parallelism and Locality on Scalable Parallel Machines
    • Albuquerque, N.M., June
    • J. Anderson and M. Lam, "Global Optimizations for Parallelism and Locality on Scalable Parallel Machines," Proc. SIGPLAN Conf. Programming Language Design and Implementation, pp. 112-125, Albuquerque, N.M., June 1993.
    • (1993) Proc. SIGPLAN Conf. Programming Language Design and Implementation , pp. 112-125
    • Anderson, J.1    Lam, M.2
  • 8
    • 84976859799 scopus 로고
    • Unifying Data and Control Transformations for Distributed Shared Memory Machines
    • La Jolla, Calif., June
    • M. Cierniak and W. Li, "Unifying Data and Control Transformations for Distributed Shared Memory Machines," Proc. SIGPLAN Conf. Programming Language Design and Implementation, pp. 205-217, La Jolla, Calif., June 1995.
    • (1995) Proc. SIGPLAN Conf. Programming Language Design and Implementation , pp. 205-217
    • Cierniak, M.1    Li, W.2
  • 10
    • 0001366267 scopus 로고
    • Strategies for Cache and Local Memory Management by Global Program Transformations
    • Oct.
    • D. Gannon, W. Jalby, and K. Gallivan, "Strategies for Cache and Local Memory Management by Global Program Transformations," J. Parallel and Distributed Computing, vol. 5, no. 5, pp. 587-616, Oct. 1988.
    • (1988) J. Parallel and Distributed Computing , vol.5 , Issue.5 , pp. 587-616
    • Gannon, D.1    Jalby, W.2    Gallivan, K.3
  • 11
    • 0029430244 scopus 로고
    • A Novel Approach Towards Automatic Data Distribution
    • San Diego, Calif., Dec.
    • J. Garcia, E. Ayguade, and J. Labarta, "A Novel Approach Towards Automatic Data Distribution," Proc. Supercomputing'95, San Diego, Calif., Dec. 1995.
    • (1995) Proc. Supercomputing'95
    • Garcia, J.1    Ayguade, E.2    Labarta, J.3
  • 12
    • 0003202116 scopus 로고    scopus 로고
    • Dynamic Data Distribution with Control Flow Analysis
    • Pittsburgh, Penn., Nov.
    • J. Garcia, E. Ayguade, and J. Labarta, "Dynamic Data Distribution with Control Flow Analysis," Proc. Supercomputing'96, Pittsburgh, Penn., Nov. 1996.
    • (1996) Proc. Supercomputing'96
    • Garcia, J.1    Ayguade, E.2    Labarta, J.3
  • 13
    • 0026823950 scopus 로고
    • Demonstration of Automatic Data Partitioning Techniques for Parallelizing Compilers on Multicomputers
    • Mar.
    • M. Gupta and P. Banerjee, "Demonstration of Automatic Data Partitioning Techniques for Parallelizing Compilers on Multicomputers," IEEE Trans. Parallel and Distributed Systems, vol. 3, no. 2, pp. 179-193, Mar. 1992.
    • (1992) IEEE Trans. Parallel and Distributed Systems , vol.3 , Issue.2 , pp. 179-193
    • Gupta, M.1    Banerjee, P.2
  • 14
    • 0024903997 scopus 로고
    • Evaluating Associativity in CPU Caches
    • Dec.
    • M. Hill and A. Smith, "Evaluating Associativity in CPU Caches," IEEE Trans. Computers, vol. 38, no. 12, pp. 1,612-1,630, Dec. 1989.
    • (1989) IEEE Trans. Computers , vol.38 , Issue.12
    • Hill, M.1    Smith, A.2
  • 17
    • 0029192199 scopus 로고
    • Reducing False Sharing on Shared Memory Multiprocessors through Compile Time Data Transformations
    • Santa Barbara, Calif., July
    • T. Jeremiassen and S. Eggers, "Reducing False Sharing on Shared Memory Multiprocessors through Compile Time Data Transformations," Proc. Fifth ACM SIGPLAN Symp. Principles and Practice of Parallel Programming, pp. 179-188, Santa Barbara, Calif., July 1995.
    • (1995) Proc. Fifth ACM SIGPLAN Symp. Principles and Practice of Parallel Programming , pp. 179-188
    • Jeremiassen, T.1    Eggers, S.2
  • 18
    • 85029492648 scopus 로고
    • Reduction of Cache Coherence Overhead by Compiler Data Layout and Loop Transformation
    • U. Banerjee et al., eds.
    • Y. Ju and H. Dietz, "Reduction of Cache Coherence Overhead by Compiler Data Layout and Loop Transformation," Languages and Compilers for Parallel Computing, U. Banerjee et al., eds., pp. 344-358, 1992.
    • (1992) Languages and Compilers for Parallel Computing , pp. 344-358
    • Ju, Y.1    Dietz, H.2
  • 19
    • 0032025292 scopus 로고    scopus 로고
    • Locality Optimization Algorithms for Compilation of Out-of-Core Codes
    • Mar.
    • M. Kandemir, A. Choudhary, J. Ramanujam, and M. Kandaswamy "Locality Optimization Algorithms for Compilation of Out-of-Core Codes," J. Information Science and Eng., vol. 14, no. 1, pp. 107-138, Mar. 1998.
    • (1998) J. Information Science and Eng. , vol.14 , Issue.1 , pp. 107-138
    • Kandemir, M.1    Choudhary, A.2    Ramanujam, J.3    Kandaswamy, M.4
  • 20
    • 0032066688 scopus 로고    scopus 로고
    • Compilation Techniques for Out-of-Core Parallel Computations
    • June
    • M. Kandemir, A. Choudhary, J. Ramanujam, and R. Bordawekar, "Compilation Techniques for Out-of-Core Parallel Computations," Parallel Computing, vol. 24, nos. 3-4, pp. 597-628, June 1998.
    • (1998) Parallel Computing , vol.24 , Issue.3-4 , pp. 597-628
    • Kandemir, M.1    Choudhary, A.2    Ramanujam, J.3    Bordawekar, R.4
  • 23
    • 0003328017 scopus 로고
    • Automatic Data Layout for High Performance Fortran
    • San Diego, Calif., Dec.
    • K. Kennedy and U. Kremer, "Automatic Data Layout for High Performance Fortran," Proc. Supercomputing '95, San Diego, Calif., Dec. 1995.
    • (1995) Proc. Supercomputing '95
    • Kennedy, K.1    Kremer, U.2
  • 25
    • 0003582055 scopus 로고
    • Technical Report TR 95-09-01, Dept. of Computer Science and Eng., Univ. of Washington, Sept.
    • S.-T. Leung and J. Zahorjan, "Optimizing Data Locality by Array Restructuring," Technical Report TR 95-09-01, Dept. of Computer Science and Eng., Univ. of Washington, Sept. 1995.
    • (1995) Optimizing Data Locality by Array Restructuring
    • Leung, S.-T.1    Zahorjan, J.2
  • 27
    • 0026187669 scopus 로고
    • Compiling Communication Efficient Programs for Massively Parallel Machines
    • J. Li and M. Chen, "Compiling Communication Efficient Programs for Massively Parallel Machines," J. Parallel and Distributed Computing, vol. 2, no. 3, pp. 361-376, 1991.
    • (1991) J. Parallel and Distributed Computing , vol.2 , Issue.3 , pp. 361-376
    • Li, J.1    Chen, M.2
  • 29
    • 0026971052 scopus 로고
    • Delinearization: An Efficient Way to Break Multi-Loop Dependence Equations
    • San Francisco, June
    • V. Maslov, "Delinearization: An Efficient Way to Break Multi-Loop Dependence Equations," Proc. SIGPLAN Conf. Programming Language Design and Implementation, pp. 152-161, San Francisco, June 1992.
    • (1992) Proc. SIGPLAN Conf. Programming Language Design and Implementation , pp. 152-161
    • Maslov, V.1
  • 32
    • 0001787125 scopus 로고
    • Automatic Selection of Dynamic Data Partitioning Schemes for Distributed-Memory Multicomputers
    • Columbus, Ohio
    • D. Palermo and P. Banerjee, "Automatic Selection of Dynamic Data Partitioning Schemes for Distributed-Memory Multicomputers," Proc. Eighth Workshop Languages and Compilers for Parallel Computing, Columbus, Ohio, pp. 392-406, 1995.
    • (1995) Proc. Eighth Workshop Languages and Compilers for Parallel Computing , pp. 392-406
    • Palermo, D.1    Banerjee, P.2
  • 34
    • 85031707006 scopus 로고
    • Non-Unimodular Transformations of Nested Loops
    • Minneapolis, Minn., Nov.
    • J. Ramanujam, "Non-Unimodular Transformations of Nested Loops," Proc. Supercomputing 92, Minneapolis, Minn., pp. 214-223, Nov. 1992.
    • (1992) Proc. Supercomputing 92 , pp. 214-223
    • Ramanujam, J.1
  • 35
    • 0041776638 scopus 로고
    • Integrating Data Distribution and Loop Transformations for Distributed Memory Machines
    • D. Bailey et al., eds., Feb.
    • J. Ramanujam and A. Narayan, "Integrating Data Distribution and Loop Transformations for Distributed Memory Machines," Proc. Seventh SIAM Conf. Parallel Processing for Scientific Computing, D. Bailey et al., eds., pp. 668-673, Feb. 1995.
    • (1995) Proc. Seventh SIAM Conf. Parallel Processing for Scientific Computing , pp. 668-673
    • Ramanujam, J.1    Narayan, A.2
  • 36
    • 0026231056 scopus 로고
    • Compile-Time Techniques for Data Distribution in Distributed Memory Machines
    • Oct.
    • J. Ramanujam and P. Sadayappan, "Compile-Time Techniques for Data Distribution in Distributed Memory Machines," IEEE Trans. Parallel and Distributed Systems, vol. 2, no. 4, pp. 472-482, Oct. 1991.
    • (1991) IEEE Trans. Parallel and Distributed Systems , vol.2 , Issue.4 , pp. 472-482
    • Ramanujam, J.1    Sadayappan, P.2
  • 38
    • 0030652844 scopus 로고    scopus 로고
    • Automatic Partitioning of Data and Computations on Scalable Shared Memory Multiprocessors
    • Bloomingdale, Ill., Aug.
    • S. Tandri and T. Abdelrahman, "Automatic Partitioning of Data and Computations on Scalable Shared Memory Multiprocessors," Proc. 1997 Int'l Conf. Parallel Processing, pp. 64-73, Bloomingdale, Ill., Aug. 1997.
    • (1997) Proc. 1997 Int'l Conf. Parallel Processing , pp. 64-73
    • Tandri, S.1    Abdelrahman, T.2
  • 39
    • 0028446907 scopus 로고
    • False Sharing and Spatial Locality in Multiprocessor Caches
    • June
    • J. Torrellas, M. Lam, and J. Hennessey, "False Sharing and Spatial Locality in Multiprocessor Caches," IEEE Trans. Computers, vol. 43, no. 6, pp. 651-663, June 1994.
    • (1994) IEEE Trans. Computers , vol.43 , Issue.6 , pp. 651-663
    • Torrellas, J.1    Lam, M.2    Hennessey, J.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.