SCOPUS 정보 검색 플랫폼

IEEE Transactions on Parallel and Distributed Systems

Volumn 10, Issue 2, 1999, Pages 115-135

A linear algebra framework for automatic determination of optimal data layouts

(5) Kandemir, Mahmut a,b Choudhary, Alok a,c Shenoy, Nagaraj c Banerjee, Prithviraj a,c Ramanujam, J a,d

a IEEE (United States)

b Syracuse University (United States)

c Northwestern University (United States)

d LOUISIANA STATE UNIVERSITY (United States)

Author keywords

Array restructuring; Data reuse; Locality optimizations; Memory performance; Parallelism; Spatial locality

Indexed keywords

COMPUTER SYSTEMS PROGRAMMING; DATA STORAGE EQUIPMENT; DATA STRUCTURES; MATRIX ALGEBRA; OPTIMIZATION; RESPONSE TIME (COMPUTER SYSTEMS);

DATA REUSE; OPTIMAL DATA LAYOUTS;

PARALLEL PROCESSING SYSTEMS;

EID: 0033077834 PISSN: 10459219 EISSN: None Source Type: Journal
DOI: 10.1109/71.752779 Document Type: Article

Times cited : (46)

References (44)

1
- 84976772979
- Optimizing Parallel Programs Using Affinity Regions
- St. Charles, Ill., Aug.
- B. Appelbe and B. Lakshmanan, "Optimizing Parallel Programs Using Affinity Regions," Proc. 1993 Int'l Conf. Parallel Processing, pp. 246-249, St. Charles, Ill., Aug. 1993.
- (1993) Proc. 1993 Int'l Conf. Parallel Processing , pp. 246-249
- Appelbe, B.¹ Lakshmanan, B.²

2
- 0029181140
- Data and Computation Transformations for Multiprocessors
- Santa Barbara, Calif., July
- J. Anderson, S. Amarasinghe, and M. Lam, "Data and Computation Transformations for Multiprocessors," Proc. Fifth ACM SIGPLAN Symp. Principles and Practice of Parallel Programming, pp. 166-178, Santa Barbara, Calif., July 1995.
- (1995) Proc. Fifth ACM SIGPLAN Symp. Principles and Practice of Parallel Programming , pp. 166-178
- Anderson, J.¹ Amarasinghe, S.² Lam, M.³

3
- 0027870804
- Global Optimizations for Parallelism and Locality on Scalable Parallel Machines
- Albuquerque, N.M., June
- J. Anderson and M. Lam, "Global Optimizations for Parallelism and Locality on Scalable Parallel Machines," Proc. SIGPLAN Conf. Programming Language Design and Implementation, pp. 112-125, Albuquerque, N.M., June 1993.
- (1993) Proc. SIGPLAN Conf. Programming Language Design and Implementation , pp. 112-125
- Anderson, J.¹ Lam, M.²

4
- 0004270780
- Norwell, Mass.: Kluwer Academic
- U. Banerjee, Dependence Analysis for Supercomputing. Norwell, Mass.: Kluwer Academic, 1988.
- (1988) Dependence Analysis for Supercomputing
- Banerjee, U.¹

5
- 0003207812
- Unimodular Transformations of Double Loops
- A. Nicolau et al., eds. MIT Press
- U. Banerjee, "Unimodular Transformations of Double Loops," Advances in Languages and Compilers for Parallel Processing, A. Nicolau et al., eds. MIT Press, 1991.
- (1991) Advances in Languages and Compilers for Parallel Processing
- Banerjee, U.¹

6
- 0030651789
- Data-Distribution Support on Distributed-Shared Memory Multiprocessors
- Las Vegas, Nev.
- R. Chandra, D. Chen, R. Cox, D. Maydan, N. Nedeljkovic, and J. Anderson, "Data-Distribution Support on Distributed-Shared Memory Multiprocessors," Proc. SIGPLAN Conf. Programming Language Design and Implementation, pp. 334-345, Las Vegas, Nev., 1997.
- (1997) Proc. SIGPLAN Conf. Programming Language Design and Implementation , pp. 334-345
- Chandra, R.¹ Chen, D.² Cox, R.³ Maydan, D.⁴ Nedeljkovic, N.⁵ Anderson, J.⁶

7
- 0029238937
- Optimal Evaluation of Array Expressions on Massively Parallel Machines
- Jan.
- S. Chatterjee, J. Gilbert, R. Schreiber, and S. Teng, "Optimal Evaluation of Array Expressions on Massively Parallel Machines," ACM Trans. Programming Languages and Systems, vol. 17, no. 1, pp. 123-156, Jan. 1995.
- (1995) ACM Trans. Programming Languages and Systems , vol.17 , Issue.1 , pp. 123-156
- Chatterjee, S.¹ Gilbert, J.² Schreiber, R.³ Teng, S.⁴

8
- 84976859799
- Unifying Data and Control Transformations for Distributed Shared Memory Machines
- La Jolla, Calif., June
- M. Cierniak and W. Li, "Unifying Data and Control Transformations for Distributed Shared Memory Machines," Proc. SIGPLAN Conf. Programming Language Design and Implementation, pp. 205-217, La Jolla, Calif., June 1995.
- (1995) Proc. SIGPLAN Conf. Programming Language Design and Implementation , pp. 205-217
- Cierniak, M.¹ Li, W.²

9
- 0025402476
- A Set of Level 3 Basic Linear Algebra Subprograms
- Mar.
- J.J. Dongarra, J.D. Croz, S. Hammarling, and I. Duff, "A Set of Level 3 Basic Linear Algebra Subprograms," ACM Trans. Mathematical Software, vol. 16, no. 1, pp. 1-17, Mar. 1990.
- (1990) ACM Trans. Mathematical Software , vol.16 , Issue.1 , pp. 1-17
- Dongarra, J.J.¹ Croz, J.D.² Hammarling, S.³ Duff, I.⁴

10
- 0001366267
- Strategies for Cache and Local Memory Management by Global Program Transformations
- Oct.
- D. Gannon, W. Jalby, and K. Gallivan, "Strategies for Cache and Local Memory Management by Global Program Transformations," J. Parallel and Distributed Computing, vol. 5, no. 5, pp. 587-616, Oct. 1988.
- (1988) J. Parallel and Distributed Computing , vol.5 , Issue.5 , pp. 587-616
- Gannon, D.¹ Jalby, W.² Gallivan, K.³

11
- 0029430244
- A Novel Approach Towards Automatic Data Distribution
- San Diego, Calif., Dec.
- J. Garcia, E. Ayguade, and J. Labarta, "A Novel Approach Towards Automatic Data Distribution," Proc. Supercomputing'95, San Diego, Calif., Dec. 1995.
- (1995) Proc. Supercomputing'95
- Garcia, J.¹ Ayguade, E.² Labarta, J.³

12
- 0003202116
- Dynamic Data Distribution with Control Flow Analysis
- Pittsburgh, Penn., Nov.
- J. Garcia, E. Ayguade, and J. Labarta, "Dynamic Data Distribution with Control Flow Analysis," Proc. Supercomputing'96, Pittsburgh, Penn., Nov. 1996.
- (1996) Proc. Supercomputing'96
- Garcia, J.¹ Ayguade, E.² Labarta, J.³

13
- 0026823950
- Demonstration of Automatic Data Partitioning Techniques for Parallelizing Compilers on Multicomputers
- Mar.
- M. Gupta and P. Banerjee, "Demonstration of Automatic Data Partitioning Techniques for Parallelizing Compilers on Multicomputers," IEEE Trans. Parallel and Distributed Systems, vol. 3, no. 2, pp. 179-193, Mar. 1992.
- (1992) IEEE Trans. Parallel and Distributed Systems , vol.3 , Issue.2 , pp. 179-193
- Gupta, M.¹ Banerjee, P.²

14
- 0024903997
- Evaluating Associativity in CPU Caches
- Dec.
- M. Hill and A. Smith, "Evaluating Associativity in CPU Caches," IEEE Trans. Computers, vol. 38, no. 12, pp. 1,612-1,630, Dec. 1989.
- (1989) IEEE Trans. Computers , vol.38 , Issue.12
- Hill, M.¹ Smith, A.²

15
- 0004341270
- High Performance Computational Chemistry Group, Pacific Northwest Laboratory, Richland, Wash.
- "NWChem: A Computational Chemistry Package for Parallel Computers," version 1.1, High Performance Computational Chemistry Group, Pacific Northwest Laboratory, Richland, Wash., 1995.
- (1995) "NWChem: A Computational Chemistry Package for Parallel Computers," Version 1.1

16
- 38249000489
- Communication-Free Partitioning of Nested Loops
- C. Huang and P. Sadayappan, "Communication-Free Partitioning of Nested Loops," J. Parallel and Distributed Computing, vol. 19, pp. 90-102, 1993.
- (1993) J. Parallel and Distributed Computing , vol.19 , pp. 90-102
- Huang, C.¹ Sadayappan, P.²

17
- 0029192199
- Reducing False Sharing on Shared Memory Multiprocessors through Compile Time Data Transformations
- Santa Barbara, Calif., July
- T. Jeremiassen and S. Eggers, "Reducing False Sharing on Shared Memory Multiprocessors through Compile Time Data Transformations," Proc. Fifth ACM SIGPLAN Symp. Principles and Practice of Parallel Programming, pp. 179-188, Santa Barbara, Calif., July 1995.
- (1995) Proc. Fifth ACM SIGPLAN Symp. Principles and Practice of Parallel Programming , pp. 179-188
- Jeremiassen, T.¹ Eggers, S.²

18
- 85029492648
- Reduction of Cache Coherence Overhead by Compiler Data Layout and Loop Transformation
- U. Banerjee et al., eds.
- Y. Ju and H. Dietz, "Reduction of Cache Coherence Overhead by Compiler Data Layout and Loop Transformation," Languages and Compilers for Parallel Computing, U. Banerjee et al., eds., pp. 344-358, 1992.
- (1992) Languages and Compilers for Parallel Computing , pp. 344-358
- Ju, Y.¹ Dietz, H.²

19
- 0032025292
- Locality Optimization Algorithms for Compilation of Out-of-Core Codes
- Mar.
- M. Kandemir, A. Choudhary, J. Ramanujam, and M. Kandaswamy "Locality Optimization Algorithms for Compilation of Out-of-Core Codes," J. Information Science and Eng., vol. 14, no. 1, pp. 107-138, Mar. 1998.
- (1998) J. Information Science and Eng. , vol.14 , Issue.1 , pp. 107-138
- Kandemir, M.¹ Choudhary, A.² Ramanujam, J.³ Kandaswamy, M.⁴

20
- 0032066688
- Compilation Techniques for Out-of-Core Parallel Computations
- June
- M. Kandemir, A. Choudhary, J. Ramanujam, and R. Bordawekar, "Compilation Techniques for Out-of-Core Parallel Computations," Parallel Computing, vol. 24, nos. 3-4, pp. 597-628, June 1998.
- (1998) Parallel Computing , vol.24 , Issue.3-4 , pp. 597-628
- Kandemir, M.¹ Choudhary, A.² Ramanujam, J.³ Bordawekar, R.⁴

21
- 0030662867
- A Compiler Algorithm for Optimizing Locality in Loop Nests
- Vienna, July
- M. Kandemir, J. Ramanujam, and A. Choudhary, "A Compiler Algorithm for Optimizing Locality in Loop Nests," Proc. 11th ACM Int'l Conf. Supercomputing, pp. 269-276, Vienna, July 1997.
- (1997) Proc. 11th ACM Int'l Conf. Supercomputing , pp. 269-276
- Kandemir, M.¹ Ramanujam, J.² Choudhary, A.³

22
- 0031334865
- Compiler Algorithms for Optimizing Locality and Parallelism on Shared and Distributed Memory Machines
- San Francisco, Nov.
- M. Kandemir, J. Ramanujam, and A. Choudhary, "Compiler Algorithms for Optimizing Locality and Parallelism on Shared and Distributed Memory Machines," Proc. 1997 Int'l Conf. Parallel Architectures and Compilation Techniques (PACT '97), pp. 236-247, San Francisco, Nov. 1997.
- (1997) Proc. 1997 Int'l Conf. Parallel Architectures and Compilation Techniques (PACT '97) , pp. 236-247
- Kandemir, M.¹ Ramanujam, J.² Choudhary, A.³

23
- 0003328017
- Automatic Data Layout for High Performance Fortran
- San Diego, Calif., Dec.
- K. Kennedy and U. Kremer, "Automatic Data Layout for High Performance Fortran," Proc. Supercomputing '95, San Diego, Calif., Dec. 1995.
- (1995) Proc. Supercomputing '95
- Kennedy, K.¹ Kremer, U.²

24
- 0030685588
- The SGI Origin: A cc-NUMA Highly Scalable Server
- May
- J. Laudon and D. Lenoski, "The SGI Origin: A cc-NUMA Highly Scalable Server," Proc. 24th Ann. Int'l Symp. Computer Architecture, May 1997.
- (1997) Proc. 24th Ann. Int'l Symp. Computer Architecture
- Laudon, J.¹ Lenoski, D.²

25
- 0003582055
- Technical Report TR 95-09-01, Dept. of Computer Science and Eng., Univ. of Washington, Sept.
- S.-T. Leung and J. Zahorjan, "Optimizing Data Locality by Array Restructuring," Technical Report TR 95-09-01, Dept. of Computer Science and Eng., Univ. of Washington, Sept. 1995.
- (1995) Optimizing Data Locality by Array Restructuring
- Leung, S.-T.¹ Zahorjan, J.²

26
- 0003888396
- PhD thesis, Cornell Univ., Ithaca, N.Y.
- W. Li, "Compiling for NUMA Parallel Machines," PhD thesis, Cornell Univ., Ithaca, N.Y., 1993.
- (1993) Compiling for NUMA Parallel Machines
- Li, W.¹

27
- 0026187669
- Compiling Communication Efficient Programs for Massively Parallel Machines
- J. Li and M. Chen, "Compiling Communication Efficient Programs for Massively Parallel Machines," J. Parallel and Distributed Computing, vol. 2, no. 3, pp. 361-376, 1991.
- (1991) J. Parallel and Distributed Computing , vol.2 , Issue.3 , pp. 361-376
- Li, J.¹ Chen, M.²

28
- 0003475248
- Boston: Kluwer Academic
- M. Mace, Memory Storage Patterns in Parallel Processing. Boston: Kluwer Academic, 1987.
- (1987) Memory Storage Patterns in Parallel Processing
- Mace, M.¹

29
- 0026971052
- Delinearization: An Efficient Way to Break Multi-Loop Dependence Equations
- San Francisco, June
- V. Maslov, "Delinearization: An Efficient Way to Break Multi-Loop Dependence Equations," Proc. SIGPLAN Conf. Programming Language Design and Implementation, pp. 152-161, San Francisco, June 1992.
- (1992) Proc. SIGPLAN Conf. Programming Language Design and Implementation , pp. 152-161
- Maslov, V.¹

30
- 0030190854
- Improving Data Locality with Loop Transformations
- July
- K. McKinley, S. Carr, and C.W. Tseng, "Improving Data Locality with Loop Transformations," ACM Trans. Programming Languages and Systems, vol. 18, no. 4, pp. 424-453, July 1996.
- (1996) ACM Trans. Programming Languages and Systems , vol.18 , Issue.4 , pp. 424-453
- McKinley, K.¹ Carr, S.² Tseng, C.W.³

31
- 0002848657
- Non-Singular Data Transformations: Definition, Validity, Applications
- Aachen, Germany
- M. O'Boyle and P. Knijnenburg, "Non-Singular Data Transformations: Definition, Validity, Applications," Proc. Sixth Workshop Compilers for Parallel Computers, pp. 287-297, Aachen, Germany, 1996.
- (1996) Proc. Sixth Workshop Compilers for Parallel Computers , pp. 287-297
- O'Boyle, M.¹ Knijnenburg, P.²

32
- 0001787125
- Automatic Selection of Dynamic Data Partitioning Schemes for Distributed-Memory Multicomputers
- Columbus, Ohio
- D. Palermo and P. Banerjee, "Automatic Selection of Dynamic Data Partitioning Schemes for Distributed-Memory Multicomputers," Proc. Eighth Workshop Languages and Compilers for Parallel Computing, Columbus, Ohio, pp. 392-406, 1995.
- (1995) Proc. Eighth Workshop Languages and Compilers for Parallel Computing , pp. 392-406
- Palermo, D.¹ Banerjee, P.²

33
- 33748145775
- PhD thesis, The Ohio State Univ., Columbus, Ohio, Also available from University Microfilms Inc. as Document 91-11789
- J. Ramanujam, "Compile-Time Techniques for Parallel Execution of Loops on Distributed Memory Multiprocessors," PhD thesis, The Ohio State Univ., Columbus, Ohio, 1990. Also available from University Microfilms Inc. as Document 91-11789.
- (1990) Compile-Time Techniques for Parallel Execution of Loops on Distributed Memory Multiprocessors
- Ramanujam, J.¹

34
- 85031707006
- Non-Unimodular Transformations of Nested Loops
- Minneapolis, Minn., Nov.
- J. Ramanujam, "Non-Unimodular Transformations of Nested Loops," Proc. Supercomputing 92, Minneapolis, Minn., pp. 214-223, Nov. 1992.
- (1992) Proc. Supercomputing 92 , pp. 214-223
- Ramanujam, J.¹

35
- 0041776638
- Integrating Data Distribution and Loop Transformations for Distributed Memory Machines
- D. Bailey et al., eds., Feb.
- J. Ramanujam and A. Narayan, "Integrating Data Distribution and Loop Transformations for Distributed Memory Machines," Proc. Seventh SIAM Conf. Parallel Processing for Scientific Computing, D. Bailey et al., eds., pp. 668-673, Feb. 1995.
- (1995) Proc. Seventh SIAM Conf. Parallel Processing for Scientific Computing , pp. 668-673
- Ramanujam, J.¹ Narayan, A.²

36
- 0026231056
- Compile-Time Techniques for Data Distribution in Distributed Memory Machines
- Oct.
- J. Ramanujam and P. Sadayappan, "Compile-Time Techniques for Data Distribution in Distributed Memory Machines," IEEE Trans. Parallel and Distributed Systems, vol. 2, no. 4, pp. 472-482, Oct. 1991.
- (1991) IEEE Trans. Parallel and Distributed Systems , vol.2 , Issue.4 , pp. 472-482
- Ramanujam, J.¹ Sadayappan, P.²

37
- 0003690189
- John Wiley
- A. Schrijver, Theory of Linear and Integer Programming. John Wiley, 1986.
- (1986) Theory of Linear and Integer Programming
- Schrijver, A.¹

38
- 0030652844
- Automatic Partitioning of Data and Computations on Scalable Shared Memory Multiprocessors
- Bloomingdale, Ill., Aug.
- S. Tandri and T. Abdelrahman, "Automatic Partitioning of Data and Computations on Scalable Shared Memory Multiprocessors," Proc. 1997 Int'l Conf. Parallel Processing, pp. 64-73, Bloomingdale, Ill., Aug. 1997.
- (1997) Proc. 1997 Int'l Conf. Parallel Processing , pp. 64-73
- Tandri, S.¹ Abdelrahman, T.²

39
- 0028446907
- False Sharing and Spatial Locality in Multiprocessor Caches
- June
- J. Torrellas, M. Lam, and J. Hennessey, "False Sharing and Spatial Locality in Multiprocessor Caches," IEEE Trans. Computers, vol. 43, no. 6, pp. 651-663, June 1994.
- (1994) IEEE Trans. Computers , vol.43 , Issue.6 , pp. 651-663
- Torrellas, J.¹ Lam, M.² Hennessey, J.³

40
- 0029194338
- Evaluating the Impact of Advanced Memory Systems on Compiler-Parallelized Codes
- Limassol, Cyprus, June
- E. Torrie, C. Tseng, M. Martonosi, and M. Hall, "Evaluating the Impact of Advanced Memory Systems on Compiler-Parallelized Codes," Proc. Int'l Conf. Parallel Architectures and Compilation Techniques, Limassol, Cyprus, June 1995.
- (1995) Proc. Int'l Conf. Parallel Architectures and Compilation Techniques
- Torrie, E.¹ Tseng, C.² Martonosi, M.³ Hall, M.⁴

41
- 0030263788
- Operating System Support for Improving Data Locality on cc-NUMA Compute Servers
- Cambridge, Mass., Oct.
- B. Verghese, S. Devine, A. Gupta, and M. Rosenblum, "Operating System Support for Improving Data Locality on cc-NUMA Compute Servers," Proc. Seventh Int'l Conf. Architectural Support for Programming Languages and Operating Systems, pp. 279-289, Cambridge, Mass., Oct. 1996.
- (1996) Proc. Seventh Int'l Conf. Architectural Support for Programming Languages and Operating Systems , pp. 279-289
- Verghese, B.¹ Devine, S.² Gupta, A.³ Rosenblum, M.⁴

42
- 84976692695
- SUIF: An Infrastructure for Research on Parallelizing and Optimizing Compilers
- Dec
- R. Wilson, R. French, C. Wilson, S. Amarasinghe, J. Anderson, S. Tjiang, S. Liao, C. Tseng, M. Hall, M. Lam, and J. Hennessy, "SUIF: An Infrastructure for Research on Parallelizing and Optimizing Compilers," ACM SIGPLAN Notices, vol. 29, no. 12, pp. 31-37, Dec 1994.
- (1994) ACM SIGPLAN Notices , vol.29 , Issue.12 , pp. 31-37
- Wilson, R.¹ French, R.² Wilson, C.³ Amarasinghe, S.⁴ Anderson, J.⁵ Tjiang, S.⁶ Liao, S.⁷ Tseng, C.⁸ Hall, M.⁹ Lam, M.¹⁰ Hennessy, J.¹¹

43
- 84976827033
- A Data Locality Optimizing Algorithm
- Toronto, Canada, June
- M. Wolf and M. Lam, "A Data Locality Optimizing Algorithm," Proc. SIGPLAN Conf. Programming Language Design and Implementation, pp. 30-44, Toronto, Canada, June 1991.
- (1991) Proc. SIGPLAN Conf. Programming Language Design and Implementation , pp. 30-44
- Wolf, M.¹ Lam, M.²

44
- 0003927035
- Addison-Wesley
- M. Wolfe, High Performance Compilers for Parallel Computing. Addison-Wesley, 1996.
- (1996) High Performance Compilers for Parallel Computing
- Wolfe, M.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.