-
1
-
-
84976772979
-
Optimizing Parallel Programs Using Affinity Regions
-
St. Charles, Ill., Aug.
-
B. Appelbe and B. Lakshmanan, "Optimizing Parallel Programs Using Affinity Regions," Proc. 1993 Int'l Conf. Parallel Processing, pp. 246-249, St. Charles, Ill., Aug. 1993.
-
(1993)
Proc. 1993 Int'l Conf. Parallel Processing
, pp. 246-249
-
-
Appelbe, B.1
Lakshmanan, B.2
-
2
-
-
0029181140
-
Data and Computation Transformations for Multiprocessors
-
Santa Barbara, Calif., July
-
J. Anderson, S. Amarasinghe, and M. Lam, "Data and Computation Transformations for Multiprocessors," Proc. Fifth ACM SIGPLAN Symp. Principles and Practice of Parallel Programming, pp. 166-178, Santa Barbara, Calif., July 1995.
-
(1995)
Proc. Fifth ACM SIGPLAN Symp. Principles and Practice of Parallel Programming
, pp. 166-178
-
-
Anderson, J.1
Amarasinghe, S.2
Lam, M.3
-
3
-
-
0027870804
-
Global Optimizations for Parallelism and Locality on Scalable Parallel Machines
-
Albuquerque, N.M., June
-
J. Anderson and M. Lam, "Global Optimizations for Parallelism and Locality on Scalable Parallel Machines," Proc. SIGPLAN Conf. Programming Language Design and Implementation, pp. 112-125, Albuquerque, N.M., June 1993.
-
(1993)
Proc. SIGPLAN Conf. Programming Language Design and Implementation
, pp. 112-125
-
-
Anderson, J.1
Lam, M.2
-
6
-
-
0030651789
-
Data-Distribution Support on Distributed-Shared Memory Multiprocessors
-
Las Vegas, Nev.
-
R. Chandra, D. Chen, R. Cox, D. Maydan, N. Nedeljkovic, and J. Anderson, "Data-Distribution Support on Distributed-Shared Memory Multiprocessors," Proc. SIGPLAN Conf. Programming Language Design and Implementation, pp. 334-345, Las Vegas, Nev., 1997.
-
(1997)
Proc. SIGPLAN Conf. Programming Language Design and Implementation
, pp. 334-345
-
-
Chandra, R.1
Chen, D.2
Cox, R.3
Maydan, D.4
Nedeljkovic, N.5
Anderson, J.6
-
7
-
-
0029238937
-
Optimal Evaluation of Array Expressions on Massively Parallel Machines
-
Jan.
-
S. Chatterjee, J. Gilbert, R. Schreiber, and S. Teng, "Optimal Evaluation of Array Expressions on Massively Parallel Machines," ACM Trans. Programming Languages and Systems, vol. 17, no. 1, pp. 123-156, Jan. 1995.
-
(1995)
ACM Trans. Programming Languages and Systems
, vol.17
, Issue.1
, pp. 123-156
-
-
Chatterjee, S.1
Gilbert, J.2
Schreiber, R.3
Teng, S.4
-
8
-
-
84976859799
-
Unifying Data and Control Transformations for Distributed Shared Memory Machines
-
La Jolla, Calif., June
-
M. Cierniak and W. Li, "Unifying Data and Control Transformations for Distributed Shared Memory Machines," Proc. SIGPLAN Conf. Programming Language Design and Implementation, pp. 205-217, La Jolla, Calif., June 1995.
-
(1995)
Proc. SIGPLAN Conf. Programming Language Design and Implementation
, pp. 205-217
-
-
Cierniak, M.1
Li, W.2
-
9
-
-
0025402476
-
A Set of Level 3 Basic Linear Algebra Subprograms
-
Mar.
-
J.J. Dongarra, J.D. Croz, S. Hammarling, and I. Duff, "A Set of Level 3 Basic Linear Algebra Subprograms," ACM Trans. Mathematical Software, vol. 16, no. 1, pp. 1-17, Mar. 1990.
-
(1990)
ACM Trans. Mathematical Software
, vol.16
, Issue.1
, pp. 1-17
-
-
Dongarra, J.J.1
Croz, J.D.2
Hammarling, S.3
Duff, I.4
-
10
-
-
0001366267
-
Strategies for Cache and Local Memory Management by Global Program Transformations
-
Oct.
-
D. Gannon, W. Jalby, and K. Gallivan, "Strategies for Cache and Local Memory Management by Global Program Transformations," J. Parallel and Distributed Computing, vol. 5, no. 5, pp. 587-616, Oct. 1988.
-
(1988)
J. Parallel and Distributed Computing
, vol.5
, Issue.5
, pp. 587-616
-
-
Gannon, D.1
Jalby, W.2
Gallivan, K.3
-
11
-
-
0029430244
-
A Novel Approach Towards Automatic Data Distribution
-
San Diego, Calif., Dec.
-
J. Garcia, E. Ayguade, and J. Labarta, "A Novel Approach Towards Automatic Data Distribution," Proc. Supercomputing'95, San Diego, Calif., Dec. 1995.
-
(1995)
Proc. Supercomputing'95
-
-
Garcia, J.1
Ayguade, E.2
Labarta, J.3
-
12
-
-
0003202116
-
Dynamic Data Distribution with Control Flow Analysis
-
Pittsburgh, Penn., Nov.
-
J. Garcia, E. Ayguade, and J. Labarta, "Dynamic Data Distribution with Control Flow Analysis," Proc. Supercomputing'96, Pittsburgh, Penn., Nov. 1996.
-
(1996)
Proc. Supercomputing'96
-
-
Garcia, J.1
Ayguade, E.2
Labarta, J.3
-
13
-
-
0026823950
-
Demonstration of Automatic Data Partitioning Techniques for Parallelizing Compilers on Multicomputers
-
Mar.
-
M. Gupta and P. Banerjee, "Demonstration of Automatic Data Partitioning Techniques for Parallelizing Compilers on Multicomputers," IEEE Trans. Parallel and Distributed Systems, vol. 3, no. 2, pp. 179-193, Mar. 1992.
-
(1992)
IEEE Trans. Parallel and Distributed Systems
, vol.3
, Issue.2
, pp. 179-193
-
-
Gupta, M.1
Banerjee, P.2
-
14
-
-
0024903997
-
Evaluating Associativity in CPU Caches
-
Dec.
-
M. Hill and A. Smith, "Evaluating Associativity in CPU Caches," IEEE Trans. Computers, vol. 38, no. 12, pp. 1,612-1,630, Dec. 1989.
-
(1989)
IEEE Trans. Computers
, vol.38
, Issue.12
-
-
Hill, M.1
Smith, A.2
-
15
-
-
0004341270
-
-
High Performance Computational Chemistry Group, Pacific Northwest Laboratory, Richland, Wash.
-
"NWChem: A Computational Chemistry Package for Parallel Computers," version 1.1, High Performance Computational Chemistry Group, Pacific Northwest Laboratory, Richland, Wash., 1995.
-
(1995)
"NWChem: A Computational Chemistry Package for Parallel Computers," Version 1.1
-
-
-
17
-
-
0029192199
-
Reducing False Sharing on Shared Memory Multiprocessors through Compile Time Data Transformations
-
Santa Barbara, Calif., July
-
T. Jeremiassen and S. Eggers, "Reducing False Sharing on Shared Memory Multiprocessors through Compile Time Data Transformations," Proc. Fifth ACM SIGPLAN Symp. Principles and Practice of Parallel Programming, pp. 179-188, Santa Barbara, Calif., July 1995.
-
(1995)
Proc. Fifth ACM SIGPLAN Symp. Principles and Practice of Parallel Programming
, pp. 179-188
-
-
Jeremiassen, T.1
Eggers, S.2
-
18
-
-
85029492648
-
Reduction of Cache Coherence Overhead by Compiler Data Layout and Loop Transformation
-
U. Banerjee et al., eds.
-
Y. Ju and H. Dietz, "Reduction of Cache Coherence Overhead by Compiler Data Layout and Loop Transformation," Languages and Compilers for Parallel Computing, U. Banerjee et al., eds., pp. 344-358, 1992.
-
(1992)
Languages and Compilers for Parallel Computing
, pp. 344-358
-
-
Ju, Y.1
Dietz, H.2
-
19
-
-
0032025292
-
Locality Optimization Algorithms for Compilation of Out-of-Core Codes
-
Mar.
-
M. Kandemir, A. Choudhary, J. Ramanujam, and M. Kandaswamy "Locality Optimization Algorithms for Compilation of Out-of-Core Codes," J. Information Science and Eng., vol. 14, no. 1, pp. 107-138, Mar. 1998.
-
(1998)
J. Information Science and Eng.
, vol.14
, Issue.1
, pp. 107-138
-
-
Kandemir, M.1
Choudhary, A.2
Ramanujam, J.3
Kandaswamy, M.4
-
20
-
-
0032066688
-
Compilation Techniques for Out-of-Core Parallel Computations
-
June
-
M. Kandemir, A. Choudhary, J. Ramanujam, and R. Bordawekar, "Compilation Techniques for Out-of-Core Parallel Computations," Parallel Computing, vol. 24, nos. 3-4, pp. 597-628, June 1998.
-
(1998)
Parallel Computing
, vol.24
, Issue.3-4
, pp. 597-628
-
-
Kandemir, M.1
Choudhary, A.2
Ramanujam, J.3
Bordawekar, R.4
-
21
-
-
0030662867
-
A Compiler Algorithm for Optimizing Locality in Loop Nests
-
Vienna, July
-
M. Kandemir, J. Ramanujam, and A. Choudhary, "A Compiler Algorithm for Optimizing Locality in Loop Nests," Proc. 11th ACM Int'l Conf. Supercomputing, pp. 269-276, Vienna, July 1997.
-
(1997)
Proc. 11th ACM Int'l Conf. Supercomputing
, pp. 269-276
-
-
Kandemir, M.1
Ramanujam, J.2
Choudhary, A.3
-
22
-
-
0031334865
-
Compiler Algorithms for Optimizing Locality and Parallelism on Shared and Distributed Memory Machines
-
San Francisco, Nov.
-
M. Kandemir, J. Ramanujam, and A. Choudhary, "Compiler Algorithms for Optimizing Locality and Parallelism on Shared and Distributed Memory Machines," Proc. 1997 Int'l Conf. Parallel Architectures and Compilation Techniques (PACT '97), pp. 236-247, San Francisco, Nov. 1997.
-
(1997)
Proc. 1997 Int'l Conf. Parallel Architectures and Compilation Techniques (PACT '97)
, pp. 236-247
-
-
Kandemir, M.1
Ramanujam, J.2
Choudhary, A.3
-
23
-
-
0003328017
-
Automatic Data Layout for High Performance Fortran
-
San Diego, Calif., Dec.
-
K. Kennedy and U. Kremer, "Automatic Data Layout for High Performance Fortran," Proc. Supercomputing '95, San Diego, Calif., Dec. 1995.
-
(1995)
Proc. Supercomputing '95
-
-
Kennedy, K.1
Kremer, U.2
-
25
-
-
0003582055
-
-
Technical Report TR 95-09-01, Dept. of Computer Science and Eng., Univ. of Washington, Sept.
-
S.-T. Leung and J. Zahorjan, "Optimizing Data Locality by Array Restructuring," Technical Report TR 95-09-01, Dept. of Computer Science and Eng., Univ. of Washington, Sept. 1995.
-
(1995)
Optimizing Data Locality by Array Restructuring
-
-
Leung, S.-T.1
Zahorjan, J.2
-
26
-
-
0003888396
-
-
PhD thesis, Cornell Univ., Ithaca, N.Y.
-
W. Li, "Compiling for NUMA Parallel Machines," PhD thesis, Cornell Univ., Ithaca, N.Y., 1993.
-
(1993)
Compiling for NUMA Parallel Machines
-
-
Li, W.1
-
27
-
-
0026187669
-
Compiling Communication Efficient Programs for Massively Parallel Machines
-
J. Li and M. Chen, "Compiling Communication Efficient Programs for Massively Parallel Machines," J. Parallel and Distributed Computing, vol. 2, no. 3, pp. 361-376, 1991.
-
(1991)
J. Parallel and Distributed Computing
, vol.2
, Issue.3
, pp. 361-376
-
-
Li, J.1
Chen, M.2
-
29
-
-
0026971052
-
Delinearization: An Efficient Way to Break Multi-Loop Dependence Equations
-
San Francisco, June
-
V. Maslov, "Delinearization: An Efficient Way to Break Multi-Loop Dependence Equations," Proc. SIGPLAN Conf. Programming Language Design and Implementation, pp. 152-161, San Francisco, June 1992.
-
(1992)
Proc. SIGPLAN Conf. Programming Language Design and Implementation
, pp. 152-161
-
-
Maslov, V.1
-
30
-
-
0030190854
-
Improving Data Locality with Loop Transformations
-
July
-
K. McKinley, S. Carr, and C.W. Tseng, "Improving Data Locality with Loop Transformations," ACM Trans. Programming Languages and Systems, vol. 18, no. 4, pp. 424-453, July 1996.
-
(1996)
ACM Trans. Programming Languages and Systems
, vol.18
, Issue.4
, pp. 424-453
-
-
McKinley, K.1
Carr, S.2
Tseng, C.W.3
-
31
-
-
0002848657
-
Non-Singular Data Transformations: Definition, Validity, Applications
-
Aachen, Germany
-
M. O'Boyle and P. Knijnenburg, "Non-Singular Data Transformations: Definition, Validity, Applications," Proc. Sixth Workshop Compilers for Parallel Computers, pp. 287-297, Aachen, Germany, 1996.
-
(1996)
Proc. Sixth Workshop Compilers for Parallel Computers
, pp. 287-297
-
-
O'Boyle, M.1
Knijnenburg, P.2
-
32
-
-
0001787125
-
Automatic Selection of Dynamic Data Partitioning Schemes for Distributed-Memory Multicomputers
-
Columbus, Ohio
-
D. Palermo and P. Banerjee, "Automatic Selection of Dynamic Data Partitioning Schemes for Distributed-Memory Multicomputers," Proc. Eighth Workshop Languages and Compilers for Parallel Computing, Columbus, Ohio, pp. 392-406, 1995.
-
(1995)
Proc. Eighth Workshop Languages and Compilers for Parallel Computing
, pp. 392-406
-
-
Palermo, D.1
Banerjee, P.2
-
33
-
-
33748145775
-
-
PhD thesis, The Ohio State Univ., Columbus, Ohio, Also available from University Microfilms Inc. as Document 91-11789
-
J. Ramanujam, "Compile-Time Techniques for Parallel Execution of Loops on Distributed Memory Multiprocessors," PhD thesis, The Ohio State Univ., Columbus, Ohio, 1990. Also available from University Microfilms Inc. as Document 91-11789.
-
(1990)
Compile-Time Techniques for Parallel Execution of Loops on Distributed Memory Multiprocessors
-
-
Ramanujam, J.1
-
34
-
-
85031707006
-
Non-Unimodular Transformations of Nested Loops
-
Minneapolis, Minn., Nov.
-
J. Ramanujam, "Non-Unimodular Transformations of Nested Loops," Proc. Supercomputing 92, Minneapolis, Minn., pp. 214-223, Nov. 1992.
-
(1992)
Proc. Supercomputing 92
, pp. 214-223
-
-
Ramanujam, J.1
-
35
-
-
0041776638
-
Integrating Data Distribution and Loop Transformations for Distributed Memory Machines
-
D. Bailey et al., eds., Feb.
-
J. Ramanujam and A. Narayan, "Integrating Data Distribution and Loop Transformations for Distributed Memory Machines," Proc. Seventh SIAM Conf. Parallel Processing for Scientific Computing, D. Bailey et al., eds., pp. 668-673, Feb. 1995.
-
(1995)
Proc. Seventh SIAM Conf. Parallel Processing for Scientific Computing
, pp. 668-673
-
-
Ramanujam, J.1
Narayan, A.2
-
36
-
-
0026231056
-
Compile-Time Techniques for Data Distribution in Distributed Memory Machines
-
Oct.
-
J. Ramanujam and P. Sadayappan, "Compile-Time Techniques for Data Distribution in Distributed Memory Machines," IEEE Trans. Parallel and Distributed Systems, vol. 2, no. 4, pp. 472-482, Oct. 1991.
-
(1991)
IEEE Trans. Parallel and Distributed Systems
, vol.2
, Issue.4
, pp. 472-482
-
-
Ramanujam, J.1
Sadayappan, P.2
-
38
-
-
0030652844
-
Automatic Partitioning of Data and Computations on Scalable Shared Memory Multiprocessors
-
Bloomingdale, Ill., Aug.
-
S. Tandri and T. Abdelrahman, "Automatic Partitioning of Data and Computations on Scalable Shared Memory Multiprocessors," Proc. 1997 Int'l Conf. Parallel Processing, pp. 64-73, Bloomingdale, Ill., Aug. 1997.
-
(1997)
Proc. 1997 Int'l Conf. Parallel Processing
, pp. 64-73
-
-
Tandri, S.1
Abdelrahman, T.2
-
39
-
-
0028446907
-
False Sharing and Spatial Locality in Multiprocessor Caches
-
June
-
J. Torrellas, M. Lam, and J. Hennessey, "False Sharing and Spatial Locality in Multiprocessor Caches," IEEE Trans. Computers, vol. 43, no. 6, pp. 651-663, June 1994.
-
(1994)
IEEE Trans. Computers
, vol.43
, Issue.6
, pp. 651-663
-
-
Torrellas, J.1
Lam, M.2
Hennessey, J.3
-
40
-
-
0029194338
-
Evaluating the Impact of Advanced Memory Systems on Compiler-Parallelized Codes
-
Limassol, Cyprus, June
-
E. Torrie, C. Tseng, M. Martonosi, and M. Hall, "Evaluating the Impact of Advanced Memory Systems on Compiler-Parallelized Codes," Proc. Int'l Conf. Parallel Architectures and Compilation Techniques, Limassol, Cyprus, June 1995.
-
(1995)
Proc. Int'l Conf. Parallel Architectures and Compilation Techniques
-
-
Torrie, E.1
Tseng, C.2
Martonosi, M.3
Hall, M.4
-
41
-
-
0030263788
-
Operating System Support for Improving Data Locality on cc-NUMA Compute Servers
-
Cambridge, Mass., Oct.
-
B. Verghese, S. Devine, A. Gupta, and M. Rosenblum, "Operating System Support for Improving Data Locality on cc-NUMA Compute Servers," Proc. Seventh Int'l Conf. Architectural Support for Programming Languages and Operating Systems, pp. 279-289, Cambridge, Mass., Oct. 1996.
-
(1996)
Proc. Seventh Int'l Conf. Architectural Support for Programming Languages and Operating Systems
, pp. 279-289
-
-
Verghese, B.1
Devine, S.2
Gupta, A.3
Rosenblum, M.4
-
42
-
-
84976692695
-
SUIF: An Infrastructure for Research on Parallelizing and Optimizing Compilers
-
Dec
-
R. Wilson, R. French, C. Wilson, S. Amarasinghe, J. Anderson, S. Tjiang, S. Liao, C. Tseng, M. Hall, M. Lam, and J. Hennessy, "SUIF: An Infrastructure for Research on Parallelizing and Optimizing Compilers," ACM SIGPLAN Notices, vol. 29, no. 12, pp. 31-37, Dec 1994.
-
(1994)
ACM SIGPLAN Notices
, vol.29
, Issue.12
, pp. 31-37
-
-
Wilson, R.1
French, R.2
Wilson, C.3
Amarasinghe, S.4
Anderson, J.5
Tjiang, S.6
Liao, S.7
Tseng, C.8
Hall, M.9
Lam, M.10
Hennessy, J.11
-
43
-
-
84976827033
-
A Data Locality Optimizing Algorithm
-
Toronto, Canada, June
-
M. Wolf and M. Lam, "A Data Locality Optimizing Algorithm," Proc. SIGPLAN Conf. Programming Language Design and Implementation, pp. 30-44, Toronto, Canada, June 1991.
-
(1991)
Proc. SIGPLAN Conf. Programming Language Design and Implementation
, pp. 30-44
-
-
Wolf, M.1
Lam, M.2
|