-
1
-
-
0346032593
-
Advanced Code Generation for High Performance Fortran
-
chapter 18
-
V. Adve and J. Mellor-Crummey, "Advanced Code Generation for High Performance Fortran," Languages, Compilation Techniques, and Run Time Systems for Scalable Parallel Systems, chapter 18, 1997.
-
(1997)
Languages, Compilation Techniques, and Run Time Systems for Scalable Parallel Systems
-
-
Adve, V.1
Mellor-Crummey, J.2
-
2
-
-
0029373981
-
Automatic Partitioning of Parallel Loops and Data Arrays for Distributed Shared-Memory Multiprocessors
-
Sept.
-
A. Agarwal, D. Kranz, and V. Natarajan, "Automatic Partitioning of Parallel Loops and Data Arrays for Distributed Shared-Memory Multiprocessors," IEEE Trans. Parallel and Distributed Systems, vol. 6, no. 9, pp. 943-962, Sept. 1995.
-
(1995)
IEEE Trans. Parallel and Distributed Systems
, vol.6
, Issue.9
, pp. 943-962
-
-
Agarwal, A.1
Kranz, D.2
Natarajan, V.3
-
5
-
-
0005784992
-
First Steps Towards Optimal Oblique Tile Sizing
-
Jan.
-
R. Andonov, P. Calland, S. Niar, S. Rajopadhye, and N. Yanev, "First Steps Towards Optimal Oblique Tile Sizing," Proc. Eighth Int'l Workshop Compilers for Parallel Computers, pp. 351-366, Jan. 2000.
-
(2000)
Proc. Eighth Int'l Workshop Compilers for Parallel Computers
, pp. 351-366
-
-
Andonov, R.1
Calland, P.2
Niar, S.3
Rajopadhye, S.4
Yanev, N.5
-
7
-
-
0028482686
-
(Pen)-Ultimate Tiling?
-
P. Boulet, A. Darte, T. Risset, and Y. Robert, "(Pen)-Ultimate Tiling?" INTEGRATION, The VLSI J., vol. 17, pp. 33-51, 1994.
-
(1994)
INTEGRATION, the VLSI J.
, vol.17
, pp. 33-51
-
-
Boulet, P.1
Darte, A.2
Risset, T.3
Robert, Y.4
-
8
-
-
0242537076
-
Programming in Vienna Fortran
-
July
-
B. Chapman, P. Mehrotra, and H. Zima, "Programming in Vienna Fortran," Proc. Third Workshop Compilers for Parallel Computers, pp. 121-160, July 1992.
-
(1992)
Proc. Third Workshop Compilers for Parallel Computers
, pp. 121-160
-
-
Chapman, B.1
Mehrotra, P.2
Zima, H.3
-
9
-
-
0032028841
-
Determining the Idle Time of a Tiling: New Results
-
Mar.
-
F. Desprez, J. Dongarra, and Y. Robert, "Determining the Idle Time of a Tiling: New Results," J. Information Science and Eng., vol. 14, pp. 167-190, Mar. 1997.
-
(1997)
J. Information Science and Eng.
, vol.14
, pp. 167-190
-
-
Desprez, F.1
Dongarra, J.2
Robert, Y.3
-
10
-
-
0026891897
-
Partitioning and Labeling of Loops by Unimodular Transformations
-
July
-
E. D'Hollander, "Partitioning and Labeling of Loops by Unimodular Transformations," IEEE Trans. Parallel and Distributed Systems, vol. 3, no. 4, pp. 465-476, July 1992.
-
(1992)
IEEE Trans. Parallel and Distributed Systems
, vol.3
, Issue.4
, pp. 465-476
-
-
D'Hollander, E.1
-
11
-
-
84949487986
-
Evaluation of Loop Grouping Methods Based on Orthogonal Projection Spaces
-
Aug.
-
I. Drossiris, G. Goumas, N. Koziris, G. Papakonstantinou, and P. Tsanakas, "Evaluation of Loop Grouping Methods Based on Orthogonal Projection Spaces," Proc. Int'l Conf. Parallel Processing, pp. 469-476, Aug. 2000.
-
(2000)
Proc. Int'l Conf. Parallel Processing
, pp. 469-476
-
-
Drossiris, I.1
Goumas, G.2
Koziris, N.3
Papakonstantinou, G.4
Tsanakas, P.5
-
12
-
-
0029354475
-
Loop Transformations Using Nonunimodular Matrices
-
Aug.
-
A. Fernandez, J. Llaberia, and M. Valero, "Loop Transformations Using Nonunimodular Matrices," IEEE Trans. Parallel and Distributed Systems, vol. 6, no. 8, pp. 832-840, Aug. 1995.
-
(1995)
IEEE Trans. Parallel and Distributed Systems
, vol.6
, Issue.8
, pp. 832-840
-
-
Fernandez, A.1
Llaberia, J.2
Valero, M.3
-
13
-
-
0003734214
-
Fortran-D Language Specification
-
Dept. of Computer Science, Rice Univ., Dec.
-
G. Fox, S. Hiranandani, K. Kennedy, C. Koelbel, U. Kremer, C. Tseng, and M. Wu, "Fortran-D Language Specification," Technical Report TR-91-170, Dept. of Computer Science, Rice Univ., Dec. 1991.
-
(1991)
Technical Report
, vol.TR-91-170
-
-
Fox, G.1
Hiranandani, S.2
Kennedy, K.3
Koelbel, C.4
Kremer, U.5
Tseng, C.6
Wu, M.7
-
14
-
-
84948965758
-
Compiling Tiled Iteration Spaces for Clusters
-
Sept.
-
G. Goumas, N. Drosinos, M. Athanasaki, and N. Koziris, "Compiling Tiled Iteration Spaces for Clusters," Proc. IEEE Int'l Conf. Cluster Computing, pp. 360-369, Sept. 2002.
-
(2002)
Proc. IEEE Int'l Conf. Cluster Computing
, pp. 360-369
-
-
Goumas, G.1
Drosinos, N.2
Athanasaki, M.3
Koziris, N.4
-
16
-
-
0032069399
-
On Supernode Transformation with Minimized Total Running Time
-
May
-
E. Hodzic and W. Shang, "On Supernode Transformation with Minimized Total Running Time," IEEE Trans. Parallel and Distributed Systems, vol. 9, no. 5, pp. 417-428, May 1998.
-
(1998)
IEEE Trans. Parallel and Distributed Systems
, vol.9
, Issue.5
, pp. 417-428
-
-
Hodzic, E.1
Shang, W.2
-
17
-
-
0036958653
-
On Time Optimal Supernode Shape
-
Dec.
-
E. Hodzic and W. Shang, "On Time Optimal Supernode Shape," IEEE Trans. Parallel and Distributed Systems, vol. 13, no. 12, pp. 1220-1233, Dec. 2002.
-
(2002)
IEEE Trans. Parallel and Distributed Systems
, vol.13
, Issue.12
, pp. 1220-1233
-
-
Hodzic, E.1
Shang, W.2
-
18
-
-
0002549765
-
Determining the Idle Time of a Tiling
-
Jan.
-
K. Hogstedt, L. Carter, and J. Ferrante, "Determining the Idle Time of a Tiling," Principles of Programming Languages, pp. 319-323, Jan. 1997.
-
(1997)
Principles of Programming Languages
, pp. 319-323
-
-
Hogstedt, K.1
Carter, L.2
Ferrante, J.3
-
19
-
-
0032642196
-
Selecting Tile Shape for Minimal Execution Time
-
K. Hogstedt, L. Carter, and J. Ferrante, "Selecting Tile Shape for Minimal Execution Time," Proc. ACM Symp. Parallel Algorithms and Architectures, pp. 201-211, 1999.
-
(1999)
Proc. ACM Symp. Parallel Algorithms and Architectures
, pp. 201-211
-
-
Hogstedt, K.1
Carter, L.2
Ferrante, J.3
-
20
-
-
0037962984
-
On the Parallel Execution Time of Tiled Loop
-
Mar.
-
K. Hogstedt, L. Carter, and J. Ferrante, "On the Parallel Execution Time of Tiled Loops," IEEE Trans. Parallel and Distributed Systems, vol. 14, no. 3, pp. 307-321, Mar. 2003.
-
(2003)
IEEE Trans. Parallel and Distributed Systems
, vol.14
, Issue.3
, pp. 307-321
-
-
Hogstedt, K.1
Carter, L.2
Ferrante, J.3
-
23
-
-
0003904906
-
The Omega Library Interface Guide
-
Computer Science Dept., Univ. of Maryland, College Park, Mar.
-
W. Kelly, V. Maslov, W. Pugh, E. Rosser, T. Shpeisman, and D. Wonnacott, "The Omega Library Interface Guide," Technical Report CS-TR-3445, Computer Science Dept., Univ. of Maryland, College Park, Mar. 1995.
-
(1995)
Technical Report
, vol.CS-TR-3445
-
-
Kelly, W.1
Maslov, V.2
Pugh, W.3
Rosser, E.4
Shpeisman, T.5
Wonnacott, D.6
-
24
-
-
0011269693
-
Pipelined Data-Parallel Algorithms: Part II Design
-
Oct.
-
C.-T. King, W.-H. Chou, and L. "Ni, "Pipelined Data-Parallel Algorithms: Part II Design," IEEE Trans. Parallel and Distributed Systems, vol. 2, no. 4, pp. 430-439, Oct. 1991.
-
(1991)
IEEE Trans. Parallel and Distributed Systems
, vol.2
, Issue.4
, pp. 430-439
-
-
King, C.-T.1
Chou, W.-H.2
Ni, L.3
-
25
-
-
0003888396
-
-
PhD dissertation, Cornell Univ., Ithaca, New York
-
W. Li, "Compiling for NUMA Parallel Machines," PhD dissertation, Cornell Univ., Ithaca, New York, 1993.
-
(1993)
Compiling for NUMA Parallel Machines
-
-
Li, W.1
-
26
-
-
85031707006
-
Non-Unimodular Loop Transformations of Nested Loops
-
Nov.
-
J. Ramanujam, "Non-Unimodular Loop Transformations of Nested Loops," Proc. Supercomputing '92 Conf., pp. 214-223, Nov. 1992.
-
(1992)
Proc. Supercomputing '92 Conf.
, pp. 214-223
-
-
Ramanujam, J.1
-
27
-
-
0029518016
-
Beyond Unimodular Transformations
-
Oct.
-
J. Ramanujam, "Beyond Unimodular Transformations," J. Supercomputing, vol. 9, no. 4, pp. 365-389, Oct. 1995.
-
(1995)
J. Supercomputing
, vol.9
, Issue.4
, pp. 365-389
-
-
Ramanujam, J.1
-
28
-
-
38249009019
-
Tiling Multidimensional Iteration Spaces for Multicomputers
-
J. Ramanujam and P. Sadayappan, "Tiling Multidimensional Iteration Spaces for Multicomputers," J. Parallel and Distributed Computing, vol. 16, pp. 108-120, 1992.
-
(1992)
J. Parallel and Distributed Computing
, vol.16
, pp. 108-120
-
-
Ramanujam, J.1
Sadayappan, P.2
-
29
-
-
0026821247
-
Independent Partitioning of Algorithms with Uniform Dependenies
-
Feb.
-
W. Shang and J. Fortes, "Independent Partitioning of Algorithms with Uniform Dependenies," IEEE Trans. Computers, vol. 41, no. 2, pp. 190-206, Feb. 1992.
-
(1992)
IEEE Trans. Computers
, vol.41
, Issue.2
, pp. 190-206
-
-
Shang, W.1
Fortes, J.2
-
30
-
-
0029191426
-
Partitioning and Mapping Nested Loops for Linear Array Multicomputers
-
J.-P. Sheu and T.-S. Chen, "Partitioning and Mapping Nested Loops for Linear Array Multicomputers," J. Supercomputing, vol. 9, pp. 183-202, 1995.
-
(1995)
J. Supercomputing
, vol.9
, pp. 183-202
-
-
Sheu, J.-P.1
Chen, T.-S.2
-
31
-
-
0026231051
-
Partitioning and Mapping Nested Loops on Multiprocessor Systems
-
Oct.
-
J.-P. Sheu and T.-H. Tai, "Partitioning and Mapping Nested Loops on Multiprocessor Systems," IEEE Trans. Parallel and Distributed Systems, vol. 2, no. 4, pp. 430-439, Oct. 1991.
-
(1991)
IEEE Trans. Parallel and Distributed Systems
, vol.2
, Issue.4
, pp. 430-439
-
-
Sheu, J.-P.1
Tai, T.-H.2
-
32
-
-
84966549146
-
Enhancing the Performance of Tiled Loop Execution onto Clusters Using Memory Mapped Network Interfaces and Pipelined Schedules
-
Apr.
-
A. Sotiropoulos, G. Tsoukalas, and N. Koziris, "Enhancing the Performance of Tiled Loop Execution onto Clusters Using Memory Mapped Network Interfaces and Pipelined Schedules," Proc. 2002 Workshop Comm. Architecture for Clusters, and Int'l Parallel and Distributed Processing Symp., Apr. 2002.
-
(2002)
Proc. 2002 Workshop Comm. Architecture for Clusters, and Int'l Parallel and Distributed Processing Symp.
-
-
Sotiropoulos, A.1
Tsoukalas, G.2
Koziris, N.3
-
33
-
-
0029190371
-
Advanced Compilation Techniques in the PARADIGM Compiler for Distributed Memory Multicomputers
-
July
-
E. Su, A. Lain, S. Ramaswamy, D.J. Palermo, E.W. Hodges, and P. Banerjee, "Advanced Compilation Techniques in the PARADIGM Compiler for Distributed Memory Multicomputers," Proc. ACM. Int'l Conf. Supercomputing, July 1995.
-
(1995)
Proc. ACM. Int'l Conf. Supercomputing
-
-
Su, E.1
Lain, A.2
Ramaswamy, S.3
Palermo, D.J.4
Hodges, E.W.5
Banerjee, P.6
-
34
-
-
0000778059
-
Generating Efficient Tiled Code for Distributed Memory Machines
-
P. Tang and J. Xue, "Generating Efficient Tiled Code for Distributed Memory Machines," Parallel Computing, vol. 26, no. 11, pp. 1369-1410, 2000.
-
(2000)
Parallel Computing
, vol.26
, Issue.11
, pp. 1369-1410
-
-
Tang, P.1
Xue, J.2
-
35
-
-
0034262560
-
Chain Grouping: A Method for Partitioning Loops onto Mesh-Connected Processor Arrays
-
Sept.
-
P. Tsanakas, N. Koziris, and G. Papakonstantinou, "Chain Grouping: A Method for Partitioning Loops onto Mesh-Connected Processor Arrays," IEEE Trans. Parallel and Distributed Systems, vol. 11, no. 9, pp. 941-955, Sept. 2000.
-
(2000)
IEEE Trans. Parallel and Distributed Systems
, vol.11
, Issue.9
, pp. 941-955
-
-
Tsanakas, P.1
Koziris, N.2
Papakonstantinou, G.3
-
37
-
-
0026232450
-
A Loop Transformation Theory and an Algorithm to Maximize Parallelism
-
Oct.
-
M. Wolf and M. Lam, "A Loop Transformation Theory and an Algorithm to Maximize Parallelism," IEEE Trans. Parallel and Distributed Systems, vol. 2, no. 4, pp. 452-471, Oct. 1991.
-
(1991)
IEEE Trans. Parallel and Distributed Systems
, vol.2
, Issue.4
, pp. 452-471
-
-
Wolf, M.1
Lam, M.2
-
38
-
-
0028434044
-
Automatic Non-Unimodular Loop Transformations for Massive Parallelism
-
J. Xue, "Automatic Non-Unimodular Loop Transformations for Massive Parallelism," Parallel Computing, vol. 20, no. 5, pp. 711-728, 1994.
-
(1994)
Parallel Computing
, vol.20
, Issue.5
, pp. 711-728
-
-
Xue, J.1
-
39
-
-
0003125942
-
Communication-Minimal Tiling of Uniform Dependence Loops
-
J. Xue, "Communication-Minimal Tiling of Uniform Dependence Loops," J. Parallel and Distributed Computing, vol. 42, no. 1, pp. 42-59, 1997.
-
(1997)
J. Parallel and Distributed Computing
, vol.42
, Issue.1
, pp. 42-59
-
-
Xue, J.1
-
40
-
-
0036601528
-
Time-Minimal Tiling when Rise is Larger than Zero
-
J. Xue and W. Cai, "Time-Minimal Tiling when Rise is Larger than Zero," Parallel Computing, vol. 28, no. 6, pp. 915-939, 2002.
-
(2002)
Parallel Computing
, vol.28
, Issue.6
, pp. 915-939
-
-
Xue, J.1
Cai, W.2
|