SCOPUS 정보 검색 플랫폼

IEEE Transactions on Parallel and Distributed Systems

Volumn 14, Issue 10, 2003, Pages 1021-1034

An Efficient Code Generation Technique for Tiled Iteration Spaces

(3) Goumas, Georgios a Athanasaki, Maria a Koziris, Nectarios a

a NATIONAL TECHNICAL UNIVERSITY OF ATHENS (Greece)

Author keywords

Code generation; Fourier Motzkin elimination; Loop tiling; Nonunimodular transformations; Supernodes

Indexed keywords

CODES (SYMBOLS); COMPUTATIONAL COMPLEXITY; DIGITAL STORAGE; ITERATIVE METHODS; MATHEMATICAL TRANSFORMATIONS; MATRIX ALGEBRA; PARALLEL PROCESSING SYSTEMS;

CODE GENERATION; MULTILEVEL TILING;

PROGRAM COMPILERS;

EID: 0242578173 PISSN: 10459219 EISSN: None Source Type: Journal
DOI: 10.1109/TPDS.2003.1239870 Document Type: Article

Times cited : (25)

References (40)

1
- 0346032593
- Advanced Code Generation for High Performance Fortran
- chapter 18
- V. Adve and J. Mellor-Crummey, "Advanced Code Generation for High Performance Fortran," Languages, Compilation Techniques, and Run Time Systems for Scalable Parallel Systems, chapter 18, 1997.
- (1997) Languages, Compilation Techniques, and Run Time Systems for Scalable Parallel Systems
- Adve, V.¹ Mellor-Crummey, J.²

2
- 0029373981
- Automatic Partitioning of Parallel Loops and Data Arrays for Distributed Shared-Memory Multiprocessors
- Sept.
- A. Agarwal, D. Kranz, and V. Natarajan, "Automatic Partitioning of Parallel Loops and Data Arrays for Distributed Shared-Memory Multiprocessors," IEEE Trans. Parallel and Distributed Systems, vol. 6, no. 9, pp. 943-962, Sept. 1995.
- (1995) IEEE Trans. Parallel and Distributed Systems , vol.6 , Issue.9 , pp. 943-962
- Agarwal, A.¹ Kranz, D.² Natarajan, V.³

3
- 0027802136
- Communication Optimization and Code Generation for Distributed Memory Machines
- June
- S.P. Amarasinghe and M.S. Lam, "Communication Optimization and Code Generation for Distributed Memory Machines," Proc. ACM SIGPLAN Conf. Programming Language Design and Implementation, June 1993.
- (1993) Proc. ACM SIGPLAN Conf. Programming Language Design and Implementation
- Amarasinghe, S.P.¹ Lam, M.S.²

4
- 84976766536
- Scanning Polyhedra with DO Loops
- Apr.
- C. Ancourt and F. Irigoin, "Scanning Polyhedra with DO Loops," Proc. Third ACM SIGPLAN Symp. Principles and Practice of Parallel Programming, pp. 39-50, Apr. 1991.
- (1991) Proc. Third ACM SIGPLAN Symp. Principles and Practice of Parallel Programming , pp. 39-50
- Ancourt, C.¹ Irigoin, F.²

5
- 0005784992
- First Steps Towards Optimal Oblique Tile Sizing
- Jan.
- R. Andonov, P. Calland, S. Niar, S. Rajopadhye, and N. Yanev, "First Steps Towards Optimal Oblique Tile Sizing," Proc. Eighth Int'l Workshop Compilers for Parallel Computers, pp. 351-366, Jan. 2000.
- (2000) Proc. Eighth Int'l Workshop Compilers for Parallel Computers , pp. 351-366
- Andonov, R.¹ Calland, P.² Niar, S.³ Rajopadhye, S.⁴ Yanev, N.⁵

6
- 0005314056
- Implementation of Fourier-Motzkin Elimination
- A. Bik and H. Wijshoff, "Implementation of Fourier-Motzkin Elimination," Proc. First Ann. Conf. Advanced School for Computing and Imaging, pp. 377-386, 1995.
- (1995) Proc. First Ann. Conf. Advanced School for Computing and Imaging , pp. 377-386
- Bik, A.¹ Wijshoff, H.²

7
- 0028482686
- (Pen)-Ultimate Tiling?
- P. Boulet, A. Darte, T. Risset, and Y. Robert, "(Pen)-Ultimate Tiling?" INTEGRATION, The VLSI J., vol. 17, pp. 33-51, 1994.
- (1994) INTEGRATION, the VLSI J. , vol.17 , pp. 33-51
- Boulet, P.¹ Darte, A.² Risset, T.³ Robert, Y.⁴

8
- 0242537076
- Programming in Vienna Fortran
- July
- B. Chapman, P. Mehrotra, and H. Zima, "Programming in Vienna Fortran," Proc. Third Workshop Compilers for Parallel Computers, pp. 121-160, July 1992.
- (1992) Proc. Third Workshop Compilers for Parallel Computers , pp. 121-160
- Chapman, B.¹ Mehrotra, P.² Zima, H.³

9
- 0032028841
- Determining the Idle Time of a Tiling: New Results
- Mar.
- F. Desprez, J. Dongarra, and Y. Robert, "Determining the Idle Time of a Tiling: New Results," J. Information Science and Eng., vol. 14, pp. 167-190, Mar. 1997.
- (1997) J. Information Science and Eng. , vol.14 , pp. 167-190
- Desprez, F.¹ Dongarra, J.² Robert, Y.³

10
- 0026891897
- Partitioning and Labeling of Loops by Unimodular Transformations
- July
- E. D'Hollander, "Partitioning and Labeling of Loops by Unimodular Transformations," IEEE Trans. Parallel and Distributed Systems, vol. 3, no. 4, pp. 465-476, July 1992.
- (1992) IEEE Trans. Parallel and Distributed Systems , vol.3 , Issue.4 , pp. 465-476
- D'Hollander, E.¹

11
- 84949487986
- Evaluation of Loop Grouping Methods Based on Orthogonal Projection Spaces
- Aug.
- I. Drossiris, G. Goumas, N. Koziris, G. Papakonstantinou, and P. Tsanakas, "Evaluation of Loop Grouping Methods Based on Orthogonal Projection Spaces," Proc. Int'l Conf. Parallel Processing, pp. 469-476, Aug. 2000.
- (2000) Proc. Int'l Conf. Parallel Processing , pp. 469-476
- Drossiris, I.¹ Goumas, G.² Koziris, N.³ Papakonstantinou, G.⁴ Tsanakas, P.⁵

12
- 0029354475
- Loop Transformations Using Nonunimodular Matrices
- Aug.
- A. Fernandez, J. Llaberia, and M. Valero, "Loop Transformations Using Nonunimodular Matrices," IEEE Trans. Parallel and Distributed Systems, vol. 6, no. 8, pp. 832-840, Aug. 1995.
- (1995) IEEE Trans. Parallel and Distributed Systems , vol.6 , Issue.8 , pp. 832-840
- Fernandez, A.¹ Llaberia, J.² Valero, M.³

13
- 0003734214
- Fortran-D Language Specification
- Dept. of Computer Science, Rice Univ., Dec.
- G. Fox, S. Hiranandani, K. Kennedy, C. Koelbel, U. Kremer, C. Tseng, and M. Wu, "Fortran-D Language Specification," Technical Report TR-91-170, Dept. of Computer Science, Rice Univ., Dec. 1991.
- (1991) Technical Report , vol.TR-91-170
- Fox, G.¹ Hiranandani, S.² Kennedy, K.³ Koelbel, C.⁴ Kremer, U.⁵ Tseng, C.⁶ Wu, M.⁷

14
- 84948965758
- Compiling Tiled Iteration Spaces for Clusters
- Sept.
- G. Goumas, N. Drosinos, M. Athanasaki, and N. Koziris, "Compiling Tiled Iteration Spaces for Clusters," Proc. IEEE Int'l Conf. Cluster Computing, pp. 360-369, Sept. 2002.
- (2002) Proc. IEEE Int'l Conf. Cluster Computing , pp. 360-369
- Goumas, G.¹ Drosinos, N.² Athanasaki, M.³ Koziris, N.⁴

15
- 84981274197
- Minimizing Completion Time for Loop Tiling with Computation and Communication Overlapping
- Apr.
- G. Goumas, A. Sotiropoulos, and N. Koziris, "Minimizing Completion Time for Loop Tiling with Computation and Communication Overlapping," Proc. IEEE Int'l Parallel and Distributed Processing Symp., Apr. 2001.
- (2001) Proc. IEEE Int'l Parallel and Distributed Processing Symp.
- Goumas, G.¹ Sotiropoulos, A.² Koziris, N.³

16
- 0032069399
- On Supernode Transformation with Minimized Total Running Time
- May
- E. Hodzic and W. Shang, "On Supernode Transformation with Minimized Total Running Time," IEEE Trans. Parallel and Distributed Systems, vol. 9, no. 5, pp. 417-428, May 1998.
- (1998) IEEE Trans. Parallel and Distributed Systems , vol.9 , Issue.5 , pp. 417-428
- Hodzic, E.¹ Shang, W.²

17
- 0036958653
- On Time Optimal Supernode Shape
- Dec.
- E. Hodzic and W. Shang, "On Time Optimal Supernode Shape," IEEE Trans. Parallel and Distributed Systems, vol. 13, no. 12, pp. 1220-1233, Dec. 2002.
- (2002) IEEE Trans. Parallel and Distributed Systems , vol.13 , Issue.12 , pp. 1220-1233
- Hodzic, E.¹ Shang, W.²

18
- 0002549765
- Determining the Idle Time of a Tiling
- Jan.
- K. Hogstedt, L. Carter, and J. Ferrante, "Determining the Idle Time of a Tiling," Principles of Programming Languages, pp. 319-323, Jan. 1997.
- (1997) Principles of Programming Languages , pp. 319-323
- Hogstedt, K.¹ Carter, L.² Ferrante, J.³

19
- 0032642196
- Selecting Tile Shape for Minimal Execution Time
- K. Hogstedt, L. Carter, and J. Ferrante, "Selecting Tile Shape for Minimal Execution Time," Proc. ACM Symp. Parallel Algorithms and Architectures, pp. 201-211, 1999.
- (1999) Proc. ACM Symp. Parallel Algorithms and Architectures , pp. 201-211
- Hogstedt, K.¹ Carter, L.² Ferrante, J.³

20
- 0037962984
- On the Parallel Execution Time of Tiled Loop
- Mar.
- K. Hogstedt, L. Carter, and J. Ferrante, "On the Parallel Execution Time of Tiled Loops," IEEE Trans. Parallel and Distributed Systems, vol. 14, no. 3, pp. 307-321, Mar. 2003.
- (2003) IEEE Trans. Parallel and Distributed Systems , vol.14 , Issue.3 , pp. 307-321
- Hogstedt, K.¹ Carter, L.² Ferrante, J.³

21
- 85026986651
- Supernode Partitioning
- Jan.
- F. Irigoin and R. Triolet, "Supernode Partitioning," Proc. 15th Ann. ACM SIGACT-SIGPLAN Symp. Principles of Programming Languages, pp. 319-329, Jan. 1988.
- (1988) Proc. 15th Ann. ACM SIGACT-SIGPLAN Symp. Principles of Programming Languages , pp. 319-329
- Irigoin, F.¹ Triolet, R.²

22
- 0003783762
- PhD dissertation, Univ. Politecnica de Catalunia
- M. Jimenez, "Multilevel Tiling for Non-Rectangular Iteration Spaces," PhD dissertation, Univ. Politecnica de Catalunia, 1999.
- (1999) Multilevel Tiling for Non-rectangular Iteration Spaces
- Jimenez, M.¹

23
- 0003904906
- The Omega Library Interface Guide
- Computer Science Dept., Univ. of Maryland, College Park, Mar.
- W. Kelly, V. Maslov, W. Pugh, E. Rosser, T. Shpeisman, and D. Wonnacott, "The Omega Library Interface Guide," Technical Report CS-TR-3445, Computer Science Dept., Univ. of Maryland, College Park, Mar. 1995.
- (1995) Technical Report , vol.CS-TR-3445
- Kelly, W.¹ Maslov, V.² Pugh, W.³ Rosser, E.⁴ Shpeisman, T.⁵ Wonnacott, D.⁶

24
- 0011269693
- Pipelined Data-Parallel Algorithms: Part II Design
- Oct.
- C.-T. King, W.-H. Chou, and L. "Ni, "Pipelined Data-Parallel Algorithms: Part II Design," IEEE Trans. Parallel and Distributed Systems, vol. 2, no. 4, pp. 430-439, Oct. 1991.
- (1991) IEEE Trans. Parallel and Distributed Systems , vol.2 , Issue.4 , pp. 430-439
- King, C.-T.¹ Chou, W.-H.² Ni, L.³

25
- 0003888396
- PhD dissertation, Cornell Univ., Ithaca, New York
- W. Li, "Compiling for NUMA Parallel Machines," PhD dissertation, Cornell Univ., Ithaca, New York, 1993.
- (1993) Compiling for NUMA Parallel Machines
- Li, W.¹

26
- 85031707006
- Non-Unimodular Loop Transformations of Nested Loops
- Nov.
- J. Ramanujam, "Non-Unimodular Loop Transformations of Nested Loops," Proc. Supercomputing '92 Conf., pp. 214-223, Nov. 1992.
- (1992) Proc. Supercomputing '92 Conf. , pp. 214-223
- Ramanujam, J.¹

27
- 0029518016
- Beyond Unimodular Transformations
- Oct.
- J. Ramanujam, "Beyond Unimodular Transformations," J. Supercomputing, vol. 9, no. 4, pp. 365-389, Oct. 1995.
- (1995) J. Supercomputing , vol.9 , Issue.4 , pp. 365-389
- Ramanujam, J.¹

28
- 38249009019
- Tiling Multidimensional Iteration Spaces for Multicomputers
- J. Ramanujam and P. Sadayappan, "Tiling Multidimensional Iteration Spaces for Multicomputers," J. Parallel and Distributed Computing, vol. 16, pp. 108-120, 1992.
- (1992) J. Parallel and Distributed Computing , vol.16 , pp. 108-120
- Ramanujam, J.¹ Sadayappan, P.²

29
- 0026821247
- Independent Partitioning of Algorithms with Uniform Dependenies
- Feb.
- W. Shang and J. Fortes, "Independent Partitioning of Algorithms with Uniform Dependenies," IEEE Trans. Computers, vol. 41, no. 2, pp. 190-206, Feb. 1992.
- (1992) IEEE Trans. Computers , vol.41 , Issue.2 , pp. 190-206
- Shang, W.¹ Fortes, J.²

30
- 0029191426
- Partitioning and Mapping Nested Loops for Linear Array Multicomputers
- J.-P. Sheu and T.-S. Chen, "Partitioning and Mapping Nested Loops for Linear Array Multicomputers," J. Supercomputing, vol. 9, pp. 183-202, 1995.
- (1995) J. Supercomputing , vol.9 , pp. 183-202
- Sheu, J.-P.¹ Chen, T.-S.²

31
- 0026231051
- Partitioning and Mapping Nested Loops on Multiprocessor Systems
- Oct.
- J.-P. Sheu and T.-H. Tai, "Partitioning and Mapping Nested Loops on Multiprocessor Systems," IEEE Trans. Parallel and Distributed Systems, vol. 2, no. 4, pp. 430-439, Oct. 1991.
- (1991) IEEE Trans. Parallel and Distributed Systems , vol.2 , Issue.4 , pp. 430-439
- Sheu, J.-P.¹ Tai, T.-H.²

32
- 84966549146
- Enhancing the Performance of Tiled Loop Execution onto Clusters Using Memory Mapped Network Interfaces and Pipelined Schedules
- Apr.
- A. Sotiropoulos, G. Tsoukalas, and N. Koziris, "Enhancing the Performance of Tiled Loop Execution onto Clusters Using Memory Mapped Network Interfaces and Pipelined Schedules," Proc. 2002 Workshop Comm. Architecture for Clusters, and Int'l Parallel and Distributed Processing Symp., Apr. 2002.
- (2002) Proc. 2002 Workshop Comm. Architecture for Clusters, and Int'l Parallel and Distributed Processing Symp.
- Sotiropoulos, A.¹ Tsoukalas, G.² Koziris, N.³

33
- 0029190371
- Advanced Compilation Techniques in the PARADIGM Compiler for Distributed Memory Multicomputers
- July
- E. Su, A. Lain, S. Ramaswamy, D.J. Palermo, E.W. Hodges, and P. Banerjee, "Advanced Compilation Techniques in the PARADIGM Compiler for Distributed Memory Multicomputers," Proc. ACM. Int'l Conf. Supercomputing, July 1995.
- (1995) Proc. ACM. Int'l Conf. Supercomputing
- Su, E.¹ Lain, A.² Ramaswamy, S.³ Palermo, D.J.⁴ Hodges, E.W.⁵ Banerjee, P.⁶

34
- 0000778059
- Generating Efficient Tiled Code for Distributed Memory Machines
- P. Tang and J. Xue, "Generating Efficient Tiled Code for Distributed Memory Machines," Parallel Computing, vol. 26, no. 11, pp. 1369-1410, 2000.
- (2000) Parallel Computing , vol.26 , Issue.11 , pp. 1369-1410
- Tang, P.¹ Xue, J.²

35
- 0034262560
- Chain Grouping: A Method for Partitioning Loops onto Mesh-Connected Processor Arrays
- Sept.
- P. Tsanakas, N. Koziris, and G. Papakonstantinou, "Chain Grouping: A Method for Partitioning Loops onto Mesh-Connected Processor Arrays," IEEE Trans. Parallel and Distributed Systems, vol. 11, no. 9, pp. 941-955, Sept. 2000.
- (2000) IEEE Trans. Parallel and Distributed Systems , vol.11 , Issue.9 , pp. 941-955
- Tsanakas, P.¹ Koziris, N.² Papakonstantinou, G.³

36
- 84976827033
- A Data Locality Optimizing Algorithm
- June
- M. Wolf and M. Lam, "A Data Locality Optimizing Algorithm," Proc. ACM SIGPLAN Conf. Programming Language Design and Implementation, June 1991.
- (1991) Proc. ACM SIGPLAN Conf. Programming Language Design and Implementation
- Wolf, M.¹ Lam, M.²

37
- 0026232450
- A Loop Transformation Theory and an Algorithm to Maximize Parallelism
- Oct.
- M. Wolf and M. Lam, "A Loop Transformation Theory and an Algorithm to Maximize Parallelism," IEEE Trans. Parallel and Distributed Systems, vol. 2, no. 4, pp. 452-471, Oct. 1991.
- (1991) IEEE Trans. Parallel and Distributed Systems , vol.2 , Issue.4 , pp. 452-471
- Wolf, M.¹ Lam, M.²

38
- 0028434044
- Automatic Non-Unimodular Loop Transformations for Massive Parallelism
- J. Xue, "Automatic Non-Unimodular Loop Transformations for Massive Parallelism," Parallel Computing, vol. 20, no. 5, pp. 711-728, 1994.
- (1994) Parallel Computing , vol.20 , Issue.5 , pp. 711-728
- Xue, J.¹

39
- 0003125942
- Communication-Minimal Tiling of Uniform Dependence Loops
- J. Xue, "Communication-Minimal Tiling of Uniform Dependence Loops," J. Parallel and Distributed Computing, vol. 42, no. 1, pp. 42-59, 1997.
- (1997) J. Parallel and Distributed Computing , vol.42 , Issue.1 , pp. 42-59
- Xue, J.¹

40
- 0036601528
- Time-Minimal Tiling when Rise is Larger than Zero
- J. Xue and W. Cai, "Time-Minimal Tiling when Rise is Larger than Zero," Parallel Computing, vol. 28, no. 6, pp. 915-939, 2002.
- (2002) Parallel Computing , vol.28 , Issue.6 , pp. 915-939
- Xue, J.¹ Cai, W.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.