SCOPUS 정보 검색 플랫폼

International Journal of Parallel Programming

Volumn 29, Issue 5, 2001, Pages 545-581

Optimized unrolling of nested loops

(1) Sarkar, Vivek a

a IBM T J WATSON RESEARCH CENTER (United States)

Author keywords

Loop transformations; Loop unrolling; Unroll factors; Unroll and jam

Indexed keywords

LOOP TRANSFORMATIONS; LOOP UNROLLING; UNROLL FACTORS; UNROLL-AND-JAM;

ALGORITHMS; C (PROGRAMMING LANGUAGE); CACHE MEMORY; CODES (SYMBOLS); COMPUTER PROGRAMMING; FORTRAN (PROGRAMMING LANGUAGE); FUNCTIONS; HIERARCHICAL SYSTEMS; ITERATIVE METHODS; JAVA PROGRAMMING LANGUAGE; MATHEMATICAL MODELS; MATHEMATICAL TRANSFORMATIONS; OPTIMIZATION;

PROGRAM COMPILERS;

EID: 0348126362 PISSN: 08857458 EISSN: None Source Type: Journal
DOI: None Document Type: Article

Times cited : (32)

References (31)

1
- 0001775038
- A catalogue of optimizing transformations
- Prentice-Hall
- F. E. Allen and J. Cocke, A catalogue of optimizing transformations, in Design and Optimization of Compilers, Prentice-Hall, pp. 1-30 (1972).
- (1972) Design and Optimization of Compilers , pp. 1-30
- Allen, F.E.¹ Cocke, J.²

2
- 0018440818
- Unrolling Loops in Fortran
- March
- J. J. Dongarra and A. R. Hinds, Unrolling Loops in Fortran, Software - Practice and Experience 9(3):219-226 (March 1979).
- (1979) Software - Practice and Experience , vol.9 , Issue.3 , pp. 219-226
- Dongarra, J.J.¹ Hinds, A.R.²

3
- 84976817232
- Parallel Processing: A Smart Compiler and a Dumb Machine
- June
- J. A. Fisher, J. R. Ellis, J. C. Ruttenberg, and A. Nicolau, Parallel Processing: A Smart Compiler and a Dumb Machine, Proc. ACM Symp. Compiler Construction, pp. 37-47 (June 1984).
- (1984) Proc. ACM Symp. Compiler Construction , pp. 37-47
- Fisher, J.A.¹ Ellis, J.R.² Ruttenberg, J.C.³ Nicolau, A.⁴

4
- 0028743437
- Compiler Transformations for High-Performance Computing
- December
- D. F. Bacon, S. L. Graham, and O. J. Sharp, Compiler Transformations for High-Performance Computing, ACM Computing Surveys 26(4):345-420 (December 1994).
- (1994) ACM Computing Surveys , vol.26 , Issue.4 , pp. 345-420
- Bacon, D.F.¹ Graham, S.L.² Sharp, O.J.³

5
- 0028277074
- Scalar Replacement in the Presence of Conditional Control Flow
- January
- Steve Carr and Ken Kennedy, Scalar Replacement in the Presence of Conditional Control Flow, Software - Practice and Experience (1):51-77 (January 1994).
- (1994) Software - Practice and Experience , Issue.1 , pp. 51-77
- Carr, S.¹ Kennedy, K.²

6
- 84976789640
- Memory bandwidth optimizations for wide-bus machines
- Wailea, Hawaii, January
- Michael J. Alexander, Mark W. Bailey, Bruce R. Childers, Jack W. Davidson, and Sanjay Jinturkar, Memory bandwidth optimizations for wide-bus machines, Proc. 26th Hawaii Int'l. Conf. Syst. Sci., Wailea, Hawaii, pp. 466-475 (January 1993).
- (1993) Proc. 26th Hawaii Int'l. Conf. Syst. Sci. , pp. 466-475
- Alexander, M.J.¹ Bailey, M.W.² Childers, B.R.³ Davidson, J.W.⁴ Jinturkar, S.⁵

7
- 0004033521
- Ph.D. thesis, Stanford University March
- T. C. Mowry, Tolerating Latency Through Software-Controlled Data Prefetching, Ph.D. thesis, Stanford University (March 1994).
- (1994) Tolerating Latency through Software-Controlled Data Prefetching
- Mowry, T.C.¹

8
- 10844249011
- Compiler Solutions for the Stale-Data and False-Sharing Problems
- IBM Santa Teresa Laboratory April
- Mauricio Breternitz, Michael Lai, Vivek Sarkar, and Barbara Simons, Compiler Solutions for the Stale-Data and False-Sharing Problems, Technical report, TR 03.466, IBM Santa Teresa Laboratory (April 1993).
- (1993) Technical Report , vol.TR 03.466
- Breternitz, M.¹ Lai, M.² Sarkar, V.³ Simons, B.⁴

9
- 0028549474
- Improving the Ratio of Memory Operations to Floating-Point Operations in Loops
- November
- Steve Carr and Ken Kennedy, Improving the Ratio of Memory Operations to Floating-Point Operations in Loops, ACM TOPLAS 16(4) (November 1994).
- (1994) ACM TOPLAS , vol.16 , Issue.4
- Carr, S.¹ Kennedy, K.²

10
- 84957718214
- Aggressive Loop Unrolling in a Retargetable, Optimizing Compiler
- Springer-Verlag, New York April
- Jack W. Davidson and Sanjay Jinturkar, Aggressive Loop Unrolling in a Retargetable, Optimizing Compiler, In Compiler Construction, Proc. Sixth Int'l. Conf. Linkoping, Sweden, Vol. 1060, Lecture Notes in Computer Science, Springer-Verlag, New York (April 1996).
- (1996) Compiler Construction, Proc. Sixth Int'l. Conf. Linkoping, Sweden, Vol. 1060, Lecture Notes in Computer Science , vol.1060
- Davidson, J.W.¹ Jinturkar, S.²

11
- 0025447908
- Improving Register Allocation for Subscripted Variables
- White Plains, New York, June
- David Callahan, Steve Carr, and Ken Kennedy, Improving Register Allocation for Subscripted Variables, Proc. ACM SIGPLAN Conf. Prog. Lang. Design and Implementation, White Plains, New York, pp. 53-65 (June 1990).
- (1990) Proc. ACM SIGPLAN Conf. Prog. Lang. Design and Implementation , pp. 53-65
- Callahan, D.¹ Carr, S.² Kennedy, K.³

12
- 0031380928
- Unroll-and-Jam Using Uniformly Generated Sets
- December
- S. Carr and Y. Guan, Unroll-and-Jam Using Uniformly Generated Sets, Proc. MICRO-30, pp. 349-357 (December 1997).
- (1997) Proc. MICRO-30 , pp. 349-357
- Carr, S.¹ Guan, Y.²

13
- 0003690936
- Ph.D. thesis, Rice University, Rice COMP TR89-93 May
- Allan K. Porterfield, Software Methods for Improvement of Cache Performance on Super-computer Applications, Ph.D. thesis, Rice University, Rice COMP TR89-93 (May 1989).
- (1989) Software Methods for Improvement of Cache Performance on Super-computer Applications
- Porterfield, A.K.¹

14
- 84976827033
- A Data Locality Optimization Algorithm
- June
- Michael E. Wolf and Monica S. Lam, A Data Locality Optimization Algorithm, Proc. ACM SIGPLAN Symp. Progr. Lang. Design and Implementation, pp. 30-44 (June 1991).
- (1991) Proc. ACM SIGPLAN Symp. Progr. Lang. Design and Implementation , pp. 30-44
- Wolf, M.E.¹ Lam, M.S.²

15
- 0031140581
- Automatic Selection of High order Transformations in the IBM XL Fortran Compilers
- May
- Vivek Sarkar, Automatic Selection of High Order Transformations in the IBM XL Fortran Compilers. IBM J. Res. Dev. 41(3) (May 1997).
- (1997) IBM J. Res. Dev. , vol.41 , Issue.3
- Sarkar, V.¹

16
- 0004062640
- Pitman, London and The MIT Press, Cambridge, Massachusetts In the series, Research Monographs in Parallel and Distributed Computing
- Michael J. Wolfe, Optimizing Supercompilers for Supercomputers, Pitman, London and The MIT Press, Cambridge, Massachusetts (1989). In the series, Research Monographs in Parallel and Distributed Computing.
- (1989) Optimizing Supercompilers for Supercomputers
- Wolfe, M.J.¹

17
- 0026991030
- A General Framework for Iteration-Reordering Loop Transformations
- June
- Vivek Sarkar and Radhika Thekkath, A General Framework for Iteration-Reordering Loop Transformations, Proc. ACM SIGPLAN Conf. Prog. Lang. Design and Implementation, pp. 175-187 (June 1992).
- (1992) Proc. ACM SIGPLAN Conf. Prog. Lang. Design and Implementation , pp. 175-187
- Sarkar, V.¹ Thekkath, R.²

18
- 85015240805
- On Estimating and Enhancing Cache Effectiveness
- Jeanne Ferrante, Vivek Sarkar, and Wendy Thrash, On Estimating and Enhancing Cache Effectiveness, Lecture Notes in Computer Science (589):328-343 (1991).
- (1991) Lecture Notes in Computer Science , Issue.589 , pp. 328-343
- Ferrante, J.¹ Sarkar, V.² Thrash, W.³

19
- 57649161728
- Santa Clara, California August
- Proc. Fourth Int'l. Workshop Lang. Compilers for Parallel Computing, Santa Clara, California (August 1991).
- (1991) Proc. Fourth Int'l. Workshop Lang. Compilers for Parallel Computing

20
- 0028768013
- Iterative Modulo Scheduling: An Algorithm for Software Pipelining Loops
- San Jose, California, November
- B. Ramakrishna Rau, Iterative Modulo Scheduling: An Algorithm for Software Pipelining Loops, Proc. 27th Ann. Int'l. Symp. Microarchitecture, San Jose, California, pp. 63-74 (November 1994).
- (1994) Proc. 27th Ann. Int'l. Symp. Microarchitecture , pp. 63-74
- Ramakrishna Rau, B.¹

21
- 10844225028
- Don't Waste Those Cycles: An In-Depth Look at Scheduling Instructions in Basic Blocks and Loops
- August
- Vivek Sarkar and Barbara Simons, Don't Waste Those Cycles: An In-Depth Look at Scheduling Instructions in Basic Blocks and Loops, Video Lecture in University Video Communication's Distinguished Lecture Series IX (August 1994).
- (1994) Video Lecture in University Video Communication's Distinguished Lecture Series IX
- Sarkar, V.¹ Simons, B.²

22
- 0024700878
- Determining Average Program Execution Times and their Variance
- July
- Vivek Sarkar, Determining Average Program Execution Times and their Variance, Proc. SIGPLAN Conf. Prog. Lang. Design and Implementation 24(7):298-312 (July 1989).
- (1989) Proc. SIGPLAN Conf. Prog. Lang. Design and Implementation , vol.24 , Issue.7 , pp. 298-312
- Sarkar, V.¹

23
- 0026213832
- Automatic Partitioning of a Program Dependence Graph into Parallel Tasks
- Vivek Sarkar, Automatic Partitioning of a Program Dependence Graph into Parallel Tasks, IBM J. Res. Dev 35(5/6) (1991).
- (1991) IBM J. Res. Dev , vol.35 , Issue.5-6
- Sarkar, V.¹

24
- 10844279641
- The Standard Performance Evaluation Corporation, SPEC CPU95 Benchmarks, http://open.specbench.org/osg/cpu95/ (1997).
- (1997) SPEC CPU95 Benchmarks

25
- 3342982100
- POWER2 and PowerPC
- September
- IBM Corporation, POWER2 and PowerPC, Special issue of IBM J. Res. Dev. 38(5): 489-648 (September 1994).
- (1994) IBM J. Res. Dev. , vol.38 , Issue.5 SPEC. ISSUE , pp. 489-648

26
- 0027927574
- An Optimal Asynchronous Scheduling Algorithm for Software Cache Consistency
- January
- Barbara Simons, Vivek Sarkar, Jr. Mauricio Breternitz, and Michael Lai, An Optimal Asynchronous Scheduling Algorithm for Software Cache Consistency, Proc. Hawaii Int'l. Conf. Syst. Sci. (January 1994).
- (1994) Proc. Hawaii Int'l. Conf. Syst. Sci.
- Simons, B.¹ Sarkar, V.² Breternitz Jr., M.³ Lai, M.⁴

27
- 84862485598
- Improving the Ratio of Memory Operations to Floating-Point operations in loops
- Copy of review can be found in the ACM digital library
- Max Hailperin, Improving the Ratio of Memory Operations to Floating-Point operations in loops, Computing Reviews. Copy of review can be found in the ACM digital library at http://www.acm.org/pubs/citations/journals/toplas/1994-16-6/p1768-carr/.
- Computing Reviews
- Hailperin, M.¹

28
- 0023601205
- A Study of Scalar Compilation Techniques for Pipelined Supercomputers
- October
- S. Weiss and J. E. Smith, A Study of Scalar Compilation Techniques for Pipelined Supercomputers, Proc. Second Int'l Conf. Architectural Support Progr. Lang. Oper. Syst. (ASPLOS), pp. 105-109 (October 1987).
- (1987) Proc. Second Int'l Conf. Architectural Support Progr. Lang. Oper. Syst. (ASPLOS) , pp. 105-109
- Weiss, S.¹ Smith, J.E.²

29
- 85045548264
- Software Pipelining: An Evaluation of Enhanced Pipelining
- December
- Reese B. Jones and Vicki H. Allan, Software Pipelining: An Evaluation of Enhanced Pipelining, Proc. 24th Ann. Int'l. symp. Microarchitecture, pp. 82-92 (December 1990).
- (1990) Proc. 24th Ann. Int'l. Symp. Microarchitecture , pp. 82-92
- Jones, R.B.¹ Allan, V.H.²

30
- 0023562016
- GURPR - A Method for Global Software Piplining
- December
- Bogong Su, Shiyuan Ding, Jian Wang, and Jinshi Xia, GURPR - A Method for Global Software Piplining; Proc. 20th Ann. Int'l. Symp. Microarchitecture, pp. 88-96 (December 1986).
- (1986) Proc. 20th Ann. Int'l. Symp. Microarchitecture , pp. 88-96
- Su, B.¹ Ding, S.² Wang, J.³ Xia, J.⁴

31
- 0029487787
- Unrolling-Based Optmizations for Modulo Sheduling
- December
- Daniel M. Lavery and Wen-Mei W.Hwu, Unrolling-Based Optmizations for Modulo Sheduling, Proc. MICRO-28, pp. 327-337 (December 1995).
- (1995) Proc. MICRO-28 , pp. 327-337
- Lavery, D.M.¹ Hwu, W.-M.W.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.