SCOPUS 정보 검색 플랫폼

ACM Transactions on Programming Languages and Systems

Volumn 19, Issue 6, 1997, Pages 853-898

Parallelizing Nonnumerical Code with Selective Scheduling and Software Pipelining

a Seoul National University (South Korea)

b IBM T J WATSON RESEARCH CENTER (United States)

Author keywords

Algorithms; Experimentation; Global instruction scheduling; Instruction level parallelism; Languages; Software pipelining; Speculative code motion; Superscalar; VLIW

Indexed keywords

ALGORITHMS; CODES (SYMBOLS); COMPUTER ARCHITECTURE; COMPUTER OPERATING PROCEDURES; COMPUTER PROGRAMMING LANGUAGES; PIPELINE PROCESSING SYSTEMS; PROGRAM COMPILERS; PROGRAM PROCESSORS;

INSTRUCTION LEVEL PARALLELISM; NONNUMERICAL CODE; SELECTIVE SCHEDULING; SOFTWARE PIPELINING; SPECULATIVE CODE MOTION; SUPERSCALAR;

PARALLEL PROCESSING SYSTEMS;

EID: 0031274169 PISSN: 01640925 EISSN: None Source Type: Journal
DOI: 10.1145/267959.269966 Document Type: Article

Times cited : (42)

References (43)

1
- 0004072686
- Addison-Wesley, Reading, Mass.
- AHO, A., SETHI, R., AND ULLMAN, J. 1986. Compilers: Principles, Techniques and Tools. Addison-Wesley, Reading, Mass.
- (1986) Compilers: Principles, Techniques and Tools
- Aho, A.¹ Sethi, R.² Ullman, J.³

2
- 0007941219
- A development environment for horizontal microcode
- AIKEN, A. AND NICOLAU, A. 1988. A development environment for horizontal microcode. IEEE Trans. Softw. Eng. 14, 5 (May), 584-594.
- (1988) IEEE Trans. Softw. Eng. , vol.14 , Issue.5 MAY , pp. 584-594
- Aiken, A.¹ Nicolau, A.²

3
- 0026242244
- Intel i860 processor
- ATKINS, M. 1991. Intel i860 processor. IEEE Micro 11, 24-28.
- (1991) IEEE Micro , vol.11 , pp. 24-28
- Atkins, M.¹

4
- 84976656897
- Global instruction scheduling for superscalar machines
- ACM Press, New York
- BERNSTEIN, D. AND RODEH, M. 1991. Global instruction scheduling for superscalar machines. In Proceedings of the SIGPLAN 1991 Conference on Programming Language Design and Implementation. ACM Press, New York, 241-255.
- (1991) Proceedings of the SIGPLAN 1991 Conference on Programming Language Design and Implementation , pp. 241-255
- Bernstein, D.¹ Rodeh, M.²

5
- 0024057252
- A VLIW architecture for a trace scheduling compiler
- COLWELL, R., NIX, R., O'DONNEL, J., PAPWORTH, D., AND RODMAN, P. 1988. A VLIW architecture for a trace scheduling compiler. IEEE Trans. Comput. 37, 8 (Aug.), 967-979.
- (1988) IEEE Trans. Comput. , vol.37 , Issue.8 AUG , pp. 967-979
- Colwell, R.¹ Nix, R.² O'Donnel, J.³ Papworth, D.⁴ Rodman, P.⁵

6
- 0026243790
- Efficiently computing static single assignment form and the control dependence graph
- CYTRON, R., FERRANTE, J., ROSEN, B., WEGMAN, M., AND ZADECK, F. 1991. Efficiently computing static single assignment form and the control dependence graph. ACM Trans. Program. Lang. Syst. 13, 4 (Jan.), 451-490.
- (1991) ACM Trans. Program. Lang. Syst. , vol.13 , Issue.4 JAN , pp. 451-490
- Cytron, R.¹ Ferrante, J.² Rosen, B.³ Wegman, M.⁴ Zadeck, F.⁵

7
- 0027590187
- Compiling for cydra 5
- DEHNERT, J. AND TOWLE, R. 1993. Compiling for cydra 5. J. Supercomput. 7, 1/2, 181-228.
- (1993) J. Supercomput. , vol.7 , Issue.1-2 , pp. 181-228
- Dehnert, J.¹ Towle, R.²

8
- 0002639275
- Some design ideas for a VLIW architecture for sequential natured software
- North Holland, Amsterdam
- EBCIOǦLU, K. 1988. Some design ideas for a VLIW architecture for sequential natured software. In Parallel Processing (Proceedings of IFIP WG 10.3 Working Conference on Parallel Processing). North Holland, Amsterdam, 3-21.
- (1988) Parallel Processing (Proceedings of IFIP WG 10.3 Working Conference on Parallel Processing) , pp. 3-21
- Ebcioǧlu, K.¹

9
- 0004726746
- Res. Rep. RC-16145, IBM T. J. Watson Research Center, Yorktown Heights, N.Y.
- EBCIOǦLU, K. AND GROVES, R. 1990. Some global compilation optimizations and architectural features for improving performance of superscalars. Res. Rep. RC-16145, IBM T. J. Watson Research Center, Yorktown Heights, N.Y.
- (1990) Some Global Compilation Optimizations and Architectural Features for Improving Performance of Superscalars
- Ebcioǧlu, K.¹ Groves, R.²

10
- 0027986342
- VLIW compilation techniques in a superscalar environment
- ACM Press, New York
- EBCIOǦLU, K., GROVES, R., KIM, K.-C., SILBERMAN, G., AND ZIV, I. 1994. VLIW compilation techniques in a superscalar environment. In Proceedings of the SIGPLAN 1994 conference on Programming Language Design and Implementation. ACM Press, New York, 36-48.
- (1994) Proceedings of the SIGPLAN 1994 Conference on Programming Language Design and Implementation , pp. 36-48
- Ebcioǧlu, K.¹ Groves, R.² Kim, K.-C.³ Silberman, G.⁴ Ziv, I.⁵

11
- 0002106131
- A new compilation technique for parallelizing loops with unpredictable branches on a VLIW architecture
- MIT Press, Cambridge, Mass.
- EBCIOǦLU, K. AND NAKATANI, T. 1989. A new compilation technique for parallelizing loops with unpredictable branches on a VLIW architecture. In Languages and Compilers for Parallel Computing. MIT Press, Cambridge, Mass., 213-229.
- (1989) Languages and Compilers for Parallel Computing , pp. 213-229
- Ebcioǧlu, K.¹ Nakatani, T.²

12
- 24844467991
- Tech. Rep. 89-31, Univ. of California, Irvine, Calif.
- EBCIOǦLU, K. AND NICOLAU, A. 1989. Percolation scheduling with resource constraints. Tech. Rep. 89-31, Univ. of California, Irvine, Calif.
- (1989) Percolation Scheduling with Resource Constraints
- Ebcioǧlu, K.¹ Nicolau, A.²

13
- 0003831259
- Ph.D. thesis, Yale Univ., New Haven, Conn.
- ELLIS, J. 1985. Bulldog: A compiler for VLIW architecture. Ph.D. thesis, Yale Univ., New Haven, Conn.
- (1985) Bulldog: A Compiler for VLIW Architecture
- Ellis, J.¹

14
- 0023385308
- The program dependency graph and its use in optimization
- FERRANTE, J., OTTENSTEIN, K., AND WARREN, J. 1987. The program dependency graph and its use in optimization. ACM Trans. Program. Lang. Syst. 9, 3, 319-349.
- (1987) ACM Trans. Program. Lang. Syst. , vol.9 , Issue.3 , pp. 319-349
- Ferrante, J.¹ Ottenstein, K.² Warren, J.³

15
- 0026918392
- Predicting conditional branch directions from previous runs of a program
- ACM Press, New York
- FISHER, J. AND FREUDENBERGER, S. 1992. Predicting conditional branch directions from previous runs of a program. In Proceedings of the 5th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM Press, New York, 85-95.
- (1992) Proceedings of the 5th International Conference on Architectural Support for Programming Languages and Operating Systems , pp. 85-95
- Fisher, J.¹ Freudenberger, S.²

16
- 0028461905
- Avoidance and suppression of compension code in a trace scheduling compiler
- FREUDENBERGER, S. M., GROSS, T. R., AND LOWNEY, P. 1994. Avoidance and suppression of compension code in a trace scheduling compiler. ACM Trans. Program. Lang. Syst. 16, 4, 1156-1214.
- (1994) ACM Trans. Program. Lang. Syst. , vol.16 , Issue.4 , pp. 1156-1214
- Freudenberger, S.M.¹ Gross, T.R.² Lowney, P.³

17
- 0029485275
- Performance issues in correlated branch schemes
- IEEE Computer Society Press, Los Alamitos, Calif.
- GLOY, N., SMITH, M., AND YOUNG, C. 1995. Performance issues in correlated branch schemes. In Proceedings of the 28th Annual International Symposium on Microarchitecture. IEEE Computer Society Press, Los Alamitos, Calif., 3-14.
- (1995) Proceedings of the 28th Annual International Symposium on Microarchitecture , pp. 3-14
- Gloy, N.¹ Smith, M.² Young, C.³

18
- 0025413768
- Region scheduling: An approach for detecting and redistributing parallelism
- GUPTA, R. AND SOFFA, M. 1990. Region scheduling: An approach for detecting and redistributing parallelism. IEEE Trans. Softw. Eng. 16, 4 (Apr.), 421-431.
- (1990) IEEE Trans. Softw. Eng. , vol.16 , Issue.4 APR , pp. 421-431
- Gupta, R.¹ Soffa, M.²

19
- 0027595384
- The superblock: An effective technique for VLIW and superscalar compilation
- HWU, W.-M., MAHLKE, S., CHEN, W., CHANG, P., WARTER, N., BRINGMANN, R., OUELLETE, R., HANK, R., KIYOHARA, T., HAAB, G., HOLM, J., AND LAVERY, D. 1993. The superblock: An effective technique for VLIW and superscalar compilation. J. Supercompt. 7, 1/2, 229-248.
- (1993) J. Supercompt. , vol.7 , Issue.1-2 , pp. 229-248
- Hwu, W.-M.¹ Mahlke, S.² Chen, W.³ Chang, P.⁴ Warter, N.⁵ Bringmann, R.⁶ Ouellete, R.⁷ Hank, R.⁸ Kiyohara, T.⁹ Haab, G.¹⁰ Holm, J.¹¹ Lavery, D.¹²

20
- 0347095853
- A special issue on IBM RISC System/600
- IBM. 1990. A special issue on IBM RISC System/600. IBM J. Res. Devel. 34, 1 (Jan.).
- (1990) IBM J. Res. Devel. , vol.34 , Issue.1 JAN

21
- 84976816559
- Circular scheduling: A new technique to perform software pipelining
- ACM Press, New York
- JAIN, S. 1991. Circular scheduling: A new technique to perform software pipelining. In Proceedings of the SIGPLAN 1991 Conference on Programming Language Design and Implementation. ACM Press, New York, 219-228.
- (1991) Proceedings of the SIGPLAN 1991 Conference on Programming Language Design and Implementation , pp. 219-228
- Jain, S.¹

22
- 0024861035
- Available instruction-level parallelism for superscalar and superpipelined machines
- ACM Press, New York
- JOUPPI, N. AND WALL, D. 1989. Available instruction-level parallelism for superscalar and superpipelined machines. In Proceedings of the 3rd International Conference on Architectural Support for Programming Languages and Operating Systems. ACM Press, New York, 272-282.
- (1989) Proceedings of the 3rd International Conference on Architectural Support for Programming Languages and Operating Systems , pp. 272-282
- Jouppi, N.¹ Wall, D.²

23
- 0042650298
- Software pipelining: An effective scheduling technique for VLIW machines
- ACM Press, New York
- LAM, M. 1988. Software pipelining: An effective scheduling technique for VLIW machines. In Proceedings of the SIGPLAN 1988 Conference on Programming Language Design and Implementation. ACM Press, New York, 318-328.
- (1988) Proceedings of the SIGPLAN 1988 Conference on Programming Language Design and Implementation , pp. 318-328
- Lam, M.¹

24
- 0347726340
- Ph.D. thesis, Univ. of Maryland, College Park, Md.
- MOON, S.-M. 1993. Compile-time parallelization of non-numerical code; VLIW and superscalar. Ph.D. thesis, Univ. of Maryland, College Park, Md.
- (1993) Compile-time Parallelization of Non-numerical Code; VLIW and Superscalar
- Moon, S.-M.¹

25
- 0031237555
- Increasing cache bandwidth using multiport caches for exploiting ILP in non-numerical codes
- MOON, S.-M. 1997. Increasing cache bandwidth using multiport caches for exploiting ILP in non-numerical codes. IEEE Proceedings - Computers and Digital Techniques 144, 5 (Sept.), 295-303.
- (1997) IEEE Proceedings - Computers and Digital Techniques , vol.144 , Issue.5 SEPT , pp. 295-303
- Moon, S.-M.¹

26
- 0029352611
- Generalized multiway branch unit for VLIW microprocessors
- MOON, S.-M. AND CARSON, S. 1995. Generalized multiway branch unit for VLIW microprocessors. IEEE Trans. Parall. Distrib. Syst. 6, 8 (Aug.), 850-862.
- (1995) IEEE Trans. Parall. Distrib. Syst. , vol.6 , Issue.8 AUG , pp. 850-862
- Moon, S.-M.¹ Carson, S.²

27
- 0028092560
- A study on the number of memory ports in multiple instruction issue machines
- IEEE, New York
- MOON, S.-M. AND EBCIOǦLU, K. 1993. A study on the number of memory ports in multiple instruction issue machines. In Proceedings of the 26th Annual International Symposium on Microarchitecture. IEEE, New York, 49-58.
- (1993) Proceedings of the 26th Annual International Symposium on Microarchitecture , pp. 49-58
- Moon, S.-M.¹ Ebcioǧlu, K.²

28
- 0030703876
- Performance analysis of tree VLIW architecture for exploiting branch ILP in non-numerical code
- ACM, New York
- MOON, S.-M. AND EBCIOǦLU, K. 1997. Performance analysis of tree VLIW architecture for exploiting branch ILP in non-numerical code. In Proceedings of the 1997 International Conference on Supercomputing. ACM, New York, 301-308.
- (1997) Proceedings of the 1997 International Conference on Supercomputing , pp. 301-308
- Moon, S.-M.¹ Ebcioǧlu, K.²

29
- 0346465540
- Tech. Rep. SNU-EE-TR-1997-7, Seoul National Univ., Seoul, Korea
- MOON, S.-M., KIM, S., PARK, J., AND EBCIOǦLU, K. 1997. Unrolling-based copy coalescing. Tech. Rep. SNU-EE-TR-1997-7, Seoul National Univ., Seoul, Korea.
- (1997) Unrolling-based Copy Coalescing
- Moon, S.-M.¹ Kim, S.² Park, J.³ Ebcioǧlu, K.⁴

30
- 84911474195
- Combining as a compilation technique for a VLIW architecture
- IEEE, New York
- NAKATANI, T. AND AND EBCIOǦLU, K. 1989. Combining as a compilation technique for a VLIW architecture. In Proceedings of the 22nd Annual Workshop on Microprogramming. IEEE, New York, 43-55.
- (1989) Proceedings of the 22nd Annual Workshop on Microprogramming , pp. 43-55
- Nakatani, T.¹ Ebcioǧlu, K.²

31
- 0027659775
- Making compaction based parallelization affordable
- NAKATANI, T. AND AND EBCIOǦLU, K. 1993. Making compaction based parallelization affordable. IEEE Trans. Parall. Distrib. Syst. 4, 9 (Sept.), 1014-1529.
- (1993) IEEE Trans. Parall. Distrib. Syst. , vol.4 , Issue.9 SEPT , pp. 1014-1529
- Nakatani, T.¹ Ebcioǧlu, K.²

32
- 0022874874
- Advanced compiler optimizations for supercomputers
- PADUA, D. AND WOLFE, M. 1986. Advanced compiler optimizations for supercomputers. Commun. ACM 29, 12 (Dec.), 1184-1201.
- (1986) Commun. ACM , vol.29 , Issue.12 DEC , pp. 1184-1201
- Padua, D.¹ Wolfe, M.²

33
- 0031378564
- Evaluation of scheduling techniques on a SPARC-based VLIW testbed
- IEEE, New York
- PARK, S., SHIM, S., AND MOON, S.-M. 1997. Evaluation of scheduling techniques on a SPARC-based VLIW testbed. In Proceedings of the 30th Annual International Symposium on Microarchitecture. IEEE, New York.
- (1997) Proceedings of the 30th Annual International Symposium on Microarchitecture
- Park, S.¹ Shim, S.² Moon, S.-M.³

34
- 0021817378
- Reduced instruction set computers
- PATTERSON, D. 1985. Reduced instruction set computers. Commun. ACM 28, 1 (Jan.), 8-21.
- (1985) Commun. ACM , vol.28 , Issue.1 JAN , pp. 8-21
- Patterson, D.¹

35
- 0024480706
- The Cydra 5 departmental supercomputer: Design philosophies, decisions, and trade-offs
- RAU, B. 1989. The Cydra 5 departmental supercomputer: Design philosophies, decisions, and trade-offs. IEEE Comput. 22, 1 (Jan.), 12-34.
- (1989) IEEE Comput. , vol.22 , Issue.1 JAN , pp. 12-34
- Rau, B.¹

36
- 0002017307
- Instruction-level parallel processing: History, overview, and perspective
- RAU, B. AND FISHER, J. 1993. Instruction-level parallel processing: History, overview, and perspective. J. Supercomput. 7, 1/2, 9-50.
- (1993) J. Supercomput. , vol.7 , Issue.1-2 , pp. 9-50
- Rau, B.¹ Fisher, J.²

37
- 0003015894
- Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing
- IEEE, New York
- RAU, B. AND GLAESER, C. 1981. Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing. In Proceedings of the 14th Annual Workshop on Microprogramming. IEEE, New York, 183-198.
- (1981) Proceedings of the 14th Annual Workshop on Microprogramming , pp. 183-198
- Rau, B.¹ Glaeser, C.²

38
- 0347926043
- Tech. Rep. 17, Courant Inst, of Computer Science, New York Univ., New York
- SCHWARTZ, J. AND SHARIR, M. 1979. A design for optimizations of the bit vectoring class. Tech. Rep. 17, Courant Inst, of Computer Science, New York Univ., New York.
- (1979) A Design for Optimizations of the Bit Vectoring Class
- Schwartz, J.¹ Sharir, M.²

39
- 0002228438
- An architectural framework for supporting heterogeneous instruction set architectures
- SILBERMAN, G. AND EBCIOǦLU, K. 1993. An architectural framework for supporting heterogeneous instruction set architectures. IEEE Comput. 26, 6 (June), 39-56.
- (1993) IEEE Comput. , vol.26 , Issue.6 JUNE , pp. 39-56
- Silberman, G.¹ Ebcioǧlu, K.²

40
- 0002790769
- Alpha AXP architecture
- SITES, R. 1993. Alpha AXP architecture. Commun. ACM 36, 2 (Feb.), 33-44.
- (1993) Commun. ACM , vol.36 , Issue.2 FEB , pp. 33-44
- Sites, R.¹

41
- 0027028425
- Efficient superscalar performance through boosting
- ACM Press, New York
- SMITH, M., HOROWITZ, M., AND LAM, M. 1992. Efficient superscalar performance through boosting. In Proceedings of the 5th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM Press, New York, 248-259.
- (1992) Proceedings of the 5th International Conference on Architectural Support for Programming Languages and Operating Systems , pp. 248-259
- Smith, M.¹ Horowitz, M.² Lam, M.³

42
- 0024859960
- Limits on multiple instruction issue
- ACM Press, New York
- SMITH, M., JOHNSON, M., AND HOROWITZ, M. 1989. Limits on multiple instruction issue. In Proceedings of the 3rd International Conference on Architectural Support for Programming Languages and Operating Systems. ACM Press, New York, 290-302.
- (1989) Proceedings of the 3rd International Conference on Architectural Support for Programming Languages and Operating Systems , pp. 290-302
- Smith, M.¹ Johnson, M.² Horowitz, M.³

43
- 0347726345
- Res. Rep. RC 11974, IBM T.J. Watson Research Center, Yorktown Heights, N.Y.
- WARREN, H., AUSLANDER, M., CHAITIN, G., CHIBIB, A., HOPKINS, M., AND MACKAY, A. Jun 1986. Final code generation in the PL.8 compiler. Res. Rep. RC 11974, IBM T.J. Watson Research Center, Yorktown Heights, N.Y.
- (1986) Final Code Generation in the PL.8 Compiler
- Warren, H.¹ Auslander, M.² Chaitin, G.³ Chibib, A.⁴ Hopkins, M.⁵ MacKay, A.J.⁶

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.