메뉴 건너뛰기




Volumn , Issue , 2009, Pages 219-228

Compiler-assisted dynamic scheduling for effective parallelization of loop nests on multicore processors

Author keywords

Compile time optimization; Dynamic scheduling; Runtime optimization

Indexed keywords

AUTOMATIC PARALLELIZATION; CHOLESKY; CHOLESKY DECOMPOSITION; COMPILATION TECHNOLOGY; COMPILE TIME; COMPILE-TIME OPTIMIZATION; COMPILER-ASSISTED; DYNAMIC EXTRACTION; DYNAMIC SCHEDULING; INPUT PROGRAMS; INPUT-AFFINE; LOAD IMBALANCE; LOAD-BALANCED; LOOP NESTS; LU DECOMPOSITION; MULTI-CORE PROCESSOR; MULTI-CORE SYSTEMS; PARALLEL CODE; PARALLEL EXECUTIONS; PARALLELIZATION; PROCESSOR CORES; RUNTIME; RUNTIME OPTIMIZATION;

EID: 67650069905     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1504176.1504209     Document Type: Conference Paper
Times cited : (35)

References (48)
  • 1
    • 0016313256 scopus 로고
    • A comparison of list schedules for parallel processing systems
    • T. L. Adam, K. M. Chandy, and J. R. Dickson. A comparison of list schedules for parallel processing systems. Commun. ACM, 17(12):685-690, 1974.
    • (1974) Commun. ACM , vol.17 , Issue.12 , pp. 685-690
    • Adam, T.L.1    Chandy, K.M.2    Dickson, J.R.3
  • 2
    • 0023438847 scopus 로고
    • AUTOMATIC TRANSLATION OF FORTRAN PROGRAMS TO VECTOR FORM.
    • DOI 10.1145/29873.29875
    • R. Allen and K. Kennedy. Automatic translation of Fortran programs to vector form. ACM Trans. on Programming Languages and Systems, 9(4):491-542, 1987. (Pubitemid 18531687)
    • (1987) ACM Transactions on Programming Languages and Systems , vol.9 , Issue.4 , pp. 491-542
    • Allen Randy1    Kennedy Ken2
  • 3
    • 84976766536 scopus 로고
    • Scanning polyhedra with do loops
    • C. Ancourt and F. Irigoin. Scanning polyhedra with do loops. In PPoPP'91, pages 39-50, 1991.
    • (1991) PPoPP'91 , pp. 39-50
    • Ancourt, C.1    Irigoin, F.2
  • 4
    • 10444289646 scopus 로고    scopus 로고
    • Code generation in the polyhedral model is easier than you think
    • C. Bastoul. Code generation in the polyhedral model is easier than you think. In PACT'04, pages 7-16, 2004.
    • (2004) PACT'04 , pp. 7-16
    • Bastoul, C.1
  • 10
    • 0032066690 scopus 로고    scopus 로고
    • Loop parallelization algorithms: From parallelism extraction to code generation
    • PII S0167819198000209
    • P. Boulet, A. Darte, G.-A. Silber, and F. Vivien. Loop parallelization algorithms: From parallelism extraction to code generation. Parallel Computing, 24(3-4):421-444, 1998. (Pubitemid 128413646)
    • (1998) Parallel Computing , vol.24 , Issue.3-4 , pp. 421-444
    • Boulet, P.1    Darte, A.2    Silber, G.-A.3    Vivien, F.4
  • 16
    • 0342782260 scopus 로고    scopus 로고
    • Combining retiming and scheduling techniques for loop parallelization and loop tiling'
    • A. Darte, G.-A. Silber, and F. Vivien. Combining retiming and scheduling techniques for loop parallelization and loop tiling. Parallel Processing Letters, 7(4):379-392, 1997. (Pubitemid 127732656)
    • (1997) Parallel Processing Letters , vol.7 , Issue.4 , pp. 379-392
    • Darte, A.1
  • 17
    • 0031358458 scopus 로고    scopus 로고
    • Optimal Fine and Medium Grain Parallelism Detection in Polyhedral Reduced Dependence Graphs
    • A. Darte and F. Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. IJPP, 25(6):447. 496, Dec. 1997. (Pubitemid 127507526)
    • (1997) International Journal of Parallel Programming , vol.25 , Issue.6 , pp. 447-496
    • Darte, A.1    Vivien, F.2
  • 19
    • 0026109335 scopus 로고
    • Dataflow analysis of array and scalar references
    • P. Feautrier. Dataflow analysis of array and scalar references. IJPP, 20(1):23-53, 1991.
    • (1991) IJPP , vol.20 , Issue.1 , pp. 23-53
    • Feautrier, P.1
  • 20
    • 0026933251 scopus 로고
    • Some efficient solutions to the affine scheduling problem. I. One-dimensional time
    • P. Feautrier. Some efficient solutions to the affine scheduling problem, part I: one-dimensional time. IJPP, 21(5):313-348, 1992. (Pubitemid 23705312)
    • (1992) International Journal of Parallel Programming , vol.21 , Issue.5 , pp. 313-347
    • Feautrier Paul1
  • 21
    • 0001448065 scopus 로고
    • Some efficient solutions to the affine scheduling problem, part II: Multidimensional time
    • P. Feautrier. Some efficient solutions to the affine scheduling problem, part II: multidimensional time. IJPP, 21(6):389-420, 1992.
    • (1992) IJPP , vol.21 , Issue.6 , pp. 389-420
    • Feautrier, P.1
  • 22
    • 84957027384 scopus 로고    scopus 로고
    • Automatic parallelization in the polytope model
    • P. Feautrier. Automatic parallelization in the polytope model. In The Data Parallel Programming Model, pages 79-103, 1996.
    • (1996) The Data Parallel Programming Model , pp. 79-103
    • Feautrier, P.1
  • 26
    • 0025539983 scopus 로고
    • Parallel processing of near fine grain tasks using static scheduling on OSCAR (Optimally Scheduled Advanced Multiprocessor)
    • Proc Supercomput 90
    • H. Kasahara, H. Honda, and S. Narita. Parallel processing of near fine grain tasks using static scheduling on oscar (optimally scheduled advanced multiprocessor). In Supercomputing'90: Proceedings of the 1990 ACM/IEEE conference on Supercomputing, pages 856-864, Washington, DC, USA, 1990. IEEE Computer Society. (Pubitemid 21675205)
    • (1990) Supercomputing'90: Proceedings of the 1990 ACM/IEEE conference on Supercomputing , pp. 856-864
    • Kasahara Hironori1    Honda Hiroki2    Narita Seinosuke3
  • 27
    • 0002050141 scopus 로고    scopus 로고
    • Static scheduling algorithms for allocating directed task graphs to multiprocessors
    • Y.-K. Kwok and I. Ahmad. Static scheduling algorithms for allocating directed task graphs to multiprocessors. ACM Comput. Surv., 31(4):406-471, 1999.
    • (1999) ACM Comput. Surv. , vol.31 , Issue.4 , pp. 406-471
    • Kwok, Y.-K.1    Ahmad, I.2
  • 28
    • 0027829921 scopus 로고
    • Improving the performance of runtime parallelization
    • S.-T. Leung and J. Zahorjan. Improving the performance of runtime parallelization. SIGPLAN Not., 28(7):83-91, 1993.
    • (1993) SIGPLAN Not. , vol.28 , Issue.7 , pp. 83-91
    • Leung, S.-T.1    Zahorjan, J.2
  • 31
    • 0032662841 scopus 로고    scopus 로고
    • An affine partitioning algorithm to maximize parallelism and minimize communication
    • A. W. Lim, G. I. Cheong, and M. S. Lam. An affine partitioning algorithm to maximize parallelism and minimize communication. In ACM Intl. Conf. on Supercomputing, pages 228.237, 1999.
    • (1999) ACM Intl. Conf. on Supercomputing , pp. 228-237
    • Lim, A.W.1    Cheong, G.I.2    Lam, M.S.3
  • 32
    • 0032067773 scopus 로고    scopus 로고
    • Maximizing parallelism and minimizing synchronization with affine partitions
    • PII S0167819198000210
    • A. W. Lim and M. S. Lam. Maximizing parallelism and minimizing synchronization with affine partitions. Parallel Computing, 24(3- 4):445.475, 1998. (Pubitemid 128413647)
    • (1998) Parallel Computing , vol.24 , Issue.3-4 , pp. 445-475
    • Lim, A.W.1    Lam, M.S.2
  • 36
    • 84976676720 scopus 로고
    • The Omega test: A fast and practical integer programming algorithm for dependence analysis
    • Aug.
    • W. Pugh. The Omega test: a fast and practical integer programming algorithm for dependence analysis. Communications of the ACM, 8:102-114, Aug. 1992.
    • (1992) Communications of the ACM , vol.8 , pp. 102-114
    • Pugh, W.1
  • 39
    • 84976823223 scopus 로고
    • The lrpd test: Speculative runtime parallelization of loops with privatization and reduction parallelization
    • L. Rauchwerger and D. Padua. The lrpd test: speculative runtime parallelization of loops with privatization and reduction parallelization. SIGPLAN Not., 30(6):218-232, 1995.
    • (1995) SIGPLAN Not. , vol.30 , Issue.6 , pp. 218-232
    • Rauchwerger, L.1    Padua, D.2
  • 41
    • 34548045548 scopus 로고    scopus 로고
    • Sensitivity analysis for automatic parallelization on multi-cores
    • DOI 10.1145/1274971.1275008, Proceedings of ICS07: 21st ACM International Conference on Supercomputing
    • S. Rus, M. Pennings, and L. Rauchwerger. Sensitivity analysis for automatic parallelization on multi-cores. In ICS'07: Proceedings of the 21st annual international conference on Supercomputing, pages 263.273, New York, NY, USA, 2007. ACM. (Pubitemid 47281623)
    • (2007) Proceedings of the International Conference on Supercomputing , pp. 263-273
    • Rus, S.1    Pennings, M.2    Rauchwerger, L.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.