메뉴 건너뛰기




Volumn , Issue , 2011, Pages 161-170

On-chip cache hierarchy-aware tile scheduling for multicore machines

Author keywords

[No Author keywords available]

Indexed keywords

ALTERNATE METHOD; APPLICATION PROGRAMS; CACHE MISS; COMPUTATION KERNEL; DATA REUSE; EMBEDDED APPLICATION; EXECUTION TIME; INTEL MACHINES; ITERATION SPACES; KEY COMPONENT; MULTI-CORE MACHINES; MULTI-PROCESSOR PLATFORMS; MULTITHREADED CODE GENERATION; ON-CHIP CACHE; PARALLELIZATIONS; SCHEDULING STRATEGIES; SOURCE-TO-SOURCE TRANSLATIONS; UNIPROCESSORS;

EID: 79957454903     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/CGO.2011.5764684     Document Type: Conference Paper
Times cited : (20)

References (47)
  • 2
    • 38149022809 scopus 로고    scopus 로고
    • "Teraflops research chip," http://techresearch.intel.com/ articles/Tera-Scale/1449.htm.
    • Teraflops Research Chip
  • 7
    • 0032662841 scopus 로고    scopus 로고
    • An affine partitioning algorithm to maximize parallelism and minimize communication
    • A. W. Lim et al., "An affine partitioning algorithm to maximize parallelism and minimize communication," Proc. of ICS, 1999.
    • Proc. of ICS, 1999
    • Lim, A.W.1
  • 10
    • 79957518643 scopus 로고    scopus 로고
    • Maximizing parallelism and minimizing synchronization with affine transforms
    • A. W. Lim and M. S. Lam, "Maximizing parallelism and minimizing synchronization with affine transforms," Proc. of POPL, 1997.
    • Proc. of POPL, 1997
    • Lim, A.W.1    Lam, M.S.2
  • 11
    • 0000563616 scopus 로고    scopus 로고
    • Compiler algorithms for optimizing locality and parallelism on shared and distributedmemory machines
    • M. Kandemir et al., "Compiler algorithms for optimizing locality and parallelism on shared and distributedmemory machines," J. Parallel Distrib. Comput., 2000.
    • (2000) J. Parallel Distrib. Comput.
    • Kandemir, M.1
  • 14
    • 84976827033 scopus 로고
    • A data locality optimizing algorithm
    • M. E. Wolf and M. S. Lam, "A data locality optimizing algorithm," SIGPLAN Not., 1991.
    • (1991) SIGPLAN Not.
    • Wolf, M.E.1    Lam, M.S.2
  • 15
    • 79957502820 scopus 로고    scopus 로고
    • Selecting tile shape for minimal execution time
    • K. Högstedt et al., "Selecting tile shape for minimal execution time," Proc. of SPAA, 1999.
    • Proc. of SPAA, 1999
    • Högstedt, K.1
  • 16
    • 0032635362 scopus 로고    scopus 로고
    • New tiling techniques to improve cache temporal locality
    • Y. Song and Z. Li, "New tiling techniques to improve cache temporal locality," Proc. of PLDI, 1999.
    • Proc. of PLDI, 1999
    • Song, Y.1    Li, Z.2
  • 18
    • 79957448379 scopus 로고    scopus 로고
    • Compiler-assisted dynamic scheduling for effective parallelization of loop nests on multicore processors
    • M. M. Baskaran et al., "Compiler-assisted dynamic scheduling for effective parallelization of loop nests on multicore processors," Proc. of PPoPP, 2009.
    • Proc. of PPoPP, 2009
    • Baskaran, M.M.1
  • 22
    • 79957459923 scopus 로고    scopus 로고
    • Cache-aware partitioning of multidimensional iteration spaces
    • A. Kejariwal et al., "Cache-aware partitioning of multidimensional iteration spaces," Proc. of SYSTOR, 2009.
    • Proc. of SYSTOR, 2009
    • Kejariwal, A.1
  • 23
    • 33847108581 scopus 로고    scopus 로고
    • Hierarchically tiled arrays for parallelism and locality
    • J. Guo et al., "Hierarchically tiled arrays for parallelism and locality," Proc. of IPDPS, 2006.
    • Proc. of IPDPS, 2006
    • Guo, J.1
  • 24
    • 79957443193 scopus 로고    scopus 로고
    • Design and use of htalib: A library for hierarchically tiled arrays
    • G. Bikshandi et al., "Design and use of htalib: a library for hierarchically tiled arrays," Proc. of LCPC, 2007.
    • Proc. of LCPC, 2007
    • Bikshandi, G.1
  • 26
    • 33847150556 scopus 로고    scopus 로고
    • Selecting the tile shape to reduce the total communication volume
    • N. Drosinos et al., "Selecting the tile shape to reduce the total communication volume," Proc. of IPDPS, 2006.
    • Proc. of IPDPS, 2006
    • Drosinos, N.1
  • 27
    • 85009352487 scopus 로고    scopus 로고
    • Tile size selection using cache organization and data layout
    • S. Coleman and K. S. McKinley, "Tile size selection using cache organization and data layout," Proc. of PLDI, 1995.
    • Proc. of PLDI, 1995
    • Coleman, S.1    McKinley, K.S.2
  • 28
    • 33748307622 scopus 로고    scopus 로고
    • An analytical model for loop tiling and its solution
    • V. Sarkar and N. Megiddo, "An analytical model for loop tiling and its solution," Proc. of ISPASS, 2000.
    • (2000) Proc. of ISPASS
    • Sarkar, V.1    Megiddo, N.2
  • 29
    • 70449702074 scopus 로고    scopus 로고
    • Parametric multi-level tiling of imperfectly nested loops
    • A. Hartono et al., "Parametric multi-level tiling of imperfectly nested loops," Proc. of ICS, 2009.
    • Proc. of ICS, 2009
    • Hartono, A.1
  • 30
    • 79957480502 scopus 로고    scopus 로고
    • Iterative optimization in the polyhedral model: Part i, one-dimensional time
    • L.-N. Pouchet et al., "Iterative optimization in the polyhedral model: Part i, one-dimensional time," Proc. of CGO, 2007.
    • Proc. of CGO, 2007
    • Pouchet, L.-N.1
  • 31
    • 85088886364 scopus 로고    scopus 로고
    • Blocking and array contraction across arbitrarily nested loops using affine partitioning
    • A. W. Lim et al., "Blocking and array contraction across arbitrarily nested loops using affine partitioning," Proc. of PPoPP, 2001.
    • Proc. of PPoPP, 2001
    • Lim, A.W.1
  • 32
    • 74049164978 scopus 로고    scopus 로고
    • A practical automatic polyhedral parallelizer and locality optimizer
    • U. Bondhugula et al., "A practical automatic polyhedral parallelizer and locality optimizer," Proc. of PLDI, 2008.
    • Proc. of PLDI, 2008
    • Bondhugula, U.1
  • 34
    • 79957489791 scopus 로고    scopus 로고
    • Optimizing shared cache behavior of chip multiprocessors
    • M. Kandemir et al., "Optimizing shared cache behavior of chip multiprocessors," Proc. of Micro, 2009.
    • Proc. of Micro, 2009
    • Kandemir, M.1
  • 35
    • 79960161840 scopus 로고    scopus 로고
    • Cache topology aware computation mapping for multicores
    • M. Kandemir, T. Yemliha et al., "Cache topology aware computation mapping for multicores," Proc. of PLDI, 2010.
    • Proc. of PLDI, 2010
    • Kandemir, M.1    Yemliha, T.2
  • 36
    • 79957456798 scopus 로고    scopus 로고
    • Compilation for explicitly managed memory hierarchies
    • T. J. Knight et al., "Compilation for explicitly managed memory hierarchies," Proc. of PPoPP, 2007.
    • Proc. of PPoPP, 2007
    • Knight, T.J.1
  • 38
    • 85087537552 scopus 로고    scopus 로고
    • Facilitating the search for compositions of program transformations
    • A. Cohen et al., "Facilitating the search for compositions of program transformations," Proc. of ICS, 2005.
    • Proc. of ICS, 2005
    • Cohen, A.1
  • 41
    • 79957501879 scopus 로고    scopus 로고
    • Counting integer points in parametric polytopes using barvinok's rational functions
    • S. Verdoolaege et al., "Counting integer points in parametric polytopes using barvinok's rational functions," Journal Algorithmica, 2007.
    • (2007) Journal Algorithmica
    • Verdoolaege, S.1
  • 42
    • 1242331527 scopus 로고    scopus 로고
    • Approximation algorithms for minimum k-cut
    • N. Guttmann-Beck and R. Hassin, "Approximation algorithms for minimum k-cut," Algorithmica, 2000.
    • (2000) Algorithmica
    • Guttmann-Beck, N.1    Hassin, R.2
  • 43
    • 10444289646 scopus 로고    scopus 로고
    • Code generation in the polyhedral model is easier than you think
    • C. Bastoul, "Code generation in the polyhedral model is easier than you think," Proc. of PACT, 2004.
    • Proc. of PACT, 2004
    • Bastoul, C.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.