메뉴 건너뛰기




Volumn 46, Issue 8, 2011, Pages 103-112

ULCC: A user-level facility for optimizing shared cache performance on multicores

Author keywords

Cache; Multicore; Scientific Computing

Indexed keywords

APPLICATION EXECUTION; APPLICATION PERFORMANCE; CACHE; CACHE CONTROL; CACHE OPTIMIZATION; CACHE POLLUTION; CACHE-CONSCIOUS; CRITICAL ISSUES; DATA SETS; EXECUTION TIME; MEMORY ACCESS; MULTI CORE; MULTI-CORE PROCESSOR; MULTI-CORES; MULTI-THREADED PROGRAMS; MULTIPLE-CASE STUDY; PERFORMANCE IMPROVEMENTS; SCIENTIFIC APPLICATIONS; SHARED CACHE; SHARED SPACES; SOFTWARE RUNTIME; USER LEVELS;

EID: 80053979318     PISSN: 15232867     EISSN: None     Source Type: Journal    
DOI: 10.1145/2038037.1941568     Document Type: Conference Paper
Times cited : (27)

References (35)
  • 1
    • 80053933239 scopus 로고    scopus 로고
    • NAS parallel benchmarks in OpenMP. URL
    • NAS parallel benchmarks in OpenMP. URL http://phase.hpcc. jp/Omni/benchmarks/NPB/index.html.
  • 6
    • 80053954369 scopus 로고    scopus 로고
    • HP Corp. Perfmon project. URL
    • HP Corp. Perfmon project. URL http://www.hpl.hp.com/ research/linux/ perfmon.
  • 7
    • 63549085110 scopus 로고    scopus 로고
    • Analysis and approximation of optimal co-scheduling on chip multiprocessors
    • Y. Jiang, X. Shen, J. Chen, and R. Tripathi. Analysis and approximation of optimal co-scheduling on chip multiprocessors. In PACT'08, pages 220-229, 2008.
    • (2008) PACT'08 , pp. 220-229
    • Jiang, Y.1    Shen, X.2    Chen, J.3    Tripathi, R.4
  • 8
    • 0033894726 scopus 로고    scopus 로고
    • Dynamic data layouts for cache-conscious factorization of DFT
    • D. Kang. Dynamic data layouts for cache-conscious factorization of DFT. In IPDPS '00, page 693, 2000.
    • (2000) IPDPS , pp. 693
    • Kang, D.1
  • 9
    • 84976736383 scopus 로고
    • Page placement algorithms for large real-indexed caches
    • R. E. Kessler and M. D. Hill. Page placement algorithms for large real-indexed caches. ACM Trans. Comput. Syst., 10(4), 1992.
    • (1992) ACM Trans. Comput. Syst. , vol.10 , pp. 4
    • Kessler, R.E.1    Hill, M.D.2
  • 10
    • 10444238444 scopus 로고    scopus 로고
    • Fair cache sharing and partitioning in a chip multiprocessor architecture
    • S. Kim, D. Chandra, and Y. Solihin. Fair cache sharing and partitioning in a chip multiprocessor architecture. In PACT'04, pages 111-122, 2004.
    • (2004) PACT'04 , pp. 111-122
    • Kim, S.1    Chandra, D.2    Solihin, Y.3
  • 11
    • 47249103334 scopus 로고    scopus 로고
    • Using OS observations to improve performance in multicore systems
    • R. Knauerhase, P. Brett, B. Hohlt, T. Li, and S. Hahn. Using OS observations to improve performance in multicore systems. IEEE Micro, 28(3):54-66, 2008.
    • (2008) IEEE Micro , vol.28 , Issue.3 , pp. 54-66
    • Knauerhase, R.1    Brett, P.2    Hohlt, B.3    Li, T.4    Hahn, S.5
  • 12
    • 0026137116 scopus 로고
    • The cache performance and optimizations of blocked algorithms
    • M. D. Lam, E. E. Rothberg, and M. E. Wolf. The cache performance and optimizations of blocked algorithms. In ASPLOS '91, pages 63-74, 1991.
    • (1991) ASPLOS , vol.91 , pp. 63-74
    • Lam, M.D.1    Rothberg, E.E.2    Wolf, M.E.3
  • 13
    • 0030733703 scopus 로고    scopus 로고
    • The influence of caches on the performance of sorting
    • A. LaMarca and R. E. Ladner. The influence of caches on the performance of sorting. In SODA '97, pages 370-379.
    • SODA , vol.97 , pp. 370-379
    • LaMarca, A.1    Ladner, R.E.2
  • 14
    • 77955032509 scopus 로고    scopus 로고
    • MCC-DB: Minimizing cache conflicts in muli-core processors for databases
    • R. Lee, X. Ding, F. Chen, Q. Lu, and X. Zhang. MCC-DB: Minimizing cache conflicts in muli-core processors for databases. In VLDB'09.
    • VLDB'09
    • Lee, R.1    Ding, X.2    Chen, F.3    Lu, Q.4    Zhang, X.5
  • 15
    • 57749186047 scopus 로고    scopus 로고
    • Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems
    • Salt Lake City, UT
    • J. Lin, Q. Lu, X. Ding, Z. Zhang, X. Zhang, and P. Sadayappan. Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems. In HPCA '08, pages 367-378, Salt Lake City, UT, 2008.
    • (2008) HPCA , vol.8 , pp. 367-378
    • Lin, J.1    Lu, Q.2    Ding, X.3    Zhang, Z.4    Zhang, X.5    Sadayappan, P.6
  • 16
    • 79952791256 scopus 로고    scopus 로고
    • Enabling software multicore cache management with lightweight hardware support
    • J. Lin, Q. Lu, X. Ding, Z. Zhang, X. Zhang, and P. Sadayappan. Enabling software multicore cache management with lightweight hardware support. In SC'09, 2009.
    • (2009) SC'09
    • Lin, J.1    Lu, Q.2    Ding, X.3    Zhang, Z.4    Zhang, X.5    Sadayappan, P.6
  • 17
    • 2342468635 scopus 로고    scopus 로고
    • Organizing the last line of defense before hitting the memory wall for CMPs
    • C. Liu, A. Sivasubramaniam, and M. Kandemir. Organizing the last line of defense before hitting the memory wall for CMPs. In HPCA'04, pages 176-185, 2004.
    • (2004) HPCA'04 , pp. 176-185
    • Liu, C.1    Sivasubramaniam, A.2    Kandemir, M.3
  • 18
    • 70449652924 scopus 로고    scopus 로고
    • Soft-OLP: Improving hardware cache performance through softwarecontrolled object-level partitioning
    • Q. Lu, J. Lin, X. Ding, Z. Zhang, X. Zhang, and P. Sadayappan. Soft-OLP: Improving hardware cache performance through softwarecontrolled object-level partitioning. In PACT '09, pages 246-257, 2009.
    • (2009) PACT , vol.9 , pp. 246-257
    • Lu, Q.1    Lin, J.2    Ding, X.3    Zhang, Z.4    Zhang, X.5    Sadayappan, P.6
  • 20
    • 0035177611 scopus 로고    scopus 로고
    • Cache-friendly implementations of transitive closure
    • Barcelona, Spain
    • M. Penner and V. K. Prasanna. Cache-friendly implementations of transitive closure. In PACT '01, page 185, Barcelona, Spain, 2001.
    • (2001) PACT , vol.1 , pp. 185
    • Penner, M.1    Prasanna, V.K.2
  • 21
    • 34548042910 scopus 로고    scopus 로고
    • Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches
    • DOI 10.1109/MICRO.2006.49, 4041865, Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-39
    • M. K. Qureshi and Y. N. Patt. Utility-based cache partitioning: A lowoverhead, high-performance, runtime mechanism to partition shared caches. In MICRO'06, pages 423-432, 2006. (Pubitemid 351337015)
    • (2006) Proceedings of the Annual International Symposium on Microarchitecture, MICRO , pp. 423-432
    • Qureshi, M.K.1    Patt, Y.N.2
  • 22
    • 0036038691 scopus 로고    scopus 로고
    • Symbiotic jobscheduling with priorities for a simultaneous multithreading processor
    • A. Snavely, D. M. Tullsen, and G. Voelker. Symbiotic jobscheduling with priorities for a simultaneous multithreading processor. In SIGMETRICS' 02, pages 66-76.
    • SIGMETRICS' , vol.2 , pp. 66-76
    • Snavely, A.1    Tullsen, D.M.2    Voelker, G.3
  • 23
    • 66749168716 scopus 로고    scopus 로고
    • Reducing the harmful effects of last-level cache polluters with an OS-level, software-only pollute buffer
    • L. Soares, D. Tam, and M. Stumm. Reducing the harmful effects of last-level cache polluters with an OS-level, software-only pollute buffer. In MICRO '08, pages 258-269, 2008.
    • (2008) MICRO , vol.8 , pp. 258-269
    • Soares, L.1    Tam, D.2    Stumm, M.3
  • 24
    • 1642371317 scopus 로고    scopus 로고
    • Dynamic partitioning of shared cache memory
    • G. E. Suh, L. Rudolph, and S. Devadas. Dynamic partitioning of shared cache memory. J. Supercomputing, 28(1), 2002.
    • (2002) J. Supercomputing , vol.28 , pp. 1
    • Suh, G.E.1    Rudolph, L.2    Devadas, S.3
  • 25
    • 57749176037 scopus 로고    scopus 로고
    • Managing shared l2 caches on multicore systems in software
    • D. Tam, R. Azimi, L. Soares, and M. Stumm. Managing shared l2 caches on multicore systems in software. In WIOSCA '07, 2007.
    • (2007) WIOSCA , vol.7
    • Tam, D.1    Azimi, R.2    Soares, L.3    Stumm, M.4
  • 26
    • 34548030923 scopus 로고    scopus 로고
    • Thread clustering: Sharing-aware scheduling on SMP-CMP-SMT multiprocessors
    • DOI 10.1145/1272996.1273004, Operating Systems Review - Proceedings of the 2007 EuroSys Conference
    • D. Tam, R. Azimi, and M. Stumm. Thread clustering: Sharing-aware scheduling on SMP-CMP-SMT multiprocessors. In EuroSys'07, pages 47-58, 2007. (Pubitemid 47281574)
    • (2007) Operating Systems Review (ACM) , pp. 47-58
    • Tam, D.1    Azimi, R.2    Stumm, M.3
  • 27
    • 80054007458 scopus 로고    scopus 로고
    • TOP500.Org. URL
    • TOP500.Org. URL http://www.top500.org/lists/2010/06.
  • 28
    • 84863652586 scopus 로고
    • A new approach to array redistribution: Strip mining redistribution
    • A. Wakatani and M. Wolfe. A new approach to array redistribution: Strip mining redistribution. In PARLE '94, pages 323-335, 1994.
    • (1994) PARLE , vol.94 , pp. 323-335
    • Wakatani, A.1    Wolfe, M.2
  • 29
    • 0003278639 scopus 로고    scopus 로고
    • Automatically tuned linear algebra software
    • R. C. Whaley and J. Dongarra. Automatically tuned linear algebra software. In SC '98, 1998.
    • (1998) SC , vol.98
    • Whaley, R.C.1    Dongarra, J.2
  • 30
    • 0002433589 scopus 로고
    • Iteration space tiling for memory hierarchies
    • Philadelphia, PA
    • M. Wolfe. Iteration space tiling for memory hierarchies. In PP '89, pages 357-361, Philadelphia, PA, 1989.
    • (1989) PP , vol.89 , pp. 357-361
    • Wolfe, M.1
  • 31
    • 0024935630 scopus 로고
    • More iteration space tiling
    • M. Wolfe. More iteration space tiling. In SC '89, pages 655-664, 1989. (Pubitemid 20665965)
    • (1989) Proc Supercomput 89 , pp. 655-664
    • Wolfe Michael1
  • 32
  • 34
    • 70349111334 scopus 로고    scopus 로고
    • Towards practical page coloring-based multicore cache management
    • X. Zhang, S. Dwarkadas, and K. Shen. Towards practical page coloring-based multicore cache management. In EuroSys'09, pages 89-102, 2009.
    • (2009) EuroSys'09 , pp. 89-102
    • Zhang, X.1    Dwarkadas, S.2    Shen, K.3
  • 35
    • 77952248898 scopus 로고    scopus 로고
    • Addressing shared resource contention in multicore processors via scheduling
    • S. Zhuravlev, S. Blagodurov, and A. Fedorova. Addressing shared resource contention in multicore processors via scheduling. In ASPLOS '10, pages 129-142, 2010.
    • ASPLOS , vol.10 , Issue.2010 , pp. 129-142
    • Zhuravlev, S.1    Blagodurov, S.2    Fedorova, A.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.