메뉴 건너뛰기




Volumn , Issue , 2009, Pages 282-288

Avoiding cache thrashing due to private data placement in last-level cache for manycore scaling

Author keywords

[No Author keywords available]

Indexed keywords

BASELINE CONFIGURATIONS; CACHE ORGANIZATION; CONFLICT MISS; DATA PARALLEL; DATA SHARING; DIRECTORY PROTOCOL; HIGH BANDWIDTH; MANY-CORE; MEMORY ALLOCATORS; PRIVATE DATA; SHARED DIRECTORIES; STACK RANDOMIZATION;

EID: 77950987305     PISSN: 10636404     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICCD.2009.5413143     Document Type: Conference Paper
Times cited : (27)

References (36)
  • 1
    • 33746683732 scopus 로고    scopus 로고
    • Maximizing CMP throughput with mediocre cores
    • J. D. Davis, J. Laudon, and K. Olukotun, "Maximizing CMP throughput with mediocre cores," in. PACT, 2005, pp. 51-62.
    • (2005) PACT , pp. 51-62
    • Davis, J.D.1    Laudon, J.2    Olukotun, K.3
  • 2
    • 20344374162 scopus 로고    scopus 로고
    • Niagara: A 32-way multithreaded Sparc processor
    • P. Kongetira, K. Amgaran, and K. Olukotun, "Niagara: a 32-way multithreaded Sparc processor," IEEE Micro, vol.25, no.2, pp. 21-29, 2005.
    • (2005) IEEE Micro , vol.25 , Issue.2 , pp. 21-29
    • Kongetira, P.1    Amgaran, K.2    Olukotun, K.3
  • 4
    • 0028201665 scopus 로고
    • Tradeoffs in two-level on-chip caching
    • Apr.
    • N. P. Jouppi and S. J. E. Wilton, "Tradeoffs in two-level on-chip caching," in ISCA, Apr. 1994, pp. 34-45.
    • (1994) ISCA , pp. 34-45
    • Jouppi, N.P.1    Wilton, S.J.E.2
  • 6
    • 34547282756 scopus 로고    scopus 로고
    • Reducing verification complexity of a multicore coherence protocol using assume/guarantee
    • X. Chen, Y. Yang, G. Gopalakrishnan, and C.-T. Chou, "Reducing verification complexity of a multicore coherence protocol using assume/guarantee," in FMCAD, 2006, pp. 81-88.
    • (2006) FMCAD , pp. 81-88
    • Chen, X.1    Yang, Y.2    Gopalakrishnan, G.3    Chou, C.-T.4
  • 9
    • 34247273005 scopus 로고    scopus 로고
    • Scalable locality-conscious multithreaded memory allocation
    • S. Schneider, C. D. Antonopoulos, and D. S. Nikolopoulos, "Scalable locality-conscious multithreaded memory allocation," in ISMM, 2006, pp. 84-94.
    • (2006) ISMM , pp. 84-94
    • Schneider, S.1    Antonopoulos, C.D.2    Nikolopoulos, D.S.3
  • 10
    • 17544362263 scopus 로고    scopus 로고
    • Hoard: A scalable memory allocator for multithreaded applications
    • E. D. Berger, K. S. McKinley, R. D. Blumofe, and P. R. Wilson, "Hoard: a scalable memory allocator for multithreaded applications," SIGPLAN Not, vol.35, no.11, pp. 117-128, 2000.
    • (2000) SIGPLAN Not , vol.35 , Issue.11 , pp. 117-128
    • Berger, E.D.1    McKinley, K.S.2    Blumofe, R.D.3    Wilson, P.R.4
  • 11
    • 84949769332 scopus 로고    scopus 로고
    • A new memory monitoring scheme for memory-aware scheduling and partitioning
    • G. E. Suh, S. Devadas, and L. Rudolph, "A new memory monitoring scheme for memory-aware scheduling and partitioning," in HPCA, 2002, p. 117.
    • (2002) HPCA , pp. 117
    • Suh, G.E.1    Devadas, S.2    Rudolph, L.3
  • 12
    • 0000444590 scopus 로고
    • Evaluating the performance of cache-affinity scheduling in shared-memory multiprocessors
    • J. Torrellas, A. Tucker, and A. Gupta, "Evaluating the performance of cache-affinity scheduling in shared-memory multiprocessors," JPDC, vol.24, no.2, pp. 139-151, 1995.
    • (1995) JPDC , vol.24 , Issue.2 , pp. 139-151
    • Torrellas, J.1    Tucker, A.2    Gupta, A.3
  • 13
    • 0028754497 scopus 로고    scopus 로고
    • Affinity scheduling of unbalanced workloads
    • S. Subramaniam and D. L. Eager, "Affinity scheduling of unbalanced workloads," in SC, .1.994, pp. 214-226.
    • SC,.1994 , pp. 214-226
    • Subramaniam, S.1    Eager, D.L.2
  • 14
    • 14844328033 scopus 로고    scopus 로고
    • On the effectiveness of address-space randomization
    • H. Shacham, E. jin Goh, N. Modadugu, B. Pfaff, and D. Boneh, "On the effectiveness of address-space randomization," in CCS, 2004, pp. 298-307.
    • (2004) CCS , pp. 298-307
    • Shacham, H.1    Jin Goh, E.2    Modadugu, N.3    Pfaff, B.4    Boneh, D.5
  • 16
    • 0034592592 scopus 로고    scopus 로고
    • Region-based caching: An energy-delay efficient memory architecture for embedded processors
    • H. S. Lee and G. S. Tyson, "Region-based caching: an energy-delay efficient memory architecture for embedded processors," in CASES, 2000, pp. 120-127.
    • (2000) CASES , pp. 120-127
    • Lee, H.S.1    Tyson, G.S.2
  • 17
    • 34548316872 scopus 로고    scopus 로고
    • A novel technique to use scratch-pad memory for stack management
    • DOI 10.1109/DATE.2007.364509, 4212019, Proceedings - 2007 Design, Automation and Test in Europe Conference and Exhibition, DATE 2007
    • S. Park, H. woo Park, and S. Ha, "A novel technique to use scratch-pad memory for stack management," in DATE, 2007, pp. 1478-1483. (Pubitemid 47334172)
    • (2007) Proceedings -Design, Automation and Test in Europe, DATE , pp. 1478-1483
    • Park, S.1    Park, H.-W.2    Ha, S.3
  • 18
    • 23044524059 scopus 로고    scopus 로고
    • On-chip vs. off-chip memory: The data partitioning problem in embedded processor-based systems
    • July
    • P. R. Panda, N. D. Dutt, and A. Nicolau, "On-chip vs. off-chip memory: The data partitioning problem in embedded processor-based systems," ACM TODAES, vol, 5, no.3, pp. 682-704, July 2000.
    • (2000) ACM TODAES , vol.5 , Issue.3 , pp. 682-704
    • Panda, P.R.1    Dutt, N.D.2    Nicolau, A.3
  • 19
    • 77951001644 scopus 로고    scopus 로고
    • A localizing directory coherence protocol
    • C. McCurdy and C. Fischer, "A localizing directory coherence protocol," in. WMPI, 2004, pp. 23-29.
    • (2004) WMPI , pp. 23-29
    • McCurdy, C.1    Fischer, C.2
  • 20
    • 42549168687 scopus 로고    scopus 로고
    • Exploring the cache design space for large scale CMPs
    • L. Hsu, R. Iyer, S. Makineni, S. Reinhardt, and D. Newell, "Exploring the cache design space for large scale CMPs," dasCMP, vol.33, no.4, pp. 24-33, 2005.
    • (2005) DasCMP , vol.33 , Issue.4 , pp. 24-33
    • Hsu, L.1    Iyer, R.2    Makineni, S.3    Reinhardt, S.4    Newell, D.5
  • 21
    • 33845903561 scopus 로고    scopus 로고
    • Cooperative caching for Chip Multiprocessors
    • J. Chang and G. S. Sohi, "Cooperative caching for Chip Multiprocessors," in ISCA, 2006, pp. 264-276.
    • (2006) ISCA , pp. 264-276
    • Chang, J.1    Sohi, G.S.2
  • 22
    • 27544495466 scopus 로고    scopus 로고
    • Victim replication: Maximizing capacity while hiding wire delay in tiled Chip Multiprocessors
    • M. Zhang and K. Asanovic, "Victim replication: Maximizing capacity while hiding wire delay in tiled Chip Multiprocessors," in ISCA, 2005, pp. 336-345.
    • (2005) ISCA , pp. 336-345
    • Zhang, M.1    Asanovic, K.2
  • 23
    • 77950982560 scopus 로고    scopus 로고
    • Victim migration: Dynamically adapting between private and shared CMP caches
    • - "Victim migration: Dynamically adapting between private and shared CMP caches," in MIT Technical Report MIT-CSAIL-TR-2005-064, MIT-LCS-TR-.1006, 2005.
    • (2005) MIT Technical Report MIT-CSAIL-TR-2005-064, MIT-LCS-TR-1006
  • 27
    • 33845423872 scopus 로고    scopus 로고
    • An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches
    • C. Kim, D. Burger, and S. W. Keckler, "An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches," ASPLOS, vol.36, no.5, 2002.
    • (2002) ASPLOS , vol.36 , Issue.5
    • Kim, C.1    Burger, D.2    Keckler, S.W.3
  • 29
    • 0002255264 scopus 로고
    • SPLASH: Stanford parallel applications for shared memory
    • Mar.
    • J. P. Singh, W.-D. Weber, and A. Gupta, "SPLASH: Stanford parallel applications for shared memory," ISCA, vol.20, no.1, pp. 5-44, Mar. 1995.
    • (1995) ISCA , vol.20 , Issue.1 , pp. 5-44
    • Singh, J.P.1    Weber, W.-D.2    Gupta, A.3
  • 30
    • 51449118065 scopus 로고    scopus 로고
    • A performance study of general purpose applications on graphics processors using CUDA
    • S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer, and K. Skadron, "A performance study of general purpose applications on graphics processors using CUDA," JPDC, 2008.
    • (2008) JPDC
    • Che, S.1    Boyer, M.2    Meng, J.3    Tarjan, D.4    Sheaffer, J.W.5    Skadron, K.6


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.