메뉴 건너뛰기




Volumn , Issue , 2012, Pages 230-241

Matching memory access patterns and data placement for NUMA systems

Author keywords

Data placement; NUMA; Scheduling

Indexed keywords

ACCESS PATTERNS; DATA ACCESS PATTERNS; DATA DISTRIBUTION; DATA PARTITIONING; DATA PLACEMENT; LOOP SCHEDULING; MEMORY ACCESS PATTERNS; MULTI CORE; NON-UNIFORM MEMORY ARCHITECTURE; NUMA; NUMA SYSTEMS; PARALLEL PROGRAM; REMOTE ACCESS; REMOTE MEMORY ACCESS;

EID: 84863469851     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/2259016.2259046     Document Type: Conference Paper
Times cited : (47)

References (19)
  • 2
    • 84863430235 scopus 로고    scopus 로고
    • A case for NUMA-aware contention management on multicore systems
    • Berkeley, CA, USA, USENIX Association
    • S. Blagodurov, S. Zhuravlev, M. Dashti, and A. Fedorova. A case for NUMA-aware contention management on multicore systems. In USENIX ATC'11, Berkeley, CA, USA, 2011. USENIX Association.
    • (2011) USENIX ATC'11
    • Blagodurov, S.1    Zhuravlev, S.2    Dashti, M.3    Fedorova, A.4
  • 4
    • 0242276320 scopus 로고    scopus 로고
    • Generalized multipartitioning of multi-dimensional arrays for parallelizing line-sweep computations
    • DOI 10.1016/S0743-7315(03)00103-5
    • A. Darte, J. Mellor-Crummey, R. Fowler, and D. Chavarrá-Miranda. Generalized multipartitioning of multi-dimensional arrays for parallelizing line-sweep computations. J. Parallel Distrib. Comput., 63:887-911, September 2003. (Pubitemid 37364493)
    • (2003) Journal of Parallel and Distributed Computing , vol.63 , Issue.9 , pp. 887-911
    • Darte, A.1    Mellor-Crummey, J.2    Fowler, R.3    Chavarria-Miranda, D.4
  • 5
    • 77953995394 scopus 로고    scopus 로고
    • What can performance counters do for memory subsystem analysis?
    • New York, NY, USA, ACM
    • S. Eranian. What can performance counters do for memory subsystem analysis? In MSPC'08, pages 26-30, New York, NY, USA, 2008. ACM.
    • (2008) MSPC'08 , pp. 26-30
    • Eranian, S.1
  • 8
    • 85023182347 scopus 로고
    • Locality and loop scheduling on NUMA multiprocessors
    • CRC Press, Inc.
    • H. Li, H. L. Sudarsan, M. Stumm, and K. C. Sevcik. Locality and loop scheduling on NUMA multiprocessors. In ICPP'93, pages 140-147. CRC Press, Inc, 1993.
    • (1993) ICPP'93 , pp. 140-147
    • Li, H.1    Sudarsan, H.L.2    Stumm, M.3    Sevcik, K.C.4
  • 9
    • 79959898692 scopus 로고    scopus 로고
    • Memory management in numa multicore systems: Trapped between cache contention and interconnect overhead
    • New York, NY, USA, ACM
    • Z. Majo and T. R. Gross. Memory management in numa multicore systems: trapped between cache contention and interconnect overhead. In ISMM'11, pages 11-20, New York, NY, USA, 2011. ACM.
    • (2011) ISMM'11 , pp. 11-20
    • Majo, Z.1    Gross, T.R.2
  • 10
    • 78650178980 scopus 로고    scopus 로고
    • Feedback-directed page placement for cc-NUMA via hardware-generated memory traces
    • December
    • J. Marathe, V. Thakkar, and F. Mueller. Feedback-directed page placement for cc-NUMA via hardware-generated memory traces. J. Par. Distrib. Comput., 70:1204-1219, December 2010.
    • (2010) J. Par. Distrib. Comput. , vol.70 , pp. 1204-1219
    • Marathe, J.1    Thakkar, V.2    Mueller, F.3
  • 11
    • 77952562600 scopus 로고    scopus 로고
    • Memphis: Finding and fixing NUMA-related performance problems on multi-core platforms
    • C. McCurdy and J. S. Vetter. Memphis: Finding and fixing NUMA-related performance problems on multi-core platforms. In ISPASS'10, pages 87-96, 2010.
    • (2010) ISPASS'10 , pp. 87-96
    • McCurdy, C.1    Vetter, J.S.2
  • 14
    • 34548030923 scopus 로고    scopus 로고
    • Thread clustering: Sharing-aware scheduling on SMP-CMP-SMT multiprocessors
    • DOI 10.1145/1272996.1273004, Operating Systems Review - Proceedings of the 2007 EuroSys Conference
    • D. Tam, R. Azimi, and M. Stumm. Thread clustering: sharing-aware scheduling on SMP-CMP-SMT multiprocessors. In EuroSys'07, pages 47-58, New York, NY, USA, 2007. ACM. (Pubitemid 47281574)
    • (2007) Operating Systems Review (ACM) , pp. 47-58
    • Tam, D.1    Azimi, R.2    Stumm, M.3
  • 15
    • 0030652844 scopus 로고    scopus 로고
    • Automatic partitioning of data and computations on scalable shared memory multiprocessors
    • August
    • S. Tandri and T. Abdelrahman. Automatic partitioning of data and computations on scalable shared memory multiprocessors. In ICPP'97, pages 64 -73, August 1997.
    • (1997) ICPP'97 , pp. 64-73
    • Tandri, S.1    Abdelrahman, T.2
  • 16
    • 0028346514 scopus 로고
    • Impact of sharing-based thread placement on multithreaded architectures
    • Los Alamitos, CA, USA, IEEE Computer Society Press
    • R. Thekkath and S. J. Eggers. Impact of sharing-based thread placement on multithreaded architectures. In ISCA'94, pages 176-186, Los Alamitos, CA, USA, 1994. IEEE Computer Society Press.
    • (1994) ISCA'94 , pp. 176-186
    • Thekkath, R.1    Eggers, S.J.2
  • 17
    • 84934274832 scopus 로고    scopus 로고
    • Using hardware counters to automatically improve memory performance
    • Washington, DC, USA, IEEE Computer Society
    • M. M. Tikir and J. K. Hollingsworth. Using hardware counters to automatically improve memory performance. In Supercomputing'04, Washington, DC, USA, 2004. IEEE Computer Society.
    • (2004) Supercomputing'04
    • Tikir, M.M.1    Hollingsworth, J.K.2
  • 18
    • 2842582520 scopus 로고    scopus 로고
    • Operating system support for improving data locality on CC-NUMA compute servers
    • New York, NY, USA, ACM
    • B. Verghese, S. Devine, A. Gupta, and M. Rosenblum. Operating system support for improving data locality on CC-NUMA compute servers. In ASPLOS'96, pages 279-289, New York, NY, USA, 1996. ACM.
    • (1996) ASPLOS'96 , pp. 279-289
    • Verghese, B.1    Devine, S.2    Gupta, A.3    Rosenblum, M.4
  • 19
    • 77749340037 scopus 로고    scopus 로고
    • Does cache sharing on modern CMP matter to the performance of contemporary multithreaded programs?
    • New York, NY, USA, ACM
    • E. Z. Zhang, Y. Jiang, and X. Shen. Does cache sharing on modern CMP matter to the performance of contemporary multithreaded programs? In PPoPP'10, pages 203-212, New York, NY, USA, 2010. ACM.
    • (2010) PPoPP'10 , pp. 203-212
    • Zhang, E.Z.1    Jiang, Y.2    Shen, X.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.