SCOPUS 정보 검색 플랫폼

Proceedings - 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGrid 2012

Volumn , Issue , 2012, Pages 603-611

Cache conscious task regrouping on multicore processors

(4) Xiang, Xiaoya a Bao, Bin a Ding, Chen a Shen, Kai a

a University of Rochester (United States)

Author keywords

lifetime sampling; multicore; online program locality analysis; task grouping

Indexed keywords

CACHE-CONSCIOUS; IMPROVING PERFORMANCE; JOB SCHEDULER; JOB SCHEDULING; MIXED INTEGER; MULTI CORE; MULTI-CORE PROCESSOR; ONLINE PROGRAMS; PARALLEL EXECUTIONS; PERFORMANCE VARIATIONS; RUN TO RUN; SHARED CACHE; TASK GROUPING;

BENCHMARKING; COMPUTER OPERATING SYSTEMS; DEGRADATION; DIGITAL ARITHMETIC; GRID COMPUTING; PROGRAM PROCESSORS; SCHEDULING; TRACE ANALYSIS;

MULTICORE PROGRAMMING;

EID: 84863700640 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/CCGrid.2012.139 Document Type: Conference Paper

Times cited : (25)

References (39)

1
- 47249103334
- Using OS observations to improve performance in multicore systems
- R. Knauerhase, P. Brett, B. Hohlt, T. Li, and S. Hahn, "Using OS observations to improve performance in multicore systems," IEEE Micro, vol. 38, no. 3, pp. 54-66, 2008.
- (2008) IEEE Micro , vol.38 , Issue.3 , pp. 54-66
- Knauerhase, R.¹ Brett, P.² Hohlt, B.³ Li, T.⁴ Hahn, S.⁵

2
- 77952263946
- Request behavior variations
- K. Shen, "Request behavior variations," in Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, 2010, pp. 103-116.
- (2010) Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems , pp. 103-116
- Shen, K.¹

3
- 77952248898
- Addressing shared resource contention in multicore processors via scheduling
- S. Zhuravlev, S. Blagodurov, and A. Fedorova, "Addressing shared resource contention in multicore processors via scheduling," in Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, 2010, pp. 129-142.
- Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, 2010 , pp. 129-142
- Zhuravlev, S.¹ Blagodurov, S.² Fedorova, A.³

4
- 21244474546
- Predicting inter-thread cache contention on a chip multi-processor architecture
- D. Chandra, F. Guo, S. Kim, and Y. Solihin, "Predicting inter-thread cache contention on a chip multi-processor architecture,"in Proceedings of the International Symposium on High-Performance Computer Architecture, 2005, pp. 340-351.
- Proceedings of the International Symposium on High-Performance Computer Architecture, 2005 , pp. 340-351
- Chandra, D.¹ Guo, F.² Kim, S.³ Solihin, Y.⁴

5
- 80053993064
- All-window profiling and composable models of cache sharing
- X. Xiang, B. Bao, T. Bai, C. Ding, and T. M. Chilimbi, "All-window profiling and composable models of cache sharing,"in Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2011, pp. 91-102.
- Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2011 , pp. 91-102
- Xiang, X.¹ Bao, B.² Bai, T.³ Ding, C.⁴ Chilimbi, T.M.⁵

6
- 84863053984
- Linear-time modeling of program working set in shared cache
- X. Xiang, B. Bao, C. Ding, and Y. Gao, "Linear-time modeling of program working set in shared cache," in Proceedings of the International Conference on Parallel Architecture and Compilation Techniques, 2011, pp. 350-360.
- Proceedings of the International Conference on Parallel Architecture and Compilation Techniques, 2011 , pp. 350-360
- Xiang, X.¹ Bao, B.² Ding, C.³ Gao, Y.⁴

7
- 84856557541
- Coherent profiles: Enabling efficient reuse distance analysis of multicore scaling for loop-based parallel programs
- M.-J.Wu and D. Yeung, "Coherent profiles: Enabling efficient reuse distance analysis of multicore scaling for loop-based parallel programs," in Proceedings of the International Conference on Parallel Architecture and Compilation Techniques, 2011.
- Proceedings of the International Conference on Parallel Architecture and Compilation Techniques, 2011
- Wu, M.-J.¹ Yeung, D.²

8
- 78149254514
- Accelerating multicore reuse distance analysis with sampling and parallelization
- D. L. Schuff, M. Kulkarni, and V. S. Pai, "Accelerating multicore reuse distance analysis with sampling and parallelization,"in Proceedings of the International Conference on Parallel Architecture and Compilation Techniques, 2010, pp. 53-64.
- Proceedings of the International Conference on Parallel Architecture and Compilation Techniques, 2010 , pp. 53-64
- Schuff, D.L.¹ Kulkarni, M.² Pai, V.S.³

9
- 57349160281
- Sampling-based program locality approximation
- Y. Zhong and W. Chang, "Sampling-based program locality approximation," in Proceedings of the International Symposium on Memory Management, 2008, pp. 91-100.
- Proceedings of the International Symposium on Memory Management, 2008 , pp. 91-100
- Zhong, Y.¹ Chang, W.²

10
- 33750304084
- Discovery of locality-improving refactoring by reuse path analysis
- Proceedings of HPCC. Springer
- K. Beyls and E. D'Hollander, "Discovery of locality-improving refactoring by reuse path analysis," in Proceedings of HPCC. Springer. Lecture Notes in Computer Science Vol. 4208, 2006, pp. 220-229.
- (2006) Lecture Notes in Computer Science , vol.4208 , pp. 220-229
- Beyls, K.¹ D'Hollander, E.²

11
- 33646073716
- Multiple page size modeling and optimization
- C. Cascaval, E. Duesterwald, P. F. Sweeney, and R. W. Wisniewski, "Multiple page size modeling and optimization,"in Proceedings of the International Conference on Parallel Architecture and Compilation Techniques, 2005, pp. 339-349.
- Proceedings of the International Conference on Parallel Architecture and Compilation Techniques, 2005 , pp. 339-349
- Cascaval, C.¹ Duesterwald, E.² Sweeney, P.F.³ Wisniewski, R.W.⁴

12
- 33244462442
- Fast data-locality profiling of native execution
- E. Berg and E. Hagersten, "Fast data-locality profiling of native execution," in Proceedings of the International Conference on Measurement and Modeling of Computer Systems, 2005, pp. 169-180.
- Proceedings of the International Conference on Measurement and Modeling of Computer Systems, 2005 , pp. 169-180
- Berg, E.¹ Hagersten, E.²

13
- 77952570425
- StatStack: Efficient modeling of LRU caches
- D. Eklov and E. Hagersten, "StatStack: Efficient modeling of LRU caches," in Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2010, pp. 55-65.
- Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2010 , pp. 55-65
- Eklov, D.¹ Hagersten, E.²

14
- 79952932476
- Fast modeling of shared caches in multicore systems
- best paper
- D. Eklov, D. Black-Schaffer, and E. Hagersten, "Fast modeling of shared caches in multicore systems," in Proceedings of the International Conference on High Performance Embedded Architectures and Compilers, 2011, pp. 147-157, best paper.
- Proceedings of the International Conference on High Performance Embedded Architectures and Compilers, 2011 , pp. 147-157
- Eklov, D.¹ Black-Schaffer, D.² Hagersten, E.³

15
- 84863467234
- Department of Computer Science, University of Rochester, Tech. Rep. URCS #972, December
- X. Xiang, B. Bao, and C. Ding, "Program locality sampling in shared cache: A theory and a real-time solution," Department of Computer Science, University of Rochester, Tech. Rep. URCS #972, December 2011.
- (2011) Program Locality Sampling in Shared Cache: A Theory and A Real-time Solution
- Xiang, X.¹ Bao, B.² Ding, C.³

16
- 77951615165
- All-window profiling of concurrent executions
- poster paper
- C. Ding and T. Chilimbi, "All-window profiling of concurrent executions," in Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2008, poster paper.
- Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2008
- Ding, C.¹ Chilimbi, T.²

17
- 84863707259
- Working sets past and present
- Jan.
- P. Denning, "Working sets past and present," IEEE Transactions on Software Engineering, vol. SE-6, no. 1, Jan. 1980.
- (1980) IEEE Transactions on Software Engineering , vol.SE-6 , Issue.1
- Denning, P.¹

18
- 0014701246
- Evaluation techniques for storage hierarchies
- R. L. Mattson, J. Gecsei, D. Slutz, and I. L. Traiger, "Evaluation techniques for storage hierarchies," IBM System Journal, vol. 9, no. 2, pp. 78-117, 1970.
- (1970) IBM System Journal , vol.9 , Issue.2 , pp. 78-117
- Mattson, R.L.¹ Gecsei, J.² Slutz, D.³ Traiger, I.L.⁴

19
- 33846547030
- On the effectiveness of set associative page mapping and its applications in main memory management
- A. J. Smith, "On the effectiveness of set associative page mapping and its applications in main memory management," in Proceedings of the 2nd International Conference on Software Engineering, 1976.
- Proceedings of the 2nd International Conference on Software Engineering, 1976
- Smith, A.J.¹

20
- 0024903997
- Evaluating associativity in CPU caches
- M. D. Hill and A. J. Smith, "Evaluating associativity in CPU caches," IEEE Transactions on Computers, vol. 38, no. 12, pp. 1612-1630, 1989.
- (1989) IEEE Transactions on Computers , vol.38 , Issue.12 , pp. 1612-1630
- Hill, M.D.¹ Smith, A.J.²

21
- 8344269521
- Cross architecture performance predictions for scientific applications using parameterized models
- G. Marin and J. Mellor-Crummey, "Cross architecture performance predictions for scientific applications using parameterized models," in Proceedings of the International Conference on Measurement and Modeling of Computer Systems, 2004, pp. 2-13.
- Proceedings of the International Conference on Measurement and Modeling of Computer Systems, 2004 , pp. 2-13
- Marin, G.¹ Mellor-Crummey, J.²

22
- 0034826142
- Analytical cache models with applications to cache partitioning
- G. E. Suh, S. Devadas, and L. Rudolph, "Analytical cache models with applications to cache partitioning." in Proceedings of the International Conference on Supercomputing, 2001, pp. 1-12.
- Proceedings of the International Conference on Supercomputing, 2001 , pp. 1-12
- Suh, G.E.¹ Devadas, S.² Rudolph, L.³

23
- 77954050277
- Microsoft Research, Tech. Rep. MSR-TR-2009-107, August
- C. Ding and T. Chilimbi, "A composable model for analyzing locality of multi-threaded programs," Microsoft Research, Tech. Rep. MSR-TR-2009-107, August 2009.
- (2009) A Composable Model for Analyzing Locality of Multi-threaded Programs
- Ding, C.¹ Chilimbi, T.²

24
- 77951616746
- Is reuse distance applicable to data locality analysis on chip multiprocessors?
- Y. Jiang, E. Z. Zhang, K. Tian, and X. Shen, "Is reuse distance applicable to data locality analysis on chip multiprocessors?"in Proceedings of the International Conference on Compiler Construction, 2010, pp. 264-282.
- Proceedings of the International Conference on Compiler Construction, 2010 , pp. 264-282
- Jiang, Y.¹ Zhang, E.Z.² Tian, K.³ Shen, X.⁴

25
- 77749340037
- Does cache sharing on modern CMP matter to the performance of contemporary multithreaded programs?
- E. Z. Zhang, Y. Jiang, and X. Shen, "Does cache sharing on modern CMP matter to the performance of contemporary multithreaded programs?" in Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2010, pp. 203-212.
- Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2010 , pp. 203-212
- Zhang, E.Z.¹ Jiang, Y.² Shen, X.³

26
- 70349743894
- Program locality analysis using reuse distance
- Aug.
- Y. Zhong, X. Shen, and C. Ding, "Program locality analysis using reuse distance," ACM Transactions on Programming Languages and Systems, vol. 31, no. 6, pp. 1-39, Aug. 2009.
- (2009) ACM Transactions on Programming Languages and Systems , vol.31 , Issue.6 , pp. 1-39
- Zhong, Y.¹ Shen, X.² Ding, C.³

27
- 31944440969
- Pin: Building customized program analysis tools with dynamic instrumentation
- C.-K. Luk et al., "Pin: Building customized program analysis tools with dynamic instrumentation," in Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Chicago, Illinois, Jun. 2005.
- Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Chicago, Illinois, Jun. 2005
- Luk, C.-K.¹

28
- 78349251674
- Comparing scalability prediction strategies on an SMP of CMPs
- K. Singh, M. Curtis-Maury, S. A. McKee, F. Blagojevic, D. S. Nikolopoulos, B. R. de Supinski, and M. Schulz, "Comparing scalability prediction strategies on an SMP of CMPs," in Proceedings of the Euro-Par Conference, 2010, pp. 143-155.
- Proceedings of the Euro-Par Conference, 2010 , pp. 143-155
- Singh, K.¹ Curtis-Maury, M.² McKee, S.A.³ Blagojevic, F.⁴ Nikolopoulos, D.S.⁵ De Supinski, B.R.⁶ Schulz, M.⁷

29
- 77954052050
- Analyzing the trade-off between multiple memory controllers and memory channels on multi-core processor performance
- J. C. Sancho, M. Lang, and D. J. Kerbyson, "Analyzing the trade-off between multiple memory controllers and memory channels on multi-core processor performance," in Proceedings of the LSPP Workshop, 2010, pp. 1-7.
- Proceedings of the LSPP Workshop, 2010 , pp. 1-7
- Sancho, J.C.¹ Lang, M.² Kerbyson, D.J.³

30
- 56349121765
- A prediction based CMP cache migration policy
- S. Hao, Z. Du, D. A. Bader, and M. Wang, "A prediction based CMP cache migration policy," in Proceedings of the IEEE International Conference on High Performance Computing and Communications, 2008, pp. 374-381.
- Proceedings of the IEEE International Conference on High Performance Computing and Communications, 2008 , pp. 374-381
- Hao, S.¹ Du, Z.² Bader, D.A.³ Wang, M.⁴

31
- 48849084701
- Open - SpeedShop: An open source infrastructure for parallel performance analysis
- M. Schulz, J. Galarowicz, D. Maghrak, W. Hachfeld, D. Montoya, and S. Cranford, "Open - SpeedShop: An open source infrastructure for parallel performance analysis," Scientific Programming, vol. 16, no. 2-3, pp. 105-121, 2008.
- (2008) Scientific Programming , vol.16 , Issue.2-3 , pp. 105-121
- Schulz, M.¹ Galarowicz, J.² Maghrak, D.³ Hachfeld, W.⁴ Montoya, D.⁵ Cranford, S.⁶

32
- 0036679608
- HPCView: A tool for top-down analysis of node performance
- J. Mellor-Crummey, R. Fowler, G. Marin, and N. Tallent, "HPCView: A tool for top-down analysis of node performance,"Journal of Supercomputing, pp. 81-104, 2002.
- (2002) Journal of Supercomputing , pp. 81-104
- Mellor-Crummey, J.¹ Fowler, R.² Marin, G.³ Tallent, N.⁴

33
- 77950611743
- HPCToolkit: Tools for performance analysis of optimized parallel programs
- L. Adhianto, S. Banerjee, M. W. Fagan, M. Krentel, G. Marin, J. M. Mellor-Crummey, and N. R. Tallent, "HPCToolkit: tools for performance analysis of optimized parallel programs,"Concurrency and Computation: Practice and Experience, vol. 22, no. 6, pp. 685-701, 2010.
- (2010) Concurrency and Computation: Practice and Experience , vol.22 , Issue.6 , pp. 685-701
- Adhianto, L.¹ Banerjee, S.² Fagan, M.W.³ Krentel, M.⁴ Marin, G.⁵ Mellor-Crummey, J.M.⁶ Tallent, N.R.⁷

34
- 63549085110
- Analysis and approximation of optimal co-scheduling on chip multiprocessors
- Y. Jiang, X. Shen, J. Chen, and R. Tripathi, "Analysis and approximation of optimal co-scheduling on chip multiprocessors,"in Proceedings of the International Conference on Parallel Architecture and Compilation Techniques, 2008, pp. 220-229.
- Proceedings of the International Conference on Parallel Architecture and Compilation Techniques, 2008 , pp. 220-229
- Jiang, Y.¹ Shen, X.² Chen, J.³ Tripathi, R.⁴

35
- 34548285855
- Locality approximation using time
- X. Shen, J. Shaw, B. Meeker, and C. Ding, "Locality approximation using time," in Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2007, pp. 55-61.
- Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2007 , pp. 55-61
- Shen, X.¹ Shaw, J.² Meeker, B.³ Ding, C.⁴

36
- 78649595661
- Characterizing the relation between Apex-Map synthetic probes and reuse distance distributions
- vol. 0
- K. Z. Ibrahim and E. Strohmaier, "Characterizing the relation between Apex-Map synthetic probes and reuse distance distributions, "Proceedings of the International Conference on Parallel Processing, vol. 0, pp. 353-362, 2010.
- (2010) Proceedings of the International Conference on Parallel Processing , pp. 353-362
- Ibrahim, K.Z.¹ Strohmaier, E.²

37
- 84866839718
- FractalMRC:Online cache miss rate curve prediction on commodity systems
- L. He, Z. Yu, and H. Jin, "FractalMRC:online cache miss rate curve prediction on commodity systems," in Proceedings of the International Parallel and Distributed Processing Symposium, 2012.
- Proceedings of the International Parallel and Distributed Processing Symposium, 2012
- He, L.¹ Yu, Z.² Jin, H.³

38
- 67650796123
- RapidMRC: Approximating L2 miss rate curves on commodity systems for online optimizations
- D. K. Tam, R. Azimi, L. Soares, and M. Stumm, "RapidMRC: approximating L2 miss rate curves on commodity systems for online optimizations," in Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, 2009, pp. 121-132.
- Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, 2009 , pp. 121-132
- Tam, D.K.¹ Azimi, R.² Soares, L.³ Stumm, M.⁴

39
- 79957472410
- Pinpointing data locality problems using data-centric analysis
- X. Liu and J. M. Mellor-Crummey, "Pinpointing data locality problems using data-centric analysis," in Proceedings of the International Symposium on Code Generation and Optimization, 2011, pp. 171-180.
- Proceedings of the International Symposium on Code Generation and Optimization, 2011 , pp. 171-180
- Liu, X.¹ Mellor-Crummey, J.M.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.