-
2
-
-
0034837177
-
A framework for reducing the cost of instrumented code
-
Snowbird, Utah, June
-
M. Arnold and B. G. Ryder. A framework for reducing the cost of instrumented code. In Proceedings of PLDI, pages 168-179, Snowbird, Utah, June 2001.
-
(2001)
Proceedings of PLDI
, pp. 168-179
-
-
Arnold, M.1
Ryder, B.G.2
-
3
-
-
33244462442
-
Fast data-locality profiling of native execution
-
E. Berg and E. Hagersten. Fast data-locality profiling of native execution. In Proceedings of SIGMETRICS, pages 169-180, 2005.
-
(2005)
Proceedings of SIGMETRICS
, pp. 169-180
-
-
Berg, E.1
Hagersten, E.2
-
4
-
-
14944380098
-
Generating cache hints for improved program efficiency
-
K. Beyls and E. D'Hollander. Generating cache hints for improved program efficiency. Journal of Systems Architecture, 51(4):223-250, 2005.
-
(2005)
Journal of Systems Architecture
, vol.51
, Issue.4
, pp. 223-250
-
-
Beyls, K.1
D'Hollander, E.2
-
5
-
-
33750304084
-
Discovery of locality-improving refactoring by reuse path analysis
-
Springer. Lecture Notes in Computer Science
-
K. Beyls and E. D'Hollander. Discovery of locality-improving refactoring by reuse path analysis. In Proceedings of HPCC. Springer. Lecture Notes in Computer Science Vol. 4208, pages 220-229, 2006.
-
(2006)
Proceedings of HPCC
, vol.4208
, pp. 220-229
-
-
Beyls, K.1
D'hollander, E.2
-
6
-
-
63549095070
-
The PARSEC benchmark suite: Characterization and architectural implications
-
C. Bienia, S. Kumar, J. P. Singh, and K. Li. The PARSEC benchmark suite: characterization and architectural implications. In Proceedings of PACT, pages 72-81, 2008.
-
(2008)
Proceedings of PACT
, pp. 72-81
-
-
Bienia, C.1
Kumar, S.2
Singh, J.P.3
Li, K.4
-
7
-
-
33646073716
-
Multiple page size modeling and optimization
-
C. Cascaval, E. Duesterwald, P. F. Sweeney, and R. W. Wisniewski. Multiple page size modeling and optimization. In Proceedings of PACT, pages 339-349, 2005.
-
(2005)
Proceedings of PACT
, pp. 339-349
-
-
Cascaval, C.1
Duesterwald, E.2
Sweeney, P.F.3
Wisniewski, R.W.4
-
8
-
-
1142268809
-
Estimating cache misses and locality using stack distances
-
C. Cascaval and D. A. Padua. Estimating cache misses and locality using stack distances. In Proceedings of ICS, pages 150-159, 2003.
-
(2003)
Proceedings of ICS
, pp. 150-159
-
-
Cascaval, C.1
Padua, D.A.2
-
9
-
-
21244474546
-
Predicting inter-thread cache contention on a chip multi-processor architecture
-
D. Chandra, F. Guo, S. Kim, and Y. Solihin. Predicting inter-thread cache contention on a chip multi-processor architecture. In Proceedings of HPCA, pages 340-351, 2005.
-
(2005)
Proceedings of HPCA
, pp. 340-351
-
-
Chandra, D.1
Guo, F.2
Kim, S.3
Solihin, Y.4
-
10
-
-
77954699826
-
Static reuse distances for locality-based optimizations in MATLAB
-
A. Chauhan and C.-Y. Shei. Static reuse distances for locality-based optimizations in MATLAB. In Proceedings of ICS, pages 295-304, 2010.
-
(2010)
Proceedings of ICS
, pp. 295-304
-
-
Chauhan, A.1
Shei, C.-Y.2
-
11
-
-
0036038136
-
Dynamic hot data stream prefetching for general-purpose programs
-
Berlin, Germany, June
-
T. M. Chilimbi and M. Hirzel. Dynamic hot data stream prefetching for general-purpose programs. In Proceedings of PLDI, Berlin, Germany, June 2002.
-
(2002)
Proceedings of PLDI
-
-
Chilimbi, T.M.1
Hirzel, M.2
-
12
-
-
84866867353
-
A highly parallel reuse distance analysis algorithm on gpus
-
H. Cui, Q. Yi, J. Xue, L.Wang, Y. Yang, and X. Feng. A highly parallel reuse distance analysis algorithm on gpus. In Proceedings of IPDPS, 2012.
-
(2012)
Proceedings of IPDPS
-
-
Cui, H.1
Yi, Q.2
Xue, J.3
Wang, L.4
Yang, Y.5
Feng, X.6
-
13
-
-
79955715200
-
The working set model for program behaviour
-
P. J. Denning. The working set model for program behaviour. Communications of ACM, 11(5):323-333, 1968.
-
(1968)
Communications of ACM
, vol.11
, Issue.5
, pp. 323-333
-
-
Denning, P.J.1
-
15
-
-
0015316498
-
Properties of the working set model
-
P. J. Denning and S. C. Schwartz. Properties of the working set model. Communications of ACM, 15(3):191-198, 1972.
-
(1972)
Communications of ACM
, vol.15
, Issue.3
, pp. 191-198
-
-
Denning, P.J.1
Schwartz, S.C.2
-
16
-
-
0018018416
-
Generalized working sets for segment reference strings
-
P. J. Denning and D. R. Slutz. Generalized working sets for segment reference strings. Communications of ACM, 21(9):750-759, 1978.
-
(1978)
Communications of ACM
, vol.21
, Issue.9
, pp. 750-759
-
-
Denning, P.J.1
Slutz, D.R.2
-
17
-
-
77951615165
-
All-window profiling of concurrent executions
-
poster paper
-
C. Ding and T. Chilimbi. All-window profiling of concurrent executions. In Proceedings of PPoPP, 2008. poster paper.
-
(2008)
Proceedings of PPoPP
-
-
Ding, C.1
Chilimbi, T.2
-
19
-
-
79952932476
-
Fast modeling of shared caches in multicore systems
-
best paper
-
D. Eklov, D. Black-Schaffer, and E. Hagersten. Fast modeling of shared caches in multicore systems. In Proceedings of HiPEAC, pages 147-157, 2011. best paper.
-
(2011)
Proceedings of HiPEAC
, pp. 147-157
-
-
Eklov, D.1
Black-Schaffer, D.2
Hagersten, E.3
-
20
-
-
77952570425
-
StatStack: Efficient modeling of LRU caches
-
D. Eklov and E. Hagersten. StatStack: Efficient modeling of LRU caches. In Proceedings of ISPASS, pages 55-65, 2010.
-
(2010)
Proceedings of ISPASS
, pp. 55-65
-
-
Eklov, D.1
Hagersten, E.2
-
21
-
-
33745793237
-
Path-based reuse distance analysis
-
C. Fang, S. Carr, S. Ö nder, and Z. Wang. Path-based reuse distance analysis. In Proceedings of CC, pages 32-46, 2006.
-
(2006)
Proceedings of CC
, pp. 32-46
-
-
Fang, C.1
Carr, S.2
Nder, S.Ö.3
Wang, Z.4
-
22
-
-
84866870820
-
Locality principle revisited: A probability-based quantitative approach
-
S. Gupta, P. Xiang, Y. Yang, and H. Zhou. Locality principle revisited: A probability-based quantitative approach. In Proceedings of IPDPS, 2012.
-
(2012)
Proceedings of IPDPS
-
-
Gupta, S.1
Xiang, P.2
Yang, Y.3
Zhou, H.4
-
23
-
-
36849034066
-
Spec cpu2006 benchmark descriptions
-
J. L. Henning. Spec cpu2006 benchmark descriptions. SIGARCH Computer Architecture News, 34(4):1-17, 2006.
-
(2006)
SIGARCH Computer Architecture News
, vol.34
, Issue.4
, pp. 1-17
-
-
Henning, J.L.1
-
25
-
-
0024903997
-
Evaluating associativity in CPU caches
-
M. D. Hill and A. J. Smith. Evaluating associativity in CPU caches. IEEE Transactions on Computers, 38(12):1612-1630, 1989.
-
(1989)
IEEE Transactions on Computers
, vol.38
, Issue.12
, pp. 1612-1630
-
-
Hill, M.D.1
Smith, A.J.2
-
26
-
-
77949597137
-
Combining locality analysis with online proactive job co-scheduling in chip multiprocessors
-
Y. Jiang, K. Tian, and X. Shen. Combining locality analysis with online proactive job co-scheduling in chip multiprocessors. In Proceedings of HiPEAC, pages 201-215, 2010.
-
(2010)
Proceedings of HiPEAC
, pp. 201-215
-
-
Jiang, Y.1
Tian, K.2
Shen, X.3
-
27
-
-
77951616746
-
Is reuse distance applicable to data locality analysis on chip multiprocessors?
-
Y. Jiang, E. Z. Zhang, K. Tian, and X. Shen. Is reuse distance applicable to data locality analysis on chip multiprocessors? In Proceedings of CC, pages 264-282, 2010.
-
(2010)
Proceedings of CC
, pp. 264-282
-
-
Jiang, Y.1
Zhang, E.Z.2
Tian, K.3
Shen, X.4
-
28
-
-
0348220543
-
Flexible reference trace reduction for VM simulations
-
S. F. Kaplan, Y. Smaragdakis, and P. R. Wilson. Flexible reference trace reduction for VM simulations. ACM Transactions on Modeling and Computer Simulation, 13(1):1-38, 2003.
-
(2003)
ACM Transactions on Modeling and Computer Simulation
, vol.13
, Issue.1
, pp. 1-38
-
-
Kaplan, S.F.1
Smaragdakis, Y.2
Wilson, P.R.3
-
29
-
-
31944440969
-
Pin: Building customized program analysis tools with dynamic instrumentation
-
C.-K. Luk, R. S. Cohn, R. Muth, H. Patil, A. Klauser, P. G. Lowney, S. Wallace, V. J. Reddi, and K. M. Hazelwood. Pin: building customized program analysis tools with dynamic instrumentation. In Proceedings of PLDI, pages 190-200, 2005.
-
(2005)
Proceedings of PLDI
, pp. 190-200
-
-
Luk, C.-K.1
Cohn, R.S.2
Muth, R.3
Patil, H.4
Klauser, A.5
Lowney, P.G.6
Wallace, S.7
Reddi, V.J.8
Hazelwood, K.M.9
-
30
-
-
8344269521
-
Cross architecture performance predictions for scientific applications using parameterized models
-
G. Marin and J. Mellor-Crummey. Cross architecture performance predictions for scientific applications using parameterized models. In Proceedings of SIGMETRICS, pages 2-13, 2004.
-
(2004)
Proceedings of SIGMETRICS
, pp. 2-13
-
-
Marin, G.1
Mellor-Crummey, J.2
-
31
-
-
0014701246
-
Evaluation techniques for storage hierarchies
-
R. L. Mattson, J. Gecsei, D. Slutz, and I. L. Traiger. Evaluation techniques for storage hierarchies. IBM System Journal, 9(2):78-117, 1970.
-
(1970)
IBM System Journal
, vol.9
, Issue.2
, pp. 78-117
-
-
Mattson, R.L.1
Gecsei, J.2
Slutz, D.3
Traiger, I.L.4
-
32
-
-
34547673280
-
Shadow profiling: Hiding instrumentation costs with parallelism
-
T. Moseley, A. Shye, V. J. Reddi, D. Grunwald, and R. Peri. Shadow profiling: Hiding instrumentation costs with parallelism. In Proceedings of CGO, pages 198-208, 2007.
-
(2007)
Proceedings of CGO
, pp. 198-208
-
-
Moseley, T.1
Shye, A.2
Reddi, V.J.3
Grunwald, D.4
Peri, R.5
-
34
-
-
0006946256
-
Efficient methods for calculating the success function of fixed space replacement policies
-
F. Olken. Efficient methods for calculating the success function of fixed space replacement policies. Technical Report LBL-12370, Lawrence Berkeley Laboratory, 1981.
-
(1981)
Technical Report LBL-12370, Lawrence Berkeley Laboratory
-
-
Olken, F.1
-
35
-
-
78149254514
-
Accelerating multicore reuse distance analysis with sampling and parallelization
-
D. L. Schuff, M. Kulkarni, and V. S. Pai. Accelerating multicore reuse distance analysis with sampling and parallelization. In Proceedings of PACT, pages 53-64, 2010.
-
(2010)
Proceedings of PACT
, pp. 53-64
-
-
Schuff, D.L.1
Kulkarni, M.2
Pai, V.S.3
-
36
-
-
34548285855
-
Locality approximation using time
-
X. Shen, J. Shaw, B. Meeker, and C. Ding. Locality approximation using time. In Proceedings of POPL, pages 55-61, 2007.
-
(2007)
Proceedings of POPL
, pp. 55-61
-
-
Shen, X.1
Shaw, J.2
Meeker, B.3
Ding, C.4
-
37
-
-
33846547030
-
On the effectiveness of set associative page mapping and its applications in main memory management
-
A. J. Smith. On the effectiveness of set associative page mapping and its applications in main memory management. In Proceedings of ICSE, 1976.
-
(1976)
Proceedings of ICSE
-
-
Smith, A.J.1
-
38
-
-
85008189411
-
Efficient simulation of caches under optimal replacement with applications to miss characterization
-
Santa Clara, CA, May
-
R. A. Sugumar and S. G. Abraham. Efficient simulation of caches under optimal replacement with applications to miss characterization. In Proceedings of SIGMETRICS, Santa Clara, CA, May 1993.
-
(1993)
Proceedings of SIGMETRICS
-
-
Sugumar, R.A.1
Abraham, S.G.2
-
39
-
-
0034826142
-
Analytical cache models with applications to cache partitioning
-
G. E. Suh, S. Devadas, and L. Rudolph. Analytical cache models with applications to cache partitioning. In Proceedings of ICS, pages 1-12, 2001.
-
(2001)
Proceedings of ICS
, pp. 1-12
-
-
Suh, G.E.1
Devadas, S.2
Rudolph, L.3
-
40
-
-
67650796123
-
RapidMRC: Approximating L2 miss rate curves on commodity systems for online optimizations
-
D. K. Tam, R. Azimi, L. Soares, and M. Stumm. RapidMRC: approximating L2 miss rate curves on commodity systems for online optimizations. In Proceedings of ASPLOS, pages 121-132, 2009.
-
(2009)
Proceedings of ASPLOS
, pp. 121-132
-
-
Tam, D.K.1
Azimi, R.2
Soares, L.3
Stumm, M.4
-
42
-
-
34547683698
-
Superpin: Parallelizing dynamic instrumentation for real-time performance
-
S.Wallace and K. Hazelwood. Superpin: Parallelizing dynamic instrumentation for real-time performance. In Proceedings of CGO, pages 209-220, 2007.
-
(2007)
Proceedings of CGO
, pp. 209-220
-
-
Wallace, S.1
Hazelwood, K.2
-
43
-
-
84856557541
-
Coherent profiles: Enabling efficient reuse distance analysis of multicore scaling for loop-based parallel programs
-
M.-J. Wu and D. Yeung. Coherent profiles: Enabling efficient reuse distance analysis of multicore scaling for loop-based parallel programs. In Proceedings of PACT, pages 264-275, 2011.
-
(2011)
Proceedings of PACT
, pp. 264-275
-
-
Wu, M.-J.1
Yeung, D.2
-
45
-
-
79952804254
-
All-window profiling and composable models of cache sharing
-
X. Xiang, B. Bao, T. Bai, C. Ding, and T. M. Chilimbi. All-window profiling and composable models of cache sharing. In Proceedings of PPoPP, pages 91-102, 2011.
-
(2011)
Proceedings of PPoPP
, pp. 91-102
-
-
Xiang, X.1
Bao, B.2
Bai, T.3
Ding, C.4
Chilimbi, T.M.5
-
46
-
-
84863053984
-
Linear-time modeling of program working set in shared cache
-
X. Xiang, B. Bao, C. Ding, and Y. Gao. Linear-time modeling of program working set in shared cache. In Proceedings of PACT, pages 350-360, 2011.
-
(2011)
Proceedings of PACT
, pp. 350-360
-
-
Xiang, X.1
Bao, B.2
Ding, C.3
Gao, Y.4
-
47
-
-
84863700640
-
Cache conscious task regrouping on multicore processors
-
X. Xiang, B. Bao, C. Ding, and K. Shen. Cache conscious task regrouping on multicore processors. In Proceedings of CCGrid, pages 603-611, 2012.
-
(2012)
Proceedings of CCGrid
, pp. 603-611
-
-
Xiang, X.1
Bao, B.2
Ding, C.3
Shen, K.4
-
48
-
-
57349160281
-
Sampling-based program locality approximation
-
Y. Zhong and W. Chang. Sampling-based program locality approximation. In Proceedings of ISMM, pages 91-100, 2008.
-
(2008)
Proceedings of ISMM
, pp. 91-100
-
-
Zhong, Y.1
Chang, W.2
-
49
-
-
70349743894
-
Program locality analysis using reuse distance
-
Aug.
-
Y. Zhong, X. Shen, and C. Ding. Program locality analysis using reuse distance. ACM Transactions on Programming Languages and Systems, 31(6):1-39, Aug. 2009.
-
(2009)
ACM Transactions on Programming Languages and Systems
, vol.31
, Issue.6
, pp. 1-39
-
-
Zhong, Y.1
Shen, X.2
Ding, C.3
-
50
-
-
33746100320
-
Accurate, efficient, and adaptive calling context profiling
-
X. Zhuang, M. J. Serrano, H. W. Cain, and J.-D. Choi. Accurate, efficient, and adaptive calling context profiling. In Proceedings of PLDI, pages 263-271, 2006.
-
(2006)
Proceedings of PLDI
, pp. 263-271
-
-
Zhuang, X.1
Serrano, M.J.2
Cain, H.W.3
Choi, J.-D.4
-
51
-
-
77952248898
-
Addressing shared resource contention in multicore processors via scheduling
-
S. Zhuravlev, S. Blagodurov, and A. Fedorova. Addressing shared resource contention in multicore processors via scheduling. In Proceedings of ASPLOS, pages 129-142, 2010.
-
(2010)
Proceedings of ASPLOS
, pp. 129-142
-
-
Zhuravlev, S.1
Blagodurov, S.2
Fedorova, A.3
|