-
1
-
-
0034449842
-
Dynamo: A transparent dynamic optimization system
-
BALA, V., DUESTERWALD, E., AND BANERJIA, S. 2000. Dynamo: A transparent dynamic optimization system. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 1-12.
-
(2000)
Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation
, pp. 1-12
-
-
BALA, V.1
DUESTERWALD, E.2
BANERJIA, S.3
-
3
-
-
84877037316
-
Using hardware performance monitors to isolate memory bottlenecks
-
BUCK, B. AND HOLLINGSWORTH, J. 2000b. Using hardware performance monitors to isolate memory bottlenecks. In Supercomput., 64-65.
-
(2000)
In Supercomput
, pp. 64-65
-
-
BUCK, B.1
HOLLINGSWORTH, J.2
-
4
-
-
34247230067
-
-
A block-sorting lossless data compression algorithm. Tech. Rep. 124
-
BURROWS, M. AND WHEELER, D. J. 1994. A block-sorting lossless data compression algorithm. Tech. Rep. 124.
-
(1994)
-
-
BURROWS, M.1
WHEELER, D.J.2
-
6
-
-
34247198203
-
-
BURTSCHER, M. 2004b. Vpc3 source code, http://www.csl.cornell. edu/burtscher/research/tracecompression/.
-
(2004)
Vpc3 source code
-
-
BURTSCHER, M.1
-
7
-
-
18844387390
-
Exact analysis of the cache behavior of nested loops
-
CHATTERJEE, S., PARKER, E., HANLON, P., AND LEBECK, A. 2001. Exact analysis of the cache behavior of nested loops. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 286-297.
-
(2001)
Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation
, pp. 286-297
-
-
CHATTERJEE, S.1
PARKER, E.2
HANLON, P.3
LEBECK, A.4
-
9
-
-
0032630166
-
Cache-Conscious structure definition
-
CHILIMBI, T., DAVIDSON, B., AND LARUS, J. 1999. Cache-Conscious structure definition. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 13-24.
-
(1999)
Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation
, pp. 13-24
-
-
CHILIMBI, T.1
DAVIDSON, B.2
LARUS, J.3
-
10
-
-
0032667164
-
Cache-Conscious structure layout
-
CHILIMBI, T., HILL, M., AND LARUS, J. 1999b. Cache-Conscious structure layout. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 1-12.
-
(1999)
Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation
, pp. 1-12
-
-
CHILIMBI, T.1
HILL, M.2
LARUS, J.3
-
11
-
-
0033905645
-
-
CIPUENTES, C. AND EMMERIK, M. 2000. UQBT: Adaptable binary translation at low cost. Comput. 33, 3 (Mar.), 60-66.
-
CIPUENTES, C. AND EMMERIK, M. 2000. UQBT: Adaptable binary translation at low cost. Comput. 33, 3 (Mar.), 60-66.
-
-
-
-
12
-
-
84877046185
-
SIGMA: A simulator infrastructure to guide memory analysis
-
DEROSE, L., EKANADHAM, K., HOLLINGSWORTH, J. K., AND SBARAGLIA, S. 2002. SIGMA: A simulator infrastructure to guide memory analysis. In Proceedings of the ACM/IEEE SC Conference.
-
(2002)
Proceedings of the ACM/IEEE SC Conference
-
-
DEROSE, L.1
EKANADHAM, K.2
HOLLINGSWORTH, J.K.3
SBARAGLIA, S.4
-
14
-
-
0001714824
-
Cache miss equations: A compiler framework for analyzing and tuning memory behavior
-
GHOSH, S., MARTONOSI, M., AND MALIK, S. 1999. Cache miss equations: A compiler framework for analyzing and tuning memory behavior. ACM Trans. Program. Lang. Syst. 21, 4, 703-746.
-
(1999)
ACM Trans. Program. Lang. Syst
, vol.21
, Issue.4
, pp. 703-746
-
-
GHOSH, S.1
MARTONOSI, M.2
MALIK, S.3
-
15
-
-
17244370540
-
An evaluation of staged run-time optimizations in dyc
-
GRANT, B., PHILIPOSE, M., MOCK, M., CHAMBERS, C., AND EGGERS, S. 1999. An evaluation of staged run-time optimizations in dyc. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 293-304.
-
(1999)
Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation
, pp. 293-304
-
-
GRANT, B.1
PHILIPOSE, M.2
MOCK, M.3
CHAMBERS, C.4
EGGERS, S.5
-
16
-
-
0026186967
-
-
HAVLAK, P. AND KENNEDY, K. 1991. An implementation of interprocedural bounded regular section analysis. IEEE Trans. Parallel Distrib. Syst. 2, 3 (Jul.), 350-360.
-
HAVLAK, P. AND KENNEDY, K. 1991. An implementation of interprocedural bounded regular section analysis. IEEE Trans. Parallel Distrib. Syst. 2, 3 (Jul.), 350-360.
-
-
-
-
17
-
-
0029666630
-
Informing memory operations: Providing memory performance feedback in modern processors
-
HOROWITZ, M., MARTONOSI, M., MOWRY, T., AND SMITH, M. 1996. Informing memory operations: Providing memory performance feedback in modern processors. In Proceedings of the International Symposium on Computer Architecure, 260-270.
-
(1996)
Proceedings of the International Symposium on Computer Architecure
, pp. 260-270
-
-
HOROWITZ, M.1
MARTONOSI, M.2
MOWRY, T.3
SMITH, M.4
-
18
-
-
34247239692
-
-
INTEL. 2004. Intel Itanium2 Processor Reference Manual for Software Development and Optimization 1, Intel, Santa Clara, CA.
-
INTEL. 2004. Intel Itanium2 Processor Reference Manual for Software Development and Optimization Vol.1, Intel, Santa Clara, CA.
-
-
-
-
19
-
-
0028380268
-
-
LARUS, J. AND BALL, T. 1994. Rewriting executable files to measure program behavior. Softw. Pract. Experi. 24, 2 (Feb.), 197-218.
-
LARUS, J. AND BALL, T. 1994. Rewriting executable files to measure program behavior. Softw. Pract. Experi. 24, 2 (Feb.), 197-218.
-
-
-
-
21
-
-
0028517833
-
-
LEBECK, A. AND WOOD, D. 1994. Cache profiling and the SPEC benchmarks: A case study. Comput. 27, 10 (Oct.), 15-26.
-
LEBECK, A. AND WOOD, D. 1994. Cache profiling and the SPEC benchmarks: A case study. Comput. 27, 10 (Oct.), 15-26.
-
-
-
-
22
-
-
0030717855
-
-
LEBECK, A. AND WOOD, D. 1997. Active memory: A new abstraction for memory system simulation. ACM Trans. Model. Comput. Simul. 7, 1 (Jan.), 42-77.
-
LEBECK, A. AND WOOD, D. 1997. Active memory: A new abstraction for memory system simulation. ACM Trans. Model. Comput. Simul. 7, 1 (Jan.), 42-77.
-
-
-
-
26
-
-
8344285405
-
METRIC: Tracking down inefficiencies in the memory hierarchy via binary rewriting
-
MARATHE, J., MUELLER, P., MOHAN, T., DE SUPINSKI, B. R., MCKEE, S. A., AND YOO, A. 2003. METRIC: Tracking down inefficiencies in the memory hierarchy via binary rewriting. In Proceedings of the International Symposium on Code Generation and Optimization, 289-300.
-
(2003)
Proceedings of the International Symposium on Code Generation and Optimization
, pp. 289-300
-
-
MARATHE, J.1
MUELLER, P.2
MOHAN, T.3
DE SUPINSKI, B.R.4
MCKEE, S.A.5
YOO, A.6
-
27
-
-
8344246921
-
Detailed cache coherence characterization for OpenMP benchmarks
-
MARATHE, J., NAGARAJAN, A., AND MUELLER, F. 2004. Detailed cache coherence characterization for OpenMP benchmarks. In Proceedings of the International Conference on Supercomputing, 287-297.
-
(2004)
Proceedings of the International Conference on Supercomputing
, pp. 287-297
-
-
MARATHE, J.1
NAGARAJAN, A.2
MUELLER, F.3
-
28
-
-
0034818669
-
Tools for application-oriented performance tuning
-
MELLOR-CRUMMEY, J., FOWLER, R., AND WHALLEY, D. 2001. Tools for application-oriented performance tuning. In Proceedings of the International Conference on Supercomputing, 154-165.
-
(2001)
Proceedings of the International Conference on Supercomputing
, pp. 154-165
-
-
MELLOR-CRUMMEY, J.1
FOWLER, R.2
WHALLEY, D.3
-
29
-
-
84877082695
-
Identifying and exploiting spatial regularity in data memory references
-
MOHAN, T., DE SUPINSKI, B. R., MCKEE, S. A., MUELLER, F., YOO, A., AND SCHULZ, M. 2003. Identifying and exploiting spatial regularity in data memory references. Supercomput.
-
(2003)
Supercomput
-
-
MOHAN, T.1
DE SUPINSKI, B.R.2
MCKEE, S.A.3
MUELLER, F.4
YOO, A.5
SCHULZ, M.6
-
30
-
-
0031357519
-
Predicting data cache misses in non-numeric applications through correlation profiling
-
MOWRY, T. AND LUK, C.-K. 1997. Predicting data cache misses in non-numeric applications through correlation profiling. In MICRO-30, 314-320.
-
(1997)
MICRO-30
, pp. 314-320
-
-
MOWRY, T.1
LUK, C.-K.2
-
31
-
-
34247255807
-
Partial data traces: Efficient generation and representation
-
IEEE Technical Committee on Computer Architecture Newsletter
-
MUELLER, P., MOHAN, T., DE SUPINSKI, B. R., MCKEE, S. A., AND YOO, A. 2001. Partial data traces: Efficient generation and representation. In Workshop on Binary Translation. IEEE Technical Committee on Computer Architecture Newsletter.
-
(2001)
Workshop on Binary Translation
-
-
MUELLER, P.1
MOHAN, T.2
DE SUPINSKI, B.R.3
MCKEE, S.A.4
YOO, A.5
-
32
-
-
0000523223
-
Compression and explanation using hierarchical grammars
-
NEVILL-MANNING, C. G. AND WITTEN, I. H. 1997a. Compression and explanation using hierarchical grammars. Comput. J. 40, 2-3.
-
(1997)
Comput. J
, vol.40
, pp. 2-3
-
-
NEVILL-MANNING, C.G.1
WITTEN, I.H.2
-
35
-
-
0002254859
-
-
SITES, R., CHERNOFF, A., KIRK, M., MARKS, M., AND ROBINSON, S. 1993. Binary translation. Commun. ACM 36, 2 (Feb.), 69-81.
-
SITES, R., CHERNOFF, A., KIRK, M., MARKS, M., AND ROBINSON, S. 1993. Binary translation. Commun. ACM 36, 2 (Feb.), 69-81.
-
-
-
-
37
-
-
0036298603
-
-
TENDLER, J. M., DODSON, J. S., FIELDS, JR., J. S., LE, H., AND SINHAROY, B. 2002. POWER4 system microarchitecture. IBM J. Res. Develop. 46, 1 (Jan.), 5-25.
-
TENDLER, J. M., DODSON, J. S., FIELDS, JR., J. S., LE, H., AND SINHAROY, B. 2002. POWER4 system microarchitecture. IBM J. Res. Develop. 46, 1 (Jan.), 5-25.
-
-
-
-
39
-
-
0242308158
-
-
VETTER, J. AND MUELLER, F. 2003. Communication characteristics of large-scale scientific applications for contemporary cluster architectures. J. Parallel Distrib. Comput. 63, 9 (Sept.), 853-865.
-
VETTER, J. AND MUELLER, F. 2003. Communication characteristics of large-scale scientific applications for contemporary cluster architectures. J. Parallel Distrib. Comput. 63, 9 (Sept.), 853-865.
-
-
-
-
40
-
-
34247232477
-
Caches as filters: A framework for the analysis of caching systems
-
WEIKLE, D., MCKEE, S. A., SKADRON, K., AND WULF, W. 2000. Caches as filters: A framework for the analysis of caching systems. In Proceedings of the Grace Murray Hopper Conference.
-
(2000)
Proceedings of the Grace Murray Hopper Conference
-
-
WEIKLE, D.1
MCKEE, S.A.2
SKADRON, K.3
WULF, W.4
-
42
-
-
8344272049
-
Array regrouping and structure splitting using whole-program reference affinity
-
ZHONG, Y., ORLOVICH, M., SHEN, X., AND DING, C. 2004. Array regrouping and structure splitting using whole-program reference affinity. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation.
-
(2004)
Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation
-
-
ZHONG, Y.1
ORLOVICH, M.2
SHEN, X.3
DING, C.4
|