-
1
-
-
76749126201
-
-
AMD Athlon 64 X2 Dual-Core processor for desktop. http://www.amd.com /usen/Processors/ProductInformation/0,,30-118-9485-13041,00.html
-
AMD Athlon 64 X2 Dual-Core processor for desktop. http://www.amd.com /usen/Processors/ProductInformation/0,,30-118-9485-13041,00.html
-
-
-
-
2
-
-
76749162689
-
Data and computation transformations for multiprocessors
-
J. M. Anderson et al. Data and computation transformations for multiprocessors. In Proc. POPL, 1995.
-
(1995)
Proc. POPL
-
-
Anderson, J.M.1
-
3
-
-
0029373981
-
Automatic partitioning of parallel loops and data arrays for distributed shared-memory multiprocessors
-
A. Agarwal et al. Automatic partitioning of parallel loops and data arrays for distributed shared-memory multiprocessors. In TPDS, 1995.
-
(1995)
TPDS
-
-
Agarwal, A.1
-
4
-
-
76749087837
-
Precise automatable analytical modeling of the cache behavior of codes with indirections
-
D. Andrade et al. Precise automatable analytical modeling of the cache behavior of codes with indirections. In TACO, 2007.
-
(2007)
TACO
-
-
Andrade, D.1
-
5
-
-
76749103691
-
Exploiting access semantics and program behavior to reduce snoop power in chip multiprocessors
-
C.S. Ballapuram et al. Exploiting access semantics and program behavior to reduce snoop power in chip multiprocessors. In Proc. ASPLOS, 2008.
-
(2008)
Proc. ASPLOS
-
-
Ballapuram, C.S.1
-
6
-
-
63549135938
-
Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories
-
M. Baskaran et al. Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories. In Proc. PPoPP, 2008.
-
(2008)
Proc. PPoPP
-
-
Baskaran, M.1
-
7
-
-
21644472427
-
Managing wire delay in large chip-multiprocessor caches
-
D. Beckmann, D. Wood. Managing wire delay in large chip-multiprocessor caches. In Proc. MICRO, 2004.
-
(2004)
Proc. MICRO
-
-
Beckmann, D.1
Wood, D.2
-
8
-
-
63549095070
-
The PARSEC benchmark suite: Characterization and architectural implications
-
October
-
C. Bienia, S. Kumar, J. P. Singh and K. Li. The PARSEC benchmark suite: characterization and architectural implications. In Proc. PACT, October 2008.
-
(2008)
Proc. PACT
-
-
Bienia, C.1
Kumar, S.2
Singh, J.P.3
Li, K.4
-
9
-
-
76749086882
-
Programming for parallelism and locality with hierarchically tiled arrays
-
G. Bikshandi et al. Programming for parallelism and locality with hierarchically tiled arrays. In Proc. PPOPP, 2006.
-
(2006)
Proc. PPOPP
-
-
Bikshandi, G.1
-
10
-
-
57349145904
-
Automatic transformations for communication- minimized parallelization and locality optimization in the polyhedral model
-
U. Bondhugula et al. Automatic transformations for communication- minimized parallelization and locality optimization in the polyhedral model. In Proc. CC, 2008.
-
(2008)
Proc. CC
-
-
Bondhugula, U.1
-
11
-
-
85009364061
-
Compiler optimizations for improving data locality
-
S. Carr et al. Compiler optimizations for improving data locality. In Proc. ASPLOS, 1994.
-
(1994)
Proc. ASPLOS
-
-
Carr, S.1
-
12
-
-
0035338106
-
Code transformations for data transfer and storage exploration preprocessing in multimedia processors
-
F. Catthoor et al. Code transformations for data transfer and storage exploration preprocessing in multimedia processors. In IEEE Design Test, 2001.
-
(2001)
IEEE Design Test
-
-
Catthoor, F.1
-
13
-
-
0034832018
-
Exact analysis of the cache behavior of nested loops
-
S. Chatterjee et al.Exact analysis of the cache behavior of nested loops. In SIGPLAN Not., 2001.
-
(2001)
SIGPLAN
-
-
Chatterjee, S.1
-
14
-
-
76749152674
-
Dynamic partitioning of shared cache memory
-
J. Chang, G. Sohi. Dynamic partitioning of shared cache memory. In Proc. ICS, 2007.
-
(2007)
Proc. ICS
-
-
Chang, J.1
Sohi, G.2
-
15
-
-
0028499023
-
Communication-free data allocation techniques for parallelizing compilers on multicomputers
-
T. S. Chen, J. P. Sheu. Communication-free data allocation techniques for parallelizing compilers on multicomputers. In TPDS, 1994.
-
(1994)
TPDS
-
-
Chen, T.S.1
Sheu, J.P.2
-
16
-
-
35248852476
-
Scheduling threads for constructive cache sharing on CMPs
-
June
-
S. Chen et al. Scheduling threads for constructive cache sharing on CMPs. In Proc. ACM SPAA, June 2007.
-
(2007)
Proc. ACM SPAA
-
-
Chen, S.1
-
17
-
-
76749093491
-
A TDI system and its application to approximation algorithms
-
M. Cheng et al. A TDI system and its application to approximation algorithms. In Proc. FOCS, 1998.
-
(1998)
Proc. FOCS
-
-
Cheng, M.1
-
18
-
-
27544432313
-
Optimizing replication, communication, and capacity allocation in CMPs
-
Z. Chishti et al. Optimizing replication, communication, and capacity allocation in CMPs. In Proc. ISCA, 2005.
-
(2005)
Proc. ISCA
-
-
Chishti, Z.1
-
20
-
-
0003795618
-
Unifying Data and control transformations for distributed shared memory machines
-
Rochester
-
M. Cierniak, W. Li, Unifying Data and control transformations for distributed shared memory machines. In Tech. Rep. U. Rochester, 1994.
-
(1994)
Tech. Rep. U
-
-
Cierniak, M.1
Li, W.2
-
21
-
-
76749100063
-
-
K. Cooper L. Torczon. Engineering a compiler. 2008.
-
K. Cooper L. Torczon. Engineering a compiler. 2008.
-
-
-
-
24
-
-
0026891897
-
Partitioning and labeling of loops by unimodular transformations
-
E. DŠ'Hollander. Partitioning and labeling of loops by unimodular transformations. In TPDS, 1992.
-
(1992)
TPDS
-
-
DŠ'Hollander, E.1
-
25
-
-
0030675463
-
Cache miss equations: An analytical representation of cache misses
-
S. Ghosh et al. Cache miss equations: An analytical representation of cache misses.In Proc. ICS, 1997.
-
(1997)
Proc. ICS
-
-
Ghosh, S.1
-
26
-
-
0030380793
-
Maximizing multiprocessor performance with the SUIF compiler
-
M. W. Hall et al. Maximizing multiprocessor performance with the SUIF compiler. In Computer, 1996.
-
(1996)
Computer
-
-
Hall, M.W.1
-
27
-
-
84868170244
-
-
http://www.intel.com/p/en US/products/server/processor/xeon7000?iid= servproc+body xeon7400subtitle
-
-
-
-
28
-
-
76749110184
-
-
Intel quad-core Xeon. http://www.intel.com/quad-core/?cid=cim:ggl|xeon us clovertown|k7449|s
-
Intel quad-core Xeon. http://www.intel.com/quad-core/?cid=cim:ggl|xeon us clovertown|k7449|s
-
-
-
-
29
-
-
84868176335
-
-
http://www.intel.com/idf/.
-
-
-
-
31
-
-
3042669130
-
IBM Power5 chip: A dual-core multithreaded processor
-
R. Kalla et al. IBM Power5 chip: a dual-core multithreaded processor. In IEEE Micro, 2004.
-
(2004)
IEEE Micro
-
-
Kalla, R.1
-
32
-
-
50249115185
-
Data locality enhancement for CMPs
-
M. Kandemir. Data locality enhancement for CMPs. In Proc. ICCAD, 2007.
-
(2007)
Proc. ICCAD
-
-
Kandemir, M.1
-
33
-
-
76749105972
-
Cache-aware iteration space partitioning
-
A. Kejariwal et al. Cache-aware iteration space partitioning. In Proc. PPoPP, 2008.
-
(2008)
Proc. PPoPP
-
-
Kejariwal, A.1
-
34
-
-
0346865818
-
Data-centric transformations for locality enhancement
-
I. Kodukula, K. Pingali. Data-centric transformations for locality enhancement. In IJPP, 2001.
-
(2001)
IJPP
-
-
Kodukula, I.1
Pingali, K.2
-
35
-
-
20344374162
-
Niagara: A 32-way multithreaded SPARC processor
-
P. Kongetira et al. Niagara: A 32-way multithreaded SPARC processor. In IEEE Micro, 2005.
-
(2005)
IEEE Micro
-
-
Kongetira, P.1
-
36
-
-
62349131952
-
The cache performance of blocked algorithms
-
M. Lam et al. The cache performance of blocked algorithms. In Proc. ASPLOS, 1991.
-
(1991)
Proc. ASPLOS
-
-
Lam, M.1
-
37
-
-
37549032725
-
IBM POWER6 microarchitecture
-
H. Q. Le, et al. IBM POWER6 microarchitecture. In IBM Jrnl. of R&D, 2007.
-
(2007)
IBM Jrnl. of R&D
-
-
Le, H.Q.1
-
39
-
-
57349101237
-
Data and computation transformations for Brook streaming applications on multiprocessors
-
S. Liao et al. Data and computation transformations for Brook streaming applications on multiprocessors. In Proc. CGO, 2006.
-
(2006)
Proc. CGO
-
-
Liao, S.1
-
40
-
-
33748870886
-
Multifacet's General Execution-driven Multiprocessor Simulator (GEMS) Toolset
-
September
-
M. Martin, D. Sorin, B. Beckmann, M. Marty, M. Xu, A. R. Alameldeen, K. Moore, M. Hill, and D. Wood. Multifacet's General Execution-driven Multiprocessor Simulator (GEMS) Toolset. Computer Architecture News, September 2005.
-
(2005)
Computer Architecture News
-
-
Martin, M.1
Sorin, D.2
Beckmann, B.3
Marty, M.4
Xu, M.5
Alameldeen, A.R.6
Moore, K.7
Hill, M.8
Wood, D.9
-
41
-
-
33751424104
-
Adaptive designs for power and thermal optimization
-
R. McGowen. Adaptive designs for power and thermal optimization. In Proc. ICCAD, 2005.
-
(2005)
Proc. ICCAD
-
-
McGowen, R.1
-
42
-
-
34247326334
-
-
Omega library. http://www.cs.umd.edu/projects/omega.
-
Omega library
-
-
-
43
-
-
0028132512
-
Counting solutions to Presburger formulas: How and why
-
W. Pugh. Counting solutions to Presburger formulas: how and why. Proc. PLDI, 1994.
-
(1994)
Proc. PLDI
-
-
Pugh, W.1
-
44
-
-
84868190036
-
-
Quad-core AMD Opteron. http://multicore.amd.com/us-en/quadcore/
-
Opteron
-
-
-
45
-
-
34548042910
-
Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches
-
M. K. Qureshi, Y. N. Patt. Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches. In Proc. MICRO, 2006.
-
(2006)
Proc. MICRO
-
-
Qureshi, M.K.1
Patt, Y.N.2
-
46
-
-
64949109435
-
Architectural support for operating system-driven CMP cache management
-
N. Rafique et al. Architectural support for operating system-driven CMP cache management. In Proc. PACT, 2006.
-
(2006)
Proc. PACT
-
-
Rafique, N.1
-
47
-
-
43249127323
-
Dynamically configurable shared CMP helper engines for improved performance
-
A. Shayesteh et al. Dynamically configurable shared CMP helper engines for improved performance. In SIGARCH Comput. Archit., 2005.
-
(2005)
SIGARCH Comput. Archit
-
-
Shayesteh, A.1
-
49
-
-
84868170241
-
-
SIMICS
-
SIMICS. http://www.virtutech.com/simics/simics.html.
-
-
-
-
51
-
-
70350634177
-
Adaptive set pinning: Managing shared caches in CMPs
-
S. Srikantaiah et al. Adaptive set pinning: managing shared caches in CMPs. In Proc. ASPLOS, 2008.
-
(2008)
Proc. ASPLOS
-
-
Srikantaiah, S.1
-
52
-
-
1642371317
-
Dynamic partitioning of shared cache memory
-
G. E. Suh et al. Dynamic partitioning of shared cache memory. In Journal of Supercomputing, 2004.
-
(2004)
Journal of Supercomputing
-
-
Suh, G.E.1
-
54
-
-
1842635044
-
A fast and accurate framework to analyze and optimize cache memory behavior
-
X. Vera et al. A fast and accurate framework to analyze and optimize cache memory behavior. In TOPLAS 2004.
-
(2004)
TOPLAS
-
-
Vera, X.1
-
55
-
-
85013942562
-
A data locality optimizing algorithm
-
M. Wolf, M. Lam. A data locality optimizing algorithm. In Proc. PLDI, 1991.
-
(1991)
Proc. PLDI
-
-
Wolf, M.1
Lam, M.2
-
56
-
-
0002375353
-
The SPLASH-2 programs: Characterization and methodological considerations
-
S. Woo et al. The SPLASH-2 programs: characterization and methodological considerations. In Proc. ISCA, 1995.
-
(1995)
Proc. ISCA
-
-
Woo, S.1
-
57
-
-
76749139374
-
A hierarchical model of data locality
-
C. Zhang et al. A hierarchical model of data locality. In Proc. POPL, 2006.
-
(2006)
Proc. POPL
-
-
Zhang, C.1
|