-
2
-
-
33947715600
-
IPC considered harmful for multiprocessor workloads
-
A. Alameldeen and D.Wood. IPC considered harmful for multiprocessor workloads. IEEE Micro, 26(4), 2006.
-
(2006)
IEEE Micro
, vol.26
, Issue.4
-
-
Alameldeen, A.1
Wood, D.2
-
3
-
-
47249127725
-
The case for energy-proportional computing
-
L. Barroso and U. Hölzle. The case for energy-proportional computing. IEEE Computer, 40(12):33-37, 2007.
-
(2007)
IEEE Computer
, vol.40
, Issue.12
, pp. 33-37
-
-
Barroso, L.1
Hölzle, U.2
-
4
-
-
84887440618
-
Jigsaw: Scalable software-defined caches
-
N. Beckmann and D. Sanchez. Jigsaw: Scalable Software-Defined Caches. In Proc. PACT-22, 2013.
-
(2013)
Proc. PACT-22
-
-
Beckmann, N.1
Sanchez, D.2
-
5
-
-
84887501582
-
PACORA: Performance aware convex optimization for resource allocation
-
S. Bird and B. Smith. PACORA: Performance aware convex optimization for resource allocation. In Proc. HotPar-3, 2011.
-
(2011)
Proc. HotPar
, vol.3
-
-
Bird, S.1
Smith, B.2
-
6
-
-
84880270753
-
Power struggles: Revisiting the risc vs cisc debate on contemporary arm and x86 architectures
-
E. Blem, J. Menon, and K. Sankaralingam. Power Struggles: Revisiting the RISC vs CISC Debate on Contemporary ARM and x86 Architectures. In Proc. HPCA-16, 2013.
-
(2013)
Proc. HPCA
, vol.16
-
-
Blem, E.1
Menon, J.2
Sankaralingam, K.3
-
7
-
-
84883366263
-
A 22nm high performance embedded dram soc technology featuring tri-gate transistors and mimcap cob
-
R. Brain, A. Baran, N. Bisnik, et al. A 22nm High Performance Embedded DRAM SoC Technology Featuring Tri-Gate Transistors and MIMCAP COB. In Proc. of the Symposium on VLSI Technology, 2013.
-
(2013)
Proc. of the Symposium on VLSI Technology
-
-
Brain, R.1
Baran, A.2
Bisnik, N.3
-
8
-
-
53549130720
-
Impact of cache partitioning on multi-tasking real time embedded systems
-
B. D. Bui, M. Caccamo, L. Sha, and J. Martinez. Impact of cache partitioning on multi-tasking real time embedded systems. In Proc. RTCSA-14, 2008.
-
(2008)
Proc. RTCSA
, vol.14
-
-
Bui, B.D.1
Caccamo, M.2
Sha, L.3
Martinez, J.4
-
9
-
-
0033683314
-
Application-specific memory management for embedded systems using software-controlled caches
-
D. Chiou, P. Jain, L. Rudolph, and S. Devadas. Application-specific memory management for embedded systems using software-controlled caches. In Proc. DAC-37, 2000.
-
(2000)
Proc. DAC
, vol.37
-
-
Chiou, D.1
Jain, P.2
Rudolph, L.3
Devadas, S.4
-
10
-
-
84881160871
-
A hardware evaluation of cache partitioning to improve utilization and energy-efficiency while preserving responsiveness
-
H. Cook, M. Moreto, S. Bird, et al. A hardware evaluation of cache partitioning to improve utilization and energy-efficiency while preserving responsiveness. In Proc. ISCA-40, 2013.
-
(2013)
Proc. ISCA
, vol.40
-
-
Cook, H.1
Moreto, M.2
Bird, S.3
-
12
-
-
84875649537
-
Paragon: Qos-aware scheduling for heterogeneous datacenters
-
C. Delimitrou and C. Kozyrakis. Paragon: QoS-Aware Scheduling for Heterogeneous Datacenters. In Proc. ASPLOS-18, 2013.
-
(2013)
Proc. ASPLOS
, vol.18
-
-
Delimitrou, C.1
Kozyrakis, C.2
-
13
-
-
77952285828
-
Fairness via source throttling: A configurable and high-performance fairness substrate for multi-core memory systems
-
E. Ebrahimi, C. J. Lee, O. Mutlu, and Y. N. Patt. Fairness via source throttling: A configurable and high-performance fairness substrate for multi-core memory systems. In Proc. ASPLOS-15, 2010.
-
(2010)
Proc. ASPLOS
, vol.15
-
-
Ebrahimi, E.1
Lee, C.J.2
Mutlu, O.3
Patt, Y.N.4
-
14
-
-
34249813667
-
A performance counter architecture for computing accurate CPI components
-
S. Eyerman, L. Eeckhout, T. Karkhanis, and J. E. Smith. A performance counter architecture for computing accurate CPI components. In Proc. ASPLOS-12, 2006.
-
(2006)
Proc. ASPLOS
, vol.12
-
-
Eyerman, S.1
Eeckhout, L.2
Karkhanis, T.3
Smith, J.E.4
-
15
-
-
84858791438
-
Clearing the clouds: A study of emerging scale-out workloads on modern hardware
-
M. Ferdman, A. Adileh, O. Kocberber, et al. Clearing the clouds: a study of emerging scale-out workloads on modern hardware. In Proc. ASPLOS-17, 2012.
-
(2012)
Proc. ASPLOS
, vol.17
-
-
Ferdman, M.1
Adileh, A.2
Kocberber, O.3
-
16
-
-
80052522708
-
Kilo-NOC: A heterogeneous network-on-chip architecture for scalability and service guarantees
-
B. Grot, J. Hestness, S. W. Keckler, and O. Mutlu. Kilo-NOC: a heterogeneous network-on-chip architecture for scalability and service guarantees. In Proc. ISCA-38, 2011.
-
(2011)
Proc. ISCA
, vol.38
-
-
Grot, B.1
Hestness, J.2
Keckler, S.W.3
Mutlu, O.4
-
17
-
-
47349085427
-
A framework for providing quality of service in chip multi-processors
-
F. Guo, Y. Solihin, L. Zhao, and R. Iyer. A framework for providing quality of service in chip multi-processors. In Proc. MICRO-40, 2007.
-
(2007)
Proc. MICRO
, vol.40
-
-
Guo, F.1
Solihin, Y.2
Zhao, L.3
Iyer, R.4
-
18
-
-
70350601187
-
Reactive NUCA: Near-optimal block placement and replication in distributed caches
-
N. Hardavellas, M. Ferdman, B. Falsafi, and A. Ailamaki. Reactive NUCA: near-optimal block placement and replication in distributed caches. In Proc. ISCA-36, 2009.
-
(2009)
Proc. ISCA
, vol.36
-
-
Hardavellas, N.1
Ferdman, M.2
Falsafi, B.3
Ailamaki, A.4
-
19
-
-
84910129119
-
FIESTA: A sample-balanced multi-program workload methodology
-
A. Hilton, N. Eswaran, and A. Roth. FIESTA: A sample-balanced multi-program workload methodology. In MoBS, 2009.
-
(2009)
MoBS
-
-
Hilton, A.1
Eswaran, N.2
Roth, A.3
-
20
-
-
47349095214
-
QoS policies and architecture for cache/memory in CMP platforms
-
R. Iyer, L. Zhao, F. Guo, et al. QoS policies and architecture for cache/memory in CMP platforms. In Proc. SIGMETRICS, 2007.
-
(2007)
Proc. SIGMETRICS
-
-
Iyer, R.1
Zhao, L.2
Guo, F.3
-
21
-
-
84863550145
-
A QoS-aware memory controller for dynamically balancing GPU and CPU bandwidth use in an MPSoC
-
M. K. Jeong, M. Erez, C. Sudanthi, and N. Paver. A QoS-aware memory controller for dynamically balancing GPU and CPU bandwidth use in an MPSoC. In Proc. DAC-49, 2012.
-
(2012)
Proc. DAC
, vol.49
-
-
Jeong, M.K.1
Erez, M.2
Sudanthi, C.3
Paver, N.4
-
23
-
-
70349141254
-
Shore-MT: A scalable storage manager for the multicore era
-
R. Johnson, I. Pandis, N. Hardavellas, et al. Shore-MT: A scalable storage manager for the multicore era. In Proc. EDBT-12, 2009.
-
(2009)
Proc. EDBT
, vol.12
-
-
Johnson, R.1
Pandis, I.2
Hardavellas, N.3
-
24
-
-
84870557554
-
Chronos: Predictable low latency for data center applications
-
R. Kapoor, G. Porter, M. Tewari, et al. Chronos: predictable low latency for data center applications. In Proc. SoCC-3, 2012.
-
(2012)
Proc. SoCC
, vol.3
-
-
Kapoor, R.1
Porter, G.2
Tewari, M.3
-
25
-
-
79951718838
-
Thread cluster memory scheduling: Exploiting differences in memory access behavior
-
Y. Kim, M. Papamichael, O. Mutlu, and M. Harchol-Balter. Thread cluster memory scheduling: Exploiting differences in memory access behavior. In Proc. MICRO-43, 2010.
-
(2010)
Proc. MICRO
, vol.43
-
-
Kim, Y.1
Papamichael, M.2
Mutlu, O.3
Harchol-Balter, M.4
-
26
-
-
85110867932
-
Moses: Open source toolkit for statistical machine translation
-
P. Koehn, H. Hoang, A. Birch, et al. Moses: Open source toolkit for statistical machine translation. In Proc. ACL-45, 2007.
-
(2007)
Proc. ACL
, vol.45
-
-
Koehn, P.1
Hoang, H.2
Birch, A.3
-
28
-
-
84897787167
-
PRETI: Partitioned REal-TIme shared cache for mixed-criticality real-time systems
-
B. Lesage, I. Puaut, and A. Seznec. PRETI: Partitioned REal-TIme shared cache for mixed-criticality real-time systems. In Proc. ICRTNS-20, 2012.
-
(2012)
Proc. ICRTNS
, vol.20
-
-
Lesage, B.1
Puaut, I.2
Seznec, A.3
-
29
-
-
79953203158
-
CoQoS: Coordinating QoS-aware shared resources in NoC-based SoCs
-
B. Li, L. Zhao, R. Iyer, et al. CoQoS: Coordinating QoS-aware shared resources in NoC-based SoCs. Journal of Parallel and Distributed Computing, 71(5), 2011.
-
(2011)
Journal of Parallel and Distributed Computing
, vol.71
, Issue.5
-
-
Li, B.1
Zhao, L.2
Iyer, R.3
-
30
-
-
84977144248
-
Refining the utility metric for utilitybased cache partitioning
-
X. Lin and R. Balasubramonian. Refining the utility metric for utilitybased cache partitioning. In Proc. WDDD, 2011.
-
(2011)
Proc. WDDD
-
-
Lin, X.1
Balasubramonian, R.2
-
31
-
-
85092783412
-
Tessellation: Space-time partitioning in a manycore client OS
-
R. Liu, K. Klues, S. Bird, et al. Tessellation: Space-time partitioning in a manycore client OS. In Proc. HotPar-1, 2009.
-
(2009)
Proc. HotPar
, vol.1
-
-
Liu, R.1
Klues, K.2
Bird, S.3
-
32
-
-
84860592643
-
Cache craftiness for fast multicore key-value storage
-
Y. Mao, E. Kohler, and R. T. Morris. Cache craftiness for fast multicore key-value storage. In Proc. EuroSys-7, 2012.
-
(2012)
Proc. EuroSys
, vol.7
-
-
Mao, Y.1
Kohler, E.2
Morris, R.T.3
-
33
-
-
84858783719
-
Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations
-
J. Mars, L. Tang, R. Hundt, et al. Bubble-Up: Increasing Utilization in Modern Warehouse Scale Computers via Sensible Co-locations. In Proc. MICRO-44, 2011.
-
(2011)
Proc. MICRO
, vol.44
-
-
Mars, J.1
Tang, L.2
Hundt, R.3
-
34
-
-
84885629106
-
Stochastic queuing simulation for data center workloads
-
D. Meisner and T. F. Wenisch. Stochastic queuing simulation for data center workloads. EXERT, 2010.
-
(2010)
EXERT
-
-
Meisner, D.1
Wenisch, T.F.2
-
36
-
-
85084163128
-
Eliminating receive livelock in an interrupt-driven kernel
-
J. Mogul and K. Ramakrishnan. Eliminating receive livelock in an interrupt-driven kernel. In Proc. USENIX ATC, 1996.
-
(1996)
Proc. USENIX ATC
-
-
Mogul, J.1
Ramakrishnan, K.2
-
37
-
-
70449655189
-
FlexDCP: A qos framework for cmp architectures
-
M. Moreto, F. J. Cazorla, A. Ramirez, et al. FlexDCP: A QoS framework for CMP architectures. SIGOPS Operating Systems Review, 43(2), 2009.
-
(2009)
SIGOPS Operating Systems Review
, vol.43
, Issue.2
-
-
Moreto, M.1
Cazorla, F.J.2
Ramirez, A.3
-
40
-
-
77954780208
-
The case for RAMClouds: Scalable high-performance storage entirely in DRAM
-
J. Ousterhout, P. Agrawal, D. Erickson, et al. The case for RAMClouds: scalable high-performance storage entirely in DRAM. SIGOPS Operat-ing Systems Review, 43(4), 2010.
-
(2010)
SIGOPS Operat-ing Systems Review
, vol.43
, Issue.4
-
-
Ousterhout, J.1
Agrawal, P.2
Erickson, D.3
-
41
-
-
34548304615
-
Scratchpad memories vs locked caches in hard real-time systems: A quantitative comparison
-
I. Puaut and C. Pais. Scratchpad memories vs locked caches in hard real-time systems: a quantitative comparison. In Proc. DATE, 2007.
-
(2007)
Proc. DATE
-
-
Puaut, I.1
Pais, C.2
-
42
-
-
34548042910
-
Utility-based cache partitioning: A lowoverhead, high-performance, runtime mechanism to partition shared caches
-
M. Qureshi and Y. Patt. Utility-based cache partitioning: A lowoverhead, high-performance, runtime mechanism to partition shared caches. In Proc. MICRO-39, 2006.
-
(2006)
Proc. MICRO
, vol.39
-
-
Qureshi, M.1
Patt, Y.2
-
43
-
-
77954977639
-
Web search using mobile cores: Quantifying and mitigating the price of efficiency
-
V. Reddi, B. Lee, T. Chilimbi, and K. Vaid. Web search using mobile cores: quantifying and mitigating the price of efficiency. In Proc. ISCA-37, 2010.
-
(2010)
Proc. ISCA
, vol.37
-
-
Reddi, V.1
Lee, B.2
Chilimbi, T.3
Vaid, K.4
-
44
-
-
79951696261
-
The zcache: Decoupling ways and associativity
-
D. Sanchez and C. Kozyrakis. The ZCache: Decoupling Ways and Associativity. In Proc. MICRO-43, 2010.
-
(2010)
Proc. MICRO
, vol.43
-
-
Sanchez, D.1
Kozyrakis, C.2
-
45
-
-
80052521720
-
Vantage: Scalable and efficient fine-grain cache partitioning
-
D. Sanchez and C. Kozyrakis. Vantage: Scalable and Efficient Fine-Grain Cache Partitioning. In Proc. ISCA-38, 2011.
-
(2011)
Proc. ISCA
, vol.38
-
-
Sanchez, D.1
Kozyrakis, C.2
-
46
-
-
84881154274
-
ZSim: Fast and accurate microarchitectural simulation of thousand-core systems
-
D. Sanchez and C. Kozyrakis. ZSim: Fast and Accurate Microarchitectural Simulation of Thousand-Core Systems. In Proc. ISCA-40, 2013.
-
(2013)
Proc. ISCA
, vol.40
-
-
Sanchez, D.1
Kozyrakis, C.2
-
48
-
-
0027307814
-
A case for two-way skewed-associative caches
-
A. Seznec. A case for two-way skewed-associative caches. In Proc. ISCA-20, 1993.
-
(1993)
Proc. ISCA
, vol.20
-
-
Seznec, A.1
-
49
-
-
84892655102
-
METE: Meeting end-to-end QoS in multicores through system-wide resource management
-
A. Sharifi, S. Srikantaiah, A. Mishra, et al. METE: meeting end-to-end QoS in multicores through system-wide resource management. In Proc. SIGMETRICS, 2011.
-
(2011)
Proc. SIGMETRICS
-
-
Sharifi, A.1
Srikantaiah, S.2
Mishra, A.3
-
50
-
-
77952200539
-
A 40nm 16-core 128-thread CMT SPARC SoC processor
-
J. Shin, K. Tam, D. Huang, et al. A 40nm 16-core 128-thread CMT SPARC SoC processor. In ISSCC, 2010.
-
(2010)
ISSCC
-
-
Shin, J.1
Tam, K.2
Huang, D.3
-
51
-
-
0034443570
-
Symbiotic jobscheduling for a simultaneous multithreading processor
-
A. Snavely and D. M. Tullsen. Symbiotic jobscheduling for a simultaneous multithreading processor. In Proc. ASPLOS-8, 2000.
-
(2000)
Proc. ASPLOS
, vol.8
-
-
Snavely, A.1
Tullsen, D.M.2
-
52
-
-
76749118968
-
SHARP control: Controlled shared cache management in chip multiprocessors
-
S. Srikantaiah, M. Kandemir, and Q. Wang. SHARP control: Controlled shared cache management in chip multiprocessors. In MICRO-42, 2009.
-
(2009)
MICRO
, vol.42
-
-
Srikantaiah, S.1
Kandemir, M.2
Wang, Q.3
-
54
-
-
84875673650
-
ReQoS: Reactive static/dynamic compilation for qos in warehouse scale computers
-
L. Tang, J. Mars, W. Wang, et al. ReQoS: Reactive Static/Dynamic Compilation for QoS in Warehouse Scale Computers. In Proc. ASPLOS-18, 2013.
-
(2013)
Proc. ASPLOS
, vol.18
-
-
Tang, L.1
Mars, J.2
Wang, W.3
-
55
-
-
79959879840
-
C4: The continuously concurrent compacting collector
-
G. Tene, B. Iyengar, and M. Wolf. C4: The continuously concurrent compacting collector. In Proc. ISMM, 2011.
-
(2011)
Proc. ISMM
-
-
Tene, G.1
Iyengar, B.2
Wolf, M.3
-
57
-
-
0346935130
-
Data caches in multitasking hard realtime systems
-
X. Vera, B. Lisper, and J. Xue. Data caches in multitasking hard realtime systems. In Proc. RTSS-24, 2003.
-
(2003)
Proc. RTSS
, vol.24
-
-
Vera, X.1
Lisper, B.2
Xue, J.3
-
58
-
-
77952179543
-
The implementation of POWER7: A highly parallel and scalable multi-core high-end server processor
-
D.Wendel, R. Kalla, R. Cargoni, et al. The implementation of POWER7: A highly parallel and scalable multi-core high-end server processor. In ISSCC, 2010.
-
(2010)
ISSCC
-
-
Wendel, D.1
Kalla, R.2
Cargoni, R.3
-
59
-
-
70450279102
-
PIPP: Promotion/insertion pseudo-partitioning of multi-core shared caches
-
Y. Xie and G. H. Loh. PIPP: promotion/insertion pseudo-partitioning of multi-core shared caches. In Proc. ISCA-36, 2009.
-
(2009)
Proc. ISCA
, vol.36
-
-
Xie, Y.1
Loh, G.H.2
-
60
-
-
84881190996
-
Bubble-flux: Precise online qos management for increased utilization in warehouse scale computers
-
H. Yang, A. Breslow, J. Mars, and L. Tang. Bubble-Flux: Precise Online QoS Management for Increased Utilization in Warehouse Scale Computers. In Proc. ISCA-40, 2013.
-
(2013)
Proc. ISCA
, vol.40
-
-
Yang, H.1
Breslow, A.2
Mars, J.3
Tang, L.4
-
61
-
-
85077083345
-
Hardware execution throttling for multi-core resource management
-
X. Zhang, S. Dwarkadas, and K. Shen. Hardware execution throttling for multi-core resource management. In Proc. of USENIX ATC, 2009
-
(2009)
Proc. of USENIX ATC
-
-
Zhang, X.1
Dwarkadas, S.2
Shen, K.3
|