SCOPUS 정보 검색 플랫폼

International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS

Volumn , Issue , 2014, Pages 729-742

Ubik: Efficient cache sharing with strict QoS for latency-critical workloads

(2) Kasture, Harshad a Sanchez, Daniel a

a MASSACHUSETTS INSTITUTE OF TECHNOLOGY (United States)

Author keywords

Cache partitioning; Interference; Isolation; Multicore; Quality of service; Resource management; Tail latency

Indexed keywords

CACHE PARTITIONING; ISOLATION; MULTI CORE; RESOURCE MANAGEMENT; TAIL LATENCY;

WAVE INTERFERENCE;

QUALITY OF SERVICE;

EID: 84897791436 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/2541940.2541944 Document Type: Conference Paper

Times cited : (146)

References (61)

1
- 0024656760
- An analytical cache model
- A. Agarwal, J. Hennessy, and M. Horowitz. An analytical cache model. ACM Transactions on Computer Systems, 7(2), 1989
- (1989) ACM Transactions on Computer Systems , vol.7 , Issue.2
- Agarwal, A.¹ Hennessy, J.² Horowitz, M.³

2
- 33947715600
- IPC considered harmful for multiprocessor workloads
- A. Alameldeen and D.Wood. IPC considered harmful for multiprocessor workloads. IEEE Micro, 26(4), 2006.
- (2006) IEEE Micro , vol.26 , Issue.4
- Alameldeen, A.¹ Wood, D.²

3
- 47249127725
- The case for energy-proportional computing
- L. Barroso and U. Hölzle. The case for energy-proportional computing. IEEE Computer, 40(12):33-37, 2007.
- (2007) IEEE Computer , vol.40 , Issue.12 , pp. 33-37
- Barroso, L.¹ Hölzle, U.²

4
- 84887440618
- Jigsaw: Scalable software-defined caches
- N. Beckmann and D. Sanchez. Jigsaw: Scalable Software-Defined Caches. In Proc. PACT-22, 2013.
- (2013) Proc. PACT-22
- Beckmann, N.¹ Sanchez, D.²

5
- 84887501582
- PACORA: Performance aware convex optimization for resource allocation
- S. Bird and B. Smith. PACORA: Performance aware convex optimization for resource allocation. In Proc. HotPar-3, 2011.
- (2011) Proc. HotPar , vol.3
- Bird, S.¹ Smith, B.²

6
- 84880270753
- Power struggles: Revisiting the risc vs cisc debate on contemporary arm and x86 architectures
- E. Blem, J. Menon, and K. Sankaralingam. Power Struggles: Revisiting the RISC vs CISC Debate on Contemporary ARM and x86 Architectures. In Proc. HPCA-16, 2013.
- (2013) Proc. HPCA , vol.16
- Blem, E.¹ Menon, J.² Sankaralingam, K.³

7
- 84883366263
- A 22nm high performance embedded dram soc technology featuring tri-gate transistors and mimcap cob
- R. Brain, A. Baran, N. Bisnik, et al. A 22nm High Performance Embedded DRAM SoC Technology Featuring Tri-Gate Transistors and MIMCAP COB. In Proc. of the Symposium on VLSI Technology, 2013.
- (2013) Proc. of the Symposium on VLSI Technology
- Brain, R.¹ Baran, A.² Bisnik, N.³

8
- 53549130720
- Impact of cache partitioning on multi-tasking real time embedded systems
- B. D. Bui, M. Caccamo, L. Sha, and J. Martinez. Impact of cache partitioning on multi-tasking real time embedded systems. In Proc. RTCSA-14, 2008.
- (2008) Proc. RTCSA , vol.14
- Bui, B.D.¹ Caccamo, M.² Sha, L.³ Martinez, J.⁴

9
- 0033683314
- Application-specific memory management for embedded systems using software-controlled caches
- D. Chiou, P. Jain, L. Rudolph, and S. Devadas. Application-specific memory management for embedded systems using software-controlled caches. In Proc. DAC-37, 2000.
- (2000) Proc. DAC , vol.37
- Chiou, D.¹ Jain, P.² Rudolph, L.³ Devadas, S.⁴

10
- 84881160871
- A hardware evaluation of cache partitioning to improve utilization and energy-efficiency while preserving responsiveness
- H. Cook, M. Moreto, S. Bird, et al. A hardware evaluation of cache partitioning to improve utilization and energy-efficiency while preserving responsiveness. In Proc. ISCA-40, 2013.
- (2013) Proc. ISCA , vol.40
- Cook, H.¹ Moreto, M.² Bird, S.³

11
- 84873622276
- The tail at scale
- J. Dean and L. A. Barroso. The tail at scale. Communications of the ACM, 56(2):74-80, 2013.
- (2013) Communications of the ACM , vol.56 , Issue.2 , pp. 74-80
- Dean, J.¹ Barroso, L.A.²

12
- 84875649537
- Paragon: Qos-aware scheduling for heterogeneous datacenters
- C. Delimitrou and C. Kozyrakis. Paragon: QoS-Aware Scheduling for Heterogeneous Datacenters. In Proc. ASPLOS-18, 2013.
- (2013) Proc. ASPLOS , vol.18
- Delimitrou, C.¹ Kozyrakis, C.²

13
- 77952285828
- Fairness via source throttling: A configurable and high-performance fairness substrate for multi-core memory systems
- E. Ebrahimi, C. J. Lee, O. Mutlu, and Y. N. Patt. Fairness via source throttling: A configurable and high-performance fairness substrate for multi-core memory systems. In Proc. ASPLOS-15, 2010.
- (2010) Proc. ASPLOS , vol.15
- Ebrahimi, E.¹ Lee, C.J.² Mutlu, O.³ Patt, Y.N.⁴

14
- 34249813667
- A performance counter architecture for computing accurate CPI components
- S. Eyerman, L. Eeckhout, T. Karkhanis, and J. E. Smith. A performance counter architecture for computing accurate CPI components. In Proc. ASPLOS-12, 2006.
- (2006) Proc. ASPLOS , vol.12
- Eyerman, S.¹ Eeckhout, L.² Karkhanis, T.³ Smith, J.E.⁴

15
- 84858791438
- Clearing the clouds: A study of emerging scale-out workloads on modern hardware
- M. Ferdman, A. Adileh, O. Kocberber, et al. Clearing the clouds: a study of emerging scale-out workloads on modern hardware. In Proc. ASPLOS-17, 2012.
- (2012) Proc. ASPLOS , vol.17
- Ferdman, M.¹ Adileh, A.² Kocberber, O.³

16
- 80052522708
- Kilo-NOC: A heterogeneous network-on-chip architecture for scalability and service guarantees
- B. Grot, J. Hestness, S. W. Keckler, and O. Mutlu. Kilo-NOC: a heterogeneous network-on-chip architecture for scalability and service guarantees. In Proc. ISCA-38, 2011.
- (2011) Proc. ISCA , vol.38
- Grot, B.¹ Hestness, J.² Keckler, S.W.³ Mutlu, O.⁴

17
- 47349085427
- A framework for providing quality of service in chip multi-processors
- F. Guo, Y. Solihin, L. Zhao, and R. Iyer. A framework for providing quality of service in chip multi-processors. In Proc. MICRO-40, 2007.
- (2007) Proc. MICRO , vol.40
- Guo, F.¹ Solihin, Y.² Zhao, L.³ Iyer, R.⁴

18
- 70350601187
- Reactive NUCA: Near-optimal block placement and replication in distributed caches
- N. Hardavellas, M. Ferdman, B. Falsafi, and A. Ailamaki. Reactive NUCA: near-optimal block placement and replication in distributed caches. In Proc. ISCA-36, 2009.
- (2009) Proc. ISCA , vol.36
- Hardavellas, N.¹ Ferdman, M.² Falsafi, B.³ Ailamaki, A.⁴

19
- 84910129119
- FIESTA: A sample-balanced multi-program workload methodology
- A. Hilton, N. Eswaran, and A. Roth. FIESTA: A sample-balanced multi-program workload methodology. In MoBS, 2009.
- (2009) MoBS
- Hilton, A.¹ Eswaran, N.² Roth, A.³

20
- 47349095214
- QoS policies and architecture for cache/memory in CMP platforms
- R. Iyer, L. Zhao, F. Guo, et al. QoS policies and architecture for cache/memory in CMP platforms. In Proc. SIGMETRICS, 2007.
- (2007) Proc. SIGMETRICS
- Iyer, R.¹ Zhao, L.² Guo, F.³

21
- 84863550145
- A QoS-aware memory controller for dynamically balancing GPU and CPU bandwidth use in an MPSoC
- M. K. Jeong, M. Erez, C. Sudanthi, and N. Paver. A QoS-aware memory controller for dynamically balancing GPU and CPU bandwidth use in an MPSoC. In Proc. DAC-49, 2012.
- (2012) Proc. DAC , vol.49
- Jeong, M.K.¹ Erez, M.² Sudanthi, C.³ Paver, N.⁴

22
- 84860338234
- Network congestion avoidance through speculative reservation
- N. Jiang, D. Becker, G. Michelogiannakis, and W. Dally. Network congestion avoidance through speculative reservation. In Proc. HPCA-18, 2012.
- (2012) Proc. HPCA , vol.18
- Jiang, N.¹ Becker, D.² Michelogiannakis, G.³ Dally, W.⁴

23
- 70349141254
- Shore-MT: A scalable storage manager for the multicore era
- R. Johnson, I. Pandis, N. Hardavellas, et al. Shore-MT: A scalable storage manager for the multicore era. In Proc. EDBT-12, 2009.
- (2009) Proc. EDBT , vol.12
- Johnson, R.¹ Pandis, I.² Hardavellas, N.³

24
- 84870557554
- Chronos: Predictable low latency for data center applications
- R. Kapoor, G. Porter, M. Tewari, et al. Chronos: predictable low latency for data center applications. In Proc. SoCC-3, 2012.
- (2012) Proc. SoCC , vol.3
- Kapoor, R.¹ Porter, G.² Tewari, M.³

25
- 79951718838
- Thread cluster memory scheduling: Exploiting differences in memory access behavior
- Y. Kim, M. Papamichael, O. Mutlu, and M. Harchol-Balter. Thread cluster memory scheduling: Exploiting differences in memory access behavior. In Proc. MICRO-43, 2010.
- (2010) Proc. MICRO , vol.43
- Kim, Y.¹ Papamichael, M.² Mutlu, O.³ Harchol-Balter, M.⁴

26
- 85110867932
- Moses: Open source toolkit for statistical machine translation
- P. Koehn, H. Hoang, A. Birch, et al. Moses: Open source toolkit for statistical machine translation. In Proc. ACL-45, 2007.
- (2007) Proc. ACL , vol.45
- Koehn, P.¹ Hoang, H.² Birch, A.³

27
- 77952125596
- Westmere: A family of 32nm IA processors
- N. Kurd, S. Bhamidipati, C. Mozak, et al. Westmere: A family of 32nm IA processors. In Proc. ISSCC, 2010.
- (2010) Proc. ISSCC
- Kurd, N.¹ Bhamidipati, S.² Mozak, C.³

28
- 84897787167
- PRETI: Partitioned REal-TIme shared cache for mixed-criticality real-time systems
- B. Lesage, I. Puaut, and A. Seznec. PRETI: Partitioned REal-TIme shared cache for mixed-criticality real-time systems. In Proc. ICRTNS-20, 2012.
- (2012) Proc. ICRTNS , vol.20
- Lesage, B.¹ Puaut, I.² Seznec, A.³

29
- 79953203158
- CoQoS: Coordinating QoS-aware shared resources in NoC-based SoCs
- B. Li, L. Zhao, R. Iyer, et al. CoQoS: Coordinating QoS-aware shared resources in NoC-based SoCs. Journal of Parallel and Distributed Computing, 71(5), 2011.
- (2011) Journal of Parallel and Distributed Computing , vol.71 , Issue.5
- Li, B.¹ Zhao, L.² Iyer, R.³

30
- 84977144248
- Refining the utility metric for utilitybased cache partitioning
- X. Lin and R. Balasubramonian. Refining the utility metric for utilitybased cache partitioning. In Proc. WDDD, 2011.
- (2011) Proc. WDDD
- Lin, X.¹ Balasubramonian, R.²

31
- 85092783412
- Tessellation: Space-time partitioning in a manycore client OS
- R. Liu, K. Klues, S. Bird, et al. Tessellation: Space-time partitioning in a manycore client OS. In Proc. HotPar-1, 2009.
- (2009) Proc. HotPar , vol.1
- Liu, R.¹ Klues, K.² Bird, S.³

32
- 84860592643
- Cache craftiness for fast multicore key-value storage
- Y. Mao, E. Kohler, and R. T. Morris. Cache craftiness for fast multicore key-value storage. In Proc. EuroSys-7, 2012.
- (2012) Proc. EuroSys , vol.7
- Mao, Y.¹ Kohler, E.² Morris, R.T.³

33
- 84858783719
- Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations
- J. Mars, L. Tang, R. Hundt, et al. Bubble-Up: Increasing Utilization in Modern Warehouse Scale Computers via Sensible Co-locations. In Proc. MICRO-44, 2011.
- (2011) Proc. MICRO , vol.44
- Mars, J.¹ Tang, L.² Hundt, R.³

34
- 84885629106
- Stochastic queuing simulation for data center workloads
- D. Meisner and T. F. Wenisch. Stochastic queuing simulation for data center workloads. EXERT, 2010.
- (2010) EXERT
- Meisner, D.¹ Wenisch, T.F.²

35
- 67650078267
- PowerNap: Eliminating server idle power
- D. Meisner, B. Gold, and T. Wenisch. PowerNap: Eliminating server idle power. Proc. ASPLOS-14, 2009.
- (2009) Proc. ASPLOS , vol.14
- Meisner, D.¹ Gold, B.² Wenisch, T.³

36
- 85084163128
- Eliminating receive livelock in an interrupt-driven kernel
- J. Mogul and K. Ramakrishnan. Eliminating receive livelock in an interrupt-driven kernel. In Proc. USENIX ATC, 1996.
- (1996) Proc. USENIX ATC
- Mogul, J.¹ Ramakrishnan, K.²

37
- 70449655189
- FlexDCP: A qos framework for cmp architectures
- M. Moreto, F. J. Cazorla, A. Ramirez, et al. FlexDCP: A QoS framework for CMP architectures. SIGOPS Operating Systems Review, 43(2), 2009.
- (2009) SIGOPS Operating Systems Review , vol.43 , Issue.2
- Moreto, M.¹ Cazorla, F.J.² Ramirez, A.³

38
- 34548050337
- Fair queuing memory systems
- K. Nesbit, N. Aggarwal, J. Laudon, and J. Smith. Fair queuing memory systems. In Proc. MICRO-39, 2006.
- (2006) Proc. MICRO , vol.39
- Nesbit, K.¹ Aggarwal, N.² Laudon, J.³ Smith, J.⁴

39
- 35348816719
- Virtual private caches
- K. J. Nesbit, J. Laudon, and J. E. Smith. Virtual private caches. In Proc. ISCA-34, 2007.
- (2007) Proc. ISCA , vol.34
- Nesbit, K.J.¹ Laudon, J.² Smith, J.E.³

40
- 77954780208
- The case for RAMClouds: Scalable high-performance storage entirely in DRAM
- J. Ousterhout, P. Agrawal, D. Erickson, et al. The case for RAMClouds: scalable high-performance storage entirely in DRAM. SIGOPS Operat-ing Systems Review, 43(4), 2010.
- (2010) SIGOPS Operat-ing Systems Review , vol.43 , Issue.4
- Ousterhout, J.¹ Agrawal, P.² Erickson, D.³

41
- 34548304615
- Scratchpad memories vs locked caches in hard real-time systems: A quantitative comparison
- I. Puaut and C. Pais. Scratchpad memories vs locked caches in hard real-time systems: a quantitative comparison. In Proc. DATE, 2007.
- (2007) Proc. DATE
- Puaut, I.¹ Pais, C.²

42
- 34548042910
- Utility-based cache partitioning: A lowoverhead, high-performance, runtime mechanism to partition shared caches
- M. Qureshi and Y. Patt. Utility-based cache partitioning: A lowoverhead, high-performance, runtime mechanism to partition shared caches. In Proc. MICRO-39, 2006.
- (2006) Proc. MICRO , vol.39
- Qureshi, M.¹ Patt, Y.²

43
- 77954977639
- Web search using mobile cores: Quantifying and mitigating the price of efficiency
- V. Reddi, B. Lee, T. Chilimbi, and K. Vaid. Web search using mobile cores: quantifying and mitigating the price of efficiency. In Proc. ISCA-37, 2010.
- (2010) Proc. ISCA , vol.37
- Reddi, V.¹ Lee, B.² Chilimbi, T.³ Vaid, K.⁴

44
- 79951696261
- The zcache: Decoupling ways and associativity
- D. Sanchez and C. Kozyrakis. The ZCache: Decoupling Ways and Associativity. In Proc. MICRO-43, 2010.
- (2010) Proc. MICRO , vol.43
- Sanchez, D.¹ Kozyrakis, C.²

45
- 80052521720
- Vantage: Scalable and efficient fine-grain cache partitioning
- D. Sanchez and C. Kozyrakis. Vantage: Scalable and Efficient Fine-Grain Cache Partitioning. In Proc. ISCA-38, 2011.
- (2011) Proc. ISCA , vol.38
- Sanchez, D.¹ Kozyrakis, C.²

46
- 84881154274
- ZSim: Fast and accurate microarchitectural simulation of thousand-core systems
- D. Sanchez and C. Kozyrakis. ZSim: Fast and Accurate Microarchitectural Simulation of Thousand-Core Systems. In Proc. ISCA-40, 2013.
- (2013) Proc. ISCA , vol.40
- Sanchez, D.¹ Kozyrakis, C.²

47
- 62749108463
- Time-predictable computer architecture
- M. Schoeberl. Time-predictable computer architecture. EURASIP Journal on Embedded Systems, 2009.
- (2009) EURASIP Journal on Embedded Systems
- Schoeberl, M.¹

48
- 0027307814
- A case for two-way skewed-associative caches
- A. Seznec. A case for two-way skewed-associative caches. In Proc. ISCA-20, 1993.
- (1993) Proc. ISCA , vol.20
- Seznec, A.¹

49
- 84892655102
- METE: Meeting end-to-end QoS in multicores through system-wide resource management
- A. Sharifi, S. Srikantaiah, A. Mishra, et al. METE: meeting end-to-end QoS in multicores through system-wide resource management. In Proc. SIGMETRICS, 2011.
- (2011) Proc. SIGMETRICS
- Sharifi, A.¹ Srikantaiah, S.² Mishra, A.³

50
- 77952200539
- A 40nm 16-core 128-thread CMT SPARC SoC processor
- J. Shin, K. Tam, D. Huang, et al. A 40nm 16-core 128-thread CMT SPARC SoC processor. In ISSCC, 2010.
- (2010) ISSCC
- Shin, J.¹ Tam, K.² Huang, D.³

51
- 0034443570
- Symbiotic jobscheduling for a simultaneous multithreading processor
- A. Snavely and D. M. Tullsen. Symbiotic jobscheduling for a simultaneous multithreading processor. In Proc. ASPLOS-8, 2000.
- (2000) Proc. ASPLOS , vol.8
- Snavely, A.¹ Tullsen, D.M.²

52
- 76749118968
- SHARP control: Controlled shared cache management in chip multiprocessors
- S. Srikantaiah, M. Kandemir, and Q. Wang. SHARP control: Controlled shared cache management in chip multiprocessors. In MICRO-42, 2009.
- (2009) MICRO , vol.42
- Srikantaiah, S.¹ Kandemir, M.² Wang, Q.³

53
- 84976732647
- Transient behavior of cache memories
- W. D. Strecker. Transient behavior of cache memories. ACM Transac-tions on Computer Systems, 1(4), 1983.
- (1983) ACM Transac-tions on Computer Systems , vol.1 , Issue.4
- Strecker, W.D.¹

54
- 84875673650
- ReQoS: Reactive static/dynamic compilation for qos in warehouse scale computers
- L. Tang, J. Mars, W. Wang, et al. ReQoS: Reactive Static/Dynamic Compilation for QoS in Warehouse Scale Computers. In Proc. ASPLOS-18, 2013.
- (2013) Proc. ASPLOS , vol.18
- Tang, L.¹ Mars, J.² Wang, W.³

55
- 79959879840
- C4: The continuously concurrent compacting collector
- G. Tene, B. Iyengar, and M. Wolf. C4: The continuously concurrent compacting collector. In Proc. ISMM, 2011.
- (2011) Proc. ISMM
- Tene, G.¹ Iyengar, B.² Wolf, M.³

56
- 84866456744
- Technical report
- Tilera. TILE-Gx 3000 Series Overview. Technical report, 2011.
- (2011) TILE-Gx 3000 Series Overview
- Tilera¹

57
- 0346935130
- Data caches in multitasking hard realtime systems
- X. Vera, B. Lisper, and J. Xue. Data caches in multitasking hard realtime systems. In Proc. RTSS-24, 2003.
- (2003) Proc. RTSS , vol.24
- Vera, X.¹ Lisper, B.² Xue, J.³

58
- 77952179543
- The implementation of POWER7: A highly parallel and scalable multi-core high-end server processor
- D.Wendel, R. Kalla, R. Cargoni, et al. The implementation of POWER7: A highly parallel and scalable multi-core high-end server processor. In ISSCC, 2010.
- (2010) ISSCC
- Wendel, D.¹ Kalla, R.² Cargoni, R.³

59
- 70450279102
- PIPP: Promotion/insertion pseudo-partitioning of multi-core shared caches
- Y. Xie and G. H. Loh. PIPP: promotion/insertion pseudo-partitioning of multi-core shared caches. In Proc. ISCA-36, 2009.
- (2009) Proc. ISCA , vol.36
- Xie, Y.¹ Loh, G.H.²

60
- 84881190996
- Bubble-flux: Precise online qos management for increased utilization in warehouse scale computers
- H. Yang, A. Breslow, J. Mars, and L. Tang. Bubble-Flux: Precise Online QoS Management for Increased Utilization in Warehouse Scale Computers. In Proc. ISCA-40, 2013.
- (2013) Proc. ISCA , vol.40
- Yang, H.¹ Breslow, A.² Mars, J.³ Tang, L.⁴

61
- 85077083345
- Hardware execution throttling for multi-core resource management
- X. Zhang, S. Dwarkadas, and K. Shen. Hardware execution throttling for multi-core resource management. In Proc. of USENIX ATC, 2009
- (2009) Proc. of USENIX ATC
- Zhang, X.¹ Dwarkadas, S.² Shen, K.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.