SCOPUS 정보 검색 플랫폼

Proceedings - International Symposium on Computer Architecture

Volumn , Issue , 2010, Pages 106-116

Aérgia: Exploiting packet latency slack in on-chip networks

(4) Das, Reetuparna a Mutlu, Onur b Moscibroda, Thomas c Das, Chita R a

a PENNSYLVANIA STATE UNIVERSITY (United States)

b CARNEGIE MELLON UNIVERSITY (United States)

c MICROSOFT RESEARCH (United States)

Author keywords

Arbitration; Criticality; Memory systems; Multi core; On chip networks; Packet scheduling; Prioritization; Slack

Indexed keywords

MEMORY SYSTEMS; MULTI CORE; ON-CHIP NETWORKS; PACKET SCHEDULING; PRIORITIZATION;

COMPUTER ARCHITECTURE; COMPUTERS; CRITICALITY (NUCLEAR FISSION); MICROPROCESSOR CHIPS; MULTIPROGRAMMING; PACKET NETWORKS; SCHEDULING;

ROUTERS;

EID: 77954985868 PISSN: 10636897 EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/1815961.1815976 Document Type: Conference Paper

Times cited : (107)

References (38)

1
- 0029254155
- Myrinet - A gigabit-per-second local-area network
- N. J. Boden, D. Cohen, R. E. Felderman, A. E. Kulawik, C. L. Seitz, J. N. Seizovic, and W. king Su. Myrinet - A Gigabit-per-Second Local-Area Network. IEEE Micro, 1995.
- (1995) IEEE Micro
- Boden, N.J.¹ Cohen, D.² Felderman, R.E.³ Kulawik, A.E.⁴ Seitz, C.L.⁵ Seizovic, J.N.⁶ King Su, W.⁷

2
- 1242309790
- QNoC QoS architecture and design process for network on chip
- E. Bolotin, I. Cidon, R. Ginosar, and A. Kolodny. QNoC: QoS architecture and design process for network on chip. Journal of Systems Arch., 2004.
- (2004) Journal of Systems Arch.
- Bolotin, E.¹ Cidon, I.² Ginosar, R.³ Kolodny, A.⁴

3
- 36348965353
- The power of priority: NoC based distributed cache coherency
- E. Bolotin, Z. Guz, I. Cidon, R. Ginosar, and A. Kolodny. The Power of Priority: NoC Based Distributed Cache Coherency. In NOCS'07, 2007.
- (2007) NOCS'07
- Bolotin, E.¹ Guz, Z.² Cidon, I.³ Ginosar, R.⁴ Kolodny, A.⁵

4
- 21244474546
- Predicting inter-thread cache contention on a chip multi-processor architecture
- D. Chandra, F. Guo, S. Kim, and Y. Solihin. Predicting inter-thread cache contention on a chip multi-processor architecture. In HPCA-11, 2005.
- (2005) HPCA-11
- Chandra, D.¹ Guo, F.² Kim, S.³ Solihin, Y.⁴

5
- 34548023929
- Cooperative cache partitioning for chip multiprocessors
- J. Chang and G. S. Sohi. Cooperative cache partitioning for chip multiprocessors. In ICS-21, 2007.
- (2007) ICS-21
- Chang, J.¹ Sohi, G.S.²

6
- 0029666638
- Rotating combined queueing (RCQ): Bandwidth and latency guarantees in low-cost, high-performance networks
- A. A. Chien and J. H. Kim. Rotating Combined Queueing (RCQ): Bandwidth and Latency Guarantees in Low-Cost, High-Performance Networks. ISCA-23, 1996.
- (1996) ISCA-23
- Chien, A.A.¹ Kim, J.H.²

7
- 4043097206
- Morgan Kaufmann
- W. J. Dally and B. Towles. Principles and Practices of Interconnection Networks. Morgan Kaufmann, 2003.
- (2003) Principles and Practices of Interconnection Networks
- Dally, W.J.¹ Towles, B.²

8
- 76749124429
- Application-aware prioritization mechanisms for on-chip networks
- R. Das, O. Mutlu, T. Moscibroda, and C. Das. Application-Aware Prioritization Mechanisms for On-Chip Networks. In MICRO-42, 2009.
- (2009) MICRO-42
- Das, R.¹ Mutlu, O.² Moscibroda, T.³ Das, C.⁴

9
- 0024889726
- Analysis and simulation of a fair queueing algorithm
- A. Demers, S. Keshav, and S. Shenker. Analysis and simulation of a fair queueing algorithm. In SIGCOMM, 1989.
- (1989) SIGCOMM
- Demers, A.¹ Keshav, S.² Shenker, S.³

10
- 0030662863
- Improving data cache performance by pre-executing instructions under a cache miss
- J. Dundas and T. Mudge. Improving data cache performance by pre-executing instructions under a cache miss. In ICS-11, 1997.
- (1997) ICS-11
- Dundas, J.¹ Mudge, T.²

11
- 77952285828
- Fairness via source throttling: A configurable and high-performance fairness substrate for multi-core memory systems
- E. Ebrahimi, C. J. Lee, O. Mutlu, and Y. N. Patt. Fairness via Source Throttling: A Configurable and High-Performance Fairness Substrate for Multi-Core Memory Systems. In ASPLOS-XV, 2010.
- (2010) ASPLOS-XV
- Ebrahimi, E.¹ Lee, C.J.² Mutlu, O.³ Patt, Y.N.⁴

12
- 47249094055
- System-level performance metrics for multiprogram workloads
- May-June
- S. Eyerman and L. Eeckhout. System-level performance metrics for multiprogram workloads. IEEE Micro, May-June 2008.
- (2008) IEEE Micro
- Eyerman, S.¹ Eeckhout, L.²

13
- 0036296821
- Slack maximizing performance under technological constraints
- B. Fields, R. Bodík, and M. Hill. Slack: Maximizing performance under technological constraints. In ISCA-29, 2002.
- (2002) ISCA-29
- Fields, B.¹ Bodík, R.² Hill, M.³

14
- 0034844926
- Focusing processor policies via critical-path prediction
- B. Fields, S. Rubin, and R. Bodík. Focusing processor policies via critical-path prediction. In ISCA-28, 2001.
- (2001) ISCA-28
- Fields, B.¹ Rubin, S.² Bodík, R.³

15
- 0001442383
- Servernet II
- June
- D. Garcia and W. Watson. Servernet II. Parallel Computing, Routing, and Communication Workshop, June 1997.
- (1997) Parallel Computing Routing and Communication Workshop
- Garcia, D.¹ Watson, W.²

16
- 4644285853
- MLP Yes! ILP No! Memory level parallelism, or, why i no longer worry about IPC
- A. Glew. MLP Yes! ILP No! Memory Level Parallelism, or, Why I No Longer Worry About IPC. In ASPLOS Wild and Crazy Ideas Session, 1998.
- (1998) ASPLOS Wild and Crazy Ideas Session
- Glew, A.¹

17
- 76749160934
- Preemptive virtual clock: A flexible, efficient, and cost-effective qos scheme for networks-on-chip
- B. Grot, S. W. Keckler, and O. Mutlu. Preemptive Virtual Clock: A Flexible, Efficient, and Cost-effective QOS Scheme for Networks-on-Chip. In MICRO-42, 2009.
- (2009) MICRO-42
- Grot, B.¹ Keckler, S.W.² Mutlu, O.³

18
- 34247143442
- Communist utilitarian, and capitalist cache policies on cmps: Caches as a shared resource
- L. R. Hsu, S. K. Reinhardt, R. Iyer, and S. Makineni. Communist, utilitarian, and capitalist cache policies on cmps: caches as a shared resource. In PACT-15, 2006.
- (2006) PACT-15
- Hsu, L.R.¹ Reinhardt, S.K.² Iyer, R.³ Makineni, S.⁴

19
- 77952558442
- ATLAS: A scalable and high-performance scheduling algorithm for multiple memory controllers
- Y. Kim, D. Han, O. Mutlu, and M. Harchol-Balter. ATLAS: A Scalable and High-Performance Scheduling Algorithm for Multiple Memory Controllers. In HPCA-16, 2010.
- (2010) HPCA-16
- Kim, Y.¹ Han, D.² Mutlu, O.³ Harchol-Balter, M.⁴

20
- 84904279959
- Lockup-free instruction fetch/prefetch cache organization
- D. Kroft. Lockup-free instruction fetch/prefetch cache organization. In ISCA-8, 1981.
- (1981) ISCA-8
- Kroft, D.¹

21
- 52649094492
- Globally-synchronized frames for guaranteed quality-of-service in on-chip networks
- J. W. Lee, M. C. Ng, and K. Asanovic. Globally-Synchronized Frames for Guaranteed Quality-of-Service in On-Chip Networks. In ISCA-35, 2008.
- (2008) ISCA-35
- Lee, J.W.¹ Ng, M.C.² Asanovic, K.³

22
- 33644903196
- Efficient runahead execution: Power-efficient memory latency tolerance
- O. Mutlu, H. Kim, and Y. N. Patt. Efficient runahead execution: Power-efficient memory latency tolerance. IEEE Micro, 2006.
- (2006) IEEE Micro
- Mutlu, O.¹ Kim, H.² Patt, Y.N.³

23
- 47349122373
- Stall-time fair memory access scheduling for chip multiprocessors
- O. Mutlu and T. Moscibroda. Stall-time fair memory access scheduling for chip multiprocessors. In MICRO-40, 2007.
- (2007) MICRO-40
- Mutlu, O.¹ Moscibroda, T.²

24
- 52649119398
- Parallelism-aware batch scheduling: Enhancing both performance and fairness of shared DRAM systems
- O. Mutlu and T. Moscibroda. Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems. In ISCA-35, 2008.
- (2008) ISCA-35
- Mutlu, O.¹ Moscibroda, T.²

25
- 84955506994
- Runahead execution: An alternative to very large instruction windows for out-of-order processors
- O. Mutlu, J. Stark, C. Wilkerson, and Y. Patt. Runahead execution: an alternative to very large instruction windows for out-of-order processors. In HPCA-9, 2003.
- (2003) HPCA-9
- Mutlu, O.¹ Stark, J.² Wilkerson, C.³ Patt, Y.⁴

26
- 34548050337
- Fair queuing memory systems
- K. J. Nesbit, N. Aggarwal, J. Laudon, and J. E. Smith. Fair queuing memory systems. In MICRO-39, 2006.
- (2006) MICRO-39
- Nesbit, K.J.¹ Aggarwal, N.² Laudon, J.³ Smith, J.E.⁴

27
- 38849178774
- chapter 6, Springer US
- V. G. Oklobdzija and R. K. Krishnamurthy. Energy-Delay Characteristics of CMOS Adders, High-Performance Energy-Efficient Microprocessor Design, chapter 6. Springer US, 2006.
- (2006) Energy-delay Characteristics of CMOS Adders, High-performance Energy-efficient Microprocessor Design
- Oklobdzija, V.G.¹ Krishnamurthy, R.K.²

28
- 21644454187
- Pinpointing representative portions of large intel itanium programs with dynamic instrumentation
- H. Patil, R. Cohn, M. Charney, R. Kapoor, A. Sun, and A. Karunanidhi. Pinpointing Representative Portions of Large Intel Itanium Programs with Dynamic Instrumentation. In MICRO-37, 2004.
- (2004) MICRO-37
- Patil, H.¹ Cohn, R.² Charney, M.³ Kapoor, R.⁴ Sun, A.⁵ Karunanidhi, A.⁶

29
- 33845874613
- A case for MLP-aware cache replacement
- M. Qureshi, D. Lynch, O. Mutlu, and Y. Patt. A Case for MLP-Aware Cache Replacement. In ISCA-33, 2006.
- (2006) ISCA-33
- Qureshi, M.¹ Lynch, D.² Mutlu, O.³ Patt, Y.⁴

30
- 34548042910
- Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches
- M. Qureshi and Y. Patt. Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches. In MICRO-39, 2006.
- (2006) MICRO-39
- Qureshi, M.¹ Patt, Y.²

31
- 84893753441
- Trade-offs in the design of a router with both guaranteed and best-effort services for networks on chip
- E. Rijpkema, K. Goossens, A. Radulescu, J. Dielissen, J. van Meerbergen, P. Wielage, and E. Waterlander. Trade-offs in the design of a router with both guaranteed and best-effort services for networks on chip. DATE, 2003.
- (2003) DATE
- Rijpkema, E.¹ Goossens, K.² Radulescu, A.³ Dielissen, J.⁴ Van Meerbergen, J.⁵ Wielage, P.⁶ Waterlander, E.⁷

32
- 27544447688
- Load latency tolerance in dynamically scheduled processors
- S. T. Srinivasan and A. R. Lebeck. Load latency tolerance in dynamically scheduled processors. In MICRO-31, 1998.
- (1998) MICRO-31
- Srinivasan, S.T.¹ Lebeck, A.R.²

33
- 64949119281
- Criticality-based optimizations for efficient load processing
- S. Subramaniam, A. Bracy, H. Wang, and G. Loh. Criticality-based optimizations for efficient load processing. In HPCA-15, 2009.
- (2009) HPCA-15
- Subramaniam, S.¹ Bracy, A.² Wang, H.³ Loh, G.⁴

34
- 0015315796
- A comparative analysis of disk scheduling policies
- T. J. Teorey and T. B. Pinkerton. A comparative analysis of disk scheduling policies. Communications of the ACM, 1972.
- (1972) Communications of the ACM
- Teorey, T.J.¹ Pinkerton, T.B.²

35
- 0003081830
- An efficient algorithm for exploiting multiple arithmetic units
- R. M. Tomasulo. An efficient algorithm for exploiting multiple arithmetic units. IBM Journal of Research and Development, 1967.
- (1967) IBM Journal of Research and Development
- Tomasulo, R.M.¹

36
- 85034094146
- Two-level adaptive training branch prediction
- T. Y. Yeh and Y. N. Patt. Two-level adaptive training branch prediction. In MICRO-24, 1991.
- (1991) MICRO-24
- Yeh, T.Y.¹ Patt, Y.N.²

37
- 0034850359
- QoS provisioning in clusters: An investigation of router and NIC design
- K. H. Yum, E. J. Kim, and C. Das. QoS provisioning in clusters: an investigation of router and NIC design. In ISCA-28, 2001.
- (2001) ISCA-28
- Yum, K.H.¹ Kim, E.J.² Das, C.³

38
- 85030153179
- Virtual clock: A new traffic control algorithm for packet switching networks
- L. Zhang. Virtual clock: a new traffic control algorithm for packet switching networks. SIGCOMM, 1990.
- (1990) SIGCOMM
- Zhang, L.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.