메뉴 건너뛰기




Volumn , Issue , 2010, Pages 106-116

Aérgia: Exploiting packet latency slack in on-chip networks

Author keywords

Arbitration; Criticality; Memory systems; Multi core; On chip networks; Packet scheduling; Prioritization; Slack

Indexed keywords

MEMORY SYSTEMS; MULTI CORE; ON-CHIP NETWORKS; PACKET SCHEDULING; PRIORITIZATION;

EID: 77954985868     PISSN: 10636897     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1815961.1815976     Document Type: Conference Paper
Times cited : (107)

References (38)
  • 4
    • 21244474546 scopus 로고    scopus 로고
    • Predicting inter-thread cache contention on a chip multi-processor architecture
    • D. Chandra, F. Guo, S. Kim, and Y. Solihin. Predicting inter-thread cache contention on a chip multi-processor architecture. In HPCA-11, 2005.
    • (2005) HPCA-11
    • Chandra, D.1    Guo, F.2    Kim, S.3    Solihin, Y.4
  • 5
    • 34548023929 scopus 로고    scopus 로고
    • Cooperative cache partitioning for chip multiprocessors
    • J. Chang and G. S. Sohi. Cooperative cache partitioning for chip multiprocessors. In ICS-21, 2007.
    • (2007) ICS-21
    • Chang, J.1    Sohi, G.S.2
  • 6
    • 0029666638 scopus 로고    scopus 로고
    • Rotating combined queueing (RCQ): Bandwidth and latency guarantees in low-cost, high-performance networks
    • A. A. Chien and J. H. Kim. Rotating Combined Queueing (RCQ): Bandwidth and Latency Guarantees in Low-Cost, High-Performance Networks. ISCA-23, 1996.
    • (1996) ISCA-23
    • Chien, A.A.1    Kim, J.H.2
  • 8
    • 76749124429 scopus 로고    scopus 로고
    • Application-aware prioritization mechanisms for on-chip networks
    • R. Das, O. Mutlu, T. Moscibroda, and C. Das. Application-Aware Prioritization Mechanisms for On-Chip Networks. In MICRO-42, 2009.
    • (2009) MICRO-42
    • Das, R.1    Mutlu, O.2    Moscibroda, T.3    Das, C.4
  • 9
    • 0024889726 scopus 로고
    • Analysis and simulation of a fair queueing algorithm
    • A. Demers, S. Keshav, and S. Shenker. Analysis and simulation of a fair queueing algorithm. In SIGCOMM, 1989.
    • (1989) SIGCOMM
    • Demers, A.1    Keshav, S.2    Shenker, S.3
  • 10
    • 0030662863 scopus 로고    scopus 로고
    • Improving data cache performance by pre-executing instructions under a cache miss
    • J. Dundas and T. Mudge. Improving data cache performance by pre-executing instructions under a cache miss. In ICS-11, 1997.
    • (1997) ICS-11
    • Dundas, J.1    Mudge, T.2
  • 11
    • 77952285828 scopus 로고    scopus 로고
    • Fairness via source throttling: A configurable and high-performance fairness substrate for multi-core memory systems
    • E. Ebrahimi, C. J. Lee, O. Mutlu, and Y. N. Patt. Fairness via Source Throttling: A Configurable and High-Performance Fairness Substrate for Multi-Core Memory Systems. In ASPLOS-XV, 2010.
    • (2010) ASPLOS-XV
    • Ebrahimi, E.1    Lee, C.J.2    Mutlu, O.3    Patt, Y.N.4
  • 12
    • 47249094055 scopus 로고    scopus 로고
    • System-level performance metrics for multiprogram workloads
    • May-June
    • S. Eyerman and L. Eeckhout. System-level performance metrics for multiprogram workloads. IEEE Micro, May-June 2008.
    • (2008) IEEE Micro
    • Eyerman, S.1    Eeckhout, L.2
  • 13
    • 0036296821 scopus 로고    scopus 로고
    • Slack maximizing performance under technological constraints
    • B. Fields, R. Bodík, and M. Hill. Slack: Maximizing performance under technological constraints. In ISCA-29, 2002.
    • (2002) ISCA-29
    • Fields, B.1    Bodík, R.2    Hill, M.3
  • 14
    • 0034844926 scopus 로고    scopus 로고
    • Focusing processor policies via critical-path prediction
    • B. Fields, S. Rubin, and R. Bodík. Focusing processor policies via critical-path prediction. In ISCA-28, 2001.
    • (2001) ISCA-28
    • Fields, B.1    Rubin, S.2    Bodík, R.3
  • 16
    • 4644285853 scopus 로고    scopus 로고
    • MLP Yes! ILP No! Memory level parallelism, or, why i no longer worry about IPC
    • A. Glew. MLP Yes! ILP No! Memory Level Parallelism, or, Why I No Longer Worry About IPC. In ASPLOS Wild and Crazy Ideas Session, 1998.
    • (1998) ASPLOS Wild and Crazy Ideas Session
    • Glew, A.1
  • 17
    • 76749160934 scopus 로고    scopus 로고
    • Preemptive virtual clock: A flexible, efficient, and cost-effective qos scheme for networks-on-chip
    • B. Grot, S. W. Keckler, and O. Mutlu. Preemptive Virtual Clock: A Flexible, Efficient, and Cost-effective QOS Scheme for Networks-on-Chip. In MICRO-42, 2009.
    • (2009) MICRO-42
    • Grot, B.1    Keckler, S.W.2    Mutlu, O.3
  • 18
    • 34247143442 scopus 로고    scopus 로고
    • Communist utilitarian, and capitalist cache policies on cmps: Caches as a shared resource
    • L. R. Hsu, S. K. Reinhardt, R. Iyer, and S. Makineni. Communist, utilitarian, and capitalist cache policies on cmps: caches as a shared resource. In PACT-15, 2006.
    • (2006) PACT-15
    • Hsu, L.R.1    Reinhardt, S.K.2    Iyer, R.3    Makineni, S.4
  • 19
    • 77952558442 scopus 로고    scopus 로고
    • ATLAS: A scalable and high-performance scheduling algorithm for multiple memory controllers
    • Y. Kim, D. Han, O. Mutlu, and M. Harchol-Balter. ATLAS: A Scalable and High-Performance Scheduling Algorithm for Multiple Memory Controllers. In HPCA-16, 2010.
    • (2010) HPCA-16
    • Kim, Y.1    Han, D.2    Mutlu, O.3    Harchol-Balter, M.4
  • 20
    • 84904279959 scopus 로고
    • Lockup-free instruction fetch/prefetch cache organization
    • D. Kroft. Lockup-free instruction fetch/prefetch cache organization. In ISCA-8, 1981.
    • (1981) ISCA-8
    • Kroft, D.1
  • 21
    • 52649094492 scopus 로고    scopus 로고
    • Globally-synchronized frames for guaranteed quality-of-service in on-chip networks
    • J. W. Lee, M. C. Ng, and K. Asanovic. Globally-Synchronized Frames for Guaranteed Quality-of-Service in On-Chip Networks. In ISCA-35, 2008.
    • (2008) ISCA-35
    • Lee, J.W.1    Ng, M.C.2    Asanovic, K.3
  • 22
    • 33644903196 scopus 로고    scopus 로고
    • Efficient runahead execution: Power-efficient memory latency tolerance
    • O. Mutlu, H. Kim, and Y. N. Patt. Efficient runahead execution: Power-efficient memory latency tolerance. IEEE Micro, 2006.
    • (2006) IEEE Micro
    • Mutlu, O.1    Kim, H.2    Patt, Y.N.3
  • 23
    • 47349122373 scopus 로고    scopus 로고
    • Stall-time fair memory access scheduling for chip multiprocessors
    • O. Mutlu and T. Moscibroda. Stall-time fair memory access scheduling for chip multiprocessors. In MICRO-40, 2007.
    • (2007) MICRO-40
    • Mutlu, O.1    Moscibroda, T.2
  • 24
    • 52649119398 scopus 로고    scopus 로고
    • Parallelism-aware batch scheduling: Enhancing both performance and fairness of shared DRAM systems
    • O. Mutlu and T. Moscibroda. Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems. In ISCA-35, 2008.
    • (2008) ISCA-35
    • Mutlu, O.1    Moscibroda, T.2
  • 25
    • 84955506994 scopus 로고    scopus 로고
    • Runahead execution: An alternative to very large instruction windows for out-of-order processors
    • O. Mutlu, J. Stark, C. Wilkerson, and Y. Patt. Runahead execution: an alternative to very large instruction windows for out-of-order processors. In HPCA-9, 2003.
    • (2003) HPCA-9
    • Mutlu, O.1    Stark, J.2    Wilkerson, C.3    Patt, Y.4
  • 28
    • 21644454187 scopus 로고    scopus 로고
    • Pinpointing representative portions of large intel itanium programs with dynamic instrumentation
    • H. Patil, R. Cohn, M. Charney, R. Kapoor, A. Sun, and A. Karunanidhi. Pinpointing Representative Portions of Large Intel Itanium Programs with Dynamic Instrumentation. In MICRO-37, 2004.
    • (2004) MICRO-37
    • Patil, H.1    Cohn, R.2    Charney, M.3    Kapoor, R.4    Sun, A.5    Karunanidhi, A.6
  • 30
    • 34548042910 scopus 로고    scopus 로고
    • Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches
    • M. Qureshi and Y. Patt. Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches. In MICRO-39, 2006.
    • (2006) MICRO-39
    • Qureshi, M.1    Patt, Y.2
  • 32
    • 27544447688 scopus 로고    scopus 로고
    • Load latency tolerance in dynamically scheduled processors
    • S. T. Srinivasan and A. R. Lebeck. Load latency tolerance in dynamically scheduled processors. In MICRO-31, 1998.
    • (1998) MICRO-31
    • Srinivasan, S.T.1    Lebeck, A.R.2
  • 33
    • 64949119281 scopus 로고    scopus 로고
    • Criticality-based optimizations for efficient load processing
    • S. Subramaniam, A. Bracy, H. Wang, and G. Loh. Criticality-based optimizations for efficient load processing. In HPCA-15, 2009.
    • (2009) HPCA-15
    • Subramaniam, S.1    Bracy, A.2    Wang, H.3    Loh, G.4
  • 36
    • 85034094146 scopus 로고
    • Two-level adaptive training branch prediction
    • T. Y. Yeh and Y. N. Patt. Two-level adaptive training branch prediction. In MICRO-24, 1991.
    • (1991) MICRO-24
    • Yeh, T.Y.1    Patt, Y.N.2
  • 37
    • 0034850359 scopus 로고    scopus 로고
    • QoS provisioning in clusters: An investigation of router and NIC design
    • K. H. Yum, E. J. Kim, and C. Das. QoS provisioning in clusters: an investigation of router and NIC design. In ISCA-28, 2001.
    • (2001) ISCA-28
    • Yum, K.H.1    Kim, E.J.2    Das, C.3
  • 38
    • 85030153179 scopus 로고
    • Virtual clock: A new traffic control algorithm for packet switching networks
    • L. Zhang. Virtual clock: a new traffic control algorithm for packet switching networks. SIGCOMM, 1990.
    • (1990) SIGCOMM
    • Zhang, L.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.