메뉴 건너뛰기




Volumn , Issue , 2011, Pages 71-82

Towards the ideal on-chip fabric for 1-to-many and many-to-1 communication

Author keywords

[No Author keywords available]

Indexed keywords

CACHE COHERENCE; DATA COHERENCE; FAN-OUT; FLOW AGGREGATION; FULL-SYSTEM SIMULATION; IDEAL NETWORK; MULTICAST PACKET; MULTICORE ARCHITECTURES; ON CHIPS; ON-CHIP NETWORKS; PACKET DELAY; POWER PENALTY; RUNTIMES; SINGLE CYCLE; WIRE DELAYS;

EID: 84858790896     PISSN: 10724451     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/2155620.2155630     Document Type: Conference Paper
Times cited : (80)

References (40)
  • 1
    • 70449661003 scopus 로고    scopus 로고
    • Intel Nehalem. http://www.realworldtech.com/page.cfm?ArticleID= RWT040208182719.
    • Intel Nehalem
  • 3
    • 84858773861 scopus 로고    scopus 로고
    • SPLASH-2. http://www-flash.stanford.edu/apps/SPLASH/.
    • SPLASH-2
  • 5
    • 70049105948 scopus 로고    scopus 로고
    • GARNET: A detailed on-chip network model inside a full-system simulator
    • Apr.
    • N. Agarwal, T. Krishna, L.-S. Peh, and N. K. Jha. GARNET: A detailed on-chip network model inside a full-system simulator. In ISPASS, Apr. 2009.
    • (2009) ISPASS
    • Agarwal, N.1    Krishna, T.2    Peh, L.-S.3    Jha, N.K.4
  • 6
    • 65349166228 scopus 로고    scopus 로고
    • In-network snoop ordering (INSO): Snoopy coherence on unordered interconnects
    • Feb.
    • N. Agarwal, L.-S. Peh, and N. K. Jha. In-network snoop ordering (INSO): Snoopy coherence on unordered interconnects. In HPCA, Feb. 2009.
    • (2009) HPCA
    • Agarwal, N.1    Peh, L.-S.2    Jha, N.K.3
  • 7
    • 33947715600 scopus 로고    scopus 로고
    • IPC considered harmful for multiprocessor workloads
    • A. R. Alameldeen and D. A. Wood. IPC considered harmful for multiprocessor workloads. IEEE Micro, 26(4):8-17, 2006.
    • (2006) IEEE Micro , vol.26 , Issue.4 , pp. 8-17
    • Alameldeen, A.R.1    Wood, D.A.2
  • 8
    • 63549095070 scopus 로고    scopus 로고
    • The PARSEC benchmark suite: Characterization and architectural implications
    • Oct.
    • C. Bienia, S. Kumar, J. P. Singh, and K. Li. The PARSEC benchmark suite: Characterization and architectural implications. In PACT, Oct. 2008.
    • (2008) PACT
    • Bienia, C.1    Kumar, S.2    Singh, J.P.3    Li, K.4
  • 9
    • 0032647513 scopus 로고    scopus 로고
    • Multicast snooping: A new coherence method using a multicast address network
    • E. E. Bilir et al. Multicast snooping: A new coherence method using a multicast address network. In ISCA, 1999.
    • (1999) ISCA
    • Bilir, E.E.1
  • 10
    • 84858775288 scopus 로고    scopus 로고
    • Evaluation of a multithreaded architecture for cellular computing
    • J. G. Castanos et al. Evaluation of a multithreaded architecture for cellular computing. In ISCA, 2002.
    • (2002) ISCA
    • Castanos, J.G.1
  • 11
    • 34548238648 scopus 로고    scopus 로고
    • The AMD Opteron Northbridge Architecture
    • Mar
    • P. Conway and B. Hughes. The AMD Opteron Northbridge Architecture. IEEE Micro, 27:10-21, Mar. 2007.
    • (2007) IEEE Micro , vol.27 , pp. 10-21
    • Conway, P.1    Hughes, B.2
  • 12
    • 77951200277 scopus 로고    scopus 로고
    • Cache hierarchy and memory subsystem of the AMD Opteron processor
    • P. Conway et al. Cache hierarchy and memory subsystem of the AMD Opteron processor. IEEE Micro, 30:16-29, 2010.
    • (2010) IEEE Micro , vol.30 , pp. 16-29
    • Conway, P.1
  • 14
    • 30344488259 scopus 로고    scopus 로고
    • MapReduce: Simplified data processing on large clusters
    • Dec.
    • J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. In OSDI, Dec. 2008.
    • (2008) OSDI
    • Dean, J.1    Ghemawat, S.2
  • 15
    • 52649171528 scopus 로고    scopus 로고
    • Virtual circuit tree multicasting: A case for on-chip hardware multicast support
    • Jun.
    • N. Enright Jerger, L.-S. Peh, and M. Lipasti. Virtual circuit tree multicasting: A case for on-chip hardware multicast support. In ISCA, Jun. 2008.
    • (2008) ISCA
    • Enright Jerger, N.1    Peh, L.-S.2    Lipasti, M.3
  • 16
    • 64949116918 scopus 로고    scopus 로고
    • MRR: Enabling fully adaptive multicast routing for CMP interconnection networks
    • P. A. Fidalgo, V. Puente, and J.-Á. Gregorio. MRR: Enabling fully adaptive multicast routing for CMP interconnection networks. In HPCA, 2009.
    • (2009) HPCA
    • Fidalgo, P.A.1    Puente, V.2    Gregorio, J.-Á.3
  • 17
    • 0000466264 scopus 로고    scopus 로고
    • Scalable pipelined interconnect for distributed endpoint routing: The SGI SPIDER chip
    • Aug.
    • M. Galles. Scalable pipelined interconnect for distributed endpoint routing: The SGI SPIDER chip. In Hot Interconnects 4, Aug. 1996.
    • (1996) Hot Interconnects , vol.4
    • Galles, M.1
  • 18
    • 21044437801 scopus 로고    scopus 로고
    • Overview of the Blue Gene/L system architecture
    • Mar.
    • A. Gara et al. Overview of the Blue Gene/L system architecture. IBM J. Res. Dev., 49:195-212, Mar. 2005.
    • (2005) IBM J. Res. Dev. , vol.49 , pp. 195-212
    • Gara, A.1
  • 20
    • 36849022584 scopus 로고    scopus 로고
    • A 5-GHz mesh interconnect for a teraflops processor
    • DOI 10.1109/MM.2007.4378783
    • Y. Hoskote et al. A 5-GHz mesh interconnect for a teraflops processor. IEEE Micro, 27(5):51-61, Sept. 2007. (Pubitemid 350218387)
    • (2007) IEEE Micro , vol.27 , Issue.5 , pp. 51-61
    • Hoskote, Y.1    Vangal, S.2    Singh, A.3    Borkar, N.4    Borkar, S.5
  • 21
    • 70350060187 scopus 로고    scopus 로고
    • ORION 2.0: A fast and accurate NoC power and area model for early-stage design space exploration
    • Feb.
    • A. B. Kahng et al. ORION 2.0: A fast and accurate NoC power and area model for early-stage design space exploration. DATE, Feb. 2009.
    • (2009) DATE
    • Kahng, A.B.1
  • 22
    • 0042281592 scopus 로고    scopus 로고
    • The need for fast communication in hardware-based speculative chip multiprocessors
    • Feb.
    • V. Krishnan and J. Torrellas. The need for fast communication in hardware-based speculative chip multiprocessors. Int. J. Parallel Program., 29:3-33, Feb. 2001.
    • (2001) Int. J. Parallel Program. , vol.29 , pp. 3-33
    • Krishnan, V.1    Torrellas, J.2
  • 23
    • 52949114554 scopus 로고    scopus 로고
    • A 4.6Tbits/s 3.6GHz single-cycle NoC router with a novel switch allocator in 65nm CMOS
    • Oct.
    • A. Kumar, P. Kundu, A. P. Singh, L.-S. Peh, and N. K. Jha. A 4.6Tbits/s 3.6GHz single-cycle NoC router with a novel switch allocator in 65nm CMOS. In ICCD, Oct. 2007.
    • (2007) ICCD
    • Kumar, A.1    Kundu, P.2    Singh, A.P.3    Peh, L.-S.4    Jha, N.K.5
  • 25
    • 35348858651 scopus 로고    scopus 로고
    • Express virtual channels: Towards the ideal interconnection fabric
    • Jun.
    • A. Kumar et al. Express virtual channels: Towards the ideal interconnection fabric. In ISCA, Jun. 2007.
    • (2007) ISCA
    • Kumar, A.1
  • 26
    • 78149271070 scopus 로고    scopus 로고
    • ATAC: A 1000-core cache-coherent processor with on-chip optical network
    • G. Kurian et al. ATAC: a 1000-core cache-coherent processor with on-chip optical network. In PACT, 2010.
    • (2010) PACT
    • Kurian, G.1
  • 27
    • 0030685588 scopus 로고    scopus 로고
    • The SGI origin: A ccNUMA highly scalable server
    • Jun.
    • J. Laudon and D. Lenoski. The SGI origin: a ccNUMA highly scalable server. In ISCA, Jun. 1997.
    • (1997) ISCA
    • Laudon, J.1    Lenoski, D.2
  • 28
    • 0025429467 scopus 로고
    • The directory-based cache coherence protocol for the DASH multiprocessor
    • Jun.
    • D. Lenoski et al. The directory-based cache coherence protocol for the DASH multiprocessor. In ISCA, Jun. 1990.
    • (1990) ISCA
    • Lenoski, D.1
  • 29
    • 0038346234 scopus 로고    scopus 로고
    • Token coherence: Decoupling performance and correctness
    • Jun.
    • M. M. K. Martin, M. D. Hill, and D. A. Wood. Token coherence: Decoupling performance and correctness. In ISCA, Jun. 2003.
    • (2003) ISCA
    • Martin, M.M.K.1    Hill, M.D.2    Wood, D.A.3
  • 30
    • 33748870886 scopus 로고    scopus 로고
    • Multifacet's General Execution-driven Multiprocessor Simulator (GEMS) Toolset
    • Sep.
    • M. M. K. Martin et al. Multifacet's General Execution-driven Multiprocessor Simulator (GEMS) Toolset. CAN, Sep. 2005.
    • (2005) CAN
    • Martin, M.M.K.1
  • 31
    • 84858775295 scopus 로고    scopus 로고
    • Prediction router: Yet another low latency on-chip router architecture
    • Feb.
    • H. Matsutani et al. Prediction router: Yet another low latency on-chip router architecture. In MICRO, Feb. 2009.
    • (2009) MICRO
    • Matsutani, H.1
  • 32
    • 0022200333 scopus 로고
    • The IBM Research Parallel Processor Prototype (RP3): Introduction and architecture
    • G. F. Pfister et al. The IBM Research Parallel Processor Prototype (RP3): Introduction and architecture. In ICPP, pages 764-771, 1985.
    • (1985) ICPP , pp. 764-771
    • Pfister, G.F.1
  • 33
    • 66749116576 scopus 로고    scopus 로고
    • Token tenure: PATCHing token counting using directory-based cache coherence
    • Nov.
    • A. Raghavan et al. Token tenure: PATCHing token counting using directory-based cache coherence. In MICRO, Nov. 2008.
    • (2008) MICRO
    • Raghavan, A.1
  • 34
    • 66749138110 scopus 로고    scopus 로고
    • Efficient unicast and multicast support for CMPs
    • Sep.
    • S. Rodrigo et al. Efficient unicast and multicast support for CMPs. In MICRO, Sep. 2008.
    • (2008) MICRO
    • Rodrigo, S.1
  • 35
    • 49749088882 scopus 로고    scopus 로고
    • Multicast parallel pipeline router architecture for network-on-chip
    • A. F. Samman et al. Multicast parallel pipeline router architecture for network-on-chip. In DATE, 2008.
    • (2008) DATE
    • Samman, A.F.1
  • 36
    • 47349125701 scopus 로고    scopus 로고
    • Uncorq: Unconstrained snoop request delivery in embedded-ring multiprocessors
    • K. Strauss et al. Uncorq: Unconstrained snoop request delivery in embedded-ring multiprocessors. In MICRO, 2007.
    • (2007) MICRO
    • Strauss, K.1
  • 37
    • 70349826938 scopus 로고    scopus 로고
    • Recursive partitioning multicast: A bandwidth-efficient routing for networks-on-chip
    • L. Wang, Y. Jin, H. Kim, and E. J. Kim. Recursive partitioning multicast: A bandwidth-efficient routing for networks-on-chip. In NOCS, 2009.
    • (2009) NOCS
    • Wang, L.1    Jin, Y.2    Kim, H.3    Kim, E.J.4
  • 38
    • 84862144932 scopus 로고    scopus 로고
    • Power-driven design of router microarchitectures in on-chip networks
    • H.-S. Wang et al. Power-driven design of router microarchitectures in on-chip networks. In MICRO, 2003.
    • (2003) MICRO
    • Wang, H.-S.1
  • 39
    • 79951712762 scopus 로고    scopus 로고
    • ReMAP: A reconfigurable heterogeneous multicore architecture
    • M. A. Watkins et al. ReMAP: A reconfigurable heterogeneous multicore architecture. In MICRO, 2010.
    • (2010) MICRO
    • Watkins, M.A.1
  • 40
    • 36849030305 scopus 로고    scopus 로고
    • On-chip interconnection architecture of the tile processor
    • Sept.
    • D. Wentzlaff et al. On-chip interconnection architecture of the tile processor. IEEE Micro, 27(5):15-31, Sept. 2007.
    • (2007) IEEE Micro , vol.27 , Issue.5 , pp. 15-31
    • Wentzlaff, D.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.