SCOPUS 정보 검색 플랫폼

Proceedings of the Annual International Symposium on Microarchitecture, MICRO

Volumn , Issue , 2011, Pages 71-82

Towards the ideal on-chip fabric for 1-to-many and many-to-1 communication

(4) Krishna, Tushar a Peh, Li Shiuan a Beckmann, Bradford M b Reinhardt, Steven K b

a MASSACHUSETTS INSTITUTE OF TECHNOLOGY (United States)

b AMD Research (United States)

Author keywords

[No Author keywords available]

Indexed keywords

CACHE COHERENCE; DATA COHERENCE; FAN-OUT; FLOW AGGREGATION; FULL-SYSTEM SIMULATION; IDEAL NETWORK; MULTICAST PACKET; MULTICORE ARCHITECTURES; ON CHIPS; ON-CHIP NETWORKS; PACKET DELAY; POWER PENALTY; RUNTIMES; SINGLE CYCLE; WIRE DELAYS;

COMMUNICATION; EMBEDDED SYSTEMS; SOFTWARE ARCHITECTURE;

ROUTERS;

EID: 84858790896 PISSN: 10724451 EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/2155620.2155630 Document Type: Conference Paper

Times cited : (80)

References (40)

1
- 70449661003
- Intel Nehalem. http://www.realworldtech.com/page.cfm?ArticleID= RWT040208182719.
- Intel Nehalem

2
- 43449109107
- Simics Full-system Simulator. http://www.windriver.com/products/simics.
- Simics Full-system Simulator

3
- 84858773861
- SPLASH-2. http://www-flash.stanford.edu/apps/SPLASH/.
- SPLASH-2

4
- 84858762960
- Efficient and scalable barrier synchronization for many-core CMPs
- J. L. Abellán et al. Efficient and scalable barrier synchronization for many-core CMPs. In Proc. 7th ACM International Conference on Computing Frontiers, 2010.
- Proc. 7th ACM International Conference on Computing Frontiers, 2010
- Abellán, J.L.¹

5
- 70049105948
- GARNET: A detailed on-chip network model inside a full-system simulator
- Apr.
- N. Agarwal, T. Krishna, L.-S. Peh, and N. K. Jha. GARNET: A detailed on-chip network model inside a full-system simulator. In ISPASS, Apr. 2009.
- (2009) ISPASS
- Agarwal, N.¹ Krishna, T.² Peh, L.-S.³ Jha, N.K.⁴

6
- 65349166228
- In-network snoop ordering (INSO): Snoopy coherence on unordered interconnects
- Feb.
- N. Agarwal, L.-S. Peh, and N. K. Jha. In-network snoop ordering (INSO): Snoopy coherence on unordered interconnects. In HPCA, Feb. 2009.
- (2009) HPCA
- Agarwal, N.¹ Peh, L.-S.² Jha, N.K.³

7
- 33947715600
- IPC considered harmful for multiprocessor workloads
- A. R. Alameldeen and D. A. Wood. IPC considered harmful for multiprocessor workloads. IEEE Micro, 26(4):8-17, 2006.
- (2006) IEEE Micro , vol.26 , Issue.4 , pp. 8-17
- Alameldeen, A.R.¹ Wood, D.A.²

8
- 63549095070
- The PARSEC benchmark suite: Characterization and architectural implications
- Oct.
- C. Bienia, S. Kumar, J. P. Singh, and K. Li. The PARSEC benchmark suite: Characterization and architectural implications. In PACT, Oct. 2008.
- (2008) PACT
- Bienia, C.¹ Kumar, S.² Singh, J.P.³ Li, K.⁴

9
- 0032647513
- Multicast snooping: A new coherence method using a multicast address network
- E. E. Bilir et al. Multicast snooping: A new coherence method using a multicast address network. In ISCA, 1999.
- (1999) ISCA
- Bilir, E.E.¹

10
- 84858775288
- Evaluation of a multithreaded architecture for cellular computing
- J. G. Castanos et al. Evaluation of a multithreaded architecture for cellular computing. In ISCA, 2002.
- (2002) ISCA
- Castanos, J.G.¹

11
- 34548238648
- The AMD Opteron Northbridge Architecture
- Mar
- P. Conway and B. Hughes. The AMD Opteron Northbridge Architecture. IEEE Micro, 27:10-21, Mar. 2007.
- (2007) IEEE Micro , vol.27 , pp. 10-21
- Conway, P.¹ Hughes, B.²

12
- 77951200277
- Cache hierarchy and memory subsystem of the AMD Opteron processor
- P. Conway et al. Cache hierarchy and memory subsystem of the AMD Opteron processor. IEEE Micro, 30:16-29, 2010.
- (2010) IEEE Micro , vol.30 , pp. 16-29
- Conway, P.¹

13
- 4043097206
- Morgan Kaufmann Pub.
- W. Dally and B. Towles. Principles and Practices of Interconnection Networks. Morgan Kaufmann Pub., 2003.
- (2003) Principles and Practices of Interconnection Networks
- Dally, W.¹ Towles, B.²

14
- 30344488259
- MapReduce: Simplified data processing on large clusters
- Dec.
- J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. In OSDI, Dec. 2008.
- (2008) OSDI
- Dean, J.¹ Ghemawat, S.²

15
- 52649171528
- Virtual circuit tree multicasting: A case for on-chip hardware multicast support
- Jun.
- N. Enright Jerger, L.-S. Peh, and M. Lipasti. Virtual circuit tree multicasting: A case for on-chip hardware multicast support. In ISCA, Jun. 2008.
- (2008) ISCA
- Enright Jerger, N.¹ Peh, L.-S.² Lipasti, M.³

16
- 64949116918
- MRR: Enabling fully adaptive multicast routing for CMP interconnection networks
- P. A. Fidalgo, V. Puente, and J.-Á. Gregorio. MRR: Enabling fully adaptive multicast routing for CMP interconnection networks. In HPCA, 2009.
- (2009) HPCA
- Fidalgo, P.A.¹ Puente, V.² Gregorio, J.-Á.³

17
- 0000466264
- Scalable pipelined interconnect for distributed endpoint routing: The SGI SPIDER chip
- Aug.
- M. Galles. Scalable pipelined interconnect for distributed endpoint routing: The SGI SPIDER chip. In Hot Interconnects 4, Aug. 1996.
- (1996) Hot Interconnects , vol.4
- Galles, M.¹

18
- 21044437801
- Overview of the Blue Gene/L system architecture
- Mar.
- A. Gara et al. Overview of the Blue Gene/L system architecture. IBM J. Res. Dev., 49:195-212, Mar. 2005.
- (2005) IBM J. Res. Dev. , vol.49 , pp. 195-212
- Gara, A.¹

19
- 0020705129
- NYU ULTRACOMPUTER - DESIGNING AN MIMD SHARED MEMORY PARALLEL COMPUTER.
- A. Gottlieb et al. The NYU Ultracomputer - designing an MIMD shared memory parallel computer. IEEE Trans. on Computers, 32:175-189, 1983. (Pubitemid 13525125)
- (1983) IEEE Transactions on Computers , vol.C-32 , Issue.2 , pp. 175-189
- Gottlieb, A.¹ Grishman, R.² Kruskal, C.P.³ McAuliffe, K.P.⁴ Rudolph, L.⁵ Snir, M.⁶

20
- 36849022584
- A 5-GHz mesh interconnect for a teraflops processor
- DOI 10.1109/MM.2007.4378783
- Y. Hoskote et al. A 5-GHz mesh interconnect for a teraflops processor. IEEE Micro, 27(5):51-61, Sept. 2007. (Pubitemid 350218387)
- (2007) IEEE Micro , vol.27 , Issue.5 , pp. 51-61
- Hoskote, Y.¹ Vangal, S.² Singh, A.³ Borkar, N.⁴ Borkar, S.⁵

21
- 70350060187
- ORION 2.0: A fast and accurate NoC power and area model for early-stage design space exploration
- Feb.
- A. B. Kahng et al. ORION 2.0: A fast and accurate NoC power and area model for early-stage design space exploration. DATE, Feb. 2009.
- (2009) DATE
- Kahng, A.B.¹

22
- 0042281592
- The need for fast communication in hardware-based speculative chip multiprocessors
- Feb.
- V. Krishnan and J. Torrellas. The need for fast communication in hardware-based speculative chip multiprocessors. Int. J. Parallel Program., 29:3-33, Feb. 2001.
- (2001) Int. J. Parallel Program. , vol.29 , pp. 3-33
- Krishnan, V.¹ Torrellas, J.²

23
- 52949114554
- A 4.6Tbits/s 3.6GHz single-cycle NoC router with a novel switch allocator in 65nm CMOS
- Oct.
- A. Kumar, P. Kundu, A. P. Singh, L.-S. Peh, and N. K. Jha. A 4.6Tbits/s 3.6GHz single-cycle NoC router with a novel switch allocator in 65nm CMOS. In ICCD, Oct. 2007.
- (2007) ICCD
- Kumar, A.¹ Kundu, P.² Singh, A.P.³ Peh, L.-S.⁴ Jha, N.K.⁵

24
- 66749104350
- Token flow control
- Nov.
- A. Kumar, L.-S. Peh, and N. K. Jha. Token flow control. In MICRO, Nov. 2008.
- (2008) MICRO
- Kumar, A.¹ Peh, L.-S.² Jha, N.K.³

25
- 35348858651
- Express virtual channels: Towards the ideal interconnection fabric
- Jun.
- A. Kumar et al. Express virtual channels: Towards the ideal interconnection fabric. In ISCA, Jun. 2007.
- (2007) ISCA
- Kumar, A.¹

26
- 78149271070
- ATAC: A 1000-core cache-coherent processor with on-chip optical network
- G. Kurian et al. ATAC: a 1000-core cache-coherent processor with on-chip optical network. In PACT, 2010.
- (2010) PACT
- Kurian, G.¹

27
- 0030685588
- The SGI origin: A ccNUMA highly scalable server
- Jun.
- J. Laudon and D. Lenoski. The SGI origin: a ccNUMA highly scalable server. In ISCA, Jun. 1997.
- (1997) ISCA
- Laudon, J.¹ Lenoski, D.²

28
- 0025429467
- The directory-based cache coherence protocol for the DASH multiprocessor
- Jun.
- D. Lenoski et al. The directory-based cache coherence protocol for the DASH multiprocessor. In ISCA, Jun. 1990.
- (1990) ISCA
- Lenoski, D.¹

29
- 0038346234
- Token coherence: Decoupling performance and correctness
- Jun.
- M. M. K. Martin, M. D. Hill, and D. A. Wood. Token coherence: Decoupling performance and correctness. In ISCA, Jun. 2003.
- (2003) ISCA
- Martin, M.M.K.¹ Hill, M.D.² Wood, D.A.³

30
- 33748870886
- Multifacet's General Execution-driven Multiprocessor Simulator (GEMS) Toolset
- Sep.
- M. M. K. Martin et al. Multifacet's General Execution-driven Multiprocessor Simulator (GEMS) Toolset. CAN, Sep. 2005.
- (2005) CAN
- Martin, M.M.K.¹

31
- 84858775295
- Prediction router: Yet another low latency on-chip router architecture
- Feb.
- H. Matsutani et al. Prediction router: Yet another low latency on-chip router architecture. In MICRO, Feb. 2009.
- (2009) MICRO
- Matsutani, H.¹

32
- 0022200333
- The IBM Research Parallel Processor Prototype (RP3): Introduction and architecture
- G. F. Pfister et al. The IBM Research Parallel Processor Prototype (RP3): Introduction and architecture. In ICPP, pages 764-771, 1985.
- (1985) ICPP , pp. 764-771
- Pfister, G.F.¹

33
- 66749116576
- Token tenure: PATCHing token counting using directory-based cache coherence
- Nov.
- A. Raghavan et al. Token tenure: PATCHing token counting using directory-based cache coherence. In MICRO, Nov. 2008.
- (2008) MICRO
- Raghavan, A.¹

34
- 66749138110
- Efficient unicast and multicast support for CMPs
- Sep.
- S. Rodrigo et al. Efficient unicast and multicast support for CMPs. In MICRO, Sep. 2008.
- (2008) MICRO
- Rodrigo, S.¹

35
- 49749088882
- Multicast parallel pipeline router architecture for network-on-chip
- A. F. Samman et al. Multicast parallel pipeline router architecture for network-on-chip. In DATE, 2008.
- (2008) DATE
- Samman, A.F.¹

36
- 47349125701
- Uncorq: Unconstrained snoop request delivery in embedded-ring multiprocessors
- K. Strauss et al. Uncorq: Unconstrained snoop request delivery in embedded-ring multiprocessors. In MICRO, 2007.
- (2007) MICRO
- Strauss, K.¹

37
- 70349826938
- Recursive partitioning multicast: A bandwidth-efficient routing for networks-on-chip
- L. Wang, Y. Jin, H. Kim, and E. J. Kim. Recursive partitioning multicast: A bandwidth-efficient routing for networks-on-chip. In NOCS, 2009.
- (2009) NOCS
- Wang, L.¹ Jin, Y.² Kim, H.³ Kim, E.J.⁴

38
- 84862144932
- Power-driven design of router microarchitectures in on-chip networks
- H.-S. Wang et al. Power-driven design of router microarchitectures in on-chip networks. In MICRO, 2003.
- (2003) MICRO
- Wang, H.-S.¹

39
- 79951712762
- ReMAP: A reconfigurable heterogeneous multicore architecture
- M. A. Watkins et al. ReMAP: A reconfigurable heterogeneous multicore architecture. In MICRO, 2010.
- (2010) MICRO
- Watkins, M.A.¹

40
- 36849030305
- On-chip interconnection architecture of the tile processor
- Sept.
- D. Wentzlaff et al. On-chip interconnection architecture of the tile processor. IEEE Micro, 27(5):15-31, Sept. 2007.
- (2007) IEEE Micro , vol.27 , Issue.5 , pp. 15-31
- Wentzlaff, D.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.