-
1
-
-
70449661003
-
-
Intel Nehalem. http://www.realworldtech.com/page.cfm?ArticleID= RWT040208182719.
-
Intel Nehalem
-
-
-
3
-
-
84858773861
-
-
SPLASH-2. http://www-flash.stanford.edu/apps/SPLASH/.
-
SPLASH-2
-
-
-
5
-
-
70049105948
-
GARNET: A detailed on-chip network model inside a full-system simulator
-
Apr.
-
N. Agarwal, T. Krishna, L.-S. Peh, and N. K. Jha. GARNET: A detailed on-chip network model inside a full-system simulator. In ISPASS, Apr. 2009.
-
(2009)
ISPASS
-
-
Agarwal, N.1
Krishna, T.2
Peh, L.-S.3
Jha, N.K.4
-
6
-
-
65349166228
-
In-network snoop ordering (INSO): Snoopy coherence on unordered interconnects
-
Feb.
-
N. Agarwal, L.-S. Peh, and N. K. Jha. In-network snoop ordering (INSO): Snoopy coherence on unordered interconnects. In HPCA, Feb. 2009.
-
(2009)
HPCA
-
-
Agarwal, N.1
Peh, L.-S.2
Jha, N.K.3
-
7
-
-
33947715600
-
IPC considered harmful for multiprocessor workloads
-
A. R. Alameldeen and D. A. Wood. IPC considered harmful for multiprocessor workloads. IEEE Micro, 26(4):8-17, 2006.
-
(2006)
IEEE Micro
, vol.26
, Issue.4
, pp. 8-17
-
-
Alameldeen, A.R.1
Wood, D.A.2
-
8
-
-
63549095070
-
The PARSEC benchmark suite: Characterization and architectural implications
-
Oct.
-
C. Bienia, S. Kumar, J. P. Singh, and K. Li. The PARSEC benchmark suite: Characterization and architectural implications. In PACT, Oct. 2008.
-
(2008)
PACT
-
-
Bienia, C.1
Kumar, S.2
Singh, J.P.3
Li, K.4
-
9
-
-
0032647513
-
Multicast snooping: A new coherence method using a multicast address network
-
E. E. Bilir et al. Multicast snooping: A new coherence method using a multicast address network. In ISCA, 1999.
-
(1999)
ISCA
-
-
Bilir, E.E.1
-
10
-
-
84858775288
-
Evaluation of a multithreaded architecture for cellular computing
-
J. G. Castanos et al. Evaluation of a multithreaded architecture for cellular computing. In ISCA, 2002.
-
(2002)
ISCA
-
-
Castanos, J.G.1
-
11
-
-
34548238648
-
The AMD Opteron Northbridge Architecture
-
Mar
-
P. Conway and B. Hughes. The AMD Opteron Northbridge Architecture. IEEE Micro, 27:10-21, Mar. 2007.
-
(2007)
IEEE Micro
, vol.27
, pp. 10-21
-
-
Conway, P.1
Hughes, B.2
-
12
-
-
77951200277
-
Cache hierarchy and memory subsystem of the AMD Opteron processor
-
P. Conway et al. Cache hierarchy and memory subsystem of the AMD Opteron processor. IEEE Micro, 30:16-29, 2010.
-
(2010)
IEEE Micro
, vol.30
, pp. 16-29
-
-
Conway, P.1
-
14
-
-
30344488259
-
MapReduce: Simplified data processing on large clusters
-
Dec.
-
J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. In OSDI, Dec. 2008.
-
(2008)
OSDI
-
-
Dean, J.1
Ghemawat, S.2
-
15
-
-
52649171528
-
Virtual circuit tree multicasting: A case for on-chip hardware multicast support
-
Jun.
-
N. Enright Jerger, L.-S. Peh, and M. Lipasti. Virtual circuit tree multicasting: A case for on-chip hardware multicast support. In ISCA, Jun. 2008.
-
(2008)
ISCA
-
-
Enright Jerger, N.1
Peh, L.-S.2
Lipasti, M.3
-
16
-
-
64949116918
-
MRR: Enabling fully adaptive multicast routing for CMP interconnection networks
-
P. A. Fidalgo, V. Puente, and J.-Á. Gregorio. MRR: Enabling fully adaptive multicast routing for CMP interconnection networks. In HPCA, 2009.
-
(2009)
HPCA
-
-
Fidalgo, P.A.1
Puente, V.2
Gregorio, J.-Á.3
-
17
-
-
0000466264
-
Scalable pipelined interconnect for distributed endpoint routing: The SGI SPIDER chip
-
Aug.
-
M. Galles. Scalable pipelined interconnect for distributed endpoint routing: The SGI SPIDER chip. In Hot Interconnects 4, Aug. 1996.
-
(1996)
Hot Interconnects
, vol.4
-
-
Galles, M.1
-
18
-
-
21044437801
-
Overview of the Blue Gene/L system architecture
-
Mar.
-
A. Gara et al. Overview of the Blue Gene/L system architecture. IBM J. Res. Dev., 49:195-212, Mar. 2005.
-
(2005)
IBM J. Res. Dev.
, vol.49
, pp. 195-212
-
-
Gara, A.1
-
19
-
-
0020705129
-
NYU ULTRACOMPUTER - DESIGNING AN MIMD SHARED MEMORY PARALLEL COMPUTER.
-
A. Gottlieb et al. The NYU Ultracomputer - designing an MIMD shared memory parallel computer. IEEE Trans. on Computers, 32:175-189, 1983. (Pubitemid 13525125)
-
(1983)
IEEE Transactions on Computers
, vol.C-32
, Issue.2
, pp. 175-189
-
-
Gottlieb, A.1
Grishman, R.2
Kruskal, C.P.3
McAuliffe, K.P.4
Rudolph, L.5
Snir, M.6
-
20
-
-
36849022584
-
A 5-GHz mesh interconnect for a teraflops processor
-
DOI 10.1109/MM.2007.4378783
-
Y. Hoskote et al. A 5-GHz mesh interconnect for a teraflops processor. IEEE Micro, 27(5):51-61, Sept. 2007. (Pubitemid 350218387)
-
(2007)
IEEE Micro
, vol.27
, Issue.5
, pp. 51-61
-
-
Hoskote, Y.1
Vangal, S.2
Singh, A.3
Borkar, N.4
Borkar, S.5
-
21
-
-
70350060187
-
ORION 2.0: A fast and accurate NoC power and area model for early-stage design space exploration
-
Feb.
-
A. B. Kahng et al. ORION 2.0: A fast and accurate NoC power and area model for early-stage design space exploration. DATE, Feb. 2009.
-
(2009)
DATE
-
-
Kahng, A.B.1
-
22
-
-
0042281592
-
The need for fast communication in hardware-based speculative chip multiprocessors
-
Feb.
-
V. Krishnan and J. Torrellas. The need for fast communication in hardware-based speculative chip multiprocessors. Int. J. Parallel Program., 29:3-33, Feb. 2001.
-
(2001)
Int. J. Parallel Program.
, vol.29
, pp. 3-33
-
-
Krishnan, V.1
Torrellas, J.2
-
23
-
-
52949114554
-
A 4.6Tbits/s 3.6GHz single-cycle NoC router with a novel switch allocator in 65nm CMOS
-
Oct.
-
A. Kumar, P. Kundu, A. P. Singh, L.-S. Peh, and N. K. Jha. A 4.6Tbits/s 3.6GHz single-cycle NoC router with a novel switch allocator in 65nm CMOS. In ICCD, Oct. 2007.
-
(2007)
ICCD
-
-
Kumar, A.1
Kundu, P.2
Singh, A.P.3
Peh, L.-S.4
Jha, N.K.5
-
25
-
-
35348858651
-
Express virtual channels: Towards the ideal interconnection fabric
-
Jun.
-
A. Kumar et al. Express virtual channels: Towards the ideal interconnection fabric. In ISCA, Jun. 2007.
-
(2007)
ISCA
-
-
Kumar, A.1
-
26
-
-
78149271070
-
ATAC: A 1000-core cache-coherent processor with on-chip optical network
-
G. Kurian et al. ATAC: a 1000-core cache-coherent processor with on-chip optical network. In PACT, 2010.
-
(2010)
PACT
-
-
Kurian, G.1
-
27
-
-
0030685588
-
The SGI origin: A ccNUMA highly scalable server
-
Jun.
-
J. Laudon and D. Lenoski. The SGI origin: a ccNUMA highly scalable server. In ISCA, Jun. 1997.
-
(1997)
ISCA
-
-
Laudon, J.1
Lenoski, D.2
-
28
-
-
0025429467
-
The directory-based cache coherence protocol for the DASH multiprocessor
-
Jun.
-
D. Lenoski et al. The directory-based cache coherence protocol for the DASH multiprocessor. In ISCA, Jun. 1990.
-
(1990)
ISCA
-
-
Lenoski, D.1
-
29
-
-
0038346234
-
Token coherence: Decoupling performance and correctness
-
Jun.
-
M. M. K. Martin, M. D. Hill, and D. A. Wood. Token coherence: Decoupling performance and correctness. In ISCA, Jun. 2003.
-
(2003)
ISCA
-
-
Martin, M.M.K.1
Hill, M.D.2
Wood, D.A.3
-
30
-
-
33748870886
-
Multifacet's General Execution-driven Multiprocessor Simulator (GEMS) Toolset
-
Sep.
-
M. M. K. Martin et al. Multifacet's General Execution-driven Multiprocessor Simulator (GEMS) Toolset. CAN, Sep. 2005.
-
(2005)
CAN
-
-
Martin, M.M.K.1
-
31
-
-
84858775295
-
Prediction router: Yet another low latency on-chip router architecture
-
Feb.
-
H. Matsutani et al. Prediction router: Yet another low latency on-chip router architecture. In MICRO, Feb. 2009.
-
(2009)
MICRO
-
-
Matsutani, H.1
-
32
-
-
0022200333
-
The IBM Research Parallel Processor Prototype (RP3): Introduction and architecture
-
G. F. Pfister et al. The IBM Research Parallel Processor Prototype (RP3): Introduction and architecture. In ICPP, pages 764-771, 1985.
-
(1985)
ICPP
, pp. 764-771
-
-
Pfister, G.F.1
-
33
-
-
66749116576
-
Token tenure: PATCHing token counting using directory-based cache coherence
-
Nov.
-
A. Raghavan et al. Token tenure: PATCHing token counting using directory-based cache coherence. In MICRO, Nov. 2008.
-
(2008)
MICRO
-
-
Raghavan, A.1
-
34
-
-
66749138110
-
Efficient unicast and multicast support for CMPs
-
Sep.
-
S. Rodrigo et al. Efficient unicast and multicast support for CMPs. In MICRO, Sep. 2008.
-
(2008)
MICRO
-
-
Rodrigo, S.1
-
35
-
-
49749088882
-
Multicast parallel pipeline router architecture for network-on-chip
-
A. F. Samman et al. Multicast parallel pipeline router architecture for network-on-chip. In DATE, 2008.
-
(2008)
DATE
-
-
Samman, A.F.1
-
36
-
-
47349125701
-
Uncorq: Unconstrained snoop request delivery in embedded-ring multiprocessors
-
K. Strauss et al. Uncorq: Unconstrained snoop request delivery in embedded-ring multiprocessors. In MICRO, 2007.
-
(2007)
MICRO
-
-
Strauss, K.1
-
37
-
-
70349826938
-
Recursive partitioning multicast: A bandwidth-efficient routing for networks-on-chip
-
L. Wang, Y. Jin, H. Kim, and E. J. Kim. Recursive partitioning multicast: A bandwidth-efficient routing for networks-on-chip. In NOCS, 2009.
-
(2009)
NOCS
-
-
Wang, L.1
Jin, Y.2
Kim, H.3
Kim, E.J.4
-
38
-
-
84862144932
-
Power-driven design of router microarchitectures in on-chip networks
-
H.-S. Wang et al. Power-driven design of router microarchitectures in on-chip networks. In MICRO, 2003.
-
(2003)
MICRO
-
-
Wang, H.-S.1
-
39
-
-
79951712762
-
ReMAP: A reconfigurable heterogeneous multicore architecture
-
M. A. Watkins et al. ReMAP: A reconfigurable heterogeneous multicore architecture. In MICRO, 2010.
-
(2010)
MICRO
-
-
Watkins, M.A.1
-
40
-
-
36849030305
-
On-chip interconnection architecture of the tile processor
-
Sept.
-
D. Wentzlaff et al. On-chip interconnection architecture of the tile processor. IEEE Micro, 27(5):15-31, Sept. 2007.
-
(2007)
IEEE Micro
, vol.27
, Issue.5
, pp. 15-31
-
-
Wentzlaff, D.1
|