-
1
-
-
84876572683
-
Owner prediction for accelerating cache-to-cache transfer misses in cc-NUMA multiprocessors
-
Nov
-
M. E. Acacio, J. González, J. M. García, and J. Duato, "Owner Prediction for Accelerating Cache-to-Cache Transfer Misses in cc-NUMA Multiprocessors", Proc. SC Conf. High Performance Networking and Computing, pp. 1-12, Nov. 2002.
-
(2002)
Proc. SC Conf. High Performance Networking and Computing
, pp. 1-12
-
-
Acacio, M.E.1
González, J.2
García, J.M.3
Duato, J.4
-
2
-
-
79959602320
-
The use of prediction for accelerating upgrade misses in cc-NUMA multiprocessors
-
Sept
-
M. E. Acacio, J. González, J. M. García, and J. Duato, "The Use of Prediction for Accelerating Upgrade Misses in cc-NUMA Multiprocessors", Proc. 11th Int'l Conf. Parallel Architectures and Compilation Techniques (PACT), pp. 155-164, Sept. 2002.
-
(2002)
Proc. 11th Int'l Conf. Parallel Architectures and Compilation Techniques (PACT)
, pp. 155-164
-
-
Acacio, M.E.1
González, J.2
García, J.M.3
Duato, J.4
-
3
-
-
65349166228
-
In-network snoop ordering (INSO): Snoopy coherence on unordered interconnects
-
Feb
-
N. Agarwal, L.-S. Peh, and N. K. Jha, "In-Network Snoop Ordering (INSO): Snoopy Coherence on Unordered Interconnects", Proc. 15th Int'l Conf. High-Performance Computer Architecture (HPCA), pp. 67-78, Feb. 2009.
-
(2009)
Proc. 15th Int'l Conf. High-Performance Computer Architecture (HPCA)
, pp. 67-78
-
-
Agarwal, N.1
Peh, L.-S.2
Jha, N.K.3
-
5
-
-
0033722744
-
Piranha: A scalable architecture based on single-chip multiprocessing
-
June
-
L. A. Barroso, K. Gharachorloo, R. McNamara, A. Nowatzyk, S. Qadeer, B. Sano, S. Smith, R. Stets, and B. Verghese, "Piranha: A Scalable Architecture Based on Single-Chip Multiprocessing", Proc. 27th Int'l Symp. Computer Architecture (ISCA), pp. 12-14, June 2000.
-
(2000)
Proc. 27th Int'l Symp. Computer Architecture (ISCA)
, pp. 12-14
-
-
Barroso, L.A.1
Gharachorloo, K.2
McNamara, R.3
Nowatzyk, A.4
Qadeer, S.5
Sano, B.6
Smith, S.7
Stets, R.8
Verghese, B.9
-
6
-
-
34548008288
-
ASR: Adaptive selective replication for CMP caches
-
Dec
-
B. M. Beckmann, M. R. Marty, and D. A. Wood, "ASR: Adaptive Selective Replication for CMP Caches", Proc. 39th Int'l Symp. Microarchitecture (MICRO), pp. 443-454, Dec. 2006.
-
(2006)
Proc. 39th Int'l Symp. Microarchitecture (MICRO)
, pp. 443-454
-
-
Beckmann, B.M.1
Marty, M.R.2
Wood, D.A.3
-
7
-
-
0006899468
-
Using hints to reduce the read miss penalty for flat COMA protocols
-
Jan
-
M. Björkman, F. Dahlgren, and P. Stenström, "Using Hints to Reduce the Read Miss Penalty for Flat COMA Protocols", Proc. 28th Int'l Conf. System Sciences, pp. 242-251, Jan. 1995.
-
(1995)
Proc. 28th Int'l Conf. System Sciences
, pp. 242-251
-
-
Björkman, M.1
Dahlgren, F.2
Stenström, P.3
-
8
-
-
0014814325
-
Space/time trade-offs in hash coding with allowable errors
-
July
-
B. H. Bloom, "Space/Time Trade-Offs in Hash Coding with Allowable Errors", Comm. ACM, vol. 13, pp. 422-426, July 1970.
-
(1970)
Comm. ACM
, vol.13
, pp. 422-426
-
-
Bloom, B.H.1
-
9
-
-
70350649055
-
High-performance embedded architecture and compilation roadmap
-
Jan
-
K. D. Bosschere, W. Luk, X. Martorell, N. Navarro, M. O'Boyle, D. Pnevmatikatos, A. Ramírez, P. Sainrat, A. Seznec, P. Stenström, and O. Temam, "High-Performance Embedded Architecture and Compilation Roadmap", Trans. High-Performance Embedded Architectures and Compilers (HiPEAC), vol. 1, pp. 5-29, Jan. 2007.
-
(2007)
Trans. High-Performance Embedded Architectures and Compilers (HiPEAC)
, vol.1
, pp. 5-29
-
-
Bosschere, K.D.1
Luk, W.2
Martorell, X.3
Navarro, N.4
O'Boyle, M.5
Pnevmatikatos, D.6
Ramírez, A.7
Sainrat, P.8
Seznec, A.9
Stenström, P.10
Temam, O.11
-
10
-
-
33644881624
-
Coarse-grain coherence tracking: Regionscout and region coherence arrays
-
Jan
-
J. F. Cantin, J. E. Smith, M. H. Lipasti, A. Moshovos, and B. Falsafi, "Coarse-Grain Coherence Tracking: Regionscout and Region Coherence Arrays", IEEE Micro, vol. 26, no. 1, pp. 70-79, Jan. 2006.
-
(2006)
IEEE Micro
, vol.26
, Issue.1
, pp. 70-79
-
-
Cantin, J.F.1
Smith, J.E.2
Lipasti, M.H.3
Moshovos, A.4
Falsafi, B.5
-
11
-
-
0018152817
-
A new solution to coherence problems in multicache systems
-
Dec
-
L. M. Censier and P. Feautrier, "A New Solution to Coherence Problems in Multicache Systems", IEEE Trans. Computers, vol. 27, no. 12, pp. 1112-1118, Dec. 1978.
-
(1978)
IEEE Trans. Computers
, vol.27
, Issue.12
, pp. 1112-1118
-
-
Censier, L.M.1
Feautrier, P.2
-
12
-
-
33845866604
-
Bulk disambiguation of speculative threads in multiprocessors
-
June
-
L. Ceze, J. Tuck, C. Cascaval, and J. Torrellas, "Bulk Disambiguation of Speculative Threads in Multiprocessors", Proc. 33rd Int'l Symp. Computer Architecture (ISCA), pp. 227-238, June 2006.
-
(2006)
Proc. 33rd Int'l Symp. Computer Architecture (ISCA)
, pp. 227-238
-
-
Ceze, L.1
Tuck, J.2
Cascaval, C.3
Torrellas, J.4
-
13
-
-
34547666967
-
An adaptive cache coherence protocol optimized for producer-consumer sharing
-
Feb
-
L. Cheng, J. B. Carter, and D. Dai, "An Adaptive Cache Coherence Protocol Optimized for Producer-Consumer Sharing", Proc. 13th Int'l Conf. High-Performance Computer Architecture (HPCA), pp. 328-339, Feb. 2007.
-
(2007)
Proc. 13th Int'l Conf. High-Performance Computer Architecture (HPCA)
, pp. 328-339
-
-
Cheng, L.1
Carter, J.B.2
Dai, D.3
-
14
-
-
33845889046
-
Interconnect-aware coherence protocols for chip multiprocessors
-
June
-
L. Cheng, N. Muralimanohar, K. Ramani, R. Balasubramonian, and J. B. Carter, "Interconnect-Aware Coherence Protocols for Chip Multiprocessors", Proc. 33rd Int'l Symp. Computer Architecture (ISCA), pp. 339-351, June 2006.
-
(2006)
Proc. 33rd Int'l Symp. Computer Architecture (ISCA)
, pp. 339-351
-
-
Cheng, L.1
Muralimanohar, N.2
Ramani, K.3
Balasubramonian, R.4
Carter, J.B.5
-
16
-
-
66749163103
-
Virtual tree coherence: Leveraging regions and in-network multicast tree for scalable cache coherence
-
Nov
-
N. D. Enright-Jerger, L.-S. Peh, and M. H. Lipasti, "Virtual Tree Coherence: Leveraging Regions and In-Network Multicast Tree for Scalable Cache Coherence", Proc. 41st Int'l Symp. Microarchitecture (MICRO), pp. 35-46, Nov. 2008.
-
(2008)
Proc. 41st Int'l Symp. Microarchitecture (MICRO)
, pp. 35-46
-
-
Enright-Jerger, N.D.1
Peh, L.-S.2
Lipasti, M.H.3
-
17
-
-
63549120761
-
Improving support for locality and fine-grain sharing in chip multiprocessors
-
Oct
-
H. Hossain, S. Dwarkadas, and M. C. Huang, "Improving Support for Locality and Fine-Grain Sharing in Chip Multiprocessors", Proc. 17th Int'l Conf. Parallel Architectures and Compilation Techniques (PACT), pp. 155-165, Oct. 2008.
-
(2008)
Proc. 17th Int'l Conf. Parallel Architectures and Compilation Techniques (PACT)
, pp. 155-165
-
-
Hossain, H.1
Dwarkadas, S.2
Huang, M.C.3
-
19
-
-
32844471317
-
A NUCA substrate for flexible CMP cache sharing
-
June
-
J. Huh, C. Kim, H. Shafi, L. Zhang, D. Burger, and S. W. Keckler, "A NUCA Substrate for Flexible CMP Cache Sharing", Proc. 19th Int'l Conf. Supercomputing (ICS), pp. 31-40, June 2005.
-
(2005)
Proc. 19th Int'l Conf. Supercomputing (ICS)
, pp. 31-40
-
-
Huh, J.1
Kim, C.2
Shafi, H.3
Zhang, L.4
Burger, D.5
Keckler, S.W.6
-
21
-
-
0036949388
-
An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches
-
Oct
-
C. Kim, D. Burger, and S. W. Keckler, "An Adaptive, Non-Uniform Cache Structure for Wire-Delay Dominated On-Chip Caches", Proc. 10th Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 211-222, Oct. 2002.
-
(2002)
Proc. 10th Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS)
, pp. 211-222
-
-
Kim, C.1
Burger, D.2
Keckler, S.W.3
-
22
-
-
27544456315
-
Interconnections in multi-core architectures: Understanding mechanisms, overheads and scaling
-
June
-
R. Kumar, V. Zyuban, and D. M. Tullsen, "Interconnections in Multi-Core Architectures: Understanding Mechanisms, Overheads and Scaling", Proc. 32nd Int'l Symp. Computer Architecture (ISCA), pp. 408-419, June 2005.
-
(2005)
Proc. 32nd Int'l Symp. Computer Architecture (ISCA)
, pp. 408-419
-
-
Kumar, R.1
Zyuban, V.2
Tullsen, D.M.3
-
23
-
-
37549032725
-
IBM POWER6 microarchitecture
-
Nov
-
H. Q. Le, W. J. Starke, J. S. Fields, F. P. O'Connell, D. Q. Nguyen, B. J. Ronchetti, W. M. Sauer, E. M. Schwarz, and M. T. Vaden, "IBM POWER6 Microarchitecture", IBM J. Research and Development, vol. 51, no. 6, pp. 639-662, Nov. 2007.
-
(2007)
IBM J. Research and Development
, vol.51
, Issue.6
, pp. 639-662
-
-
Le, H.Q.1
Starke, W.J.2
Fields, J.S.3
O'Connell, F.P.4
Nguyen, D.Q.5
Ronchetti, B.J.6
Sauer, W.M.7
Schwarz, E.M.8
Vaden, M.T.9
-
24
-
-
33749052315
-
The ALP bench benchmark suite for complex multimedia applications
-
Oct
-
M.-L. Li, R. Sasanka, S. V. Adve, Y.-K. Chen, and E. Debes, "The ALPBench Benchmark Suite for Complex Multimedia Applications", Proc. Int'l Symp. Workload Characterization, pp. 34-45, Oct. 2005.
-
(2005)
Proc. Int'l Symp. Workload Characterization
, pp. 34-45
-
-
Li, M.-L.1
Sasanka, R.2
Adve, S.V.3
Chen, Y.-K.4
Debes, E.5
-
25
-
-
16244422171
-
Interconnect-power dissipation in a microprocessor
-
Feb
-
N. Magen, A. Kolodny, U. Weiser, and N. Shamir, "Interconnect-Power Dissipation in a Microprocessor", Proc. Int'l Workshop System Level Interconnect Prediction, pp. 7-13, Feb. 2004.
-
(2004)
Proc. Int'l Workshop System Level Interconnect Prediction
, pp. 7-13
-
-
Magen, N.1
Kolodny, A.2
Weiser, U.3
Shamir, N.4
-
26
-
-
0036469676
-
Simics: A full system simulation platform
-
Feb
-
P. S. Magnusson, M. Christensson, J. Eskilson, D. Forsgren, G. Hallberg, J. Hogberg, F. Larsson, A. Moestedt, and B. Werner, "Simics: A Full System Simulation Platform", Computer, vol. 35, no. 2, pp. 50-58, Feb. 2002.
-
(2002)
Computer
, vol.35
, Issue.2
, pp. 50-58
-
-
Magnusson, P.S.1
Christensson, M.2
Eskilson, J.3
Forsgren, D.4
Hallberg, G.5
Hogberg, J.6
Larsson, F.7
Moestedt, A.8
Werner, B.9
-
27
-
-
1342341261
-
-
PhD thesis, Univ. of Wisconsin-Madison, Dec
-
M. M. Martin, "Token Coherence", PhD thesis, Univ. of Wisconsin-Madison, Dec. 2003.
-
(2003)
Token Coherence
-
-
Martin, M.M.1
-
28
-
-
0038684776
-
Using destination-set prediction to improve the latency/bandwidth tradeoff in shared-memory multiprocessors
-
June
-
M. M. Martin, P. J. Harper, D. J. Sorin, M. D. Hill, and D. A. Wood, "Using Destination-Set Prediction to Improve the Latency/Bandwidth Tradeoff in Shared-Memory Multiprocessors", Proc. 30th Int'l Symp. Computer Architecture (ISCA), pp. 206-217, June 2003.
-
(2003)
Proc. 30th Int'l Symp. Computer Architecture (ISCA)
, pp. 206-217
-
-
Martin, M.M.1
Harper, P.J.2
Sorin, D.J.3
Hill, M.D.4
Wood, D.A.5
-
29
-
-
0038346234
-
Token coherence: Decoupling performance and correctness
-
June
-
M. M. Martin, M. D. Hill, and D. A. Wood, "Token Coherence: Decoupling Performance and Correctness", Proc. 30th Int'l Symp. Computer Architecture (ISCA), pp. 182-193, June 2003.
-
(2003)
Proc. 30th Int'l Symp. Computer Architecture (ISCA)
, pp. 182-193
-
-
Martin, M.M.1
Hill, M.D.2
Wood, D.A.3
-
30
-
-
0034442640
-
Timestamp snooping: An approach for extending SMPs
-
Nov
-
M. M. Martin, D. J. Sorin, A. Ailamaki, A. R. Alameldeen, R. M. Dickson, C. J. Mauer, K. E. Moore, M. Plakal, M. D. Hill, and D. A. Wood, "Timestamp Snooping: An Approach for Extending SMPs", Proc. Ninth Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 25-36, Nov. 2000.
-
(2000)
Proc. Ninth Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS)
, pp. 25-36
-
-
Martin, M.M.1
Sorin, D.J.2
Ailamaki, A.3
Alameldeen, A.R.4
Dickson, R.M.5
Mauer, C.J.6
Moore, K.E.7
Plakal, M.8
Hill, M.D.9
Wood, D.A.10
-
31
-
-
33748870886
-
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset
-
Sept
-
M. M. Martin, D. J. Sorin, B. M. Beckmann, M. R. Marty, M. Xu, A. R. Alameldeen, K. E. Moore, M. D. Hill, and D. A. Wood, "Multifacet's General Execution-Driven Multiprocessor Simulator (GEMS) Toolset", Computer Architecture News, vol. 33, no. 4, pp. 92-99, Sept. 2005.
-
(2005)
Computer Architecture News
, vol.33
, Issue.4
, pp. 92-99
-
-
Martin, M.M.1
Sorin, D.J.2
Beckmann, B.M.3
Marty, M.R.4
Xu, M.5
Alameldeen, A.R.6
Moore, K.E.7
Hill, M.D.8
Wood, D.A.9
-
32
-
-
28444472751
-
Improving multiple-CMP systems using token coherence
-
Feb
-
M. R. Marty, J. D. Bingham, M. D. Hill, A. J. Hu, M. M. Martin, and D. A. Wood, "Improving Multiple-CMP Systems Using Token Coherence", Proc. 11th Int'l Conf. High-Performance Computer Architecture (HPCA), pp. 328-339, Feb. 2005.
-
(2005)
Proc. 11th Int'l Conf. High-Performance Computer Architecture (HPCA)
, pp. 328-339
-
-
Marty, M.R.1
Bingham, J.D.2
Hill, M.D.3
Hu, A.J.4
Martin, M.M.5
Wood, D.A.6
-
33
-
-
0030259458
-
The case for a single-chip multiprocessor
-
Oct
-
K. Olukotun, B. A. Nayfeh, L. Hammond, K. G. Wilson, and K. Chang, "The Case for a Single-Chip Multiprocessor", Proc. Seventh Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 2-11, Oct. 1996.
-
(1996)
Proc. Seventh Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS)
, pp. 2-11
-
-
Olukotun, K.1
Nayfeh, B.A.2
Hammond, L.3
Wilson, K.G.4
Chang, K.5
-
34
-
-
0037643684
-
SICOSYS: An integrated framework for studying interconnection network in multiprocessor systems
-
Jan
-
V. Puente, J. A. Gregorio, and R. Beivide, "SICOSYS: An Integrated Framework for Studying Interconnection Network in Multiprocessor Systems", Proc. 10th Euromicro Workshop Parallel, Distributed and Network-Based Processing, pp. 15-22, Jan. 2002.
-
(2002)
Proc. 10th Euromicro Workshop Parallel, Distributed and Network-Based Processing
, pp. 15-22
-
-
Puente, V.1
Gregorio, J.A.2
Beivide, R.3
-
35
-
-
38349000549
-
Direct coherence: Bringing together performance and scalability in shared-memory multiprocessors
-
Dec
-
A. Ros, M. E. Acacio, and J. M. García, "Direct Coherence: Bringing Together Performance and Scalability in Shared-Memory Multiprocessors", Proc. 14th Int'l Conf. High Performance Computing (HiPC), pp. 147-160, Dec. 2007.
-
(2007)
Proc. 14th Int'l Conf. High Performance Computing (HiPC)
, pp. 147-160
-
-
Ros, A.1
Acacio, M.E.2
García, J.M.3
-
36
-
-
51049087285
-
DiCo-CMP: Efficient cache coherency in tiled CMP architectures
-
Apr
-
A. Ros, M. E. Acacio, and J. M. García, "DiCo-CMP: Efficient Cache Coherency in Tiled CMP Architectures", Proc. 22nd Int'l Symp. Parallel and Distributed Processing (IPDPS), pp. 1-11, Apr. 2008.
-
(2008)
Proc. 22nd Int'l Symp. Parallel and Distributed Processing (IPDPS)
, pp. 1-11
-
-
Ros, A.1
Acacio, M.E.2
García, J.M.3
-
37
-
-
62649104787
-
Scalable directory organization for tiled CMP architectures
-
July
-
A. Ros, M. E. Acacio, and J. M. García, "Scalable Directory Organization for Tiled CMP Architectures", Proc. Int'l Conf. Computer Design (CDES), pp. 112-118, July 2008.
-
(2008)
Proc. Int'l Conf. Computer Design (CDES)
, pp. 112-118
-
-
Ros, A.1
Acacio, M.E.2
García, J.M.3
-
38
-
-
51349168284
-
Ultra SPARC T2: A highly-threaded, power-efficient, SPARC SoC
-
Nov
-
M. Shah, J. Barreh, J. Brooks, R. Golla, G. Grohoski, N. Gura, R. Hetherington, P. Jordan, M. Luttrell, C. Olson, B. Saha, D. Sheahan, L. Spracklen, and A. Wynn, "UltraSPARC T2: A Highly-Threaded, Power-Efficient, SPARC SoC", Proc. IEEE Asian Solid-State Circuits Conf., pp. 22-25, Nov. 2007.
-
(2007)
Proc. IEEE Asian Solid-State Circuits Conf.
, pp. 22-25
-
-
Shah, M.1
Barreh, J.2
Brooks, J.3
Golla, R.4
Grohoski, G.5
Gura, N.6
Hetherington, R.7
Jordan, P.8
Luttrell, M.9
Olson, C.10
Saha, B.11
Sheahan, D.12
Spracklen, L.13
Wynn, A.14
-
39
-
-
0027242953
-
An adaptive cache coherence protocol optimized for migratory sharing
-
May
-
P. Stenström, M. Brorsson, and L. Sandberg, "An Adaptive Cache Coherence Protocol Optimized for Migratory Sharing", Proc. 20th Int'l Symp. Computer Architecture (ISCA), pp. 109-118, May 1993.
-
(1993)
Proc. 20th Int'l Symp. Computer Architecture (ISCA)
, pp. 109-118
-
-
Stenström, P.1
Brorsson, M.2
Sandberg, L.3
-
40
-
-
84862144932
-
Power-driven design of router microarchitectures in on-chip networks
-
Dec
-
H. Wang, L.-S. Peh, and S. Malik, "Power-Driven Design of Router Microarchitectures in On-Chip Networks", Proc. 36th Int'l Symp. Microarchitecture (MICRO), pp. 105-111, Dec. 2003.
-
(2003)
Proc. 36th Int'l Symp. Microarchitecture (MICRO)
, pp. 105-111
-
-
Wang, H.1
Peh, L.-S.2
Malik, S.3
-
41
-
-
0029179077
-
The SPLASH-2 programs: Characterization and methodological considerations
-
June
-
S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta, "The SPLASH-2 Programs: Characterization and Methodological Considerations", Proc. 22nd Int'l Symp. Computer Architecture (ISCA), pp. 24-36, June 1995.
-
(1995)
Proc. 22nd Int'l Symp. Computer Architecture (ISCA)
, pp. 24-36
-
-
Woo, S.C.1
Ohara, M.2
Torrie, E.3
Singh, J.P.4
Gupta, A.5
-
42
-
-
34547683554
-
LogTM-SE: Decoupling hardware transactional memory from caches
-
DOI 10.1109/HPCA.2007.346204, 4147667, 2007 IEEE 13th Annual International Symposium on High Performance Computer Architecture, HPCA-13
-
L. Yen, J. Bobba, M. R. Marty, K. E. Moore, H. Volos, M. D. Hill, M. M. Swift, and D. A. Wood, "LogTM-SE: Decoupling Hardware Transactional Memory from Caches", Proc. 13th Int'l Conf. High-Performance Computer Architecture (HPCA), pp. 261-272, Feb. 2007. (Pubitemid 47208171)
-
(2007)
Proceedings - International Symposium on High-Performance Computer Architecture
, pp. 261-272
-
-
Yen, L.1
Bobba, J.2
Marty, M.R.3
Moore, K.E.4
Volos, H.5
Hill, M.D.6
Swift, M.M.7
Wood, D.A.8
-
43
-
-
27544495466
-
Victim replication: Maximizing capacity while hiding wire delay in tiled chip multiprocessors
-
June
-
M. Zhang and K. Asanovic, "Victim Replication: Maximizing Capacity While Hiding Wire Delay in Tiled Chip Multiprocessors", Proc. 32nd Int'l Symp. Computer Architecture (ISCA), pp. 336-345, June 2005.
-
(2005)
Proc. 32nd Int'l Symp. Computer Architecture (ISCA)
, pp. 336-345
-
-
Zhang, M.1
Asanovic, K.2
|