-
1
-
-
33746683732
-
Maximizing CMP throughput with mediocre cores
-
J. D. Davis, J. Laudon, and K. Olukotun, "Maximizing CMP throughput with mediocre cores," in. PACT, 2005, pp. 51-62.
-
(2005)
PACT
, pp. 51-62
-
-
Davis, J.D.1
Laudon, J.2
Olukotun, K.3
-
2
-
-
20344374162
-
Niagara: A 32-way multithreaded Sparc processor
-
P. Kongetira, K. Amgaran, and K. Olukotun, "Niagara: a 32-way multithreaded Sparc processor," IEEE Micro, vol.25, no.2, pp. 21-29, 2005.
-
(2005)
IEEE Micro
, vol.25
, Issue.2
, pp. 21-29
-
-
Kongetira, P.1
Amgaran, K.2
Olukotun, K.3
-
3
-
-
49249086142
-
Larrabee: A many-core x86 architecture for visual, computing
-
L. Seiler, D. Carmean, E. Sprangle, T. Forsyth, M. Abrash, P. Dubey, S. Junkins, A. Lake, J. Sugerman, R. Cavin, R. Espasa, E. Grochowski, T. Juan, and P. Hanrahan, "Larrabee: a many-core x86 architecture for visual, computing," ACM TOG, vol.27, no.3, pp. 1-15, 2008.
-
(2008)
ACM TOG
, vol.27
, Issue.3
, pp. 1-15
-
-
Seiler, L.1
Carmean, D.2
Sprangle, E.3
Forsyth, T.4
Abrash, M.5
Dubey, P.6
Junkins, S.7
Lake, A.8
Sugerman, J.9
Cavin, R.10
Espasa, R.11
Grochowski, E.12
Juan, T.13
Hanrahan, P.14
-
4
-
-
0028201665
-
Tradeoffs in two-level on-chip caching
-
Apr.
-
N. P. Jouppi and S. J. E. Wilton, "Tradeoffs in two-level on-chip caching," in ISCA, Apr. 1994, pp. 34-45.
-
(1994)
ISCA
, pp. 34-45
-
-
Jouppi, N.P.1
Wilton, S.J.E.2
-
5
-
-
70449695861
-
Non-inclusion property in multi-level caches revisited
-
M. Zahran, K. Albayraktaroglu, and M. Franklin, "Non-inclusion property in multi-level caches revisited," in Int'l J. Computers and their Applications, no.2, 2007, pp. 99-108.
-
(2007)
Int'l J. Computers and Their Applications
, Issue.2
, pp. 99-108
-
-
Zahran, M.1
Albayraktaroglu, K.2
Franklin, M.3
-
6
-
-
34547282756
-
Reducing verification complexity of a multicore coherence protocol using assume/guarantee
-
X. Chen, Y. Yang, G. Gopalakrishnan, and C.-T. Chou, "Reducing verification complexity of a multicore coherence protocol using assume/guarantee," in FMCAD, 2006, pp. 81-88.
-
(2006)
FMCAD
, pp. 81-88
-
-
Chen, X.1
Yang, Y.2
Gopalakrishnan, G.3
Chou, C.-T.4
-
7
-
-
70449131864
-
Corey: An operating system for many cores
-
December
-
S. B. Wickizer, H. Chen, R. Chen, Y Mao, F. Kaashoek, R. Morris, A. Pesterev, L. Stein, M. Wu, Y Dai, Y. Zhang, and Z. Zhang, "Corey: An operating system for many cores," in OSDI, December 2008.
-
(2008)
OSDI
-
-
Wickizer, S.B.1
Chen, H.2
Chen, R.3
Mao, Y.4
Kaashoek, F.5
Morris, R.6
Pesterev, A.7
Stein, L.8
Wu, M.9
Dai, Y.10
Zhang, Y.11
Zhang, Z.12
-
9
-
-
34247273005
-
Scalable locality-conscious multithreaded memory allocation
-
S. Schneider, C. D. Antonopoulos, and D. S. Nikolopoulos, "Scalable locality-conscious multithreaded memory allocation," in ISMM, 2006, pp. 84-94.
-
(2006)
ISMM
, pp. 84-94
-
-
Schneider, S.1
Antonopoulos, C.D.2
Nikolopoulos, D.S.3
-
10
-
-
17544362263
-
Hoard: A scalable memory allocator for multithreaded applications
-
E. D. Berger, K. S. McKinley, R. D. Blumofe, and P. R. Wilson, "Hoard: a scalable memory allocator for multithreaded applications," SIGPLAN Not, vol.35, no.11, pp. 117-128, 2000.
-
(2000)
SIGPLAN Not
, vol.35
, Issue.11
, pp. 117-128
-
-
Berger, E.D.1
McKinley, K.S.2
Blumofe, R.D.3
Wilson, P.R.4
-
11
-
-
84949769332
-
A new memory monitoring scheme for memory-aware scheduling and partitioning
-
G. E. Suh, S. Devadas, and L. Rudolph, "A new memory monitoring scheme for memory-aware scheduling and partitioning," in HPCA, 2002, p. 117.
-
(2002)
HPCA
, pp. 117
-
-
Suh, G.E.1
Devadas, S.2
Rudolph, L.3
-
12
-
-
0000444590
-
Evaluating the performance of cache-affinity scheduling in shared-memory multiprocessors
-
J. Torrellas, A. Tucker, and A. Gupta, "Evaluating the performance of cache-affinity scheduling in shared-memory multiprocessors," JPDC, vol.24, no.2, pp. 139-151, 1995.
-
(1995)
JPDC
, vol.24
, Issue.2
, pp. 139-151
-
-
Torrellas, J.1
Tucker, A.2
Gupta, A.3
-
13
-
-
0028754497
-
Affinity scheduling of unbalanced workloads
-
S. Subramaniam and D. L. Eager, "Affinity scheduling of unbalanced workloads," in SC, .1.994, pp. 214-226.
-
SC,.1994
, pp. 214-226
-
-
Subramaniam, S.1
Eager, D.L.2
-
14
-
-
14844328033
-
On the effectiveness of address-space randomization
-
H. Shacham, E. jin Goh, N. Modadugu, B. Pfaff, and D. Boneh, "On the effectiveness of address-space randomization," in CCS, 2004, pp. 298-307.
-
(2004)
CCS
, pp. 298-307
-
-
Shacham, H.1
Jin Goh, E.2
Modadugu, N.3
Pfaff, B.4
Boneh, D.5
-
15
-
-
35348920021
-
Adaptive insertion policies for high performance caching
-
DOI 10.1145/1250662.1250709, ISCA'07: 34th Annual International Symposium on Computer Architecture, Conference Proceedings
-
M. K. Qureshi, A. Jaleel, Y. N. Patt, S. C. Steely, and J. Einer, "Adaptive insertion policies for high performance caching, ISCA, 2007, pp. 381-391. (Pubitemid 47582119)
-
(2007)
Proceedings - International Symposium on Computer Architecture
, pp. 381-391
-
-
Qureshi, M.K.1
Jaleel, A.2
Patt, Y.N.3
Steely Jr., S.C.4
Emer, J.5
-
16
-
-
0034592592
-
Region-based caching: An energy-delay efficient memory architecture for embedded processors
-
H. S. Lee and G. S. Tyson, "Region-based caching: an energy-delay efficient memory architecture for embedded processors," in CASES, 2000, pp. 120-127.
-
(2000)
CASES
, pp. 120-127
-
-
Lee, H.S.1
Tyson, G.S.2
-
17
-
-
34548316872
-
A novel technique to use scratch-pad memory for stack management
-
DOI 10.1109/DATE.2007.364509, 4212019, Proceedings - 2007 Design, Automation and Test in Europe Conference and Exhibition, DATE 2007
-
S. Park, H. woo Park, and S. Ha, "A novel technique to use scratch-pad memory for stack management," in DATE, 2007, pp. 1478-1483. (Pubitemid 47334172)
-
(2007)
Proceedings -Design, Automation and Test in Europe, DATE
, pp. 1478-1483
-
-
Park, S.1
Park, H.-W.2
Ha, S.3
-
18
-
-
23044524059
-
On-chip vs. off-chip memory: The data partitioning problem in embedded processor-based systems
-
July
-
P. R. Panda, N. D. Dutt, and A. Nicolau, "On-chip vs. off-chip memory: The data partitioning problem in embedded processor-based systems," ACM TODAES, vol, 5, no.3, pp. 682-704, July 2000.
-
(2000)
ACM TODAES
, vol.5
, Issue.3
, pp. 682-704
-
-
Panda, P.R.1
Dutt, N.D.2
Nicolau, A.3
-
19
-
-
77951001644
-
A localizing directory coherence protocol
-
C. McCurdy and C. Fischer, "A localizing directory coherence protocol," in. WMPI, 2004, pp. 23-29.
-
(2004)
WMPI
, pp. 23-29
-
-
McCurdy, C.1
Fischer, C.2
-
20
-
-
42549168687
-
Exploring the cache design space for large scale CMPs
-
L. Hsu, R. Iyer, S. Makineni, S. Reinhardt, and D. Newell, "Exploring the cache design space for large scale CMPs," dasCMP, vol.33, no.4, pp. 24-33, 2005.
-
(2005)
DasCMP
, vol.33
, Issue.4
, pp. 24-33
-
-
Hsu, L.1
Iyer, R.2
Makineni, S.3
Reinhardt, S.4
Newell, D.5
-
21
-
-
33845903561
-
Cooperative caching for Chip Multiprocessors
-
J. Chang and G. S. Sohi, "Cooperative caching for Chip Multiprocessors," in ISCA, 2006, pp. 264-276.
-
(2006)
ISCA
, pp. 264-276
-
-
Chang, J.1
Sohi, G.S.2
-
22
-
-
27544495466
-
Victim replication: Maximizing capacity while hiding wire delay in tiled Chip Multiprocessors
-
M. Zhang and K. Asanovic, "Victim replication: Maximizing capacity while hiding wire delay in tiled Chip Multiprocessors," in ISCA, 2005, pp. 336-345.
-
(2005)
ISCA
, pp. 336-345
-
-
Zhang, M.1
Asanovic, K.2
-
23
-
-
77950982560
-
Victim migration: Dynamically adapting between private and shared CMP caches
-
- "Victim migration: Dynamically adapting between private and shared CMP caches," in MIT Technical Report MIT-CSAIL-TR-2005-064, MIT-LCS-TR-.1006, 2005.
-
(2005)
MIT Technical Report MIT-CSAIL-TR-2005-064, MIT-LCS-TR-1006
-
-
-
24
-
-
33846535493
-
The M5 simulator: Modeling networked, systems
-
N. L. Binkert, R. G. Dreslinski, L. R. Hsu, K. T. Lim, A. G. Saldi, and S. K. Reinhardt, "The M5 simulator: Modeling networked, systems," IEEE Micro, vol.26, no.4, 2006.
-
(2006)
IEEE Micro
, vol.26
, Issue.4
-
-
Binkert, N.L.1
Dreslinski, R.G.2
Hsu, L.R.3
Lim, K.T.4
Saldi, A.G.5
Reinhardt, S.K.6
-
26
-
-
34547664408
-
Cacti 4.0
-
D- Tarjan, S. Thoziyoor, and N. P. Jouppi, "Cacti 4.0," HP Laboratories Palo Alto, Tech. Rep. HPL-2006-2086, 2006.
-
(2006)
HP Laboratories Palo Alto, Tech. Rep. HPL-2006-2086
-
-
Tarjan, D.1
Thoziyoor, S.2
Jouppi, N.P.3
-
27
-
-
33845423872
-
An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches
-
C. Kim, D. Burger, and S. W. Keckler, "An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches," ASPLOS, vol.36, no.5, 2002.
-
(2002)
ASPLOS
, vol.36
, Issue.5
-
-
Kim, C.1
Burger, D.2
Keckler, S.W.3
-
28
-
-
36849004429
-
Bringing NoCs to 65 nm
-
A. Pullini, F. Angiollni, S. Murali, D. Atienza, G. D. Micheli, and L. Benini, "Bringing NoCs to 65 nm," IEEE Micro, vol.27, no.5, 2007.
-
(2007)
IEEE Micro
, vol.27
, Issue.5
-
-
Pullini, A.1
Angiollni, F.2
Murali, S.3
Atienza, D.4
Micheli, G.D.5
Benini, L.6
-
29
-
-
0002255264
-
SPLASH: Stanford parallel applications for shared memory
-
Mar.
-
J. P. Singh, W.-D. Weber, and A. Gupta, "SPLASH: Stanford parallel applications for shared memory," ISCA, vol.20, no.1, pp. 5-44, Mar. 1995.
-
(1995)
ISCA
, vol.20
, Issue.1
, pp. 5-44
-
-
Singh, J.P.1
Weber, W.-D.2
Gupta, A.3
-
30
-
-
51449118065
-
A performance study of general purpose applications on graphics processors using CUDA
-
S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer, and K. Skadron, "A performance study of general purpose applications on graphics processors using CUDA," JPDC, 2008.
-
(2008)
JPDC
-
-
Che, S.1
Boyer, M.2
Meng, J.3
Tarjan, D.4
Sheaffer, J.W.5
Skadron, K.6
-
31
-
-
47349098275
-
Minebench: A benchmark suite for data mining workloads
-
Oct.
-
R. Narayanan, B. Ozislkyilmaz, J. Zambreno, G. Memik, and A. Choudhary, "Minebench: A benchmark suite for data mining workloads," WC, pp. 182-188, Oct. 2006.
-
(2006)
WC
, pp. 182-188
-
-
Narayanan, R.1
Ozislkyilmaz, B.2
Zambreno, J.3
Memik, G.4
Choudhary, A.5
|