-
1
-
-
84900342836
-
Specomp: A new benchmark suite for measuring parallel computer performance
-
July
-
V. Aslot, M. J. Domeika, R. Eigenmann, G. Gaertner, W. B. Jones, and B. Parady. Specomp: A new benchmark suite for measuring parallel computer performance. In Proceedings of the International Workshop on OpenMP Applications and Tools, pages 1-10, July 2001.
-
(2001)
Proceedings of the International Workshop on OpenMP Applications and Tools
, pp. 1-10
-
-
Aslot, V.1
Domeika, M.J.2
Eigenmann, R.3
Gaertner, G.4
Jones, W.B.5
Parady, B.6
-
3
-
-
31844454881
-
Design and experience: Using the intel itanium-2 processor
-
Y. Choi, A. Knies, G. Vedaraman, J. Williamson, and I. Esmer. Design and experience: Using the intel itanium-2 processor. In Proceedings of the EPIC2 Workshop, 2002.
-
(2002)
Proceedings of the EPIC2 Workshop
-
-
Choi, Y.1
Knies, A.2
Vedaraman, G.3
Williamson, J.4
Esmer, I.5
-
4
-
-
84965078406
-
Fixed and adaptive sequential prefetching in shared-memory multiprocessors
-
St. Charles, IL
-
F. Dahlgren, M. Dubois, and P. Stenstrom. Fixed and adaptive sequential prefetching in shared-memory multiprocessors. In Proceedings of the International Conference on Parallel Processing, page 5663., St. Charles, IL, 1993.
-
(1993)
Proceedings of the International Conference on Parallel Processing
, pp. 5663
-
-
Dahlgren, F.1
Dubois, M.2
Stenstrom, P.3
-
5
-
-
0002026205
-
An overview of the Intel IA-64 compiler
-
C. Dulong, R. Krishnaiyer, D. Kulkarni, D. Lavery, W. Li, J. Ng, and D. Sehr. An overview of the Intel IA-64 compiler. Intel Technology Journal, 1999.
-
(1999)
Intel Technology Journal
-
-
Dulong, C.1
Krishnaiyer, R.2
Kulkarni, D.3
Lavery, D.4
Li, W.5
Ng, J.6
Sehr, D.7
-
6
-
-
2342510413
-
An integrated hardware/software data prefetching scheme for shared-memory multiprocessors
-
E. H. Gornish and A. Veidenbaum. An integrated hardware/software data prefetching scheme for shared-memory multiprocessors. International Journal of Parallel Programming, 27(1):35-70, 1999.
-
(1999)
International Journal of Parallel Programming
, vol.27
, Issue.1
, pp. 35-70
-
-
Gornish, E.H.1
Veidenbaum, A.2
-
8
-
-
0032630821
-
Ultrasparc-iii: Designing third-generation 64-bit performance
-
T. Horel and G. Lauterbach. Ultrasparc-iii: Designing third-generation 64-bit performance. IEEE Micro, 19(3):73-85, 1999.
-
(1999)
IEEE Micro
, vol.19
, Issue.3
, pp. 73-85
-
-
Horel, T.1
Lauterbach, G.2
-
9
-
-
42549168687
-
Exploring the cache design space for large scale CMPs
-
L. Hsu, R. Iyer, S. Makineni, S. Reinhardt, and D. Newell. Exploring the cache design space for large scale CMPs. SIGARCH Computer Architecture News, 33(4):24-33, 2005.
-
(2005)
SIGARCH Computer Architecture News
, vol.33
, Issue.4
, pp. 24-33
-
-
Hsu, L.1
Iyer, R.2
Makineni, S.3
Reinhardt, S.4
Newell, D.5
-
12
-
-
20344374162
-
Niagara: A 32-way multithreaded SPARC processor
-
P. Kongetira, K. Aingaran, and K. Olukotun. Niagara: A 32-way multithreaded SPARC processor. IEEE Micro, 25(2):21-29, 2005.
-
(2005)
IEEE Micro
, vol.25
, Issue.2
, pp. 21-29
-
-
Kongetira, P.1
Aingaran, K.2
Olukotun, K.3
-
13
-
-
67650020024
-
The performance of runtime data cache prefetching in a dynamic optimization system
-
J. Lu, H. Chen, R. Fu, W.-C. Hsu, B. Othmer, P.-C. Yew, and D.-Y. Chen. The performance of runtime data cache prefetching in a dynamic optimization system. In Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003.
-
(2003)
Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture
-
-
Lu, J.1
Chen, H.2
Fu, R.3
Hsu, W.-C.4
Othmer, B.5
Yew, P.-C.6
Chen, D.-Y.7
-
14
-
-
33749382556
-
Dynamic helper threaded prefetching on the SUN ultrasparc CMP processor
-
J. Lu, A. Das, W.-C. Hsu, K. Nguyen, and S. G. Abraham. Dynamic helper threaded prefetching on the SUN ultrasparc CMP processor. In Proceedings of the 38th International Symposium on Microarchitecture, 2005.
-
(2005)
Proceedings of the 38th International Symposium on Microarchitecture
-
-
Lu, J.1
Das, A.2
Hsu, W.-C.3
Nguyen, K.4
Abraham, S.G.5
-
15
-
-
0034839064
-
Tolerating memory latency through software-controlled pre- execution in simultaneous multithreading processors
-
C.-K. Luk. Tolerating memory latency through software-controlled pre- execution in simultaneous multithreading processors. In Proceedings of the 28th Annual International Symposium on Computer Architecture, pages 40-51, 2001.
-
(2001)
Proceedings of the 28th Annual International Symposium on Computer Architecture
, pp. 40-51
-
-
Luk, C.-K.1
-
17
-
-
0042366306
-
Architectural and compiler support for effective instruction prefetching: A cooperative approach
-
C.-K. Luk and T. C. Mowry. Architectural and compiler support for effective instruction prefetching: a cooperative approach. ACM Transactions on Computer Systems, 19(1):71-109, 2001.
-
(2001)
ACM Transactions on Computer Systems
, vol.19
, Issue.1
, pp. 71-109
-
-
Luk, C.-K.1
Mowry, T.C.2
-
18
-
-
0036469676
-
Simics: A full system simulation platform
-
P. S. Magnusson, M. Christensson, J. Eskilson, D. Forsgren, G. Hallberg, J. Hogberg, F. Larsson, A. Moestedt, and B. Werner. Simics: A full system simulation platform. Computer, 35(2):50-58, 2002.
-
(2002)
Computer
, vol.35
, Issue.2
, pp. 50-58
-
-
Magnusson, P.S.1
Christensson, M.2
Eskilson, J.3
Forsgren, D.4
Hallberg, G.5
Hogberg, J.6
Larsson, F.7
Moestedt, A.8
Werner, B.9
-
19
-
-
0031988272
-
Tolerating latency in multiprocessors through compiler-inserted prefetching
-
T. C. Mowry. Tolerating latency in multiprocessors through compiler-inserted prefetching. ACM Transactions on Computer Systems, 16(1):55-92, 1998.
-
(1998)
ACM Transactions on Computer Systems
, vol.16
, Issue.1
, pp. 55-92
-
-
Mowry, T.C.1
-
21
-
-
33847108092
-
Coterminous locality and coterminous group data prefetching on chip multiprocessors
-
X. Shi, Z. Yang, J. Peir, L. Peng, Y.-K. Chen, Lee, and Liang. Coterminous locality and coterminous group data prefetching on chip multiprocessors. In Proceedings of the 20th International Parallel and Distributed Processing Symposium, 2006.
-
(2006)
Proceedings of the 20th International Parallel and Distributed Processing Symposium
-
-
Shi, X.1
Yang, Z.2
Peir, J.3
Peng, L.4
Chen, Y.-K.5
Lee6
Liang7
-
22
-
-
25844437046
-
Power5 system microarchitecture
-
B. Sinharoy, R. N. Kalla, J. M. Tendler, R. J. Eickemeyer, and J. B. Joyner. Power5 system microarchitecture. IBM J. Res. Dev., 49(4/5):505-521, 2005.
-
(2005)
IBM J. Res. Dev
, vol.49
, Issue.4-5
, pp. 505-521
-
-
Sinharoy, B.1
Kalla, R.N.2
Tendler, J.M.3
Eickemeyer, R.J.4
Joyner, J.B.5
-
23
-
-
0020177251
-
Cache memories
-
A. J. Smith. Cache memories. ACM Comput. Surv., 14(3):473-530, 1982.
-
(1982)
ACM Comput. Surv
, vol.14
, Issue.3
, pp. 473-530
-
-
Smith, A.J.1
-
25
-
-
1342323887
-
A prefetch taxonomy
-
V. Srinivasan, E. S. Davidson, and G. S. Tyson. A prefetch taxonomy. IEEE Transactions on Computers, 53(2):126-140, 2004.
-
(2004)
IEEE Transactions on Computers
, vol.53
, Issue.2
, pp. 126-140
-
-
Srinivasan, V.1
Davidson, E.S.2
Tyson, G.S.3
-
28
-
-
0038345683
-
Guided region prefetching: A cooperative hardware/software approach
-
Z. Wang, D. Burger, K. S. McKinley, S. K. Reinhardt, and C. C. Weems. Guided region prefetching: a cooperative hardware/software approach. In Proceedings of the 30th Annual International Symposium on Computer Architecture, pages 388-398, 2003.
-
(2003)
Proceedings of the 30th Annual International Symposium on Computer Architecture
, pp. 388-398
-
-
Wang, Z.1
Burger, D.2
McKinley, K.S.3
Reinhardt, S.K.4
Weems, C.C.5
|