-
3
-
-
0033717865
-
Clock rate versus IPC: The end of the road for conventional microarchitectures
-
Vancouver, Canada, June
-
V. Agarwal, M. Hrishikesh, S. W. Keckler, and D. Burger. Clock rate versus IPC: The end of the road for conventional microarchitectures. In Proceedings of the 27th International Symposium on Computer Architecture, pages 248-259, Vancouver, Canada, June 2000.
-
(2000)
Proceedings of the 27th International Symposium on Computer Architecture
, pp. 248-259
-
-
Agarwal, V.1
Hrishikesh, M.2
Keckler, S.W.3
Burger, D.4
-
4
-
-
84858895265
-
Micro-30 SimpleScalar tutorial
-
T. Austin and D. Burger. Micro-30 SimpleScalar tutorial. Technical report, http://www.cs.wisc.edu/mscalar/ss/tutorial.html, 1997.
-
(1997)
Technical Report
-
-
Austin, T.1
Burger, D.2
-
5
-
-
0026267802
-
An effective on-chip preloading scheme to reduce data access penalty
-
Albuquerque, NM, Nov.
-
J.-L. Baer and T.-F. Chen. An effective on-chip preloading scheme to reduce data access penalty. In Proceedings of Supercomputing '91, pages 176-186, Albuquerque, NM, Nov. 1991.
-
(1991)
Proceedings of Supercomputing '91
, pp. 176-186
-
-
Baer, J.-L.1
Chen, T.-F.2
-
7
-
-
0003003638
-
A study of replacement algorithms for a virtual-storage computer
-
L. A. Belady. A study of replacement algorithms for a virtual-storage computer. IBM Systems Journal, 5(2):79-101, 1966.
-
(1966)
IBM Systems Journal
, vol.5
, Issue.2
, pp. 79-101
-
-
Belady, L.A.1
-
10
-
-
0029666646
-
Memory bandwidth limitations of future microprocessors
-
Philadelphia, PA, May
-
D. Burger, A. Kägi, and J. R. Goodman. Memory bandwidth limitations of future microprocessors. In Proceedings of the 23rd International Symposium on Computer Architecture, pages 78-89, Philadelphia, PA, May 1996.
-
(1996)
Proceedings of the 23rd International Symposium on Computer Architecture
, pp. 78-89
-
-
Burger, D.1
Kägi, A.2
Goodman, J.R.3
-
12
-
-
0032761638
-
Impulse: Building a smarter memory controller
-
Orlando, FL, Jan.
-
J. Carter, W. Hsieh, L. Stoller, M. Swanson, L. Zhang, E. Brunvand, A. Davis, C.-C. Kuo, R. Kuramkote, M. Parker, L. Schaelicke, and T. Tateyama. Impulse: Building a smarter memory controller. In Fifth International Symposium on High Performance Computer Architecture, Orlando, FL, Jan. 1999.
-
(1999)
Fifth International Symposium on High Performance Computer Architecture
-
-
Carter, J.1
Hsieh, W.2
Stoller, L.3
Swanson, M.4
Zhang, L.5
Brunvand, E.6
Davis, A.7
Kuo, C.-C.8
Kuramkote, R.9
Parker, M.10
Schaelicke, L.11
Tateyama, T.12
-
13
-
-
0032123777
-
The IA-64 architecture at work
-
July
-
C. Dulong. The IA-64 architecture at work. IEEE Computer, 31(7):24-32, July 1998.
-
(1998)
IEEE Computer
, vol.31
, Issue.7
, pp. 24-32
-
-
Dulong, C.1
-
14
-
-
0031611719
-
Precise miss analysis for program transformations with caches of arbitrary associativity
-
San Jose, CA, Oct.
-
S. Ghosh, M. Martonosi, and S. Malik. Precise miss analysis for program transformations with caches of arbitrary associativity. In Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 228-239, San Jose, CA, Oct. 1998.
-
(1998)
Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems
, pp. 228-239
-
-
Ghosh, S.1
Martonosi, M.2
Malik, S.3
-
15
-
-
84976790479
-
Practical dependence testing
-
Toronto, Canada, June
-
G. Goff, K. Kennedy, and C. Tseng. Practical dependence testing. In Proceedings of the SIGPLAN '91 Conference on Programming Language Design and Implementation, pages 15-29, Toronto, Canada, June 1991.
-
(1991)
Proceedings of the SIGPLAN '91 Conference on Programming Language Design and Implementation
, pp. 15-29
-
-
Goff, G.1
Kennedy, K.2
Tseng, C.3
-
17
-
-
0026186967
-
An implementation of interprocedural bounded regular section analysis
-
July
-
P. Havlak and K. Kennedy. An implementation of interprocedural bounded regular section analysis. IEEE Transactions on Parallel and Distributed Systems, 2(3):350-360, July 1991.
-
(1991)
IEEE Transactions on Parallel and Distributed Systems
, vol.2
, Issue.3
, pp. 350-360
-
-
Havlak, P.1
Kennedy, K.2
-
18
-
-
0003278283
-
The microarchitecture of the Pentium 4 processor
-
Q1 2001
-
G. Hinton, D. Sager, M. Upton, D. Boggs, D. Carmean, A. Kyker, and P. Roussel. The microarchitecture of the Pentium 4 processor. Intel Technology Journal, (Q1 2001), 2001.
-
(2001)
Intel Technology Journal
-
-
Hinton, G.1
Sager, D.2
Upton, M.3
Boggs, D.4
Carmean, D.5
Kyker, A.6
Roussel, P.7
-
19
-
-
0031364102
-
Run-time spatial locality detection and optimization
-
Research Triangle Park, NC, Dec.
-
T. L. Johnson, M. C. Merten, and W. W. Hwu. Run-time spatial locality detection and optimization. In Proceedings of the 30th International Symposium on Microarchitecture, pages 57-64, Research Triangle Park, NC, Dec. 1997.
-
(1997)
Proceedings of the 30th International Symposium on Microarchitecture
, pp. 57-64
-
-
Johnson, T.L.1
Merten, M.C.2
Hwu, W.W.3
-
20
-
-
0025429331
-
Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers
-
Seattle, WA, June
-
N. P. Jouppi. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. In Proceedings of the 17th International Symposium on Computer Architecture, pages 364-373, Seattle, WA, June 1990.
-
(1990)
Proceedings of the 17th International Symposium on Computer Architecture
, pp. 364-373
-
-
Jouppi, N.P.1
-
21
-
-
0037722074
-
A matrix-based approach to the global locality optimization problem
-
Paris, France, Oct.
-
M. Kandemir, A. Choudhary, J. Ramanujam, and P. Banerjee. A matrix-based approach to the global locality optimization problem. In The 1998 International Con ference on Parallel Architectures and Compilation Techniques, pages 306-313, Paris, France, Oct. 1998.
-
(1998)
The 1998 International Con Ference on Parallel Architectures and Compilation Techniques
, pp. 306-313
-
-
Kandemir, M.1
Choudhary, A.2
Ramanujam, J.3
Banerjee, P.4
-
22
-
-
0024668838
-
Inexpensive implementations of set-associativity
-
Jerusalem, Israel, June
-
R. E. Kessler, R. Jooss, A. Lebeck, and M. D. Hill. Inexpensive implementations of set-associativity. In Proceedings of the 16th International Symposium on Computer Architecture, pages 131-139, Jerusalem, Israel, June 1989.
-
(1989)
Proceedings of the 16th International Symposium on Computer Architecture
, pp. 131-139
-
-
Kessler, R.E.1
Jooss, R.2
Lebeck, A.3
Hill, M.D.4
-
23
-
-
28044437184
-
The Alpha 21264 microprocessor architecture
-
Nov.
-
R. E. Kessler, E. McLellan, and D. Webb. The Alpha 21264 microprocessor architecture. Technical report, http://www.compaq.com/AlphaServer/download/ ev6chip.pdf, Nov. 1999.
-
(1999)
Technical Report
-
-
Kessler, R.E.1
McLellan, E.2
Webb, D.3
-
24
-
-
0036949388
-
An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches
-
Oct.
-
C. Kim, D. Burger, and S. W. Keckler. An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches. In Proceedings of the 10th Symposium on Architectural Support for Programming Languages and Operating Systems, pages 211-222, Oct. 2002.
-
(2002)
Proceedings of the 10th Symposium on Architectural Support for Programming Languages and Operating Systems
, pp. 211-222
-
-
Kim, C.1
Burger, D.2
Keckler, S.W.3
-
25
-
-
0034851536
-
Dead-block prediction and dead-block correlating prefetchers
-
Goteborg, Swenden, June
-
A. Lai, C. Fide, and B. Falsafi. Dead-block prediction and dead-block correlating prefetchers. In Proceedings of the 28th International Symposium on Computer Architecture, pages 144-154, Goteborg, Swenden, June 2001.
-
(2001)
Proceedings of the 28th International Symposium on Computer Architecture
, pp. 144-154
-
-
Lai, A.1
Fide, C.2
Falsafi, B.3
-
26
-
-
0034592592
-
Region-based caching: An energy-delay efficient memory architecture for embedded processors
-
San Jose, California, Nov.
-
H. H. Lee and G. S. Tyson. Region-based caching: an energy-delay efficient memory architecture for embedded processors. In Proceedings of PACM (CASES'00), San Jose, California, Nov. 2000.
-
(2000)
Proceedings of PACM (CASES'00)
-
-
Lee, H.H.1
Tyson, G.S.2
-
27
-
-
0034818343
-
Reducing dram latencies with an integrated memory hierarchy design
-
Monterrey, Mexico, Jan.
-
W. Lin, S. K. Reinhardt, and D.Burger. Reducing dram latencies with an integrated memory hierarchy design. In Seventh International Symposium on High Performance Computer Architecture, pages 301-312, Monterrey, Mexico, Jan. 2001.
-
(2001)
Seventh International Symposium on High Performance Computer Architecture
, pp. 301-312
-
-
Lin, W.1
Reinhardt, S.K.2
Burger, D.3
-
28
-
-
0032121748
-
Smarter memory: Improving bandwidth for streamed references
-
July
-
S. A. McKee, R. H. Klenke, K. L. Wright, W. A. Wulf, M. H. Salinas, J. H. Aylor, and A. P. Batson. Smarter memory: Improving bandwidth for streamed references. IEEE Computer, 31(7):54-63, July 1998.
-
(1998)
IEEE Computer
, vol.31
, Issue.7
, pp. 54-63
-
-
McKee, S.A.1
Klenke, R.H.2
Wright, K.L.3
Wulf, W.A.4
Salinas, M.H.5
Aylor, J.H.6
Batson, A.P.7
-
29
-
-
33646844157
-
The scale compiler
-
K. S. McKinley, J. Burrill, D. Burger, B. Cahoon, J. Gibson, J. E. B. Moss, A. Smith, Z. Wang, and C. Weems. The scale compiler. Technical report, 2005. http://ali-www.cs.umass.edu/~scale/.
-
(2005)
Technical Report
-
-
McKinley, K.S.1
Burrill, J.2
Burger, D.3
Cahoon, B.4
Gibson, J.5
Moss, J.E.B.6
Smith, A.7
Wang, Z.8
Weems, C.9
-
30
-
-
0030190854
-
Improving data locality with loop transformations
-
July
-
K. S. McKinley, S. Carr, and C. Tseng. Improving data locality with loop transformations. ACM Transactions on Programming Languages and Systems, 18(4): 424-453, July 1996.
-
(1996)
ACM Transactions on Programming Languages and Systems
, vol.18
, Issue.4
, pp. 424-453
-
-
McKinley, K.S.1
Carr, S.2
Tseng, C.3
-
31
-
-
0003665539
-
Quantifying loop nest locality using SPEC'95 and the Perfect benchmarks
-
Nov.
-
K. S. McKinley and O. Temam. Quantifying loop nest locality using SPEC'95 and the Perfect benchmarks. ACM Transactions on Computer Systems, 17(4):288-336, Nov. 1999.
-
(1999)
ACM Transactions on Computer Systems
, vol.17
, Issue.4
, pp. 288-336
-
-
McKinley, K.S.1
Temam, O.2
-
32
-
-
0026918402
-
Design and evaluation of a compiler algorithm for prefetching
-
Boston, MA, Oct.
-
T. Mowry, M. S. Lam, and A. Gupta. Design and evaluation of a compiler algorithm for prefetching. In Proceedings of the Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 62-73, Boston, MA, Oct. 1992.
-
(1992)
Proceedings of the Fifth International Conference on Architectural Support for Programming Languages and Operating Systems
, pp. 62-73
-
-
Mowry, T.1
Lam, M.S.2
Gupta, A.3
-
36
-
-
84976676720
-
A practical algorithm for exact array dependence analysis
-
Aug.
-
W. Pugh. A practical algorithm for exact array dependence analysis. Communications of the ACM, 35(8): 102-114, Aug. 1992.
-
(1992)
Communications of the ACM
, vol.35
, Issue.8
, pp. 102-114
-
-
Pugh, W.1
-
37
-
-
0031630011
-
Utilizing reuse information in data cache management
-
Melbourne, Australia, July
-
J. A. Rivers, E. S. Tam, G. S. Tyson, E. S. Davidson, and M. Farrens. Utilizing reuse information in data cache management. In Proceedings of the 1997 ACM International Conference on Supercomputing, pages 449-456, Melbourne, Australia, July 1998.
-
(1998)
Proceedings of the 1997 ACM International Conference on Supercomputing
, pp. 449-456
-
-
Rivers, J.A.1
Tam, E.S.2
Tyson, G.S.3
Davidson, E.S.4
Farrens, M.5
-
38
-
-
0032688350
-
EELRU: Simple and effective adaptive page replacement
-
Atlanta, GA, May
-
Y. Smaragdakis, S. Kaplan, and P. Wilson. EELRU: Simple and effective adaptive page replacement. In Proceedings of the ACM SIGMETRICS Conference on Measurement & Modeling Computer Systems, pages 122-133, Atlanta, GA, May 1999.
-
(1999)
Proceedings of the ACM SIGMETRICS Conference on Measurement & Modeling Computer Systems
, pp. 122-133
-
-
Smaragdakis, Y.1
Kaplan, S.2
Wilson, P.3
-
40
-
-
0033075110
-
An algorithm for optimally exploiting spatial and temporal locality in upper memory levels
-
Feb.
-
O. Temam. An algorithm for optimally exploiting spatial and temporal locality in upper memory levels. IEEE Transactions on Computers, 48(2): 150-158, Feb. 1999.
-
(1999)
IEEE Transactions on Computers
, vol.48
, Issue.2
, pp. 150-158
-
-
Temam, O.1
-
41
-
-
33744454737
-
-
Univerisity of Maryland. The Omega Library, 1996. http://www.cs.umd.edu/ projects/omega/.
-
(1996)
The Omega Library
-
-
-
43
-
-
0038345683
-
Guided region prefetching: A cooperative hardware/software approach
-
San Diego, California, June
-
Z. Wang, D. Burger, S. K. Reinhardt, K. S. McKinley, and C. C. Weems. Guided region prefetching: A cooperative hardware/software approach. In Proceedings of the 30th International Symposium on Computer Architecture, pages 388-398, San Diego, California, June 2003.
-
(2003)
Proceedings of the 30th International Symposium on Computer Architecture
, pp. 388-398
-
-
Wang, Z.1
Burger, D.2
Reinhardt, S.K.3
McKinley, K.S.4
Weems, C.C.5
-
44
-
-
14944380022
-
Using the compiler to improve cache replacement decisions
-
Charlottesville, Virginia, Sept.
-
Z. Wang, K. S. McKinley, A. L. Rosenberg, and C. C. Weems. Using the compiler to improve cache replacement decisions. In The 2002 International Conference on Parallel Architectures and Compilation Techniques, pages 199-208, Charlottesville, Virginia, Sept. 2002.
-
(2002)
The 2002 International Conference on Parallel Architectures and Compilation Techniques
, pp. 199-208
-
-
Wang, Z.1
McKinley, K.S.2
Rosenberg, A.L.3
Weems, C.C.4
-
46
-
-
14944366408
-
Compiler-assisted cache replacement: Problem formulation and performance evaluation
-
College Station, Texas
-
H. Yang, R. Govindarajan, G. R. Gao, and Z. Hu. Compiler-assisted cache replacement: Problem formulation and performance evaluation. In Proceedings of the Sixteenth Workshop on Languages and Compilers for Parallel Computing, pages 131-139, College Station, Texas, 2003.
-
(2003)
Proceedings of the Sixteenth Workshop on Languages and Compilers for Parallel Computing
, pp. 131-139
-
-
Yang, H.1
Govindarajan, R.2
Gao, G.R.3
Hu, Z.4
|