-
1
-
-
70450285523
-
Achieving predictable performance through better memory controller placement in many-core CMPs
-
D. Abts, N. D. Enright Jerger, J. Kim, D. Gibson, and M. H. Lipasti. Achieving predictable performance through better memory controller placement in many-core CMPs. In ISCA-36, 2009.
-
(2009)
ISCA-36
-
-
Abts, D.1
Enright Jerger, N.D.2
Kim, J.3
Gibson, D.4
Lipasti, M.H.5
-
4
-
-
33745956039
-
Framework for instruction-level tracing and analysis of programs
-
S. Bhansali, W.-K. Chen, S. de Jong, A. Edwards, R. Murray, M. Drinić, D. Mihočka, and J. Chau. Framework for instruction-level tracing and analysis of programs. In VEE, 2006.
-
(2006)
VEE
-
-
Bhansali, S.1
Chen, W.-K.2
De Jong, S.3
Edwards, A.4
Murray, R.5
Drinić, M.6
Mihočka, D.7
Chau, J.8
-
5
-
-
44549088807
-
Scheduling in practice
-
March
-
E. W. Biersack, B. Schroeder, and G. Urvoy-Keller. Scheduling in practice. Performance Evaluation Review, Special Issue on "New Perspectives in Scheduling", 34(4), March 2007.
-
(2007)
Performance Evaluation Review, Special Issue on "New Perspectives in Scheduling"
, vol.34
, Issue.4
-
-
Biersack, E.W.1
Schroeder, B.2
Urvoy-Keller, G.3
-
6
-
-
0036504582
-
Intel 870: A building block for cost-effective, scalable servers
-
F. Briggs, M. Cekleov, K. Creta, M. Khare, S. Kulick, A. Kumar, L. P. Looi, C. Natarajan, S. Radhakrishnan, and L. Rankin. Intel 870: A building block for cost-effective, scalable servers. IEEE Micro, 22(2):36-47, 2002.
-
(2002)
IEEE Micro
, vol.22
, Issue.2
, pp. 36-47
-
-
Briggs, F.1
Cekleov, M.2
Creta, K.3
Khare, M.4
Kulick, S.5
Kumar, A.6
Looi, L.P.7
Natarajan, C.8
Radhakrishnan, S.9
Rankin, L.10
-
7
-
-
77952561300
-
Micro-architecture techniques in the Intel E8870 scalable memory controller
-
F. Briggs, S. Chittor, and K. Cheng. Micro-architecture techniques in the Intel E8870 scalable memory controller. In WMPI-3, 2004.
-
(2004)
WMPI-3
-
-
Briggs, F.1
Chittor, S.2
Cheng, K.3
-
8
-
-
34548238648
-
The AMD Opteron northbridge architecture
-
P. Conway and B. Hughes. The AMD Opteron northbridge architecture. IEEE Micro, 27(2):10-21, 2007.
-
(2007)
IEEE Micro
, vol.27
, Issue.2
, pp. 10-21
-
-
Conway, P.1
Hughes, B.2
-
9
-
-
0031383380
-
Self-similarity in World Wide Web traffic: Evidence and possible causes
-
M. E. Crovella and A. Bestavros. Self-similarity in World Wide Web traffic: Evidence and possible causes. IEEE/ACM TON, 5(6):835-846, 1997.
-
(1997)
IEEE/ACM TON
, vol.5
, Issue.6
, pp. 835-846
-
-
Crovella, M.E.1
Bestavros, A.2
-
10
-
-
0001939946
-
Heavy-tailed probability distributions in the world wide web
-
chapter 1, Chapman & Hall, New York
-
M. E. Crovella, M. S. Taqqu, and A. Bestavros. Heavy-tailed probability distributions in the world wide web. In A Practical Guide To Heavy Tails, chapter 1, pages 1-23. Chapman & Hall, New York, 1998.
-
(1998)
A Practical Guide to Heavy Tails
, pp. 1-23
-
-
Crovella, M.E.1
Taqqu, M.S.2
Bestavros, A.3
-
11
-
-
0024889726
-
Analysis and simulation of a fair queueing algorithm
-
A. Demers, S. Keshav, and S. Shenker. Analysis and simulation of a fair queueing algorithm. In SIGCOMM, 1989.
-
(1989)
SIGCOMM
-
-
Demers, A.1
Keshav, S.2
Shenker, S.3
-
13
-
-
47249094055
-
System-level performance metrics for multiprogram workloads
-
S. Eyerman and L. Eeckhout. System-level performance metrics for multiprogram workloads. IEEE Micro, 28(3):42-53, 2008.
-
(2008)
IEEE Micro
, vol.28
, Issue.3
, pp. 42-53
-
-
Eyerman, S.1
Eeckhout, L.2
-
14
-
-
0022242041
-
XOR-Schemes: A flexible data organization in parallel memories
-
J. M. Frailong, W. Jalby, and J. Lenfant. XOR-Schemes: A flexible data organization in parallel memories. In ICPP, 1985.
-
(1985)
ICPP
-
-
Frailong, J.M.1
Jalby, W.2
Lenfant, J.3
-
15
-
-
0037885374
-
Task assignment with unknown duration
-
March
-
M. Harchol-Balter. Task assignment with unknown duration. J. ACM, 49(2):260-288, March 2002.
-
(2002)
J. ACM
, vol.49
, Issue.2
, pp. 260-288
-
-
Harchol-Balter, M.1
-
16
-
-
85085698525
-
Exploiting process lifetime distributions for dynamic load balancing
-
M. Harchol-Balter and A. Downey. Exploiting process lifetime distributions for dynamic load balancing. In SIGMETRICS, 1996.
-
(1996)
SIGMETRICS
-
-
Harchol-Balter, M.1
Downey, A.2
-
17
-
-
0032785291
-
Access order and effective bandwidth for streams on a direct rambus memory
-
S. I. Hong, S. A. McKee, M. H. Salinas, R. H. Klenke, J. H. Aylor, and W. A. Wulf. Access order and effective bandwidth for streams on a direct rambus memory. In HPCA-5, 1999.
-
(1999)
HPCA-5
-
-
Hong, S.I.1
McKee, S.A.2
Salinas, M.H.3
Klenke, R.H.4
Aylor, J.H.5
Wulf, W.A.6
-
18
-
-
21644455082
-
Adaptive history-based memory schedulers
-
I. Hur and C. Lin. Adaptive history-based memory schedulers. In MICRO-37, 2004.
-
(2004)
MICRO-37
-
-
Hur, I.1
Lin, C.2
-
19
-
-
57749175984
-
A comprehensive approach to DRAM power management
-
I. Hur and C. Lin. A comprehensive approach to DRAM power management. In HPCA-14, 2008.
-
(2008)
HPCA-14
-
-
Hur, I.1
Lin, C.2
-
20
-
-
84888279461
-
-
IBM. PowerXCell 8i Processor. http://www.ibm.com/technology/resources/ technology-cell-pdf-PowerXCell-PB-7May2008-pub.pdf.
-
PowerXCell 8i Processor
-
-
-
21
-
-
84872067428
-
-
Intel. Intel Core i7 Processor. http://www.intel.com/products/processor/ corei7/specifications.htm.
-
Intel Core i7 Processor
-
-
-
22
-
-
52649148744
-
Self-optimizing memory controllers: A reinforcement learning approach
-
E. Ipek, O. Mutlu, J. F. Martínez, and R. Caruana. Self-optimizing memory controllers: A reinforcement learning approach. In ISCA-35, 2008.
-
(2008)
ISCA-35
-
-
Ipek, E.1
Mutlu, O.2
Martínez, J.F.3
Caruana, R.4
-
23
-
-
0003487810
-
-
Available at September
-
G. Irlam. Unix file size survey - 1993. Available at http:-//www.base. com/gordoni/ufs93.html, September 1994.
-
(1994)
Unix File Size Survey - 1993
-
-
Irlam, G.1
-
25
-
-
4644294548
-
A day in the life of a data cache miss
-
T. Karkhanis and J. E. Smith. A day in the life of a data cache miss. In WMPI-2, 2002.
-
(2002)
WMPI-2
-
-
Karkhanis, T.1
Smith, J.E.2
-
28
-
-
31944440969
-
Pin: Building customized program analysis tools with dynamic instrumentation
-
C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: Building customized program analysis tools with dynamic instrumentation. In PLDI, 2005.
-
(2005)
PLDI
-
-
Luk, C.-K.1
Cohn, R.2
Muth, R.3
Patil, H.4
Klauser, A.5
Lowney, G.6
Wallace, S.7
Reddi, V.J.8
Hazelwood, K.9
-
29
-
-
84962144701
-
Balancing thoughput and fairness in SMT processors
-
K. Luo, J. Gummaraju, and M. Franklin. Balancing thoughput and fairness in SMT processors. In ISPASS, 2001.
-
(2001)
ISPASS
-
-
Luo, K.1
Gummaraju, J.2
Franklin, M.3
-
30
-
-
0034314462
-
Dynamic access ordering for streamed computations
-
Nov.
-
S. A. McKee, W. A. Wulf, J. H. Aylor, M. H. Salinas, R. H. Klenke, S. I. Hong, and D. A. B. Weikle. Dynamic access ordering for streamed computations. IEEE TC, 49(11):1255-1271, Nov. 2000.
-
(2000)
IEEE TC
, vol.49
, Issue.11
, pp. 1255-1271
-
-
McKee, S.A.1
Wulf, W.A.2
Aylor, J.H.3
Salinas, M.H.4
Klenke, R.H.5
Hong, S.I.6
Weikle, D.A.B.7
-
32
-
-
52649128991
-
Memory performance attacks: Denial of memory service in multi-core systems
-
T. Moscibroda and O. Mutlu. Memory performance attacks: Denial of memory service in multi-core systems. In USENIX SECURITY, 2007.
-
(2007)
USENIX Security
-
-
Moscibroda, T.1
Mutlu, O.2
-
33
-
-
57549112769
-
Distributed order scheduling and its application to multi-core DRAM controllers
-
T. Moscibroda and O. Mutlu. Distributed order scheduling and its application to multi-core DRAM controllers. In PODC, 2008.
-
(2008)
PODC
-
-
Moscibroda, T.1
Mutlu, O.2
-
34
-
-
33644903196
-
Efficient runahead execution: Power-efficient memory latency tolerance
-
O. Mutlu, H. Kim, and Y. N. Patt. Efficient runahead execution: Power-efficient memory latency tolerance. IEEE Micro, 26(1):10-20, 2006.
-
(2006)
IEEE Micro
, vol.26
, Issue.1
, pp. 10-20
-
-
Mutlu, O.1
Kim, H.2
Patt, Y.N.3
-
35
-
-
47349122373
-
Stall-time fair memory access scheduling for chip multiprocessors
-
O. Mutlu and T. Moscibroda. Stall-time fair memory access scheduling for chip multiprocessors. In MICRO-40, 2007.
-
(2007)
MICRO-40
-
-
Mutlu, O.1
Moscibroda, T.2
-
36
-
-
52649119398
-
Parallelism-aware batch scheduling: Enhancing both performance and fairness of shared DRAM systems
-
O. Mutlu and T. Moscibroda. Parallelism-aware batch scheduling: Enhancing both performance and fairness of shared DRAM systems. In ISCA-36, 2008.
-
(2008)
ISCA-36
-
-
Mutlu, O.1
Moscibroda, T.2
-
37
-
-
47349089021
-
A study of performance impact of memory controller features in multi-processor server environment
-
C. Natarajan, B. Christenson, and F. Briggs. A study of performance impact of memory controller features in multi-processor server environment. In WMPI-3, 2004.
-
(2004)
WMPI-3
-
-
Natarajan, C.1
Christenson, B.2
Briggs, F.3
-
39
-
-
21644454187
-
Pinpointing representative portions of large Intel Itanium programs with dynamic instrumentation
-
H. Patil, R. Cohn, M. Charney, R. Kapoor, A. Sun, and A. Karunanidhi. Pinpointing representative portions of large Intel Itanium programs with dynamic instrumentation. In MICRO-37, 2004.
-
(2004)
MICRO-37
-
-
Patil, H.1
Cohn, R.2
Charney, M.3
Kapoor, R.4
Sun, A.5
Karunanidhi, A.6
-
40
-
-
0029323403
-
Wide-area traffic: The failure of Poisson modeling
-
June
-
V. Paxson and S. Floyd. Wide-area traffic: The failure of Poisson modeling. IEEE/ACM TON, pages 226-244, June 1995.
-
(1995)
IEEE/ACM TON
, pp. 226-244
-
-
Paxson, V.1
Floyd, S.2
-
41
-
-
47849130815
-
Effective management of DRAM bandwidth in multicore processors
-
N. Rafique, W.-T. Lim, and M. Thottethodi. Effective management of DRAM bandwidth in multicore processors. In PACT-16, 2007.
-
(2007)
PACT-16
-
-
Rafique, N.1
Lim, W.-T.2
Thottethodi, M.3
-
42
-
-
8344231178
-
Analysis of LAS scheduling for job size distributions with high variance
-
I. A. Rai, G. Urvoy-Keller, and E. W. Biersack. Analysis of LAS scheduling for job size distributions with high variance. In SIGMETRICS, 2003.
-
(2003)
SIGMETRICS
-
-
Rai, I.A.1
Urvoy-Keller, G.2
Biersack, E.W.3
-
43
-
-
0026156613
-
Pseudo-randomly interleaved memory
-
B. R. Rau. Pseudo-randomly interleaved memory. In ISCA-18, 1991.
-
(1991)
ISCA-18
-
-
Rau, B.R.1
-
44
-
-
84971109320
-
Scheduling multiclass single server queueing systems to stochastically maximize the number of successful departures
-
R. Righter and J. Shanthikumar. Scheduling multiclass single server queueing systems to stochastically maximize the number of successful departures. Probability in the Engineering and Information Sciences, 3:967-978, 1989.
-
(1989)
Probability in the Engineering and Information Sciences
, vol.3
, pp. 967-978
-
-
Righter, R.1
Shanthikumar, J.2
-
45
-
-
21644486223
-
Memory controller optimizations for web servers
-
S. Rixner. Memory controller optimizations for web servers. In MICRO-37, 2004.
-
(2004)
MICRO-37
-
-
Rixner, S.1
-
46
-
-
0033691565
-
Memory access scheduling
-
S. Rixner, W. J. Dally, U. J. Kapasi, P. Mattson, and J. D. Owens. Memory access scheduling. In ISCA-27, 2000.
-
(2000)
ISCA-27
-
-
Rixner, S.1
Dally, W.J.2
Kapasi, U.J.3
Mattson, P.4
Owens, J.D.5
-
47
-
-
0000891048
-
A proof of the optimality of the shortest remaining processing time discipline
-
L. E. Schrage. A proof of the optimality of the shortest remaining processing time discipline. Operations Research, 16:678-690, 1968.
-
(1968)
Operations Research
, vol.16
, pp. 678-690
-
-
Schrage, L.E.1
-
48
-
-
32844475712
-
Evaluation of task assignment policies for supercomputing servers: The case for load unbalancing and fairness
-
April
-
B. Schroeder and M. Harchol-Balter. Evaluation of task assignment policies for supercomputing servers: The case for load unbalancing and fairness. Cluster Computing: The Journal of Networks, Software Tools, and Applications, 7(2):151-161, April 2004.
-
(2004)
Cluster Computing: The Journal of Networks, Software Tools, and Applications
, vol.7
, Issue.2
, pp. 151-161
-
-
Schroeder, B.1
Harchol-Balter, M.2
-
49
-
-
0002357993
-
Load-sensitive routing of long-lived IP flows
-
A. Shaikh, J. Rexford, and K. G. Shin. Load-sensitive routing of long-lived IP flows. In SIGCOMM, 1999.
-
(1999)
SIGCOMM
-
-
Shaikh, A.1
Rexford, J.2
Shin, K.G.3
-
50
-
-
34547692955
-
A burst scheduling access reordering mechanism
-
J. Shao and B. T. Davis. A burst scheduling access reordering mechanism. In HPCA-13, 2007.
-
(2007)
HPCA-13
-
-
Shao, J.1
Davis, B.T.2
-
51
-
-
0034443570
-
Symbiotic jobscheduling for a simultaneous multithreading processor
-
A. Snavely and D. M. Tullsen. Symbiotic jobscheduling for a simultaneous multithreading processor. In ASPLOS-IX, 2000.
-
(2000)
ASPLOS-IX
-
-
Snavely, A.1
Tullsen, D.M.2
-
53
-
-
0026865523
-
Increasing the number of strides for conflict-free vector access
-
M. Valero, T. Lang, J. M. Llabería, M. Peiron, E. Ayguadé, and J. J. Navarra. Increasing the number of strides for conflict-free vector access. In ISCA-19, 1992.
-
(1992)
ISCA-19
-
-
Valero, M.1
Lang, T.2
Llabería, J.M.3
Peiron, M.4
Ayguadé, E.5
Navarra, J.J.6
-
54
-
-
36849030305
-
On-chip interconnection architecture of the tile processor
-
D. Wentzlaff, P. Griffin, H. Hoffmann, L. Bao, B. Edwards, C. Ramey, M. Mattina, C.-C. Miao, J. F. Brown III, and A. Agarwal. On-chip interconnection architecture of the tile processor. IEEE Micro, 27(5):15-31, 2007.
-
(2007)
IEEE Micro
, vol.27
, Issue.5
, pp. 15-31
-
-
Wentzlaff, D.1
Griffin, P.2
Hoffmann, H.3
Bao, L.4
Edwards, B.5
Ramey, C.6
Mattina, M.7
Miao, C.-C.8
Brown III, J.F.9
Agarwal, A.10
-
55
-
-
0033688639
-
Hardware-only stream prefetching and dynamic access ordering
-
C. Zhang and S. A. McKee. Hardware-only stream prefetching and dynamic access ordering. In ICS, 2000.
-
(2000)
ICS
-
-
Zhang, C.1
McKee, S.A.2
-
56
-
-
0035510702
-
The impulse memory controller
-
Nov.
-
L. Zhang, Z. Fang, M. Parker, B. K. Mathew, L. Schaelicke, J. B. Carter, W. C. Hsieh, and S. A. McKee. The impulse memory controller. IEEE TC, 50(11):1117-1132, Nov. 2001.
-
(2001)
IEEE TC
, vol.50
, Issue.11
, pp. 1117-1132
-
-
Zhang, L.1
Fang, Z.2
Parker, M.3
Mathew, B.K.4
Schaelicke, L.5
Carter, J.B.6
Hsieh, W.C.7
McKee, S.A.8
-
57
-
-
47349110818
-
A permutation-based page interleaving scheme to reduce row-buffer conflicts and exploit data locality
-
Z. Zhang, Z. Zhu, and X. Zhang. A permutation-based page interleaving scheme to reduce row-buffer conflicts and exploit data locality. In MICRO-33, 2000.
-
(2000)
MICRO-33
-
-
Zhang, Z.1
Zhu, Z.2
Zhang, X.3
-
58
-
-
28444470842
-
A performance comparison of DRAM memory system optimizations for SMT processors
-
Z. Zhu and Z. Zhang. A performance comparison of DRAM memory system optimizations for SMT processors. In HPCA-11, 2005.
-
(2005)
HPCA-11
-
-
Zhu, Z.1
Zhang, Z.2
-
59
-
-
52649113530
-
Controller for a synchronous DRAM that maximizes throughput by allowing memory requests and commands to be issued out of order
-
U.S. Patent Number 5,630,096, May
-
W. K. Zuravleff and T. Robinson. Controller for a synchronous DRAM that maximizes throughput by allowing memory requests and commands to be issued out of order. U.S. Patent Number 5,630,096, May 1997.
-
(1997)
-
-
Zuravleff, W.K.1
Robinson, T.2
|