-
1
-
-
85015899515
-
The price of performance
-
L. A. Barroso, "The price of performance," Queue, vol.3, no.7, pp. 48-53, 2005.
-
(2005)
Queue
, vol.3
, Issue.7
, pp. 48-53
-
-
Barroso, L.A.1
-
2
-
-
45449096678
-
Parallel tiled QR factorization for multicore architectures
-
A. Buttari, J. Langou, J. Kurzak, and J. Dongarra, "Parallel Tiled QR Factorization for Multicore Architectures," LECTURE NOTES IN COMPUTER SCIENCE, vol.4967, p. 639, 2008.
-
(2008)
Lecture Notes in Computer Science
, vol.4967
, pp. 639
-
-
Buttari, A.1
Langou, J.2
Kurzak, J.3
Dongarra, J.4
-
3
-
-
0028549474
-
Improving the ratio of memory operations to floating-point operations in loops
-
S. Carr and K. Kennedy, "Improving the ratio of memory operations to floating-point operations in loops," ACM Trans. Program. Lang. Syst., vol.16, no.6, pp. 1768-1810, 1994.
-
(1994)
ACM Trans. Program. Lang. Syst.
, vol.16
, Issue.6
, pp. 1768-1810
-
-
Carr, S.1
Kennedy, K.2
-
4
-
-
2142702913
-
Memory-access-aware data structure transformations for embedded software with dynamic data accesses
-
March
-
E. Daylight, D. Atienza, A. Vandecappelle, F. Catthoor, and J. Mendias, "Memory-access-aware data structure transformations for embedded software with dynamic data accesses," Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, vol.12, no.3, pp. 269-280, March 2004.
-
(2004)
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
, vol.12
, Issue.3
, pp. 269-280
-
-
Daylight, E.1
Atienza, D.2
Vandecappelle, A.3
Catthoor, F.4
Mendias, J.5
-
5
-
-
84981274540
-
Improving effective bandwidth through compiler enhancement of global cache reuse
-
C. Ding and K. Kennedy, "Improving effective bandwidth through compiler enhancement of global cache reuse," Parallel and Distributed Processing Symposium, International, vol.1, p. 10038b, 2001.
-
(2001)
Parallel and Distributed Processing Symposium, International
, vol.1
-
-
Ding, C.1
Kennedy, K.2
-
6
-
-
47349127017
-
The impact of multicore on computational science software
-
J. Dongarra, D. Gannon, G. Fox, and K. Kenned, "The impact of multicore on computational science software," CTWatch Quarterly, vol.3, pp. 3-10, 2007.
-
(2007)
CTWatch Quarterly
, vol.3
, pp. 3-10
-
-
Dongarra, J.1
Gannon, D.2
Fox, G.3
Kenned, K.4
-
7
-
-
34548803454
-
Using PAPI for hardware performance monitoring on Linux systems
-
J. Dongarra, K. London, S. Moore, P. Mucci, and D. Terpstra, "Using PAPI for hardware performance monitoring on Linux systems," in Conference on Linux Clusters: The HPC Revolution, 2001.
-
(2001)
Conference on Linux Clusters: The HPC Revolution
-
-
Dongarra, J.1
London, K.2
Moore, S.3
Mucci, P.4
Terpstra, D.5
-
8
-
-
20344401552
-
Chip makers turn to multicore processors
-
D. Greer, "Chip makers turn to multicore processors," IEEE Computer, vol.38, no.5, pp. 11-13, 2005.
-
(2005)
IEEE Computer
, vol.38
, Issue.5
, pp. 11-13
-
-
Greer, D.1
-
9
-
-
77957817744
-
Streamware: Programming general-purpose multicore processors using streams
-
New York, NY, USA: ACM
-
J. Gummaraju, J. Coburn, Y. Turner, and M. Rosenblum, "Streamware: programming general-purpose multicore processors using streams," in ASPLOS XIII: Proceedings of the 13th international conference on Architectural support for programming languages and operating systems. New York, NY, USA: ACM, 2008, pp. 297-307.
-
(2008)
ASPLOS XIII: Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems
, pp. 297-307
-
-
Gummaraju, J.1
Coburn, J.2
Turner, Y.3
Rosenblum, M.4
-
10
-
-
0035187053
-
Exploring the design space of future CMPs
-
J. Huh, D. Burger, and S. Keckler, "Exploring the design space of future CMPs," Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques, pp. 199-210, 2001.
-
(2001)
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
, pp. 199-210
-
-
Huh, J.1
Burger, D.2
Keckler, S.3
-
12
-
-
50249115185
-
Data locality enhancement for cmps
-
Piscataway, NJ, USA: IEEE Press
-
M. Kandemir, "Data locality enhancement for cmps," in ICCAD '07: Proceedings of the 2007 IEEE/ACM international conference on Computer-aided design. Piscataway, NJ, USA: IEEE Press, 2007, pp. 155-159.
-
(2007)
ICCAD '07: Proceedings of the 2007 IEEE/ACM International Conference on Computer-aided Design
, pp. 155-159
-
-
Kandemir, M.1
-
13
-
-
3142679577
-
A data locality optimizing algorithm
-
M. S. Lam and M. E. Wolf, "A data locality optimizing algorithm," SIGPLAN Not., vol.39, no.4, pp. 442-459, 2004.
-
(2004)
SIGPLAN Not.
, vol.39
, Issue.4
, pp. 442-459
-
-
Lam, M.S.1
Wolf, M.E.2
-
14
-
-
33750050005
-
Performance/watt: The new server focus
-
J. Laudon, "Performance/watt: the new server focus," SIGARCH Comput. Archit. News, vol.33, no.4, pp. 5-13, 2005.
-
(2005)
SIGARCH Comput. Archit. News
, vol.33
, Issue.4
, pp. 5-13
-
-
Laudon, J.1
-
15
-
-
27944449256
-
Locality-conscious workload assignment for array-based computations in mpsoc architectures
-
NewYork, NY, USA: ACM
-
F. Li and M. Kandemir, "Locality-conscious workload assignment for array-based computations in mpsoc architectures," in DAC '05: Proceedings of the 42nd annual conference on Design automation. NewYork, NY, USA: ACM, 2005, pp. 95-100.
-
(2005)
DAC '05: Proceedings of the 42nd Annual Conference on Design Automation
, pp. 95-100
-
-
Li, F.1
Kandemir, M.2
-
16
-
-
2342468635
-
Organizing the last line of defense before hitting the memory wall for CMPs
-
IEEE Computer SocietyWashington, DC, USA
-
C. Liu, A. Sivasubramaniam, and M. Kandemir, "Organizing the Last Line of Defense before Hitting the Memory Wall for CMPs," in Proceedings of the 10th International Symposium on High Performance Computer Architecture. IEEE Computer SocietyWashington, DC, USA, 2004, p. 176.
-
(2004)
Proceedings of the 10th International Symposium on High Performance Computer Architecture
, pp. 176
-
-
Liu, C.1
Sivasubramaniam, A.2
Kandemir, M.3
-
17
-
-
0030259458
-
The case for a single-chip multiprocessor
-
K. Olukotun, B. A. Nayfeh, L. Hammond, K. Wilson, and K. Chang, "The case for a single-chip multiprocessor," SIGPLAN Not., vol.31, no.9, pp. 2-11, 1996.
-
(1996)
SIGPLAN Not.
, vol.31
, Issue.9
, pp. 2-11
-
-
Olukotun, K.1
Nayfeh, B.A.2
Hammond, L.3
Wilson, K.4
Chang, K.5
-
18
-
-
25844437046
-
Power5 system microarchitecture
-
B. Sinharoy, R. N. Kalla, J. M. Tendler, R. J. Eickemeyer, and J. B. Joyner, "Power5 system microarchitecture," IBM Journal of Research and Development, vol.49, no.4-5, pp. 505-522, 2005.
-
(2005)
IBM Journal of Research and Development
, vol.49
, Issue.4-5
, pp. 505-522
-
-
Sinharoy, B.1
Kalla, R.N.2
Tendler, J.M.3
Eickemeyer, R.J.4
Joyner, J.B.5
-
19
-
-
35048834531
-
Bus-invert coding for low-power I/O
-
M. Stan and W. Burleson, "Bus-invert coding for low-power I/O," Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, vol.3, no.1, pp. 49-58, 1995.
-
(1995)
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
, vol.3
, Issue.1
, pp. 49-58
-
-
Stan, M.1
Burleson, W.2
-
21
-
-
33745423635
-
An accurate cost model for guiding data locality transformations
-
X. Vera, J. Abella, J. Llosa, and A. González, "An accurate cost model for guiding data locality transformations," ACM Trans. Program. Lang. Syst., vol.27, no.5, pp. 946-987, 2005.
-
(2005)
ACM Trans. Program. Lang. Syst.
, vol.27
, Issue.5
, pp. 946-987
-
-
Vera, X.1
Abella, J.2
Llosa, J.3
González, A.4
-
22
-
-
35348861182
-
Dramsim: A memory system simulator
-
D. Wang, B. Ganesh, N. Tuaycharoen, K. Baynes, A. Jaleel, and B. Jacob, "Dramsim: a memory system simulator," SIGARCH Comput. Archit. News, vol.33, no.4, pp. 100-107, 2005.
-
(2005)
SIGARCH Comput. Archit. News
, vol.33
, Issue.4
, pp. 100-107
-
-
Wang, D.1
Ganesh, B.2
Tuaycharoen, N.3
Baynes, K.4
Jaleel, A.5
Jacob, B.6
-
23
-
-
0029700806
-
Power exploration for data dominated video applications
-
S. Wuytack, F. Catthoor, L. Nachtergaele, H. De Man, and L. IMEC, "Power exploration for data dominated video applications," Low Power Electronics and Design, 1996., International Symposium on, pp. 359-364, 1996.
-
International Symposium on Low Power Electronics and Design 1996
, vol.1996
, pp. 359-364
-
-
Wuytack, S.1
Catthoor, F.2
Nachtergaele, L.3
De Man, H.4
Imec, L.5
-
24
-
-
47249123399
-
Cachescouts: Fine-grain monitoring of shared caches in cmp platforms
-
Washington DC, USA: IEEE Computer Society
-
L. Zhao, R. Iyer, R. Illikkal, J. Moses, S. Makineni, and D. Newell, "Cachescouts: Fine-grain monitoring of shared caches in cmp platforms," in PACT '07: Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007). Washington, DC, USA: IEEE Computer Society, 2007, pp. 339-352.
-
(2007)
PACT '07: Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007)
, pp. 339-352
-
-
Zhao, L.1
Iyer, R.2
Illikkal, R.3
Moses, J.4
Makineni, S.5
Newell, D.6
|