-
1
-
-
34548083281
-
The free lunch is over - A fundamental turn toward concurrency in software
-
Sutter H,. The free lunch is over-a fundamental turn toward concurrency in software. Dr. Dobb's Journal 2005; 30 (3): 202-210.
-
(2005)
Dr. Dobb's Journal
, vol.30
, Issue.3
, pp. 202-210
-
-
Sutter, H.1
-
3
-
-
80052325923
-
-
[Oct 5]
-
AMD Inc. The AMD Fusion Family of APUs. http://sites.amd.com/us/fusion/ apu/Pages/fusion.aspx/ [Oct 5, 2011].
-
(2011)
The AMD Fusion Family of APUs.
-
-
-
4
-
-
33646015987
-
Synergistic processing in Cell's multicore architecture
-
Gschwind M, Hofstee HP, Flachs B, Hopkins M, Watanabe Y, Yamazaki T,. Synergistic processing in Cell's multicore architecture. IEEE Micro 2006; 26 (2): 10-24.
-
(2006)
IEEE Micro
, vol.26
, Issue.2
, pp. 10-24
-
-
Gschwind, M.1
Hofstee, H.P.2
Flachs, B.3
Hopkins, M.4
Watanabe, Y.5
Yamazaki, T.6
-
5
-
-
84859721294
-
-
[Oct 5]
-
Khronos OpenCL working group. OpenCL 1.0 Standard. http://www.khronos. org/opencl/[Oct 5, 2011].
-
(2011)
OpenCL 1.0 Standard
-
-
-
6
-
-
2042458649
-
A survey of processors with explicit multithreading
-
DOI 10.1145/641865.641867
-
Ungerer T, Robič B, Åilc J,. A survey of processors with explicit multithreading. ACM Computing Surveys 2003; 35 (1): 29-63. (Pubitemid 44159292)
-
(2003)
ACM Computing Surveys
, vol.35
, Issue.1
, pp. 29-63
-
-
Ungerer, T.1
Robic, B.2
Silc, J.3
-
7
-
-
62349092536
-
Scalable programming models for massively multicore processors
-
McCool MD,. Scalable programming models for massively multicore processors. Proceedings of the IEEE, 2008; 816-831.
-
(2008)
Proceedings of the IEEE
, pp. 816-831
-
-
McCool, M.D.1
-
8
-
-
56749158843
-
Optimization of sparse matrix-vector multiplication on emerging multicore platforms
-
In. ACM: New York.
-
Williams S, Oliker L, Vuduc R, Shalf J, Yelick K, Demmel J,. Optimization of sparse matrix-vector multiplication on emerging multicore platforms. In SC '07 Proceedings of the 2007 ACM/IEEE conference on Supercomputing. ACM: New York, 2007; 1-12.
-
(2007)
SC '07 Proceedings of the 2007 ACM/IEEE Conference on Supercomputing
, pp. 1-12
-
-
Williams, S.1
Oliker, L.2
Vuduc, R.3
Shalf, J.4
Yelick, K.5
Demmel, J.6
-
9
-
-
70350771127
-
Stencil computation optimization and auto-tuning on state-of-The-art multicore architectures
-
In. ACM: New York.
-
Datta K, Murphy M, Volkov V, Williams S, Carter J, Oliker L, Patterson D, Shalf J, Yelick K,. Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures. In SC '08 Proceedings of the 2008 ACM/IEEE Conference on Supercomputing. ACM: New York, 2008; 1-12.
-
(2008)
SC '08 Proceedings of the 2008 ACM/IEEE Conference on Supercomputing
, pp. 1-12
-
-
Datta, K.1
Murphy, M.2
Volkov, V.3
Williams, S.4
Carter, J.5
Oliker, L.6
Patterson, D.7
Shalf, J.8
Yelick, K.9
-
10
-
-
51049106193
-
Lattice Boltzmann simulation optimization on leading multicore platforms
-
Williams S, Carter J, Oliker L, Shalf J, Yelick K,. Lattice Boltzmann simulation optimization on leading multicore platforms. Proceedings of International Parallel and Distributed Processing Symposium, 2008.
-
(2008)
Proceedings of International Parallel and Distributed Processing Symposium
-
-
Williams, S.1
Carter, J.2
Oliker, L.3
Shalf, J.4
Yelick, K.5
-
11
-
-
84885784745
-
Evaluating multi-core platforms for HPC data-intensive kernels
-
In. ACM: New York.
-
Van Amesfoort A, Varbanescu A, Sips H, Van Nieuwpoort R,. Evaluating multi-core platforms for HPC data-intensive kernels. In CF '09 Proceedings of the 6th ACM conference on Computing Frontiers. ACM: New York, 2009; 207-216.
-
(2009)
CF '09 Proceedings of the 6th ACM Conference on Computing Frontiers
, pp. 207-216
-
-
Van Amesfoort, A.1
Varbanescu, A.2
Sips, H.3
Van Nieuwpoort, R.4
-
12
-
-
36049051263
-
The new landscape of parallel computer architecture
-
Shalf J,. The new landscape of parallel computer architecture. Journal of Physics: Conference Series 2007; 78. 012066.
-
(2007)
Journal of Physics: Conference Series
, vol.78
, pp. 012066
-
-
Shalf, J.1
-
13
-
-
35648995516
-
The landscape of parallel computing research: A view from Berkeley
-
[Oct 5, 2011]
-
Asanovic K, et al,. The landscape of parallel computing research: a view from Berkeley. EECS technical report, 2006. http://www.eecs.berkeley.edu/Pubs/ TechRpts/2006/EECS-2006-183.html [Oct 5, 2011].
-
(2006)
EECS Technical Report
-
-
Asanovic, K.1
-
15
-
-
84859702835
-
-
[Oct 5]
-
Xilinx, Inc. Virtex-6 FPGA Family. http://www.xilinx.com/products/ virtex6/ [Oct 5, 2011].
-
(2011)
Virtex-6 FPGA Family
-
-
-
16
-
-
84859700527
-
-
[Oct 5]
-
Altera Corp. Stratix series high-end FPGAs. http://www.altera.com/ products/devices/stratix-fpgas/about/stx-about.html [Oct 5, 2011].
-
(2011)
Stratix Series High-end FPGAs
-
-
-
17
-
-
84879491134
-
-
[Oct 5]
-
Intel Single-Chip Cloud Computer. http://download.intel.com/pressroom/ pdf/rockcreek/SCC-Announcement-JustinRattner.pdf [Oct 5, 2011].
-
(2011)
Intel Single-Chip Cloud Computer
-
-
-
21
-
-
79959456077
-
Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories
-
Baskaran MM, Bondhugula U, Krishnamoorthy S, Ramanujam J, Rountev A, Sadayappan P,. Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories. PPoPP '08 Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2008; 1-10.
-
(2008)
PPoPP '08 Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
, pp. 1-10
-
-
Baskaran, M.M.1
Bondhugula, U.2
Krishnamoorthy, S.3
Ramanujam, J.4
Rountev, A.5
Sadayappan, P.6
-
22
-
-
4243648129
-
Little's law and high performance computing
-
Bailey DH,. Little's law and high performance computing. RNR Technical Report, 1997.
-
(1997)
RNR Technical Report
-
-
Bailey, D.H.1
-
23
-
-
67650056991
-
LU, QR and Cholesky factorizations using vector capabilities of GPUs
-
[Oct 5]
-
Volkov V, Demmel J,. LU, QR and Cholesky factorizations using vector capabilities of GPUs. EECS Technical Report, 2008. http://www.eecs.berkeley.edu/ Pubs/TechRpts/2008/EECS-2008-49.html [Oct 5, 2011].
-
(2008)
EECS Technical Report
-
-
Volkov, V.1
Demmel, J.2
-
24
-
-
84859721293
-
-
[Oct 5]
-
Intel Corp. Intel many integrated core. http://newsroom.intel.com/ servlet/JiveServlet/download/2152-4-5220/ISC-Intel-MIC-factsheet.pdf [Oct 5, 2011].
-
(2011)
Intel Many Integrated Core
-
-
-
27
-
-
0002806690
-
OpenMP: An industry-standard API for shared-memory programming
-
Dagum L, Menon R,. OpenMP: an industry-standard API for shared-memory programming. IEEE Computational Science and Engineering 1998; 05 (1): 46-55.
-
(1998)
IEEE Computational Science and Engineering
, vol.5
, Issue.1
, pp. 46-55
-
-
Dagum, L.1
Menon, R.2
-
28
-
-
84859703334
-
-
(version 2.2). [Oct 5]
-
MPI: a message-passing interface standard (version 2.2). http://www.mpi-forum.org/docs/mpi-2.2/mpi22-report.pdf [Oct 5, 2011].
-
(2011)
MPI: A Message-passing Interface Standard
-
-
-
29
-
-
0343644429
-
High Performance Fortran - History, overview and current developments
-
[Oct 5, 2011]
-
Richardson H,. High Performance Fortran-history, overview and current developments. TMC-261, 1996. http://citeseerx.ist.psu.edu/viewdoc/summary?doi= 10.1.1.48.8497 [Oct 5, 2011].
-
(1996)
TMC-261
-
-
Richardson, H.1
-
30
-
-
84864154621
-
Programming in the partitioned global address space model
-
[Oct 5, 2011].
-
Carlson B, El-Ghazawi T, Numrich R, Yelick K,. Programming in the partitioned global address space model. Supercomputing 2003. [Oct 5, 2011].
-
Supercomputing 2003
-
-
Carlson, B.1
El-Ghazawi, T.2
Numrich, R.3
Yelick, K.4
-
31
-
-
84859716318
-
-
[Oct 5]
-
Unified Parallel C. http://upc.gwu.edu/ [Oct 5, 2011].
-
(2011)
-
-
-
32
-
-
84859716175
-
-
[Oct 5]
-
Coarray Fortran. http://caf.rice.edu/ [Oct 5, 2011].
-
(2011)
-
-
-
35
-
-
73449104291
-
Programming multiprocessors with explicitly managed memory hierarchies
-
Schneider S, Yeom JS, Nikolopoulos DS,. Programming multiprocessors with explicitly managed memory hierarchies. IEEE Computer 2009; 42: 28-34.
-
(2009)
IEEE Computer
, vol.42
, pp. 28-34
-
-
Schneider, S.1
Yeom, J.S.2
Nikolopoulos, D.S.3
-
37
-
-
84859723297
-
-
[Oct 5]
-
Intel Corp. Intel Parallel Building Blocks (PBB). http://software.intel. com/en-us/articles/intel-parallel-buildingblocks/ [Oct 5, 2011].
-
(2011)
Intel Parallel Building Blocks (PBB)
-
-
-
38
-
-
84859719676
-
-
[Oct 5]
-
Intel Corp. Intel Array Building Blocks (ArBB). http://software.intel. com/en-us/articles/intel-array-building-blocks/ [Oct 5, 2011].
-
(2011)
Intel Array Building Blocks (ArBB)
-
-
-
39
-
-
67650085808
-
EXOCHI: Architecture and programming environment for a heterogeneous multi-core multithreaded system
-
DOI 10.1145/1250734.1250753, PLDI'07: Proceedings of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation
-
Wang P, Collins J, Chinya G, Jiang H, Tian X, Girkar M, Yang N, Lue GY, Wang H,. EXOCHI: architecture and programming environment for a heterogeneous multi-core multithreaded system. SIGPLAN Not 2007; 42 (6): 156-166. (Pubitemid 47630684)
-
(2007)
Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI)
, pp. 156-166
-
-
Wang, P.H.1
Collins, J.D.2
Chinya, G.N.3
Jiang, H.4
Tian, X.5
Girkar, M.6
Yang, N.Y.7
Lueh, G.-Y.8
Wang, H.9
-
40
-
-
77957759721
-
Merge: A programming model for heterogeneous multi-core systems
-
DOI 10.1145/1346281.1346318, ASPLOS XIII - Thirteenth International Conference on Architectural Support for Programming Languages and Operating Systems
-
Linderman M, Collins J, Wang H, Meng T,. Merge: a programming model for heterogeneous multi-core systems. ASPLOS XIII, 2008; 287-296. (Pubitemid 351585414)
-
(2008)
International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS
, pp. 287-296
-
-
Linderman, M.D.1
Collins, J.D.2
Wang, H.3
Meng, T.H.4
-
43
-
-
63449095902
-
A light-weight approach to dynamical run-time linking supporting heterogenous, parallel, and reconfigurable architectures
-
In, , Lecture Notes in Computer Science.
-
Buchty R, Kramer D, Kicherer M, Karl W,. A light-weight approach to dynamical run-time linking supporting heterogenous, parallel, and reconfigurable architectures. In Architecture of Computing Systems, vol. 5467, Lecture Notes in Computer Science, 2009; 60-71.
-
(2009)
Architecture of Computing Systems
, vol.5467
, pp. 60-71
-
-
Buchty, R.1
Kramer, D.2
Kicherer, M.3
Karl, W.4
-
44
-
-
84859699631
-
An embrace-and-extend approach to managing the complexity of future heterogeneous systems
-
In, , Lecture Notes in Computer Science, Springer: Berlin/Heidelberg.
-
Buchty R, Kicherer M, Kramer D, Karl W,. An embrace-and-extend approach to managing the complexity of future heterogeneous systems. In SAMOS IX, vol. 5657, Lecture Notes in Computer Science, Springer: Berlin/Heidelberg, 2009; 226-235.
-
(2009)
SAMOS IX
, vol.5657
, pp. 226-235
-
-
Buchty, R.1
Kicherer, M.2
Kramer, D.3
Karl, W.4
-
45
-
-
84859700526
-
Delivering guidance information in heterogeneous systems
-
Hannover, Germany, February;. VDE, ISBN 978-3-8007-3222-7
-
Nowak F, Kicherer M, Buchty R, Karl W,. Delivering guidance information in heterogeneous systems. ARCS 2010 Workshop Proceedings, Hannover, Germany, February 2010; 279-284. VDE, ISBN 978-3-8007-3222-7.
-
(2010)
ARCS 2010 Workshop Proceedings
, pp. 279-284
-
-
Nowak, F.1
Kicherer, M.2
Buchty, R.3
Karl, W.4
-
46
-
-
79952974311
-
Extending a light-weight runtime system by dynamic instrumentation for performance evaluation
-
Hannover, Germany, February;. VDE, ISBN 978-38007-3322-7
-
Kicherer M, Nowak F, Buchty R, Karl W,. Extending a light-weight runtime system by dynamic instrumentation for performance evaluation. ARCS 2010 Workshop Proceedings, Hannover, Germany, February 2010; 95-101. VDE, ISBN 978-38007-3322-7.
-
(2010)
ARCS 2010 Workshop Proceedings
, pp. 95-101
-
-
Kicherer, M.1
Nowak, F.2
Buchty, R.3
Karl, W.4
-
48
-
-
74049146136
-
Minimizing communication in sparse matrix solvers
-
In, ACM: New York.
-
Mohiyuddin M, Hoemmen M, Demmel J, Yelick K,. Minimizing communication in sparse matrix solvers. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, ACM: New York, 2009; 1-11.
-
(2009)
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
, pp. 1-11
-
-
Mohiyuddin, M.1
Hoemmen, M.2
Demmel, J.3
Yelick, K.4
-
51
-
-
78649815411
-
A multi-platform linear algebra toolbox for finite element solvers on heterogeneous clusters
-
to appear
-
Heuveline V, Subramanian C, Lukarski D, Weiss JP,. A multi-platform linear algebra toolbox for finite element solvers on heterogeneous clusters. PPAAC'10, IEEE Cluster Workshops, 2010; to appear.
-
(2010)
PPAAC'10, IEEE Cluster Workshops
-
-
Heuveline, V.1
Subramanian, C.2
Lukarski, D.3
Weiss, J.P.4
-
52
-
-
84859716317
-
-
EMCL Preprint 2011-08, [Oct 5, 2011]
-
Heuveline V, Lukarski D, Weiss JP,. Enhanced parallel ILU(p)-based preconditioners for multi-core CPUs and GPUs -the power(q)-pattern method, 2011. EMCL Preprint 2011-08, http://www.emcl.kit.edu/preprints/emcl-preprint-2011-08. pdf [Oct 5, 2011].
-
(2011)
Enhanced Parallel ILU(p)-based Preconditioners for Multi-core CPUs and GPUs - The Power(q)-pattern Method
-
-
Heuveline, V.1
Lukarski, D.2
Weiss, J.P.3
|