-
1
-
-
85062050911
-
-
New York: ACM Press;
-
Ananthanarayanan R, Esser SK, Simon HD, Modha DS Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis (SC '09). New York: ACM Press ; 2009: 1-63.
-
(2009)
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis (SC '09)
, pp. 1-63
-
-
Ananthanarayanan, R.1
Esser, S.K.2
Simon, H.D.3
Modha, D.S.4
-
2
-
-
60649098999
-
3D seismic imaging through reverse-time migration on homogeneous and heterogeneous multi-core processors
-
Araya-Polo M, Rubio F, De R, Hanzich M, María J. 3D seismic imaging through reverse-time migration on homogeneous and heterogeneous multi-core processors. Scientific Programming. 2009 ; 17: 185-198
-
(2009)
Scientific Programming
, vol.17
, pp. 185-198
-
-
Araya-Polo, M.1
Rubio, F.2
De, R.3
Hanzich, M.4
María, J.5
-
4
-
-
48749141209
-
Adaptive mesh refinement for hyperbolic partial differential equations
-
Berger MJ, Oliger J. Adaptive mesh refinement for hyperbolic partial differential equations. Journal of Computational Physics. 1984 ; 53: 484-512
-
(1984)
Journal of Computational Physics
, vol.53
, pp. 484-512
-
-
Berger, M.J.1
Oliger, J.2
-
7
-
-
0031268141
-
Using integer linear programming for instruction scheduling and register allocation in multi-issue processors - 1
-
Chang C, Chen C, King C. Using integer linear programming for instruction scheduling and register allocation in multi-issue processors - 1. Computers and Mathematics with Applications. 1997 ; 34 (9). 1-14
-
(1997)
Computers and Mathematics with Applications
, vol.34
, Issue.9
, pp. 1-14
-
-
Chang, C.1
Chen, C.2
King, C.3
-
8
-
-
80051670105
-
Automatic code generation and tuning for stencil kernels on modern shared memory architectures
-
Christen M, Schenk O, Burkhart H. Automatic code generation and tuning for stencil kernels on modern shared memory architectures. Computer Science - Research and Development. 2011 ; 26: 205-210
-
(2011)
Computer Science - Research and Development
, vol.26
, pp. 205-210
-
-
Christen, M.1
Schenk, O.2
Burkhart, H.3
-
9
-
-
84877247710
-
-
Piscataway, NJ: IEEE Press;
-
Christen M, Schenk O, Neufeld E, Messmer P, Burkhart H 2009 IEEE International Symposium on Parallel and Distributed Processing. Piscataway, NJ: IEEE Press ; 2009: 1-10.
-
(2009)
2009 IEEE International Symposium on Parallel and Distributed Processing
, pp. 1-10
-
-
Christen, M.1
Schenk, O.2
Neufeld, E.3
Messmer, P.4
Burkhart, H.5
-
10
-
-
77953972043
-
-
PhD thesis, EECS Department, University of California, Berkeley, CA
-
Datta K (2009) Auto-tuning Stencil Codes for Cache-Based Multicore Platforms. PhD thesis, EECS Department, University of California, Berkeley, CA. http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-177.html.
-
(2009)
Auto-tuning Stencil Codes for Cache-Based Multicore Platforms
-
-
Datta, K.1
-
14
-
-
4544335844
-
Vectorization for SIMD architectures with alignment constraints
-
Eichenberger A, Wu P, O'Brien K. Vectorization for SIMD architectures with alignment constraints. ACM SIGPLAN Notices. 2004 ; 39 (6). 82-93
-
(2004)
ACM SIGPLAN Notices
, vol.39
, Issue.6
, pp. 82-93
-
-
Eichenberger, A.1
Wu, P.2
O'Brien, K.3
-
15
-
-
64349099995
-
The Green500 List: Encouraging sustainable supercomputing
-
Feng W, Cameron K. The Green500 List: Encouraging Sustainable Supercomputing. Computer. 2007 ;: 50-55
-
(2007)
Computer
, pp. 50-55
-
-
Feng, W.1
Cameron, K.2
-
19
-
-
79953274591
-
-
New York: Springer;
-
Henretty T, Stock K, Pouchet L, Franchetti F, Ramanujam J, Sadayappan P Compiler Construction. New York: Springer ; 2011: 225-245.
-
(2011)
Compiler Construction
, pp. 225-245
-
-
Henretty, T.1
Stock, K.2
Pouchet, L.3
Franchetti, F.4
Ramanujam, J.5
Sadayappan, P.6
-
20
-
-
40749160036
-
Overview of the IBM Blue Gene/P project
-
Overview of the IBM Blue Gene/P project. IBM Journal of Research and Development. 2008 ; 52 (1/2). 199
-
(2008)
IBM Journal of Research and Development
, vol.52
, Issue.1-2
, pp. 199
-
-
-
21
-
-
79551702774
-
-
Piscataway, NJ: IEEE Press;
-
Kamil S, Chan C, Oliker L, Shalf J, Williams S 2010 IEEE International Symposium on Parallel and Distributed Processing (IPDPS). Piscataway, NJ: IEEE Press ; 2010: 1-12.
-
(2010)
2010 IEEE International Symposium on Parallel and Distributed Processing (IPDPS)
, pp. 1-12
-
-
Kamil, S.1
Chan, C.2
Oliker, L.3
Shalf, J.4
Williams, S.5
-
22
-
-
34547500808
-
Implicit and explicit optimizations for stencil computations
-
Kamil S, Datta K, Williams S, Oliker L, Shalf J, Yelick K. Implicit and explicit optimizations for stencil computations. Proceedings of the 2006 workshop on Memory system performance and correctness - MSPC '06. 2006 ;: 51
-
(2006)
Proceedings of the 2006 Workshop on Memory System Performance and Correctness - MSPC '06
, pp. 51
-
-
Kamil, S.1
Datta, K.2
Williams, S.3
Oliker, L.4
Shalf, J.5
Yelick, K.6
-
23
-
-
84958661690
-
Impact of modern memory subsystems on cache optimizations for stencil computations
-
Kamil S, Husbands P, Oliker L, Shalf J, Yelick K. Impact of modern memory subsystems on cache optimizations for stencil computations. Memory System Performance. 2005 ;: 36-43
-
(2005)
Memory System Performance
, pp. 36-43
-
-
Kamil, S.1
Husbands, P.2
Oliker, L.3
Shalf, J.4
Yelick, K.5
-
24
-
-
79551674713
-
Exaflop/s: The why and the how
-
Keyes D. Exaflop/s: The why and the how. Comptes Rendus Mécanique. 2011 ; 339 (2-3). 70-77
-
(2011)
Comptes Rendus Mécanique
, vol.339
, Issue.23
, pp. 70-77
-
-
Keyes, D.1
-
26
-
-
35448944792
-
Effective automatic parallelization of stencil computations
-
Krishnamoorthy S, Baskaran M, Bondhugula U, Ramanujam J, Rountev A, Sadayappan P. Effective automatic parallelization of stencil computations. ACM Sigplan Notices. 2007 ; 42 (6). 235
-
(2007)
ACM Sigplan Notices
, vol.42
, Issue.6
, pp. 235
-
-
Krishnamoorthy, S.1
Baskaran, M.2
Bondhugula, U.3
Ramanujam, J.4
Rountev, A.5
Sadayappan, P.6
-
28
-
-
44849137198
-
NVIDIA Tesla: A unified graphics and computing architecture
-
Lindholm E, Nickolls J, Oberman S, Montrym J. NVIDIA Tesla: A unified graphics and computing architecture. IEEE Micro. 2008 ; 28 (2). 39-55
-
(2008)
IEEE Micro
, vol.28
, Issue.2
, pp. 39-55
-
-
Lindholm, E.1
Nickolls, J.2
Oberman, S.3
Montrym, J.4
-
29
-
-
50649115040
-
CorePy: High-productivity Cell/BE programming
-
Mueller C and Martin B (2007) CorePy: high-productivity Cell/BE programming. Applications for the Cell/BE, http://sti.cc.gatech.edu/Slides/ Mueller-070619.pdf.
-
(2007)
Applications for the Cell/BE
-
-
Mueller, C.1
Martin, B.2
-
30
-
-
79957475280
-
Intel's array building blocks: A retargetable, dynamic compiler and embedded language
-
Newburn C, So B, Liu Z, et al. (2011) Intel's Array Building Blocks: A Retargetable, Dynamic Compiler and Embedded Language. Proceedings of Code Generation and Optimization, http://software.intel.com/en-us/blogs/wordpress/wp- content/uploads/2011/03/ArBB-CGO2011-distr.pdf.
-
(2011)
Proceedings of Code Generation and Optimization
-
-
Newburn, C.1
So, B.2
Liu, Z.3
-
31
-
-
78650806116
-
3.5-D blocking optimization for stencil computations on modern CPUs and GPUs
-
Nguyen A, Satish N, Chhugani J, Kim C, Dubey P. 3.5-D blocking optimization for stencil computations on modern CPUs and GPUs. Proceedings of SuperComputing. 2010 ;: 1-13
-
(2010)
Proceedings of SuperComputing
, pp. 1-13
-
-
Nguyen, A.1
Satish, N.2
Chhugani, J.3
Kim, C.4
Dubey, P.5
-
33
-
-
31344457004
-
Overview of the architecture, circuit design, and physical implementation of a first-generation cell processor
-
Pham DC, Aipperspach T, Boerstler D, et al. Overview of the architecture, circuit design, and physical implementation of a first-generation cell processor. IEEE Journal of Solid-State Circuits. 2006 ; 41: 179-196
-
(2006)
IEEE Journal of Solid-State Circuits
, vol.41
, pp. 179-196
-
-
Pham, D.C.1
Aipperspach, T.2
Boerstler, D.3
-
37
-
-
0037383334
-
High-order finite difference and finite volume WENO schemes and discontinuous Galerkin methods for CFD
-
Shu C. High-order finite difference and finite volume WENO schemes and discontinuous Galerkin methods for CFD. International Journal of Computational Fluid Dynamics. 2003 ; 17: 107-118
-
(2003)
International Journal of Computational Fluid Dynamics
, vol.17
, pp. 107-118
-
-
Shu, C.1
-
38
-
-
35449003235
-
-
New York: ACM Press;
-
Solar-Lezama A, Arnold G, Tancau L, Bodik R, Saraswat V, Seshia S Proceedings of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation. New York: ACM Press ; 2007: 167-178.
-
(2007)
Proceedings of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation
, pp. 167-178
-
-
Solar-Lezama, A.1
Arnold, G.2
Tancau, L.3
Bodik, R.4
Saraswat, V.5
Seshia, S.6
-
40
-
-
79959673844
-
-
New York: ACM Press;
-
Tang Y, Chowdhury RA, Kuszmaul BC, Luk C-K, Leiserson CE Proceedings of the 23rd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA '11). New York: ACM Press ; 2011: 117.
-
(2011)
Proceedings of the 23rd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA '11)
, pp. 117
-
-
Tang, Y.1
Chowdhury, R.A.2
Kuszmaul, B.C.3
Luk, C.-K.4
Leiserson, C.E.5
-
42
-
-
70449657442
-
Efficient temporal blocking for stencil computations by multicore-aware wavefront parallelization
-
Wellein G, Hager G, Zeiser T, Wittmann M, Fehske H. Efficient temporal blocking for stencil computations by multicore-aware wavefront parallelization. 2009 33rd Annual IEEE International Computer Software and Applications Conference. 2009 ;: 579-586
-
(2009)
2009 33rd Annual IEEE International Computer Software and Applications Conference
, pp. 579-586
-
-
Wellein, G.1
Hager, G.2
Zeiser, T.3
Wittmann, M.4
Fehske, H.5
-
43
-
-
0034448098
-
Optimal instruction scheduling using integer programming
-
Wilken K. Optimal instruction scheduling using integer programming. ACM SIGPLAN Notices. 2000 ;:
-
(2000)
ACM SIGPLAN Notices
-
-
Wilken, K.1
-
44
-
-
51049106193
-
Lattice Boltzmann simulation optimization on leading multicore platforms
-
Williams S, Carter J, Oliker L, Shalf J, Yelick K. Lattice Boltzmann simulation optimization on leading multicore platforms. 2008 IEEE International Symposium on Parallel and Distributed Processing. 2008 ;: 1-14
-
(2008)
2008 IEEE International Symposium on Parallel and Distributed Processing
, pp. 1-14
-
-
Williams, S.1
Carter, J.2
Oliker, L.3
Shalf, J.4
Yelick, K.5
-
45
-
-
78650871519
-
Leveraging shared caches for parallel temporal blocking of stencil codes on multicore processors and clusters
-
Wittmann M, Hager G, Treibig J, Wellein G. Leveraging shared caches for parallel temporal blocking of stencil codes on multicore processors and clusters. Parallel Processing Letters. 2010 ; 20: 359-376
-
(2010)
Parallel Processing Letters
, vol.20
, pp. 359-376
-
-
Wittmann, M.1
Hager, G.2
Treibig, J.3
Wellein, G.4
|