-
2
-
-
70350771127
-
Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures
-
Datta K., Murphy M., Volkov V., Williams S., Carter J., Oliker L., Patterson D., Shalf J., Yelick K. Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures. ACM/IEEE (Ed.): Proceedings of the ACM/IEEE SC 2008 Conference Supercomputing Conference'08 2008.
-
(2008)
ACM/IEEE (Ed.): Proceedings of the ACM/IEEE SC 2008 Conference Supercomputing Conference'08
-
-
Datta, K.1
Murphy, M.2
Volkov, V.3
Williams, S.4
Carter, J.5
Oliker, L.6
Patterson, D.7
Shalf, J.8
Yelick, K.9
-
4
-
-
59749100826
-
Optimization and performance modeling of stencil computations on modern microprocessors
-
Datta K., Kamil S., Williams S., Oliker L., Shalf J., Yelick: K. Optimization and performance modeling of stencil computations on modern microprocessors. SIAM Rev. 2009, 51(1):129-159.
-
(2009)
SIAM Rev.
, vol.51
, Issue.1
, pp. 129-159
-
-
Datta, K.1
Kamil, S.2
Williams, S.3
Oliker, L.4
Shalf, J.5
Yelick, K.6
-
5
-
-
70450077422
-
Parallel data-locality aware stencil computations on modern micro-architectures
-
Christen M., Schenk O., Messmer P., Neufeld E., Burkhart: H. Parallel data-locality aware stencil computations on modern micro-architectures. Proceedings of the 23rd IEEE International Parallel and Distributed Processing Symposium (IPDPS), May 25-29 2009.
-
(2009)
Proceedings of the 23rd IEEE International Parallel and Distributed Processing Symposium (IPDPS), May 25-29
-
-
Christen, M.1
Schenk, O.2
Messmer, P.3
Neufeld, E.4
Burkhart, H.5
-
6
-
-
79958773085
-
-
Efficiency Improvements of Iterative Numerical Algorithms on Modern Architectures. Ph.D. Thesis, July, URN: urn:nbn:de:bvb:29-opus-14036.
-
J. Treibig, Efficiency Improvements of Iterative Numerical Algorithms on Modern Architectures. Ph.D. Thesis, July 2009, URN: urn:nbn:de:bvb:29-opus-14036.
-
(2009)
-
-
Treibig, J.1
-
7
-
-
0033350255
-
Cache-oblivious algorithms
-
Frigo M., Leiserson C.E., Prokop H., Ramachandran: S. Cache-oblivious algorithms. 40th Annual Symposium on Foundations of Computer Science, FOCS 99, October 17-18 1999.
-
(1999)
40th Annual Symposium on Foundations of Computer Science, FOCS 99, October 17-18
-
-
Frigo, M.1
Leiserson, C.E.2
Prokop, H.3
Ramachandran, S.4
-
8
-
-
56349170328
-
Introducing a parallel cache oblivious blocking approach for the lattice Boltzmann method
-
Zeiser T., Wellein G., Nitsure A., Iglberger K., Rüde U., Hager: G. Introducing a parallel cache oblivious blocking approach for the lattice Boltzmann method. Prog. CFD 2008, 8(1-4):179-188.
-
(2008)
Prog. CFD
, vol.8
, Issue.1-4
, pp. 179-188
-
-
Zeiser, T.1
Wellein, G.2
Nitsure, A.3
Iglberger, K.4
Rüde, U.5
Hager, G.6
-
9
-
-
70449657442
-
Efficient temporal blocking for stencil computations by multicore-aware wavefront parallelization
-
Wellein G., Hager G., Zeiser T., Wittmann M., Fehske H. Efficient temporal blocking for stencil computations by multicore-aware wavefront parallelization. Proc. COMPSAC 2009 2009, 10.1109/COMPSAC.1.2009.82.
-
(2009)
Proc. COMPSAC 2009
-
-
Wellein, G.1
Hager, G.2
Zeiser, T.3
Wittmann, M.4
Fehske, H.5
-
11
-
-
79958765147
-
-
STREAM: Sustainable Memory Bandwidth in High Performance Computers.
-
J.D. McCalpin, STREAM: Sustainable Memory Bandwidth in High Performance Computers. http://www.cs.virginia.edu/stream.
-
-
-
McCalpin, J.D.1
-
13
-
-
78649844813
-
-
LIKWID: a lightweight performance-oriented tool suite for x86 multicore environments, PSTI2010, the First International Workshop on Parallel Software Tools and Tool Infrastructures, San Diego CA, September 13, arXiv:1004.4431, in press. doi:10.1109/ICPPW.2010.38
-
J. Treibig, G. Hager, G. Wellein, LIKWID: a lightweight performance-oriented tool suite for x86 multicore environments, PSTI2010, the First International Workshop on Parallel Software Tools and Tool Infrastructures, San Diego CA, September 13, 2010. arXiv:1004.4431, in press. doi:10.1109/ICPPW.2010.38.
-
-
-
Treibig, J.1
Hager, G.2
Wellein, G.3
-
14
-
-
78650871519
-
Leveraging shared caches for parallel temporal blocking of stencil codes on multicore processors and clusters
-
Wittmann M., Hager G., Treibig J., Wellein G. Leveraging shared caches for parallel temporal blocking of stencil codes on multicore processors and clusters. Parallel Processing Letters 2010, 20(4):359-376.
-
(2010)
Parallel Processing Letters
, vol.20
, Issue.4
, pp. 359-376
-
-
Wittmann, M.1
Hager, G.2
Treibig, J.3
Wellein, G.4
|