SCOPUS 정보 검색 플랫폼

Volumn 2, Issue 2, 2011, Pages 130-137

Efficient multicore-aware parallelization strategies for iterative stencil computations

Author keywords

Multicore; Simultaneous multi threading; Spatial blocking; Stencil computations; Temporal blocking; Wavefront parallelization

Indexed keywords

MULTI CORE; SIMULTANEOUS MULTI-THREADING; SPATIAL BLOCKING; STENCIL COMPUTATIONS; TEMPORAL BLOCKING; WAVEFRONT PARALLELIZATION;

ALGORITHMS; SOFTWARE ARCHITECTURE;

OPTIMIZATION;

EID: 79958773431 PISSN: 18777503 EISSN: None Source Type: Journal
DOI: 10.1016/j.jocs.2011.01.010 Document Type: Article

Times cited : (33)

References (14)

2
- 70350771127
- Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures
- Datta K., Murphy M., Volkov V., Williams S., Carter J., Oliker L., Patterson D., Shalf J., Yelick K. Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures. ACM/IEEE (Ed.): Proceedings of the ACM/IEEE SC 2008 Conference Supercomputing Conference'08 2008.
- (2008) ACM/IEEE (Ed.): Proceedings of the ACM/IEEE SC 2008 Conference Supercomputing Conference'08
- Datta, K.¹ Murphy, M.² Volkov, V.³ Williams, S.⁴ Carter, J.⁵ Oliker, L.⁶ Patterson, D.⁷ Shalf, J.⁸ Yelick, K.⁹

3
- 32444450812
- SCS Publishing House, Germany, ISBN 3-936150-39-7
- Kowarschik M. Data Locality Optimizations for Iterative Numerical Algorithms and Cellular Automata on Hierarchical Memory Architectures. Ph.D. Thesis, July 2004 2004, SCS Publishing House, Germany, ISBN 3-936150-39-7.
- (2004) Data Locality Optimizations for Iterative Numerical Algorithms and Cellular Automata on Hierarchical Memory Architectures. Ph.D. Thesis, July 2004
- Kowarschik, M.¹

4
- 59749100826
- Optimization and performance modeling of stencil computations on modern microprocessors
- Datta K., Kamil S., Williams S., Oliker L., Shalf J., Yelick: K. Optimization and performance modeling of stencil computations on modern microprocessors. SIAM Rev. 2009, 51(1):129-159.
- (2009) SIAM Rev. , vol.51 , Issue.1 , pp. 129-159
- Datta, K.¹ Kamil, S.² Williams, S.³ Oliker, L.⁴ Shalf, J.⁵ Yelick, K.⁶

5
- 70450077422
- Parallel data-locality aware stencil computations on modern micro-architectures
- Christen M., Schenk O., Messmer P., Neufeld E., Burkhart: H. Parallel data-locality aware stencil computations on modern micro-architectures. Proceedings of the 23rd IEEE International Parallel and Distributed Processing Symposium (IPDPS), May 25-29 2009.
- (2009) Proceedings of the 23rd IEEE International Parallel and Distributed Processing Symposium (IPDPS), May 25-29
- Christen, M.¹ Schenk, O.² Messmer, P.³ Neufeld, E.⁴ Burkhart, H.⁵

6
- 79958773085
- Efficiency Improvements of Iterative Numerical Algorithms on Modern Architectures. Ph.D. Thesis, July, URN: urn:nbn:de:bvb:29-opus-14036.
- J. Treibig, Efficiency Improvements of Iterative Numerical Algorithms on Modern Architectures. Ph.D. Thesis, July 2009, URN: urn:nbn:de:bvb:29-opus-14036.
- (2009)
- Treibig, J.¹

7
- 0033350255
- Cache-oblivious algorithms
- Frigo M., Leiserson C.E., Prokop H., Ramachandran: S. Cache-oblivious algorithms. 40th Annual Symposium on Foundations of Computer Science, FOCS 99, October 17-18 1999.
- (1999) 40th Annual Symposium on Foundations of Computer Science, FOCS 99, October 17-18
- Frigo, M.¹ Leiserson, C.E.² Prokop, H.³ Ramachandran, S.⁴

8
- 56349170328
- Introducing a parallel cache oblivious blocking approach for the lattice Boltzmann method
- Zeiser T., Wellein G., Nitsure A., Iglberger K., Rüde U., Hager: G. Introducing a parallel cache oblivious blocking approach for the lattice Boltzmann method. Prog. CFD 2008, 8(1-4):179-188.
- (2008) Prog. CFD , vol.8 , Issue.1-4 , pp. 179-188
- Zeiser, T.¹ Wellein, G.² Nitsure, A.³ Iglberger, K.⁴ Rüde, U.⁵ Hager, G.⁶

9
- 70449657442
- Efficient temporal blocking for stencil computations by multicore-aware wavefront parallelization
- Wellein G., Hager G., Zeiser T., Wittmann M., Fehske H. Efficient temporal blocking for stencil computations by multicore-aware wavefront parallelization. Proc. COMPSAC 2009 2009, 10.1109/COMPSAC.1.2009.82.
- (2009) Proc. COMPSAC 2009
- Wellein, G.¹ Hager, G.² Zeiser, T.³ Wittmann, M.⁴ Fehske, H.⁵

11
- 79958765147
- STREAM: Sustainable Memory Bandwidth in High Performance Computers.
- J.D. McCalpin, STREAM: Sustainable Memory Bandwidth in High Performance Computers. http://www.cs.virginia.edu/stream.
- McCalpin, J.D.¹

12
- 79958768049
- Introducing a performance model for bandwidth limited loop kernels
- Treibig J., Hager: G. Introducing a performance model for bandwidth limited loop kernels. Workshop on Memory Issues on Multi- and Manycore Platforms, PPAM 2009.
- (2009) Workshop on Memory Issues on Multi- and Manycore Platforms, PPAM
- Treibig, J.¹ Hager, G.²

13
- 78649844813
- LIKWID: a lightweight performance-oriented tool suite for x86 multicore environments, PSTI2010, the First International Workshop on Parallel Software Tools and Tool Infrastructures, San Diego CA, September 13, arXiv:1004.4431, in press. doi:10.1109/ICPPW.2010.38
- J. Treibig, G. Hager, G. Wellein, LIKWID: a lightweight performance-oriented tool suite for x86 multicore environments, PSTI2010, the First International Workshop on Parallel Software Tools and Tool Infrastructures, San Diego CA, September 13, 2010. arXiv:1004.4431, in press. doi:10.1109/ICPPW.2010.38.
- Treibig, J.¹ Hager, G.² Wellein, G.³

14
- 78650871519
- Leveraging shared caches for parallel temporal blocking of stencil codes on multicore processors and clusters
- Wittmann M., Hager G., Treibig J., Wellein G. Leveraging shared caches for parallel temporal blocking of stencil codes on multicore processors and clusters. Parallel Processing Letters 2010, 20(4):359-376.
- (2010) Parallel Processing Letters , vol.20 , Issue.4 , pp. 359-376
- Wittmann, M.¹ Hager, G.² Treibig, J.³ Wellein, G.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.