SCOPUS 정보 검색 플랫폼

Proceedings of the International Conference on Supercomputing

Volumn , Issue , 2010, Pages 49-59

Cache oblivious parallelograms in iterative stencil computations

(4) Strzodka, Robert a Shaheen, Mohammed a Pajak, Dawid b Seidel, Hans Peter a

a MAX PLANCK INSTITUTE FOR INFORMATICS (Germany)

b WEST POMERANIAN UNIVERSITY OF TECHNOLOGY (Poland)

Author keywords

cache oblivious; memory bound; memory wall; parallelism and locality; stencil; temporal blocking; time skewing

Indexed keywords

3D-SPATIAL DOMAIN; BLOCKING TIME; CACHE-OBLIVIOUS; DATA LOCALITY; DOUBLE PRECISION; EXECUTION TIME; ITERATION SPACES; MEMORY WALL; ON-CHIP CACHE; OPTIMIZERS; PARALLELIZATIONS; PERFORMANCE BENEFITS; STENCIL COMPUTATIONS; SYSTEM BANDWIDTH; TILING STRUCTURES; VECTORIZATION; WORK-LOAD DISTRIBUTION;

INTELLIGENT CONTROL;

CACHE MEMORY;

EID: 77954709215 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/1810085.1810096 Document Type: Conference Paper

Times cited : (49)

References (18)

1
- 70449662808
- Technical report, Carnegie Mellon University
- G. E. Blelloch, P. B. Gibbons, and H. V. Simhadri. Low depth cache-oblivious algorithms. Technical report, Carnegie Mellon University, 2009.
- (2009) Low Depth Cache-oblivious Algorithms
- Blelloch, G.E.¹ Gibbons, P.B.² Simhadri, H.V.³

2
- 67650079888
- A practical automatic polyhedral parallelizer and locality optimizer
- U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan. A practical automatic polyhedral parallelizer and locality optimizer. SIGPLAN Not., 43(6):101-113, 2008.
- (2008) SIGPLAN Not. , vol.43 , Issue.6 , pp. 101-113
- Bondhugula, U.¹ Hartono, A.² Ramanujam, J.³ Sadayappan, P.⁴

3
- 32844463802
- Cache oblivious stencil computations
- ACM
- M. Frigo and V. Strumpen. Cache oblivious stencil computations. In ICS '05: Proceedings of the 19th annual international conference on Supercomputing, pages 361-366. ACM, 2005.
- (2005) ICS '05: Proceedings of the 19th Annual International Conference on Supercomputing , pp. 361-366
- Frigo, M.¹ Strumpen, V.²

4
- 33749564381
- The cache complexity of multithreaded cache oblivious algorithms
- New York, NY, USA, ACM
- M. Frigo and V. Strumpen. The cache complexity of multithreaded cache oblivious algorithms. In SPAA '06: Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures, pages 271-280, New York, NY, USA, 2006. ACM.
- (2006) SPAA '06: Proceedings of the Eighteenth Annual ACM Symposium on Parallelism in Algorithms and Architectures , pp. 271-280
- Frigo, M.¹ Strumpen, V.²

5
- 4243166952
- Tight bounds on cache use for stencil operations on rectangular grids
- M. A. Frumkin and R. F. Van der Wijngaart. Tight bounds on cache use for stencil operations on rectangular grids. Journal of ACM, 49(3):434-453, 2002.
- (2002) Journal of ACM , vol.49 , Issue.3 , pp. 434-453
- Frumkin, M.A.¹ Van Der Wijngaart, R.F.²

6
- 70449702074
- Parametric multi-level tiling of imperfectly nested loops
- A. Hartono, M. M. Baskaran, C. Bastoul, A. Cohen, S. Krishnamoorthy, B. Norris, J. Ramanujam, and P. Sadayappan. Parametric multi-level tiling of imperfectly nested loops. In Proceedings of the 23rd International Conference on Supercomputing, pages 147-157, 2009.
- (2009) Proceedings of the 23rd International Conference on Supercomputing , pp. 147-157
- Hartono, A.¹ Baskaran, M.M.² Bastoul, C.³ Cohen, A.⁴ Krishnamoorthy, S.⁵ Norris, B.⁶ Ramanujam, J.⁷ Sadayappan, P.⁸

7
- 77954016628
- HiTLoG: Hierarchical tiled loop generator. http://www.cs.colostate.edu/ MMAlpha/tiling/.
- HiTLoG: Hierarchical Tiled Loop Generator

8
- 77954022347
- An auto-tuning framework for parallel multicore stencil computations
- S. Kamil, C. Chan, L. Oliker, J. Shalf, and S. Williams. An auto-tuning framework for parallel multicore stencil computations. In International Parallel & Distributed Processing Symposium (IPDPS), 2010.
- International Parallel & Distributed Processing Symposium (IPDPS), 2010
- Kamil, S.¹ Chan, C.² Oliker, L.³ Shalf, J.⁴ Williams, S.⁵

9
- 34547500808
- Implicit and explicit optimizations for stencil computations
- ACM
- S. Kamil, K. Datta, S. Williams, L. Oliker, J. Shalf, and K. Yelick. Implicit and explicit optimizations for stencil computations. In MSPC '06: Proceedings of the 2006 workshop on Memory system performance and correctness, pages 51-60. ACM, 2006.
- (2006) MSPC '06: Proceedings of the 2006 Workshop on Memory System Performance and Correctness , pp. 51-60
- Kamil, S.¹ Datta, K.² Williams, S.³ Oliker, L.⁴ Shalf, J.⁵ Yelick, K.⁶

10
- 56749175334
- Multi-level tiling: M for the price of one
- D. Kim, L. Renganarayanan, D. Rostron, S. V. Rajopadhye, and M. M. Strout. Multi-level tiling: M for the price of one. In Proceedings of the ACM/IEEE Conference on Supercomputing, page 51, 2007.
- (2007) Proceedings of the ACM/IEEE Conference on Supercomputing , pp. 51
- Kim, D.¹ Renganarayanan, L.² Rostron, D.³ Rajopadhye, S.V.⁴ Strout, M.M.⁵

11
- 35448944792
- Effective automatic parallelization of stencil computations
- S. Krishnamoorthy, M. Baskaran, U. Bondhugula, J. Ramanujam, A. Rountev, and P. Sadayappan. Effective automatic parallelization of stencil computations. SIGPLAN Not., 42(6):235-244, 2007.
- (2007) SIGPLAN Not. , vol.42 , Issue.6 , pp. 235-244
- Krishnamoorthy, S.¹ Baskaran, M.² Bondhugula, U.³ Ramanujam, J.⁴ Rountev, A.⁵ Sadayappan, P.⁶

12
- 77954731998
- Technical report, University of Delaware, Mar.
- D. Orozco and G. Gao. Mapping the FDTD application to many-core chip architectures. Technical report, University of Delaware, Mar. 2009.
- (2009) Mapping the FDTD Application to Many-core Chip Architectures
- Orozco, D.¹ Gao, G.²

13
- 84877715579
- PluTo: A polyhedral automatic parallelizer and locality optimizer for multicores. http://sourceforge.net/projects/pluto-compiler/.
- PluTo: A Polyhedral Automatic Parallelizer and Locality Optimizer for Multicores

14
- 77953982883
- PrimeTile: A parametric multi-level tiler for imperfect loop nests. http://primetile.sourceforge.net.
- PrimeTile: A Parametric Multi-level Tiler for Imperfect Loop Nests

15
- 0032635362
- New tiling techniques to improve cache temporal locality
- Y. Song and Z. Li. New tiling techniques to improve cache temporal locality. In Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, 1999.
- Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, 1999
- Song, Y.¹ Li, Z.²

16
- 67449120328
- Technical report, IBM Research
- V. Strumpen and M. Frigo. Software engineering aspects of cache oblivious stencil computations. Technical report, IBM Research, 2006.
- (2006) Software Engineering Aspects of Cache Oblivious Stencil Computations
- Strumpen, V.¹ Frigo, M.²

17
- 0024935630
- More iteration space tiling
- M. Wolf. More iteration space tiling. In Proceedings of Supercomputing '89, 1989.
- Proceedings of Supercomputing '89, 1989
- Wolf, M.¹

18
- 0033905336
- Using time skewing to eliminate idle time due to memory bandwidth and network limitations
- D. Wonnacott. Using time skewing to eliminate idle time due to memory bandwidth and network limitations. In Proceedings of International Parallel and Distributed Processing Symposium, 2000.
- Proceedings of International Parallel and Distributed Processing Symposium, 2000
- Wonnacott, D.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.