SCOPUS 정보 검색 플랫폼

International Conference for High Performance Computing, Networking, Storage and Analysis, SC

Volumn , Issue , 2012, Pages

PATUS for convenient high-performance stencils: Evaluation in earthquake simulations

(3) Christen, Matthias a Schenk, Olaf a Cui, Yifeng b

a UNIVERSITY OF LUGANO (Switzerland)

b UNIVERSITY OF CALIFORNIA (United States)

Author keywords

[No Author keywords available]

Indexed keywords

COMPLEX SIMULATION; DISCRETIZATIONS; EARTHQUAKE SIMULATION; MANY-CORE PROCESSORS; PARALLEL EFFICIENCY; PROGRAMMER PRODUCTIVITY; SEISMIC APPLICATION; STENCIL COMPUTATIONS;

EARTHQUAKES; NETWORK COMPONENTS; PRODUCTIVITY; SPECIFICATION LANGUAGES;

THREE DIMENSIONAL COMPUTER GRAPHICS;

EID: 84877717516 PISSN: 21674329 EISSN: 21674337 Source Type: Conference Proceeding
DOI: 10.1109/SC.2012.95 Document Type: Conference Paper

Times cited : (35)

References (28)

1
- 78650818575
- Scalable Earthquake Simulation on Petascale Supercomputers
- Y. Cui, K. Olsen, T. Jordan, K. Lee, J. Zhou, P. Small, D. Roten, G. Ely, D. Panda, A. Chourasia, J. Levesque, S. Day, and P. Maechling, "Scalable Earthquake Simulation on Petascale Supercomputers," in Proc. ACM/IEEE Int'l Conference for High Performance Computing, Networking, Storage and Analysis (SC 2010), 2010, pp. 1-20.
- Proc. ACM/IEEE Int'l Conference for High Performance Computing, Networking, Storage and Analysis (SC 2010), 2010 , pp. 1-20
- Cui, Y.¹ Olsen, K.² Jordan, T.³ Lee, K.⁴ Zhou, J.⁵ Small, P.⁶ Roten, D.⁷ Ely, G.⁸ Panda, D.⁹ Chourasia, A.¹⁰ Levesque, J.¹¹ Day, S.¹² Maechling, P.¹³

2
- 80053238973
- PATUS: A Code Generation and Autotuning Framework For Parallel Iterative Stencil Computations on Modern Microarchitectures
- M. Christen, O. Schenk, and H. Burkhart, "PATUS: A Code Generation and Autotuning Framework For Parallel Iterative Stencil Computations on Modern Microarchitectures," in Proc. IEEE Int'l Parallel & Distributed Processing Symposium (IPDPS 2011), 2011, pp. 1-12.
- Proc. IEEE Int'l Parallel & Distributed Processing Symposium (IPDPS 2011), 2011 , pp. 1-12
- Christen, M.¹ Schenk, O.² Burkhart, H.³

3
- 84877695627
- Ph.D. dissertation, University of Basel, Switzerland
- M. Christen, "Generating and Auto-Tuning Parallel Stencil Codes," Ph.D. dissertation, University of Basel, Switzerland, 2011.
- (2011) Generating and Auto-Tuning Parallel Stencil Codes
- Christen, M.¹

4
- 0003616175
- Ph.D. dissertation, Univ. of Utah
- K. Olsen, "Simulation of Three-Dimensional Wave Propagation in the Salt Lake Basin," Ph.D. dissertation, Univ. of Utah, 1994.
- (1994) Simulation of Three-Dimensional Wave Propagation in the Salt Lake Basin
- Olsen, K.¹

5
- 0042885467
- On the Implementation of Perfectly Matched Layers in a 3D Fourth-Order Velocity-Stress Finite-Difference Scheme
- C. Marcinkovich and K. Olsen, "On the Implementation of Perfectly Matched Layers in a 3D Fourth-Order Velocity-Stress Finite-Difference Scheme," J. Geophys. Res., vol. 108 (B5), 2003.
- (2003) J. Geophys. Res. , vol.108 , Issue.B5
- Marcinkovich, C.¹ Olsen, K.²

6
- 70450077422
- Parallel Data-Locality Aware Stencil Computations on Modern Micro-Architectures
- M. Christen, O. Schenk, E. Neufeld, P. Messmer, and H. Burkhart, "Parallel Data-Locality Aware Stencil Computations on Modern Micro-Architectures," in Proc. IEEE Int'l Parallel & Distributed Processing Symposium (IPDPS 2009), May 2009, pp. 1-10.
- Proc. IEEE Int'l Parallel & Distributed Processing Symposium (IPDPS 2009), May 2009 , pp. 1-10
- Christen, M.¹ Schenk, O.² Neufeld, E.³ Messmer, P.⁴ Burkhart, H.⁵

7
- 33749564381
- The Cache Complexity of Multithreaded Cache Oblivious Algorithms
- M. Frigo and V. Strumpen, "The Cache Complexity of Multithreaded Cache Oblivious Algorithms," in Proc. ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2006), 2006, pp. 271-280.
- Proc. ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2006), 2006 , pp. 271-280
- Frigo, M.¹ Strumpen, V.²

8
- 77954709215
- Cache oblivious parallelograms in iterative stencil computations
- R. Strzodka, M. Shaheen, D. Pajak, and H. Seidel, "Cache oblivious parallelograms in iterative stencil computations," in Proc. ACM Int'l Conference on Supercomputing (ICS 2010), 2010, pp. 49-59.
- Proc. ACM Int'l Conference on Supercomputing (ICS 2010), 2010 , pp. 49-59
- Strzodka, R.¹ Shaheen, M.² Pajak, D.³ Seidel, H.⁴

9
- 79959673844
- The Pochoir Stencil Compiler
- Y. Tang, R. Chowdhury, B. Kuszmaul, C. Luk, and C. Leiserson, "The Pochoir Stencil Compiler," in Proc. ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2011), 2011, pp. 117-128.
- Proc. ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2011), 2011 , pp. 117-128
- Tang, Y.¹ Chowdhury, R.² Kuszmaul, B.³ Luk, C.⁴ Leiserson, C.⁵

10
- 79958773431
- Efficient Multicore-Aware Parallelization Strategies for Iterative Stencil Computations
- J. Treibig, G. Wellein, and G. Hager, "Efficient Multicore-Aware Parallelization Strategies for Iterative Stencil Computations," J. Comp. Sci., vol. 2, no. 2, pp. 130-137, 2011.
- (2011) J. Comp. Sci. , vol.2 , Issue.2 , pp. 130-137
- Treibig, J.¹ Wellein, G.² Hager, G.³

11
- 84863436006
- Time skewing for parallel computers
- Springer-Verlag
- D. Wonnacott, "Time skewing for parallel computers," in Proc. Int'l Workshop on Compilers for Parallel Computing (CPC 1999). Springer-Verlag, 1999, pp. 477-480.
- (1999) Proc. Int'l Workshop on Compilers for Parallel Computing (CPC 1999) , pp. 477-480
- Wonnacott, D.¹

12
- 79959601133
- Mint: Realizing CUDA Performance in 3D Stencil Methods with Annotated C
- D. Unat, X. Cai, and S. Baden, "Mint: Realizing CUDA Performance in 3D Stencil Methods with Annotated C," in Proc. ACM Int'l Conference on Supercomputing (ICS 2011), 2011, pp. 214-224.
- Proc. ACM Int'l Conference on Supercomputing (ICS 2011), 2011 , pp. 214-224
- Unat, D.¹ Cai, X.² Baden, S.³

13
- 83155190224
- Physis: An Implicitly Parallel Programming Model for Stencil Computations on Large-Scale GPU-Accelerated Supercomputers
- IEEE Computer Society
- N. Maruyama, T. Nomura, K. Sato, and S. Matsuoka, "Physis: An Implicitly Parallel Programming Model for Stencil Computations on Large-Scale GPU-Accelerated Supercomputers," in Proc. ACM/IEEE Int'l Conference for High Performance Computing, Networking, Storage and Analysis (SC 2011). IEEE Computer Society, 2011.
- (2011) Proc. ACM/IEEE Int'l Conference for High Performance Computing, Networking, Storage and Analysis (SC 2011)
- Maruyama, N.¹ Nomura, T.² Sato, K.³ Matsuoka, S.⁴

14
- 79551491518
- A Performance Study for Iterative Stencil Loops on GPUs with Ghost Zone Optimizations
- February
- J. Meng and K. Skadron, "A Performance Study for Iterative Stencil Loops on GPUs with Ghost Zone Optimizations," Int. J. Parallel Prog., vol. 39, pp. 115-142, February 2011.
- (2011) Int. J. Parallel Prog. , vol.39 , pp. 115-142
- Meng, J.¹ Skadron, K.²

15
- 84861635761
- A Hybrid Circular Queue Method for Iterative Stencil Computations on GPUs
- Y. Yang, H. Cui, X. Feng, and J. Xue, "A Hybrid Circular Queue Method for Iterative Stencil Computations on GPUs," J. Comp. Sci. Tech., vol. 27, pp. 57-74, 2012.
- (2012) J. Comp. Sci. Tech. , vol.27 , pp. 57-74
- Yang, Y.¹ Cui, H.² Feng, X.³ Xue, J.⁴

16
- 24644456455
- Automatic tiling of iterative stencil loops
- DOI 10.1145/1034774.1034777
- Z. Li and Y. Song, "Automatic Tiling of Iterative Stencil Loops," ACM Trans. Program. Lang. Syst., vol. 26, no. 6, pp. 975-1028, 2004. (Pubitemid 41270296)
- (2004) ACM Transactions on Programming Languages and Systems , vol.26 , Issue.6 , pp. 975-1028
- Li, Z.¹ Song, Y.²

17
- 77953972043
- Ph.D. dissertation, Univ. of California, Berkeley
- K. Datta, "Auto-tuning Stencil Codes for Cache-Based Multicore Platforms," Ph.D. dissertation, Univ. of California, Berkeley, 2009.
- (2009) Auto-tuning Stencil Codes for Cache-Based Multicore Platforms
- Datta, K.¹

18
- 77954022347
- An Auto-tuning Framework For Parallel Multicore Stencil Computations
- S. Kamil, C. Chan, L. Oliker, J. Shalf, and S. Williams, "An Auto-tuning Framework For Parallel Multicore Stencil Computations," in Proc. IEEE Int'l Parallel & Distributed Processing Symposium (IPDPS 2010), April 2010.
- Proc. IEEE Int'l Parallel & Distributed Processing Symposium (IPDPS 2010), April 2010
- Kamil, S.¹ Chan, C.² Oliker, L.³ Shalf, J.⁴ Williams, S.⁵

19
- 78650806116
- 3.5-D Blocking Optimization for Stencil Computations on Modern CPUs and GPUs
- IEEE Computer Society
- A. Nguyen, N. Satish, J. Chhugani, C. Kim, and P. Dubey, "3.5-D Blocking Optimization for Stencil Computations on Modern CPUs and GPUs," in Proc. ACM/IEEE Int'l Conference for High Performance Computing, Networking, Storage and Analysis (SC 2010). IEEE Computer Society, 2010.
- (2010) Proc. ACM/IEEE Int'l Conference for High Performance Computing, Networking, Storage and Analysis (SC 2010)
- Nguyen, A.¹ Satish, N.² Chhugani, J.³ Kim, C.⁴ Dubey, P.⁵

20
- 84877698136
- Optimizing the Performance of Streaming Numerical Kernels on the IBM Blue Gene/P PowerPC 450 Processor
- T. Malas, A. Ahmadia, J. Brown, J. Gunnels, and D. Keyes, "Optimizing the Performance of Streaming Numerical Kernels on the IBM Blue Gene/P PowerPC 450 Processor," IJHPCA, 2012.
- (2012) IJHPCA
- Malas, T.¹ Ahmadia, A.² Brown, J.³ Gunnels, J.⁴ Keyes, D.⁵

21
- 35448985754
- Parameterized Tiled Loops for Free
- L. Renganarayanan, D. Kim, S. Rajopadhye, and M. Strout, "Parameterized Tiled Loops for Free," in Proc. ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2007), 2007, pp. 405-414.
- Proc. ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2007), 2007 , pp. 405-414
- Renganarayanan, L.¹ Kim, D.² Rajopadhye, S.³ Strout, M.⁴

22
- 57349139452
- A Practical Automatic Polyhedral Parallelizer and Locality Optimizer
- U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan, "A Practical Automatic Polyhedral Parallelizer and Locality Optimizer," in Proc. ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2008), 2008.
- Proc. ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2008), 2008
- Bondhugula, U.¹ Hartono, A.² Ramanujam, J.³ Sadayappan, P.⁴

23
- 77954412565
- Loop Transformation Recipes for Code Generation and Auto-Tuning
- Languages and Compilers for Parallel Computing, ser. G. Gao, L. Pollock, J. Cavazos, and X. Li, Eds., Springer Berlin / Heidelberg
- M. Hall, J. Chame, C. Chen, J. Shin, G. Rudy, and M. Khan, "Loop Transformation Recipes for Code Generation and Auto-Tuning," in Languages and Compilers for Parallel Computing, ser. Lecture Notes in Computer Science, G. Gao, L. Pollock, J. Cavazos, and X. Li, Eds., vol. 5898. Springer Berlin / Heidelberg, 2010, pp. 50-64.
- (2010) Lecture Notes in Computer Science , vol.5898 , pp. 50-64
- Hall, M.¹ Chame, J.² Chen, C.³ Shin, J.⁴ Rudy, G.⁵ Khan, M.⁶

24
- 84863015363
- A Heterogeneous Parallel Framework for Domain-Specific Languages
- K. Brown, A. Sujeeth, H. Lee, T. Rompf, H. Chafi, M. Odersky, and K. Olukotun, "A Heterogeneous Parallel Framework for Domain-Specific Languages," in Proc. Int'l Conference on Parallel Architectures and Compilation Techniques (PACT 2011), 2011, pp. 89-100.
- Proc. Int'l Conference on Parallel Architectures and Compilation Techniques (PACT 2011), 2011 , pp. 89-100
- Brown, K.¹ Sujeeth, A.² Lee, H.³ Rompf, T.⁴ Chafi, H.⁵ Odersky, M.⁶ Olukotun, K.⁷

25
- 84955498575
- Cetus: A Source-to-Source Compiler Infrastructure for Multicores
- H. Bae, L. Bachega, C. Dave, S. Lee, S. Lee, S. Min, R. Eigenmann, and S. Midkiff, "Cetus: A Source-to-Source Compiler Infrastructure for Multicores," in Proc. Int'l Workshop on Compilers for Parallel Computing (CPC 2009), 2009.
- Proc. Int'l Workshop on Compilers for Parallel Computing (CPC 2009), 2009
- Bae, H.¹ Bachega, L.² Dave, C.³ Lee, S.⁴ Lee, S.⁵ Min, S.⁶ Eigenmann, R.⁷ Midkiff, S.⁸

26
- 38149089799
- accessed April 2012
- H. Mössenböck, M. Löberbauer, and A. Wöß, "The Compiler Generator Coco/R," http://www.ssw.uni-linz.ac.at/coco, accessed April 2012.
- The Compiler Generator Coco/R
- Mössenböck, H.¹ Löberbauer, M.² Wöß, A.³

27
- 34547503691
- Department of Computer Science, Rutgers University, Tech. Rep. DCS-TR-379
- J. McCalpin and D. Wonnacott, "Time skewing: A value-based approach to optimizing for memory locality," Department of Computer Science, Rutgers University, Tech. Rep. DCS-TR-379, 1998.
- (1998) Time Skewing: A Value-based Approach to Optimizing for Memory Locality
- McCalpin, J.¹ Wonnacott, D.²

28
- 78649844813
- LIKWID: A Lightweight Performance-oriented Tool Suite for x86 Multicore Environments
- J. Treibig, G. Hager, and G. Wellein, "LIKWID: A Lightweight Performance-oriented Tool Suite for x86 Multicore Environments," in Proc. First International Workshop on Parallel Software Tools and Tool Infrastructures (PSTI2010), San Diego CA, 2010.
- Proc. First International Workshop on Parallel Software Tools and Tool Infrastructures (PSTI2010), San Diego CA, 2010
- Treibig, J.¹ Hager, G.² Wellein, G.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.