메뉴 건너뛰기




Volumn , Issue , 2011, Pages 117-128

The pochoir stencil compiler

Author keywords

C++; cache oblivious algorithm; cilk; compiler; embedded domain specific language; multicore; parallel computation; stencil computation; trapezoidal decomposition

Indexed keywords

C++; CACHE-OBLIVIOUS ALGORITHMS; CILK; COMPILER; DOMAIN SPECIFIC LANGUAGES; MULTI CORE; PARALLEL COMPUTATION; STENCIL COMPUTATIONS; TRAPEZOIDAL DECOMPOSITION;

EID: 79959673844     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1989493.1989508     Document Type: Conference Paper
Times cited : (297)

References (41)
  • 1
    • 0041940818 scopus 로고    scopus 로고
    • Dynamic programming algorithms for RNA secondary structure prediction with pseudoknots
    • T. Akutsu. Dynamic programming algorithms for RNA secondary structure prediction with pseudoknots. Discrete Applied Mathematics, 104:45-62, 2000.
    • (2000) Discrete Applied Mathematics , vol.104 , pp. 45-62
    • Akutsu, T.1
  • 2
    • 0000148916 scopus 로고
    • Salinity-driven thermocline transients in a wind- and thermohaline-forced isopycnic coordinate model of the North Atlantic
    • R. Bleck, C. Rooth, D. Hu, and L. T. Smith. Salinity-driven thermocline transients in a wind- and thermohaline-forced isopycnic coordinate model of the North Atlantic. Journal of Physical Oceanography, 22(12):1486-1505, 1992.
    • (1992) Journal of Physical Oceanography , vol.22 , Issue.12 , pp. 1486-1505
    • Bleck, R.1    Rooth, C.2    Hu, D.3    Smith, L.T.4
  • 4
    • 84976690230 scopus 로고
    • Fortran at ten Gigaflops: The Connection Machine convolution compiler
    • Toronto, Ontario, Canada, June 26-28
    • M. Bromley, S. Heller, T. McNerney, and G. L. Steele Jr. Fortran at ten Gigaflops: The Connection Machine convolution compiler. In PLDI, pages 145-156, Toronto, Ontario, Canada, June 26-28 1991.
    • (1991) PLDI , pp. 145-156
    • Bromley, M.1    Heller, S.2    McNerney, T.3    Steele Jr., G.L.4
  • 5
    • 67650259451 scopus 로고    scopus 로고
    • C++ Standards Committee. available from ISO/IEC Document Number N3242=11-0012
    • C++ Standards Committee. Working draft, standard for programming language C++. available from http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2011/ n3242.pdf, 2011. ISO/IEC Document Number N3242=11-0012.
    • (2011) Working Draft, Standard for Programming Language C++
  • 6
    • 77955467765 scopus 로고    scopus 로고
    • Cache-oblivious dynamic programming for bioinformatics
    • July-Sept.
    • R. A. Chowdhury, H.-S. Le, and V. Ramachandran. Cache-oblivious dynamic programming for bioinformatics. TCBB, 7(3):495-510, July-Sept. 2010.
    • (2010) TCBB , vol.7 , Issue.3 , pp. 495-510
    • Chowdhury, R.A.1    Le, H.-S.2    Ramachandran, V.3
  • 9
    • 70350771127 scopus 로고    scopus 로고
    • Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures
    • Austin, TX, Nov. 15-18
    • K. Datta, M. Murphy, V. Volkov, S. Williams, J. Carter, L. Oliker, D. Patterson, J. Shalf, and K. Yelick. Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures. In SC, pages 4:1-4:12, Austin, TX, Nov. 15-18 2008.
    • (2008) SC
    • Datta, K.1    Murphy, M.2    Volkov, V.3    Williams, S.4    Carter, J.5    Oliker, L.6    Patterson, D.7    Shalf, J.8    Yelick, K.9
  • 10
    • 0001813087 scopus 로고    scopus 로고
    • Domain-specific languages: An annotated bibliography
    • June
    • A. van Deursen, P. Klint, and J. Visser. Domain-specific languages: An annotated bibliography. SIGPLAN Not., 35(6):26-36, June 2000.
    • (2000) SIGPLAN Not. , vol.35 , Issue.6 , pp. 26-36
    • Van Deursen, A.1    Klint, P.2    Visser, J.3
  • 11
    • 70350630432 scopus 로고    scopus 로고
    • A multilevel parallelization framework for high-order stencil computations
    • Delft, The Netherlands, Aug. 25-28
    • H. Dursun, K.-i. Nomura, L. Peng, R. Seymour, W. Wang, R. K. Kalia, A. Nakano, and P. Vashishta. A multilevel parallelization framework for high-order stencil computations. In Euro-Par, pages 642-653, Delft, The Netherlands, Aug. 25-28 2009.
    • (2009) Euro-Par , pp. 642-653
    • Dursun, H.1    Nomura, K.-I.2    Peng, L.3    Seymour, R.4    Wang, W.5    Kalia, R.K.6    Nakano, A.7    Vashishta, P.8
  • 15
    • 0033350255 scopus 로고    scopus 로고
    • Cache-oblivious algorithms
    • New York, NY, Oct. 17-19
    • M. Frigo, C. E. Leiserson, H. Prokop, and S. Ramachandran. Cache-oblivious algorithms. In FOCS, pages 285-297, New York, NY, Oct. 17-19 1999.
    • (1999) FOCS , pp. 285-297
    • Frigo, M.1    Leiserson, C.E.2    Prokop, H.3    Ramachandran, S.4
  • 16
    • 32844463802 scopus 로고    scopus 로고
    • Cache oblivious stencil computations
    • Cambridge, MA, June 20-22
    • M. Frigo and V. Strumpen. Cache oblivious stencil computations. In ICS, pages 361-366, Cambridge, MA, June 20-22 2005.
    • (2005) ICS , pp. 361-366
    • Frigo, M.1    Strumpen, V.2
  • 17
    • 67449111382 scopus 로고    scopus 로고
    • The cache complexity of multithreaded cache oblivious algorithms
    • M. Frigo and V. Strumpen. The cache complexity of multithreaded cache oblivious algorithms. Theory of Computing Systems, 45(2):203-233, 2009.
    • (2009) Theory of Computing Systems , vol.45 , Issue.2 , pp. 203-233
    • Frigo, M.1    Strumpen, V.2
  • 18
    • 0000870032 scopus 로고
    • Mathematical Games
    • M. Gardner. Mathematical Games. Scientific American, 223(4):120-123, 1970.
    • (1970) Scientific American , vol.223 , Issue.4 , pp. 120-123
    • Gardner, M.1
  • 19
    • 0020484488 scopus 로고
    • An improved algorithm for matching biological sequences
    • O. Gotoh. An improved algorithm for matching biological sequences. Journal of Molecular Biology, 162:705-708, 1982.
    • (1982) Journal of Molecular Biology , vol.162 , pp. 705-708
    • Gotoh, O.1
  • 20
    • 77954942121 scopus 로고    scopus 로고
    • The Cilkview scalability analyzer
    • Santorini, Greece, June 13-15
    • Y. He, C. E. Leiserson, and W. M. Leiserson. The Cilkview scalability analyzer. In SPAA, pages 145-156, Santorini, Greece, June 13-15 2010.
    • (2010) SPAA , pp. 145-156
    • He, Y.1    Leiserson, C.E.2    Leiserson, W.M.3
  • 21
    • 0001082611 scopus 로고    scopus 로고
    • Building domain-specific embedded languages
    • December
    • P. Hudak. Building domain-specific embedded languages. ACM Computing Surveys, 28(4), December 1996.
    • (1996) ACM Computing Surveys , vol.28 , pp. 4
    • Hudak, P.1
  • 22
    • 79959681659 scopus 로고    scopus 로고
    • Intel software autotuning tool. http://software.intel.com/en-us/articles/ intel-software-autotuning-tool/, 2010.
    • (2010) Intel Software Autotuning Tool
  • 23
    • 79959678281 scopus 로고    scopus 로고
    • Document Number: 324396-001US. Available from
    • Intel Corporation. Intel Cilk Plus Language Specification, 2010. Document Number: 324396-001US. Available from http://software.intel.com/sites/products/ cilk-plus/cilk-plus-language-specification.pdf.
    • (2010) Intel Cilk Plus Language Specification
  • 25
    • 77954022347 scopus 로고    scopus 로고
    • An auto-tuning framework for parallel multicore stencil computations
    • S. Kamil, C. Chan, L. Oliker, J. Shalf, and S. Williams. An auto-tuning framework for parallel multicore stencil computations. In IPDPS, pages 1-12, 2010.
    • (2010) IPDPS , pp. 1-12
    • Kamil, S.1    Chan, C.2    Oliker, L.3    Shalf, J.4    Williams, S.5
  • 26
    • 34547500808 scopus 로고    scopus 로고
    • Implicit and explicit optimizations for stencil computations
    • San Jose, CA
    • S. Kamil, K. Datta, S. Williams, L. Oliker, J. Shalf, and K. Yelick. Implicit and explicit optimizations for stencil computations. In MSPC, pages 51-60, San Jose, CA, 2006.
    • (2006) MSPC , pp. 51-60
    • Kamil, S.1    Datta, K.2    Williams, S.3    Oliker, L.4    Shalf, J.5    Yelick, K.6
  • 27
    • 84958661690 scopus 로고    scopus 로고
    • Impact of modern memory subsystems on cache optimizations for stencil computations
    • Chicago, IL, June 12
    • S. Kamil, P. Husbands, L. Oliker, J. Shalf, and K. Yelick. Impact of modern memory subsystems on cache optimizations for stencil computations. In MSP, pages 36-43, Chicago, IL, June 12 2005.
    • (2005) MSP , pp. 36-43
    • Kamil, S.1    Husbands, P.2    Oliker, L.3    Shalf, J.4    Yelick, K.5
  • 29
    • 79959651189 scopus 로고    scopus 로고
    • https://perf.wiki.kernel.org/index.php/Main-Page.
  • 30
    • 0000331979 scopus 로고    scopus 로고
    • Lattice Boltzmann method for 3-D flows with curved boundary
    • R. Mei, W. Shyy, D. Yu, and L. Luo. Lattice Boltzmann method for 3-D flows with curved boundary. J. of Comput. Phys, 161(2):680-699, 2000.
    • (2000) J. of Comput. Phys , vol.161 , Issue.2 , pp. 680-699
    • Mei, R.1    Shyy, W.2    Yu, D.3    Luo, L.4
  • 31
    • 33745167684 scopus 로고    scopus 로고
    • When and how to develop domain-specific languages
    • December
    • M. Mernik, J. Heering, and A. M. Sloane. When and how to develop domain-specific languages. ACM Computing Surveys, 37:316-344, December 2005.
    • (2005) ACM Computing Surveys , vol.37 , pp. 316-344
    • Mernik, M.1    Heering, J.2    Sloane, A.M.3
  • 32
    • 67650671606 scopus 로고    scopus 로고
    • 3D finite difference computation on GPUs using CUDA
    • Washington, DC, Mar. 8
    • P. Micikevicius. 3D finite difference computation on GPUs using CUDA. In GPPGPU, pages 79-84, Washington, DC, Mar. 8 2009.
    • (2009) GPPGPU , pp. 79-84
    • Micikevicius, P.1
  • 33
    • 0028714453 scopus 로고
    • Multiresolution molecular dynamics algorithm for realistic materials modeling on parallel computers
    • A. Nakano, R. Kalia, and P. Vashishta. Multiresolution molecular dynamics algorithm for realistic materials modeling on parallel computers. Computer Physics Communications, 83(2-3):197-214, 1994.
    • (1994) Computer Physics Communications , vol.83 , Issue.2-3 , pp. 197-214
    • Nakano, A.1    Kalia, R.2    Vashishta, P.3
  • 38
    • 0008198155 scopus 로고    scopus 로고
    • Master's thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, June
    • H. Prokop. Cache-oblivious algorithms. Master's thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, June 1999.
    • (1999) Cache-oblivious Algorithms
    • Prokop, H.1
  • 39
    • 35448947231 scopus 로고    scopus 로고
    • Compiling stencils in High Performance Fortran
    • San Jose, CA, Nov. 16-20 ACM
    • G. Roth, J. Mellor-Crummey, K. Kennedy, and R. G. Brickner. Compiling stencils in High Performance Fortran. In SC, pages 1-20, San Jose, CA, Nov. 16-20 1997. ACM.
    • (1997) SC , pp. 1-20
    • Roth, G.1    Mellor-Crummey, J.2    Kennedy, K.3    Brickner, R.G.4
  • 41
    • 51049106193 scopus 로고    scopus 로고
    • Lattice Boltzmann simulation optimization on leading multicore platforms
    • Apr. Miami, FL
    • S. Williams, J. Carter, L. Oliker, J. Shalf, and K. Yelick. Lattice Boltzmann simulation optimization on leading multicore platforms. In IPDPS, pages 1-14, Miami, FL, Apr. 2008.
    • (2008) IPDPS , pp. 1-14
    • Williams, S.1    Carter, J.2    Oliker, L.3    Shalf, J.4    Yelick, K.5


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.