메뉴 건너뛰기




Volumn , Issue , 2011, Pages

Extracting ultra-scale lattice boltzmann performance via hierarchical and distributed auto-tuning

Author keywords

Auto tuning; Bluegene; Hybrid programming models; Lattice boltzmann; OpenMP; SIMD

Indexed keywords

AUTOTUNING; BLUEGENE; HYBRID PROGRAMMING MODEL; LATTICE BOLTZMANN; OPENMP; SIMD;

EID: 83155188480     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/2063384.2063458     Document Type: Conference Paper
Times cited : (34)

References (39)
  • 1
    • 26344468007 scopus 로고
    • A model for collisional processes in gases I: Small amplitude processes in charged and neutral one-component systems
    • P. Bhatnagar, E. Gross, and M. Krook. A model for collisional processes in gases I: small amplitude processes in charged and neutral one-component systems. Phys. Rev., 94:511, 1954.
    • (1954) Phys. Rev. , vol.94 , pp. 511
    • Bhatnagar, P.1    Gross, E.2    Krook, M.3
  • 3
    • 84899683182 scopus 로고    scopus 로고
    • Magnetohydrodynamic turbulence simulations on the earth simulator using the lattice Boltzmann method
    • Seattle, WA
    • J. Carter, M. Soe, L. Oliker, Y. Tsuda, G. Vahala, L. Vahala, and A. Macnab. Magnetohydrodynamic turbulence simulations on the earth simulator using the lattice Boltzmann method. In SC05, Seattle, WA, 2005.
    • (2005) SC05
    • Carter, J.1    Soe, M.2    Oliker, L.3    Tsuda, Y.4    Vahala, G.5    Vahala, L.6    Macnab, A.7
  • 6
    • 59749100826 scopus 로고    scopus 로고
    • Optimization and performance modeling of stencil computations on modern microprocessors
    • K. Datta, S. Kamil, S. Williams, L. Oliker, J. Shalf, and K. A. Yelick. Optimization and performance modeling of stencil computations on modern microprocessors. SIAM Review, 51(1):129-159, 2009.
    • (2009) SIAM Review , vol.51 , Issue.1 , pp. 129-159
    • Datta, K.1    Kamil, S.2    Williams, S.3    Oliker, L.4    Shalf, J.5    Yelick, K.A.6
  • 9
    • 0037054259 scopus 로고    scopus 로고
    • Lattice kinetic schemes for magnetohydrodynamics
    • P. Dellar. Lattice kinetic schemes for magnetohydrodynamics. J. Comput. Phys., 79, 2002.
    • (2002) J. Comput. Phys. , vol.79
    • Dellar, P.1
  • 13
    • 84958661690 scopus 로고    scopus 로고
    • Impact of modern memory subsystems on cache optimizations for stencil computations
    • ACM
    • S. Kamil, P. Husbands, L. Oliker, J. Shalf, and K. Yelick. Impact of modern memory subsystems on cache optimizations for stencil computations. In Memory Systen Performance, pages 36-43. ACM, 2005.
    • (2005) Memory Systen Performance , pp. 36-43
    • Kamil, S.1    Husbands, P.2    Oliker, L.3    Shalf, J.4    Yelick, K.5
  • 16
    • 0000979764 scopus 로고
    • Lattice Boltzmann magnetohydrodynamics
    • June
    • D. Martinez, S. Chen, and W. Matthaeus. Lattice Boltzmann magnetohydrodynamics. Physics of Plasmas, 1:1850-1867, June 1994.
    • (1994) Physics of Plasmas , vol.1 , pp. 1850-1867
    • Martinez, D.1    Chen, S.2    Matthaeus, W.3
  • 21
    • 42749090414 scopus 로고    scopus 로고
    • Progress in lattice Boltzmann methods for magnetohydrodynamic ows relevant to fusion applications
    • M. Pattison, K. Premnath, N. Morley, and M. Abdou. Progress in lattice Boltzmann methods for magnetohydrodynamic ows relevant to fusion applications. Fusion Eng. Des., 83:557-572, 2008.
    • (2008) Fusion Eng. Des. , vol.83 , pp. 557-572
    • Pattison, M.1    Premnath, K.2    Morley, N.3    Abdou, M.4
  • 22
    • 1242352441 scopus 로고    scopus 로고
    • Optimization and profiling of the cache performance of parallel lattice Boltzmann codes
    • T. Pohl, M. Kowarschik, J. Wilke, K. Iglberger, and U. Rüde. Optimization and profiling of the cache performance of parallel lattice Boltzmann codes. Parallel Processing Letters, 13(4):S:549, 2003.
    • (2003) Parallel Processing Letters , vol.13 , Issue.4 , pp. 549
    • Pohl, T.1    Kowarschik, M.2    Wilke, J.3    Iglberger, K.4    Rüde, U.5
  • 29
    • 33646809359 scopus 로고    scopus 로고
    • On the single processor performance of simple lattice Boltzmann kernels
    • Nov. ISSN 0045-7930
    • G. Wellein, T. Zeiser, G. Hager, and S. Donath. On the single processor performance of simple lattice Boltzmann kernels. computers & fluids, 35(8-9):910-919, Nov. 2006. ISSN 0045-7930.
    • (2006) Computers & Fluids , vol.35 , Issue.8-9 , pp. 910-919
    • Wellein, G.1    Zeiser, T.2    Hager, G.3    Donath, S.4
  • 30
    • 0343462141 scopus 로고    scopus 로고
    • Automated empirical optimizations of software and the ATLAS project
    • DOI 10.1016/S0167-8191(00)00087-9
    • R. C. Whaley, A. Petitet, and J. Dongarra. Automated empirical optimization of software and the ATLAS project. Parallel Computing, 27(1-2):3-35, 2001. (Pubitemid 32264775)
    • (2001) Parallel Computing , vol.27 , Issue.1-2 , pp. 3-35
    • Clint Whaley, R.1    Petitet, A.2    Dongarra, J.J.3
  • 31
  • 36
    • 67650797544 scopus 로고    scopus 로고
    • Roofline: An insightful visual performance model for floating-point programs and multicore architectures
    • April
    • S. Williams, A. Watterman, and D. Patterson. Roofline: An insightful visual performance model for floating-point programs and multicore architectures. Communications of the ACM, April 2009.
    • (2009) Communications of the ACM
    • Williams, S.1    Watterman, A.2    Patterson, D.3
  • 37
    • 0000331979 scopus 로고    scopus 로고
    • Lattice Boltzmann method for 3D flows with curved boundary
    • D. Yu, R. Mei, W. Shyy, and L. Luo. Lattice Boltzmann method for 3D flows with curved boundary. Journal of Comp. Physics, 161:680-699, 2000.
    • (2000) Journal of Comp. Physics , vol.161 , pp. 680-699
    • Yu, D.1    Mei, R.2    Shyy, W.3    Luo, L.4
  • 38
    • 73849092882 scopus 로고    scopus 로고
    • Benchmark analysis and application results for lattice Boltzmann simulations on NEC SXvector and Intel Nehalemsystems
    • T. Zeiser, G. Hager, and G. Wellein. Benchmark analysis and application results for lattice Boltzmann simulations on NEC SXvector and Intel Nehalemsystems. Parallel Processing Letters, 19(4):491-511, 2009.
    • (2009) Parallel Processing Letters , vol.19 , Issue.4 , pp. 491-511
    • Zeiser, T.1    Hager, G.2    Wellein, G.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.