메뉴 건너뛰기




Volumn 28, Issue 7, 2016, Pages 2295-2315

Chip-level and multi-node analysis of energy-optimized lattice Boltzmann CFD simulations

Author keywords

ECM performance model; energy optimization; lattice Boltzmann method

Indexed keywords

CLOCKS; COMPUTER AIDED SOFTWARE ENGINEERING; ENERGY UTILIZATION; KINETIC THEORY; MESSAGE PASSING;

EID: 84929440437     PISSN: 15320626     EISSN: 15320634     Source Type: Journal    
DOI: 10.1002/cpe.3489     Document Type: Conference Paper
Times cited : (36)

References (35)
  • 1
    • 33846243532 scopus 로고    scopus 로고
    • Parallelization strategies and efficiency of CFD computations in complex geometries using lattice Boltzmann methods on high performance computers
    • Breuer M. Durst F. Zenger C. (eds)., Lecture Notes in Computational Science and Engineering. Springer: Berlin, Heidelberg
    • Schulz M, Krafczyk M, Tölke J, Rank E,. Parallelization strategies and efficiency of CFD computations in complex geometries using lattice Boltzmann methods on high performance computers. In High Performance Scientific and Engineering Computing Proceedings of the 3rd International FORTWIHR Conference on HPSEC, Erlangen, March 12-14, 2001, vol. 21, Breuer M, Durst F, Zenger C, (eds)., Lecture Notes in Computational Science and Engineering. Springer: Berlin, Heidelberg, 2002; 115-122. DOI: 10.1016/j.jcp.2008.01.013.
    • (2002) High Performance Scientific and Engineering Computing Proceedings of the 3rd International FORTWIHR Conference on HPSEC, Erlangen, March 12-14, 2001 , vol.21 , pp. 115-122
    • Schulz, M.1    Krafczyk, M.2    Tölke, J.3    Rank, E.4
  • 2
    • 1642342275 scopus 로고    scopus 로고
    • A high-performance lattice Boltzmann implementation to model flow in porous media
    • Pan C, Prins JF, Miller CT,. A high-performance lattice Boltzmann implementation to model flow in porous media. Computer Physics Communications 2004; 158 (2): 89-105.
    • (2004) Computer Physics Communications , vol.158 , Issue.2 , pp. 89-105
    • Pan, C.1    Prins, J.F.2    Miller, C.T.3
  • 3
    • 27244459147 scopus 로고    scopus 로고
    • Domain-decomposition method for parallel lattice Boltzmann simulation of incompressible flow in porous media
    • Wang J, Zhang X, Bengough AG, Crawford JW,. Domain-decomposition method for parallel lattice Boltzmann simulation of incompressible flow in porous media. Physical Review E 2005; 72 (1): 016706.
    • (2005) Physical Review E , vol.72 , Issue.1 , pp. 016706
    • Wang, J.1    Zhang, X.2    Bengough, A.G.3    Crawford, J.W.4
  • 4
    • 33646809359 scopus 로고    scopus 로고
    • On the single processor performance of simple lattice Boltzmann kernels
    • Wellein G, Zeiser T, Donath S, Hager G,. On the single processor performance of simple lattice Boltzmann kernels. Computers & Fluids 2006; 35: 910-919.
    • (2006) Computers & Fluids , vol.35 , pp. 910-919
    • Wellein, G.1    Zeiser, T.2    Donath, S.3    Hager, G.4
  • 8
    • 70350719194 scopus 로고    scopus 로고
    • On improving the performance of large parallel lattice Boltzmann flow simulations in heterogeneous porous media
    • Vidal D, Roy R, Bertrand F,. On improving the performance of large parallel lattice Boltzmann flow simulations in heterogeneous porous media. Computers & Fluids 2010; 39 (2): 324-337.
    • (2010) Computers & Fluids , vol.39 , Issue.2 , pp. 324-337
    • Vidal, D.1    Roy, R.2    Bertrand, F.3
  • 9
    • 84896617265 scopus 로고    scopus 로고
    • A fully distributed CFD framework for massively parallel systems
    • Stuttgart, Germany. [Accessed on 26 March 2015]
    • Zudrop J, Klimach H, Hasert M, Masilamani K, Roller S,. A fully distributed CFD framework for massively parallel systems. Cray Users Group Conference 2011, Stuttgart, Germany. (Available from: https://cug.org/proceedings/attendee-program-cug2012/includes/files/pap136.pdf) [Accessed on 26 March 2015], 2012.
    • (2012) Cray Users Group Conference 2011
    • Zudrop, J.1    Klimach, H.2    Hasert, M.3    Masilamani, K.4    Roller, S.5
  • 11
    • 65949107549 scopus 로고    scopus 로고
    • Roofline: An insightful visual performance model for multicore architectures
    • Williams S, Waterman A, Patterson D,. Roofline: an insightful visual performance model for multicore architectures. Communications of the ACM 2009; 52 (4): 65-76.
    • (2009) Communications of the ACM , vol.52 , Issue.4 , pp. 65-76
    • Williams, S.1    Waterman, A.2    Patterson, D.3
  • 15
    • 84962933449 scopus 로고    scopus 로고
    • Energy-to-solution: A today's metric for tomorrow's concerns
    • Mons, Belgium, November 9, [Accessed on 26 March 2015]
    • Keller V,. Energy-to-solution: A today's metric for tomorrow's concerns. Talk at the Symposium on Future Generations of Processors and Systems (FGPS'175), Mons, Belgium, November 9, 2012. (Available from: http://www.ig.fpms.ac.be/sites/default/files/FGPS175-Keller.pdf) [Accessed on 26 March 2015].
    • (2012) Talk at the Symposium on Future Generations of Processors and Systems (FGPS'175)
    • Keller, V.1
  • 16
    • 84883192796 scopus 로고    scopus 로고
    • Energy to solution: A new mission for parallel computing
    • Wolf F. Mohr B. Mey D. (eds)., Lecture Notes in Computer Science. Springer: Berlin Heidelberg
    • Bode Arndt,. Energy to solution: A new mission for parallel computing. In Euro-par 2013 Parallel Processing, vol. 8097, Wolf F, Mohr B, Mey D, (eds)., Lecture Notes in Computer Science. Springer: Berlin Heidelberg, 2013; 1-2. (Available from: http://dx.doi.org/10.1007/978-3-642-40047-6-1).
    • (2013) Euro-par 2013 Parallel Processing , vol.8097 , pp. 1-2
    • Bode, A.1
  • 21
    • 85086888962 scopus 로고    scopus 로고
    • Memory performance at reduced CPU clock speeds: An analysis of current x86-64 processors
    • USENIX Association, Berkeley, CA, USA, [Accessed on 26 March 2015]
    • Schöne R, Hackenberg D, Molka D,. Memory performance at reduced CPU clock speeds: an analysis of current x86-64 processors. Proceedings of the 2012 USENIX Conference on Power-Aware Computing and Systems, HotPower'12, USENIX Association, Berkeley, CA, USA, 2012. (Available from: https://www.usenix.org/system/files/conference/hotpower12/ hotpower12-final5.pdf) [Accessed on 26 March 2015].
    • (2012) Proceedings of the 2012 USENIX Conference on Power-Aware Computing and Systems, HotPower'12
    • Schöne, R.1    Hackenberg, D.2    Molka, D.3
  • 23
    • 73849092882 scopus 로고    scopus 로고
    • Benchmark analysis and application results for lattice Boltzmann simulations on NEC SX vector and Intel Nehalem systems
    • Zeiser T, Hager G, Wellein G,. Benchmark analysis and application results for lattice Boltzmann simulations on NEC SX vector and Intel Nehalem systems. Parallel Processing Letters 2009; 19 (4): 491-511.
    • (2009) Parallel Processing Letters , vol.19 , Issue.4 , pp. 491-511
    • Zeiser, T.1    Hager, G.2    Wellein, G.3
  • 24
    • 40549109949 scopus 로고    scopus 로고
    • Two-relaxation-time lattice Boltzmann scheme: About parametrization, velocity, pressure and mixed boundary conditions
    • Ginzburg I, Verhaeghe F, d'Humieres D,. Two-relaxation-time lattice Boltzmann scheme: about parametrization, velocity, pressure and mixed boundary conditions. Communications and Computer of Physics 2008; 3 (2): 427-428.
    • (2008) Communications and Computer of Physics , vol.3 , Issue.2 , pp. 427-428
    • Ginzburg, I.1    Verhaeghe, F.2    D'Humieres, D.3
  • 25
    • 84884480524 scopus 로고    scopus 로고
    • [Accessed on 26 March 2015]
    • SuperMUC petascale system. (Available from: http://www.lrz.de/services/compute/supermuc) [Accessed on 26 March 2015].
    • SuperMUC Petascale System
  • 28
    • 78650871519 scopus 로고    scopus 로고
    • Leveraging shared caches for parallel temporal blocking of stencil codes on multicore processors and clusters
    • Wittmann M, Hager G, Treibig J, Wellein G,. Leveraging shared caches for parallel temporal blocking of stencil codes on multicore processors and clusters. Parallel Processing Letters 2010; 20 (4): 359-376.
    • (2010) Parallel Processing Letters , vol.20 , Issue.4 , pp. 359-376
    • Wittmann, M.1    Hager, G.2    Treibig, J.3    Wellein, G.4
  • 30
    • 85050288436 scopus 로고    scopus 로고
    • Intel. June. [Accessed on 26 March 2015]
    • Intel. Intel Architecture Code Analyzer, June 2012. (Available from: http://software.intel.com/en-us/articles/intel-architecture-code-analyzer/) [Accessed on 26 March 2015].
    • (2012) Intel Architecture Code Analyzer
  • 31
    • 67650784628 scopus 로고    scopus 로고
    • Feedback-driven threading: Power-efficient and high-performance execution of multi-threaded workloads on CMPs
    • Suleman MA, Qureshi MK, Patt YN,. Feedback-driven threading: power-efficient and high-performance execution of multi-threaded workloads on CMPs. SIGARCH Computer Architecture News 2008; 36 (1): 277-286.
    • (2008) SIGARCH Computer Architecture News , vol.36 , Issue.1 , pp. 277-286
    • Suleman, M.A.1    Qureshi, M.K.2    Patt, Y.N.3
  • 32
    • 34447569672 scopus 로고    scopus 로고
    • Intel Corp. [Accessed on 26 March 2015]
    • Intel Corp. Intel 64 and IA-32 Architectures Software Developer's Manual, 2013. (Available from: http://download.intel.com/products/processor/manual/325384.pdf) [Accessed on 26 March 2015].
    • (2013) Intel 64 and IA-32 Architectures Software Developer's Manual
  • 33
    • 84859729360 scopus 로고    scopus 로고
    • Power-management architecture of the Intel microarchitecture code-named Sandy Bridge
    • Rotem E, Naveh A, Ananthakrishnan A, Rajwan D, Weissmann E,. Power-management architecture of the Intel microarchitecture code-named Sandy Bridge. IEEE Micro 2012; 32: 20-27.
    • (2012) IEEE Micro , vol.32 , pp. 20-27
    • Rotem, E.1    Naveh, A.2    Ananthakrishnan, A.3    Rajwan, D.4    Weissmann, E.5
  • 34
    • 84962953842 scopus 로고    scopus 로고
    • LRZ, Private Communication
    • Huber H,. LRZ, Private Communication.
    • Huber, H.1
  • 35
    • 84962964547 scopus 로고    scopus 로고
    • Intel MPI benchmarks Intel Corp. [Accessed on 26 March 2015]
    • Intel Corp. Intel MPI benchmarks. (Available from: http://software.intel.com/en-us/articles/intel-mpi-benchmarks) [Accessed on 26 March 2015].


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.