메뉴 건너뛰기




Volumn 71, Issue 1, 2013, Pages 80-97

Finite element assembly strategies on multi-core and many-core architectures

Author keywords

FEM; GPU; Many core; Multi core

Indexed keywords

ADVECTION-DIFFUSION; DISCONTINUOUS GALERKIN METHODS; FINITE ELEMENT; GPU; HIGH-LEVEL STRUCTURE; HIGH-PERFORMANCE ARCHITECTURE; LOSS OF PERFORMANCE; MANY-CORE; MANY-CORE ARCHITECTURE; MATRIX APPROACH; MEMORY ACCESS; MULTI CORE; MULTICORE ARCHITECTURES; NUMERICAL INVESTIGATIONS; PERFORMANCE POTENTIALS; REDUNDANT DATA; SPECTRAL ELEMENT METHOD;

EID: 84871030508     PISSN: 02712091     EISSN: 10970363     Source Type: Journal    
DOI: 10.1002/fld.3648     Document Type: Article
Times cited : (95)

References (37)
  • 1
    • 84872401479 scopus 로고    scopus 로고
    • GPU Acceleration of equations assembly in finite elements method - preliminary results. SAAHPC : Symposium on Application Accelerators in HPC
    • Filipovic J, Peterlik I, Fousek J. GPU Acceleration of equations assembly in finite elements method - preliminary results. SAAHPC : Symposium on Application Accelerators in HPC, 2009.
    • (2009)
    • Filipovic, J.1    Peterlik, I.2    Fousek, J.3
  • 2
    • 64449087473 scopus 로고    scopus 로고
    • Porting a high-order finite-element earthquake modeling application to NVIDIA graphics cards using CUDA
    • Komatitsch D, Michéa D, Erlebacher G. Porting a high-order finite-element earthquake modeling application to NVIDIA graphics cards using CUDA. Journal of Parallel and Distributed Computing 2009; 69(5):451-460.
    • (2009) Journal of Parallel and Distributed Computing , vol.69 , Issue.5 , pp. 451-460
    • Komatitsch, D.1    Michéa, D.2    Erlebacher, G.3
  • 3
    • 77952958084 scopus 로고    scopus 로고
    • Modeling the propagation of elastic waves using spectral elements on a cluster of 192 GPUs
    • DOI: 10.1007/s00450-010-0109-1.
    • Komatitsch D, Göddeke D, Erlebacher G, Michéa D. Modeling the propagation of elastic waves using spectral elements on a cluster of 192 GPUs. Computer Science Research and Development 2010; 25(1-2):75-82. DOI: 10.1007/s00450-010-0109-1.
    • (2010) Computer Science Research and Development , vol.25 , Issue.1-2 , pp. 75-82
    • Komatitsch, D.1    Göddeke, D.2    Erlebacher, G.3    Michéa, D.4
  • 4
    • 78650291125 scopus 로고    scopus 로고
    • 3D finite element numerical integration on GPUs
    • DOI: 10.1016/j.procs.2010.04.121. ICCS 2010.
    • Maciol P, Plaszewski P, Banas K. 3D finite element numerical integration on GPUs. Procedia Computer Science 2010; 1(1):1087-1094, DOI: 10.1016/j.procs.2010.04.121. ICCS 2010.
    • (2010) Procedia Computer Science , vol.1 , Issue.1 , pp. 1087-1094
    • Maciol, P.1    Plaszewski, P.2    Banas, K.3
  • 7
    • 79952241422 scopus 로고    scopus 로고
    • From h to p efficiently: Strategy selection for operator evaluation on hexahedral and tetrahedral elements
    • DOI: 10.1016/j.compfluid.2010.08.012. URL
    • Cantwell C, Sherwin S, Kirby R, Kelly P. From h to p efficiently: Strategy selection for operator evaluation on hexahedral and tetrahedral elements. Computers & Fluids 2011; 43:23-28. DOI: 10.1016/j.compfluid.2010.08.012. URL http://www.sciencedirect.com/science/article/B6V26-50V208D-1/2/5aa26c0ff5f8a0791ccdeb0651c17f61.
    • (2011) Computers & Fluids , vol.43 , pp. 23-28
    • Cantwell, C.1    Sherwin, S.2    Kirby, R.3    Kelly, P.4
  • 8
    • 79960938930 scopus 로고    scopus 로고
    • From h to p efficiently: selecting the optimal spectral/hp discretisation in three dimensions
    • DOI: 10.1051/mmnp/20116304.
    • Cantwell CD, Sherwin SJ, Kirby RM, Kelly PHJ. From h to p efficiently: selecting the optimal spectral/hp discretisation in three dimensions. Mathematical Modelling of Natural Phenomena 2011; 6:84-96. DOI: 10.1051/mmnp/20116304.
    • (2011) Mathematical Modelling of Natural Phenomena , vol.6 , pp. 84-96
    • Cantwell, C.D.1    Sherwin, S.J.2    Kirby, R.M.3    Kelly, P.H.J.4
  • 9
    • 77952425955 scopus 로고    scopus 로고
    • From h to p efficiently: implementing finite and spectral/hp element methods to achieve optimal performance for low- and high-order discretisations
    • DOI:
    • Vos PEJ, Sherwin SJ, Kirby RM. From h to p efficiently: implementing finite and spectral/hp element methods to achieve optimal performance for low- and high-order discretisations. Journal of Computational Physics 2010; 229(13):5161-5181. DOI: http://dx.doi.org/10.1016/j.jcp.2010.03.031.
    • (2010) Journal of Computational Physics , vol.229 , Issue.13 , pp. 5161-5181
    • Vos, P.E.J.1    Sherwin, S.J.2    Kirby, R.M.3
  • 11
    • 84872399218 scopus 로고    scopus 로고
    • NVidia. NVidia CUDA programming guide version 2.3.1, August URL:
    • NVidia. NVidia CUDA programming guide version 2.3.1, August 2009. URL: http://developer.download.nvidia.com/compute/cuda/2_3/toolkit/docs/NVIDIA_CUDA_Programming_Guide_2.3.pdf.
    • (2009)
  • 12
    • 84872397138 scopus 로고    scopus 로고
    • OpenCL 1.0 working specification. URL
    • KhronosGroup T. OpenCL 1.0 working specification. 2008. URL http://www.khronos.org/registry/cl/specs/opencl-1.0.29.pdf.
    • (2008)
    • KhronosGroup, T.1
  • 13
    • 84872418003 scopus 로고    scopus 로고
    • NVidia. Fermi white paper, Retrieved 19 May 2011.
    • NVidia. Fermi white paper, 2010. http://www.nvidia.com/content/PDF/fermi/_white/_papers/NVIDIAFermiComputeArchitectureWhitepaper.pdf, Retrieved 19 May 2011.
    • (2010)
  • 14
    • 84872385959 scopus 로고    scopus 로고
    • AMD: Unleashing the Power of Parallel Compute. SIGGRAPH Asia presentation
    • Devices AM. AMD: Unleashing the Power of Parallel Compute. SIGGRAPH Asia presentation, 2009. http://sa09.idav.ucdavis.edu/docs/SA09_AMD_IHV.pdf.
    • (2009)
    • Devices, A.M.1
  • 15
    • 84872411268 scopus 로고    scopus 로고
    • Decomposition-fusion scheme for medium-grained problems on GPU, Unpublished manuscript.
    • Filipovic J, Fousek J, Matyska L, Peterlik I. Decomposition-fusion scheme for medium-grained problems on GPU, 2010. Unpublished manuscript.
    • (2010)
    • Filipovic, J.1    Fousek, J.2    Matyska, L.3    Peterlik, I.4
  • 16
    • 80051658396 scopus 로고    scopus 로고
    • Unsteady CFD computations using vertex-centered finite volumes for unstructured grids on graphics processing units
    • DOI: 10.1002/fld.23.52.
    • Asouti VG, Trompoukis XS, Kampolis IC, Giannakoglou KC. Unsteady CFD computations using vertex-centered finite volumes for unstructured grids on graphics processing units. International Journal for Numerical Methods in Fluids 2011; 67(2):232-246. DOI: 10.1002/fld.23.52.
    • (2011) International Journal for Numerical Methods in Fluids , vol.67 , Issue.2 , pp. 232-246
    • Asouti, V.G.1    Trompoukis, X.S.2    Kampolis, I.C.3    Giannakoglou, K.C.4
  • 19
    • 79955372794 scopus 로고    scopus 로고
    • Running unstructured grid-based CFD solvers on modern graphics hardware
    • DOI: 10.1002/fld.2254. URL
    • Corrigan A, Camelli FF, Lhner R, Wallin J. Running unstructured grid-based CFD solvers on modern graphics hardware. International Journal for Numerical Methods in Fluids 2011; 66(2):221-229. DOI: 10.1002/fld.2254. URL http://dx.doi.org/10.1002/fld.2254.
    • (2011) International Journal for Numerical Methods in Fluids , vol.66 , Issue.2 , pp. 221-229
    • Corrigan, A.1    Camelli, F.F.2    Lhner, R.3    Wallin, J.4
  • 20
    • 33749341848 scopus 로고    scopus 로고
    • A compiler for variational forms
    • Kirby RC, Logg A. A compiler for variational forms. ACM Transactions on Mathematical Software 2006; 32(3):417-444. URL http://home.simula.no/~logg/pub/papers/KirbyLogg2006a.pdf.
    • (2006) ACM Transactions on Mathematical Software , vol.32 , Issue.3 , pp. 417-444
    • Kirby, R.C.1    Logg, A.2
  • 21
    • 84872370098 scopus 로고    scopus 로고
    • An AEcute framework for accelerating unstructured mesh based computation using the CUDA programming model. Master's Thesis, Imperial College London
    • Sharif Z. An AEcute framework for accelerating unstructured mesh based computation using the CUDA programming model. Master's Thesis, Imperial College London, 2010.
    • (2010)
    • Sharif, Z.1
  • 22
    • 84872403319 scopus 로고    scopus 로고
    • A framework for parallel unstructured grid applications on GPUs. Plenary talk at the SIAM conference on parallel processing for scientific computing (PP10), Seattle, February
    • Giles M. A framework for parallel unstructured grid applications on GPUs. Plenary talk at the SIAM conference on parallel processing for scientific computing (PP10), Seattle, February 2010.
    • (2010)
    • Giles, M.1
  • 24
    • 84859938922 scopus 로고    scopus 로고
    • Semi-automatic porting of a large-scale CFD code to multi-graphics processing unit clusters
    • DOI: 10.1002/fld.2664.
    • Corrigan A, Lhner R. Semi-automatic porting of a large-scale CFD code to multi-graphics processing unit clusters. International Journal for Numerical Methods in Fluids 2011. DOI: 10.1002/fld.2664.
    • (2011) International Journal for Numerical Methods in Fluids
    • Corrigan, A.1    Lhner, R.2
  • 27
    • 84872411795 scopus 로고    scopus 로고
    • Applied modelling and computation group, Department of Earth Science and Engineering, South Kensington Campus, Imperial College London, London, SW7 2AZ, UK. Fluidity manual. Version 4.0-release edn. November Available at
    • Applied modelling and computation group, Department of Earth Science and Engineering, South Kensington Campus, Imperial College London, London, SW7 2AZ, UK. Fluidity manual. Version 4.0-release edn. November 2010. Available at http://hdl.handle.net/10044/1/7086.
    • (2010)
  • 28
    • 84872393132 scopus 로고    scopus 로고
    • About ICOM, Retrieved 30 Sep.
    • Piggott MD. About ICOM, 2010. http://amcg.ese.ic.ac.uk/index.php?title=ICOM, Retrieved 30 Sep.
    • (2010)
    • Piggott, M.D.1
  • 29
    • 84872383153 scopus 로고    scopus 로고
    • Accelerating unstructured mesh computational fluid dynamics using the NVidia Tesla GPU Architecture. ISO Report, Imperial College London
    • Markall G, Kelly PHJ. Accelerating unstructured mesh computational fluid dynamics using the NVidia Tesla GPU Architecture. ISO Report, Imperial College London, 2009.
    • (2009)
    • Markall, G.1    Kelly, P.H.J.2
  • 31
    • 67650149648 scopus 로고    scopus 로고
    • GMSH: A 3-D finite element mesh generator with built-in pre- and post-processing facilities
    • Geuzaine C, Remacle JF. GMSH: A 3-D finite element mesh generator with built-in pre- and post-processing facilities. International Journal for Numerical Methods in Engineering 2009; 79(11):1309-1331.
    • (2009) International Journal for Numerical Methods in Engineering , vol.79 , Issue.11 , pp. 1309-1331
    • Geuzaine, C.1    Remacle, J.F.2
  • 32
    • 78650283968 scopus 로고    scopus 로고
    • Towards generating optimised finite element solvers for GPUs from high-level specifications
    • DOI: 10.1016/j.procs.2010.04.203. ICCS 2010.
    • Markall GR, Ham DA, Kelly PHJ. Towards generating optimised finite element solvers for GPUs from high-level specifications. Procedia Computer Science 2010; 1(1):1809-1817. DOI: 10.1016/j.procs.2010.04.203. ICCS 2010.
    • (2010) Procedia Computer Science , vol.1 , Issue.1 , pp. 1809-1817
    • Markall, G.R.1    Ham, D.A.2    Kelly, P.H.J.3
  • 33
    • 84872379632 scopus 로고    scopus 로고
    • NVidia. Visual profiler user guide.
    • NVidia. Visual profiler user guide.
  • 34
    • 84872389904 scopus 로고    scopus 로고
    • AMD. AMD Radeon 5870 Specifications. Retrieved 30 Sep 2010.
    • AMD. AMD Radeon 5870 Specifications. http://www.amd.com/us/products/desktop/graphics/ati-radeon-hd-5000/hd-5870/Pages/ati-radeon-hd-5870-overview.aspx/#2, Retrieved 30 Sep 2010.
  • 35
    • 84872389354 scopus 로고    scopus 로고
    • NVidia. NVidia Geforce GTX480 specifications. Retrieved 30 Sep 2010.
    • NVidia. NVidia Geforce GTX480 specifications. http://www.nvidia.com/object/product_geforce_gtx_480_us.html, Retrieved 30 Sep 2010.


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.