메뉴 건너뛰기




Volumn 94, Issue 2, 2013, Pages 204-220

Generation of large finite-element matrices on multiple graphics processors

Author keywords

Fermi; Finite element method; Matrix generation; Multicore CPUs; Multiple GPUs; Parallel computing

Indexed keywords

FERMI; GENERATION PROCESS; GRAPHICS ACCELERATORS; GRAPHICS PROCESSING UNITS; GRAPHICS PROCESSOR; LARGE SPARSE LINEAR SYSTEMS; MULTI-CORE CPUS; MULTIPLE GPUS;

EID: 84875625015     PISSN: 00295981     EISSN: 10970207     Source Type: Journal    
DOI: 10.1002/nme.4452     Document Type: Article
Times cited : (46)

References (39)
  • 2
    • 54249162842 scopus 로고    scopus 로고
    • Large calculation of the flow over a hypersonic vehicle using a GPU
    • DOI: 10.1016/j.jcp.2008.08.023
    • Elsen E, LeGresley P, Darve E.Large calculation of the flow over a hypersonic vehicle using a GPU. Journal of Computational Physics 2008; 227(24):10148-10161. DOI: 10.1016/j.jcp.2008.08.023.
    • (2008) Journal of Computational Physics , vol.227 , Issue.24 , pp. 10148-10161
    • Elsen, E.1    LeGresley, P.2    Darve, E.3
  • 3
    • 84857501567 scopus 로고    scopus 로고
    • OpenCL-based implementation of an unstructured edge-based finite element convection-diffusion solver on graphics hardware
    • DOI: 10.1002/nme.3302
    • Mossaiby F, Rossi R, Dadvand P, Idelsohn S.OpenCL-based implementation of an unstructured edge-based finite element convection-diffusion solver on graphics hardware. International Journal for Numerical Methods in Engineering 2011; 89:1635-1651. DOI: 10.1002/nme.3302.
    • (2011) International Journal for Numerical Methods in Engineering , vol.89 , pp. 1635-1651
    • Mossaiby, F.1    Rossi, R.2    Dadvand, P.3    Idelsohn, S.4
  • 5
    • 78650491825 scopus 로고    scopus 로고
    • Molecular dynamics simulations of aqueous ions at the liquid-vapor interface accelerated using graphics processors
    • DOI: 10.1002/jcc.21578
    • Brad A, Bauer BA, Davis JE, Taufer M, Patel S.Molecular dynamics simulations of aqueous ions at the liquid-vapor interface accelerated using graphics processors. Journal of Computational Chemistry 2011; 32(3):375-385. DOI: 10.1002/jcc.21578.
    • (2011) Journal of Computational Chemistry , vol.32 , Issue.3 , pp. 375-385
    • Brad, A.1    Bauer, B.A.2    Davis, J.E.3    Taufer, M.4    Patel, S.5
  • 6
    • 80052037036 scopus 로고    scopus 로고
    • Structural, dynamic, and electrostatic properties of fully hydrated DMPC bilayers from molecular dynamics simulations accelerated with graphical processing units (GPUs)
    • DOI: 10.1002/jcc.21871
    • Ganesan N, Bauer BA, Lucas TR, Patel S, Taufer M.Structural, dynamic, and electrostatic properties of fully hydrated DMPC bilayers from molecular dynamics simulations accelerated with graphical processing units (GPUs). Journal of Computational Chemistry 2011; 32(14):2958-2973. DOI: 10.1002/jcc.21871.
    • (2011) Journal of Computational Chemistry , vol.32 , Issue.14 , pp. 2958-2973
    • Ganesan, N.1    Bauer, B.A.2    Lucas, T.R.3    Patel, S.4    Taufer, M.5
  • 7
    • 71049183906 scopus 로고    scopus 로고
    • GPU acceleration of linear systems for computational electromagnetic simulations, Antennas and Propagation Society International Symposium, 1-5 June 2009. APSURSI '09. IEEE, Charleston SC, USA
    • Inman MJ, Elsherbeni AZ.GPU acceleration of linear systems for computational electromagnetic simulations, Antennas and Propagation Society International Symposium, 1-5 June 2009. APSURSI '09. IEEE, Charleston SC, USA;1-4.
    • Inman, M.J.1    Elsherbeni, A.Z.2
  • 9
    • 79952749352 scopus 로고    scopus 로고
    • GPU accelerated lattice Boltzmann model for shallow water flow and mass transport
    • DOI: 10.1002/nme.3066
    • Tubbs KR, Tsai FT-C.GPU accelerated lattice Boltzmann model for shallow water flow and mass transport. International Journal for Numerical Methods in Engineering 2011; 86(3):316-334. DOI: 10.1002/nme.3066.
    • (2011) International Journal for Numerical Methods in Engineering , vol.86 , Issue.3 , pp. 316-334
    • Tubbs, K.R.1    Tsai, F.-C.2
  • 10
    • 70449084910 scopus 로고    scopus 로고
    • GPU-accelerated boundary element method for Helmholtz' equation in three dimensions
    • DOI: 10.1002/nme.2661
    • Takahashi T, Hamada T.GPU-accelerated boundary element method for Helmholtz' equation in three dimensions. International Journal for Numerical Methods in Engineering 2009; 80(10):1295-1321. DOI: 10.1002/nme.2661.
    • (2009) International Journal for Numerical Methods in Engineering , vol.80 , Issue.10 , pp. 1295-1321
    • Takahashi, T.1    Hamada, T.2
  • 11
    • 84655161873 scopus 로고    scopus 로고
    • Optimizing the multipole-to-local operator in the fast multipole method for graphical processing units
    • DOI: 10.1002/nme.3240
    • Takahashi T, Cecka C, Fong W, Darve E.Optimizing the multipole-to-local operator in the fast multipole method for graphical processing units. International Journal for Numerical Methods in Engineering 2012; 89(1):105-133. DOI: 10.1002/nme.3240.
    • (2012) International Journal for Numerical Methods in Engineering , vol.89 , Issue.1 , pp. 105-133
    • Takahashi, T.1    Cecka, C.2    Fong, W.3    Darve, E.4
  • 12
    • 61449164992 scopus 로고    scopus 로고
    • How to render FDTD computations more effective using a graphics accelerator
    • DOI: 10.1109/TMAG.2009.2012614
    • Sypek P, Dziekonski A, Mrozowski M.How to render FDTD computations more effective using a graphics accelerator. IEEE Transactions on Magnetics 2009; 45(3):1324-1327. DOI: 10.1109/TMAG.2009.2012614.
    • (2009) IEEE Transactions on Magnetics , vol.45 , Issue.3 , pp. 1324-1327
    • Sypek, P.1    Dziekonski, A.2    Mrozowski, M.3
  • 15
    • 79960133510 scopus 로고    scopus 로고
    • Tuning a Hybrid GPU-CPU V-Cycle multilevel preconditioner for solving large real and complex systems of FEM equations
    • DOI: 10.1109/LAWP.2011.2159769
    • Dziekonski A, Lamecki A, Mrozowski M.Tuning a Hybrid GPU-CPU V-Cycle multilevel preconditioner for solving large real and complex systems of FEM equations. IEEE Antennas and Wireless Propagation Letters 2011; 10:619-622. DOI: 10.1109/LAWP.2011.2159769.
    • (2011) IEEE Antennas and Wireless Propagation Letters , vol.10 , pp. 619-622
    • Dziekonski, A.1    Lamecki, A.2    Mrozowski, M.3
  • 16
    • 78651340345 scopus 로고    scopus 로고
    • GPU Acceleration of multilevel solvers for analysis of microwave components with finite element method
    • DOI: 10.1109/LMWC.2010.2089974
    • Dziekonski A, Lamecki A, Mrozowski M.GPU Acceleration of multilevel solvers for analysis of microwave components with finite element method. IEEE Microwave and Wireless Components Letters 2011; 21(1):1-3. DOI: 10.1109/LMWC.2010.2089974.
    • (2011) IEEE Microwave and Wireless Components Letters , vol.21 , Issue.1 , pp. 1-3
    • Dziekonski, A.1    Lamecki, A.2    Mrozowski, M.3
  • 17
    • 69949091119 scopus 로고    scopus 로고
    • Nodal discontinuous Galerkin methods on graphics processors
    • DOI: 10.1016/j.jcp.2009.06.041
    • Klockner A, Warburton T, Bridge J, Hesthaven JS.Nodal discontinuous Galerkin methods on graphics processors. Journal of Computational Physics 2009; 228(21):7863-7882. DOI: 10.1016/j.jcp.2009.06.041.
    • (2009) Journal of Computational Physics , vol.228 , Issue.21 , pp. 7863-7882
    • Klockner, A.1    Warburton, T.2    Bridge, J.3    Hesthaven, J.S.4
  • 22
    • 0036760579 scopus 로고    scopus 로고
    • A highly effective preconditioner for solving the finite element-boundary integral matrix equation of 3-D scattering
    • DOI: 10.1109/TAP.2002.801377.
    • Liu J, Jin J-M.A highly effective preconditioner for solving the finite element-boundary integral matrix equation of 3-D scattering. IEEE Transactions on Antennas and Propagation 2002; 50(9):1212-1221. DOI: 10.1109/TAP.2002.801377.
    • (2002) IEEE Transactions on Antennas and Propagation , vol.50 , Issue.9 , pp. 1212-1221
    • Liu, J.1    Jin, J.-M.2
  • 23
    • 79958091044 scopus 로고    scopus 로고
    • A memory efficient and fast sparse matrix vector product on a GPU
    • DOI: 10.2528/PIER11031607.
    • Dziekonski A, Lamecki A, Mrozowski M.A memory efficient and fast sparse matrix vector product on a GPU. Progress In Electromagnetics Research 2011; 116:49-63. DOI: 10.2528/PIER11031607.
    • (2011) Progress In Electromagnetics Research , vol.116 , pp. 49-63
    • Dziekonski, A.1    Lamecki, A.2    Mrozowski, M.3
  • 24
    • 77954837026 scopus 로고    scopus 로고
    • Finite-element sparse matrix vector multiplication on graphic processing units
    • DOI: 10.1109/TMAG.2010.2043511
    • Dehnavi MM, Fernandez DM, Giannacopoulos D.Finite-element sparse matrix vector multiplication on graphic processing units. IEEE Transactions on Magnetics 2010; 46(8):2982-2985. DOI: 10.1109/TMAG.2010.2043511.
    • (2010) IEEE Transactions on Magnetics , vol.46 , Issue.8 , pp. 2982-2985
    • Dehnavi, M.M.1    Fernandez, D.M.2    Giannacopoulos, D.3
  • 25
  • 26
    • 79551549387 scopus 로고    scopus 로고
    • Higher order FEM numerical integration on GPUs with OpenCL. Proceedings of the 2010 International Multiconference on Computer Science and Information Technology (IMCSIT), Wisla, Poland, 18-20 October
    • Plaszewski P, Banas K, Maciol P.Higher order FEM numerical integration on GPUs with OpenCL. Proceedings of the 2010 International Multiconference on Computer Science and Information Technology (IMCSIT), Wisla, Poland, 18-20 October 2010; 337-342.
    • (2010) , pp. 337-342
    • Plaszewski, P.1    Banas, K.2    Maciol, P.3
  • 28
    • 84882460891 scopus 로고    scopus 로고
    • Application of assembly of finite element methods on graphics processors for real-time elastodynamics. GPU Computing Gems Emerald Edition, chapter 16. Elsevier Inc.: Burlington, July 2011
    • Cecka C, Lew AJ, Darve E.Application of assembly of finite element methods on graphics processors for real-time elastodynamics. GPU Computing Gems Emerald Edition, chapter 16. Elsevier Inc.: Burlington, July 2011; 187-205.
    • Cecka, C.1    Lew, A.J.2    Darve, E.3
  • 30
    • 84875595743 scopus 로고    scopus 로고
    • Whitepaper-NVIDIA's next generation CUDA compute architecture Fermi. Available from. [Accessed date: 2012].
    • Whitepaper-NVIDIA's next generation CUDA compute architecture Fermi. Available from: http://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf. [Accessed date: 2012].
  • 32
    • 84875606147 scopus 로고    scopus 로고
    • Website of the Intel Math Kernel Library (Intel MKL). Available from: [Accessed date: 2012].
    • Website of the Intel Math Kernel Library (Intel MKL). Available from: http://software.intel.com/en-us/intel-mkl. [Accessed date: 2012].
  • 34
    • 84875630049 scopus 로고    scopus 로고
    • NVIDIA Corporation, CUDA API Reference Manual, Accessed date: 2011].
    • NVIDIA Corporation, CUDA API Reference Manual, 2011. [Accessed date: 2011].
    • (2011)
  • 35
    • 84875609993 scopus 로고    scopus 로고
    • Website of the UMFPACK (unsymmetric multifrontal sparse LU factorization package). Available from. [Accessed date: 2012].
    • Website of the UMFPACK (unsymmetric multifrontal sparse LU factorization package). Available from: http://www.cise.ufl.edu/research/sparse/umfpack. [Accessed date: 2012].
  • 36
    • 84875622105 scopus 로고    scopus 로고
    • Whitepaper-NVIDIA's next generation CUDA compute architecture Kepler GK110. Available from: [Accessed date: 2012].
    • Whitepaper-NVIDIA's next generation CUDA compute architecture Kepler GK110. Available from: http://www.nvidia.com/content/PDF/kepler/NVIDIA-Kepler-GK110-Architecture-Whitepaper.pdf. [Accessed date: 2012].
  • 37
    • 61349184327 scopus 로고    scopus 로고
    • A set of symmetric quadrature rules on triangles and tetrahedra
    • Zhang L, Cui T, Liu H.A set of symmetric quadrature rules on triangles and tetrahedra. Journal of Computational Mathematics 2009; 27(1):89-96.
    • (2009) Journal of Computational Mathematics , vol.27 , Issue.1 , pp. 89-96
    • Zhang, L.1    Cui, T.2    Liu, H.3
  • 38
    • 31044438837 scopus 로고    scopus 로고
    • A new set of H(curl)-conforming hierarchical basis functions for tetrahedral meshes
    • DOI: 10.1109/TMTT.2005.860295.
    • Ingelstrom P.A new set of H(curl)-conforming hierarchical basis functions for tetrahedral meshes. IEEE Transactions on Microwave Theory and Techniques 2006; 54(1):106-114. DOI: 10.1109/TMTT.2005.860295.
    • (2006) IEEE Transactions on Microwave Theory and Techniques , vol.54 , Issue.1 , pp. 106-114
    • Ingelstrom, P.1
  • 39
    • 84875613830 scopus 로고    scopus 로고
    • Website of the Thrust library. Available from. [Accessed date: 2012].
    • Website of the Thrust library. Available from: http://code.google.com/p/thrust/. [Accessed date: 2012].


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.