메뉴 건너뛰기




Volumn 4, Issue 1, 2008, Pages 36-55

Using GPUs to improve multigrid solver performance on a cluster

Author keywords

Domain decomposition; Finite element calculations; Floating point co processors; GPUs; Graphics processing units; Mixed precision; Multigrid solvers; Parallel scientific computing

Indexed keywords

APPLICATION PROGRAMS; COMPUTER GRAPHICS; DIGITAL ARITHMETIC; DOMAIN DECOMPOSITION METHODS; FINITE ELEMENT METHOD; GRAPHICS PROCESSING UNIT; ITERATIVE METHODS; PROGRAM PROCESSORS;

EID: 49249120833     PISSN: 17427185     EISSN: 17427193     Source Type: Journal    
DOI: 10.1504/IJCSE.2008.021111     Document Type: Article
Times cited : (81)

References (70)
  • 1
    • 56349133479 scopus 로고    scopus 로고
    • AGEIA Technologies Inc
    • AGEIA Technologies Inc. (2006) AGEIA PhysX Processor, http:// ageia.com/physx/index.html
    • (2006) AGEIA PhysX Processor
  • 2
    • 56349115680 scopus 로고    scopus 로고
    • AMD Inc
    • AMD Inc. (2006) Torrenza Technology, http://enterprise.amd.com/ us-en/AMD-Business/Technology-Home/Torrenza.a%spx
    • (2006) Torrenza Technology
  • 5
    • 77953998137 scopus 로고    scopus 로고
    • Sparse matrix solvers on the GPU: Conjugate gradients and multigrid
    • Bolz, J., Farmer, I., Grinspun, E. and Schröder, P. (2003) 'Sparse matrix solvers on the GPU: Conjugate gradients and multigrid', ACM Transactions on Graphics (TOG), Vol. 22, No. 3, pp.917-924.
    • (2003) ACM Transactions on Graphics (TOG) , vol.22 , Issue.3 , pp. 917-924
    • Bolz, J.1    Farmer, I.2    Grinspun, E.3    Schröder, P.4
  • 6
    • 0141942838 scopus 로고    scopus 로고
    • Reconfigurable computing systems
    • Bondalapati, K. and Prasanna, V.K. (2002) 'Reconfigurable computing systems', Proceedings of the IEEE, Vol. 90, No. 7, pp.1201-1217.
    • (2002) Proceedings of the IEEE , vol.90 , Issue.7 , pp. 1201-1217
    • Bondalapati, K.1    Prasanna, V.K.2
  • 8
    • 56349120184 scopus 로고    scopus 로고
    • ClearSpeed Technology, Inc
    • ClearSpeed Technology, Inc. (2006) ClearSpeed Advance Accelerator Boards, www.clearspeed.com/products/cs_advance/
    • (2006) ClearSpeed Advance Accelerator Boards
  • 9
    • 1342318696 scopus 로고    scopus 로고
    • A Science-Based Case for Large-Scale Simulation
    • Technical report, DOE Office of Science
    • Colella, P., Dunning, T.H., Gropp, W.D. and Keyes, D.E. (2003) A Science-Based Case for Large-Scale Simulation, Technical report, DOE Office of Science, http://www.pnl.gov/scales
    • (2003)
    • Colella, P.1    Dunning, T.H.2    Gropp, W.D.3    Keyes, D.E.4
  • 10
    • 0000227930 scopus 로고    scopus 로고
    • Reconfigurable computing: A survey of systems and software
    • Compton, K. and Hauck, S. (2002) 'Reconfigurable computing: A survey of systems and software', ACMComputing Surveys, Vol. 34, No. 2, pp.171-210.
    • (2002) ACMComputing Surveys , vol.34 , Issue.2 , pp. 171-210
    • Compton, K.1    Hauck, S.2
  • 11
    • 56349172474 scopus 로고    scopus 로고
    • Cray Inc
    • Cray Inc. (2006) Cray XD1 Supercomputer, www.cray.com/products/xd1
    • (2006) Cray XD1 Supercomputer
  • 13
    • 2942655475 scopus 로고    scopus 로고
    • A column pre-ordering strategy for the unsymmetric-pattern multifrontal method
    • Davis, T.A. (2004) 'A column pre-ordering strategy for the unsymmetric-pattern multifrontal method', ACM Transactions on Mathematical Software, Vol. 30, No. 2, pp.165-195.
    • (2004) ACM Transactions on Mathematical Software , vol.30 , Issue.2 , pp. 165-195
    • Davis, T.A.1
  • 14
    • 56349113584 scopus 로고    scopus 로고
    • Very large scale spatial computing
    • DeHon, A. (2002) 'Very large scale spatial computing', Lecture Notes in Computer Science, Proceedings of the Third International Conference on Unconventional Models of Computation, Vol. 2509, pp.27-36.
    • (2002) Lecture Notes in Computer Science , vol.2509 , pp. 27-36
    • DeHon, A.1
  • 16
    • 26444516623 scopus 로고    scopus 로고
    • Fixed and adaptive cache aware algorithms for multigrid methods
    • Dick, E, Riemslagh, K. and Vierendeels, J, Eds, Springer, Berlin
    • Douglas, C.C., Hu, J., Karl, W., Kowarschik, M., Rüde, U. and Weiß, C. (2000a) 'Fixed and adaptive cache aware algorithms for multigrid methods', in Dick, E., Riemslagh, K. and Vierendeels, J. (Eds. : Multigrid Methods VI, Springer, Berlin, Vol. 14, pp.87-93.
    • (2000) Multigrid Methods VI , vol.14 , pp. 87-93
    • Douglas, C.C.1    Hu, J.2    Karl, W.3    Kowarschik, M.4    Rüde, U.5    Weiß, C.6
  • 18
    • 56349083539 scopus 로고    scopus 로고
    • DRC Computer Corporation
    • DRC Computer Corporation (2006) DRC Reconfigurable Processor Units http://www.drccomputer.com/drc/modules.html
    • (2006) DRC Reconfigurable Processor Units
  • 23
    • 56349154312 scopus 로고    scopus 로고
    • GPGPU Coding Tutorials
    • Technical Report, University of Dortmund, Institute of Applied Mathematics and Numerics
    • Göddeke, D. (2006) GPGPU Coding Tutorials, Technical Report, University of Dortmund, Institute of Applied Mathematics and Numerics, http://www.mathematik.uni-dortmund.de/goeddeke/gpgpu/
    • (2006)
    • Göddeke, D.1
  • 24
    • 33947588604 scopus 로고    scopus 로고
    • Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations
    • Göddeke, D., Strzodka, R. and Turek, S. (2007) 'Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations', International Journal of Parallel, Emergent and Distributed Systems, Vol. 22, No. 4, pp.221-256.
    • (2007) International Journal of Parallel, Emergent and Distributed Systems , vol.22 , Issue.4 , pp. 221-256
    • Göddeke, D.1    Strzodka, R.2    Turek, S.3
  • 25
    • 11144277251 scopus 로고    scopus 로고
    • A multigrid solver for boundary value problems using programmable graphics hardware
    • Goodnight, N., Woolley, C., Lewin, G., Luebke, D. and Humphreys, G. (2003) 'A multigrid solver for boundary value problems using programmable graphics hardware', Graphics Hardware 2003, pp.102-111.
    • (2003) Graphics Hardware 2003 , pp. 102-111
    • Goodnight, N.1    Woolley, C.2    Lewin, G.3    Luebke, D.4    Humphreys, G.5
  • 34
    • 56349158990 scopus 로고    scopus 로고
    • Technical Report, FZ Jülich, Zentralinstitut für Angewandte Mathematik
    • Hoßfeld, F. (2001) Perspektiven für Supercomputer-Architekturen Technical Report, FZ Jülich - Zentralinstitut für Angewandte Mathematik.
    • (2001) Perspektiven für Supercomputer-Architekturen
    • Hoßfeld, F.1
  • 37
    • 14044257293 scopus 로고    scopus 로고
    • Terascale implicit methods for partial differential equations
    • Keyes, D.E. (2002) 'Terascale implicit methods for partial differential equations', Contemporary Mathematics, Vol. 306, pp.29-84.
    • (2002) Contemporary Mathematics , vol.306 , pp. 29-84
    • Keyes, D.E.1
  • 41
    • 0010828753 scopus 로고    scopus 로고
    • Cache-aware multigrid methods for solving poissons equation in two dimensions
    • Kowarschik, M., Weiß, C., Karl, W. and Rüde, U. (2000) 'Cache-aware multigrid methods for solving poissons equation in two dimensions', Computing, Vol. 64, pp.381-399.
    • (2000) Computing , vol.64 , pp. 381-399
    • Kowarschik, M.1    Weiß, C.2    Karl, W.3    Rüde, U.4
  • 42
    • 0242533310 scopus 로고    scopus 로고
    • Linear algebra operators for GPU implementation of numerical algorithms
    • Krüger, J. and Westermann, R. (2003) 'Linear algebra operators for GPU implementation of numerical algorithms', ACM Transactions on Graphics (TOG), Vol. 22, No. 3, pp.908-916.
    • (2003) ACM Transactions on Graphics (TOG) , vol.22 , Issue.3 , pp. 908-916
    • Krüger, J.1    Westermann, R.2
  • 43
    • 34548206782 scopus 로고    scopus 로고
    • Tools and techniques for performance - exploiting the performance of 32 bit floating point arithmetic in obtaining 64 bit accuracy (revisiting iterative refinement for linear systems)
    • Langou, J., Langou, J., Luszczek, P., Kurzak, J., Buttari, A. and Dongarra, J. (2006) 'Tools and techniques for performance - exploiting the performance of 32 bit floating point arithmetic in obtaining 64 bit accuracy (revisiting iterative refinement for linear systems)', SC '06: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, p.113.
    • (2006) SC '06: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing , pp. 113
    • Langou, J.1    Langou, J.2    Luszczek, P.3    Kurzak, J.4    Buttari, A.5    Dongarra, J.6
  • 45
    • 0012065017 scopus 로고
    • Handbook series linear algebra: Iterative refinement of the solution of a positive definite system of equations
    • Martin, R., Peters, G. and Wilkinson, J. (1966) 'Handbook series linear algebra: Iterative refinement of the solution of a positive definite system of equations', Numerische Mathematik, Vol. 8, pp.203-216.
    • (1966) Numerische Mathematik , vol.8 , pp. 203-216
    • Martin, R.1    Peters, G.2    Wilkinson, J.3
  • 55
    • 56349149338 scopus 로고    scopus 로고
    • A hardware redundancy and recovery mechanism for reliable scientific computation on graphics processors
    • Aila, T. and Segal, M, Eds
    • Sheaffer, J.W., Luebke, D.P. and Skadron, K. (2007) 'A hardware redundancy and recovery mechanism for reliable scientific computation on graphics processors', in Aila, T. and Segal, M. (Eds.): Graphics Hardware 2007, pp.55-64.
    • (2007) Graphics Hardware 2007 , pp. 55-64
    • Sheaffer, J.W.1    Luebke, D.P.2    Skadron, K.3
  • 57
    • 56349103911 scopus 로고    scopus 로고
    • Stanford University Graphics Lab
    • Stanford University Graphics Lab (2006) GPUbench - How Much Does Your GPUbench? http://graphics.stanford.edu/projects/gpubench/results
    • (2006) GPUbench - How Much Does Your GPUbench
  • 61
    • 0038345686 scopus 로고    scopus 로고
    • Suh, J., Kim, E.-G., Crago, S.P., Srinivasan, L. and French, M.C. (2003) 'A performance analysis of PIM, stream processing, and tiled processing on memory-intensive signal processing kernels', in DeGroot, D. (Ed.): ISCA '03: Proceedings of the 30th Annual International Symposium on Computer Architecture, Computer Architecture News, pp.410-421.
    • Suh, J., Kim, E.-G., Crago, S.P., Srinivasan, L. and French, M.C. (2003) 'A performance analysis of PIM, stream processing, and tiled processing on memory-intensive signal processing kernels', in DeGroot, D. (Ed.): ISCA '03: Proceedings of the 30th Annual International Symposium on Computer Architecture, Computer Architecture News, pp.410-421.
  • 65
    • 26444596160 scopus 로고    scopus 로고
    • Hardware-oriented numerics and concepts for PDE software
    • Turek, S., Becker, C. and Kilian, S. (2003) 'Hardware-oriented numerics and concepts for PDE software', Future Generation Computer Systems Vol. 22, Nos. 1-2, pp.217-238.
    • (2003) Future Generation Computer Systems , vol.22 , Issue.1-2 , pp. 217-238
    • Turek, S.1    Becker, C.2    Kilian, S.3
  • 66
    • 0343462141 scopus 로고    scopus 로고
    • Automated empirical optimisation of software and the ATLAS project
    • Whaley, R.C., Petitet, A. and Dongarra, J.J. (2001) 'Automated empirical optimisation of software and the ATLAS project', Parallel Computing Vol. 27, Nos. 1-2, pp.3-35.
    • (2001) Parallel Computing , vol.27 , Issue.1-2 , pp. 3-35
    • Whaley, R.C.1    Petitet, A.2    Dongarra, J.J.3
  • 70
    • 27844475071 scopus 로고    scopus 로고
    • Genaue Lösung linearer Gleichungssysteme
    • Zielke, G. and Drygalla, V. (2003) 'Genaue Lösung linearer Gleichungssysteme', GAMM-Mitteilungen, Vol. 2, No. 1, pp.7-107.
    • (2003) GAMM-Mitteilungen , vol.2 , Issue.1 , pp. 7-107
    • Zielke, G.1    Drygalla, V.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.