SCOPUS 정보 검색 플랫폼

International Journal of Computational Science and Engineering

Volumn 4, Issue 1, 2008, Pages 36-55

Using GPUs to improve multigrid solver performance on a cluster

(7) Göddeke, Dominik a Strzodka, Robert b Mohd Yusof, Jamaludin c McCormick, Patrick c Wobker, Hilmar a Becker, Christian a Turek, Stefan a

a TU DORTMUND UNIVERSITY (Germany)

b MAX PLANCK INSTITUTE FOR INFORMATICS (Germany)

c LOS ALAMOS NATIONAL LABORATORY (United States)

Author keywords

Domain decomposition; Finite element calculations; Floating point co processors; GPUs; Graphics processing units; Mixed precision; Multigrid solvers; Parallel scientific computing

Indexed keywords

APPLICATION PROGRAMS; COMPUTER GRAPHICS; DIGITAL ARITHMETIC; DOMAIN DECOMPOSITION METHODS; FINITE ELEMENT METHOD; GRAPHICS PROCESSING UNIT; ITERATIVE METHODS; PROGRAM PROCESSORS;

FLOATING-POINT CO-PROCESSORS; GPUS; MIXED PRECISION; MULTIGRID SOLVER; PARALLEL SCIENTIFIC COMPUTING;

COMPUTER HARDWARE;

EID: 49249120833 PISSN: 17427185 EISSN: 17427193 Source Type: Journal
DOI: 10.1504/IJCSE.2008.021111 Document Type: Article

Times cited : (81)

References (70)

1
- 56349133479
- AGEIA Technologies Inc
- AGEIA Technologies Inc. (2006) AGEIA PhysX Processor, http:// ageia.com/physx/index.html
- (2006) AGEIA PhysX Processor

2
- 56349115680
- AMD Inc
- AMD Inc. (2006) Torrenza Technology, http://enterprise.amd.com/ us-en/AMD-Business/Technology-Home/Torrenza.a%spx
- (2006) Torrenza Technology

3
- 56349165471
- PhD Thesis, Universität Dortmund, Fachbereich Mathematik
- Becker, C. (2007) Strategien und Methoden zur Ausnutzung der High-Performance-Computing-Ressourcen moderner Rechnerarchitekturen für Finite Element Simulationen und ihre Realisierung in FEAST (Finite Element Analysis & Solution Tools), PhD Thesis, Universität Dortmund, Fachbereich Mathematik.
- (2007) Strategien und Methoden zur Ausnutzung der High-Performance-Computing-Ressourcen moderner Rechnerarchitekturen für Finite Element Simulationen und ihre Realisierung in FEAST (Finite Element Analysis & Solution Tools)
- Becker, C.¹

4
- 56349109318
- Becker, C., Kilian, S. and Turek, S. (2007) FEAST - Finite Element Analysis and Solution Tools, http://www.feast.uni-dortmund.de
- (2007) FEAST - Finite Element Analysis and Solution Tools
- Becker, C.¹ Kilian, S.² Turek, S.³

5
- 77953998137
- Sparse matrix solvers on the GPU: Conjugate gradients and multigrid
- Bolz, J., Farmer, I., Grinspun, E. and Schröder, P. (2003) 'Sparse matrix solvers on the GPU: Conjugate gradients and multigrid', ACM Transactions on Graphics (TOG), Vol. 22, No. 3, pp.917-924.
- (2003) ACM Transactions on Graphics (TOG) , vol.22 , Issue.3 , pp. 917-924
- Bolz, J.¹ Farmer, I.² Grinspun, E.³ Schröder, P.⁴

6
- 0141942838
- Reconfigurable computing systems
- Bondalapati, K. and Prasanna, V.K. (2002) 'Reconfigurable computing systems', Proceedings of the IEEE, Vol. 90, No. 7, pp.1201-1217.
- (2002) Proceedings of the IEEE , vol.90 , Issue.7 , pp. 1201-1217
- Bondalapati, K.¹ Prasanna, V.K.²

7
- 10644248153
- Brook for GPUs: Stream computing on graphics hardware
- Buck, I., Foley, T., Horn, D., Sugerman, J., Fatahalian, K., Houston, M. and Hanrahan, P. (2004) 'Brook for GPUs: Stream computing on graphics hardware', ACM Transactions on Graphics (TOG), Vol. 23, No. 3, pp.777-786.
- (2004) ACM Transactions on Graphics (TOG) , vol.23 , Issue.3 , pp. 777-786
- Buck, I.¹ Foley, T.² Horn, D.³ Sugerman, J.⁴ Fatahalian, K.⁵ Houston, M.⁶ Hanrahan, P.⁷

8
- 56349120184
- ClearSpeed Technology, Inc
- ClearSpeed Technology, Inc. (2006) ClearSpeed Advance Accelerator Boards, www.clearspeed.com/products/cs_advance/
- (2006) ClearSpeed Advance Accelerator Boards

9
- 1342318696
- A Science-Based Case for Large-Scale Simulation
- Technical report, DOE Office of Science
- Colella, P., Dunning, T.H., Gropp, W.D. and Keyes, D.E. (2003) A Science-Based Case for Large-Scale Simulation, Technical report, DOE Office of Science, http://www.pnl.gov/scales
- (2003)
- Colella, P.¹ Dunning, T.H.² Gropp, W.D.³ Keyes, D.E.⁴

10
- 0000227930
- Reconfigurable computing: A survey of systems and software
- Compton, K. and Hauck, S. (2002) 'Reconfigurable computing: A survey of systems and software', ACMComputing Surveys, Vol. 34, No. 2, pp.171-210.
- (2002) ACMComputing Surveys , vol.34 , Issue.2 , pp. 171-210
- Compton, K.¹ Hauck, S.²

11
- 56349172474
- Cray Inc
- Cray Inc. (2006) Cray XD1 Supercomputer, www.cray.com/products/xd1
- (2006) Cray XD1 Supercomputer

12
- 84877083867
- Merrimac: Supercomputing with streams
- Dally, W.J., Hanrahan, P., Erez, M., Knight, T.J., Labonté, F., Ahn, J.-H., Jayasena, N., Kapasi, U.J., Das, A., Gummaraju, J. and Buck, I. (2003) 'Merrimac: Supercomputing with streams', SC '03: Proceedings of the 2003 ACM/IEEE Conference on Supercomputing, p.35.
- (2003) SC '03: Proceedings of the 2003 ACM/IEEE Conference on Supercomputing , pp. 35
- Dally, W.J.¹ Hanrahan, P.² Erez, M.³ Knight, T.J.⁴ Labonté, F.⁵ Ahn, J.-H.⁶ Jayasena, N.⁷ Kapasi, U.J.⁸ Das, A.⁹ Gummaraju, J.¹⁰ Buck, I.¹¹

13
- 2942655475
- A column pre-ordering strategy for the unsymmetric-pattern multifrontal method
- Davis, T.A. (2004) 'A column pre-ordering strategy for the unsymmetric-pattern multifrontal method', ACM Transactions on Mathematical Software, Vol. 30, No. 2, pp.165-195.
- (2004) ACM Transactions on Mathematical Software , vol.30 , Issue.2 , pp. 165-195
- Davis, T.A.¹

14
- 56349113584
- Very large scale spatial computing
- DeHon, A. (2002) 'Very large scale spatial computing', Lecture Notes in Computer Science, Proceedings of the Third International Conference on Unconventional Models of Computation, Vol. 2509, pp.27-36.
- (2002) Lecture Notes in Computer Science , vol.2509 , pp. 27-36
- DeHon, A.¹

15
- 33746084108
- Error bounds from extra-precise iterative refinement
- Demmel, J., Hida, Y., Kahan, W., Li, X.S., Mukherjee, S. and Riedy, E.J. (2006) 'Error bounds from extra-precise iterative refinement', ACM Transactions on Mathematical Software, Vol. 32, No. 2, pp.325-351.
- (2006) ACM Transactions on Mathematical Software , vol.32 , Issue.2 , pp. 325-351
- Demmel, J.¹ Hida, Y.² Kahan, W.³ Li, X.S.⁴ Mukherjee, S.⁵ Riedy, E.J.⁶

16
- 26444516623
- Fixed and adaptive cache aware algorithms for multigrid methods
- Dick, E, Riemslagh, K. and Vierendeels, J, Eds, Springer, Berlin
- Douglas, C.C., Hu, J., Karl, W., Kowarschik, M., Rüde, U. and Weiß, C. (2000a) 'Fixed and adaptive cache aware algorithms for multigrid methods', in Dick, E., Riemslagh, K. and Vierendeels, J. (Eds. : Multigrid Methods VI, Springer, Berlin, Vol. 14, pp.87-93.
- (2000) Multigrid Methods VI , vol.14 , pp. 87-93
- Douglas, C.C.¹ Hu, J.² Karl, W.³ Kowarschik, M.⁴ Rüde, U.⁵ Weiß, C.⁶

17
- 0002349926
- Cache optimisation for structured and unstructured grid multigrid
- Douglas, C.C., Hu, J., Kowarschik, M., Rüde, U. and Weiß, C. (2000b) 'Cache optimisation for structured and unstructured grid multigrid', Electronic Transactions on Numerical Analysis, Vol. 10, pp.21-40.
- (2000) Electronic Transactions on Numerical Analysis , vol.10 , pp. 21-40
- Douglas, C.C.¹ Hu, J.² Kowarschik, M.³ Rüde, U.⁴ Weiß, C.⁵

18
- 56349083539
- DRC Computer Corporation
- DRC Computer Corporation (2006) DRC Reconfigurable Processor Units http://www.drccomputer.com/drc/modules.html
- (2006) DRC Reconfigurable Processor Units

19
- 23944462603
- GPU cluster for high performance computing
- Fan, Z., Qiu, F., Kaufman, A. and Yoakum-Stover, S. (2004) 'GPU cluster for high performance computing', SC '04: Proceedings of the 2004 ACM/ IEEE Conference on Supercomputing, p.47.
- (2004) SC '04: Proceedings of the 2004 ACM/ IEEE Conference on Supercomputing , pp. 47
- Fan, Z.¹ Qiu, F.² Kaufman, A.³ Yoakum-Stover, S.⁴

20
- 10044273311
- Using multiple graphics cards as a general purpose parallel computer: Applications to computer vision
- Fung, J. and Mann, S. (2004) 'Using multiple graphics cards as a general purpose parallel computer: Applications to computer vision', Proceedings of the 17th International Conference on Pattern Recognition (ICPR 2004), Vol. 1, pp.805-808.
- (2004) Proceedings of the 17th International Conference on Pattern Recognition (ICPR 2004) , vol.1 , pp. 805-808
- Fung, J.¹ Mann, S.²

21
- 0037478423
- Sun Microsystems Inc
- Garg, R.P. and Sharapov, I. (2001) Techniques for Optimising Applications: High Performance Computing, Sun Microsystems Inc.
- (2001) Techniques for Optimising Applications: High Performance Computing
- Garg, R.P.¹ Sharapov, I.²

22
- 1542330052
- Exploiting fast hardware floating point in high precision computation
- Geddes, K.O. and Zheng, W.W. (2003) 'Exploiting fast hardware floating point in high precision computation', ISSAC '03: Proceedings of the 2003 International Symposium on Symbolic and Algebraic Computation, pp.111-118.
- (2003) ISSAC '03: Proceedings of the 2003 International Symposium on Symbolic and Algebraic Computation , pp. 111-118
- Geddes, K.O.¹ Zheng, W.W.²

23
- 56349154312
- GPGPU Coding Tutorials
- Technical Report, University of Dortmund, Institute of Applied Mathematics and Numerics
- Göddeke, D. (2006) GPGPU Coding Tutorials, Technical Report, University of Dortmund, Institute of Applied Mathematics and Numerics, http://www.mathematik.uni-dortmund.de/goeddeke/gpgpu/
- (2006)
- Göddeke, D.¹

24
- 33947588604
- Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations
- Göddeke, D., Strzodka, R. and Turek, S. (2007) 'Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations', International Journal of Parallel, Emergent and Distributed Systems, Vol. 22, No. 4, pp.221-256.
- (2007) International Journal of Parallel, Emergent and Distributed Systems , vol.22 , Issue.4 , pp. 221-256
- Göddeke, D.¹ Strzodka, R.² Turek, S.³

25
- 11144277251
- A multigrid solver for boundary value problems using programmable graphics hardware
- Goodnight, N., Woolley, C., Lewin, G., Luebke, D. and Humphreys, G. (2003) 'A multigrid solver for boundary value problems using programmable graphics hardware', Graphics Hardware 2003, pp.102-111.
- (2003) Graphics Hardware 2003 , pp. 102-111
- Goodnight, N.¹ Woolley, C.² Lewin, G.³ Luebke, D.⁴ Humphreys, G.⁵

26
- 0038303815
- Interactive visibility culling in complex environments using occlusion-switches
- Govindaraju, N.K., Sud, A., Yoon, S.-E. and Manocha, D. (2003) 'Interactive visibility culling in complex environments using occlusion-switches', SI3D '03: Proceedings of the 2003 symposium on Interactive 3D Graphics, pp.103-112.
- (2003) SI3D '03: Proceedings of the 2003 symposium on Interactive 3D Graphics , pp. 103-112
- Govindaraju, N.K.¹ Sud, A.² Yoon, S.-E.³ Manocha, D.⁴

27
- 56349106795
- GPGPU
- GPGPU (2007) General-Purpose Computation Using Graphics Hardware, http://www.gpgpu.org
- (2007) General-Purpose Computation Using Graphics Hardware

28
- 0035276480
- High performance parallel implicit CFD
- Gropp, W.D., Kaushik, D.K., Keyes, D.E. and Smith, B.F. (2001) 'High performance parallel implicit CFD', Parallel Computing, Vol. 27, pp.337-362.
- (2001) Parallel Computing , vol.27 , pp. 337-362
- Gropp, W.D.¹ Kaushik, D.K.² Keyes, D.E.³ Smith, B.F.⁴

29
- 2442575888
- A quantitative analysis of the speedup factors of FPGAs over processors
- Guo, Z., Najjar, W., Vahid, F. and Vissers, K. (2004) 'A quantitative analysis of the speedup factors of FPGAs over processors', FPGA '04: Proceedings of the 2004 ACM/SIGDA 12th International Symposium on Field Programmable Gate Arrays, pp.162-170.
- (2004) FPGA '04: Proceedings of the 2004 ACM/SIGDA 12th International Symposium on Field Programmable Gate Arrays , pp. 162-170
- Guo, Z.¹ Najjar, W.² Vahid, F.³ Vissers, K.⁴

30
- 33750711838
- Mapping computational concepts to GPUs
- Pharr, M, Ed, Addison-Wesley, Boston, MA, chapter 31, pp
- Harris, M. (2005) 'Mapping computational concepts to GPUs', in Pharr, M. (Ed.): GPUGems 2: Programming Techniques for High-Performance Graphics and General-Purpose Computation, Addison-Wesley, Boston, MA, chapter 31, pp.493-508.
- (2005) GPUGems 2: Programming Techniques for High-Performance Graphics and General-Purpose Computation , pp. 493-508
- Harris, M.¹

31
- 56149125037
- Optimising data movement rates for parallel processing applications on graphics processors
- Harrison, O. and Waldron, J. (2007) 'Optimising data movement rates for parallel processing applications on graphics processors', Proceedings of the 25th International Conference Parallel and Distributed Computing and Networks (PDCN 2007), pp.251-256.
- (2007) Proceedings of the 25th International Conference Parallel and Distributed Computing and Networks (PDCN 2007) , pp. 251-256
- Harrison, O.¹ Waldron, J.²

32
- 84893641728
- A decade of reconfigurable computing: A visionary retrospective
- Hartenstein, R. (2001) 'A decade of reconfigurable computing: A visionary retrospective', Design, Automation and Test in Europe 2001, Proceedings, pp.642-649.
- (2001) Design, Automation and Test in Europe 2001, Proceedings , pp. 642-649
- Hartenstein, R.¹

33
- 4444328068
- Data-stream-based computing: Models and architectural resources
- Hartenstein, R. (2003) 'Data-stream-based computing: Models and architectural resources', International Conference on Microelectronics, Devices and Materials (MIDEM 2003), pp.228-235.
- (2003) International Conference on Microelectronics, Devices and Materials (MIDEM 2003) , pp. 228-235
- Hartenstein, R.¹

34
- 56349158990
- Technical Report, FZ Jülich, Zentralinstitut für Angewandte Mathematik
- Hoßfeld, F. (2001) Perspektiven für Supercomputer-Architekturen Technical Report, FZ Jülich - Zentralinstitut für Angewandte Mathematik.
- (2001) Perspektiven für Supercomputer-Architekturen
- Hoßfeld, F.¹

35
- 14344259756
- IEC , 2nd ed, IEC, Geneva
- IEC (2000) Letter Symbols to be Used in Electrical Technology - Part 2: Telecommunications and Electronics, 2nd ed., IEC, Geneva.
- (2000) Letter Symbols to be Used in Electrical Technology - Part 2: Telecommunications and Electronics

36
- 56349100944
- Intel, Inc
- Intel, Inc. (2006) Geneseo: PCI Express Technology Advancement, http://www.intel.com/technology/pciexpress/devnet/innovation.htm
- (2006) Geneseo: PCI Express Technology Advancement

37
- 14044257293
- Terascale implicit methods for partial differential equations
- Keyes, D.E. (2002) 'Terascale implicit methods for partial differential equations', Contemporary Mathematics, Vol. 306, pp.29-84.
- (2002) Contemporary Mathematics , vol.306 , pp. 29-84
- Keyes, D.E.¹

38
- 84955473128
- Exploring the VLSI scalability of stream processors
- Khailany, B., Dally, W.J., Rixner, S., Kapasi, U.J., Owens, J.D. and Towles, B. (2003) 'Exploring the VLSI scalability of stream processors', Proceedings of the Ninth Symposium on High Performance Computer Architecture, 153-164.
- (2003) Proceedings of the Ninth Symposium on High Performance Computer Architecture , pp. 153-164
- Khailany, B.¹ Dally, W.J.² Rixner, S.³ Kapasi, U.J.⁴ Owens, J.D.⁵ Towles, B.⁶

39
- 34250364609
- PhD Thesis, Universität Dortmund, Fachbereich Mathematik
- Kilian, S. (2001) ScaRC: Ein verallgemeinertes Gebietszerlegungs-/ Mehrgitterkonzept auf Parallelrechnern, PhD Thesis, Universität Dortmund, Fachbereich Mathematik.
- (2001) ScaRC: Ein verallgemeinertes Gebietszerlegungs-/ Mehrgitterkonzept auf Parallelrechnern
- Kilian, S.¹

40
- 84880375785
- Kornhuber, R, Periaux, J, Widlund, O.B, Hoppe, R, Pironneau, O. and Xu, J, Eds, Springer
- Kornhuber, R., Periaux, J., Widlund, O.B., Hoppe, R., Pironneau, O. and Xu, J. (Eds.) (2005) Domain Decomposition Methods in Science and Engineering, Lecture Notes in Computational Science and Engineering, Vol. 40, Springer.
- (2005) Domain Decomposition Methods in Science and Engineering, Lecture Notes in Computational Science and Engineering , vol.40

41
- 0010828753
- Cache-aware multigrid methods for solving poissons equation in two dimensions
- Kowarschik, M., Weiß, C., Karl, W. and Rüde, U. (2000) 'Cache-aware multigrid methods for solving poissons equation in two dimensions', Computing, Vol. 64, pp.381-399.
- (2000) Computing , vol.64 , pp. 381-399
- Kowarschik, M.¹ Weiß, C.² Karl, W.³ Rüde, U.⁴

42
- 0242533310
- Linear algebra operators for GPU implementation of numerical algorithms
- Krüger, J. and Westermann, R. (2003) 'Linear algebra operators for GPU implementation of numerical algorithms', ACM Transactions on Graphics (TOG), Vol. 22, No. 3, pp.908-916.
- (2003) ACM Transactions on Graphics (TOG) , vol.22 , Issue.3 , pp. 908-916
- Krüger, J.¹ Westermann, R.²

43
- 34548206782
- Tools and techniques for performance - exploiting the performance of 32 bit floating point arithmetic in obtaining 64 bit accuracy (revisiting iterative refinement for linear systems)
- Langou, J., Langou, J., Luszczek, P., Kurzak, J., Buttari, A. and Dongarra, J. (2006) 'Tools and techniques for performance - exploiting the performance of 32 bit floating point arithmetic in obtaining 64 bit accuracy (revisiting iterative refinement for linear systems)', SC '06: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, p.113.
- (2006) SC '06: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing , pp. 113
- Langou, J.¹ Langou, J.² Luszczek, P.³ Kurzak, J.⁴ Buttari, A.⁵ Dongarra, J.⁶

44
- 19044370033
- Design, implementation and testing of extended and mixed precision BLAS
- Li, X.S., Demmel, J.W., Bailey, D.H., Henry, G., Hida, Y., Iskandar, J., Kahan, W., Kang, S.Y., Kapur, A., Martin, M.C., Thompson, B.J., Tung, T. and Yoo, D.J. (2002) 'Design, implementation and testing of extended and mixed precision BLAS', ACM Transactions on Mathematical Software, Vol. 28, No. 2, pp.152-205.
- (2002) ACM Transactions on Mathematical Software , vol.28 , Issue.2 , pp. 152-205
- Li, X.S.¹ Demmel, J.W.² Bailey, D.H.³ Henry, G.⁴ Hida, Y.⁵ Iskandar, J.⁶ Kahan, W.⁷ Kang, S.Y.⁸ Kapur, A.⁹ Martin, M.C.¹⁰ Thompson, B.J.¹¹ Tung, T.¹² Yoo, D.J.¹³

45
- 0012065017
- Handbook series linear algebra: Iterative refinement of the solution of a positive definite system of equations
- Martin, R., Peters, G. and Wilkinson, J. (1966) 'Handbook series linear algebra: Iterative refinement of the solution of a positive definite system of equations', Numerische Mathematik, Vol. 8, pp.203-216.
- (1966) Numerische Mathematik , vol.8 , pp. 203-216
- Martin, R.¹ Peters, G.² Wilkinson, J.³

46
- 32844469834
- Meuer, H., Strohmaier, E., Dongarra, J.J. and Simon, H.D. (2006) Top500 Supercomputer Sites, http://www.top500.org/
- (2006) Top500 Supercomputer Sites
- Meuer, H.¹ Strohmaier, E.² Dongarra, J.J.³ Simon, H.D.⁴

47
- 42149168865
- NVIDIA Corporation (2007) NVIDIA CUDA Compute Unified Device Architecture Programming Guide, http://developer.nvidia.com/cuda
- (2007) NVIDIA CUDA Compute Unified Device Architecture Programming Guide

48
- 33947588048
- A survey of general-purpose computation on graphics hardware
- Owens, J.D., Luebke, D., Govindaraju, N., Harris, M., Krüger, J., Lefohn, A.E. and Purcell, T.J. (2007) 'A survey of general-purpose computation on graphics hardware', Computer Graphics Forum, Vol. 26, No. 1, pp.80-113.
- (2007) Computer Graphics Forum , vol.26 , Issue.1 , pp. 80-113
- Owens, J.D.¹ Luebke, D.² Govindaraju, N.³ Harris, M.⁴ Krüger, J.⁵ Lefohn, A.E.⁶ Purcell, T.J.⁷

49
- 77951558943
- A performance-oriented data parallel virtual machine for GPUs
- Peercy, M., Segal, M. and Gerstmann, D. (2006) 'A performance-oriented data parallel virtual machine for GPUs', ACM SIGGRAPH 2006 Conference Abstracts and Applications, p.184.
- (2006) ACM SIGGRAPH 2006 Conference Abstracts and Applications , pp. 184
- Peercy, M.¹ Segal, M.² Gerstmann, D.³

50
- 27344435504
- The design and implementation of a first-generation CELL processor
- Digest of Technical Papers
- Pham, D., Asano, S., Bolliger, M., Day, M.N., Hofstee, H.P., Johns, C., Kahle, J., Kameyama, A., Keaty, J., Masubuchi, Y., Riley, M., Shippy, D., Stasiak, D., Suzuoki, M., Wang, M., Warnock, J., Weitzel, S., Wendel, D., Yamazaki, T. and Yazawa, K. (2005) 'The design and implementation of a first-generation CELL processor', Solid-State Circuits Conference 2005, Digest of Technical Papers, Vol. 1, pp.184-592
- (2005) Solid-State Circuits Conference , vol.1 , pp. 184-592
- Pham, D.¹ Asano, S.² Bolliger, M.³ Day, M.N.⁴ Hofstee, H.P.⁵ Johns, C.⁶ Kahle, J.⁷ Kameyama, A.⁸ Keaty, J.⁹ Masubuchi, Y.¹⁰ Riley, M.¹¹ Shippy, D.¹² Stasiak, D.¹³ Suzuoki, M.¹⁴ Wang, M.¹⁵ Warnock, J.¹⁶ Weitzel, S.¹⁷ Wendel, D.¹⁸ Yamazaki, T.¹⁹ Yazawa, K.²⁰ more..

51
- 0005288644
- Technological trends and their impact on the future of supercomputers
- Bungartz, H.-J, Durst, F. and Zenger, C, Eds
- Rüde, U. (1999) 'Technological trends and their impact on the future of supercomputers', in Bungartz, H.-J., Durst, F. and Zenger, C. (Eds.): High Performance Scientific and Engineering Computing, Lecture notes in Computational Science and Engineering, Vol. 8, pp.459-471.
- (1999) High Performance Scientific and Engineering Computing, Lecture notes in Computational Science and Engineering , vol.8 , pp. 459-471
- Rüde, U.¹

52
- 84880477362
- Graphics processor units: New prospects for parallel computing
- Bruaset, A.M. and Tveito, A, Eds, Springer
- Rumpf, M. and Strzodka, R. (2005) 'Graphics processor units: New prospects for parallel computing', in Bruaset, A.M. and Tveito, A. (Eds.): Numerical Solution of Partial Differential Equations on Parallel Computers, Lecture Notes in Computational Science and Engineering, Springer, Vol. 51, pp.89-134.
- (2005) Numerical Solution of Partial Differential Equations on Parallel Computers, Lecture Notes in Computational Science and Engineering , vol.51 , pp. 89-134
- Rumpf, M.¹ Strzodka, R.²

53
- 0037669851
- Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture
- Sankaralingam, K., Nagarajan, R., Liu, H., Kim, C., Huh, J., Burger, D., Keckler, S.W. and Moore, C.R. (2003) 'Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture', ACM SIGARCH Computer Architecture News, Vol. 31, No. 2, pp.422-433.
- (2003) ACM SIGARCH Computer Architecture News , vol.31 , Issue.2 , pp. 422-433
- Sankaralingam, K.¹ Nagarajan, R.² Liu, H.³ Kim, C.⁴ Huh, J.⁵ Burger, D.⁶ Keckler, S.W.⁷ Moore, C.R.⁸

54
- 33845237994
- SEMATECH (2006) International Technology Roadmap for Semiconductors (ITRS), http://www.sematech.org/corporate/annual
- (2006) International Technology Roadmap for Semiconductors (ITRS)

55
- 56349149338
- A hardware redundancy and recovery mechanism for reliable scientific computation on graphics processors
- Aila, T. and Segal, M, Eds
- Sheaffer, J.W., Luebke, D.P. and Skadron, K. (2007) 'A hardware redundancy and recovery mechanism for reliable scientific computation on graphics processors', in Aila, T. and Segal, M. (Eds.): Graphics Hardware 2007, pp.55-64.
- (2007) Graphics Hardware 2007 , pp. 55-64
- Sheaffer, J.W.¹ Luebke, D.P.² Skadron, K.³

56
- 0003419897
- Cambridge University Press, Cambridge, UK
- Smith, B.F., Bjorstad, P.E. and Gropp, W.D. (1996) Domain Decomposition: Parallel Multilevel Methods for Elliptic Partial Differential Equations, Cambridge University Press, Cambridge, UK.
- (1996) Domain Decomposition: Parallel Multilevel Methods for Elliptic Partial Differential Equations
- Smith, B.F.¹ Bjorstad, P.E.² Gropp, W.D.³

57
- 56349103911
- Stanford University Graphics Lab
- Stanford University Graphics Lab (2006) GPUbench - How Much Does Your GPUbench? http://graphics.stanford.edu/projects/gpubench/results
- (2006) GPUbench - How Much Does Your GPUbench

58
- 33947600335
- PhD Thesis, University of Duisburg-Essen
- Strzodka, R. (2004) Hardware Efficient PDE Solvers in Quantised Image Processing, PhD Thesis, University of Duisburg-Essen.
- (2004) Hardware Efficient PDE Solvers in Quantised Image Processing
- Strzodka, R.¹

59
- 25844449063
- Scientific computation for simulations on programmable graphics hardware
- Strzodka, R., Doggett, M. and Kolb, A. (2005) 'Scientific computation for simulations on programmable graphics hardware', Simulation Modelling Practice and Theory, Special Issue: Programmable Graphics Hardware, Vol. 13, No. 8, pp.667-680.
- (2005) Simulation Modelling Practice and Theory, Special Issue: Programmable Graphics Hardware , vol.13 , Issue.8 , pp. 667-680
- Strzodka, R.¹ Doggett, M.² Kolb, A.³

60
- 20344389963
- Fast image registration in DX9 graphics hardware
- Strzodka, R., Droske, M. and Rumpf, M. (2003) 'Fast image registration in DX9 graphics hardware', Journal of Medical Informatics and Technologies, Vol. 6, pp.43-49.
- (2003) Journal of Medical Informatics and Technologies , vol.6 , pp. 43-49
- Strzodka, R.¹ Droske, M.² Rumpf, M.³

61
- 0038345686
- Suh, J., Kim, E.-G., Crago, S.P., Srinivasan, L. and French, M.C. (2003) 'A performance analysis of PIM, stream processing, and tiled processing on memory-intensive signal processing kernels', in DeGroot, D. (Ed.): ISCA '03: Proceedings of the 30th Annual International Symposium on Computer Architecture, Computer Architecture News, pp.410-421.
- Suh, J., Kim, E.-G., Crago, S.P., Srinivasan, L. and French, M.C. (2003) 'A performance analysis of PIM, stream processing, and tiled processing on memory-intensive signal processing kernels', in DeGroot, D. (Ed.): ISCA '03: Proceedings of the 30th Annual International Symposium on Computer Architecture, Computer Architecture News, pp.410-421.

62
- 0036505033
- The raw microprocessor: A computational fabric for software circuits and general purpose programs
- Taylor, M.B., Kim, J.S., Miller, J., Wentzlaff, D., Ghodrat, F., Greenwald, B., Hoffmann, H., Johnson, P., Lee, J-W., Lee, W., Ma, A., Saraf, A., Seneski, M., Shnidman, N., Strumpen, V., Frank, M., Amarasinghe, S.P. and Agarwal, A. (2002) 'The raw microprocessor: A computational fabric for software circuits and general purpose programs', IEEE Micro, Vol. 22, No. 2, pp.25-35.
- (2002) IEEE Micro , vol.22 , Issue.2 , pp. 25-35
- Taylor, M.B.¹ Kim, J.S.² Miller, J.³ Wentzlaff, D.⁴ Ghodrat, F.⁵ Greenwald, B.⁶ Hoffmann, H.⁷ Johnson, P.⁸ Lee, J.-W.⁹ Lee, W.¹⁰ Ma, A.¹¹ Saraf, A.¹² Seneski, M.¹³ Shnidman, N.¹⁴ Strumpen, V.¹⁵ Frank, M.¹⁶ Amarasinghe, S.P.¹⁷ Agarwal, A.¹⁸

63
- 14044267617
- Springer
- Toselli, A. and Widlund, O.B. (2004) Domain Decomposition Methods - Algorithms and Theory, Springer Series in Computational Mathematics, Vol. 34, Springer.
- (2004) Domain Decomposition Methods - Algorithms and Theory, Springer Series in Computational Mathematics , vol.34
- Toselli, A.¹ Widlund, O.B.²

64
- 0003529440
- Springer, Berlin
- Turek, S. (1999) Efficient Solvers for Incompressible Flow Problems: An Algorithmic and Computational Approach, Springer, Berlin.
- (1999) Efficient Solvers for Incompressible Flow Problems: An Algorithmic and Computational Approach
- Turek, S.¹

65
- 26444596160
- Hardware-oriented numerics and concepts for PDE software
- Turek, S., Becker, C. and Kilian, S. (2003) 'Hardware-oriented numerics and concepts for PDE software', Future Generation Computer Systems Vol. 22, Nos. 1-2, pp.217-238.
- (2003) Future Generation Computer Systems , vol.22 , Issue.1-2 , pp. 217-238
- Turek, S.¹ Becker, C.² Kilian, S.³

66
- 0343462141
- Automated empirical optimisation of software and the ATLAS project
- Whaley, R.C., Petitet, A. and Dongarra, J.J. (2001) 'Automated empirical optimisation of software and the ATLAS project', Parallel Computing Vol. 27, Nos. 1-2, pp.3-35.
- (2001) Parallel Computing , vol.27 , Issue.1-2 , pp. 3-35
- Whaley, R.C.¹ Petitet, A.² Dongarra, J.J.³

67
- 17044371311
- The memory gap (keynote)
- Wilkes, M. (2000) 'The memory gap (keynote)', Solving the Memory Wall Problem Workshop, http://www.ece.neu.edu/conf/wall2k/wilkes1.pdf
- (2000) Solving the Memory Wall Problem Workshop
- Wilkes, M.¹

68
- 34247349114
- The potential of the Cell processor for scientific computing
- Williams, S., Shalf, J., Oliker, L., Kamil, S., Husbands, P. and Yelick, K. (2006) 'The potential of the Cell processor for scientific computing', CF '06: Proceedings of the ACM International Conference on Computing Frontiers, pp.9-20.
- (2006) CF '06: Proceedings of the ACM International Conference on Computing Frontiers , pp. 9-20
- Williams, S.¹ Shalf, J.² Oliker, L.³ Kamil, S.⁴ Husbands, P.⁵ Yelick, K.⁶

69
- 40949114835
- XtremeData Inc
- XtremeData Inc. (2006) The XD1000 FPGA Coprocessor Module for Socket 940, http://www.xtremedatainc.com/Products.html
- (2006) The XD1000 FPGA Coprocessor Module for Socket 940

70
- 27844475071
- Genaue Lösung linearer Gleichungssysteme
- Zielke, G. and Drygalla, V. (2003) 'Genaue Lösung linearer Gleichungssysteme', GAMM-Mitteilungen, Vol. 2, No. 1, pp.7-107.
- (2003) GAMM-Mitteilungen , vol.2 , Issue.1 , pp. 7-107
- Zielke, G.¹ Drygalla, V.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.