SCOPUS 정보 검색 플랫폼

Journal of Supercomputing

Volumn 63, Issue 2, 2013, Pages 443-466

GPU-accelerated preconditioned iterative linear solvers

(2) Li, Ruipeng a Saad, Yousef a

a University of Minnesota (United States)

Author keywords

GPU computing; Preconditioned iterative methods; Sparse matrix computations

Indexed keywords

GPU COMPUTING; ITERATIVE LINEAR SOLVER; MATRIX-VECTOR PRODUCTS; PRECONDITIONED GMRES; PRECONDITIONED ITERATIVE METHODS; PRECONDITIONING METHOD; PRECONDITIONING TECHNIQUES; SPARSE MATRIX COMPUTATIONS;

FACTORIZATION; MATLAB; MATRIX ALGEBRA; PROGRAM PROCESSORS;

ITERATIVE METHODS;

EID: 84877617833 PISSN: 09208542 EISSN: 15730484 Source Type: Journal
DOI: 10.1007/s11227-012-0825-3 Document Type: Article

Times cited : (208)

References (33)

1
- 34547281000
- The kill rule for multicore
- New York, NY, USA ACM New York 10.1145/1278480.1278668
- Agarwal A, Levy M (2007) The kill rule for multicore. In: DAC'07: proceedings of the 44th annual design automation conference, New York, NY, USA. ACM, New York, pp 750-753
- (2007) DAC'07: Proceedings of the 44th Annual Design Automation Conference , pp. 750-753
- Agarwal, A.¹ Levy, M.²

2
- 77953997924
- Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects
- 012037 10.1088/1742-6596/180/1/012037
- Agullo E, Demmel J, Dongarra J, Hadri B, Kurzak J, Langou J, Ltaief H, Luszczek P, Tomov S (2009) Numerical linear algebra on emerging architectures: the PLASMA and MAGMA projects. J Phys Conf Ser 180(1):012037
- (2009) J Phys Conf ser , vol.180 , Issue.1
- Agullo, E.¹ Demmel, J.² Dongarra, J.³ Hadri, B.⁴ Kurzak, J.⁵ Langou, J.⁶ Ltaief, H.⁷ Luszczek, P.⁸ Tomov, S.⁹

3
- 77952662514
- A parallel preconditioned conjugate gradient solver for the Poisson problem on a Multi-GPU platform
- Washington, DC, USA IEEE Comput. Soc. Los Alamitos
- Ament M, Knittel G, Weiskopf D, Strasser W (2010) A parallel preconditioned conjugate gradient solver for the Poisson problem on a Multi-GPU platform. In: PDP'10: proceedings of the 2010 18th euromicro conference on parallel, distributed and network-based processing, Washington, DC, USA. IEEE Comput. Soc., Los Alamitos, pp 583-592
- (2010) PDP'10: Proceedings of the 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing , pp. 583-592
- Ament, M.¹ Knittel, G.² Weiskopf, D.³ Strasser, W.⁴

4
- 74049163483
- Tech report, IBM Research
- Baskaran MM, Bordawekar R (2008) Optimizing sparse matrix-vector multiplication on GPUs. Tech report, IBM Research
- (2008) Optimizing Sparse Matrix-vector Multiplication on GPUs
- Baskaran, M.M.¹ Bordawekar, R.²

5
- 74049143158
- Implementing sparse matrix-vector multiplication on throughput-oriented processors
- New York, NY, USA ACM New York 10.1145/1654059.1654078
- Bell N, Garland M (2009) Implementing sparse matrix-vector multiplication on throughput-oriented processors. In: SC'09: proceedings of the conference on high performance computing networking, storage and analysis, New York, NY, USA. ACM, New York, pp 1-11
- (2009) SC'09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis , pp. 1-11
- Bell, N.¹ Garland, M.²

6
- 80051888188
- Version 0.1.0
- Bell N, Garland M (2010) Cusp: generic parallel algorithms for sparse matrix and graph computations. Version 0.1.0
- (2010) Cusp: Generic Parallel Algorithms for Sparse Matrix and Graph Computations
- Bell, N.¹ Garland, M.²

7
- 0242533311
- Sparse matrix solvers on the GPU: Conjugate gradients and multigrid
- 10.1145/882262.882364
- Bolz J, Farmer I, Grinspun E, Schröoder P (2003) Sparse matrix solvers on the GPU: conjugate gradients and multigrid. ACM Trans Graph 22(3):917-924
- (2003) ACM Trans Graph , vol.22 , Issue.3 , pp. 917-924
- Bolz, J.¹ Farmer, I.² Grinspun, E.³ Schröoder, P.⁴

8
- 77957679421
- Model-driven autotuning of sparse matrix-vector multiply on GPUs
- 10.1145/1837853.1693471
- Choi JW, Singh A, Vuduc RW (2010) Model-driven autotuning of sparse matrix-vector multiply on GPUs. ACM SIGPLAN Not 45:115-126
- (2010) ACM SIGPLAN Not , vol.45 , pp. 115-126
- Choi, J.W.¹ Singh, A.² Vuduc, R.W.³

9
- 0003699170
- Blaisdell Waltham 0111.06003
- Davis PJ (1963) Interpolation and approximation. Blaisdell, Waltham
- (1963) Interpolation and Approximation
- Davis, P.J.¹

10
- 0012453312
- Davis TA (1994) University of Florinda sparse matrix collection, na digest
- (1994) University of Florinda Sparse Matrix Collection, Na Digest
- Davis, T.A.¹

11
- 8644220273
- Tech report umsi-2001-32, Minnesota Supercomputer Institute, University of Minnesota, Minneapolis, MN
- Erhel J, Guyomarc'H F, Saad Y (2001) Least-squares polynomial filters for ill-conditioned linear systems. Tech report umsi-2001-32, Minnesota Supercomputer Institute, University of Minnesota, Minneapolis, MN
- (2001) Least-squares Polynomial Filters for Ill-conditioned Linear Systems
- Erhel, J.¹ Guyomarc'H, F.² Saad, Y.³

12
- 0024630993
- The evolution of the minimum degree ordering algorithm
- 986480 0671.65024 10.1137/1031001
- George A, Liu JWH (1989) The evolution of the minimum degree ordering algorithm. SIAM Rev 31(1):1-19
- (1989) SIAM Rev , vol.31 , Issue.1 , pp. 1-19
- George, A.¹ Liu, J.W.H.²

13
- 84871526672
- Georgescu S, Okuda H (2007) Conjugate gradients on graphic hardware: performance & feasibility
- (2007) Conjugate Gradients on Graphic Hardware: Performance & Feasibility
- Georgescu, S.¹ Okuda, H.²

14
- 84871518824
- Master's thesis, Delft Institute of Applied Mathematics, Delft University of Technology, 2628 BL, Delft, The Netherlands
- Gupta R (2009) A GPU implementation of a bubbly flow solver. Master's thesis, Delft Institute of Applied Mathematics, Delft University of Technology, 2628 BL, Delft, The Netherlands
- (2009) A GPU Implementation of A Bubbly Flow Solver
- Gupta, R.¹

15
- 0003734628
- Tech report, University of Minnesota, Department of Computer Science/Army HPC Research Center
- Karypis G, Kumar V (1998) Metis - a software package for partitioning unstructured graphs, partitioning meshes, and computing fill-reducing orderings of sparse matrices, version 4.0. Tech report, University of Minnesota, Department of Computer Science/Army HPC Research Center
- (1998) Metis - A Software Package for Partitioning Unstructured Graphs, Partitioning Meshes, and Computing Fill-reducing Orderings of Sparse Matrices, Version 4.0
- Karypis, G.¹ Kumar, V.²

16
- 0000094594
- An iteration method for the solution of the eigenvalue problem of linear differential and integral operators
- 42791 10.6028/jres.045.026
- Lanczos C (1950) An iteration method for the solution of the eigenvalue problem of linear differential and integral operators. J Res Natl Bur Stand 45:255-282
- (1950) J Res Natl Bur Stand , vol.45 , pp. 255-282
- Lanczos, C.¹

17
- 70350356359
- Implementing blocked sparse matrix-vector multiplication on nvidia GPUs
- K. Bertels N. Dimopoulos C. Silvano S. Wong (eds) Lecture notes in computer science 5657 Springer Berlin 10.1007/978-3-642-03138-0-32
- Monakov A, Avetisyan A (2009) Implementing blocked sparse matrix-vector multiplication on nvidia GPUs. In: Bertels K, Dimopoulos N, Silvano C, Wong S (eds) Embedded computer systems: architectures, modeling, and simulation. Lecture notes in computer science, vol 5657. Springer, Berlin, pp 289-297
- (2009) Embedded Computer Systems: Architectures, Modeling, and Simulation , pp. 289-297
- Monakov, A.¹ Avetisyan, A.²

18
- 77949577730
- Automatically tuning sparse matrix-vector multiplication for GPU architectures
- Y. Patt Foglia E. Duesterwald Faraboschi X. Martorell (eds) Lecture notes in computer science 5952 Springer Berlin 10.1007/978-3-642-11515-8-10
- Monakov A, Lokhmotov A, Avetisyan A (2010) Automatically tuning sparse matrix-vector multiplication for GPU architectures. In: Patt Y, Foglia P, Duesterwald E, Faraboschi P, Martorell X (eds) High performance embedded architectures and compilers. Lecture notes in computer science, vol 5952. Springer, Berlin, pp 111-125
- (2010) High Performance Embedded Architectures and Compilers , pp. 111-125
- Monakov, A.¹ Lokhmotov, A.² Avetisyan, A.³

19
- 84894310144
- NVIDIA
- NVIDIA (2012) CUBLAS library user guide 4.2
- (2012) CUBLAS Library User Guide 4.2

20
- 84877689944
- NVIDIA
- NVIDIA (2012) CUDA CUSPARSE Library
- (2012) CUDA CUSPARSE Library

21
- 35948991669
- NVIDIA
- NVIDIA (2012) NVIDIA CUDA C programming guide 4.2
- (2012) NVIDIA CUDA C Programming Guide 4.2

22
- 84886723333
- CoRR abs/1012.2270
- Oberhuber T, Suzuki A, Vacata J (2010) New row-grouped csr format for storing the sparse matrices on GPU with implementation in CUDA. CoRR abs/1012.2270
- (2010) New Row-grouped Csr Format for Storing the Sparse Matrices on GPU with Implementation in CUDA
- Oberhuber, T.¹ Suzuki, A.² Vacata, J.³

23
- 0012189842
- Regular incomplete factorizations of real positive definite matrices
- 683212 0502.65018 10.1016/0024-3795(82)90101-X
- Robert Y (1982) Regular incomplete factorizations of real positive definite matrices. Linear Algebra Appl 48:105-117
- (1982) Linear Algebra Appl , vol.48 , pp. 105-117
- Robert, Y.¹

24
- 0003550735
- Tech report RIACS-90-20, Research Institute for Advanced Computer Science, NASA Ames Research Center, Moffett Field, CA
- Saad Y (1990) SPARSKIT: A basic tool kit for sparse matrix computations. Tech report RIACS-90-20, Research Institute for Advanced Computer Science, NASA Ames Research Center, Moffett Field, CA
- (1990) SPARSKIT: A Basic Tool Kit for Sparse Matrix Computations
- Saad, Y.¹

25
- 84985408358
- ILUT: A dual threshold incomplete ILU factorization
- 1306700 0838.65026 10.1002/nla.1680010405
- Saad Y (1994) ILUT: a dual threshold incomplete ILU factorization. Numer Linear Algebra Appl 1:387-402
- (1994) Numer Linear Algebra Appl , vol.1 , pp. 387-402
- Saad, Y.¹

26
- 1842829625
- 2 SIAM Philadelphia 1031.65046 10.1137/1.9780898718003
- Saad Y (2003) Iterative methods for sparse linear systems, 2nd edn. SIAM, Philadelphia
- (2003) Iterative Methods for Sparse Linear Systems
- Saad, Y.¹

27
- 78651284120
- ACM New York
- Sengupta S, Harris M, Zhang Y, Owens JD (2007) Scan primitives for GPU computing. Graphics hardware 2007. ACM, New York, pp 97-106
- (2007) Scan Primitives for GPU Computing. Graphics Hardware 2007 , pp. 97-106
- Sengupta, S.¹ Harris, M.² Zhang, Y.³ Owens, J.D.⁴

28
- 85087192496
- High performance manycore solvers for reservoir simulation
- Sudan H, Klie H, Li R, Saad Y (2010) High performance manycore solvers for reservoir simulation. In: 12th European conference on the mathematics of oil recovery
- (2010) 12th European Conference on the Mathematics of Oil Recovery
- Sudan, H.¹ Klie, H.² Li, R.³ Saad, Y.⁴

29
- 77949613205
- Tech report, Department of Computer Architecture and Electronics, University of Almeria
- Vázquez F, Garzon EM, Martinez JA, Fernandez JJ (2009) The sparse matrix vector product on GPUs. Tech report, Department of Computer Architecture and Electronics, University of Almeria
- (2009) The Sparse Matrix Vector Product on GPUs
- Vázquez, F.¹ Garzon, E.M.² Martinez, J.A.³ Fernandez, J.J.⁴

30
- 67650056991
- Tech report, Computer Science Division University of California at Berkeley
- Volkov V, Demmel J (2008) LU, QR and Cholesky factorizations using vector capabilities of GPUs. Tech report, Computer Science Division University of California at Berkeley
- (2008) LU, QR and Cholesky Factorizations Using Vector Capabilities of GPUs
- Volkov, V.¹ Demmel, J.²

31
- 68849108613
- Solving sparse linear systems on nvidia tesla GPUs
- Springer Berlin
- Wang M, Klie H, Parashar M, Sudan H (2009) Solving sparse linear systems on nvidia tesla GPUs. In: ICCS'09: proceedings of the 9th international conference on computational science. Springer, Berlin, pp 864-873
- (2009) ICCS'09: Proceedings of the 9th International Conference on Computational Science , pp. 864-873
- Wang, M.¹ Klie, H.² Parashar, M.³ Sudan, H.⁴

32
- 85017244495
- CRC Press Boca Raton 10.1201/b10376-8 Chap 5
- Williams S, Bell N, Choi JW, Garland M, Oliker L, Vuduc R (2010) Scientific computing with multicore and accelerators. CRC Press, Boca Raton, pp 83-109. Chap 5
- (2010) Scientific Computing with Multicore and Accelerators , pp. 83-109
- Williams, S.¹ Bell, N.² Choi, J.W.³ Garland, M.⁴ Oliker, L.⁵ Vuduc, R.⁶

33
- 33846337142
- Parallel self-consistent-field calculations via Chebyshev-filtered subspace acceleration
- 066704 10.1103/PhysRevE.74.066704
- Zhou Y, Saad Y, Tiago ML, Chelikowsky JR (2006) Parallel self-consistent-field calculations via Chebyshev-filtered subspace acceleration. Phys Rev E 74:066704
- (2006) Phys Rev e , vol.74
- Zhou, Y.¹ Saad, Y.² Tiago, M.L.³ Chelikowsky, J.R.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.