SCOPUS 정보 검색 플랫폼

International Conference for High Performance Computing, Networking, Storage and Analysis, SC

Volumn , Issue , 2012, Pages

A scalable, numerically stable, high-performance tridiagonal solver using GPUs

(4) Chang, Li Wen a Stratton, John A a Kim, Hee Seok a Hwu, Wen Mei W a

a UNIVERSITY OF ILLINOIS AT URBANA CHAMPAIGN (United States)

Author keywords

GPGPU; GPU Computing; SPIKE; Tridiagonal Solver

Indexed keywords

GPGPU; GPU COMPUTING; HIGH-THROUGHPUT DATA; MEMORY EFFICIENCY; OPTIMIZATION STRATEGY; SPIKE; STABLE SOLUTIONS; TRI-DIAGONAL SOLVER;

ALGORITHMS; DIGITAL STORAGE; MATLAB; MATRIX ALGEBRA; METADATA;

PROGRAM PROCESSORS;

EID: 84877702106 PISSN: 21674329 EISSN: 21674337 Source Type: Conference Proceeding
DOI: 10.1109/SC.2012.12 Document Type: Conference Paper

Times cited : (54)

References (38)

1
- 70449768671
- tech. rep.
- A. Lefohn, U. C. Davis, J. Owens, and U. C. Davis, "Interactive depth of field using simulated diffusion," tech. rep., 2006.
- (2006) Interactive Depth of Field Using Simulated Diffusion
- Lefohn, A.¹ Davis, U.C.² Owens, J.³ Davis, U.C.⁴

2
- 78651284120
- Scan primitives for GPU computing
- Aug.
- S. Sengupta, M. Harris, Y. Zhang, and J. D. Owens, "Scan primitives for GPU computing," in Graphics Hardware 2007, pp. 97-106, Aug. 2007.
- (2007) Graphics Hardware 2007 , pp. 97-106
- Sengupta, S.¹ Harris, M.² Zhang, Y.³ Owens, J.D.⁴

3
- 80155186310
- Tridiagonal solvers on the GPU and applications to fluid simulation
- N. Sakharnykh, "Tridiagonal solvers on the GPU and applications to fluid simulation," NVIDIA GPU Technology Conference, September 2009.
- NVIDIA GPU Technology Conference, September 2009
- Sakharnykh, N.¹

4
- 80155169335
- Efficient tridiagonal solvers for ADI methods and fluid simulation
- N. Sakharnykh, "Efficient tridiagonal solvers for ADI methods and fluid simulation," NVIDIA GPU Technology Conference, September 2010.
- NVIDIA GPU Technology Conference, September 2010
- Sakharnykh, N.¹

5
- 84932220767
- A fast direct solution of Poisson's equation using Fourier analysis
- January
- R. W. Hockney, "A fast direct solution of Poisson's equation using Fourier analysis," J. ACM, vol. 12, pp. 95-113, January 1965.
- (1965) J. ACM , vol.12 , pp. 95-113
- Hockney, R.W.¹

6
- 0003470761
- Philadelphia: SIAM
- A. Greenbaum, Iterative Methods for Solving Linear Systems. Philadelphia: SIAM, 1997.
- (1997) Iterative Methods for Solving Linear Systems
- Greenbaum, A.¹

7
- 80051640515
- Parallel implementation of multi-dimensional ensemble empirical mode decomposition
- L.-W. Chang, M.-T. Lo, N. Anssari, K.-H. Hsu, N. Huang, and W.-M. Hwu, "Parallel implementation of multi-dimensional ensemble empirical mode decomposition," International Conference on Acoustics, Speech, and Signal Processing, May 2011.
- International Conference on Acoustics, Speech, and Signal Processing, May 2011
- Chang, L.-W.¹ Lo, M.-T.² Anssari, N.³ Hsu, K.-H.⁴ Huang, N.⁵ Hwu, W.-M.⁶

8
- 78649807974
- Cyclic reduction tridiagonal solvers on GPUs applied to mixed-precision multigrid
- D. Göddeke and R. Strzodka, "Cyclic reduction tridiagonal solvers on GPUs applied to mixed-precision multigrid," IEEE Transactions on Parallel and Distributed Systems, vol. 22, pp. 22-32, 2011.
- (2011) IEEE Transactions on Parallel and Distributed Systems , vol.22 , pp. 22-32
- Göddeke, D.¹ Strzodka, R.²

9
- 80053259579
- An auto-tuned method for solving large tridiagonal systems on the GPU
- A. Davidson, Y. Zhang, and J. D. Owens, "An auto-tuned method for solving large tridiagonal systems on the GPU," in Proceedings of the 25th IEEE International Parallel and Distributed Processing Symposium, May 2011.
- Proceedings of the 25th IEEE International Parallel and Distributed Processing Symposium, May 2011
- Davidson, A.¹ Zhang, Y.² Owens, J.D.³

10
- 80155140292
- A scalable tridiagonal solver for gpus
- sept.
- H.-S. Kim, S. Wu, L.-W. Chang, and W.-M. Hwu, "A scalable tridiagonal solver for gpus," in Parallel Processing (ICPP), 2011 International Conference on, pp. 444-453, sept. 2011.
- (2011) Parallel Processing (ICPP), 2011 International Conference on , pp. 444-453
- Kim, H.-S.¹ Wu, S.² Chang, L.-W.³ Hwu, W.-M.⁴

11
- 0004039746
- R.W. Hockney, C.R. Jesshope. Hilger, Bristol
- R. W. Hockney and C. R. Jesshope, Parallel computers : architecture, programming and algorithms / R.W. Hockney, C.R. Jesshope. Hilger, Bristol :, 1981.
- (1981) Parallel Computers: Architecture, Programming and Algorithms
- Hockney, R.W.¹ Jesshope, C.R.²

12
- 0003801686
- McGraw-Hill Higher Education, 3rd ed.
- S. D. Conte and C. W. D. Boor, Elementary Numerical Analysis: An Algorithmic Approach. McGraw-Hill Higher Education, 3rd ed., 1980.
- (1980) Elementary Numerical Analysis: An Algorithmic Approach
- Conte, S.D.¹ Boor, C.W.D.²

13
- 84877689944
- NVIDIA Corporation, Jan.
- NVIDIA Corporation, CUDA CUSPARSE Library, Jan. 2012.
- (2012) CUDA CUSPARSE Library

14
- 84867265648
- Register packing for cyclic reduction: A case study
- A. Davidson and J. D. Owens, "Register packing for cyclic reduction: A case study," in Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units, Mar. 2011.
- Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units, Mar. 2011
- Davidson, A.¹ Owens, J.D.²

15
- 80053275518
- Feb.
- D. Egloff, "High performance finite difference PDE solvers on GPUs." http://download.quantalea.net/fdm-gpu.pdf, Feb. 2010.
- (2010) High Performance Finite Difference PDE Solvers on GPUs
- Egloff, D.¹

16
- 84875973859
- Pricing financial derivatives with high performance finite dierence solvers on GPUs
- in press
- D. Egloff, "Pricing financial derivatives with high performance finite dierence solvers on GPUs," in GPU Computing Gems, in press.
- GPU Computing Gems
- Egloff, D.¹

17
- 77957607987
- Fast tridiagonal solvers on the GPU
- ACM
- Y. Zhang, J. Cohen, and J. D. Owens, "Fast tridiagonal solvers on the GPU," in Proceedings of the 15th ACM SIGPLAN symposium on Principles and practice of parallel programming, PPoPP '10, (New York, NY, USA), pp. 127-136, ACM, 2010.
- (2010) Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '10, (New York, NY, USA) , pp. 127-136
- Zhang, Y.¹ Cohen, J.² Owens, J.D.³

18
- 84910052222
- A hybrid method for solving tridiagonal systems on the GPU
- Morgan Kaufmann, Aug.
- Y. Zhang, J. Cohen, A. A. Davidson, and J. D. Owens, "A hybrid method for solving tridiagonal systems on the GPU," in GPU Computing Gems, vol. 2, Morgan Kaufmann, Aug. 2011.
- (2011) GPU Computing Gems , vol.2
- Zhang, Y.¹ Cohen, J.² Davidson, A.A.³ Owens, J.D.⁴

19
- 84976729385
- An efficient parallel algorithm for the solution of a tridiagonal linear system of equations
- January
- H. S. Stone, "An efficient parallel algorithm for the solution of a tridiagonal linear system of equations," J. ACM, vol. 20, pp. 27-38, January 1973.
- (1973) J. ACM , vol.20 , pp. 27-38
- Stone, H.S.¹

20
- 31044454001
- A parallel hybrid banded system solver: The SPIKE algorithm
- DOI 10.1016/j.parco.2005.07.005, PII S0167819105001353, Parallel Matrix Algorithms and Applications (PMAA '04)
- E. Polizzi and A. H. Sameh, "A parallel hybrid banded system solver: The SPIKE algorithm," Parallel Computing, vol. 32, no. 2, pp. 177-194, 2006. (Pubitemid 43120209)
- (2006) Parallel Computing , vol.32 , Issue.2 , pp. 177-194
- Polizzi, E.¹ Sameh, A.H.²

21
- 33750298089
- SPIKE: A parallel environment for solving banded linear systems
- DOI 10.1016/j.compfluid.2005.07.005, PII S0045793005001325, Challenges and Advances in Flow Simulation and Modeling
- E. Polizzi and A. Sameh, "SPIKE: A parallel environment for solving banded linear systems," Computers and Fluids, vol. 36, no. 1, pp. 113-120, 2007. (Pubitemid 44615421)
- (2007) Computers and Fluids , vol.36 , Issue.1 , pp. 113-120
- Polizzi, E.¹ Sameh, A.²

22
- 78649902840
- Generalized diagonal pivoting methods for tridiagonal systems without interchanges
- J. B. Erway, R. F. Marcia, and J. Tyson, "Generalized diagonal pivoting methods for tridiagonal systems without interchanges," IAENG International Journal of Applied Mathematics, vol. 4, no. 40, pp. 269-275, 2010.
- (2010) IAENG International Journal of Applied Mathematics , vol.4 , Issue.40 , pp. 269-275
- Erway, J.B.¹ Marcia, R.F.² Tyson, J.³

23
- 78149251414
- Data layout transformation exploiting memory-level parallelism in structured grid manycore applications
- ACM
- I.-J. Sung, J. A. Stratton, and W.-M. W. Hwu, "Data layout transformation exploiting memory-level parallelism in structured grid manycore applications," in Proceedings of the 19th international conference on Parallel architectures and compilation techniques, PACT '10, (New York, NY, USA), pp. 513-522, ACM, 2010.
- (2010) Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, PACT '10, (New York, NY, USA) , pp. 513-522
- Sung, I.-J.¹ Stratton, J.A.² Hwu, W.-M.W.³

24
- 84870691946
- DL: A data layout transformation system for heterogeneous computing
- (In Press), May
- I.-J. Sung, G. D. Liu, and W.-M. W. Hwu, "DL: A data layout transformation system for heterogeneous computing," in IEEE Conference on Innovative Parallel Computing (In Press), May 2012.
- (2012) IEEE Conference on Innovative Parallel Computing
- Sung, I.-J.¹ Liu, G.D.² Hwu, W.-M.W.³

25
- 84870714904
- MPI Forum, "Message Passing Interface (MPI) Forum Home Page." http://www.mpi-forum.org/.
- Message Passing Interface (MPI) Forum Home Page

26
- 78650639324
- OpenMP, "The OpenMP API specification for parallel programming." http://openmp.org.
- The OpenMP API Specification for Parallel Programming

27
- 84872201157
- Intel, "Math Kernel Library." http://developer.intel.com/ software/products/mkl/.
- Math Kernel Library

28
- 84877702147
- Intel, "Intel adaptive spike-based solver." http://software.intel.com/en-us/articles/intel-adaptive-spike-based-solver/.
- Intel Adaptive Spike-based Solver

29
- 84875379930
- version 7.10.0 (R2010a). Natick, Massachusetts: The MathWorks Inc.
- MATLAB, version 7.10.0 (R2010a). Natick, Massachusetts: The MathWorks Inc., 2010.
- (2010) MATLAB

30
- 82955212653
- NVIDIA Corporation, Nov.
- NVIDIA Corporation, CUDA Programming Guide 4.1, Nov. 2011.
- (2011) CUDA Programming Guide 4.1

31
- 84877689879
- K. Group
- K. Group, "OpenCL." http://www.khronos.org/opencl/.
- OpenCL

32
- 77952251540
- An asymmetric distributed shared memory model for heterogeneous parallel systems
- ACM
- I. Gelado, J. E. Stone, J. Cabezas, S. Patel, N. Navarro, and W.-M. W. Hwu, "An asymmetric distributed shared memory model for heterogeneous parallel systems," in Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems, ASPLOS '10, (New York, NY, USA), pp. 347-358, ACM, 2010.
- (2010) Proceedings of the Fifteenth Edition of ASPLOS on Architectural Support for Programming Languages and Operating Systems, ASPLOS '10, (New York, NY, USA) , pp. 347-358
- Gelado, I.¹ Stone, J.E.² Cabezas, J.³ Patel, S.⁴ Navarro, N.⁵ Hwu, W.-M.W.⁶

33
- 84966228742
- Some stable methods for calculating inertia and solving symmetric linear systems
- January
- J. R. Bunch and L. Kaufman, "Some stable methods for calculating inertia and solving symmetric linear systems," Math. Comp., pp. 63-179, January 1977.
- (1977) Math. Comp. , pp. 63-179
- Bunch, J.R.¹ Kaufman, L.²

34
- 84867431833
- NVIDIA Corporation, Jan.
- NVIDIA Corporation, Compute VIsual PRofiler, Jan. 2012.
- (2012) Compute VIsual PRofiler

35
- 72049106942
- GPU clusters for high-performance computing
- V. V. Kindratenko, J. J. Enos, G. Shi, M. T. Showerman, G. W. Arnold, J. E. Stone, J. C. Phillips, and W.-M. Hwu, "GPU clusters for high-performance computing," 2009 IEEE International Conference on Cluster Computing and Workshops, pp. 1-8, 2009.
- (2009) 2009 IEEE International Conference on Cluster Computing and Workshops , pp. 1-8
- Kindratenko, V.V.¹ Enos, J.J.² Shi, G.³ Showerman, M.T.⁴ Arnold, G.W.⁵ Stone, J.E.⁶ Phillips, J.C.⁷ Hwu, W.-M.⁸

36
- 84870698295
- National Center for Supercomputing Applications at the University of Illinois, "Dell nvidia linux cluster forge." http://www.ncsa.illinois. edu/UserInfo/Resources/Hardware/DellNVIDIACluster/.
- Dell Nvidia Linux Cluster Forge

37
- 0032395502
- Reliable computation of the condition number of a tridiagonal matrix in o(n) time
- July
- I. S. Dhillon, "Reliable computation of the condition number of a tridiagonal matrix in o(n) time," SIAM J. Matrix Anal. Appl., vol. 19, pp. 776-796, July 1998.
- (1998) SIAM J. Matrix Anal. Appl. , vol.19 , pp. 776-796
- Dhillon, I.S.¹

38
- 33746281706
- Computing the condition number of tridiagonal and diagonal-plus- semiseparable matrices in linear time
- July
- G. I. Hargreaves, "Computing the condition number of tridiagonal and diagonal-plus-semiseparable matrices in linear time," SIAM J. Matrix Anal. Appl., vol. 27, pp. 801-820, July 2005.
- (2005) SIAM J. Matrix Anal. Appl. , vol.27 , pp. 801-820
- Hargreaves, G.I.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.