메뉴 건너뛰기




Volumn 45, Issue 5, 2010, Pages 127-136

Fast tridiagonal solvers on the GPU

Author keywords

GPGPU; Performance Optimization; Tridiagonal Linear System

Indexed keywords

CYCLIC REDUCTION; GPGPU; GPU PROGRAMS; LINEAR ALGORITHMS; MEMORY ACCESS; MULTITHREADED; PERFORMANCE OPTIMIZATIONS; RECURSIVE DOUBLING; TRI-DIAGONAL SOLVER; TRIDIAGONAL;

EID: 77957607987     PISSN: 15232867     EISSN: None     Source Type: Journal    
DOI: 10.1145/1837853.1693472     Document Type: Conference Paper
Times cited : (128)

References (33)
  • 2
    • 77957599342 scopus 로고    scopus 로고
    • NVIDIA CUDA compute unified device architecture, programming guide, Version 2.0
    • NVIDIA CUDA compute unified device architecture, programming guide, 2009. Version 2.0.
    • (2009)
  • 7
    • 10044249026 scopus 로고    scopus 로고
    • An overlapped two-way method for solving tridiagonal linear systems in a bsp computer
    • J.-J. Climent, C. Perea, L. Tortosa, and A. Zamora. An overlapped two-way method for solving tridiagonal linear systems in a bsp computer. Applied Mathematics and Computation, 161(2):475-500, 2005.
    • (2005) Applied Mathematics and Computation , vol.161 , Issue.2 , pp. 475-500
    • Climent, J.-J.1    Perea, C.2    Tortosa, L.3    Zamora, A.4
  • 9
    • 0042458763 scopus 로고
    • An analysis of the recursive doubling algorithm
    • D. Kuck, D. Lawrie, and A. Sameh, editors, Academic Press, New York, NY
    • P. Dubois and G. Rodrigue. An analysis of the recursive doubling algorithm. In D. Kuck, D. Lawrie, and A. Sameh, editors, High Speed Computer and Algorithm Organization, pages 299-305. Academic Press, New York, NY, 1977.
    • (1977) High Speed Computer and Algorithm Organization , pp. 299-305
    • Dubois, P.1    Rodrigue, G.2
  • 10
    • 0040098094 scopus 로고
    • A recursive doubling algorithm for solution of tridiagonal systems on hypercube multiprocessors
    • Ö. Eǧecioǧlu, C. K. Koc, and A. J. Laub. A recursive doubling algorithm for solution of tridiagonal systems on hypercube multiprocessors. Journal of Computational and Applied Mathematics, 27:95-108, 1989.
    • (1989) Journal of Computational and Applied Mathematics , vol.27 , pp. 95-108
    • Eǧecioǧlu, Ö.1    Koc, C.K.2    Laub, A.J.3
  • 13
    • 2342522154 scopus 로고    scopus 로고
    • Evaluation of vertical coordinate and vertical mixing algorithms in the HYbrid-Coordinate Ocean Model (HYCOM)
    • G. R. Halliwell. Evaluation of vertical coordinate and vertical mixing algorithms in the HYbrid-Coordinate Ocean Model (HYCOM). Ocean Modelling, 7:285-322, 2004.
    • (2004) Ocean Modelling , vol.7 , pp. 285-322
    • Halliwell, G.R.1
  • 15
    • 0000490624 scopus 로고
    • Optimizing tridiagonal solvers for alternating direction methods on boolean cube multiprocessors
    • C. T. Ho and S. L. Johnsson. Optimizing tridiagonal solvers for alternating direction methods on boolean cube multiprocessors. SIAM Journal of Scientific and Statistical Computing, 11(3):563-592, 1990.
    • (1990) SIAM Journal of Scientific and Statistical Computing , vol.11 , Issue.3 , pp. 563-592
    • Ho, C.T.1    Johnsson, S.L.2
  • 16
    • 84932220767 scopus 로고
    • A fast direct solution of Poisson's equation using Fourier analysis
    • Jan.
    • R. W. Hockney. A fast direct solution of Poisson's equation using Fourier analysis. Journal of the ACM, 12(1):95-113, Jan. 1965.
    • (1965) Journal of the ACM , vol.12 , Issue.1 , pp. 95-113
    • Hockney, R.W.1
  • 18
    • 0032024603 scopus 로고    scopus 로고
    • Two-way BSP algorithm for tridiagonal systems
    • Mar.
    • Y. Huang and W. F. McColl. Two-way BSP algorithm for tridiagonal systems. Future Generation Computer Systems, 13:337-347, Mar. 1998.
    • (1998) Future Generation Computer Systems , vol.13 , pp. 337-347
    • Huang, Y.1    McColl, W.F.2
  • 19
    • 70449768671 scopus 로고    scopus 로고
    • Interactive depth of field using simulated diffusion
    • Pixar Animation Studios, Jan.
    • M. Kass, A. Lefohn, and J. D. Owens. Interactive depth of field using simulated diffusion. Technical Report 06-101, Pixar Animation Studios, Jan. 2006.
    • (2006) Technical Report , pp. 06-101
    • Kass, M.1    Lefohn, A.2    Owens, J.D.3
  • 21
    • 84976719982 scopus 로고
    • The solution of tridiagonal linear systems on the CDC STAR-100 computer
    • J. J. Lambiotte and R. G. Voigt. The solution of tridiagonal linear systems on the CDC STAR-100 computer. ACM Trans. Math. Software, 1(4):308-329, 1975.
    • (1975) ACM Trans. Math. Software , vol.1 , Issue.4 , pp. 308-329
    • Lambiotte, J.J.1    Voigt, R.G.2
  • 22
    • 0026170724 scopus 로고
    • A method to parallelize tridiagonal solvers
    • S. M. Müller and D. Sheerer. A method to parallelize tridiagonal solvers. Parallel Computing, 17:181-188, 1991.
    • (1991) Parallel Computing , vol.17 , pp. 181-188
    • Müller, S.M.1    Sheerer, D.2
  • 27
    • 84976729385 scopus 로고
    • An efficient parallel algorithm for the solution of a tridiagonal linear system of equations
    • Jan.
    • H. S. Stone. An efficient parallel algorithm for the solution of a tridiagonal linear system of equations. Journal of the ACM, 20(1):27-38, Jan. 1973.
    • (1973) Journal of the ACM , vol.20 , Issue.1 , pp. 27-38
    • Stone, H.S.1
  • 28
    • 0026825865 scopus 로고
    • Efficient tridiagonal solvers on multicomputers
    • C-41, Mar.
    • X.-H. Sun, H. Zhang, and L. M. Ni. Efficient tridiagonal solvers on multicomputers. IEEE Transactions on Computers, C-41(3):286-296, Mar. 1992.
    • (1992) IEEE Transactions on Computers , Issue.3 , pp. 286-296
    • Sun, X.-H.1    Zhang, H.2    Ni, L.M.3
  • 29
    • 1342282168 scopus 로고    scopus 로고
    • A parallel two-level hybrid method for tridiagonal systems and its application to fast poisson solvers
    • PDS-15, Feb.
    • X.-H. Sun and W. Zhang. A parallel two-level hybrid method for tridiagonal systems and its application to fast Poisson solvers. IEEE Transactions on Parallel and Distributed Systems, PDS-15(2):97-106, Feb. 2004.
    • (2004) IEEE Transactions on Parallel and Distributed Systems , Issue.2 , pp. 97-106
    • Sun, X.-H.1    Zhang, W.2
  • 31
    • 84856252478 scopus 로고    scopus 로고
    • Using GPUs to accelerate the bisection algorithm for finding eigenvalues of symmetric tridiagonal matrices
    • University of Tennessee, Knoxville, Jan.
    • V. Volkov and J. W. Demmel. Using GPUs to accelerate the bisection algorithm for finding eigenvalues of symmetric tridiagonal matrices. LAPACKWorking Note 197, Department of Computer Science, University of Tennessee, Knoxville, Jan. 2008.
    • (2008) LAPACKWorking Note 197, Department of Computer Science
    • Volkov, V.1    Demmel, J.W.2
  • 32
    • 0019575493 scopus 로고
    • A parallel method for tridiagonal equations
    • H. H. Wang. A parallel method for tridiagonal equations. ACM Trans. Math. Software, 7:170-183, 1981.
    • (1981) ACM Trans. Math. Software , vol.7 , pp. 170-183
    • Wang, H.H.1
  • 33
    • 65949107549 scopus 로고    scopus 로고
    • Roofline: An insightful visual performance model for multicore architectures
    • S. Williams, A. Waterman, and D. Patterson. Roofline: An insightful visual performance model for multicore architectures. Commun. ACM, 52(4):65-76, 2009.
    • (2009) Commun. ACM , vol.52 , Issue.4 , pp. 65-76
    • Williams, S.1    Waterman, A.2    Patterson, D.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.