메뉴 건너뛰기




Volumn 68, Issue 10, 2008, Pages 1360-1369

Algorithmic performance studies on graphics processing units

Author keywords

Graphics processing units; Matrix decomposition; Nonlinear optimization; Parallel processing; Sparse direct solvers

Indexed keywords

APPLICATIONS; COLOR IMAGE PROCESSING; CONSTRAINED OPTIMIZATION; CONSTRAINT THEORY; DIGITAL ARITHMETIC; IMAGE CODING; INTEGRATION; OPTIMIZATION; PARTIAL DIFFERENTIAL EQUATIONS; RHENIUM COMPOUNDS;

EID: 51449090534     PISSN: 07437315     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.jpdc.2008.05.008     Document Type: Article
Times cited : (55)

References (32)
  • 1
    • 51449106544 scopus 로고    scopus 로고
    • AMD Stream Processor: http://ati.amd.com/products/streamprocessor/index.html
    • AMD Stream Processor: http://ati.amd.com/products/streamprocessor/index.html
  • 3
    • 51449116136 scopus 로고    scopus 로고
    • R.G. Belleman, J. Bedorf, S.P. Zwart, High performance direct gravitational N-body simulations on graphics processing units-II : An implementation in CUDA, New Astronomy, 2007 (in press)
    • R.G. Belleman, J. Bedorf, S.P. Zwart, High performance direct gravitational N-body simulations on graphics processing units-II : An implementation in CUDA, New Astronomy, 2007 (in press)
  • 5
    • 51449112186 scopus 로고    scopus 로고
    • T. Davis, University of Florida Sparse Matrix Collection, University of Florida, Gainesville. http://www.cise.ufl.edu/davis/sparse/
    • T. Davis, University of Florida Sparse Matrix Collection, University of Florida, Gainesville. http://www.cise.ufl.edu/davis/sparse/
  • 8
    • 33845468997 scopus 로고    scopus 로고
    • N. Galoppo, N.K. Govindaraju, M. Henson, D. Manocha, LU-GPU: Efficient algorithms for solving dense linear systems on graphics hardware, in: ACM/IEEE SC 2005 Conference, SC'05, 2005, p. 3
    • N. Galoppo, N.K. Govindaraju, M. Henson, D. Manocha, LU-GPU: Efficient algorithms for solving dense linear systems on graphics hardware, in: ACM/IEEE SC 2005 Conference, SC'05, 2005, p. 3
  • 9
    • 33947588604 scopus 로고    scopus 로고
    • Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations
    • Göddeke D., Strzodka R., and Turek S. Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations. International Journal of Parallel, Emergent and Distributed Systems 22 4 (2007) 221-256
    • (2007) International Journal of Parallel, Emergent and Distributed Systems , vol.22 , Issue.4 , pp. 221-256
    • Göddeke, D.1    Strzodka, R.2    Turek, S.3
  • 10
    • 34547483006 scopus 로고    scopus 로고
    • A numerical evaluation of sparse direct solvers for the solution of large sparse, symmetric linear systems of equations
    • Gould N.I.M., Hu Y., and Scott J.A. A numerical evaluation of sparse direct solvers for the solution of large sparse, symmetric linear systems of equations. ACM Transactions on Mathematical Software (TOMS) 33 2 (2007)
    • (2007) ACM Transactions on Mathematical Software (TOMS) , vol.33 , Issue.2
    • Gould, N.I.M.1    Hu, Y.2    Scott, J.A.3
  • 11
    • 51449122534 scopus 로고    scopus 로고
    • GPUBench: A benchmark suite designed to analyze the performance of programmable graphics processors: http://graphics.stanford.edu/projects/gpubench
    • GPUBench: A benchmark suite designed to analyze the performance of programmable graphics processors: http://graphics.stanford.edu/projects/gpubench
  • 12
    • 51449092409 scopus 로고    scopus 로고
    • Jin Hyuk Jung, Dianne P. O'Leary, Implementing an interior point method for linear programs on a CPU-GPU system, in: Electronic Transactions on Numerical Analysis, ETNA, 2008 (in press)
    • Jin Hyuk Jung, Dianne P. O'Leary, Implementing an interior point method for linear programs on a CPU-GPU system, in: Electronic Transactions on Numerical Analysis, ETNA, 2008 (in press)
  • 13
    • 51449122079 scopus 로고    scopus 로고
    • Intel Math Kernel Library 9.1 - Sparse Solvers: http://www.intel.com/cd/software/products/asmo-na/eng/266853.htm
    • Intel Math Kernel Library 9.1 - Sparse Solvers: http://www.intel.com/cd/software/products/asmo-na/eng/266853.htm
  • 14
    • 0030304771 scopus 로고    scopus 로고
    • Augmented Lagrangian-SQP methods for nonlinear optimal control problems of tracking type
    • Ito K., and Kunisch K. Augmented Lagrangian-SQP methods for nonlinear optimal control problems of tracking type. SIAM Journal of Optimization 6 (1996) 96-125
    • (1996) SIAM Journal of Optimization , vol.6 , pp. 96-125
    • Ito, K.1    Kunisch, K.2
  • 15
    • 51449112632 scopus 로고    scopus 로고
    • Andreas Kolb, Nicolas Cuntz, Dynamic particle coupling for gpu-based fluid simulation, in: Proceedings 18th Symposium on Simulation Techniques, 2005, pp. 722-727
    • Andreas Kolb, Nicolas Cuntz, Dynamic particle coupling for gpu-based fluid simulation, in: Proceedings 18th Symposium on Simulation Techniques, 2005, pp. 722-727
  • 16
    • 0242533310 scopus 로고    scopus 로고
    • Linear algebra operators for GPU implementation of numerical algorithms
    • Krüger J., and Westermann R. Linear algebra operators for GPU implementation of numerical algorithms. ACM Transactions on Graphics (TOG) 22 3 (2003) 908-916
    • (2003) ACM Transactions on Graphics (TOG) , vol.22 , Issue.3 , pp. 908-916
    • Krüger, J.1    Westermann, R.2
  • 17
    • 51449087128 scopus 로고    scopus 로고
    • J. Kurzak, J. Dongarra, Implementation of the mixed-precision high performance LINPACK Benchmark on the CELL Processor. Technical report UT-CS-06-580, LAPACK Working Note 177, University of Tennessee Computer Science, September 2006
    • J. Kurzak, J. Dongarra, Implementation of the mixed-precision high performance LINPACK Benchmark on the CELL Processor. Technical report UT-CS-06-580, LAPACK Working Note 177, University of Tennessee Computer Science, September 2006
  • 18
    • 51449108826 scopus 로고    scopus 로고
    • Michael McCool, Data-parallel programming on the cell BE and the GPU using the RapidMind development platform, GSPx Multicore Applications Conference, Santa Clara, Oct. 31 to Nov. 2, 2006
    • Michael McCool, Data-parallel programming on the cell BE and the GPU using the RapidMind development platform, GSPx Multicore Applications Conference, Santa Clara, Oct. 31 to Nov. 2, 2006
  • 19
    • 51449122313 scopus 로고    scopus 로고
    • Microsoft's DirectX - A collection of application programming interfaces for handling tasks related to multimedia, especially game programming and video, on Microsoft platforms: http://www.microsoft.com/windows/directx
    • Microsoft's DirectX - A collection of application programming interfaces for handling tasks related to multimedia, especially game programming and video, on Microsoft platforms: http://www.microsoft.com/windows/directx
  • 20
    • 0000018801 scopus 로고
    • Block sparse Cholesky algorithms on advanced uniprocessor computers
    • Ng E.G., and Peyton B.W. Block sparse Cholesky algorithms on advanced uniprocessor computers. SIAM Journal on Scientific Computing 14 (1993) 1034-1056
    • (1993) SIAM Journal on Scientific Computing , vol.14 , pp. 1034-1056
    • Ng, E.G.1    Peyton, B.W.2
  • 21
    • 51449084257 scopus 로고    scopus 로고
    • NVIDIA CUDA Homepage: http://developer.nvidia.com/object/cuda.html
    • NVIDIA CUDA Homepage: http://developer.nvidia.com/object/cuda.html
  • 22
    • 51449118648 scopus 로고    scopus 로고
    • OpenGL-The Industry's Foundation for High Performance Graphics: http://www.opengl.org
    • OpenGL-The Industry's Foundation for High Performance Graphics: http://www.opengl.org
  • 24
    • 0001102965 scopus 로고    scopus 로고
    • Performance of panel and block approaches to sparse Cholesky factorization on the iPSC/860 and Paragon multicomputers
    • Rothberg E. Performance of panel and block approaches to sparse Cholesky factorization on the iPSC/860 and Paragon multicomputers. SIAM Journal on Scientific Computing 17 3 (1996) 699-711
    • (1996) SIAM Journal on Scientific Computing , vol.17 , Issue.3 , pp. 699-711
    • Rothberg, E.1
  • 25
    • 51449118213 scopus 로고    scopus 로고
    • E. Rothberg, A. Gupta, Techniques for improving the performance of sparse matrix factorization on multiprocessor workstations, in: Supercomputing '90. ACM-IEEE, 1990
    • E. Rothberg, A. Gupta, Techniques for improving the performance of sparse matrix factorization on multiprocessor workstations, in: Supercomputing '90. ACM-IEEE, 1990
  • 26
    • 1642370513 scopus 로고    scopus 로고
    • Solving unsymmetric sparse systems of linear equations with PARDISO
    • Schenk O., and Gärtner K. Solving unsymmetric sparse systems of linear equations with PARDISO. Journal of Future Generation Computer Systems 20 3 (2004) 475-487
    • (2004) Journal of Future Generation Computer Systems , vol.20 , Issue.3 , pp. 475-487
    • Schenk, O.1    Gärtner, K.2
  • 27
    • 33846230007 scopus 로고    scopus 로고
    • On fast factorization pivoting methods for symmetric indefinite systems
    • Schenk O., and Gärtner K. On fast factorization pivoting methods for symmetric indefinite systems. Electronic Transactions on Numerical Analysis 23 1 (2006) 158-179
    • (2006) Electronic Transactions on Numerical Analysis , vol.23 , Issue.1 , pp. 158-179
    • Schenk, O.1    Gärtner, K.2
  • 28
    • 78651284120 scopus 로고    scopus 로고
    • Shubhabrata Sengupta, Mark Harris, Yao Zhang, John D. Owens, Scan primitives for GPU computing, in: Proceedings Graphics Hardware 2007, August 2007, pp. 97-106
    • Shubhabrata Sengupta, Mark Harris, Yao Zhang, John D. Owens, Scan primitives for GPU computing, in: Proceedings Graphics Hardware 2007, August 2007, pp. 97-106
  • 29
    • 51449100390 scopus 로고    scopus 로고
    • The cell project at IBM research: http://www.research.ibm.com/cell/
    • The cell project at IBM research: http://www.research.ibm.com/cell/
  • 30
    • 29144523061 scopus 로고    scopus 로고
    • On the implementation of a primal-dual interior point filter line search algorithm for large-scale nonlinear programming
    • Wächter A., and Biegler L.T. On the implementation of a primal-dual interior point filter line search algorithm for large-scale nonlinear programming. Mathematical Programming 106 1 (2006) 25-57
    • (2006) Mathematical Programming , vol.106 , Issue.1 , pp. 25-57
    • Wächter, A.1    Biegler, L.T.2
  • 31
    • 51449090359 scopus 로고    scopus 로고
    • Samuel W. Williams, Leonid Oliker, Richard Vuduc, Katherine Yelick, James Demmel, John Shalf, Optimization of sparse matrix-vector multiplication on emerging multicore platforms, in: Proceedings of Supercomputing '07, Nov. 2007
    • Samuel W. Williams, Leonid Oliker, Richard Vuduc, Katherine Yelick, James Demmel, John Shalf, Optimization of sparse matrix-vector multiplication on emerging multicore platforms, in: Proceedings of Supercomputing '07, Nov. 2007
  • 32
    • 51449094605 scopus 로고    scopus 로고
    • Two-electron integral evaluation on the graphics processor unit
    • Yasuda K. Two-electron integral evaluation on the graphics processor unit. Journal of Computational Chemistry (2007)
    • (2007) Journal of Computational Chemistry
    • Yasuda, K.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.