메뉴 건너뛰기




Volumn 68, Issue 10, 2008, Pages 1370-1380

A performance study of general-purpose applications on graphics processors using CUDA

Author keywords

CUDA; GPGPU; GPU; Graphics processors; Heterogeneous computing organizations; Manycore; Multicore; OpenMP; Parallel programming

Indexed keywords

DATA STORAGE EQUIPMENT; GENERAL PURPOSE COMPUTERS; PROGRAMMING THEORY;

EID: 51449118065     PISSN: 07437315     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.jpdc.2008.05.014     Document Type: Article
Times cited : (488)

References (31)
  • 1
    • 51449120100 scopus 로고    scopus 로고
    • K. Asanovic, R. Bodik, B.C. Catanzaro, J.J. Gebis, P. Husbands, K. Keutzer, D.A. Patterson, W.L. Plishker, J. Shalf, S.W. Williams, K.A. Yelick, The landscape of parallel computing research: A view from Berkeley, Tech. Rep. UCB/EECS-2006-183, Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Dec 2006
    • K. Asanovic, R. Bodik, B.C. Catanzaro, J.J. Gebis, P. Husbands, K. Keutzer, D.A. Patterson, W.L. Plishker, J. Shalf, S.W. Williams, K.A. Yelick, The landscape of parallel computing research: A view from Berkeley, Tech. Rep. UCB/EECS-2006-183, Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Dec 2006
  • 2
    • 0025627052 scopus 로고    scopus 로고
    • T. Blank, The MasPar MP-1 architecture, in: Compcon Spring '90. Intellectual Leverage. Digest of Papers. Thirty-Fifth IEEE Computer Society International Conference, 1990, pp. 20-24
    • T. Blank, The MasPar MP-1 architecture, in: Compcon Spring '90. Intellectual Leverage. Digest of Papers. Thirty-Fifth IEEE Computer Society International Conference, 1990, pp. 20-24
  • 3
    • 51449114717 scopus 로고    scopus 로고
    • M. Boyer, K. Skadron, W. Weimer, Automated dynamic analysis of CUDA programs, in: Third Workshop on Software Tools for MultiCore Systems, 2008
    • M. Boyer, K. Skadron, W. Weimer, Automated dynamic analysis of CUDA programs, in: Third Workshop on Software Tools for MultiCore Systems, 2008
  • 5
    • 51449120786 scopus 로고    scopus 로고
    • M. Damrudi, K.J. Aval, Parallel sorting on ILLIAC array processor, in: Proc. of the 7th Conference on 7th WSEAS International Conference on Systems Theory and Scientific Computation, 2007
    • M. Damrudi, K.J. Aval, Parallel sorting on ILLIAC array processor, in: Proc. of the 7th Conference on 7th WSEAS International Conference on Systems Theory and Scientific Computation, 2007
  • 6
    • 51449110631 scopus 로고    scopus 로고
    • T. Goodale, G. Allen, G. Lanfermann, J. Massó, T. Radke, E. Seidel, J. Shalf, The Cactus framework and toolkit: Design and applications, in: Proc. of Vector and Parallel Processing, 2002
    • T. Goodale, G. Allen, G. Lanfermann, J. Massó, T. Radke, E. Seidel, J. Shalf, The Cactus framework and toolkit: Design and applications, in: Proc. of Vector and Parallel Processing, 2002
  • 7
    • 3142739595 scopus 로고    scopus 로고
    • N.K. Govindaraju, B. Lloyd, W. Wang, M. Lin, D. Manocha, Fast computation of database operations using graphics processors, in: Proc. of the ACM SIGMOD International Conference on Management of Data, 2004
    • N.K. Govindaraju, B. Lloyd, W. Wang, M. Lin, D. Manocha, Fast computation of database operations using graphics processors, in: Proc. of the ACM SIGMOD International Conference on Management of Data, 2004
  • 8
    • 78651284090 scopus 로고    scopus 로고
    • M.J. Harris, W.V. Baxter, T. Scheuermann, A. Lastra, Simulation of cloud dynamics on graphics hardware, in: Proc. of the ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware, 2003
    • M.J. Harris, W.V. Baxter, T. Scheuermann, A. Lastra, Simulation of cloud dynamics on graphics hardware, in: Proc. of the ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware, 2003
  • 10
    • 34547288235 scopus 로고    scopus 로고
    • W.W. Hwu, S. Ryoo, S.-Z. Ueng, J.H. Kelm, I. Gelado, S.S. Stone, R.E. Kidd, S.S. Baghsorkhi, A. Mahesri, S.C. Tsao, N. Navarro, S.S. Lumetta, M.I. Frank, S.J. Patel, Implicitly parallel programming models for thousand-core microprocessors, in: Proc. of the 44th ACM/IEEE Design Automation Conference, 2007
    • W.W. Hwu, S. Ryoo, S.-Z. Ueng, J.H. Kelm, I. Gelado, S.S. Stone, R.E. Kidd, S.S. Baghsorkhi, A. Mahesri, S.C. Tsao, N. Navarro, S.S. Lumetta, M.I. Frank, S.J. Patel, Implicitly parallel programming models for thousand-core microprocessors, in: Proc. of the 44th ACM/IEEE Design Automation Conference, 2007
  • 11
    • 51449096476 scopus 로고    scopus 로고
    • KDD cup 1999 data, http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
    • KDD cup 1999 data, http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
  • 12
    • 51449105265 scopus 로고    scopus 로고
    • J. Kessenich, D. Baldwin, R. Rost, The OpenGL shading language, http://www.opengl.org/documentation/glsl
    • J. Kessenich, D. Baldwin, R. Rost, The OpenGL shading language, http://www.opengl.org/documentation/glsl
  • 13
    • 0242533310 scopus 로고    scopus 로고
    • Linear algebra operators for GPU implementation of numerical algorithms
    • Krüger J., and Westermann R. Linear algebra operators for GPU implementation of numerical algorithms. ACM Transactions on Graphics 22 3 (2003) 908-916
    • (2003) ACM Transactions on Graphics , vol.22 , Issue.3 , pp. 908-916
    • Krüger, J.1    Westermann, R.2
  • 14
    • 44849137198 scopus 로고    scopus 로고
    • NVIDIA Tesla: A unified graphics and computing architecture
    • Lindholm E., Nickolls J., Oberman S., and Montrym J. NVIDIA Tesla: A unified graphics and computing architecture. IEEE Micro 28 2 (2008) 39-55
    • (2008) IEEE Micro , vol.28 , Issue.2 , pp. 39-55
    • Lindholm, E.1    Nickolls, J.2    Oberman, S.3    Montrym, J.4
  • 16
    • 51449115294 scopus 로고    scopus 로고
    • Microsoft, DirectX 10, http://www.gamesforwindows.com/en-US/AboutGFW/Pages/DirectX10.aspx
    • Microsoft, DirectX 10, http://www.gamesforwindows.com/en-US/AboutGFW/Pages/DirectX10.aspx
  • 17
    • 78651550268 scopus 로고    scopus 로고
    • Scalable parallel programming with CUDA
    • Nickolls J., Buck I., Garland M., and Skadron K. Scalable parallel programming with CUDA. ACM Queue 6 2 (2008) 40-53
    • (2008) ACM Queue , vol.6 , Issue.2 , pp. 40-53
    • Nickolls, J.1    Buck, I.2    Garland, M.3    Skadron, K.4
  • 18
    • 51449118877 scopus 로고    scopus 로고
    • NVIDIA, CUDA CUFFT library, http://developer.download.nvidia.com/compute/cuda/1_1/CUFFT_Library_1.1.pdf
    • NVIDIA, CUDA CUFFT library, http://developer.download.nvidia.com/compute/cuda/1_1/CUFFT_Library_1.1.pdf
  • 19
    • 51449100200 scopus 로고    scopus 로고
    • NVIDIA, CUDA programming guide 1.1, http://developer.download.nvidia.com/compute/cuda/1_1/NVIDIA_CUDA_Programming_Guide_1.1.pdf
    • NVIDIA, CUDA programming guide 1.1, http://developer.download.nvidia.com/compute/cuda/1_1/NVIDIA_CUDA_Programming_Guide_1.1.pdf
  • 20
    • 51449085385 scopus 로고    scopus 로고
    • NVIDIA, Monte-carlo option pricing, http://developer.download.nvidia.com/compute/cuda/sdk/website/projects/MonteCarlo/doc/MonteCarlo.pdf
    • NVIDIA, Monte-carlo option pricing, http://developer.download.nvidia.com/compute/cuda/sdk/website/projects/MonteCarlo/doc/MonteCarlo.pdf
  • 21
    • 44849094749 scopus 로고    scopus 로고
    • Fast N-Body simulation with CUDA
    • Addison Wesley
    • Nyland L., Harris M., and Prins J. Fast N-Body simulation with CUDA. GPU Gems vol. 3 (2007), Addison Wesley 677-795
    • (2007) GPU Gems , vol.3 , pp. 677-795
    • Nyland, L.1    Harris, M.2    Prins, J.3
  • 22
    • 51449093792 scopus 로고    scopus 로고
    • J. Pisharath, Y. Liu, W. Liao, A. Choudhary, G. Memik, J. Parhi, NU-MineBench 2.0, Tech. Rep. CUCIS-2005-08-01, Department of Electrical and Computer Engineering, Northwestern University, Aug 2005
    • J. Pisharath, Y. Liu, W. Liao, A. Choudhary, G. Memik, J. Parhi, NU-MineBench 2.0, Tech. Rep. CUCIS-2005-08-01, Department of Electrical and Computer Engineering, Northwestern University, Aug 2005
  • 23
    • 53749106683 scopus 로고    scopus 로고
    • C.I. Rodrigues, D.J. Hardy, J.E. Stone, K. Schulten, W.-M.W. Hwu, GPU acceleration of cutoff pair potentials for molecular modeling applications, in: Proc. of the 2008 Conference on Computing Frontiers, 2008
    • C.I. Rodrigues, D.J. Hardy, J.E. Stone, K. Schulten, W.-M.W. Hwu, GPU acceleration of cutoff pair potentials for molecular modeling applications, in: Proc. of the 2008 Conference on Computing Frontiers, 2008
  • 24
    • 38849131252 scopus 로고    scopus 로고
    • High-throughput sequence alignment using Graphics Processing Units
    • Schatz M., Trapnell C., Delcher A., and Varshney A. High-throughput sequence alignment using Graphics Processing Units. BMC Bioinformatics 8 1 (2007) 474
    • (2007) BMC Bioinformatics , vol.8 , Issue.1 , pp. 474
    • Schatz, M.1    Trapnell, C.2    Delcher, A.3    Varshney, A.4
  • 25
    • 78651284120 scopus 로고    scopus 로고
    • S. Sengupta, M. Harris, Y. Zhang, J.D. Owens, Scan primitives for GPU computing, in: Proc. of the ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware, 2007
    • S. Sengupta, M. Harris, Y. Zhang, J.D. Owens, Scan primitives for GPU computing, in: Proc. of the ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware, 2007
  • 26
    • 51449113906 scopus 로고    scopus 로고
    • S.S. Stratton, J.A. Stone, W.W. Hwu, M-CUDA: An efficient implementation of CUDA kernels on multi-cores, Tech. Rep. IMPACT-08-01, Center for Reliable and High-Performance Computing, University of Illinois at Urbana-Champaign, March 2008
    • S.S. Stratton, J.A. Stone, W.W. Hwu, M-CUDA: An efficient implementation of CUDA kernels on multi-cores, Tech. Rep. IMPACT-08-01, Center for Reliable and High-Performance Computing, University of Illinois at Urbana-Champaign, March 2008
  • 27
    • 33947595619 scopus 로고    scopus 로고
    • D. Tarditi, S. Puri, J. Oglesby, Accelerator: using data parallelism to program GPUs for general-purpose uses, in: Proc. of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, 2006
    • D. Tarditi, S. Puri, J. Oglesby, Accelerator: using data parallelism to program GPUs for general-purpose uses, in: Proc. of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, 2006
  • 28
    • 51449124336 scopus 로고    scopus 로고
    • F.M. Thiesing, U. Middelberg, O. Vornberger, Parallel back-propagation for sales prediction on transputer systems, in: Proc. of the World Transputer Congress, 1995
    • F.M. Thiesing, U. Middelberg, O. Vornberger, Parallel back-propagation for sales prediction on transputer systems, in: Proc. of the World Transputer Congress, 1995
  • 29
    • 51449115705 scopus 로고    scopus 로고
    • G. Vahala, J. Yepez, L. Vahala, M. Soe, J. Carter, 3D entropic Lattice Boltzmann simulations of 3D Navier-Stokes turbulence, in: Proc. of the 47th Annual Meeting of the APS Division of Plasma Physics, 2005
    • G. Vahala, J. Yepez, L. Vahala, M. Soe, J. Carter, 3D entropic Lattice Boltzmann simulations of 3D Navier-Stokes turbulence, in: Proc. of the 47th Annual Meeting of the APS Division of Plasma Physics, 2005
  • 30
    • 38149007462 scopus 로고    scopus 로고
    • AES encryption and decryption on the GPU
    • Addison Wesley
    • Yamanouchi T. AES encryption and decryption on the GPU. GPU Gems vol. 3 (2007), Addison Wesley 785-803
    • (2007) GPU Gems , vol.3 , pp. 785-803
    • Yamanouchi, T.1
  • 31
    • 0036870577 scopus 로고    scopus 로고
    • Speckle reducing anisotropic diffusion
    • Yu Y., and Acton S. Speckle reducing anisotropic diffusion. IEEE Transactions on Image Processing 11 11 (2002) 1260-1270
    • (2002) IEEE Transactions on Image Processing , vol.11 , Issue.11 , pp. 1260-1270
    • Yu, Y.1    Acton, S.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.