메뉴 건너뛰기




Volumn , Issue , 2009, Pages

Performance analysis of memory transfers and GEMM subroutines on NVIDIA Tesla GPU cluster

Author keywords

CUBLAS; CUDA; GPU cluster; Math Kernel Library; NetPIPE; Performance; Tesla

Indexed keywords

APPLICATION ACCELERATOR; BASIC LINEAR ALGEBRA SUBROUTINES; COMMODITY CLUSTERS; COMPUTATIONAL CHEMISTRY; DOUBLE PRECISION; EFFICIENT IMPLEMENTATION; GRAPHICAL PROCESSING UNITS; HIGH PERFORMANCE COMPUTING SYSTEMS; KERNEL LIBRARIES; MATRIX; OVERALL EFFICIENCY; PERFORMANCE ANALYSIS; PRICE RATIO; SCIENTIFIC APPLICATIONS;

EID: 72049102909     PISSN: 15525244     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/CLUSTR.2009.5289124     Document Type: Conference Paper
Times cited : (11)

References (22)
  • 4
    • 78651550268 scopus 로고    scopus 로고
    • Scalable parallel programming with cuda
    • J. Nickolls, I. Buck, M. Garland, and K. Skadron, "Scalable parallel programming with cuda," Queue, vol.6, no.2, pp. 40-53, 2008.
    • (2008) Queue , vol.6 , Issue.2 , pp. 40-53
    • Nickolls, J.1    Buck, I.2    Garland, M.3    Skadron, K.4
  • 6
    • 33947588048 scopus 로고    scopus 로고
    • A survey of general-purpose computation on graphics hardware
    • [Online]. Available
    • J. D. Owens, D. Luebke, N. Govindaraju, M. Harris, J. Krger, A. E. Lefohn, and T. J. Purcell, "A survey of general-purpose computation on graphics hardware," Computer Graphics Forum, vol.26, no.1, pp. 80-113, 2007. [Online]. Available: http://www.blackwellsynergy.com/doi/pdf/10.1111/j. 1467-8659.2007.01012.x
    • (2007) Computer Graphics Forum , vol.26 , Issue.1 , pp. 80-113
    • Owens, J.D.1    Luebke, D.2    Govindaraju, N.3    Harris, M.4    Krger, J.5    Lefohn, A.E.6    Purcell, T.J.7
  • 8
    • 72049110101 scopus 로고    scopus 로고
    • http://www.khronos.org/opencl/
  • 10
    • 33846818766 scopus 로고    scopus 로고
    • Examining the viability of FPGA supercomputing
    • S. Craven and P. Athanas, "Examining the viability of FPGA supercomputing," EURASIP J. Embedded Syst., vol.2007, no.1, pp. 13-13, 2007.
    • (2007) EURASIP J. Embedded Syst. , vol.2007 , Issue.1 , pp. 13-13
    • Craven, S.1    Athanas, P.2
  • 11
    • 44849137198 scopus 로고    scopus 로고
    • Nvidia tesla: A unified graphics and computing architecture
    • E. Lindholm, J. Nickolls, S. Oberman, and J. Montrym, "Nvidia tesla: A unified graphics and computing architecture," IEEE Micro, vol.28, no.2, pp. 39-55, 2008.
    • (2008) IEEE Micro , vol.28 , Issue.2 , pp. 39-55
    • Lindholm, E.1    Nickolls, J.2    Oberman, S.3    Montrym, J.4
  • 18
    • 72049109841 scopus 로고    scopus 로고
    • Intel. [Online]. Available
    • Intel, thread Affinity Interface. [Online]. Available: http://software.intel.com/en-us/intel-compilers/
    • Thread Affinity Interface
  • 22
    • 70350771131 scopus 로고    scopus 로고
    • Benchmarking gpus to tune dense linear algebra
    • V. Volkov and J. Demmel, "Benchmarking gpus to tune dense linear algebra," in SC, 2008, p. 31.
    • (2008) SC , pp. 31
    • Volkov, V.1    Demmel, J.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.