메뉴 건너뛰기




Volumn , Issue , 2011, Pages

Fast implementation of DGEMM on Fermi GPU

Author keywords

CUDA; GPU; High performance computing; Matrix matrix multiplication

Indexed keywords

CUDA; FAST IMPLEMENTATION; GPU; HIGH PERFORMANCE COMPUTING; INSTRUCTION SCHEDULING; MACHINE LANGUAGES; MATRIX-MATRIX MULTIPLICATION; MEMORY HIERARCHY; MEMORY OPERATIONS; MICRO ARCHITECTURES; OPTIMAL ALGORITHM; OPTIMIZATION STRATEGY; PEAK PERFORMANCE; PERFORMANCE MODELING; SHARED MEMORIES; SOFTWARE PIPELINING;

EID: 83155160943     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/2063384.2063431     Document Type: Conference Paper
Times cited : (102)

References (17)
  • 5
    • 44249094647 scopus 로고    scopus 로고
    • Anatomy of high-performance matrix multiplication
    • 34:12:1-12:25, May
    • K. Goto and R. A. v. d. Geijn. Anatomy of high-performance matrix multiplication. ACM Trans. Math. Softw., 34:12:1-12:25, May 2008.
    • (2008) ACM Trans. Math. Softw.
    • Goto, K.1    Geijn, R.A.V.D.2
  • 8
    • 81555213505 scopus 로고    scopus 로고
    • A fast gemm implementation on the cypress gpu
    • March
    • N. Nakasato. A fast gemm implementation on the cypress gpu. SIGMETRICS Perform. Eval. Rev., 38:50-55, March 2011.
    • (2011) Sigmetrics Perform. Eval. Rev. , vol.38 , pp. 50-55
    • Nakasato, N.1
  • 10
    • 84886934561 scopus 로고    scopus 로고
    • NVIDIA. Cuda Community Showcase. http://www.nvidia.com/object/ cudaappsflashnew.html.
    • Cuda Community Showcase


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.