메뉴 건너뛰기




Volumn , Issue , 2013, Pages 461-466

Register and thread structure optimization for GPUs

Author keywords

[No Author keywords available]

Indexed keywords

DESIGN SPACE EXPLORATION; EFFICIENT DESIGN SPACE EXPLORATIONS; HIGH PERFORMANCE COMPUTING; IMPLEMENTATION PLATFORMS; OPTIMIZATION TECHNIQUES; PARALLEL PROGRAMMING MODEL; SINGLE-THREAD PERFORMANCE; STRUCTURE OPTIMIZATION;

EID: 84877777934     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ASPDAC.2013.6509639     Document Type: Conference Paper
Times cited : (5)

References (14)
  • 1
    • 85081782108 scopus 로고    scopus 로고
    • NVIDIA. NVIDIA CUDA Programming Guide, Version 3. 2
    • NVIDIA. NVIDIA CUDA Programming Guide, Version 3.2.
  • 2
    • 85081781987 scopus 로고    scopus 로고
    • NVIDIA. Occupancy Calculator
    • NVIDIA. Occupancy Calculator. http://developer.nvidia. com/object/cuda-3-2-toolkit-rc.html.
  • 3
    • 85081779909 scopus 로고    scopus 로고
    • OpenCL
    • OpenCL. http://www.khronos.org/opencl.
  • 4
    • 77749337497 scopus 로고    scopus 로고
    • An adaptive performance modeling tool for gpu architectures
    • S. S. Baghsorkhi et al. An adaptive performance modeling tool for GPU architectures. In PPoPP, 2010.
    • (2010) PPoPP
    • Baghsorkhi, S.S.1
  • 5
    • 70349169075 scopus 로고    scopus 로고
    • Analyzing cuda worloads using a detailed gpu simulator
    • A. Bakhoda et al. Analyzing CUDA worloads using a detailed GPU simulator. In ISPASS, 2009.
    • (2009) ISPASS
    • Bakhoda, A.1
  • 6
    • 70649092154 scopus 로고    scopus 로고
    • Rodinia: A benchmark suite for heterogeneous computing
    • S. Che et al. Rodinia: A benchmark suite for heterogeneous computing. In IISWC, 2009.
    • (2009) IISWC
    • Che, S.1
  • 7
    • 84866876242 scopus 로고    scopus 로고
    • An accurate GPU performance model for effective control flow divergence optimization
    • Z. Cui et al. An accurate GPU performance model for effective control flow divergence optimization. In IPDPS, 2012.
    • (2012) IPDPS
    • Cui, Z.1
  • 9
    • 70450231944 scopus 로고    scopus 로고
    • An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness
    • S. Hong and H. Kim. An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness. In ISCA, 2009.
    • (2009) ISCA
    • Hong, S.1    Kim, H.2
  • 10
    • 84862069040 scopus 로고    scopus 로고
    • Real-time implementation and performance optimization of 3D sound localization on GPUs
    • Y. Liang et al. Real-time implementation and performance optimization of 3D sound localization on GPUs. In DATE, 2012.
    • (2012) DATE
    • Liang, Y.1
  • 11
  • 12
    • 79959466764 scopus 로고    scopus 로고
    • Optimization principles and application performance evaluation of a multithreaded gpu using cuda
    • S. Ryoo et al. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. In PPoPP, 2008.
    • (2008) PPoPP
    • Ryoo, S.1
  • 13
    • 77952579552 scopus 로고    scopus 로고
    • Demystifying GPU microarchitecture through microbenchmarking
    • H. Wong et al. Demystifying GPU microarchitecture through microbenchmarking. In ISPASS, 2010.
    • (2010) ISPASS
    • Wong, H.1
  • 14
    • 79955921273 scopus 로고    scopus 로고
    • A quantitative performance analysis model for gpu architectures
    • Y. Zhang and J. D. Owens. A quantitative performance analysis model for GPU architectures. In HPCA, 2011.
    • (2011) HPCA
    • Zhang, Y.1    Owens, J.D.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.