메뉴 건너뛰기




Volumn 48, Issue 4, 2013, Pages 407-418

Improving GPGPU concurrency with elastic kernels

Author keywords

Concurrent Kernels; CUDA; GPGPU

Indexed keywords

BENCHMARK SUITES; BETTER PERFORMANCE; CONCURRENT EXECUTION; CONCURRENT KERNELS; CUDA; FINE-GRAINED CONTROL; GPGPU; SYSTEM THROUGHPUT;

EID: 84880112409     PISSN: 15232867     EISSN: None     Source Type: Journal    
DOI: 10.1145/2499368.2451160     Document Type: Conference Paper
Times cited : (110)

References (22)
  • 1
    • 84860351763 scopus 로고    scopus 로고
    • The case for gpgpu spatial multitasking
    • J. Adriaens et al. The case for GPGPU spatial multitasking. In HPCA, 2012.
    • (2012) HPCA
    • Adriaens, J.1
  • 2
    • 70349169075 scopus 로고    scopus 로고
    • Analyzing cuda workloads using a detailed gpu simulator
    • A. Bakhoda et al. Analyzing CUDA Workloads Using a Detailed GPU Simulator. In ISPASS, 2009.
    • (2009) ISPASS
    • Bakhoda, A.1
  • 3
    • 70649092154 scopus 로고    scopus 로고
    • Rodinia: A benchmark suite for heterogeneous computing
    • S. Che et al. Rodinia: A benchmark suite for heterogeneous computing. In IISWC, 2009.
    • (2009) IISWC
    • Che, S.1
  • 5
    • 78650613386 scopus 로고    scopus 로고
    • A scalable concurrent malloc(3) implementation for freebsd
    • J. Evans. A Scalable Concurrent malloc(3) Implementation for FreeBSD. In BSDcan, 2006.
    • (2006) BSDcan
    • Evans, J.1
  • 6
    • 47249094055 scopus 로고    scopus 로고
    • System-level performance metrics for multiprogram workloads
    • S. Eyerman and L. Eeckhout. System-level Performance Metrics for Multiprogram Workloads. IEEE Micro, 28(3), 2008.
    • (2008) IEEE Micro , vol.28 , Issue.3
    • Eyerman, S.1    Eeckhout, L.2
  • 7
    • 84894883016 scopus 로고    scopus 로고
    • Fine-grained resource sharing for concurrent gpgpu kernels
    • C. Gregg et al. Fine-grained resource sharing for concurrent GPGPU kernels. In HotPar, 2012.
    • (2012) HotPar
    • Gregg, C.1
  • 9
    • 44849137198 scopus 로고    scopus 로고
    • Nvidia tesla: A unified graphics and computing architecture
    • E. Lindholm et al. NVIDIA Tesla: A unified graphics and computing architecture. IEEE Micro, 28(2):39-55, 2008.
    • (2008) IEEE Micro , vol.28 , Issue.2 , pp. 39-55
    • Lindholm, E.1
  • 14
    • 79960506159 scopus 로고    scopus 로고
    • Supporting gpu sharing in cloud environments with a transparent runtime consolidation framework
    • V. T. Ravi et al. Supporting GPU sharing in cloud environments with a transparent runtime consolidation framework. In HPDC, 2011.
    • (2011) HPDC
    • Ravi, V.T.1
  • 16
    • 70449725275 scopus 로고    scopus 로고
    • Chunking parallel loops in the presence of synchronization
    • J. Shirako et al. Chunking parallel loops in the presence of synchronization. In ICS, 2009.
    • (2009) ICS
    • Shirako, J.1
  • 17
    • 77953978573 scopus 로고    scopus 로고
    • Efficient compilation of fine-grained spmdthreaded programs for multicore cpus
    • J. A. Stratton et al. Efficient compilation of fine-grained SPMDthreaded programs for multicore CPUs. In CGO, 2010.
    • (2010) CGO
    • Stratton, J.A.1
  • 18
    • 70449749047 scopus 로고    scopus 로고
    • Mcuda: An efficient implementation of cuda kernels for multi-core cpus
    • J. A. Stratton, S. S. Stone, andW. meiW. Hwu. MCUDA: An efficient implementation of CUDA kernels for multi-core CPUs. In LCPC, 2008.
    • (2008) LCPC
    • Stratton, J.A.1    Stone, S.S.2    Mei, W.3    Hwu, W.4
  • 19
    • 84875683628 scopus 로고    scopus 로고
    • TOP500.org
    • TOP500.org. The Top 500.
    • The Top 500
  • 20
    • 34547715870 scopus 로고    scopus 로고
    • Initial observations of the simultaneous multithreading pentium 4 processor
    • N. Tuck and D. M. Tullsen. Initial Observations of the Simultaneous Multithreading Pentium 4 Processor. In PACT, 2003.
    • (2003) PACT
    • Tuck, N.1    Tullsen, D.M.2
  • 22
    • 80052985746 scopus 로고    scopus 로고
    • Exploiting concurrent kernel execution on graphic processing units
    • L.Wang, M. Huang, and T. El-Ghazawi. Exploiting concurrent kernel execution on graphic processing units. In HPCS, 2011.
    • (2011) HPCS
    • Wang, L.1    Huang, M.2    El-Ghazawi, T.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.