SCOPUS 정보 검색 플랫폼

Volumn 48, Issue 4, 2013, Pages 407-418

Improving GPGPU concurrency with elastic kernels

Author keywords

Concurrent Kernels; CUDA; GPGPU

Indexed keywords

BENCHMARK SUITES; BETTER PERFORMANCE; CONCURRENT EXECUTION; CONCURRENT KERNELS; CUDA; FINE-GRAINED CONTROL; GPGPU; SYSTEM THROUGHPUT;

INTERACTIVE COMPUTER SYSTEMS; PROGRAM PROCESSORS; TURNAROUND TIME;

CONCURRENCY CONTROL;

EID: 84880112409 PISSN: 15232867 EISSN: None Source Type: Journal
DOI: 10.1145/2499368.2451160 Document Type: Conference Paper

Times cited : (110)

References (22)

1
- 84860351763
- The case for gpgpu spatial multitasking
- J. Adriaens et al. The case for GPGPU spatial multitasking. In HPCA, 2012.
- (2012) HPCA
- Adriaens, J.¹

2
- 70349169075
- Analyzing cuda workloads using a detailed gpu simulator
- A. Bakhoda et al. Analyzing CUDA Workloads Using a Detailed GPU Simulator. In ISPASS, 2009.
- (2009) ISPASS
- Bakhoda, A.¹

3
- 70649092154
- Rodinia: A benchmark suite for heterogeneous computing
- S. Che et al. Rodinia: A benchmark suite for heterogeneous computing. In IISWC, 2009.
- (2009) IISWC
- Che, S.¹

4
- 84875682017
- M. Desnoyers et al. LTTng-UST User Space Tracer.
- LTTng-UST User Space Tracer
- Desnoyers, M.¹

5
- 78650613386
- A scalable concurrent malloc(3) implementation for freebsd
- J. Evans. A Scalable Concurrent malloc(3) Implementation for FreeBSD. In BSDcan, 2006.
- (2006) BSDcan
- Evans, J.¹

6
- 47249094055
- System-level performance metrics for multiprogram workloads
- S. Eyerman and L. Eeckhout. System-level Performance Metrics for Multiprogram Workloads. IEEE Micro, 28(3), 2008.
- (2008) IEEE Micro , vol.28 , Issue.3
- Eyerman, S.¹ Eeckhout, L.²

7
- 84894883016
- Fine-grained resource sharing for concurrent gpgpu kernels
- C. Gregg et al. Fine-grained resource sharing for concurrent GPGPU kernels. In HotPar, 2012.
- (2012) HotPar
- Gregg, C.¹

8
- 79960526623
- Enabling task parallelism in the cuda scheduler
- M. Guevara et al. Enabling task parallelism in the CUDA scheduler. In Workshop on Programming Models for Emerging Architectures (PMEA), 2009.
- (2009) Workshop on Programming Models for Emerging Architectures (PMEA)
- Guevara, M.¹

9
- 44849137198
- Nvidia tesla: A unified graphics and computing architecture
- E. Lindholm et al. NVIDIA Tesla: A unified graphics and computing architecture. IEEE Micro, 28(2):39-55, 2008.
- (2008) IEEE Micro , vol.28 , Issue.2 , pp. 39-55
- Lindholm, E.¹

10
- 84875665328
- NVIDIA
- NVIDIA. Compute Command Line Profiler: User Guide.
- Compute Command Line Profiler: User Guide

11
- 84875678919
- NVIDIA
- NVIDIA. CUDA Occupancy Calculator.
- CUDA Occupancy Calculator

12
- 84878084176
- NVIDIA
- NVIDIA. NVIDIA CUDA C Programming Guide (version 4.2).
- NVIDIA CUDA C Programming Guide (Version 4.2)

13
- 84880085895
- NVIDIA
- NVIDIA. NVIDIA's Next Generation CUDA Compute Architecture: Kepler GK110.
- NVIDIA's Next Generation CUDA Compute Architecture: Kepler GK110

14
- 79960506159
- Supporting gpu sharing in cloud environments with a transparent runtime consolidation framework
- V. T. Ravi et al. Supporting GPU sharing in cloud environments with a transparent runtime consolidation framework. In HPDC, 2011.
- (2011) HPDC
- Ravi, V.T.¹

15
- 84963971049
- S. Rennich. CUDA C/C++ Streams and Concurrency.
- CUDA C/C++ Streams and Concurrency
- Rennich, S.¹

16
- 70449725275
- Chunking parallel loops in the presence of synchronization
- J. Shirako et al. Chunking parallel loops in the presence of synchronization. In ICS, 2009.
- (2009) ICS
- Shirako, J.¹

17
- 77953978573
- Efficient compilation of fine-grained spmdthreaded programs for multicore cpus
- J. A. Stratton et al. Efficient compilation of fine-grained SPMDthreaded programs for multicore CPUs. In CGO, 2010.
- (2010) CGO
- Stratton, J.A.¹

18
- 70449749047
- Mcuda: An efficient implementation of cuda kernels for multi-core cpus
- J. A. Stratton, S. S. Stone, andW. meiW. Hwu. MCUDA: An efficient implementation of CUDA kernels for multi-core CPUs. In LCPC, 2008.
- (2008) LCPC
- Stratton, J.A.¹ Stone, S.S.² Mei, W.³ Hwu, W.⁴

20
- 34547715870
- Initial observations of the simultaneous multithreading pentium 4 processor
- N. Tuck and D. M. Tullsen. Initial Observations of the Simultaneous Multithreading Pentium 4 Processor. In PACT, 2003.
- (2003) PACT
- Tuck, N.¹ Tullsen, D.M.²

21
- 79953080838
- Kernel fusion: An effective method for better power efficiency on multithreaded gpu
- G. Wang, Y. Lin, and W. Yi. Kernel fusion: An effective method for better power efficiency on multithreaded GPU. In Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing, GREENCOM-CPSCOM '10, 2010.
- (2010) Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing, GREENCOM-CPSCOM '10
- Wang, G.¹ Lin, Y.² Yi, W.³

22
- 80052985746
- Exploiting concurrent kernel execution on graphic processing units
- L.Wang, M. Huang, and T. El-Ghazawi. Exploiting concurrent kernel execution on graphic processing units. In HPCS, 2011.
- (2011) HPCS
- Wang, L.¹ Huang, M.² El-Ghazawi, T.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.