SCOPUS 정보 검색 플랫폼

IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD

Volumn , Issue , 2013, Pages 516-523

An efficient compiler framework for cache bypassing on GPUs

(4) Xie, Xiaolong a Liang, Yun a Sun, Guangyu a Chen, Deming b

a PEKING UNIVERSITY (China)

b UNIVERSITY OF ILLINOIS AT URBANA CHAMPAIGN (United States)

Author keywords

Cache Bypassing; Compiler Optimization; GPU

Indexed keywords

CACHE BYPASSING; COMPILER OPTIMIZATIONS; GENERAL PURPOSE GPU; GPU; GRAPHICS PROCESSING UNITS; INSTRUCTION SET ARCHITECTURE; PERFORMANCE METRICS; SCRATCH PAD MEMORY;

ACCESS CONTROL; ALGORITHMS; COMPUTER AIDED DESIGN; COMPUTER ARCHITECTURE; COMPUTER GRAPHICS; PROGRAM COMPILERS; UBIQUITOUS COMPUTING;

CACHE MEMORY;

EID: 84893396474 PISSN: 10923152 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICCAD.2013.6691165 Document Type: Conference Paper

Times cited : (90)

References (35)

1
- 84893426008
- GE Intelligent Platforms. http://defense.ge-ip. com/products/hpec/c560.

2
- 84893408658
- Mosek. http://www.mosek.com/.

3
- 84893362632
- NVIDIA. Fermi GPUs www.nvidia.com/object/fermi-architecture.html.

4
- 84893389126
- NVIDIA. Kepler GPUs www.nvidia.com/object/nvidia-kepler.html.

5
- 84893347454
- NVIDIA. PTX Code http://docs.nvidia.com/cuda/pdf/ptx-isa-3.1.pdf.

6
- 84878610978
- Version 3.2
- NVIDIA. CUDA Programming Guide, Version 3.2.
- CUDA Programming Guide

7
- 84893425171
- NVIDIA. Profiler http://docs.nvidia.com/cuda/profiler-users-guide/index. html.

8
- 84893367579
- NVIDIA GPU Computing SDK. http://developer.nvidia.com/gpu-computing-sdk.

9
- 84893429325
- NVIDIA Tegra. http://www.nvidia.com/object/tegra.html.

10
- 84893381500
- QualcommInc. http://www.qualcomm.com/snapdragon.

11
- 84893398220
- SamSung Inc. www.samsung.com/exynos.

12
- 0004116989
- McGraw-Hill Higher Education, 2nd edition
- T. Cormen, C. Stein, R. Rivest, and C. Leiserson. Introduction to Algorithms. McGraw-Hill Higher Education, 2nd edition, 2001.
- (2001) Introduction to Algorithms
- Cormen, T.¹ Stein, C.² Rivest, R.³ Leiserson, C.⁴

13
- 84866876242
- An accurate GPU performance model for effective control flow divergence optimization
- Z. Cui, Y. Liang, K. Rupnow, and D. Chen. An accurate GPU performance model for effective control flow divergence optimization. In IPDPS, 2012.
- (2012) IPDPS
- Cui, Z.¹ Liang, Y.² Rupnow, K.³ Chen, D.⁴

14
- 84863389330
- SHiP: Signature-based hit predictor for high performance caching
- C. J. Wu et al. SHiP: signature-based hit predictor for high performance caching. In Micro, 2011.
- (2011) Micro
- Wu, C.J.¹

15
- 84873470137
- Parboil: A revised benchmark suite for scientific and commercial throughput computing
- J. A. Stratton et al. Parboil: A revised benchmark suite for scientific and commercial throughput computing. In IMPACT Technical Report, 2012.
- (2012) IMPACT Technical Report
- Stratton, J.A.¹

16
- 49049088756
- GPU computing
- J. D. Owens et al. GPU computing. Proceedings of the IEEE, 2008.
- (2008) Proceedings of the IEEE
- Owens, J.D.¹

17
- 57349180412
- A compiler framework for optimization of affine loop nests for GPGPUs
- M. M. Baskaran et al. A compiler framework for optimization of affine loop nests for GPGPUs. In ICS, 2008.
- (2008) ICS
- Baskaran, M.M.¹

18
- 70649092154
- S. Che et al. Rodinia: A benchmark suite for heterogeneous computing. Isn IISWC, 2009.
- (2009) Rodinia: A Benchmark Suite for Heterogeneous Computing. Isn IISWC
- Che, S.¹

19
- 79959466764
- Optimization principles and application performance evaluation of a multithreaded GPU using CUDA
- S. Ryoo et al. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. In PPoPP, 2008.
- (2008) PPoPP
- Ryoo, S.¹

20
- 84948958301
- Compiler managed micro-cache bypassing for high performance EPIC processors
- Y. Wu et al. Compiler managed micro-cache bypassing for high performance EPIC processors. In Micro, 2002.
- (2002) Micro
- Wu, Y.¹

21
- 4444328501
- An integrated hardware/software approach for run-time scratchpad management
- P. Francesco et al. An integrated hardware/software approach for run-time scratchpad management. In DAC, 2004.
- (2004) DAC
- Francesco, P.¹

22
- 70450231944
- An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness
- S. Hong and H. Kim. An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness. In ISCA, 2009.
- (2009) ISCA
- Hong, S.¹ Kim, H.²

23
- 84864068497
- Characterizing and improving the use of demand-fetched caches in GPUs
- W. Jia, K. A. Shaw, and M. Martonosi. Characterizing and improving the use of demand-fetched caches in GPUs. In ICS, 2012.
- (2012) ICS
- Jia, W.¹ Shaw, K.A.² Martonosi, M.³

24
- 41149104074
- Counter-based cache replacement and bypassing algorithms
- M. Kharbutli and Y. Solihin. Counter-based cache replacement and bypassing algorithms. In IEEE Transactions on Computers, 2008.
- (2008) IEEE Transactions on Computers
- Kharbutli, M.¹ Solihin, Y.²

25
- 80052655793
- CuMAPz: A tool to analyze memory access patterns in CUDA
- Y. Kim and A. Shrivastava. CuMAPz: A tool to analyze memory access patterns in CUDA. In DAC, 2011.
- (2011) DAC
- Kim, Y.¹ Shrivastava, A.²

26
- 84877739484
- Cache capacity aware thread scheduling for irregular memory access on many-core GPGPUs
- H. Kuo, T. Yen, B. C. Lai, and J. Jou. Cache capacity aware thread scheduling for irregular memory access on many-core GPGPUs. In ASPDAC, 2013.
- (2013) ASPDAC
- Kuo, H.¹ Yen, T.² Lai, B.C.³ Jou, J.⁴

27
- 84877777934
- Register and thread structure optimization for GPUs
- Y. Liang, Z. Cui, K. Rupnow, and D. Chen. Register and thread structure optimization for GPUs. In ASPDAC, 2013.
- (2013) ASPDAC
- Liang, Y.¹ Cui, Z.² Rupnow, K.³ Chen, D.⁴

28
- 84862069040
- Real-time implementation and performance optimization of 3D sound localization on GPUs
- Y. Liang et al. Real-time implementation and performance optimization of 3D sound localization on GPUs. In DATE, 2012.
- (2012) DATE
- Liang, Y.¹

29
- 63349099764
- Static analysis for fast and accurate design space exploration of caches
- Y. Liang and T. Mitra. Static analysis for fast and accurate design space exploration of caches. In CODES+ISSS, 2008.
- (2008) CODES+ISSS
- Liang, Y.¹ Mitra, T.²

30
- 66749155879
- Cache Bursts: A new approach for eliminating dead blocks and increasing cache efficiency
- H. Liu, M. Ferdman, J. Huh, and D. Burger. Cache Bursts: A new approach for eliminating dead blocks and increasing cache efficiency. In Micro, 2008.
- (2008) Micro
- Liu, H.¹ Ferdman, M.² Huh, J.³ Burger, D.⁴

31
- 78149251414
- Data layout transformation exploiting memory-level parallelism in structured grid many-core applications
- I. J. Sung, J. A. Stratton, and W. W. Hwu. Data layout transformation exploiting memory-level parallelism in structured grid many-core applications. In PACT, 2010.
- (2010) PACT
- Sung, I.J.¹ Stratton, J.A.² Hwu, W.W.³

32
- 47649086892
- Dynamic allocation for scratch-pad memory using compile-time decisions
- May
- S. Udayakumaran, A. Dominguez, and R. Barua. Dynamic allocation for scratch-pad memory using compile-time decisions. ACM Trans. Embed. Comput. Syst., 5(2):472-511, May 2006.
- (2006) ACM Trans. Embed. Comput. Syst. , vol.5 , Issue.2 , pp. 472-511
- Udayakumaran, S.¹ Dominguez, A.² Barua, R.³

33
- 14944380022
- Using the compiler to improve cache replacement decisions
- Z. Wang, K. S. McKinley, A. L. Rosenberg, and C. C. Weems. Using the compiler to improve cache replacement decisions. In PACT, 2002.
- (2002) PACT
- Wang, Z.¹ McKinley, K.S.² Rosenberg, A.L.³ Weems, C.C.⁴

34
- 77954691442
- A GPGPU compiler for memory optimization and parallelism management
- Y. Yang, P. Xiang, J. Kong, and H. Zhou. A GPGPU compiler for memory optimization and parallelism management. In PLDI, 2010.
- (2010) PLDI
- Yang, Y.¹ Xiang, P.² Kong, J.³ Zhou, H.⁴

35
- 79953126288
- On-the-fly elimination of dynamic irregularities for GPU computing
- E. Z. Zhang, Y. Jiang, Z. Guo, K. Tian, and X. Shen. On-the-fly elimination of dynamic irregularities for GPU computing. In ASPLOS, 2011.
- (2011) ASPLOS
- Zhang, E.Z.¹ Jiang, Y.² Guo, Z.³ Tian, K.⁴ Shen, X.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.