SCOPUS 정보 검색 플랫폼

Proceedings - International Symposium on Computer Architecture

Volumn , Issue , 2012, Pages 61-71

CAPRI: Prediction of compaction-adequacy for handling control-divergence in GPGPU architectures

(2) Rhu, Minsoo a Erez, Mattan a

a The University of Texas at Austin (United States)

Author keywords

[No Author keywords available]

Indexed keywords

BASELINE DESIGN; CONTROL FLOWS; GENERAL PURPOSE; PERFORMANCE EVALUATION; PREDICTION ACCURACY;

COMPUTER ARCHITECTURE; DEGRADATION; PIPELINE CODES; PROGRAM PROCESSORS;

COMPACTION;

EID: 84864855982 PISSN: 10636897 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ISCA.2012.6237006 Document Type: Conference Paper

Times cited : (51)

References (31)

1
- 0020915645
- Conversion of control dependence to data dependence
- New York, NY, USA. ACM
- J. R. Allen, K. Kennedy, C. Porterfield, and J. Warren. Conversion of control dependence to data dependence. In Proceedings of the 10th ACM SIGACTSIGPLAN symposium on Principles of programming languages, POPL '83, pages 177-189, New York, NY, USA, 1983. ACM.
- (1983) Proceedings of the 10th ACM SIGACTSIGPLAN Symposium on Principles of Programming Languages, POPL '83 , pp. 177-189
- Allen, J.R.¹ Kennedy, K.² Porterfield, C.³ Warren, J.⁴

2
- 84864843978
- AMD Corporation
- AMD Corporation. AMD Radeon HD 6900M Series Specifications, 2010.
- (2010) AMD Radeon HD 6900M Series Specifications

3
- 78149329064
- AMD Corporation, August
- AMD Corporation. ATI Stream Computing OpenCL Programming Guide, August 2010.
- (2010) ATI Stream Computing OpenCL Programming Guide

4
- 70349169075
- Analyzing CUDA workloads using a detailed GPU simulator
- April
- A. Bakhoda, G. Yuan, W. Fung, H. Wong, and T. Aamodt. Analyzing CUDA workloads using a detailed GPU simulator. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS-2009), April 2009.
- (2009) IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS-2009)
- Bakhoda, A.¹ Yuan, G.² Fung, W.³ Wong, H.⁴ Aamodt, T.⁵

5
- 0015330108
- The illiac IV system
- April
- W. Bouknight, S. Denenberg, D. McIntyre, J. Randall, A. Sameh, and D. Slotnick. The Illiac IV System. In Proceedings of the IEEE, volume 60, pages 369-388, April 1972.
- (1972) Proceedings of the IEEE , vol.60 , pp. 369-388
- Bouknight, W.¹ Denenberg, S.² McIntyre, D.³ Randall, J.⁴ Sameh, A.⁵ Slotnick, D.⁶

6
- 70649092154
- Rodinia: A benchmark suite for heterogeneous computing
- October
- S. Che, M. Boyer, J. Meng, D. Tarjan, J. Sheaffer, S.- H. Lee, and K. Skadron. Rodinia: A benchmark suite for heterogeneous computing. In IEEE International Symposium on Workload Characterization (IISWC- 2009), October 2009.
- (2009) IEEE International Symposium on Workload Characterization (IISWC- 2009)
- Che, S.¹ Boyer, M.² Meng, J.³ Tarjan, D.⁴ Sheaffer, J.⁵ Lee, S.-H.⁶ Skadron, K.⁷

7
- 84863351470
- SIMD R e-convergence at thread frontiers
- December
- G. Diamos, B. Ashbaugh, S. Maiyuran, A. Kerr, H. Wu, and S. Yalamanchili. SIMD R e-Convergence At Thread Frontiers. In 44th International Symposium on Microarchitecture (MICRO-44), December 2011.
- (2011) 44th International Symposium on Microarchitecture (MICRO-44)
- Diamos, G.¹ Ashbaugh, B.² Maiyuran, S.³ Kerr, A.⁴ Wu, H.⁵ Yalamanchili, S.⁶

8
- 79955923056
- Thread block compaction for efficient SIMT control flow
- February
- W. W. Fung and T. M. Aamodt. Thread Block Compaction for Efficient SIMT Control Flow. In 17th International Symposium on High Performance Computer Architecture (HPCA-17), February 2011.
- (2011) 17th International Symposium on High Performance Computer Architecture (HPCA-17)
- Fung, W.W.¹ Aamodt, T.M.²

9
- 47349104432
- Dynamic warp formation and scheduling for efficient GPU control flow
- December
- W.W. Fung, I. Sham, G. Yuan, and T. M. Aamodt. Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow. In 40th International Symposium on Microarchitecture (MICRO-40), December 2007.
- (2007) 40th International Symposium on Microarchitecture (MICRO-40)
- Fung, W.W.¹ Sham, I.² Yuan, G.³ Aamodt, T.M.⁴

10
- 78650817529
- Size matters: Space/- time tradeoffs to improve GPGPU applications performance
- November
- A. Gharaibeh and M. Ripeanu. Size Matters: Space/- Time Tradeoffs to Improve GPGPU Applications Performance. In 2010 International Conference for High Performance Computing, Networking, Storage and Analysis (SC-2010), November 2010.
- (2010) 2010 International Conference for High Performance Computing, Networking, Storage and Analysis (SC-2010)
- Gharaibeh, A.¹ Ripeanu, M.²

11
- 77954007548
- M. Giles. Jacobi iteration for a Laplace discretisation on a 3D structured grid, 2008.
- (2008) Jacobi Iteration for A Laplace Discretisation on A 3D Structured Grid
- Giles, M.¹

12
- 77952620490
- M. Giles and S. Xiaoke. Notes on using the NVIDIA 8800 GTX graphics card. http://people.maths.ox.ac.uk/gilesm/hpc/, 2008.
- (2008) Notes on Using the NVIDIA 8800 GTX Graphics Card
- Giles, M.¹ Xiaoke, S.²

13
- 38349041620
- Accelerating large graph algorithms on the GPU using CUDA
- P. Harish and P. Narayanan. Accelerating Large Graph Algorithms on the GPU Using CUDA. In High Performance Computing HiPC 2007, volume 4873, pages 197-208. 2007.
- (2007) High Performance Computing HiPC 2007 , vol.4873 , pp. 197-208
- Harish, P.¹ Narayanan, P.²

14
- 84864836708
- IMPACT Research Group. The Parboil Benchmark Suite, 2007
- IMPACT Research Group. The Parboil Benchmark Suite, 2007.

15
- 70449722984
- Intel Corporation, May
- Intel Corporation. Intel AVX: New Frontiers in Performance Improvements and Energy Efficiency, May 2009.
- (2009) Intel AVX: New Frontiers in Performance Improvements and Energy Efficiency

16
- 84864860920
- Intel Corporation, June
- Intel Corporation. Intel HD Graphics OpenSource Programmer Reference Manual, June 2011.
- (2011) Intel HD Graphics OpenSource Programmer Reference Manual

17
- 0034459255
- Efficient conditional operations for data-parallel architectures
- December
- U. Kapasi, W. Dally, S. Rixner, P. Mattson, J. Owens, and B. Khailany. Efficient conditional operations for data-parallel architectures. In 33th International Symposium on Microarchitecture (MICRO-33), December 2000.
- (2000) 33th International Symposium on Microarchitecture (MICRO-33)
- Kapasi, U.¹ Dally, W.² Rixner, S.³ Mattson, P.⁴ Owens, J.⁵ Khailany, B.⁶

18
- 77954976292
- Dynamic warp subdivision for intgrated branch and memory divergence tolerance
- J. Meng, D. Tarjan, and K. Skadron. Dynamic warp subdivision for intgrated branch and memory divergence tolerance. In 37th International Symposium on Computer Architecture (ISCA-37), 2010.
- (2010) 37th International Symposium on Computer Architecture (ISCA-37)
- Meng, J.¹ Tarjan, D.² Skadron, K.³

19
- 84863342255
- Improving GPU performance via largewarps and two-level warp scheduling
- December
- V. Narasiman, C. Lee, M. Shebanow, R. Miftakhutdinov, O. Mutlu, and Y. Patt. Improving GPU Performance via LargeWarps and Two-LevelWarp Scheduling. In 44th International Symposium on Microarchitecture (MICRO-44), December 2011.
- (2011) 44th International Symposium on Microarchitecture (MICRO-44)
- Narasiman, V.¹ Lee, C.² Shebanow, M.³ Miftakhutdinov, R.⁴ Mutlu, O.⁵ Patt, Y.⁶

20
- 77951900491
- NVIDIA Corporation
- NVIDIA Corporation. NVIDIA's Next Generation CUDA Compute Architecture: Fermi, 2009.
- (2009) NVIDIA's Next Generation CUDA Compute Architecture: Fermi

21
- 35948991669
- NVIDIA Corporation
- NVIDIA Corporation. NVIDIA CUDA Programming Guide, 2011.
- (2011) NVIDIA CUDA Programming Guide

22
- 84860577839
- NVIDIA Corporation
- NVIDIA Corporation. PTX: Parallel Thread Execution ISA Version 2.3, 2011.
- (2011) PTX: Parallel Thread Execution ISA Version 2.3

23
- 84864861336
- NVIDIA Corporation
- NVIDIA Corporation. CUDA C/C++ SDK CODE Samples, 2011.
- (2011) CUDA C/C++ SDK CODE Samples

24
- 0017922490
- The CR AY-1 computer system
- January
- R. M. R ussell. The CR AY-1 computer system. Commun. ACM, 21:63-72, January 1978.
- (1978) Commun. ACM , vol.21 , pp. 63-72
- Russell, R.M.¹

25
- 38849131252
- High-throughput sequence alignment using graphics processing units
- M. Schatz, C. Trapnell, A. Delcher, and A. Varshney. High-throughput sequence alignment using graphics processing units. BMC Bioinformatics, 8(1):474, 2007.
- (2007) BMC Bioinformatics , vol.8 , Issue.1 , pp. 474
- Schatz, M.¹ Trapnell, C.² Delcher, A.³ Varshney, A.⁴

26
- 49249086142
- Larrabee: A many-core x86 architecture for visual computing
- August
- L. Seiler, D. Carmean, E. Sprangle, T. Forsyth, M. Abrash, P. Dubey, S. Junkins, A. Lake, J. Sugerman, R. Cavin, R. Espasa, E. Grochowski, T. Juan, and P. Hanrahan. Larrabee: A Many-core x86 Architecture for Visual Computing. ACM Trans. Graph., 27:18:1-18:15, August 2008.
- (2008) ACM Trans. Graph. , vol.27 , pp. 181-1815
- Seiler, L.¹ Carmean, D.² Sprangle, E.³ Forsyth, T.⁴ Abrash, M.⁵ Dubey, P.⁶ Junkins, S.⁷ Lake, A.⁸ Sugerman, J.⁹ Cavin, R.¹⁰ Espasa, R.¹¹ Grochowski, E.¹² Juan, T.¹³ Hanrahan, P.¹⁴

27
- 0033727057
- Vector instruction set support for conditional operations
- J. E. Smith, G. Faanes, and R. Sugumar. Vector instruction set support for conditional operations. In 27th International Symposium on Computer Architecture (ISCA-27), 2000.
- (2000) 27th International Symposium on Computer Architecture (ISCA-27)
- Smith, J.E.¹ Faanes, G.² Sugumar, R.³

28
- 0003502903
- Morgan Kaufmann
- Steven Muchnick. Advanced Compiler Design and Implementation. Morgan Kaufmann, 1997.
- (1997) Advanced Compiler Design and Implementation
- Muchnick, S.¹

29
- 74049151553
- Increasing memory miss tolerance for SIMD cores
- D. Tarjan, J. Meng, and K. Skadron. Increasing memory miss tolerance for SIMD cores. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis (SC-09), 2009.
- (2009) Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis (SC-09)
- Tarjan, D.¹ Meng, J.² Skadron, K.³

30
- 30744459395
- RPU: A programmable ray processing unit for realtime ray tracing
- July
- S. Woop, J. Schmittler, and P. Slusallek. RPU: a programmable ray processing unit for realtime ray tracing. ACM Trans. Graph., 24:434-444, July 2005.
- (2005) ACM Trans. Graph. , vol.24 , pp. 434-444
- Woop, S.¹ Schmittler, J.² Slusallek, P.³

31
- 84863457471
- Characterization and transformation of unstructured control flow in GPU applications
- June
- H. Wu, G. Diamos, S. Li, and S. Yalamanchili. Characterization and Transformation of Unstructured Control Flow in GPU Applications. In 1st International Workshop on Characterizing Applications for Heterogeneous Exascale Systems, June 2011.
- (2011) 1st International Workshop on Characterizing Applications for Heterogeneous Exascale Systems
- Wu, H.¹ Diamos, G.² Li, S.³ Yalamanchili, S.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.