SCOPUS 정보 검색 플랫폼

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Volumn 6011 LNCS, Issue , 2010, Pages 244-263

Automatic C-to-CUDA code generation for affine programs

(3) Baskaran, Muthu Manikandan a Ramanujam, J b Sadayappan, P a

a OHIO STATE UNIVERSITY (United States)

b LOUISIANA STATE UNIVERSITY (United States)

Author keywords

[No Author keywords available]

Indexed keywords

AFFINE PROGRAM; AUTOMATIC CODES; AUTOMATIC TRANSFORMATIONS; AUTOMATICALLY GENERATED; C CODES; CODE GENERATION; COMPILER OPTIMIZATIONS; COMPUTATIONAL POWER; COMPUTE UNIFIED DEVICE ARCHITECTURES; DATA ACCESS; GRAPHICS PROCESSING UNITS; INPUT PROGRAMS; MEMORY HIERARCHY; MULTI CORE; MULTI-LEVEL; MULTITHREADED; PARALLEL PROGRAMMING MODEL; PERFORMANCE IMPLEMENTATION; TRANSFORMATION SYSTEMS;

COMPUTER SOFTWARE; COSINE TRANSFORMS; OPTIMIZATION; PARALLEL PROGRAMMING; PROGRAM COMPILERS;

AUTOMATIC PROGRAMMING;

EID: 77951572335 PISSN: 03029743 EISSN: 16113349 Source Type: Book Series
DOI: 10.1007/978-3-642-11970-5_14 Document Type: Conference Paper

Times cited : (123)

References (29)

1
- 84976766536
- Scanning polyhedra with do loops
- Ancourt, C., Irigoin, F.: Scanning polyhedra with do loops. In: PPoPP 1991, pp. 39-50 (1991)
- (1991) PPoPP 1991 , pp. 39-50
- Ancourt, C.¹ Irigoin, F.²

2
- 77951547168
- A Compiler Framework for Optimization of Affine Loop Nests for GPGPUs
- June
- Baskaran, M., Bondhugula, U., Krishnamoorthy, S., Ramanujam, J., Rountev, A., Sadayappan, P.: A Compiler Framework for Optimization of Affine Loop Nests for GPGPUs. In: ACM ICS (June 2008)
- (2008) ACM ICS
- Baskaran, M.¹ Bondhugula, U.² Krishnamoorthy, S.³ Ramanujam, J.⁴ Rountev, A.⁵ Sadayappan, P.⁶

3
- 79959456077
- Automatic Data Movement and Computation Mapping for Multi-level Parallel Architectures with Explicitly Managed Memories
- February
- Baskaran, M., Bondhugula, U., Krishnamoorthy, S., Ramanujam, J., Rountev, A., Sadayappan, P.: Automatic Data Movement and Computation Mapping for Multi-level Parallel Architectures with Explicitly Managed Memories. In: ACM SIGPLAN PPoPP (February 2008)
- (2008) ACM SIGPLAN PPoPP
- Baskaran, M.¹ Bondhugula, U.² Krishnamoorthy, S.³ Ramanujam, J.⁴ Rountev, A.⁵ Sadayappan, P.⁶

4
- 10444289646
- Code generation in the polyhedral model is easier than you think
- Bastoul, C.: Code generation in the polyhedral model is easier than you think. In: PACT 2004, pp. 7-16 (2004)
- PACT 2004 , vol.2004 , pp. 7-16
- Bastoul, C.¹

5
- 47249156196
- Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model
- Hendren, L. (ed.) CC 2008. Springer, Heidelberg
- Bondhugula, U., Baskaran, M., Krishnamoorthy, S., Ramanujam, J., Rountev, A., Sadayappan, P.: Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model. In: Hendren, L. (ed.) CC 2008. LNCS, vol. 4959, pp. 132-146. Springer, Heidelberg (2008)
- (2008) LNCS , vol.4959 , pp. 132-146
- Bondhugula, U.¹ Baskaran, M.² Krishnamoorthy, S.³ Ramanujam, J.⁴ Rountev, A.⁵ Sadayappan, P.⁶

6
- 57349139452
- A practical automatic polyhedral parallelizer and locality optimizer
- Bondhugula, U., Hartono, A., Ramanujan, J., Sadayappan, P.: A practical automatic polyhedral parallelizer and locality optimizer. In: ACMSIGPLAN Programming Languages Design and Implementation, PLDI 2008 (2008)
- (2008) ACMSIGPLAN Programming Languages Design and Implementation, PLDI 2008
- Bondhugula, U.¹ Hartono, A.² Ramanujan, J.³ Sadayappan, P.⁴

7
- 77951602079
- CLooG: The Chunky Loop Generator, http://www.cloog.org

8
- 78651269052
- Understanding the efficiency of GPU algorithms for matrix-matrix multiplication
- Fatahalian, K., Sugerman, J., Hanrahan, P.: Understanding the efficiency of GPU algorithms for matrix-matrix multiplication. In: ACM SIGGRAPH/ EUROGRAPHICS Conference on Graphics Hardware, pp. 133-137 (2004)
- (2004) ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware , pp. 133-137
- Fatahalian, K.¹ Sugerman, J.² Hanrahan, P.³

9
- 0026109335
- Dataflow analysis of array and scalar references
- Feautrier, P.: Dataflow analysis of array and scalar references. IJPP 20(1), 23-53 (1991)
- (1991) IJPP , vol.20 , Issue.1 , pp. 23-53
- Feautrier, P.¹

10
- 0026933251
- Some efficient solutions to the affine scheduling problem, part I: One-dimensional time
- Feautrier, P.: Some efficient solutions to the affine scheduling problem, part I: one-dimensional time. IJPP 21(5), 313-348 (1992)
- (1992) IJPP , vol.21 , Issue.5 , pp. 313-348
- Feautrier, P.¹

11
- 84957027384
- Automatic parallelization in the polytope model
- Perrin, G.-R., Darte, A. (eds.) The Data Parallel Programming Model. Springer, Heidelberg
- Feautrier, P.: Automatic parallelization in the polytope model. In: Perrin, G.-R., Darte, A. (eds.) The Data Parallel Programming Model. LNCS, vol. 1132, pp. 79-103. Springer, Heidelberg (1996)
- (1996) LNCS , vol.1132 , pp. 79-103
- Feautrier, P.¹

12
- 77951575756
- A memory model for scientific algorithms on graphics processors
- Löwe, W., Südholt, M. (eds.) SC 2006. Springer, Heidelberg
- Govindaraju, N.K., Larsen, S., Gray, J., Manocha, D.: A memory model for scientific algorithms on graphics processors. In: Löwe, W., Südholt, M. (eds.) SC 2006. LNCS, vol. 4089. Springer, Heidelberg (2006)
- (2006) LNCS , vol.4089
- Govindaraju, N.K.¹ Larsen, S.² Gray, J.³ Manocha, D.⁴

13
- 77951581325
- General-Purpose Computation Using Graphics Hardware, http://www.gpgpu. org/

14
- 33646559059
- Habilitation Thesis. FMI, University of Passau
- Griebl, M.: Automatic Parallelization of Loop Programs for Distributed Memory Architectures. Habilitation Thesis. FMI, University of Passau (2004)
- (2004) Automatic Parallelization of Loop Programs for Distributed Memory Architectures.
- Griebl, M.¹

15
- 85026986651
- Supernode partitioning
- Irigoin, F., Triolet, R.: Supernode partitioning. In: Proceedings of POPL 1988, pp. 319-329 (1988)
- (1988) Proceedings of POPL , vol.1988 , pp. 319-329
- Irigoin, F.¹ Triolet, R.²

16
- 44849094749
- Fast N-body Simulation with CUDA
- August
- Nyland, L., Harris, M., Prins, J.F.: Fast N-body Simulation with CUDA. GPU Gems 3 article (August 2007)
- (2007) GPU Gems 3 Article
- Nyland, L.¹ Harris, M.² Prins, J.F.³

17
- 67650081010
- Openmp to gpgpu: A compiler framework for automatic translation and optimization
- Lee, S., Min, S.-J., Eigenmann, R.: Openmp to gpgpu: A compiler framework for automatic translation and optimization. In: PPoPP 2009, pp. 101-110 (2009)
- (2009) PPoPP 2009 , pp. 101-110
- Lee, S.¹ Min, S.-J.² Eigenmann, R.³

18
- 4243731804
- PhD thesis, Stanford University August
- Lim, A.: Improving Parallelism And Data Locality With Affine Partitioning. PhD thesis, Stanford University (August 2001)
- (2001) Improving Parallelism and Data Locality with Affine Partitioning.
- Lim, A.¹

19
- 70450103746
- A cross-input adaptive framework for gpu programs optimizations
- May
- Liu, Y., Zhang, E.Z., Shen, X.: A cross-input adaptive framework for gpu programs optimizations. In: IPDPS (May 2009)
- (2009) IPDPS
- Liu, Y.¹ Zhang, E.Z.² Shen, X.³

20
- 77951584344
- NVIDIA CUDA, http://developer.nvidia.com/object/cuda.html

21
- 77951530755
- Parboil Benchmark Suite, http://impact.crhc.illinois.edu/parboil.php

22
- 84877715579
- Pluto: A polyhedral automatic parallelizer and locality optimizer for multicores http://pluto-compiler.sourceforge.net
- Pluto: A Polyhedral Automatic Parallelizer and Locality Optimizer for Multicores

23
- 34547683700
- Iterative optimization in the polyhedral model: Part I, one-dimensional time
- DOI 10.1109/CGO.2007.21, 4145111, International Symposium on Code Generation and Optimization, CGO 2007
- Pouchet, L.-N., Bastoul, C., Cohen, A., Vasilache, N.: Iterative optimization in the polyhedral model: Part I, one-dimensional time. In: CGO 2007, pp. 144-156 (2007) (Pubitemid 47214305)
- (2007) International Symposium on Code Generation and Optimization, CGO 2007 , pp. 144-156
- Pouchet, L.-N.¹ Bastoul, C.² Cohen, A.³ Vasilache, N.⁴

24
- 84976676720
- The Omega test: A fast and practical integer programming algorithm for dependence analysis
- Pugh, W.: The Omega test: a fast and practical integer programming algorithm for dependence analysis. Communications of the ACM 8, 102-114 (1992)
- (1992) Communications of the ACM , vol.8 , pp. 102-114
- Pugh, W.¹

25
- 0034299275
- Generation of efficient nested loops from polyhedra
- Quilleré, F., Rajopadhye, S.V., Wilde, D.: Generation of efficient nested loops from polyhedra. IJPP 28(5), 469-498 (2000)
- (2000) IJPP , vol.28 , Issue.5 , pp. 469-498
- Quilleré, F.¹ Rajopadhye, S.V.² Wilde, D.³

26
- 79959466764
- Optimization principles and application performance evaluation of a multithreaded GPU using CUDA
- February
- Ryoo, S., Rodrigues, C., Baghsorkhi, S., Stone, S., Kirk, D., Hwu, W.: Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. In: ACM SIGPLAN PPoPP 2008 (February 2008)
- (2008) ACM SIGPLAN PPoPP 2008
- Ryoo, S.¹ Rodrigues, C.² Baghsorkhi, S.³ Stone, S.⁴ Kirk, D.⁵ Hwu, W.⁶

27
- 51449106975
- Program optimization study on a 128-core GPU
- October
- Ryoo, S., Rodrigues, C., Stone, S., Baghsorkhi, S., Ueng, S., Hwu, W.: Program optimization study on a 128-core GPU. In: The First Workshop on General Purpose Processing on Graphics Processing Units (October 2007)
- (2007) The First Workshop on General Purpose Processing on Graphics Processing Units
- Ryoo, S.¹ Rodrigues, C.² Stone, S.³ Baghsorkhi, S.⁴ Ueng, S.⁵ Hwu, W.⁶

28
- 43449094719
- Program optimization space pruning for a multithreaded GPU
- DOI 10.1145/1356058.1356084, Proceedings of the 2008 CGO - Sixth International Symposium on Code Generation and Optimization
- Ryoo, S., Rodrigues, C., Stone, S., Baghsorkhi, S., Ueng, S., Stratton, J., Hwu, W.: Program optimization space pruning for a multithreaded GPU. In: CGO (2008) (Pubitemid 351667266)
- (2008) Proceedings of the 2008 CGO - Sixth International Symposium on Code Generation and Optimization , pp. 195-204
- Ryoo, S.¹ Rodrigues, C.I.² Stone, S.S.³ Baghsorkhi, S.S.⁴ Ueng, S.-Z.⁵ Stratton, J.A.⁶ Hwu, W.-M.W.⁷

29
- 67650016545
- Violated dependence analysis
- June
- Vasilache, N., Bastoul, C., Girbal, S., Cohen, A.: Violated dependence analysis. In: ACM ICS (June 2006)
- (2006) ACM ICS
- Vasilache, N.¹ Bastoul, C.² Girbal, S.³ Cohen, A.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.