메뉴 건너뛰기




Volumn , Issue , 2008, Pages 225-234

A compiler framework for optimization of affine loop nests for GPGPUs

Author keywords

Empirical tuning; GPU; Memory access optimization; Polyhedral model

Indexed keywords

AUTOMATIC PARALLELIZATION; COMPILER OPTIMIZATIONS; COMPUTATIONAL POWERS; DATA ACCESSES; DATA DEPENDENCES; DEVICE ARCHITECTURES; EMPIRICAL TUNING; GPU; LOOP NESTS; MEMORY ACCESS OPTIMIZATION; OPTIMAL PARAMETERS; PARALLEL ARCHITECTURES; PARALLEL CODES; PERFORMANCE OPTIMIZATIONS; POLYHEDRAL MODEL; PROGRAM TRANSFORMATIONS; PROGRAMMING MODELS; SHARED MEMORIES;

EID: 57349180412     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1375527.1375562     Document Type: Conference Paper
Times cited : (171)

References (28)
  • 1
    • 84976766536 scopus 로고
    • Scanning polyhedra with do loops
    • C. Ancourt and F. Irigoin. Scanning polyhedra with do loops. In PPoPP'91, pages 39-50, 1991.
    • (1991) PPoPP'91 , pp. 39-50
    • Ancourt, C.1    Irigoin, F.2
  • 3
    • 10444289646 scopus 로고    scopus 로고
    • Code generation in the polyhedral model is easier than you think
    • C. Bastoul. Code generation in the polyhedral model is easier than you think. In PACT'04, pages 7-16, 2004.
    • (2004) PACT'04 , pp. 7-16
    • Bastoul, C.1
  • 9
    • 0026109335 scopus 로고
    • Dataflow analysis of array and scalar references
    • P. Feautrier. Dataflow analysis of array and scalar references. IJPP, 20(1):23-53, 1991.
    • (1991) IJPP , vol.20 , Issue.1 , pp. 23-53
    • Feautrier, P.1
  • 10
    • 0026933251 scopus 로고
    • Some efficient solutions to the affine scheduling problem, part I: One-dimensional time
    • P. Feautrier. Some efficient solutions to the affine scheduling problem, part I: one-dimensional time. IJPP, 21(5):313-348, 1992.
    • (1992) IJPP , vol.21 , Issue.5 , pp. 313-348
    • Feautrier, P.1
  • 11
    • 0001448065 scopus 로고
    • Some efficient solutions to the affine scheduling problem, part II: Multidimensional time
    • P. Feautrier. Some efficient solutions to the affine scheduling problem, part II: multidimensional time. IJPP, 21(6):389-420, 1992.
    • (1992) IJPP , vol.21 , Issue.6 , pp. 389-420
    • Feautrier, P.1
  • 12
    • 34548292052 scopus 로고    scopus 로고
    • A memory model for scientific algorithms on graphics processors
    • N. K. Govindaraju, S. Larsen, J. Gray, and D. Manocha. A memory model for scientific algorithms on graphics processors. In SC'06, 2006.
    • (2006) SC'06
    • Govindaraju, N.K.1    Larsen, S.2    Gray, J.3    Manocha, D.4
  • 13
    • 57349162527 scopus 로고    scopus 로고
    • General-Purpose Computation Using Graphics Hardware. http://www.gpgpu. org/.
    • General-Purpose Computation Using Graphics Hardware. http://www.gpgpu. org/.
  • 14
    • 57349100116 scopus 로고    scopus 로고
    • Automatic Parallelization of Loop Programs for Distributed Memory Architectures. FMI, University of Passau, Habilitation Thesis
    • M. Griebl. Automatic Parallelization of Loop Programs for Distributed Memory Architectures. FMI, University of Passau, 2004. Habilitation Thesis.
    • (2004)
    • Griebl, M.1
  • 15
    • 57349101237 scopus 로고    scopus 로고
    • Data and computation transformations for Brook streaming applications on multiprocessors
    • S.-W. Liao, Z. Du, G. Wu, and G.-Y. Lueh. Data and computation transformations for Brook streaming applications on multiprocessors. In CGO'06, pages 196-207, 2006.
    • (2006) CGO'06 , pp. 196-207
    • Liao, S.-W.1    Du, Z.2    Wu, G.3    Lueh, G.-Y.4
  • 17
    • 0030645995 scopus 로고    scopus 로고
    • Maximizing parallelism and minimizing synchronization with affine transforms
    • A. W. Lim and M. S. Lam. Maximizing parallelism and minimizing synchronization with affine transforms. In POPL, pages 201-214, 1997.
    • (1997) POPL , pp. 201-214
    • Lim, A.W.1    Lam, M.S.2
  • 18
    • 57349189733 scopus 로고    scopus 로고
    • NVIDIA CUDA
    • NVIDIA CUDA. http://developer.nvidia.com/object/cuda.html.
  • 19
    • 57349128633 scopus 로고    scopus 로고
    • NVIDIA GeForce 8800. http://www.nvidia.com/page/geforce-8800.html.
    • , vol.8800
  • 21
    • 34547683700 scopus 로고    scopus 로고
    • Iterative optimization in the polyhedral model: Part I, one-dimensional time
    • L.-N. Pouchet, C. Bastoul, A. Cohen, and N. Vasilache. Iterative optimization in the polyhedral model: Part I, one-dimensional time. In CGO'07, pages 144-156, 2007.
    • (2007) CGO'07 , pp. 144-156
    • Pouchet, L.-N.1    Bastoul, C.2    Cohen, A.3    Vasilache, N.4
  • 22
    • 84976676720 scopus 로고
    • The Omega test: A fast and practical integer programming algorithm for dependence analysis
    • Aug
    • W. Pugh. The Omega test: a fast and practical integer programming algorithm for dependence analysis. Communications of the ACM, 8:102-114, Aug. 1992.
    • (1992) Communications of the ACM , vol.8 , pp. 102-114
    • Pugh, W.1
  • 23
    • 0034299275 scopus 로고    scopus 로고
    • Generation of efficient nested loops from polyhedra
    • F. Quilleré, S. V. Rajopadhye, and D. Wilde. Generation of efficient nested loops from polyhedra. IJPP, 28(5):469-498, 2000.
    • (2000) IJPP , vol.28 , Issue.5 , pp. 469-498
    • Quilleré, F.1    Rajopadhye, S.V.2    Wilde, D.3
  • 24
    • 79959466764 scopus 로고    scopus 로고
    • Optimization principles and application performance evaluation of a multithreaded GPU using CUDA
    • Feb
    • S. Ryoo, C. Rodrigues, S. Baghsorkhi, S. Stone, D. Kirk, and W. Hwu. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. In ACM SIGPLAN PPoPP 2008, Feb. 2008.
    • (2008) ACM SIGPLAN PPoPP 2008
    • Ryoo, S.1    Rodrigues, C.2    Baghsorkhi, S.3    Stone, S.4    Kirk, D.5    Hwu, W.6
  • 26
    • 43449094719 scopus 로고    scopus 로고
    • S. Ryoo, C. Rodrigues, S. Stone, S. Baghsorkhi, S. Ueng, J. Stratton, and W. Hwu. Program optimization space pruning for a multithreaded GPU. In CGO, 2008.
    • S. Ryoo, C. Rodrigues, S. Stone, S. Baghsorkhi, S. Ueng, J. Stratton, and W. Hwu. Program optimization space pruning for a multithreaded GPU. In CGO, 2008.
  • 27
    • 33947595619 scopus 로고    scopus 로고
    • Accelerator: Using data parallelism to program GPUs for general-purpose uses
    • D. Tarditi, S. Puri, and J. Oglesby. Accelerator: using data parallelism to program GPUs for general-purpose uses. In ASPLOS-XII, pages 325-335, 2006.
    • (2006) ASPLOS-XII , pp. 325-335
    • Tarditi, D.1    Puri, S.2    Oglesby, J.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.