-
1
-
-
0023438847
-
Automatic translation of fortran programs to vector form
-
ALLEN, R. AND KENNEDY, K. 1987. Automatic translation of fortran programs to vector form. ACM Trans. Program. Lang. Syst. 9, 4, 491-542.
-
(1987)
ACM Trans. Program. Lang. Syst.
, vol.9
, Issue.4
, pp. 491-542
-
-
Allen, R.1
Kennedy, K.2
-
3
-
-
84979017283
-
Static compilation analysis for host-accelerator communication optimization
-
Springer
-
AMINI, M., COELHO, F., IRIGOIN, F., AND KERYELL, R. 2011. Static compilation analysis for host-accelerator communication optimization. In Workshop on Languages and Compilers for Parallel Computing (LCPC'11). Lecture Notes in Computer Science. Springer.
-
(2011)
Workshop on Languages and Compilers for Parallel Computing (LCPC'11). Lecture Notes in Computer Science
-
-
Amini, M.1
Coelho, F.2
Irigoin, F.3
Keryell, R.4
-
6
-
-
77951572335
-
Automatic C-to-CUDA code generation for affine programs
-
Springer
-
BASKARAN, M., RAMANUJAM, J., AND SADAYAPPAN, P. 2010. Automatic C-to-CUDA code generation for affine programs. In Compiler Construction (CC 10), Held as Part of the Joint European Conferences on Theory and Practice of Software, (ETAPS 10), Lecture Notes in Computer Science, vol. 6011. Springer, 244-2633
-
(2010)
Compiler Construction (CC 10), Held As Part of the Joint European Conferences on Theory and Practice of Software, (ETAPS 10), Lecture Notes in Computer Science
, vol.6011
, pp. 244-2633
-
-
Baskaran, M.1
Ramanujam, J.2
Sadayappan, P.3
-
8
-
-
77951599594
-
The polyhedral model is more widely applicable than you think
-
Springer
-
BENABDERRAHMANE, M.-W., POUCHET, L.-N., COHEN, A., AND BASTOUL, C. 2010. The polyhedral model is more widely applicable than you think. In Proceedings of the International Conference on Compiler Construction (CC'10). Lecture Notes in Computer Science, vol. 6011. Springer.
-
(2010)
Proceedings of the International Conference on Compiler Construction (CC'10). Lecture Notes in Computer Science
, vol.6011
-
-
Benabderrahmane, M.-W.1
Pouchet, L.-N.2
Cohen, A.3
Bastoul, C.4
-
11
-
-
57349139452
-
A practical automatic polyhedral parallelizer and locality optimizer
-
BONDHUGULA, U., HARTONO, A., RAMANUJAM, J., AND SADAYAPPAN, P. 2008a. A practical automatic polyhedral parallelizer and locality optimizer. SIGPLAN Not. 43, 6, 101-113.
-
(2008)
SIGPLAN Not.
, vol.43
, Issue.6
, pp. 101-113
-
-
Bondhugula, U.1
Hartono, A.2
Ramanujam, J.3
Sadayappan, P.4
-
13
-
-
0032066690
-
Loop parallelization algorithms: From parallelism extraction to code generation
-
BOULET, P., DARTE, A., SILBER, G.-A., AND VIVIEN, F. 1998. Loop parallelization algorithms: From parallelism extraction to code generation. Parallel Comput. 24, 421-444.
-
(1998)
Parallel Comput.
, vol.24
, pp. 421-444
-
-
Boulet, P.1
Darte, A.2
Silber, G.-A.3
Vivien, F.4
-
14
-
-
70449959487
-
-
Tech. rep., USC Computer Science
-
CHEN, C., CHAME, J., AND HALL, M. 2008. A framework for composing high-level loop transformations. Tech. rep., USC Computer Science.
-
(2008)
A Framework for Composing High-level Loop Transformations
-
-
Chen, C.1
Chame, J.2
Hall, M.3
-
15
-
-
77949650907
-
Offload-Automating code migration to heterogeneous multicore systems
-
COOPER, P., DOLINSKY, U., DONALDSON, A. F., RICHARDS, A., RILEY, C., AND RUSSELL, G. 2010. Offload-Automating code migration to heterogeneous multicore systems. In Proceedings of International Conference on High-Performance Embedded Architectures and Compilers (HIPEAC). 337-352.
-
(2010)
Proceedings of International Conference on High-Performance Embedded Architectures and Compilers (HIPEAC)
, pp. 337-352
-
-
Cooper, P.1
Dolinsky, U.2
Donaldson, A.F.3
Richards, A.4
Riley, C.5
Russell, G.6
-
16
-
-
0026109335
-
Dataflow analysis of array and scalar references
-
FEAUTRIER, P. 1991. Dataflow analysis of array and scalar references. Int. J. Parallel Program. 20, 1, 23-53.
-
(1991)
Int. J. Parallel Program.
, vol.20
, Issue.1
, pp. 23-53
-
-
Feautrier, P.1
-
17
-
-
0026933251
-
Some efficient solutions to the affine scheduling problem. Part I. one-dimensional time
-
FEAUTRIER, P. 1992a. Some efficient solutions to the affine scheduling problem. Part I. One-Dimensional time. Int. J. Parallel Program. 21, 313-347.
-
(1992)
Int. J. Parallel Program.
, vol.21
, pp. 313-347
-
-
Feautrier, P.1
-
18
-
-
0001448065
-
Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time
-
FEAUTRIER, P. 1992b. Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time. Int. J. Parallel Program. 21, 389-420.
-
(1992)
Int. J. Parallel Program.
, vol.21
, pp. 389-420
-
-
Feautrier, P.1
-
19
-
-
77949621946
-
Analysis of task offloading for accelerators
-
FERRER, R., BELTRAN, V., GONŹALEZ, M., MARTORELL, X., AND AYGUAD́e, E. 2010. Analysis of task offloading for accelerators. In Proceedings of the International Conference on High-Performance Embedded Architectures and Compilers (HiPEAC). 322-336.
-
(2010)
Proceedings of the International Conference on High-Performance Embedded Architectures and Compilers (HiPEAC)
, pp. 322-336
-
-
Ferrer, R.1
Beltran, V.2
Gonźalez, M.3
Martorell, X.4
Ayguad́e, E.5
-
20
-
-
84871776762
-
Polly: Polyhedral optimization in llvm
-
GROSSER, T., ZHENG, H., A, R., SIMBÜRGER, A., GRÖSSLINGER, A., AND POUCHET, L.-N. 2011. Polly: Polyhedral optimization in llvm. In 1st InterantionalWorkshop on Polyhedral Compilation Techniques (IMPACT'11).
-
(2011)
1st InterantionalWorkshop on Polyhedral Compilation Techniques (IMPACT'11)
-
-
Grosser, T.1
Zheng, H.A.R.2
Simbürger, A.3
Grösslinger, A.4
Pouchet, L.-N.5
-
21
-
-
70350627685
-
Precise management of scratchpad memories for localising array accesses in scientific codes
-
Springer
-
GRÖSSLINGER, A. 2009. Precise management of scratchpad memories for localising array accesses in scientific codes. In CC'09. Springer, 236-250.
-
(2009)
CC'09
, pp. 236-250
-
-
Grösslinger, A.1
-
24
-
-
78650802947
-
OpenMPC: Extended openmp programming and tuning for GPUs
-
IEEE Computer Society, Washington, DC
-
LEE, S. AND EIGENMANN, R. 2010. OpenMPC: Extended openmp programming and tuning for GPUs. In Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC'10). . IEEE Computer Society, Washington, DC, 1-11.
-
(2010)
Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC'10)
, pp. 1-11
-
-
Lee, S.1
Eigenmann, R.2
-
26
-
-
77952264175
-
A mapping path for multi-GPGPU accelerated computers from a portable high level programming abstraction
-
ACM, New York
-
LEUNG, A., VASILACHE, N., MEISTER, B., BASKARAN, M., WOHLFORD, D., BASTOUL, C., AND LETHIN, R. 2010. A mapping path for multi-GPGPU accelerated computers from a portable high level programming abstraction. In Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units (GPGPU'10). ACM, New York, 51-61.
-
(2010)
Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units (GPGPU'10)
, pp. 51-61
-
-
Leung, A.1
Vasilache, N.2
Meister, B.3
Baskaran, M.4
Wohlford, D.5
Bastoul, C.6
Lethin, R.7
-
31
-
-
35948991669
-
-
NVIDIA Corporation, NVIDIA Corporation
-
NVIDIA Corporation 2011. NVIDIA CUDA Programming guide 4.0. NVIDIA Corporation.
-
(2011)
NVIDIA CUDA Programming guide 4.0
-
-
-
34
-
-
84864028974
-
Apricot: An optimizing compiler and productivity tool for x86-compatible many-core coprocessors
-
RAVI, N., YANG, Y., BAO, T., AND CHAKRADHAR, S. 2012. Apricot: An optimizing compiler and productivity tool for x86-compatible many-core coprocessors. In International Conference on Supercomputing (ICS'12).
-
(2012)
International Conference on Supercomputing (ICS'12)
-
-
Ravi, N.1
Yang, Y.2
Bao, T.3
Chakradhar, S.4
-
35
-
-
35448985754
-
Parameterized tiled loops for free
-
RENGANARAYANAN, L., KIM, D., RAJOPADHYE, S., AND STROUT, M. 2007. Parameterized tiled loops for free. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI).
-
(2007)
ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI)
-
-
Renganarayanan, L.1
Kim, D.2
Rajopadhye, S.3
Strout, M.4
-
36
-
-
79952576869
-
A programming language interface to describe transformations and code generation
-
Springer
-
RUDY, G., KHAN, M. M., HALL, M., CHEN, C., AND JACQUELINE, C. 2011. A programming language interface to describe transformations and code generation. In Proceedings of the 23rd international conference on Languages and Compilers for Parallel Computing (LCPC'10). Springer, 136-150.
-
(2011)
Proceedings of the 23rd International Conference on Languages and Compilers for Parallel Computing (LCPC'10)
, pp. 136-150
-
-
Rudy, G.1
Khan, M.M.2
Hall, M.3
Chen, C.4
Jacqueline, C.5
-
37
-
-
79959466764
-
Optimization principles and application perform-ance evaluation of a multithreadedGPUusing CUDA
-
RYOO, S., RODRIGUES, C. I., BAGHSORKHI, S. S., STONE, S. S., KIRK, D. B., AND HWU, W.-M. 2008a. Optimization principles and application perform-ance evaluation of a multithreadedGPUusing CUDA. In Proceedings of the Symposium on Principles and Practice of Parallel Programming (PPoPP'08).
-
(2008)
Proceedings of the Symposium on Principles and Practice of Parallel Programming (PPoPP'08)
-
-
Ryoo, S.1
Rodrigues, C.I.2
Baghsorkhi, S.S.3
Stone, S.S.4
Kirk, D.B.5
Hwu, W.-M.6
-
38
-
-
43449094719
-
Optimization space pruning for a multithreaded GPU
-
RYOO, S., RODRIGUES, C. I., STONE, S. S., BAGHSORKHI, S. S., S. UENG, J. A. S., AND HWU, W.-M. 2008b. Optimization space pruning for a multithreaded GPU. In Proceedings of the International Symposium on Code Generation and Optimization (CGO'08).
-
(2008)
Proceedings of the International Symposium on Code Generation and Optimization (CGO'08)
-
-
Ryoo, S.1
Rodrigues, C.I.2
Stone, S.S.3
Baghsorkhi, S.S.S.4
Ueng, J.A.S.5
Hwu, W.-M.6
-
42
-
-
70449626135
-
Polyhedral-model guided loop-nest auto-vectorization
-
TRIFUNOVÍC, K., NUZMAN, D., COHEN, A., ZAKS, A., AND ROSEN, I. 2009. Polyhedral-model guided loop-nest auto-vectorization. In International Conference on Parallel Architecture and Compilation Techniques (PACT'09).
-
(2009)
International Conference on Parallel Architecture and Compilation Techniques (PACT'09)
-
-
Trifunovíc, K.1
Nuzman, D.2
Cohen, A.3
Zaks, A.4
Rosen, I.5
-
43
-
-
77952597755
-
CUDA-lite: Reducing GPU programming complexity
-
UENG, S., LATHARA, M., BAGHSORKHI, S. S., AND HWU, W.-M. 2008. CUDA-lite: Reducing GPU programming complexity. In Proceedings of the Workshop on Languages and Compilers for Parallel Computing (LCPC'08).
-
(2008)
Proceedings of the Workshop on Languages and Compilers for Parallel Computing (LCPC'08)
-
-
Ueng, S.1
Lathara, M.2
Baghsorkhi, S.S.3
Hwu, W.-M.4
-
44
-
-
84859153100
-
Automatic restructuring of gpu kernels for exploiting inter-thread data locality
-
Springer
-
UNKULE, S., SHALTZ, C., AND QASEM, A. 2012. Automatic restructuring of gpu kernels for exploiting inter-thread data locality. In International Conference on Compiler Construction (CC'12). Lecture Notes in Computer Science, vol. 7210. Springer.
-
(2012)
International Conference on Compiler Construction (CC'12). Lecture Notes in Computer Science
, vol.7210
-
-
Unkule, S.1
Shaltz, C.2
Qasem, A.3
-
45
-
-
84872972843
-
Joint scheduling and layout optimization to enable multi-level vectorization
-
VASILACHE, N., MEISTER, B., BASKARAN, M., AND LETHIN, R. 2012. Joint scheduling and layout optimization to enable multi-level vectorization. In Proceedings of the International workshop on Polyhedral Compilation Techniques (IMPACT'12).
-
(2012)
Proceedings of the International Workshop on Polyhedral Compilation Techniques (IMPACT'12)
-
-
Vasilache, N.1
Meister, B.2
Baskaran, M.3
Lethin, R.4
-
46
-
-
78149237521
-
Isl: An integer set library for the polyhedral model
-
K. Fukuda, J. Hoeven, M. Joswig, and N. Takayama, Eds. Lecture Notes in Computer Science Series Springer
-
VERDOOLAEGE, S. 2010. isl: An integer set library for the polyhedral model. In International Conference on Mathematical Software (ICMS'10), K. Fukuda, J. Hoeven, M. Joswig, and N. Takayama, Eds. Lecture Notes in Computer Science Series, vol. 6327. Springer, 299-302.
-
(2010)
International Conference on Mathematical Software (ICMS'10)
, vol.6327
, pp. 299-302
-
-
Verdoolaege, S.1
-
48
-
-
13244279577
-
Minimizing development and maintenance costs in supporting persistently optimized BLAS
-
WHALEY, R. C. AND PETITET, A. 2005. Minimizing development and maintenance costs in supporting persistently optimized BLAS. Softw. Pract. Exper. 35, 2, 101-121. http://www.cs.utsa.edu/-whaley/papers/spercw04.ps+.
-
(2005)
Softw. Pract. Exper.
, vol.35
, Issue.2
, pp. 101-121
-
-
Whaley, R.C.1
Petitet, A.2
-
50
-
-
32844466554
-
An integrated Simdization framework using virtual vectors
-
WU, P., EICHENBERGER, A. E., WANG, A., AND ZHAO, P. 2005. An integrated Simdization framework using virtual vectors. In International Conference on Supercomputing (ICS'05).
-
(2005)
International Conference on Supercomputing (ICS'05)
-
-
U, P.W.1
Eichenberger, A.E.2
Wang, A.3
Zhao, P.4
|