-
1
-
-
0016313256
-
A comparison of list schedules for parallel processing systems
-
T. L. Adam, K. M. Chandy, and J. R. Dickson. A comparison of list schedules for parallel processing systems. Commun. ACM, 17(12):685-690, 1974.
-
(1974)
Commun. ACM
, vol.17
, Issue.12
, pp. 685-690
-
-
Adam, T.L.1
Chandy, K.M.2
Dickson, J.R.3
-
3
-
-
84976766536
-
Scanning polyhedra with do loops
-
C. Ancourt and F. Irigoin. Scanning polyhedra with do loops. In PPoPP'91, pages 39-50, 1991.
-
(1991)
PPoPP'91
, pp. 39-50
-
-
Ancourt, C.1
Irigoin, F.2
-
4
-
-
10444289646
-
Code generation in the polyhedral model is easier than you think
-
C. Bastoul. Code generation in the polyhedral model is easier than you think. In PACT'04, pages 7-16, 2004.
-
(2004)
PACT'04
, pp. 7-16
-
-
Bastoul, C.1
-
5
-
-
10444255848
-
Putting polyhedral loop transformations to work
-
C. Bastoul, A. Cohen, S. Girbal, S. Sharma, and O. Temam. Putting polyhedral loop transformations to work. In Workshop on Languages and Compilers for Parallel Computing (LCPC'03), pages 23-30, 2003.
-
(2003)
Workshop on Languages and Compilers for Parallel Computing (LCPC'03)
, pp. 23-30
-
-
Bastoul, C.1
Cohen, A.2
Girbal, S.3
Sharma, S.4
Temam, O.5
-
6
-
-
57349110181
-
Affine transformations for communication minimal parallelization and locality optimization of arbitrarily nested loop sequences
-
May
-
U. Bondhugula, M. Baskaran, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan. Affine transformations for communication minimal parallelization and locality optimization of arbitrarily nested loop sequences. Technical Report OSU-CISRC- 5/07-TR43, Ohio State University, May 2007.
-
(2007)
Technical Report OSU-CISRC- 5/07-TR43 Ohio State University
-
-
Bondhugula, U.1
Baskaran, M.2
Krishnamoorthy, S.3
Ramanujam, J.4
Rountev, A.5
Sadayappan, P.6
-
7
-
-
57349145904
-
Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model
-
Apr.
-
U. Bondhugula, M. Baskaran, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan. Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model. In International Conference on Compiler Construction (ETAPS CC), Apr. 2008.
-
(2008)
International Conference on Compiler Construction (ETAPS CC)
-
-
Bondhugula, U.1
Baskaran, M.2
Krishnamoorthy, S.3
Ramanujam, J.4
Rountev, A.5
Sadayappan, P.6
-
10
-
-
0032066690
-
Loop parallelization algorithms: From parallelism extraction to code generation
-
PII S0167819198000209
-
P. Boulet, A. Darte, G.-A. Silber, and F. Vivien. Loop parallelization algorithms: From parallelism extraction to code generation. Parallel Computing, 24(3-4):421-444, 1998. (Pubitemid 128413646)
-
(1998)
Parallel Computing
, vol.24
, Issue.3-4
, pp. 421-444
-
-
Boulet, P.1
Darte, A.2
Silber, G.-A.3
Vivien, F.4
-
11
-
-
36048997493
-
Multithreading for synchronization tolerance in matrix factorization
-
Proceedings of the SciDAC 2007 Conference
-
A. Buttari, J. Dongarra, P. Husbands, J. Kurzak, and K. Yelick. Multithreading for synchronization tolerance in matrix factorization. In Proceedings of the SciDAC 2007 Conference. Journal of Physics: Conference Series, 2007.
-
(2007)
Journal of Physics: Conference Series
-
-
Buttari, A.1
Dongarra, J.2
Husbands, P.3
Kurzak, J.4
Yelick, K.5
-
12
-
-
51049101584
-
A class of parallel tiled linear algebra algorithms for multicore architectures
-
September, Submitted to Parallel Computing. LAPACK Working Note 191
-
A. Buttari, J. Langou, J. Kurzak, and J. Dongarra. A class of parallel tiled linear algebra algorithms for multicore architectures. Technical Report UT-CS-07-600, Innovative Computing Laboratory, University of Tennessee Knoxville, September 2007. Submitted to Parallel Computing. LAPACK Working Note 191.
-
(2007)
Technical Report UT-CS-07-600, Innovative Computing Laboratory, University of Tennessee Knoxville
-
-
Buttari, A.1
Langou, J.2
Kurzak, J.3
Dongarra, J.4
-
13
-
-
0028744946
-
An efficient algorithm for the run-time parallelization of doacross loops
-
D.-K. Chen, J. Torrellas, and P.-C. Yew. An efficient algorithm for the run-time parallelization of doacross loops. In Supercomputing'94: Proceedings of the 1994 conference on Supercomputing, pages 518-527, Los Alamitos, CA, USA, 1994. IEEE Computer Society Press.
-
(1994)
Supercomputing'94: Proceedings of the 1994 conference on Supercomputing
, pp. 518-527
-
-
Chen, D.-K.1
Torrellas, J.2
Yew, P.-C.3
-
14
-
-
0038378430
-
Toward efficient and robust software speculative parallelization on multiprocessors
-
New York, NY, USA, ACM.
-
M. Cintra and D. R. Llanos. Toward efficient and robust software speculative parallelization on multiprocessors. In PPoPP'03: Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 13-24, New York, NY, USA, 2003. ACM.
-
(2003)
PPoPP'03: Proceedings of the Ninth ACM SIGPLAN Symposium on Principles and Practice of parallel programming
, pp. 13-24
-
-
Cintra, M.1
Llanos, D.R.2
-
16
-
-
0342782260
-
Combining retiming and scheduling techniques for loop parallelization and loop tiling'
-
A. Darte, G.-A. Silber, and F. Vivien. Combining retiming and scheduling techniques for loop parallelization and loop tiling. Parallel Processing Letters, 7(4):379-392, 1997. (Pubitemid 127732656)
-
(1997)
Parallel Processing Letters
, vol.7
, Issue.4
, pp. 379-392
-
-
Darte, A.1
-
17
-
-
0031358458
-
Optimal Fine and Medium Grain Parallelism Detection in Polyhedral Reduced Dependence Graphs
-
A. Darte and F. Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. IJPP, 25(6):447. 496, Dec. 1997. (Pubitemid 127507526)
-
(1997)
International Journal of Parallel Programming
, vol.25
, Issue.6
, pp. 447-496
-
-
Darte, A.1
Vivien, F.2
-
19
-
-
0026109335
-
Dataflow analysis of array and scalar references
-
P. Feautrier. Dataflow analysis of array and scalar references. IJPP, 20(1):23-53, 1991.
-
(1991)
IJPP
, vol.20
, Issue.1
, pp. 23-53
-
-
Feautrier, P.1
-
20
-
-
0026933251
-
Some efficient solutions to the affine scheduling problem. I. One-dimensional time
-
P. Feautrier. Some efficient solutions to the affine scheduling problem, part I: one-dimensional time. IJPP, 21(5):313-348, 1992. (Pubitemid 23705312)
-
(1992)
International Journal of Parallel Programming
, vol.21
, Issue.5
, pp. 313-347
-
-
Feautrier Paul1
-
21
-
-
0001448065
-
Some efficient solutions to the affine scheduling problem, part II: Multidimensional time
-
P. Feautrier. Some efficient solutions to the affine scheduling problem, part II: multidimensional time. IJPP, 21(6):389-420, 1992.
-
(1992)
IJPP
, vol.21
, Issue.6
, pp. 389-420
-
-
Feautrier, P.1
-
22
-
-
84957027384
-
Automatic parallelization in the polytope model
-
P. Feautrier. Automatic parallelization in the polytope model. In The Data Parallel Programming Model, pages 79-103, 1996.
-
(1996)
The Data Parallel Programming Model
, pp. 79-103
-
-
Feautrier, P.1
-
24
-
-
33746593747
-
Semi-automatic composition of loop transformations
-
June
-
S. Girbal, N. Vasilache, C. Bastoul, A. Cohen, D. Parello, M. Sigler, and O. Temam. Semi-automatic composition of loop transformations. IJPP, 34(3):261-317, June 2006.
-
(2006)
IJPP
, vol.34
, Issue.3
, pp. 261-317
-
-
Girbal, S.1
Vasilache, N.2
Bastoul, C.3
Cohen, A.4
Parello, D.5
Sigler, M.6
Temam, O.7
-
26
-
-
0025539983
-
Parallel processing of near fine grain tasks using static scheduling on OSCAR (Optimally Scheduled Advanced Multiprocessor)
-
Proc Supercomput 90
-
H. Kasahara, H. Honda, and S. Narita. Parallel processing of near fine grain tasks using static scheduling on oscar (optimally scheduled advanced multiprocessor). In Supercomputing'90: Proceedings of the 1990 ACM/IEEE conference on Supercomputing, pages 856-864, Washington, DC, USA, 1990. IEEE Computer Society. (Pubitemid 21675205)
-
(1990)
Supercomputing'90: Proceedings of the 1990 ACM/IEEE conference on Supercomputing
, pp. 856-864
-
-
Kasahara Hironori1
Honda Hiroki2
Narita Seinosuke3
-
27
-
-
0002050141
-
Static scheduling algorithms for allocating directed task graphs to multiprocessors
-
Y.-K. Kwok and I. Ahmad. Static scheduling algorithms for allocating directed task graphs to multiprocessors. ACM Comput. Surv., 31(4):406-471, 1999.
-
(1999)
ACM Comput. Surv.
, vol.31
, Issue.4
, pp. 406-471
-
-
Kwok, Y.-K.1
Ahmad, I.2
-
28
-
-
0027829921
-
Improving the performance of runtime parallelization
-
S.-T. Leung and J. Zahorjan. Improving the performance of runtime parallelization. SIGPLAN Not., 28(7):83-91, 1993.
-
(1993)
SIGPLAN Not.
, vol.28
, Issue.7
, pp. 83-91
-
-
Leung, S.-T.1
Zahorjan, J.2
-
30
-
-
17644395320
-
Blocking and array contraction across arbitrarily nested loops using affine partitioning
-
A. Lim, S. Liao, and M. Lam. Blocking and array contraction across arbitrarily nested loops using affine partitioning. In ACM SIGPLAN PPoPP, pages 103-112, 2001. (Pubitemid 33720383)
-
(2001)
SIGPLAN Notices (ACM Special Interest Group on Programming Languages)
, vol.36
, Issue.7
, pp. 103-112
-
-
Lim, A.W.1
Liao, S.-W.2
Lam, M.S.3
-
31
-
-
0032662841
-
An affine partitioning algorithm to maximize parallelism and minimize communication
-
A. W. Lim, G. I. Cheong, and M. S. Lam. An affine partitioning algorithm to maximize parallelism and minimize communication. In ACM Intl. Conf. on Supercomputing, pages 228.237, 1999.
-
(1999)
ACM Intl. Conf. on Supercomputing
, pp. 228-237
-
-
Lim, A.W.1
Cheong, G.I.2
Lam, M.S.3
-
32
-
-
0032067773
-
Maximizing parallelism and minimizing synchronization with affine partitions
-
PII S0167819198000210
-
A. W. Lim and M. S. Lam. Maximizing parallelism and minimizing synchronization with affine partitions. Parallel Computing, 24(3- 4):445.475, 1998. (Pubitemid 128413647)
-
(1998)
Parallel Computing
, vol.24
, Issue.3-4
, pp. 445-475
-
-
Lim, A.W.1
Lam, M.S.2
-
35
-
-
0027735065
-
Runtime compilation techniques for data partitioning and communication schedule reuse
-
New York, NY, USA, ACM.
-
R. Ponnusamy, J. Saltz, and A. Choudhary. Runtime compilation techniques for data partitioning and communication schedule reuse. In Supercomputing'93: Proceedings of the 1993 ACM/IEEE conference on Supercomputing, pages 361.370, New York, NY, USA, 1993. ACM.
-
(1993)
Supercomputing'93: Proceedings of the 1993 ACM/IEEE conference on Supercomputing
, pp. 361-370
-
-
Ponnusamy, R.1
Saltz, J.2
Choudhary, A.3
-
36
-
-
84976676720
-
The Omega test: A fast and practical integer programming algorithm for dependence analysis
-
Aug.
-
W. Pugh. The Omega test: a fast and practical integer programming algorithm for dependence analysis. Communications of the ACM, 8:102-114, Aug. 1992.
-
(1992)
Communications of the ACM
, vol.8
, pp. 102-114
-
-
Pugh, W.1
-
37
-
-
0034299275
-
Generation of efficient nested loops from polyhedra
-
DOI 10.1023/A:1007554627716
-
F. Quilleŕe, S. V. Rajopadhye, and D. Wilde. Generation of efficient nested loops from polyhedra. IJPP, 28(5):469-498, 2000. (Pubitemid 30959586)
-
(2000)
International Journal of Parallel Programming
, vol.28
, Issue.5
, pp. 469-498
-
-
Quillere, F.1
Rajopadhye, S.2
Wilde, D.3
-
38
-
-
31844447800
-
Mitosis compiler: An infrastructure for speculative threading based on pre-computation slices
-
Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 05
-
C. G. Quinones, C. Madriles, J. Sánchez, P. Marcuello, A. González, and D. M. Tullsen. Mitosis compiler: An infrastructure for speculative threading based on pre-computation slices. In PLDI 05: Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, pages 269.279, 2005. (Pubitemid 43182906)
-
(2005)
Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI)
, pp. 269-279
-
-
Quinones, C.G.1
Madriles, C.2
Sanchez, J.3
Marcuello, P.4
Gonzalez, A.5
Tullsen, D.M.6
-
39
-
-
84976823223
-
The lrpd test: Speculative runtime parallelization of loops with privatization and reduction parallelization
-
L. Rauchwerger and D. Padua. The lrpd test: speculative runtime parallelization of loops with privatization and reduction parallelization. SIGPLAN Not., 30(6):218-232, 1995.
-
(1995)
SIGPLAN Not.
, vol.30
, Issue.6
, pp. 218-232
-
-
Rauchwerger, L.1
Padua, D.2
-
41
-
-
34548045548
-
Sensitivity analysis for automatic parallelization on multi-cores
-
DOI 10.1145/1274971.1275008, Proceedings of ICS07: 21st ACM International Conference on Supercomputing
-
S. Rus, M. Pennings, and L. Rauchwerger. Sensitivity analysis for automatic parallelization on multi-cores. In ICS'07: Proceedings of the 21st annual international conference on Supercomputing, pages 263.273, New York, NY, USA, 2007. ACM. (Pubitemid 47281623)
-
(2007)
Proceedings of the International Conference on Supercomputing
, pp. 263-273
-
-
Rus, S.1
Pennings, M.2
Rauchwerger, L.3
-
45
-
-
84976746768
-
Compile-time partitioning and scheduling of parallel programs
-
New York, NY, USA, ACM.
-
V. Sarkar and J. Hennessy. Compile-time partitioning and scheduling of parallel programs. In SIGPLAN'86: Proceedings of the 1986 SIGPLAN symposium on Compiler construction, pages 17-26, New York, NY, USA, 1986. ACM.
-
(1986)
SIGPLAN'86: Proceedings of the 1986 SIGPLAN symposium on Compiler construction
, pp. 17-26
-
-
Sarkar, V.1
Hennessy, J.2
-
46
-
-
33745804733
-
Polyhedral code generation in the real world
-
DOI 10.1007/11688839-16, Compiler Construction - 15th International Conference, CC 2006, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2006, Proceedings
-
N. Vasilache, C. Bastoul, and A. Cohen. Polyhedral code generation in the real world. In International Conference on Compiler Construction (ETAPS CC'06), pages 185.201, Mar. 2006. (Pubitemid 44019652)
-
(2006)
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
, vol.3923
, pp. 185-201
-
-
Vasilache, N.1
Bastoul, C.2
Cohen, A.3
|