-
1
-
-
0032305438
-
Compiler optimizations for real time execution of loops on limited memory embedded systems
-
S. Anantharaman and S. Pande. Compiler optimizations for real time execution of loops on limited memory embedded systems. In IEEE Real-Time Systems Symposium, pages 154-164, 1998.
-
(1998)
IEEE Real-Time Systems Symposium
, pp. 154-164
-
-
Anantharaman, S.1
Pande, S.2
-
3
-
-
34547453716
-
Loop transformation methodologies for array-oriented memory management
-
F. Balasa, P. Kjeldsberg, M. Palkovic, A. Vandecappelle, and F. Catthoor. Loop transformation methodologies for array-oriented memory management. In 17th IEEE International Conference on Application-Specific Systems, Architectures and Processors (ASAP'06), pages 205-212, 2006.
-
(2006)
17th IEEE International Conference on Application-Specific Systems, Architectures and Processors (ASAP'06)
, pp. 205-212
-
-
Balasa, F.1
Kjeldsberg, P.2
Palkovic, M.3
Vandecappelle, A.4
Catthoor, F.5
-
4
-
-
0003713964
-
-
2nd Edition. Athena Scientific. ISBN 1-886529-00-0
-
D. P. Bertsekas. Nonlinear Programming: 2nd Edition. Athena Scientific. ISBN 1-886529-00-0.
-
Nonlinear Programming
-
-
Bertsekas, D.P.1
-
5
-
-
33751022080
-
Programming for parallelism and locality with hierarchically tiled arrays
-
G. Bikshandi, J. Guo, D. Hoeflinger, G. Almasi, B. B. Fraguela, M. J. Garzaran, D. Padua, and C. von Praun. Programming for parallelism and locality with hierarchically tiled arrays. In PPoPP, pages 48-57, 2006.
-
(2006)
PPoPP
, pp. 48-57
-
-
Bikshandi, G.1
Guo, J.2
Hoeflinger, D.3
Almasi, G.4
Fraguela, B.B.5
Garzaran, M.J.6
Padua, D.7
Von Praun, C.8
-
6
-
-
0030661485
-
Optimizing matrix multiply using PHiPAC
-
J. Bilmes, K. Asanovic, C. Chin, and J. Demmel. Optimizing matrix multiply using PHiPAC. In Proc. ACM International Conference on Supercomputing, pages 340-347, 1997.
-
(1997)
Proc. ACM International Conference on Supercomputing
, pp. 340-347
-
-
Bilmes, J.1
Asanovic, K.2
Chin, C.3
Demmel, J.4
-
7
-
-
57349110181
-
Affine transformations for communication minimal parallelization and locality optimization of arbitrarily nested loop sequences
-
Ohio State University, May
-
U. Bondhugula, M. Baskaran, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan. Affine transformations for communication minimal parallelization and locality optimization of arbitrarily nested loop sequences. Technical Report OSU-CISRC5/07-TR43, Ohio State University, May 2007.
-
(2007)
Technical Report OSU-CISRC5/07-TR43
-
-
Bondhugula, U.1
Baskaran, M.2
Krishnamoorthy, S.3
Ramanujam, J.4
Rountev, A.5
Sadayappan, P.6
-
8
-
-
0003502725
-
-
Kluwer Academic Publishers
-
F. Catthoor, K. Danckaert, C. Kulkarni, E. Brockmeyer, P. Kjeldsberg, T V. Achteren, and T Omnes. Data Access and Storage Management for Embedded Programmable Processors. Kluwer Academic Publishers, 2002.
-
(2002)
Data Access and Storage Management for Embedded Programmable Processors
-
-
Catthoor, F.1
Danckaert, K.2
Kulkarni, C.3
Brockmeyer, E.4
Kjeldsberg, P.5
Achteren, T.6
Omnes, T.7
-
9
-
-
0029717349
-
Counting solutions to linear and nonlinear constraints through ehrhart polynomials: Applications to analyze and transform scientific programs
-
P. Clauss. Counting solutions to linear and nonlinear constraints through ehrhart polynomials: applications to analyze and transform scientific programs. In ICS '96: Proceedings of the 10th international conference on Supercomputing, pages 278-285, 1996.
-
(1996)
ICS '96: Proceedings of the 10th International Conference on Supercomputing
, pp. 278-285
-
-
Clauss, P.1
-
10
-
-
79959483988
-
-
CLooG: The Chunky Loop Generator, http://www.cloog.org.
-
-
-
-
11
-
-
0031358458
-
Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs
-
Dec.
-
A. Darte and F. Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. IJPP, 25(6):447-496, Dec. 1997.
-
(1997)
IJPP
, vol.25
, Issue.6
, pp. 447-496
-
-
Darte, A.1
Vivien, F.2
-
12
-
-
0346757617
-
A strategy for array management in local memory
-
Irvine, Calif., Cambridge, Mass.: MIT Press
-
C. Eisenbeis, W. Jalby, D. Windheiser, and F. Bodin. A strategy for array management in local memory. In Advances in Languages and Compilers for Parallel Computing, 1990 Workshop, pages 130-151, Irvine, Calif., 1990. Cambridge, Mass.: MIT Press.
-
(1990)
Advances in Languages and Compilers for Parallel Computing, 1990 Workshop
-
-
Eisenbeis, C.1
Jalby, W.2
Windheiser, D.3
Bodin, F.4
-
13
-
-
34548207355
-
Sequoia: Programming the memory hierarchy
-
K. Fatahalian, T. J. Knight, M. Houston, M. Erez, D. R. Horn, L. Leem, J. Y. Park, M. Ren, A. Aiken, W. J. Dally, and P. Hanrahan. Sequoia: Programming the memory hierarchy. In Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, 2006.
-
(2006)
Proceedings of the 2006 ACM/IEEE Conference on Supercomputing
-
-
Fatahalian, K.1
Knight, T.J.2
Houston, M.3
Erez, M.4
Horn, D.R.5
Leem, L.6
Park, J.Y.7
Ren, M.8
Aiken, A.9
Dally, W.J.10
Hanrahan, P.11
-
15
-
-
0026109335
-
Dataflow analysis of array and scalar references
-
P. Feautrier. Dataflow analysis of array and scalar references. IJPP, 20(1):23-53, 1991.
-
(1991)
IJPP
, vol.20
, Issue.1
, pp. 23-53
-
-
Feautrier, P.1
-
16
-
-
0026933251
-
Some efficient solutions to the affine scheduling problem: I. one-dimensional time
-
P. Feautrier. Some efficient solutions to the affine scheduling problem: I. one-dimensional time. IJPP, 21(5):313-348, 1992.
-
(1992)
IJPP
, vol.21
, Issue.5
, pp. 313-348
-
-
Feautrier, P.1
-
17
-
-
0001448065
-
Some efficient solutions to the affine scheduling problem. partii, multidimensional time
-
P. Feautrier. Some efficient solutions to the affine scheduling problem. partii, multidimensional time. IJPP, 21(6):389-420, 1992.
-
(1992)
IJPP
, vol.21
, Issue.6
, pp. 389-420
-
-
Feautrier, P.1
-
18
-
-
84957027384
-
Automatic parallelization in the polytope model
-
P. Feautrier. Automatic parallelization in the polytope model. In The Data Parallel Programming Model, pages 79-103, 1996.
-
(1996)
The Data Parallel Programming Model
, pp. 79-103
-
-
Feautrier, P.1
-
19
-
-
85015240805
-
On estimating and enhancing cache effectiveness
-
London, UK, Springer-Verlag
-
J. Ferrante, V. Sarkar, and W. Thrash. On estimating and enhancing cache effectiveness. In Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing, pages 328-343, London, UK, 1992. Springer-Verlag.
-
(1992)
Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
, pp. 328-343
-
-
Ferrante, J.1
Sarkar, V.2
Thrash, W.3
-
20
-
-
84862940593
-
Strategies for cache and local memory management by global program transformation
-
New York, NY, USA, Springer-Verlag New York, Inc.
-
D. Gannon, W. Jalby, and K. Gallivan. Strategies for cache and local memory management by global program transformation. In Proceedings of the 1st International Conference on Supercomputing, pages 229-254, New York, NY, USA, 1988. Springer-Verlag New York, Inc.
-
(1988)
Proceedings of the 1st International Conference on Supercomputing
, pp. 229-254
-
-
Gannon, D.1
Jalby, W.2
Gallivan, K.3
-
22
-
-
34547227870
-
Multiprocessor system-on-chip data reuse analysis for exploring customized memory hierarchies
-
I. Issenin, E. Brockmeyer, B. Durinck, and N. Dutt. Multiprocessor system-on-chip data reuse analysis for exploring customized memory hierarchies. In DAC '06: Proceedings of the 43rd annual conference on Design automation, pages 49-52, 2006.
-
(2006)
DAC '06: Proceedings of the 43rd Annual Conference on Design Automation
, pp. 49-52
-
-
Issenin, I.1
Brockmeyer, E.2
Durinck, B.3
Dutt, N.4
-
23
-
-
0242578180
-
A cost-effective implementation of multilevel tiling
-
M. Jimnez, J. M. Llabera, and A. Fernndez. A cost-effective implementation of multilevel tiling. IEEE Trans. Parallel Distrib. Syst., 14(10): 1006-1020, 2003.
-
(2003)
IEEE Trans. Parallel Distrib. Syst.
, vol.14
, Issue.10
, pp. 1006-1020
-
-
Jimnez, M.1
Llabera, J.M.2
Fernndez, A.3
-
24
-
-
2142707258
-
Compiler-directed scratch pad memory optimization for embedded multiprocessors
-
M. Kandemir, I. Kadayif, A. Choudhary, J. Ramanujam, and I. Kolcu. Compiler-directed scratch pad memory optimization for embedded multiprocessors. IEEE Transactions on VLSI (TVLSI), 12(3):281-287, 2004.
-
(2004)
IEEE Transactions on VLSI (TVLSI)
, vol.12
, Issue.3
, pp. 281-287
-
-
Kandemir, M.1
Kadayif, I.2
Choudhary, A.3
Ramanujam, J.4
Kolcu, I.5
-
25
-
-
1242286076
-
A compiler based approach for dynamically managing scratch-pad memories in embedded systems
-
M. Kandemir, J. Ramanujam, M. Irwin, V. Narayanan, I. Kadayif, and A. Parikh. A compiler based approach for dynamically managing scratch-pad memories in embedded systems. IEEE Transactions on Computer-Aided Design, 23(2):243-260, 2004.
-
(2004)
IEEE Transactions on Computer-Aided Design
, vol.23
, Issue.2
, pp. 243-260
-
-
Kandemir, M.1
Ramanujam, J.2
Irwin, M.3
Narayanan, V.4
Kadayif, I.5
Parikh, A.6
-
26
-
-
56749175334
-
Multi-level tiling: M for the price of one
-
November
-
D. Kim, L. Renganarayana, D. Rostron, S. Rajopadhye, and M. M. Strout. Multi-level tiling: M for the price of one. In SC, November 2007.
-
(2007)
SC
-
-
Kim, D.1
Renganarayana, L.2
Rostron, D.3
Rajopadhye, S.4
Strout, M.M.5
-
27
-
-
35448944792
-
Effective automatic parallelization of stencil computations
-
July
-
S. Krishnamoorthy, M. Baskaran, U. Bondhugula, J. Ramanujam, A. Rountev, and P. Sadayappan. Effective Automatic Parallelization of Stencil Computations. In ACM SIGPLAN PLDI2007, July 2007.
-
(2007)
ACM SIGPLAN PLDI2007
-
-
Krishnamoorthy, S.1
Baskaran, M.2
Bondhugula, U.3
Ramanujam, J.4
Rountev, A.5
Sadayappan, P.6
-
29
-
-
0030645995
-
Maximizing parallelism and minimizing synchronization with affine transforms
-
A. W. Lim and M. S. Lam. Maximizing parallelism and minimizing synchronization with affine transforms. InPOPL'97, pages 201-214, 1997.
-
(1997)
InPOPL'
, vol.97
, pp. 201-214
-
-
Lim, A.W.1
Lam, M.S.2
-
30
-
-
79959401728
-
-
NVIDIA CUDA. http://developer.nvidia.com/object/cuda.html.
-
-
-
-
31
-
-
33746967016
-
Data and memory optimization techniques for embedded systems. ACM Trans
-
P. R. Panda, F. Catthoor, N. D. Dutt, K. Danckaert, E. Brockmeyer, C. Kulkarni, A. Vandecappelle, and P. G. Kjeldsberg. Data and memory optimization techniques for embedded systems. ACM Trans. Design Autom. Electr. Syst., 6(2):149-206, 2001.
-
(2001)
Design Autom. Electr. Syst.
, vol.6
, Issue.2
, pp. 149-206
-
-
Panda, P.R.1
Catthoor, F.2
Dutt, N.D.3
Danckaert, K.4
Brockmeyer, E.5
Kulkarni, C.6
Vandecappelle, A.7
Kjeldsberg, P.G.8
-
33
-
-
34547683700
-
Iterative optimization in the polyhedral model: Part I, one-dimensional time
-
L.-N. Pouchet, C. Bastoul, A. Cohen, and N. Vasilache. Iterative Optimization in the Polyhedral Model: Part I, One-Dimensional Time. In CGO '07, pages 144-156, 2007.
-
(2007)
CGO '07
, pp. 144-156
-
-
Pouchet, L.-N.1
Bastoul, C.2
Cohen, A.3
Vasilache, N.4
-
34
-
-
84976676720
-
The omega test: A fast and practical integer programming algorithm for dependence analysis
-
Aug
-
W. Pugh. The omega test: a fast and practical integer programming algorithm for dependence analysis. Communication's of the ACM, 8:102-114, Aug. 1992.
-
(1992)
Communication's of the ACM
, vol.8
, pp. 102-114
-
-
Pugh, W.1
-
36
-
-
0034857668
-
Reducing memory requirements of nested loops for embedded systems
-
J. Ramanujam, J. Hong, M. Kandemir, and A. Narayan. Reducing memory requirements of nested loops for embedded systems. In DAC '01: Proceedings of the 38th conference on Design automation, pages 359-364, 2001.
-
(2001)
DAC '01: Proceedings of the 38th Conference on Design Automation
, pp. 359-364
-
-
Ramanujam, J.1
Hong, J.2
Kandemir, M.3
Narayan, A.4
-
37
-
-
34548752231
-
Towards optimal multi-level tiling for stencil computations
-
IEEE
-
L. Renganarayanan, M. Harthikote-Matha, R. Dewri, and S. V. Rajopadhye. Towards optimal multi-level tiling for stencil computations. In IPDPS, pages 1-10. IEEE, 2007.
-
(2007)
IPDPS
, pp. 1-10
-
-
Renganarayanan, L.1
Harthikote-Matha, M.2
Dewri, R.3
Rajopadhye, S.4
-
38
-
-
78650907365
-
Near-Optimal allocation of local memory arrays
-
HP Laboratories Palo Alto
-
R. Schreiber and D. C. Cronquist. Near-Optimal Allocation of Local Memory Arrays. Technical Report HPL-2004-24, HP Laboratories Palo Alto, 2004.
-
(2004)
Technical Report HPL-2004-24
-
-
Schreiber, R.1
Cronquist, D.C.2
|