-
1
-
-
77954416366
-
-
http://www.peri-scidac.org/wiki/index.php/Main-Page
-
-
-
-
2
-
-
77954398863
-
-
http://rosecompiler.org/
-
-
-
-
3
-
-
77954416181
-
-
http://www.gnu.org/prep/standards/html-node/Errors.html
-
-
-
-
4
-
-
77954390318
-
-
http://nek5000.mcs.anl.gov/index.php/Main-Page
-
-
-
-
6
-
-
4544380943
-
Finding effective compilation sequences
-
Almagor, L., Cooper, K.D., Grosul, A., Harvey, T.J., Reeves, S.W., Subramanian, D., Torczon, L., Waterman, T.: Finding effective compilation sequences. In: Proceedings of ACM SIGPLAN Workshop on Languages, Compilers, and Tools for Embedded Systems, LCTES 2004 (June 2004)
-
Proceedings of ACM SIGPLAN Workshop on Languages, Compilers, and Tools for Embedded Systems, LCTES 2004 (June 2004)
-
-
Almagor, L.1
Cooper, K.D.2
Grosul, A.3
Harvey, T.J.4
Reeves, S.W.5
Subramanian, D.6
Torczon, L.7
Waterman, T.8
-
7
-
-
77954415625
-
LAPACK: A portable linear algebra library for high-performance computers
-
Anderson, E., Sorensen, D., Bai, Z., Dongarra, J., Greenbaum, A., McKenney, A., Croz, J.D., Hammarling, S., Demmel, J., Bischof, C.H.: LAPACK: A portable linear algebra library for high-performance computers. In: Proceedings of Supercomputing 1990 (November 1990)
-
Proceedings of Supercomputing 1990 (November 1990)
-
-
Anderson, E.1
Sorensen, D.2
Bai, Z.3
Dongarra, J.4
Greenbaum, A.5
McKenney, A.6
Croz, J.D.7
Hammarling, S.8
Demmel, J.9
Bischof, C.H.10
-
8
-
-
0028549474
-
Improving the ratio of memory operations to floating-point operations in loops
-
Carr, S., Kennedy, K.: Improving the ratio of memory operations to floating-point operations in loops. ACM Transactions on Programming Languages and Systems 16(6), 1768-1810 (1994)
-
(1994)
ACM Transactions on Programming Languages and Systems
, vol.16
, Issue.6
, pp. 1768-1810
-
-
Carr, S.1
Kennedy, K.2
-
10
-
-
70449959487
-
-
Technical Report 08-897, University of Southern California June
-
Chen, C., Chame, J., Hall, M.: CHiLL: A framework for composing high-level loop transformations. Technical Report 08-897, University of Southern California (June 2008)
-
(2008)
CHiLL: A Framework for Composing High-level Loop Transformations
-
-
Chen, C.1
Chame, J.2
Hall, M.3
-
12
-
-
0036679993
-
Adaptive optimizing compilers for the 21st century
-
DOI 10.1023/A:1015729001611
-
Cooper, K.D., Subramanian, D., Torczon, L.: Adaptive optimizing compilers for the 21st century. The Journal of Supercomputing 23(1), 7-22 (2002) (Pubitemid 34772138)
-
(2002)
Journal of Supercomputing
, vol.23
, Issue.1
, pp. 7-22
-
-
Cooper, K.D.1
Subramanian, D.2
Torczon, L.3
-
13
-
-
43949129775
-
A language for the compact representation of multiple program versions
-
Ayguadé, E., Baumgartner, G., Ramanujam, J., Sadayappan, P. (eds.) LCPC 2005. Springer, Heidelberg
-
Donadio, S., Brodman, J., Roeder, T., Yotov, K., Barthou, D., Cohen, A., Garzarán, M.J., Padua, D., Pingali, K.: A language for the compact representation of multiple program versions. In: Ayguadé, E., Baumgartner, G., Ramanujam, J., Sadayappan, P. (eds.) LCPC 2005. LNCS, vol. 4339, pp. 136-151. Springer, Heidelberg (2006)
-
(2006)
LNCS
, vol.4339
, pp. 136-151
-
-
Donadio, S.1
Brodman, J.2
Roeder, T.3
Yotov, K.4
Barthou, D.5
Cohen, A.6
Garzarán, M.J.7
Padua, D.8
Pingali, K.9
-
14
-
-
20744449792
-
The design and implementation of FFTW3
-
Frigo, M., Johnson, S.G.: The design and implementation of FFTW3. Proceedings of the IEEE: Special Issue on Program Generation, Optimization, and Platform Adaptation 93(2), 216-231 (2005)
-
(2005)
Proceedings of the IEEE: Special Issue on Program Generation, Optimization, and Platform Adaptation
, vol.93
, Issue.2
, pp. 216-231
-
-
Frigo, M.1
Johnson, S.G.2
-
15
-
-
33746593747
-
Semi-automatic composition of loop transformations for deep parallelism and memory hierarchies
-
Girbal, S., Vasilache, N., Bastoul, C., Cohen, A., Parello, D., Sigler, M., Temam, O.: Semi-automatic composition of loop transformations for deep parallelism and memory hierarchies. International Journal of Parallel Programming 34(3), 261-317 (2006)
-
(2006)
International Journal of Parallel Programming
, vol.34
, Issue.3
, pp. 261-317
-
-
Girbal, S.1
Vasilache, N.2
Bastoul, C.3
Cohen, A.4
Parello, D.5
Sigler, M.6
Temam, O.7
-
17
-
-
35048886594
-
Improving performance of hypermatrix cholesky factorization
-
Kosch, H., Böszörményi, L., Hellwagner, H. (eds.) Euro-Par 2003. Springer, Heidelberg
-
Herrero, J.R., Navarro, J.J.: Improving performance of hypermatrix cholesky factorization. In: Kosch, H., Böszörményi, L., Hellwagner, H. (eds.) Euro-Par 2003. LNCS, vol. 2790, pp. 461-469. Springer, Heidelberg (2003)
-
(2003)
LNCS
, vol.2790
, pp. 461-469
-
-
Herrero, J.R.1
Navarro, J.J.2
-
18
-
-
0038895757
-
Register tiling in nonrectangular iteration spaces
-
Jiménez, M., Llabería, J.M., Fernández, A.: Register tiling in nonrectangular iteration spaces. ACM Transactions on Programming Languages and Systems 24(4), 409-453 (2002)
-
(2002)
ACM Transactions on Programming Languages and Systems
, vol.24
, Issue.4
, pp. 409-453
-
-
Jiménez, M.1
Llabería, J.M.2
Fernández, A.3
-
19
-
-
58449097645
-
Improving the performance of tensor matrix vector multiplication in cumulative reaction probability based quantum chemistry codes
-
Sadayappan, P., Parashar, M., Badrinath, R., Prasanna, V.K. (eds.) HiPC 2008. Springer, Heidelberg
-
Kaushik, D.K., Gropp, W., Minkoff, M., Smith, B.: Improving the performance of tensor matrix vector multiplication in cumulative reaction probability based quantum chemistry codes. In: Sadayappan, P., Parashar, M., Badrinath, R., Prasanna, V.K. (eds.) HiPC 2008. LNCS, vol. 5374, pp. 120-130. Springer, Heidelberg (2008)
-
(2008)
LNCS
, vol.5374
, pp. 120-130
-
-
Kaushik, D.K.1
Gropp, W.2
Minkoff, M.3
Smith, B.4
-
22
-
-
0442295621
-
The effect of cache models on iterative compilation for combined tiling and unrolling
-
Knijnenburg, P.M.W., Kisuki, T., Gallivan, K., O'Boyle, M.F.P.: The effect of cache models on iterative compilation for combined tiling and unrolling. Concurrency and Computation: Practice and Experience 16(2-3), 247-270 (2004)
-
(2004)
Concurrency and Computation: Practice and Experience
, vol.16
, Issue.2-3
, pp. 247-270
-
-
Knijnenburg, P.M.W.1
Kisuki, T.2
Gallivan, K.3
O'Boyle, M.F.P.4
-
24
-
-
23944512382
-
Empirical optimization for a sparse linear solver: A case study
-
Lee, Y., Diniz, P., Hall, M., Lucas, R.: Empirical optimization for a sparse linear solver: A case study. International Journal of Parallel Programming 33 (2005)
-
(2005)
International Journal of Parallel Programming
, vol.33
-
-
Lee, Y.1
Diniz, P.2
Hall, M.3
Lucas, R.4
-
28
-
-
0030190854
-
Improving Data Locality with Loop Transformations
-
McKinley, K.S., Carr, S., Tseng, C.-W.: Improving data locality with loop transformations. ACM Transactions on Programming Languages and Systems 18(4), 424-453 (1996) (Pubitemid 126422522)
-
(1996)
ACM Transactions on Programming Languages and Systems
, vol.18
, Issue.4
, pp. 424-453
-
-
Mckinley, K.S.1
Carr, S.2
Tseng, C.-W.3
-
29
-
-
68849096760
-
Generating empirically optimized composed matrix kernels from matlab prototypes
-
Allen, G., Nabrzyski, J., Seidel, E., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) Computational Science - ICCS 2009. Springer, Heidelberg
-
Norris, B., Hartono, A., Jessup, E., Siek, J.: Generating empirically optimized composed matrix kernels from matlab prototypes. In: Allen, G., Nabrzyski, J., Seidel, E., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) Computational Science - ICCS 2009. LNCS, vol. 5544, pp. 248-258. Springer, Heidelberg (2009)
-
(2009)
LNCS
, vol.5544
, pp. 248-258
-
-
Norris, B.1
Hartono, A.2
Jessup, E.3
Siek, J.4
-
30
-
-
84871295761
-
GRAPHITE: Polyhedral analyses and optimizations for GCC
-
Pop, S., Cohen, A., Bastoul, C., Girbal, S., Silber, G.-A., Vasilache, N.: GRAPHITE: Polyhedral analyses and optimizations for GCC. In: Proceedings of the 4th GCC Developers' Summit (June 2006)
-
Proceedings of the 4th GCC Developers' Summit (June 2006)
-
-
Pop, S.1
Cohen, A.2
Bastoul, C.3
Girbal, S.4
Silber, G.-A.5
Vasilache, N.6
-
31
-
-
34547683700
-
Iterative optimization in the polyhedral model: Part I, one-dimensional time
-
Pouchet, L.-N., Bastoul, C., Cohen, A., Cavazos, J.: Iterative optimization in the polyhedral model: Part I, one-dimensional time. In: Proceedings of the International Symposium on Code Generation and Optimization (March 2007)
-
Proceedings of the International Symposium on Code Generation and Optimization (March 2007)
-
-
Pouchet, L.-N.1
Bastoul, C.2
Cohen, A.3
Cavazos, J.4
-
32
-
-
57349167317
-
Iterative optimization in the polyhedral model: Part II, multi-dimensional time
-
Pouchet, L.-N., Bastoul, C., Cohen, A., Vasilache, N.: Iterative optimization in the polyhedral model: Part II, multi-dimensional time. In: Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation (June 2008)
-
Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation (June 2008)
-
-
Pouchet, L.-N.1
Bastoul, C.2
Cohen, A.3
Vasilache, N.4
-
33
-
-
84948990001
-
Iteration space slicing for locality
-
Carter, L., Ferrante, J. (eds.) LCPC 1999. Springer, Heidelberg
-
Pugh, B., Rosser, E.: Iteration space slicing for locality. In: Carter, L., Ferrante, J. (eds.) LCPC 1999. LNCS, vol. 1863, p. 164. Springer, Heidelberg (1999)
-
(1999)
LNCS
, vol.1863
, pp. 164
-
-
Pugh, B.1
Rosser, E.2
-
34
-
-
14744298722
-
-
Technical Report TR03-419, Rice University October
-
Qasem, A., Jin, G., Mellor-Crummey, J.: Improving performance with integrated program transformations. Technical Report TR03-419, Rice University (October 2003)
-
(2003)
Improving Performance with Integrated Program Transformations
-
-
Qasem, A.1
Jin, G.2
Mellor-Crummey, J.3
-
36
-
-
63549093766
-
A tuning framework for software-managed memory hierarchies
-
Ren, M., Park, J.Y., Houston, M., Aiken, A., Dally, W.J.: A tuning framework for software-managed memory hierarchies. In: Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (October 2008)
-
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (October 2008)
-
-
Ren, M.1
Park, J.Y.2
Houston, M.3
Aiken, A.4
Dally, W.J.5
-
39
-
-
84891431362
-
Autotuning and specialization: Speeding up matrix multiply for small matrices with compiler technology
-
Shin, J., Hall, M.W., Chame, J., Chen, C., Hovland, P.D.: Autotuning and specialization: Speeding up matrix multiply for small matrices with compiler technology. In: The Fourth International Workshop on Automatic Performance Tuning (October 2009)
-
The Fourth International Workshop on Automatic Performance Tuning (October 2009)
-
-
Shin, J.1
Hall, M.W.2
Chame, J.3
Chen, C.4
Hovland, P.D.5
-
40
-
-
0027764718
-
To copy or not to copy: A compile-time technique for assessing when data copying should be used to eliminate cache conflicts
-
Temam, O., Granston, E.D., Jalby, W.: To copy or not to copy: A compile-time technique for assessing when data copying should be used to eliminate cache conflicts. In: Proceedings of Supercomputing 1993 (November 1993)
-
Proceedings of Supercomputing 1993 (November 1993)
-
-
Temam, O.1
Granston, E.D.2
Jalby, W.3
-
41
-
-
70449844310
-
A scalable auto-tuning framework for compiler optimization
-
Tiwari, A., Chen, C., Chame, J., Hall, M., Hollingsworth, J.K.: A scalable auto-tuning framework for compiler optimization. In: Proceedings of the 24th International Parallel and Distributed Processing Symposium (April 2009)
-
Proceedings of the 24th International Parallel and Distributed Processing Symposium (April 2009)
-
-
Tiwari, A.1
Chen, C.2
Chame, J.3
Hall, M.4
Hollingsworth, J.K.5
-
43
-
-
0343462141
-
Automated empirical optimizations of software and the ATLAS project
-
DOI 10.1016/S0167-8191(00)00087-9
-
Whaley, R.C., Petitet, A., Dongarra, J.J.: Automated empirical optimization of software and the ATLAS project. Parallel Computing 27(1-2), 3-35 (2001) (Pubitemid 32264775)
-
(2001)
Parallel Computing
, vol.27
, Issue.1-2
, pp. 3-35
-
-
Clint Whaley, R.1
Petitet, A.2
Dongarra, J.J.3
-
47
-
-
0026082301
-
Data dependence and program restructuring
-
Wolfe, M.: Data dependence and program restructuring. The Journal of Supercomputing 4(4), 321-344 (1991)
-
(1991)
The Journal of Supercomputing
, vol.4
, Issue.4
, pp. 321-344
-
-
Wolfe, M.1
-
49
-
-
34548765138
-
POET: Parameterized optimizations for empirical tuning
-
Yi, Q., Seymour, K., You, H., Vuduc, R., Quinlan, D.: POET: parameterized optimizations for empirical tuning. In: Proceedings of the 21st International Parallel and Distributed Processing Symposium (March 2007)
-
Proceedings of the 21st International Parallel and Distributed Processing Symposium (March 2007)
-
-
Yi, Q.1
Seymour, K.2
You, H.3
Vuduc, R.4
Quinlan, D.5
|