-
1
-
-
84891398703
-
-
http://nek5000.mcs.anl.gov
-
-
-
-
2
-
-
84891415274
-
-
http://rosecompiler.org
-
-
-
-
3
-
-
84891421266
-
-
http://www.mcs.anl.gov/~jaewook/tune.html
-
-
-
-
4
-
-
84891448238
-
-
http://www.netlib.org/blas
-
-
-
-
5
-
-
34547678265
-
Loop optimization using hierarchical compilation and kernel decomposition
-
San Jose, CA
-
Barthou D, Donadio S, Carribault P, Duchateau A, Jalby W (2007) Loop optimization using hierarchical compilation and kernel decomposition. In International symposium on code generation and optimization, San Jose, CA
-
(2007)
International Symposium on Code Generation and Optimization
-
-
Barthou, D.1
Donadio, S.2
Carribault, P.3
Duchateau, A.4
Jalby, W.5
-
6
-
-
0030661485
-
Optimizing matrix multiply using PHiPAC: A portable, high-performance, ANSI C coding methodology
-
Vienna, Austria
-
Bilmes J, Asanovic K, Chin C-W, Demmel J (1997) Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology. In International conference on supercomputing, Vienna, Austria, pp 340-347
-
(1997)
International Conference on Supercomputing
, pp. 340-347
-
-
Bilmes, J.1
Asanovic, K.2
Chin, C.-W.3
Demmel, J.4
-
10
-
-
0036733153
-
Value-sensitive automatic code specialization for embedded software
-
Chung E-Y, Benini L, De Micheli G, Luculli G, Carilli M (2002) Value-sensitive automatic code specialization for embedded software. IEEE Trans Comput Aided Des Integr Circuits Syst 21(9):1051-1067
-
(2002)
IEEE Trans Comput Aided des Integr Circuits Syst
, vol.21
, Issue.9
, pp. 1051-1067
-
-
Chung, E.-Y.1
Benini, L.2
De Micheli, G.3
Luculli, G.4
Carilli, M.5
-
12
-
-
0039435412
-
FLAME: Formal linear algebra methods environment
-
Gunnels JA, Gustavson FG, Henry GM, Van De Geijn RA (2001) FLAME: formal linear algebra methods environment. ACM Trans Math Software 27(27):422-455
-
(2001)
ACM Trans Math Software
, vol.27
, Issue.27
, pp. 422-455
-
-
Gunnels, J.A.1
Gustavson, F.G.2
Henry, G.M.3
Van De Geijn, R.A.4
-
13
-
-
84870211068
-
Loop transformation recipes for code generation and auto-tuning
-
October 8-10, 2009, University of Delaware, Newark, Delaware
-
Hall M, Chame J, Chen C, Shin J, Rudy G, Murtaza Khan M (2009) Loop transformation recipes for code generation and auto-tuning. The 22nd international workshop on languages and compilers for parallel computing, October 8-10, 2009, University of Delaware, Newark, Delaware
-
(2009)
The 22nd International Workshop on Languages and Compilers for Parallel Computing
-
-
Hall, M.1
Chame, J.2
Chen, C.3
Shin, J.4
Rudy, G.5
Murtaza Khan, M.6
-
17
-
-
58449097645
-
Improving the performance of tensor matrix vector multiplication in cumulative reaction probability based quantum chemistry codes
-
Springer, Berlin
-
Kaushik DK, Gropp W, Minkoff M, Smith B (2008) Improving the performance of tensor matrix vector multiplication in cumulative reaction probability based quantum chemistry codes.. In 15th international conference on high performance computing (HiPC 2008), vol. 5374 of Lecture Notes in Computer Science, Springer, Berlin
-
(2008)
15th International Conference on High Performance Computing (HiPC 2008), Vol. 5374 of Lecture Notes in Computer Science
-
-
Kaushik, D.K.1
Gropp, W.2
Minkoff, M.3
Smith, B.4
-
18
-
-
0037266298
-
Combined selection of tile sizes and unroll factors using iterative compilation
-
Knijnenburg P M W, Kisuki T, O'Boyle M F P (2003) Combined selection of tile sizes and unroll factors using iterative compilation. J Supercomput 24(24):43-67
-
(2003)
J Supercomput
, vol.24
, Issue.24
, pp. 43-67
-
-
Knijnenburg, P.M.W.1
Kisuki, T.2
O'Boyle, M.F.P.3
-
20
-
-
19344368072
-
SPIRAL: Code generation for DSP transforms
-
Püschel M, Moura J M F, Johnson J, Padua D, Veloso M, Singer B, Xiong J, Franchetti F, Gaçić A, Voronenko Y, Chen K, Johnson RW, Rizzolo N (2005) SPIRAL: code generation for DSP transforms. Proc IEEE 93(93):232-275
-
(2005)
Proc IEEE
, vol.93
, Issue.93
, pp. 232-275
-
-
Püschel, M.1
Moura, J.M.F.2
Johnson, J.3
Padua, D.4
Veloso, M.5
Singer, B.6
Xiong, J.7
Franchetti, F.8
Gaçić, A.9
Voronenko, Y.10
Chen, K.11
Johnson, R.W.12
Rizzolo, N.13
-
21
-
-
70449844310
-
-
IPDPS, Rome, Italy
-
Tiwari A, Chen C, Chame J, Hall M, Hollingsworth JK (2009) A scalable autotuning framework for compiler optimization. In IPDPS, Rome, Italy
-
(2009)
A Scalable Autotuning Framework for Compiler Optimization
-
-
Tiwari, A.1
Chen, C.2
Chame, J.3
Hall, M.4
Hollingsworth, J.K.5
-
23
-
-
24344485098
-
Oski: A library of automatically tuned sparse matrix kernels
-
Vuduc R, Demmel JW, Yelick KA (2005) Oski: a library of automatically tuned sparse matrix kernels. J Phys Conf Ser 16(16):521-530
-
(2005)
J Phys Conf Ser
, vol.16
, Issue.16
, pp. 521-530
-
-
Vuduc, R.1
Demmel, J.W.2
Yelick, K.A.3
-
25
-
-
34548765138
-
POET: Parameterized optimizations for empirical tuning
-
March 2007
-
Yi Q, Seymour K, You H, Vuduc R, Quinlan D (2007) POET: parameterized optimizations for empirical tuning. In IPDPS, Long Beach, CA, March 2007
-
(2007)
IPDPS, Long Beach, CA
-
-
Yi, Q.1
Seymour, K.2
You, H.3
Vuduc, R.4
Quinlan, D.5
-
26
-
-
20744459570
-
Is search really necessary to generate high-performance BLAS?
-
Yotov K, Li X, Ren G, Garzarán MJ, Padua D, Pingali K, Stodghill P (2005) Is search really necessary to generate high-performance BLAS? Proc IEEE 93(93):358-386
-
(2005)
Proc IEEE
, vol.93
, Issue.93
, pp. 358-386
-
-
Yotov, K.1
Li, X.2
Ren, G.3
Garzarán, M.J.4
Padua, D.5
Pingali, K.6
Stodghill, P.7
|