-
1
-
-
0344908850
-
Automatic intra-register vectorization for the intel architecture
-
Apr
-
A. Bik, M. Girkar, P. Grey, and X. Tian. Automatic intra-register vectorization for the intel architecture. International Journal of Parallel Programming, 30(2):65-98, Apr. 2002.
-
(2002)
International Journal of Parallel Programming
, vol.30
, Issue.2
, pp. 65-98
-
-
Bik, A.1
Girkar, M.2
Grey, P.3
Tian, X.4
-
2
-
-
0030661485
-
Optimizing matrix multiply using PHiPAC: A portable, high-performance, ANSI C coding methodology
-
June
-
J. Bilmes, K. Asanović, C.-W. Chin, and J. Demmel. Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology. In Proceedings of the 1997 A CM International Conference on Supercomputing, June 1997.
-
(1997)
Proceedings of the 1997 A CM International Conference on Supercomputing
-
-
Bilmes, J.1
Asanović, K.2
Chin, C.-W.3
Demmel, J.4
-
3
-
-
0028549474
-
Improving the ratio of memory operations to floating-point operations in loops
-
Nov
-
S. Carr and K. Kennedy. Improving the ratio of memory operations to floating-point operations in loops. ACM Transactions on Programming Languages and Systems, 16(6):1768-1810, Nov. 1994.
-
(1994)
ACM Transactions on Programming Languages and Systems
, vol.16
, Issue.6
, pp. 1768-1810
-
-
Carr, S.1
Kennedy, K.2
-
7
-
-
17644409855
-
An optimizer for multimedia, instruction sets
-
Stanford University, USA, Aug
-
G. Cheong and M. Lam. An optimizer for multimedia, instruction sets. In The Second SUIF Compiler Workshop, Stanford University, USA, Aug. 1997.
-
(1997)
The Second SUIF Compiler Workshop
-
-
Cheong, G.1
Lam, M.2
-
8
-
-
17144430151
-
Optimizing for reduced code space using genetic algorithms
-
May
-
K. D. Cooper, P. J. Schielke, and D. Subramanian. Optimizing for reduced code space using genetic algorithms. In Proceedings of ACM SIGPLAN Workshop on Languages, Compilers, and Tools for Embedded Systems (LCTES'99), May 1999.
-
(1999)
Proceedings of ACM SIGPLAN Workshop on Languages, Compilers, and Tools for Embedded Systems (LCTES'99)
-
-
Cooper, K.D.1
Schielke, P.J.2
Subramanian, D.3
-
12
-
-
34548762396
-
High-performance implementation of the level-3 BLAS
-
Technical Report TR-06-23, Department of Computer Science, University of Texas at Austin
-
K. Goto and R. van de Geijn. High-performance implementation of the level-3 BLAS. Technical Report TR-06-23, Department of Computer Science, University of Texas at Austin, 2006.
-
(2006)
-
-
Goto, K.1
van de Geijn, R.2
-
13
-
-
0442295621
-
The effect of cache models on iterative compilation for combined tiling and unrolling
-
Mar
-
P. M. W. Knijnenburg, T. Kisuki, K. Gallivan, and M. F. P. O'Boyle. The effect of cache models on iterative compilation for combined tiling and unrolling. Concurrency and Computation: Practice and Experience, 16(2-3):247-270, Mar. 2004.
-
(2004)
Concurrency and Computation: Practice and Experience
, vol.16
, Issue.2-3
, pp. 247-270
-
-
Knijnenburg, P.M.W.1
Kisuki, T.2
Gallivan, K.3
O'Boyle, M.F.P.4
-
21
-
-
34548789419
-
Better tiling and array contraction for compiling scientific programs
-
Nov
-
G. Pike and P. N. Hilfinger. Better tiling and array contraction for compiling scientific programs. In Proceedings of Supercomputing '02, Nov. 2002.
-
(2002)
Proceedings of Supercomputing '02
-
-
Pike, G.1
Hilfinger, P.N.2
-
25
-
-
34548772288
-
-
J. Shin. Compiler Optimizations for Architectures Supporting Superword-level Parallelism. PhD thesis, Dept. of Computer Science, USC, Aug. 2005.
-
J. Shin. Compiler Optimizations for Architectures Supporting Superword-level Parallelism. PhD thesis, Dept. of Computer Science, USC, Aug. 2005.
-
-
-
-
28
-
-
0141696394
-
Stochastic search for signal processing algorithm optimization
-
Nov
-
B. Singer and M. Veloso. Stochastic search for signal processing algorithm optimization. In Proceedings of Supercomputing '01, Nov. 2001.
-
(2001)
Proceedings of Supercomputing '01
-
-
Singer, B.1
Veloso, M.2
-
31
-
-
0027764718
-
To copy or not to copy: A compile-time technique for assessing when data copying should be used to eliminate cache conflicts
-
Nov
-
O. Temam, E. D. Granston, and W. Jalby. To copy or not to copy: A compile-time technique for assessing when data copying should be used to eliminate cache conflicts. In Proceedings of Supercomputing '93, Nov. 1993.
-
(1993)
Proceedings of Supercomputing '93
-
-
Temam, O.1
Granston, E.D.2
Jalby, W.3
-
33
-
-
0343462141
-
Automated empirical optimization of software and the ATLAS project
-
Jan
-
R. C Whaley, A. Petitet, and J. J. Dongarra. Automated empirical optimization of software and the ATLAS project. Parallel Computing, 27(1-2):3-35, Jan. 2001.
-
(2001)
Parallel Computing
, vol.27
, Issue.1-2
, pp. 3-35
-
-
Whaley, R.C.1
Petitet, A.2
Dongarra, J.J.3
|