-
1
-
-
0343462141
-
Automated empirical optimizations of software and the ATLAS project
-
R. C. Whaley, A. Petitet, and J. Dongarra, "Automated empirical optimizations of software and the ATLAS project," Parallel Computing, vol. 27, no. 1-2, pp. 3-35, 2001.
-
(2001)
Parallel Computing
, vol.27
, Issue.1-2
, pp. 3-35
-
-
Whaley, R.C.1
Petitet, A.2
Dongarra, J.3
-
2
-
-
20744449792
-
The design and implementation of FFTW3
-
M. Frigo and S. G. Johnson, "The design and implementation of FFTW3," Proc. of the IEEE, vol. 93, no. 2, pp. 216-231, 2005.
-
(2005)
Proc. of the IEEE
, vol.93
, Issue.2
, pp. 216-231
-
-
Frigo, M.1
Johnson, S.G.2
-
3
-
-
19344368072
-
SPIRAL: Code generation for DSP transforms
-
M. Püschel, J. M. F. Moura, J. Johnson, D. Padua, M. Veloso, B. Singer, J. Xiong, F. Franchetti, A. Gacic, Y. Voronenko, K. Chen, R. W. Johnson, and N. Rizzolo, "SPIRAL: Code generation for DSP transforms," Proc. of the IEEE, vol. 93, no. 2, pp. 232-275, 2005.
-
(2005)
Proc. of the IEEE
, vol.93
, Issue.2
, pp. 232-275
-
-
Püschel, M.1
Moura, J.M.F.2
Johnson, J.3
Padua, D.4
Veloso, M.5
Singer, B.6
Xiong, J.7
Franchetti, F.8
Gacic, A.9
Voronenko, Y.10
Chen, K.11
Johnson, R.W.12
Rizzolo, N.13
-
4
-
-
84857557751
-
Optimizing program locality through CMEs and GAs
-
X. Vera, J. Abella, A. Gonzalez, and J. Llosa, "Optimizing program locality through CMEs and GAs," in Proc. Parallel Architectures and Compilation Techniques (PACT), 2003, pp. 68-78.
-
(2003)
Proc. Parallel Architectures and Compilation Techniques (PACT)
, pp. 68-78
-
-
Vera, X.1
Abella, J.2
Gonzalez, A.3
Llosa, J.4
-
5
-
-
70449690852
-
Optimal tile size selection guided by analytical models
-
B. B. Fraguela, M. G. Carmueja, and D. Andrade, "Optimal tile size selection guided by analytical models," in Proc. Parallel Computing (ParCo), 2005, pp. 565-572.
-
(2005)
Proc. Parallel Computing (ParCo)
, pp. 565-572
-
-
Fraguela, B.B.1
Carmueja, M.G.2
Andrade, D.3
-
6
-
-
20744459570
-
Is search really necessary to generate high performance BLAS?
-
K. Yotov, X. Li, G. Ren, M. Garzaran, D. Padua, K. Pingali, and P. Stodghill, "Is search really necessary to generate high performance BLAS?" Proc. of the IEEE, vol. 93, no. 2, 2005.
-
(2005)
Proc. of the IEEE
, vol.93
, Issue.2
-
-
Yotov, K.1
Li, X.2
Ren, G.3
Garzaran, M.4
Padua, D.5
Pingali, K.6
Stodghill, P.7
-
7
-
-
0141496142
-
Learning to construct fast signal processing implementations
-
B. Singer and M. Veloso, "Learning to construct fast signal processing implementations," J. Machine Learning Research, special issue on ICML, vol. 3, pp. 887-919, 2003.
-
(2003)
J. Machine Learning Research, Special Issue on ICML
, vol.3
, pp. 887-919
-
-
Singer, B.1
Veloso, M.2
-
8
-
-
20744445511
-
Scanning the issue: Special issue on program generation, optimization, and platform adaptation
-
J. M. F. Moura, M. Püschel, D. Padua, and J. Dongarra, "Scanning the issue: Special issue on program generation, optimization, and platform adaptation," Proc. of the IEEE, vol. 93, no. 2, pp. 211-215, 2005.
-
(2005)
Proc. of the IEEE
, vol.93
, Issue.2
, pp. 211-215
-
-
Moura, J.M.F.1
Püschel, M.2
Padua, D.3
Dongarra, J.4
-
9
-
-
0001714824
-
Cache miss equations: A compiler framework for analyzing and tuning memory behavior
-
July
-
S. Ghosh, M. Martonosi, and S. Malik, "Cache Miss Equations: A Compiler Framework for Analyzing and Tuning Memory Behavior," ACM Trans. Programming Languages and Systems, vol. 21, no. 4, pp. 702-745, July 1999.
-
(1999)
ACM Trans. Programming Languages and Systems
, vol.21
, Issue.4
, pp. 702-745
-
-
Ghosh, S.1
Martonosi, M.2
Malik, S.3
-
10
-
-
0034832018
-
Exact analysis of the cache behavior of nested loops
-
S. Chatterjee, E. Parker, P. J. Hanlon, and A. R. Lebeck, "Exact analysis of the cache behavior of nested loops, " in Proc. Programming Language Design and Implementation (PLDI), 2001, pp. 286-297.
-
(2001)
Proc. Programming Language Design and Implementation (PLDI)
, pp. 286-297
-
-
Chatterjee, S.1
Parker, E.2
Hanlon, P.J.3
Lebeck, A.R.4
-
11
-
-
84948974925
-
Compile-time based performance prediction
-
C. Cascaval, L. D. Rose, D. A. Padua, and D. A. Reed, "Compile-time based performance prediction, " in Proc. Languages and Compilers for Parallel Computing (LCPC), 1999, pp. 365-379.
-
(1999)
Proc. Languages and Compilers for Parallel Computing (LCPC)
, pp. 365-379
-
-
Cascaval, C.1
Rose, L.D.2
Padua, D.A.3
Reed, D.A.4
-
12
-
-
0037340135
-
Probabilistic miss equations: Evaluating memory hierarchy performance
-
B. B. Fraguela, R. Doallo, and E. L. Zapata, "Probabilistic Miss Equations: Evaluating Memory Hierarchy Performance," IEEE Trans. Computers, vol. 52, no. 3, pp. 321-336, 2003.
-
(2003)
IEEE Trans. Computers
, vol.52
, Issue.3
, pp. 321-336
-
-
Fraguela, B.B.1
Doallo, R.2
Zapata, E.L.3
-
13
-
-
31844432305
-
Formal loop merging for signal transforms
-
F. Franchetti, Y. Voronenko, and M. Püschel, "Formal loop merging for signal transforms," in Proc. Programming Languages Design and Implementation (PLDI), 2005, pp. 315-326.
-
(2005)
Proc. Programming Languages Design and Implementation (PLDI)
, pp. 315-326
-
-
Franchetti, F.1
Voronenko, Y.2
Püschel, M.3
-
14
-
-
38049144052
-
A rewriting system for the vectorization of signal transforms
-
F. Franchetti, Y. Voronenko, and M. Pueschel, "A rewriting system for the vectorization of signal transforms," in Proc. High Performance Computing for Computational Science (VECPAR), 2006, pp. 363-377.
-
(2006)
Proc. High Performance Computing for Computational Science (VECPAR)
, pp. 363-377
-
-
Franchetti, F.1
Voronenko, Y.2
Pueschel, M.3
-
15
-
-
67650568215
-
Computer generation of general size linear transform libraries
-
Y. Voronenko, F. de Mesmay, and M. Püschel, "Computer generation of general size linear transform libraries," in Proc. Code Generation and Optimization (CGO), 2009, pp. 102-113.
-
(2009)
Proc. Code Generation and Optimization (CGO)
, pp. 102-113
-
-
Voronenko, Y.1
De Mesmay, F.2
Püschel, M.3
-
16
-
-
0026137116
-
The cache performance and optimizations of blocked algorithms
-
M. S. Lam, E. E. Rothberg, and M. E. Wolf, "The Cache Performance and Optimizations of Blocked Algorithms," in Proc. Architectural Support for Programming Languages and Operating Systems (ASPLOS), 1991, pp. 63-74.
-
(1991)
Proc. Architectural Support for Programming Languages and Operating Systems (ASPLOS)
, pp. 63-74
-
-
Lam, M.S.1
Rothberg, E.E.2
Wolf, M.E.3
-
17
-
-
0028429842
-
Cache interference phenomena
-
O. Temam, C. Fricker, and W. Jalby, "Cache Interference Phenomena," in Proc. Measurement and modeling of computer systems (SIGMETRICS), 1994, pp. 261-271.
-
(1994)
Proc. Measurement and Modeling of Computer Systems (SIGMETRICS)
, pp. 261-271
-
-
Temam, O.1
Fricker, C.2
Jalby, W.3
|