-
1
-
-
0003706460
-
-
Society for Industrial and Applied Mathematics, Philadelphia, PA, 3rd edition
-
E. Anderson, Z. Bai, C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D. Sorensen. LAPACK Users' Guide. Society for Industrial and Applied Mathematics, Philadelphia, PA, 3rd edition, 1999.
-
(1999)
LAPACK Users' Guide
-
-
Anderson, E.1
Bai, Z.2
Bischof, C.3
Blackford, S.4
Demmel, J.5
Dongarra, J.6
Du Croz, J.7
Greenbaum, A.8
Hammarling, S.9
McKenney, A.10
Sorensen, D.11
-
2
-
-
77954003716
-
Parameterized Tiling Revisited
-
M. Baskaran, A. Hartono, T. Henretty, J. Ramanujam, and P. Sadayappan. Parameterized Tiling Revisited. In IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2010.
-
IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2010
-
-
Baskaran, M.1
Hartono, A.2
Henretty, T.3
Ramanujam, J.4
Sadayappan, P.5
-
3
-
-
0030661485
-
Optimizing matrix multiply using PHiPAC: A portable, highperformance, ANSI C coding methodology
-
J. Bilmes, K. Asanovic, C.-W. Chin, and J. Demmel. Optimizing matrix multiply using PHiPAC: A portable, highperformance, ANSI C coding methodology. In International Conference on Supercomputing, pages 340-347, 1997.
-
(1997)
International Conference on Supercomputing
, pp. 340-347
-
-
Bilmes, J.1
Asanovic, K.2
Chin, C.-W.3
Demmel, J.4
-
4
-
-
33847230814
-
Scalapack: A portable linear algebra library for distributed memory computers - Design issues and performance
-
Washington, DC, USA, IEEE Computer Society
-
L. S. Blackford, J. Choi, A. Cleary, A. Petitet, R. C. Whaley, J. Demmel, I. Dhillon, K. Stanley, J. Dongarra, S. Hammarling, G. Henry, and D. Walker. Scalapack: a portable linear algebra library for distributed memory computers - design issues and performance. In Supercomputing '96: Proceedings of the 1996 ACM/IEEE conference on Supercomputing (CDROM), page 5, Washington, DC, USA, 1996. IEEE Computer Society.
-
(1996)
Supercomputing '96: Proceedings of the 1996 ACM/IEEE Conference on Supercomputing (CDROM)
, pp. 5
-
-
Blackford, L.S.1
Choi, J.2
Cleary, A.3
Petitet, A.4
Whaley, R.C.5
Demmel, J.6
Dhillon, I.7
Stanley, K.8
Dongarra, J.9
Hammarling, S.10
Henry, G.11
Walker, D.12
-
6
-
-
2442694387
-
Singular operators in multiwavelet bases
-
G. I. Fann, G. Beylkin, R. J. Harrison, and K. E. Jordan. Singular operators in multiwavelet bases. IBM Journal of Research and Development, 48(2):161-172, 2004.
-
(2004)
IBM Journal of Research and Development
, vol.48
, Issue.2
, pp. 161-172
-
-
Fann, G.I.1
Beylkin, G.2
Harrison, R.J.3
Jordan, K.E.4
-
7
-
-
0031636309
-
FFTW: An adaptive software architecture for the FFT. In
-
M. Frigo. FFTW: An adaptive software architecture for the FFT. In Proceedings of the ICASSP Conference, volume 3, page 1381, 1998.
-
(1998)
Proceedings of the ICASSP Conference, Volume
, vol.3
, pp. 1381
-
-
Frigo, M.1
-
8
-
-
44249094647
-
Anatomy of high-performance matrix multiplication
-
K. Goto and R. A. v. d. Geijn. Anatomy of high-performance matrix multiplication. ACM Trans. Math. Softw., 34(3):1-25, 2008.
-
(2008)
ACM Trans. Math. Softw.
, vol.34
, Issue.3
, pp. 1-25
-
-
Goto, K.1
Geijn, R.A.V.D.2
-
9
-
-
48849089104
-
High-performance implementation of the level-3 BLAS
-
K. Goto and R. van De Geijn. High-performance implementation of the level-3 BLAS. ACM Trans. Math. Softw., 35(1):1-14, 2008.
-
(2008)
ACM Trans. Math. Softw.
, vol.35
, Issue.1
, pp. 1-14
-
-
Goto, K.1
Van De Geijn, R.2
-
10
-
-
35248847018
-
Multiresolution quantum chemistry in multiwavelet bases
-
R. J. Harrison, G. I. Fann, T. Yanai, and G. Beylkin. Multiresolution quantum chemistry in multiwavelet bases. In International Conference on Computational Science, pages 103-110, 2003.
-
(2003)
International Conference on Computational Science
, pp. 103-110
-
-
Harrison, R.J.1
Fann, G.I.2
Yanai, T.3
Beylkin, G.4
-
11
-
-
11044224123
-
Multiresolution quantum chemistry: Basic theory and initial applications
-
DOI 10.1063/1.1791051, 12
-
R. J. Harrison, G. I. Fann, T. Yanai, Z. Gan, and G. Beylkin. Multiresolution quantum chemistry: Basic theory and initial applications. Journal of Chemical Physics, 121(23):11587-11598, 2004. (Pubitemid 40044262)
-
(2004)
Journal of Chemical Physics
, vol.121
, Issue.23
, pp. 11587-11598
-
-
Harrison, R.J.1
Fann, G.I.2
Yanai, T.3
Gan, Z.4
Beylkin, G.5
-
12
-
-
70449702074
-
Parametric Multi-Level Tiling of Imperfectly Nested Loops
-
A. Hartono, M. M. Baskaran, C. Bastoul, A. Cohen, S. Krishnamoorthy, B. Norris, J. Ramanujam, and P. Sadayappan. Parametric Multi-Level Tiling of Imperfectly Nested Loops. In ACM International Conference on Supercomputing, 2009.
-
ACM International Conference on Supercomputing, 2009
-
-
Hartono, A.1
Baskaran, M.M.2
Bastoul, C.3
Cohen, A.4
Krishnamoorthy, S.5
Norris, B.6
Ramanujam, J.7
Sadayappan, P.8
-
13
-
-
67149109696
-
A simd optimization framework for retargetable compilers
-
M. Hohenauer, F. Engel, R. Leupers, G. Ascheid, and H. Meyr. A simd optimization framework for retargetable compilers. ACM Trans. Archit. Code Optim., 6(1):1-27, 2009.
-
(2009)
ACM Trans. Archit. Code Optim.
, vol.6
, Issue.1
, pp. 1-27
-
-
Hohenauer, M.1
Engel, F.2
Leupers, R.3
Ascheid, G.4
Meyr, H.5
-
16
-
-
63549093768
-
Outer-loop vectorization: Revisited for short simd architectures
-
New York, NY, USA, ACM
-
D. Nuzman and A. Zaks. Outer-loop vectorization: revisited for short simd architectures. In PACT '08: Proceedings of the 17th international conference on Parallel architectures and compilation techniques, pages 2-11, New York, NY, USA, 2008. ACM.
-
(2008)
PACT '08: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques
, pp. 2-11
-
-
Nuzman, D.1
Zaks, A.2
-
18
-
-
19344368072
-
SPIRAL: Code generation for DSP transforms
-
Feb
-
M. Puschel, J. M. F. Moura, J. R. Johnson, D. Padua, M. M. Veloso, B. W. Singer, J. Xiong, F. Franchetti, A. Gacic, Y. Voronenko, K. Chen, R. W. Johnson, and N. Rizzolo. SPIRAL: Code generation for DSP transforms. In Proceedings of the IEEE, Special Issue on Program Generation, Optimization, and Platform Adaptation, volume 93, pages 216-231, Feb 2005.
-
(2005)
Proceedings of the IEEE, Special Issue on Program Generation, Optimization, and Platform Adaptation
, vol.93
, pp. 216-231
-
-
Puschel, M.1
Moura, J.M.F.2
Johnson, J.R.3
Padua, D.4
Veloso, M.M.5
Singer, B.W.6
Xiong, J.7
Franchetti, F.8
Gacic, A.9
Voronenko, Y.10
Chen, K.11
Johnson, R.W.12
Rizzolo, N.13
-
19
-
-
35448985754
-
Parameterized tiled loops for free
-
New York, NY, USA, ACM
-
L. Renganarayanan, D. Kim, S. Rajopadhye, and M. M. Strout. Parameterized tiled loops for free. In PLDI '07: Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation, pages 405-414, New York, NY, USA, 2007. ACM.
-
(2007)
PLDI '07: Proceedings of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation
, pp. 405-414
-
-
Renganarayanan, L.1
Kim, D.2
Rajopadhye, S.3
Strout, M.M.4
-
20
-
-
77954713325
-
Speeding up nek5000 with autotuning and specialization
-
New York, NY, USA, ACM
-
J. Shin, M. W. Hall, J. Chame, C. Chen, P. F. Fischer, and P. D. Hovland. Speeding up nek5000 with autotuning and specialization. In ICS '10: Proceedings of the 24th ACM International Conference on Supercomputing, pages 253-262, New York, NY, USA, 2010. ACM.
-
(2010)
ICS '10: Proceedings of the 24th ACM International Conference on Supercomputing
, pp. 253-262
-
-
Shin, J.1
Hall, M.W.2
Chame, J.3
Chen, C.4
Fischer, P.F.5
Hovland, P.D.6
-
21
-
-
70449626135
-
Polyhedral-model guided loop-nest auto-vectorization
-
Washington, DC, USA, IEEE Computer Society
-
K. Trifunovic, D. Nuzman, A. Cohen, A. Zaks, and I. Rosen. Polyhedral-model guided loop-nest auto-vectorization. In PACT '09: Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques, pages 327-337, Washington, DC, USA, 2009. IEEE Computer Society.
-
(2009)
PACT '09: Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques
, pp. 327-337
-
-
Trifunovic, K.1
Nuzman, D.2
Cohen, A.3
Zaks, A.4
Rosen, I.5
-
22
-
-
24344485098
-
OSKI: A library of automatically tuned sparse matrix kernels
-
Proceedings of SciDAC 2005, Institute of Physics Publishing, June
-
R. Vuduc, J. Demmel, and K. Yelick. OSKI: A library of automatically tuned sparse matrix kernels. In Proceedings of SciDAC 2005, volume 16 of Journal of Physics: Conference Series, pages 521-530. Institute of Physics Publishing, June 2005.
-
(2005)
Journal of Physics: Conference Series
, vol.16
, pp. 521-530
-
-
Vuduc, R.1
Demmel, J.2
Yelick, K.3
-
23
-
-
0343462141
-
Automated empirical optimization of software and the ATLAS project
-
Also available as University of Tennessee LAPACK Working Note #147, UTCS-00-448
-
R. C. Whaley, A. Petitet, and J. J. Dongarra. Automated empirical optimization of software and the ATLAS project. Parallel Computing, 27(1-2):3-35, 2001. Also available as University of Tennessee LAPACK Working Note #147, UTCS-00-448, 2000. www.netlib.org/lapack/lawns/lawn147.ps.
-
(2000)
Parallel Computing
, vol.27
, Issue.1-2
, pp. 3-35
-
-
Whaley, R.C.1
Petitet, A.2
Dongarra, J.J.3
-
24
-
-
4344648428
-
Multiresolution quantum chemistry in multiwavelet bases: Analytic derivatives for hartree-fock and density functional theory
-
T. Yanai, G. I. Fann, Z. Gan, R. J. Harrison, and G. Beylkin. Multiresolution quantum chemistry in multiwavelet bases: Analytic derivatives for hartree-fock and density functional theory. Journal of Chemical Physics, 121(7):2866-2876, 2004.
-
(2004)
Journal of Chemical Physics
, vol.121
, Issue.7
, pp. 2866-2876
-
-
Yanai, T.1
Fann, G.I.2
Gan, Z.3
Harrison, R.J.4
Beylkin, G.5
|