-
1
-
-
70449756058
-
-
Lawrence Berkeley National Laboratory, Tech. Rep. LBNL-55291
-
H. Simon, L. Oliker, A. Canning, J. Carter, S. Ethier, and J. Shalf, "Evaluation of leading scalar and vector architectures for scientificcomputations," http://repositories.cdlib.org/lbnl/LBNL-55291, Lawrence Berkeley National Laboratory, Tech. Rep. LBNL-55291, 2004.
-
(2004)
Evaluation of leading scalar and vector architectures for scientificcomputations
-
-
Simon, H.1
Oliker, L.2
Canning, A.3
Carter, J.4
Ethier, S.5
Shalf, J.6
-
2
-
-
38049177237
-
Performance evaluation of scientific applications on modern parallel vector systems
-
VECPAR, J. Daydé, J. M. L. M. Palma, A. L. G. A. Coutinho, E. Pacitti, and J. C. Lopes, Eds, 4395. Springer
-
J. Carter, L. Oliker, and J. Shalf, "Performance evaluation of scientific applications on modern parallel vector systems," in VECPAR, ser. Lecture Notes in Computer Science, M. J. Daydé, J. M. L. M. Palma, A. L. G. A. Coutinho, E. Pacitti, and J. C. Lopes, Eds., vol. 4395. Springer, 2006, pp. 490-503.
-
(2006)
ser. Lecture Notes in Computer Science
, vol.1000
, pp. 490-503
-
-
Carter, J.1
Oliker, L.2
Shalf, J.3
-
3
-
-
0025402476
-
A set of level 3 basic linear algebra subprograms
-
J. J. Dongarra, J. D. Croz, I. S. Duff, and S. Hammarling, "A set of level 3 basic linear algebra subprograms," ACM Trans. Math. Soft., vol. 16, pp. 1-17, 1990.
-
(1990)
ACM Trans. Math. Soft
, vol.16
, pp. 1-17
-
-
Dongarra, J.J.1
Croz, J.D.2
Duff, I.S.3
Hammarling, S.4
-
6
-
-
84943297310
-
Automatically tuned linear algebra software
-
Washington, DC, USA: IEEE Computer Society
-
R. C. Whaley and J. J. Dongarra, "Automatically tuned linear algebra software," in Supercomputing '98: Proceedings of the 1998 ACM/IEEE Conference on Supercomputing (CDROM). Washington, DC, USA: IEEE Computer Society, 1998, pp. 1-27.
-
(1998)
Supercomputing '98: Proceedings of the 1998 ACM/IEEE Conference on Supercomputing (CDROM)
, pp. 1-27
-
-
Whaley, R.C.1
Dongarra, J.J.2
-
7
-
-
0343462141
-
Automated empirical optimization of software and the ATLAS project
-
R. C. Whaley, A. Petitet, and J. J. Dongarra, "Automated empirical optimization of software and the ATLAS project," Parallel Computing, vol. 27, no. 1-2, pp. 3-35, 2001.
-
(2001)
Parallel Computing
, vol.27
, Issue.1-2
, pp. 3-35
-
-
Whaley, R.C.1
Petitet, A.2
Dongarra, J.J.3
-
8
-
-
0003706460
-
-
Philadelphia, PA, USA: Societyfor Industrial and Applied Mathematics
-
E. Anderson, Z. Bai, C. Bischof, J. Demmel, J. Dongarra, J. DuCroz, A. Greenbaum, S. Hammarling, A. McKenney, S. Ostrouchov, and D. Sorensen, LAPACK's user's guide. Philadelphia, PA, USA: Societyfor Industrial and Applied Mathematics, 1992.
-
(1992)
LAPACK's user's guide
-
-
Anderson, E.1
Bai, Z.2
Bischof, C.3
Demmel, J.4
Dongarra, J.5
DuCroz, J.6
Greenbaum, A.7
Hammarling, S.8
McKenney, A.9
Ostrouchov, S.10
Sorensen, D.11
-
9
-
-
24344485098
-
-
R. Vuduc, J. Demmel, and K. Yelick, OSKI: A library of automatically tuned sparse matrix kernels, in Proceedings of SciDAC 2005, ser. Journal of Physics: Conference Series, 16. Institute of Physics Publishing, June 2005, pp. 521-530.
-
R. Vuduc, J. Demmel, and K. Yelick, "OSKI: A library of automatically tuned sparse matrix kernels," in Proceedings of SciDAC 2005, ser. Journal of Physics: Conference Series, vol. 16. Institute of Physics Publishing, June 2005, pp. 521-530.
-
-
-
-
10
-
-
0030661485
-
Optimizing matrix multiply using PHiPAC: A portable, high-performance, ANSI C codingmethodology
-
J. Bilmes, K. Asanovic, C.-W. Chin, and J. Demmel, "Optimizing matrix multiply using PHiPAC: A portable, high-performance, ANSI C codingmethodology," in International Conference on Supercomputing, 1997, pp. 340-347.
-
(1997)
International Conference on Supercomputing
, pp. 340-347
-
-
Bilmes, J.1
Asanovic, K.2
Chin, C.-W.3
Demmel, J.4
-
11
-
-
0031636309
-
FFTW: An adaptive software architecture for the FFT
-
M. Frigo, "FFTW: An adaptive software architecture for the FFT," in Proceedings of the ICASSP Conference, vol. 3, 1998, p. 1381.
-
(1998)
Proceedings of the ICASSP Conference
, vol.3
, pp. 1381
-
-
Frigo, M.1
-
12
-
-
19344368072
-
SPIRAL: Code generation for DSP transforms
-
Feb
-
M. Puschel, J. M. F. Moura, J. R. Johnson, D. Padua, M. M. Veloso, B. W. Singer, J. Xiong, F. Franchetti, A. Gacic, Y. Voronenko, K. Chen, R. W. Johnson, and N. Rizzolo, "SPIRAL: Code generation for DSP transforms," in Proceedings of the IEEE, Special Issue on Program Generation, Optimization, and Platform Adaptation, vol. 93, Feb 2005, pp. 216-231.
-
(2005)
Proceedings of the IEEE, Special Issue on Program Generation, Optimization, and Platform Adaptation
, vol.93
, pp. 216-231
-
-
Puschel, M.1
Moura, J.M.F.2
Johnson, J.R.3
Padua, D.4
Veloso, M.M.5
Singer, B.W.6
Xiong, J.7
Franchetti, F.8
Gacic, A.9
Voronenko, Y.10
Chen, K.11
Johnson, R.W.12
Rizzolo, N.13
-
13
-
-
70449783253
-
-
K. Goto, "GotoBLAS," http://www.tacc.utexas.edu/resources/ software/,2007.
-
(2007)
GotoBLAS
-
-
Goto, K.1
-
14
-
-
0036679608
-
HPCVIEW: A tool for top-down analysis of node performance
-
Aug
-
J. Mellor-Crummey, R. Fowler, G. Marin, and N. Tallent, "HPCVIEW: A tool for top-down analysis of node performance," The Journal of Supercomputing, vol. 23, no. 1, pp. 81-104, Aug 2002.
-
(2002)
The Journal of Supercomputing
, vol.23
, Issue.1
, pp. 81-104
-
-
Mellor-Crummey, J.1
Fowler, R.2
Marin, G.3
Tallent, N.4
-
15
-
-
84947901718
-
A rational approach to portable high performance: The basic linear algebra instruction set (BLAIS) and the fixed algorithm size template (FAST) library
-
ECOOP Workshops, S. Demeyer and J. Bosch, Eds, Springer
-
J. G. Siek and A. Lumsdaine, "A rational approach to portable high performance: The basic linear algebra instruction set (BLAIS) and the fixed algorithm size template (FAST) library," in ECOOP Workshops, ser. Lecture Notes in Computer Science, S. Demeyer and J. Bosch, Eds., vol. 1543. Springer, 1998, pp. 468-469.
-
(1998)
ser. Lecture Notes in Computer Science
, vol.1543
, pp. 468-469
-
-
Siek, J.G.1
Lumsdaine, A.2
-
16
-
-
77952410868
-
POET: Parameterized optimizations for empirical tuning
-
IEEE Computer Society, March
-
Q. Yi, K. Seymour, H. You, R. Vuduc, and D. Quinlan, "POET: Parameterized optimizations for empirical tuning," in Workshop on Performance Optimization of High-Level Languages and Libraries (POHLL). IEEE Computer Society, March 2007, pp. 1-8.
-
(2007)
Workshop on Performance Optimization of High-Level Languages and Libraries (POHLL)
, pp. 1-8
-
-
Yi, Q.1
Seymour, K.2
You, H.3
Vuduc, R.4
Quinlan, D.5
-
17
-
-
43949129775
-
Language for the compact representation of multiple program versions
-
Proceedings of Languages and Compilers for Parallel Computing LCPC05, Germany: Springer-Verlag
-
S. Donadio, J. Brodman, T. Roeder, K. Yotov, D. Barthou, A. Cohen,M. J. Garzarán, D. Padua, and K. Pingali, "Language for the compact representation of multiple program versions," in Proceedings of Languages and Compilers for Parallel Computing (LCPC05), ser. Lecture Notes in Computer Science. Germany: Springer-Verlag, 2006, no. 4339, pp. 136-151.
-
(2006)
ser. Lecture Notes in Computer Science
, Issue.4339
, pp. 136-151
-
-
Donadio, S.1
Brodman, J.2
Roeder, T.3
Yotov, K.4
Barthou, D.5
Cohen, A.6
Garzarán, M.J.7
Padua, D.8
Pingali, K.9
-
18
-
-
20744452343
-
Broadway: A compiler for exploiting the domain-specific semantics of software libraries
-
July
-
C. Lin and S. Z. Guyer, "Broadway: A compiler for exploiting the domain-specific semantics of software libraries," Proceedings of the IEEE, vol. 93, no. 2, pp. 342-357, July 2005.
-
(2005)
Proceedings of the IEEE
, vol.93
, Issue.2
, pp. 342-357
-
-
Lin, C.1
Guyer, S.Z.2
-
19
-
-
68849088002
-
Telescoping languages project description
-
http: //telescoping.rice.edu
-
K. Kennedy et al., "Telescoping languages project description," http: //telescoping.rice.edu/, 2006.
-
(2006)
-
-
Kennedy, K.1
-
20
-
-
20744444866
-
Telescoping languages: A system for automatic generation of domain languages
-
K. Kennedy, B. Broom, A. Chauhan, R. Fowler, J. Garvin, C. Koelbel, C. McCosh, and J. Mellor-Crummey, "Telescoping languages: A system for automatic generation of domain languages," Proceedings of the IEEE, vol. 93, no. 3, pp. 387-408, 2005.
-
(2005)
Proceedings of the IEEE
, vol.93
, Issue.3
, pp. 387-408
-
-
Kennedy, K.1
Broom, B.2
Chauhan, A.3
Fowler, R.4
Garvin, J.5
Koelbel, C.6
McCosh, C.7
Mellor-Crummey, J.8
-
22
-
-
70449756060
-
-
T. Veldhuizen, Expression templates, C++ Report, 7, no. 5, pp. 26-31, June 1995.
-
T. Veldhuizen, "Expression templates," C++ Report, vol. 7, no. 5, pp. 26-31, June 1995.
-
-
-
-
26
-
-
70449935802
-
-
Orio project, trac.mcs.anl.gov/projects/performance/orio
-
"Orio project," trac.mcs.anl.gov/projects/performance/orio, 2008.
-
(2008)
-
-
-
27
-
-
0032251894
-
Convergence properties of the Nelder-Mead simplex method in low dimensions
-
J. C. Lagarias, J. A. Reeds, M. H. Wright, and P. E. Wright, "Convergence properties of the Nelder-Mead simplex method in low dimensions," SIAM Journal of Optimization, vol. 9, pp. 112-147, 1998.
-
(1998)
SIAM Journal of Optimization
, vol.9
, pp. 112-147
-
-
Lagarias, J.C.1
Reeds, J.A.2
Wright, M.H.3
Wright, P.E.4
-
28
-
-
0034545694
-
Direct search methods: Then and
-
R. M. Lewis, Michael, and W. Trosset, "Direct search methods: Then and now," Journal of Computational and Applied Mathematics, vol. 124, pp. 200-0, 2000.
-
(2000)
Journal of Computational and Applied Mathematics
, vol.124
, pp. 200-200
-
-
Lewis, R.M.1
Michael2
Trosset, W.3
-
29
-
-
26444479778
-
Optimization by simulated annealing
-
S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, "Optimization by simulated annealing," Science, vol. 220, pp. 671-680, 1983.
-
(1983)
Science
, vol.220
, pp. 671-680
-
-
Kirkpatrick, S.1
Gelatt, C.D.2
Vecchi, M.P.3
-
30
-
-
0348126362
-
Optimized unrolling of nested loops
-
V. Sarkar, "Optimized unrolling of nested loops," Int. J. Parallel Program., vol. 29, no. 5, pp. 545-581, 2001.
-
(2001)
Int. J. Parallel Program
, vol.29
, Issue.5
, pp. 545-581
-
-
Sarkar, V.1
-
31
-
-
74049164978
-
A practical automatic polyhedral program optimization system
-
Jun
-
U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan, "A practical automatic polyhedral program optimization system," in ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), Jun. 2008.
-
(2008)
ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI)
-
-
Bondhugula, U.1
Hartono, A.2
Ramanujam, J.3
Sadayappan, P.4
-
32
-
-
84976736522
-
gprof: A call graph execution profiler
-
Jun
-
S. L. Graham, P. B. Kessler, and M. K. McKusick, "gprof: A call graph execution profiler," SIGPLAN Notices, vol. 17, no. 6, p. 120, Jun. 1982.
-
(1982)
SIGPLAN Notices
, vol.17
, Issue.6
, pp. 120
-
-
Graham, S.L.1
Kessler, P.B.2
McKusick, M.K.3
-
33
-
-
70449759963
-
-
S. Balay, K. Buschelman, V. Eijkhout, W. D. Gropp, D. Kaushik, M. G. Knepley, L. C. McInnes, B. F. Smith, and H. Zhang, PETSc Users Manual, Argonne National Laboratory, Tech. Rep. ANL-95/11 - Revision 2.1.5, 2004.
-
S. Balay, K. Buschelman, V. Eijkhout, W. D. Gropp, D. Kaushik, M. G. Knepley, L. C. McInnes, B. F. Smith, and H. Zhang, "PETSc Users Manual," Argonne National Laboratory, Tech. Rep. ANL-95/11 - Revision 2.1.5, 2004.
-
-
-
-
34
-
-
10044233808
-
Automatic performance tuning of sparse matrix kernels,
-
Ph.D. dissertation, University of California, Berkeley, December
-
R. W. Vuduc, "Automatic performance tuning of sparse matrix kernels," Ph.D. dissertation, University of California, Berkeley, December 2003.
-
(2003)
-
-
Vuduc, R.W.1
-
35
-
-
0347017866
-
Pseudo-transient continuation and differential-algebraic equations
-
T. S. Coffey, C. T. Kelley, and D. E. Keyes, "Pseudo-transient continuation and differential-algebraic equations," SIAM J. Sci. Comput., vol. 25, no. 2, pp. 553-569, 2003.
-
(2003)
SIAM J. Sci. Comput
, vol.25
, Issue.2
, pp. 553-569
-
-
Coffey, T.S.1
Kelley, C.T.2
Keyes, D.E.3
-
36
-
-
70449871737
-
-
The Pluto automatic parallelizer, sourceforge.net/projects/ pluto-compiler.
-
"The Pluto automatic parallelizer," sourceforge.net/projects/ pluto-compiler.
-
-
-
|