-
2
-
-
67650786281
-
Petabricks: A language and compiler for algorithmic choice
-
J. Ansel, C. Chan, Y. L. Wong, M. Olszewski, Q. Zhao, A. Edelman, and S. Amarasinghe. Petabricks: A language and compiler for algorithmic choice. In PLDI '09: Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, 2009.
-
(2009)
PLDI '09: Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation
-
-
Ansel, J.1
Chan, C.2
Wong, Y.L.3
Olszewski, M.4
Zhao, Q.5
Edelman, A.6
Amarasinghe, S.7
-
3
-
-
23044532053
-
-
S. Bhowmick, P. Raghavan, and K. Teranishi. A combinatorial scheme for developing efficient composite solvers. In ICCS '02: Proceedings of the International Conference on Computational Science-Part II, pages 325-334, London, UK, 2002. Springer-Verlag.
-
S. Bhowmick, P. Raghavan, and K. Teranishi. A combinatorial scheme for developing efficient composite solvers. In ICCS '02: Proceedings of the International Conference on Computational Science-Part II, pages 325-334, London, UK, 2002. Springer-Verlag.
-
-
-
-
4
-
-
0030661485
-
Optimizing matrix multiply using PHiPAC: A portable, high-performance, ansi c coding methodology
-
J. Bilmes, K. Asanovic, C.-W. Chin, and J. Demmel. Optimizing matrix multiply using PHiPAC: a portable, high-performance, ansi c coding methodology. In ICS '97: Proceedings of the 11th international conference on Supercomputing, pages 340-347, 1997.
-
(1997)
ICS '97: Proceedings of the 11th international conference on Supercomputing
, pp. 340-347
-
-
Bilmes, J.1
Asanovic, K.2
Chin, C.-W.3
Demmel, J.4
-
5
-
-
70350771127
-
Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures
-
Piscataway, NJ, USA, IEEE Press
-
K. Datta, M. Murphy, V. Volkov, S. Williams, J. Carter, L. Oliker, D. Patterson, J. Shalf, and K. Yelick. Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures. In SC '08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing, pages 1-12, Piscataway, NJ, USA, 2008. IEEE Press.
-
(2008)
SC '08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing
, pp. 1-12
-
-
Datta, K.1
Murphy, M.2
Volkov, V.3
Williams, S.4
Carter, J.5
Oliker, L.6
Patterson, D.7
Shalf, J.8
Yelick, K.9
-
6
-
-
0003252789
-
Applied Numerical Linear Algebra
-
August
-
J. W. Demmel. Applied Numerical Linear Algebra. SIAM, August 1997.
-
(1997)
SIAM
-
-
Demmel, J.W.1
-
8
-
-
20744449792
-
-
M. Frigo and S. G. Johnson. The design and implementation of FFTW3. Proceedings of the IEEE, 93(2):216-231, February 2005. Invited paper, special issue on Program Generation, Optimization, and Platform Adaptation.
-
M. Frigo and S. G. Johnson. The design and implementation of FFTW3. Proceedings of the IEEE, 93(2):216-231, February 2005. Invited paper, special issue on "Program Generation, Optimization, and Platform Adaptation".
-
-
-
-
9
-
-
0031622953
-
The implementation of the Cilk-5 multithreaded language
-
Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Montreal, Quebec, Canada, Jun, May
-
M. Frigo, C. E. Leiserson, and K. H. Randall. The implementation of the Cilk-5 multithreaded language. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 212-223, Montreal, Quebec, Canada, Jun 1998. Proceedings published ACM SIGPLAN Notices, Vol. 33, No. 5, May, 1998.
-
(1998)
Proceedings published ACM SIGPLAN Notices
, vol.33
, Issue.5
, pp. 212-223
-
-
Frigo, M.1
Leiserson, C.E.2
Randall, K.H.3
-
10
-
-
84949647432
-
-
E. Im and K. Yelick. Optimizing sparse matrix computations for register reuse in SPARSITY. In In Proceedings of the International Conference on Computational Science, 2073 of LNCS, pages 127-136. Springer, 2001.
-
E. Im and K. Yelick. Optimizing sparse matrix computations for register reuse in SPARSITY. In In Proceedings of the International Conference on Computational Science, volume 2073 of LNCS, pages 127-136. Springer, 2001.
-
-
-
-
12
-
-
12344325118
-
Install-time system for automatic generation of optimized parallel sorting algorithms
-
M. Olszewski and M. Voss. Install-time system for automatic generation of optimized parallel sorting algorithms. In PDPTA, pages 17-23, 2004.
-
(2004)
PDPTA
, pp. 17-23
-
-
Olszewski, M.1
Voss, M.2
-
13
-
-
74049105014
-
SPIRAL: Code generation for dsp transforms
-
M. Puschel, J. M. F. Moura, J. R. Johnson, D. Padua, M. M. Veloso, B. W. Singer, J. Xiong, A. G. Franz Franchetti, R. W. J. Yevgen Voronenko, Kang Chen, and N. Rizzolo. SPIRAL: Code generation for dsp transforms. In Proceedings of the IEEE.
-
Proceedings of the IEEE
-
-
Puschel, M.1
Moura, J.M.F.2
Johnson, J.R.3
Padua, D.4
Veloso, M.M.5
Singer, B.W.6
Xiong, J.7
Franz Franchetti, A.G.8
Yevgen Voronenko, R.W.J.9
Chen, K.10
Rizzolo, N.11
-
14
-
-
78649765479
-
Tiling optimizations for 3d scientific computations
-
CDROM, Washington, DC, USA, IEEE Computer Society
-
G. Rivera and C.-W. Tseng. Tiling optimizations for 3d scientific computations. In Supercomputing '00: Proceedings of the 2000 ACM/IEEE conference on Supercomputing (CDROM), page 32, Washington, DC, USA, 2000. IEEE Computer Society.
-
(2000)
Supercomputing '00: Proceedings of the 2000 ACM/IEEE conference on Supercomputing
, pp. 32
-
-
Rivera, G.1
Tseng, C.-W.2
-
16
-
-
24344485098
-
-
R. Vuduc, J. W. Demmel, and K. A. Yelick. OSKI: A library of automatically tuned sparse matrix kernels. In Proceedings of SciDAC 2005, Journal of Physics: Conference Series, San Francisco, CA, USA, June 2005. Institute of Physics Publishing.
-
R. Vuduc, J. W. Demmel, and K. A. Yelick. OSKI: A library of automatically tuned sparse matrix kernels. In Proceedings of SciDAC 2005, Journal of Physics: Conference Series, San Francisco, CA, USA, June 2005. Institute of Physics Publishing.
-
-
-
-
17
-
-
84943297310
-
Automatically tuned linear algebra software
-
Washington, DC, USA, IEEE Computer Society
-
R. C. Whaley and J. J. Dongarra. Automatically tuned linear algebra software. In Supercomputing '98: Proceedings of the 1998 ACM/IEEE conference on Supercomputing (CDROM), pages 1-27, Washington, DC, USA, 1998. IEEE Computer Society.
-
(1998)
Supercomputing '98: Proceedings of the 1998 ACM/IEEE conference on Supercomputing (CDROM)
, pp. 1-27
-
-
Whaley, R.C.1
Dongarra, J.J.2
-
18
-
-
13244279577
-
Minimizing development and maintenance costs in supporting persistently optimized BLAS
-
February
-
R. C. Whaley and A. Petitet. Minimizing development and maintenance costs in supporting persistently optimized BLAS. Software: Practice and Experience, 35(2):101-121, February 2005.
-
(2005)
Software: Practice and Experience
, vol.35
, Issue.2
, pp. 101-121
-
-
Whaley, R.C.1
Petitet, A.2
|