-
1
-
-
60649099576
-
Optimizing matrix multiplication for a short-vector simd architecture-cell processor
-
J. Kurzak, W. Alvaro, and J. Dongarra. Optimizing matrix multiplication for a short-vector simd architecture-cell processor. Parallel Comput., 35(3):138-150, 2009.
-
(2009)
Parallel Comput.
, vol.35
, Issue.3
, pp. 138-150
-
-
Kurzak, J.1
Alvaro, W.2
Dongarra, J.3
-
2
-
-
56749158843
-
Optimization of sparse matrix-vector multiplication on emerging multicore platforms
-
S. Williams, L. Oliker, R. Vuduc, J. Shalf, K. Yelick, and J. Demmel. Optimization of sparse matrix-vector multiplication on emerging multicore platforms. In In Proc. 2007 ACM/IEEE Conference on Supercomputing, 2007.
-
In Proc. 2007 ACM/IEEE Conference on Supercomputing, 2007
-
-
Williams, S.1
Oliker, L.2
Vuduc, R.3
Shalf, J.4
Yelick, K.5
Demmel, J.6
-
3
-
-
74049143158
-
Implementing sparse matrix-vector multiplication on throughput-oriented processors
-
New York, NY, USA
-
N. Bell and M. Garland. Implementing sparse matrix-vector multiplication on throughput-oriented processors. In SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, pages 1-11, New York, NY, USA, 2009.
-
(2009)
SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
, pp. 1-11
-
-
Bell, N.1
Garland, M.2
-
6
-
-
20744452904
-
Self-adapting linear algebra algorithms and software
-
J. Demmel, J. Dongarra, V. Eijkhout, E. Fuentes, A. Petitet, R. C. Whaley R. Vuduc, and K. Yelick. Self-adapting linear algebra algorithms and software. Proceeding of IEEE, 93(2):293-312, 2005.
-
(2005)
Proceeding of IEEE
, vol.93
, Issue.2
, pp. 293-312
-
-
Demmel, J.1
Dongarra, J.2
Eijkhout, V.3
Fuentes, E.4
Petitet, A.5
Whaley, R.C.6
Vuduc, R.7
Yelick, K.8
-
7
-
-
1542501019
-
Sparsity: Optimization framework for sparse matrix kernels
-
Eun-Jin Im, K.a.t.h.e.r.i.n.e. Yelick, and R.i.c.h.a.r.d. Vuduc. Sparsity: Optimization framework for sparse matrix kernels. Int. J. High Perform. Comput. Appl., 18(1):135-158, 2004.
-
(2004)
Int. J. High Perform. Comput. Appl.
, vol.18
, Issue.1
, pp. 135-158
-
-
Im, E.-J.1
Yelick, K.2
Vuduc, R.3
-
8
-
-
0242533311
-
Sparse matrix solvers on the gpu: Conjugate gradients and multigrid
-
J. Bolz, I. Farmer, E. Grinspun, and P. Schroder. Sparse matrix solvers on the gpu: Conjugate gradients and multigrid. ACM Trans. Graph., 22(3):917-924, 2003.
-
(2003)
ACM Trans. Graph.
, vol.22
, Issue.3
, pp. 917-924
-
-
Bolz, J.1
Farmer, I.2
Grinspun, E.3
Schroder, P.4
-
9
-
-
77957679421
-
Model-driven autotuning of sparse matrix-vector multiply on gpus
-
New York, NY, USA
-
J. W. Choi, A. Singh, and R. W. Vuduc. Model-driven autotuning of sparse matrix-vector multiply on gpus. In PPoPP '10: Proceedings of the 15th ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 115-126, New York, NY, USA, 2010.
-
(2010)
PPoPP '10: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
, pp. 115-126
-
-
Choi, J.W.1
Singh, A.2
Vuduc, R.W.3
-
11
-
-
74049114159
-
Auto-tuning 3-d fft library for cuda gpus
-
New York, NY, USA
-
A. Nukada and S. Matsuoka. Auto-tuning 3-d fft library for cuda gpus. In SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, pages 1-10, New York, NY, USA, 2009.
-
(2009)
SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
, pp. 1-10
-
-
Nukada, A.1
Matsuoka, S.2
|