-
1
-
-
85031252197
-
Achieving high sustained performance in an unstructured mesh cfd application
-
New York. NY, USA, ACM
-
W. K. Anderson, W. D. Gropp, D. K. Kaushik, D. E. Keyes, and B. F. Smith. Achieving high sustained performance in an unstructured mesh cfd application. In SC '99: Proceedings of the 1999 ACM/IEEE conference on Supercomputing, page 69, New York. NY, USA, 1999. ACM.
-
(1999)
SC '99: Proceedings of the 1999 ACM/IEEE conference on Supercomputing
, pp. 69
-
-
Anderson, W.K.1
Gropp, W.D.2
Kaushik, D.K.3
Keyes, D.E.4
Smith, B.F.5
-
2
-
-
0003473816
-
-
SIAM, Philadelphia
-
R. Barrett, M. Berry, T. F. Chan, J. Demmel, J. M. Donato, J. Dongarra, V. Eijkhout, R. Pozo, C. Romine, and H. V. der Vorst. Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods. SIAM, Philadelphia, 1994.
-
(1994)
Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods
-
-
Barrett, R.1
Berry, M.2
Chan, T.F.3
Demmel, J.4
Donato, J.M.5
Dongarra, J.6
Eijkhout, V.7
Pozo, R.8
Romine, C.9
der Vorst, H.V.10
-
3
-
-
34547626759
-
High throughput compression of double-precision floating-point data
-
Washington, DC, USA, IEEE Computer Society
-
M. Burtscher and P. Ratanaworabhan. High throughput compression of double-precision floating-point data. In DCC '07: Proceedings of the 2007 Data Compression Conference, pages 293-302, Washington, DC, USA, 2007. IEEE Computer Society.
-
(2007)
DCC '07: Proceedings of the 2007 Data Compression Conference
, pp. 293-302
-
-
Burtscher, M.1
Ratanaworabhan, P.2
-
4
-
-
0003197949
-
University of Florida sparse matrix collection
-
T. Davis. University of Florida sparse matrix collection. NA Digest, 97(23):7, 1997.
-
(1997)
NA Digest
, vol.97
, Issue.23
, pp. 7
-
-
Davis, T.1
-
5
-
-
47349103843
-
Understanding the performance of sparse matrix-vector multiplication
-
G. Goumas, K. Kourtis, N. Anastopoulos, V. Karakasis, and N. Koziris. Understanding the performance of sparse matrix-vector multiplication. In PDP '08: Proceedings of the 16th Euromicro International Conference on Parallel, Distributed and Network-based Processing, 2008.
-
(2008)
PDP '08: Proceedings of the 16th Euromicro International Conference on Parallel, Distributed and Network-based Processing
-
-
Goumas, G.1
Kourtis, K.2
Anastopoulos, N.3
Karakasis, V.4
Koziris, N.5
-
8
-
-
84949647432
-
Optimizing sparse matrix computations for register reuse in SPARSITY
-
E. Im and K. Yelick. Optimizing sparse matrix computations for register reuse in SPARSITY. Lecture Notes in Computer Science, 2073:127-136, 2001.
-
(2001)
Lecture Notes in Computer Science
, vol.2073
, pp. 127-136
-
-
Im, E.1
Yelick, K.2
-
10
-
-
34548206782
-
Exploiting the performance of 32 bit floating point arithmetic in obtaining 64 bit accuracy (revisiting iterative refinement for linear systems)
-
New York, NY, USA, ACM
-
J. Langou, J. Langou, P. Luszczek, J. Kurzak, A. Buttari, and J. Dongarra. Exploiting the performance of 32 bit floating point arithmetic in obtaining 64 bit accuracy (revisiting iterative refinement for linear systems). In SC '06: Proceedings of the 2006 ACM/IEEE conference on Supercomputing, page 113, New York, NY, USA, 2006. ACM.
-
(2006)
SC '06: Proceedings of the 2006 ACM/IEEE conference on Supercomputing
, pp. 113
-
-
Langou, J.1
Langou, J.2
Luszczek, P.3
Kurzak, J.4
Buttari, A.5
Dongarra, J.6
-
11
-
-
10044248780
-
Performance models for evaluation and automatic tuning of symmetric sparse matrix-vector multiply
-
15-18 Aug
-
B. Lee, R. Vuduc, J. Demmel, and K. Yelick. Performance models for evaluation and automatic tuning of symmetric sparse matrix-vector multiply. In ICPP '04: Proceedings of the International Conference on Parallel Processing, pages 169-176 vol. 1, 15-18 Aug. 2004.
-
(2004)
ICPP '04: Proceedings of the International Conference on Parallel Processing
, vol.1
, pp. 169-176
-
-
Lee, B.1
Vuduc, R.2
Demmel, J.3
Yelick, K.4
-
13
-
-
3042618790
-
-
J. C. Pichel, D. B. Heras, J. C. Cabaleiro, and F. F. Rivera. Improving the locality of the sparse matrix-vector product on shared memory multiprocessors. In PDP, pages 66-71. IEEE Computer Society, 2004.
-
J. C. Pichel, D. B. Heras, J. C. Cabaleiro, and F. F. Rivera. Improving the locality of the sparse matrix-vector product on shared memory multiprocessors. In PDP, pages 66-71. IEEE Computer Society, 2004.
-
-
-
-
14
-
-
25644439819
-
Performance optimization of irregular codes based on the combination of reordering and blocking techniques
-
J. C. Pichel, D. B. Heras, J. C. Cabaleiro, and F. F. Rivera. Performance optimization of irregular codes based on the combination of reordering and blocking techniques. Parallel Computing, 31(8-9):858-876, 2005.
-
(2005)
Parallel Computing
, vol.31
, Issue.8-9
, pp. 858-876
-
-
Pichel, J.C.1
Heras, D.B.2
Cabaleiro, J.C.3
Rivera, F.F.4
-
15
-
-
85031264203
-
Improving performance of sparse matrix-vector multiplication. In Supercomputing'99, Portland, OR
-
Nov
-
A. Pinar and M. T. Heath. Improving performance of sparse matrix-vector multiplication. In Supercomputing'99, Portland, OR, Nov. 1999. ACM SIGARCH and IEEE.
-
(1999)
ACM SIGARCH and IEEE
-
-
Pinar, A.1
Heath, M.T.2
-
16
-
-
56749108298
-
-
Y. Saad. SPARSKIT: A basic tool kit for sparse matrix computations. Technical report, Computer Science Department, University of Minnesota, Minneapolis, MN 55455, June 1994. Version 2
-
Y. Saad. SPARSKIT: A basic tool kit for sparse matrix computations. Technical report, Computer Science Department, University of Minnesota, Minneapolis, MN 55455, June 1994. Version 2.
-
-
-
-
18
-
-
33644639675
-
Sparse matrix storage revisited
-
New York, NY, USA, ACM
-
M. Silva. Sparse matrix storage revisited. In CF '05: Proceedings of the 2nd conference on Computing frontiers, pages 230-235, New York, NY, USA, 2005. ACM.
-
(2005)
CF '05: Proceedings of the 2nd conference on Computing frontiers
, pp. 230-235
-
-
Silva, M.1
-
19
-
-
0031269220
-
Improving the memory-system performance of sparse-matrix vector multiplication
-
S. Toledo. Improving the memory-system performance of sparse-matrix vector multiplication. IBM Journal of Research and Development, 41(6):711-725, 1997.
-
(1997)
IBM Journal of Research and Development
, vol.41
, Issue.6
, pp. 711-725
-
-
Toledo, S.1
-
20
-
-
84990830919
-
Performance optimizations and bounds for sparse matrix-vector multiply
-
Baltimore, MD, Nov
-
R. Vuduc, J. Demmel, K. Yelick, S. Kamil, R. Nishtala, and B. Lee. Performance optimizations and bounds for sparse matrix-vector multiply. In Supercomputing, Baltimore, MD, Nov. 2002.
-
(2002)
Supercomputing
-
-
Vuduc, R.1
Demmel, J.2
Yelick, K.3
Kamil, S.4
Nishtala, R.5
Lee, B.6
-
21
-
-
33646389518
-
Fast sparse matrix-vector multiplication by exploiting variable block structure
-
High Performance Computing and Communications, of, Springer
-
R. W. Vuduc and H. Moon. Fast sparse matrix-vector multiplication by exploiting variable block structure. In High Performance Computing and Communications, volume 3726 of Lecture Notes in Computer Science, pages 807-816. Springer, 2005.
-
(2005)
Lecture Notes in Computer Science
, vol.3726
, pp. 807-816
-
-
Vuduc, R.W.1
Moon, H.2
-
23
-
-
34547468948
-
Accelerating sparse matrix computations via data compression
-
New York, NY, USA, ACM Press
-
J. Willcock and A. Lumsdaine. Accelerating sparse matrix computations via data compression. In ICS '06: Proceedings of the 20th annual international conference on Supercomputing, pages 307-316, New York, NY, USA, 2006. ACM Press.
-
(2006)
ICS '06: Proceedings of the 20th annual international conference on Supercomputing
, pp. 307-316
-
-
Willcock, J.1
Lumsdaine, A.2
-
24
-
-
56749158843
-
Optimization of sparse matrix-vector multiplication on emerging multicore platforms
-
Reno, NV, Nov
-
S. Williams, L. Oilker, R. Vuduc, J. Shalf, K. Yelick, and J. Demmel. Optimization of sparse matrix-vector multiplication on emerging multicore platforms. In SC '07: Proceedings of the 2007 ACM/IEEE conference on Supercomputing, Reno, NV, Nov. 2007.
-
(2007)
SC '07: Proceedings of the 2007 ACM/IEEE conference on Supercomputing
-
-
Williams, S.1
Oilker, L.2
Vuduc, R.3
Shalf, J.4
Yelick, K.5
Demmel, J.6
|