-
1
-
-
85031252197
-
Achieving high sustained performance in an unstructured mesh cfd application
-
ACM, New York, NY
-
ANDERSON, W. K., GROPP, W. D., KAUSHIK, D. K., KEYES, D. E., AND SMITH, B. F. 1999. Achieving high sustained performance in an unstructured mesh cfd application. In Proceedings of the ACM/IEEE Conference on Supercomputing (SC'99). ACM, New York, NY, 69.
-
(1999)
Proceedings of the ACM/IEEE Conference on Supercomputing (SC'99)
, vol.69
-
-
Anderson, W.K.1
Gropp, W.D.2
Kaushik, D.K.3
Keyes, D.E.4
Smith, B.F.5
-
2
-
-
35648995516
-
The landscape of parallel computing research: A view from Berkeley
-
ASANOVIC, K., BODIK, R., CATANZARO, B. C., GEBIS, J. J., HUSBANDS, P., KEUTZER, K., PATTERSON, D. A., PLISHKER, W. L., SHALF, J., WILLIAMS, S. W., AND YELICK, K. A. 2006. The landscape of parallel computing research: a view from Berkeley. Tech. rep. UCB/EECS-2006-183, EECS Department, University of California, Berkeley.
-
(2006)
Tech. Rep. UCB/EECS-2006-183, EECS Department, University of California, Berkeley
-
-
Asanovic, K.1
Bodik, R.2
Catanzaro, B.C.3
Gebis, J.J.4
Husbands, P.5
Keutzer, K.6
Patterson, D.A.7
Plishker, W.L.8
Shalf, J.9
Williams, S.W.10
Yelick, K.A.11
-
3
-
-
57649208491
-
A performance evaluation of the Nehalem quad-core processor for scientific computing
-
BARKER, K., DAVIS, K., HOISIE, A., KERBYSON, D. J., LANG, M., PAKIN, S., AND SANCHO, J. C. 2008. A performance evaluation of the Nehalem quad-core processor for scientific computing. Parall. Process. Lett.
-
(2008)
Parall. Process. Lett
-
-
Barker, K.1
Davis, K.2
Hoisie, A.3
Kerbyson, D.J.4
Lang, M.5
Pakin, S.6
Sancho, J.C.7
-
4
-
-
0003473816
-
-
SIAM, Philadelphia, PA
-
BARRETT, R., BERRY, M., CHAN, T. F., DEMMEL, J., DONATO, J. M., DONGARRA, J., EIJKHOUT, V., POZO, R., ROMINE, C., AND DER VORST, H. V. 1994. Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods. SIAM, Philadelphia, PA.
-
(1994)
Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods
-
-
Barrett, R.1
Berry, M.2
Chan, T.F.3
Demmel, J.4
Donato, J.M.5
Dongarra, J.6
Eijkhout, V.7
Pozo, R.8
Romine, C.9
Der Vorst, H.V.10
-
5
-
-
78650279432
-
Pattern-based sparse matrix representation for memory-efficient smvm kernels
-
ACM, New York, NY
-
BELGIN, M., BACK, G., AND RIBBENS, C. J. 2009. Pattern-based sparse matrix representation for memory-efficient smvm kernels. In Proceedings of the 23rd International Conference on Supercomputing (ICS'09). ACM, New York, NY, 100-109.
-
(2009)
Proceedings of the 23rd International Conference on Supercomputing (ICS'09)
, pp. 100-109
-
-
Belgin, M.1
Back, G.2
Ribbens, C.J.3
-
6
-
-
34547626759
-
High throughput compression of double-precision floating-point data
-
IEEE Computer Society, Los Alamitos, CA
-
BURTSCHER, M. AND RATANAWORABHAN, P. 2007. High throughput compression of double-precision floating-point data. In Proceedings of the Data Compression Conference (DCC'07). IEEE Computer Society, Los Alamitos, CA, 293-302.
-
(2007)
Proceedings of the Data Compression Conference (DCC'07)
, pp. 293-302
-
-
Burtscher, M.1
Ratanaworabhan, P.2
-
7
-
-
84947911090
-
Decomposing irregularly sparse matrices for parallel matrix-vector multiplication
-
CATALYUEREK, U. V. AND AYKANAT, C. 1996. Decomposing irregularly sparse matrices for parallel matrix-vector multiplication. Lecture Notes in Computer Science, vol. 1117, 75-86.
-
(1996)
Lecture Notes in Computer Science
, vol.1117
, pp. 75-86
-
-
Catalyuerek, U.V.1
Aykanat, C.2
-
9
-
-
0003197949
-
University of Florida sparse matrix collection
-
DAVIS, T. 1997. University of Florida sparse matrix collection. NA Digest 97, 23, 7.
-
(1997)
NA Digest
, vol.97
, Issue.23
, pp. 7
-
-
Davis, T.1
-
10
-
-
20344401552
-
Chip makers turn to multicore processors
-
GEER, D. 2005. Chip makers turn to multicore processors. IEEE Comput. 38, 5, 11-13.
-
(2005)
IEEE Comput.
, vol.38
, Issue.5
, pp. 11-13
-
-
Geer, D.1
-
12
-
-
79951987528
-
Performance evaluation of the sparse matrix-vector multiplication on modern architectures
-
GOUMAS, G., KOURTIS, K., ANASTOPOULOS, N., KARAKASIS, V., AND KOZIRIS, N. 2008. Performance evaluation of the sparse matrix-vector multiplication on modern architectures. J. Supercomput.
-
(2008)
J. Supercomput
-
-
Goumas, G.1
Kourtis, K.2
Anastopoulos, N.3
Karakasis, V.4
Koziris, N.5
-
13
-
-
33646015987
-
Synergistic processing in cell's multicore architecture
-
GSCHWIND, M., HOFSTEE, H. P., FLACHS, B. K., HOPKINS, M., WATANABE, Y., AND YAMAZAKI, T. 2006. Synergistic processing in cell's multicore architecture. IEEE Micro 26, 2, 10-24.
-
(2006)
IEEE Micro
, vol.26
, Issue.2
, pp. 10-24
-
-
Gschwind, M.1
Hofstee, H.P.2
Flachs, B.K.3
Hopkins, M.4
Watanabe, Y.5
Yamazaki, T.6
-
15
-
-
42549168687
-
Exploring the cache design space for large scale CMPs
-
HSU, L., IYER, R., MAKINENI, S., REINHARDT, S., AND NEWELL, D. 2005. Exploring the cache design space for large scale CMPs. ACM SIGARCH Comput. Architect. News 33, 4, 24-33.
-
(2005)
ACM SIGARCH Comput. Architect. News
, vol.33
, Issue.4
, pp. 24-33
-
-
Hsu, L.1
Iyer, R.2
Makineni, S.3
Reinhardt, S.4
Newell, D.5
-
17
-
-
84949647432
-
Optimizing sparse matrix computations for register reuse in SPARSITY
-
IM, E. AND YELICK, K. 2001. Optimizing sparse matrix computations for register reuse in SPARSITY. Lecture Notes in Computer Science, vol. 2073, 127-136.
-
(2001)
Lecture Notes in Computer Science
, vol.2073
, pp. 127-136
-
-
Im, E.1
Yelick, K.2
-
19
-
-
55849145179
-
Improving the performance of multithreaded sparse matrix-vector multiplication using index and value compression
-
KOURTIS, K., GOUMAS, G., AND KOZIRIS, N. 2008a. Improving the performance of multithreaded sparse matrix-vector multiplication using index and value compression. In Proceedings of the 37th International Conference on Parallel Processing (ICPP'08), 511-519.
-
(2008)
Proceedings of the 37th International Conference on Parallel Processing (ICPP'08)
, pp. 511-519
-
-
Kourtis, K.1
Goumas, G.2
Koziris, N.3
-
20
-
-
55849146932
-
Optimizing sparse matrix-vector multiplication using index and value compression
-
ACM, New York, NY
-
KOURTIS, K., GOUMAS, G., AND KOZIRIS, N. 2008b. Optimizing sparse matrix-vector multiplication using index and value compression. In Proceedings of the Conference on Computing Frontiers (CF'08). ACM, New York, NY, 87-96.
-
(2008)
Proceedings of the Conference on Computing Frontiers (CF'08)
, pp. 87-96
-
-
Kourtis, K.1
Goumas, G.2
Koziris, N.3
-
21
-
-
34548206782
-
Exploiting the performance of 32 bit floating point arithmetic in obtaining 64 bit accuracy (revisiting iterative refinement for linear systems)
-
ACM, New York, NY
-
LANGOU, J.,LANGOU, J.,LUSZCZEK, P.,KURZAK, J.,BUTTARI, A., AND DONGARRA, J. 2006. Exploiting the performance of 32 bit floating point arithmetic in obtaining 64 bit accuracy (revisiting iterative refinement for linear systems). In Proceedings of the ACM/IEEE Conference on Supercomputing (SC'06). ACM, New York, NY, 113.
-
(2006)
Proceedings of the ACM/IEEE Conference on Supercomputing (SC'06)
, vol.113
-
-
Langou, J.1
Langou, J.2
Luszczek, P.3
Kurzak, J.4
Buttari, A.5
Dongarra, J.6
-
22
-
-
10044248780
-
Performance models for evaluation and automatic tuning of symmetric sparse matrix-vector multiply
-
15-18 Aug.
-
LEE, B., VUDUC, R., DEMMEL, J., AND YELICK, K. 15-18 Aug. 2004. Performance models for evaluation and automatic tuning of symmetric sparse matrix-vector multiply. In Proceedings of the International Conference on Parallel Processing (ICPP'04). 169-176, vol. 1.
-
(2004)
Proceedings of the International Conference on Parallel Processing (ICPP'04)
, vol.1
, pp. 169-176
-
-
Lee, B.1
Vuduc, R.2
Demmel, J.3
Yelick, K.4
-
23
-
-
2942628343
-
Optimizing sparse matrix-vector product computations using unroll and jam
-
MELLOR-CRUMMEY, J. AND GARVIN, J. 2004. Optimizing sparse matrix-vector product computations using unroll and jam. Int. J. High Perform. Comput. Appl. 18, 2, 225.
-
(2004)
Int. J. High Perform. Comput. Appl.
, vol.18
, Issue.2
, pp. 225
-
-
Mellor-Crummey, J.1
Garvin, J.2
-
24
-
-
33646829256
-
Streaming sparse matrix compression/decompression
-
MOLONEY, D., GERAGHTY, D., MCSWEENEY, C., AND MCELROY, C. 2005. Streaming sparse matrix compression/decompression. In Proceedings of the High Performance Embedded Architectures and Compilers, 1st International Conference (HiPEAC'05). 116-129.
-
(2005)
Proceedings of the High Performance Embedded Architectures and Compilers, 1st International Conference (HiPEAC'05)
, pp. 116-129
-
-
Moloney, D.1
Geraghty, D.2
McSweeney, C.3
McElroy, C.4
-
25
-
-
3042618790
-
Improving the locality of the sparse matrix-vector product on shared memory multiprocessors
-
PICHEL, J. C., HERAS, D. B., CABALEIRO, J. C., AND RIVERA, F. F. 2004. Improving the locality of the sparse matrix-vector product on shared memory multiprocessors. In Proceedings of theEconomics Conference on Parallel, Distributed, and Network-Based Processing. 66.
-
(2004)
Proceedings of TheEconomics Conference on Parallel, Distributed, and Network-Based Processing
, vol.66
-
-
Pichel, J.C.1
Heras, D.B.2
Cabaleiro, J.C.3
Rivera, F.F.4
-
26
-
-
85031264203
-
Improving performance of sparse matrix-vector multiplication
-
ACM SIGARCH and IEEE
-
PINAR, A. AND HEATH, M. 1999. Improving performance of sparse matrix-vector multiplication. In Proceedings of the Supercomputing'99 Conference. ACM SIGARCH and IEEE.
-
(1999)
Proceedings of the Supercomputing'99 Conference
-
-
Pinar, A.1
Heath, M.2
-
27
-
-
0003550735
-
-
Tech. rep., Computer Science Department, University of Minnesota, Minneapolis, MN
-
SAAD, Y. 1994. SPARSKIT: A basic tool kit for sparse matrix computations. Tech. rep., Computer Science Department, University of Minnesota, Minneapolis, MN.
-
(1994)
SPARSKIT: A Basic Tool Kit for Sparse Matrix Computations
-
-
Saad, Y.1
-
30
-
-
0031269220
-
Improving the memory-system performance of sparse-matrix vector multiplication
-
TOLEDO, S. 1997. Improving the memory-system performance of sparse-matrix vector multiplication. IBM J. Resear. Devel. 41, 6, 711-725.
-
(1997)
IBM J. Resear. Devel.
, vol.41
, Issue.6
, pp. 711-725
-
-
Toledo, S.1
-
31
-
-
84990830919
-
Performance optimizations and bounds for sparse matrix-vector multiply
-
VUDUC, R., DEMMEL, J., YELICK, K., KAMIL, S., NISHTALA, R., AND LEE, B. 2002. Performance optimizations and bounds for sparse matrix-vector multiply. In Proceedings of the Supercomputing Conference'02. 26-26.
-
(2002)
Proceedings of the Supercomputing Conference'02
, pp. 26-26
-
-
Vuduc, R.1
Demmel, J.2
Yelick, K.3
Kamil, S.4
Nishtala, R.5
Lee, B.6
-
32
-
-
33646389518
-
Fast sparse matrix-vector multiplication by exploiting variable block structure
-
Lecture Notes in Computer Science. Springer
-
VUDUC, R. W. AND MOON, H. 2005. Fast sparse matrix-vector multiplication by exploiting variable block structure. In Proceedings of the Conference on High Performance Computing and Communications. Lecture Notes in Computer Science, vol. 3726. Springer, 807-816.
-
(2005)
Proceedings of the Conference on High Performance Computing and Communications
, vol.3726
, pp. 807-816
-
-
Vuduc, R.W.1
Moon, H.2
-
34
-
-
34547468948
-
Accelerating sparse matrix computations via data compression
-
ACM Press, New York, NY
-
WILLCOCK, J. AND LUMSDAINE, A. 2006. Accelerating sparse matrix computations via data compression. In Proceedings of the 20th Annual International Conference on Supercomputing (ICS'06). ACM Press, New York, NY, 307-316.
-
(2006)
Proceedings of the 20th Annual International Conference on Supercomputing (ICS'06)
, pp. 307-316
-
-
Willcock, J.1
Lumsdaine, A.2
-
35
-
-
56749158843
-
Optimization of sparse matrix-vector multiplication on emerging multicore platforms
-
WILLIAMS, S., OILKER, L., VUDUC, R., SHALF, J., YELICK, K., AND DEMMEL, J. 2007. Optimization of sparse matrix-vector multiplication on emerging multicore platforms. In Proceedings of the ACM/IEEE Conference on Supercomputing.
-
(2007)
Proceedings of the ACM/IEEE Conference on Supercomputing
-
-
Williams, S.1
Oilker, L.2
Vuduc, R.3
Shalf, J.4
Yelick, K.5
Demmel, J.6
|