-
1
-
-
35948991669
-
-
NVIDIA. (2nd edn). NVIDIA Corporation, July.
-
NVIDIA. NVIDIA CUDA Programming Guide (2nd edn). NVIDIA Corporation, July 2008.
-
(2008)
NVIDIA CUDA Programming Guide
-
-
-
2
-
-
1542501019
-
Sparsity: Optimization framework for sparse matrix kernels
-
Im EJ, Yelick KA, Vuduc RW,. Sparsity: Optimization framework for sparse matrix kernels. IJHPCA 2004; 18 (1): 135-158.
-
(2004)
IJHPCA
, vol.18
, Issue.1
, pp. 135-158
-
-
Im, E.J.1
Yelick, K.A.2
Vuduc, R.W.3
-
3
-
-
24344485098
-
OSKI: A library of automatically tuned sparse matrix kernels
-
Journal of Physics: Conference Series, Institute of Physics Publishing: San Francisco, CA, U.S.A.
-
Vuduc R, Demmel JW, Yelick KA,. OSKI: A library of automatically tuned sparse matrix kernels. Proceedings of SciDAC'05. Journal of Physics: Conference Series, Institute of Physics Publishing: San Francisco, CA, U.S.A., 2005.
-
(2005)
Proceedings of SciDAC'05
-
-
Vuduc, R.1
Demmel, J.W.2
Yelick, K.A.3
-
4
-
-
35549013711
-
Performance optimization and modeling of blocked sparse kernels
-
DOI 10.1177/1094342007083801
-
Buttari A, Eijkhout V, Langou J, Filippone S,. Performance optimization and modeling of blocked sparse kernels. IJHPCA 2007; 21 (4): 467-484. (Pubitemid 350011340)
-
(2007)
International Journal of High Performance Computing Applications
, vol.21
, Issue.4
, pp. 467-484
-
-
Buttari, A.1
Eijkhout, V.2
Langou, J.3
Filippone, S.4
-
5
-
-
77954926909
-
From sparse matrix to optimal GPU CUDA sparse matrix vector product implementation
-
Washington, DC, U.S.A. IEEE
-
Zein AHE, Rendell AP,. From sparse matrix to optimal GPU CUDA sparse matrix vector product implementation. CCGRID, Washington, DC, U.S.A. IEEE, 2010; 808-813.
-
(2010)
CCGRID
, pp. 808-813
-
-
Zein, A.H.E.1
Rendell, A.P.2
-
6
-
-
47749154455
-
Performance evaluation of the nvidia geforce 8800 gtx gpu for machine learning
-
(Lecture Notes in Computer Science, 5101), Bubak M. van Albada G.D. Dongarra J. Sloot P.M.A. (eds.). Springer: Berlin.
-
Zein AE, McCreath E, Rendell AP, Smola AJ,. Performance evaluation of the nvidia geforce 8800 gtx gpu for machine learning. ICCS (1) (Lecture Notes in Computer Science, vol. 5101), Bubak M, van Albada GD, Dongarra J, Sloot PMA, (eds.). Springer: Berlin, 2008; 466-475.
-
(2008)
ICCS (1)
, pp. 466-475
-
-
Zein, A.E.1
McCreath, E.2
Rendell, A.P.3
Smola, A.J.4
-
7
-
-
0013269731
-
University of florida sparse matrix collection
-
Davis TA,. University of florida sparse matrix collection. NA Digest 1994; 92.
-
(1994)
NA Digest
, vol.92
-
-
Davis, T.A.1
-
8
-
-
84855218113
-
-
[August]
-
Tesla S2050 GPU computing system. Available at: http://www.nvidia.com/ object/product-tesla-S2050-us.html [August 2010 ].
-
(2010)
Tesla S2050 GPU Computing System
-
-
-
9
-
-
84859261309
-
NVIDIA's next generation CUDA compute architecture: Fermi
-
June 2010. [August]
-
NVIDIA. NVIDIA's next generation CUDA compute architecture: Fermi. White Paper, June 2010. Available at: http://www.nvidia.com/content/PDF/fermi-white- papers/NVIDIA-Fermi-Compute-Architecture-Whitepaper.pdf [August 2010].
-
(2010)
White Paper
-
-
-
11
-
-
74049163483
-
Optimizing sparse matrix-vector multiplication on gpus
-
RC24704
-
Baskaran MM, Bordawekar R,. Optimizing sparse matrix-vector multiplication on gpus. IBM Research Report 2009; RC24704.
-
(2009)
IBM Research Report
-
-
Baskaran, M.M.1
Bordawekar, R.2
-
12
-
-
84855210918
-
-
SC Verastegui B. (ed.). ACM Press: New York.
-
Williams S, Oliker L, Vuduc RW, Shalf J, Yelick KA, Demmel J,. Optimization of sparse matrix-vector multiplication on emerging multicore platforms. SC, Verastegui B, (ed.). ACM Press: New York, 2007; 38.
-
(2007)
Optimization of sparse matrix-vector multiplication on emerging multicore platforms
, pp. 38
-
-
Williams, S.1
Oliker, L.2
Vuduc, R.W.3
Shalf, J.4
Yelick, K.A.5
Demmel, J.6
-
14
-
-
78651284120
-
Scan primitives for gpu computing
-
Segal M. Aila T. (eds.). Eurographics Association: Aire-la-Ville, Switzerland.
-
Sengupta S, Harris M, Zhang Y, Owens JD,. Scan primitives for gpu computing. Graphics Hardware, Segal M, Aila T, (eds.). Eurographics Association: Aire-la-Ville, Switzerland, 2007; 97-106.
-
(2007)
Graphics Hardware
, pp. 97-106
-
-
Sengupta, S.1
Harris, M.2
Zhang, Y.3
Owens, J.D.4
-
15
-
-
57949097109
-
Reinforcement learning for automated performance tuning: Initial evaluation for sparse matrix format selection
-
IEEE: New York.
-
Armstrong W, Rendell AP,. Reinforcement learning for automated performance tuning: Initial evaluation for sparse matrix format selection. CLUSTER. IEEE: New York, 2008; 411-420.
-
(2008)
CLUSTER
, pp. 411-420
-
-
Armstrong, W.1
Rendell, A.P.2
-
16
-
-
77954995885
-
Debunking the 100x gpu vs. cpu myth: An evaluation of throughput computing on cpu and gpu
-
Seznec A. Weiser U.C. Ronen R. (eds.). ACM: New York.
-
Lee VW, Kim C, Chhugani J, Deisher M, Kim D, Nguyen AD, Satish N, Smelyanskiy M, Chennupaty S, Hammarlund P,. et al. Debunking the 100x gpu vs. cpu myth: An evaluation of throughput computing on cpu and gpu. ISCA, Seznec A, Weiser UC, Ronen R, (eds.). ACM: New York, 2010; 451-460.
-
(2010)
ISCA
, pp. 451-460
-
-
Lee, V.W.1
Kim, C.2
Chhugani, J.3
Deisher, M.4
Kim, D.5
Nguyen, A.D.6
Satish, N.7
Smelyanskiy, M.8
Chennupaty, S.9
Hammarlund, P.10
|