-
1
-
-
85054422862
-
-
POSIX Threads (pthread.h), IEEE Std 1003.1
-
The Open Group Base Specifications, Issue 6: POSIX Threads (pthread.h). IEEE Std 1003.1, 2004. http://www.opengroup.org/onlinepubs/009695399/basedefs/pthread.h.html.
-
(2004)
The Open Group Base Specifications
, Issue.6
-
-
-
4
-
-
72849129747
-
-
Technical Report RC24704 (W0812-047), IBM T.J. Watson Research Center, Yorktown Heights, NY, USA, December
-
Muthu Manikandan Baskaran, and Rajesh Bordawekar. Optimizing sparse matrix-vector multiplication on GPUs using compile-time, and run-time strategies. Technical Report RC24704 (W0812-047), IBM T.J. Watson Research Center, Yorktown Heights, NY, USA, December 2008.
-
(2008)
Optimizing sparse matrix-vector multiplication on GPUs using compile-time, and run-time strategies
-
-
Baskaran, M.M.1
Bordawekar, R.2
-
5
-
-
74049143158
-
Implementing a sparse matrix-vector multiplication on throughput-oriented processors
-
Portland, OR, USA, November
-
Nathan Bell, and Michael Garland. Implementing a sparse matrix-vector multiplication on throughput-oriented processors. In Proc. ACM/IEEE Conf. Supercomputing (SC), Portland, OR, USA, November 2009.
-
(2009)
Proc. ACM/IEEE Conf. Supercomputing (SC)
-
-
Bell, N.1
Garland, M.2
-
6
-
-
0002924006
-
-
Technical Report, Carnegie Mellon University, Department of Computer Science, Pittsburgh, PA, USA, August
-
Guy E. Blelloch, Michael A. Heroux, and Marco Zagha. Segmented operations for sparse matrix computations on vector multiprocessors. Technical Report, Carnegie Mellon University, Department of Computer Science, Pittsburgh, PA, USA, August 1993.
-
(1993)
Segmented operations for sparse matrix computations on vector multiprocessors
-
-
Blelloch, G.E.1
Heroux, M.A.2
Zagha, M.3
-
7
-
-
77749340082
-
Model-driven autotuning of sparse matrix-vector multiply on GPUs
-
Bangalore, India, January
-
Jee Whan Choi, Amik Singh, and Richard W. Vuduc. Model-driven autotuning of sparse matrix-vector multiply on GPUs. In Proc. ACM SIGPLAN Symp. Principles, and Practice of Parallel Programming (PPoPP), Bangalore, India, January 2010.
-
(2010)
Proc. ACM SIGPLAN Symp. Principles, and Practice of Parallel Programming (PPoPP)
-
-
Choi, J.W.1
Singh, A.2
Vuduc, R.W.3
-
8
-
-
1542501019
-
Sparsity: Optimization framework for sparse matrix kernels
-
February
-
Eun-Jin Im, Katherine Yelick, and Richard Vuduc. Sparsity: Optimization framework for sparse matrix kernels. Int’l J. of High Performance Computing Applications (IJHPCA), 18(1): 135-158, February 2004.
-
(2004)
Int’l J. of High Performance Computing Applications (IJHPCA)
, vol.18
, Issue.1
, pp. 135-158
-
-
Im, E.-J.1
Yelick, K.2
Vuduc, R.3
-
9
-
-
77951900491
-
-
Whitepaper (electronic), September
-
NVIDIA. NVIDIA’s next generation CUDA compute architecture: FermiTM, v1.1. Whitepaper (electronic), September 2009. http://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf.
-
(2009)
NVIDIA’s next generation CUDA compute architecture: FermiTM, v1.1
-
-
-
12
-
-
78651284120
-
Scan primitives for GPU computing
-
San Diego, CA, USA
-
Shubhabrata Sengupta, Mark Harris, Yao Zhang, and John D. Owens. Scan primitives for GPU computing. In Proc. ACM SIGGRAPH/EUROGRAPHICS Symp. Graphics Hardware, San Diego, CA, USA, 2007.
-
(2007)
Proc. ACM SIGGRAPH/EUROGRAPHICS Symp. Graphics Hardware
-
-
Sengupta, S.1
Harris, M.2
Zhang, Y.3
Owens, J.D.4
-
13
-
-
10044233808
-
-
PhD thesis, University of California, Berkeley, CA, USA, January
-
Richard W. Vuduc. Automatic performance tuning of sparse matrix kernels. PhD thesis, University of California, Berkeley, CA, USA, January 2004.
-
(2004)
Automatic performance tuning of sparse matrix kernels
-
-
Vuduc, R.W.1
-
14
-
-
56749158843
-
Optimization of sparse matrix-vector multiplication on emerging multicore platforms
-
Sam Williams, Leonid Oliker, Richard Vuduc, John Shalf, Katherine Yelick, and James Demmel. Optimization of sparse matrix-vector multiplication on emerging multicore platforms. In Proc. ACM/IEEE Conf. Supercomputing (SC), 2007.
-
(2007)
Proc. ACM/IEEE Conf. Supercomputing (SC)
-
-
Williams, S.1
Oliker, L.2
Vuduc, R.3
Shalf, J.4
Yelick, K.5
Demmel, J.6
-
15
-
-
56749158843
-
Optimizing sparse matrix-vector multiply on emerging multicore platforms
-
March, Extends conference version
-
Sam Williams, Richard Vuduc, Leonid Oliker, John Shalf, Katherine Yelick, and James Demmel. Optimizing sparse matrix-vector multiply on emerging multicore platforms. Parallel Computing (ParCo), 35(3): 178-194, March 2009. Extends conference version: http://dx.doi.org/10.1145/1362622.1362674.
-
(2009)
Parallel Computing (ParCo)
, vol.35
, Issue.3
, pp. 178-194
-
-
Williams, S.1
Vuduc, R.2
Oliker, L.3
Shalf, J.4
Yelick, K.5
Demmel, J.6
-
16
-
-
65649090648
-
-
UCB/EECS-2008-164, University of California, Berkeley, CA, USA, December
-
Samuel Webb Williams. Auto-tuning performance on multicore computers. UCB/EECS-2008-164, University of California, Berkeley, CA, USA, December 2008.
-
(2008)
Auto-tuning performance on multicore computers
-
-
Williams, S.W.1
|