-
1
-
-
84896893237
-
CUDA-NP: Realizing nested thread-level parallelism in GPGPU applications
-
Y. Yang, and H. Zhou, "CUDA-NP: realizing nested thread-level parallelism in GPGPU applications, " in Proc. of PPoPP 2014.
-
(2014)
Proc. of PPoPP
-
-
Yang, Y.1
Zhou, H.2
-
2
-
-
84946029581
-
Characterization and analysis of dynamic parallelism in unstructured GPU applications
-
J. Wang, and S. Yalamanchili, "Characterization and Analysis of Dynamic Parallelism in Unstructured GPU Applications, " in Proc. of IISWC 2014.
-
(2014)
Proc. of IISWC
-
-
Wang, J.1
Yalamanchili, S.2
-
3
-
-
84884875169
-
Deploying graph algorithms on GPUs: An adaptive solution
-
D. Li, and M. Becchi, "Deploying Graph Algorithms on GPUs: An Adaptive Solution, " in Proc. of IPDPS 2013.
-
(2013)
Proc. of IPDPS
-
-
Li, D.1
Becchi, M.2
-
4
-
-
84976484929
-
General transformations for GPU execution of tree traversals
-
M. Goldfarb, Y. Jo, and M. Kulkarni, "General transformations for GPU execution of tree traversals, " in Proc. of SC 2013.
-
(2013)
Proc. of SC
-
-
Goldfarb, M.1
Jo, Y.2
Kulkarni, M.3
-
5
-
-
60649099910
-
Accelerating large graph algorithms on the GPU using CUDA
-
P. Harish, and P. J. Narayanan, "Accelerating large graph algorithms on the GPU using CUDA, " in Proc. of HiPC 2007.
-
(2007)
Proc. of HiPC
-
-
Harish, P.1
Narayanan, P.J.2
-
6
-
-
84976497139
-
Betweenness centrality on GPUs and heterogeneous architectures
-
A. E. Sriyuce, et al., "Betweenness Centrality on GPUs and Heterogeneous Architectures. " in Proc. of GPGPU-6 2013.
-
(2013)
Proc. of GPGPU-6
-
-
Sriyuce, A.E.1
-
8
-
-
84976478010
-
Efficient sparse matrix-vector multiplication on GPUs using the CSR storage format
-
J. L. Greathouse, and M. Daga, "Efficient sparse matrix-vector multiplication on GPUs using the CSR storage format. " in Proc of SC 2014.
-
(2014)
Proc of SC
-
-
Greathouse, J.L.1
Daga, M.2
-
11
-
-
79952783409
-
Ordered vs. Unordered: A comparison of parallelism and work-efficiency in irregular algorithms
-
M. A. Hassaan, M. Burtscher, and K. Pingali, "Ordered vs. unordered: A comparison of parallelism and work-efficiency in irregular algorithms, " in Proc. of PPoPP 2011.
-
(2011)
Proc. of PPoPP
-
-
Hassaan, M.A.1
Burtscher, M.2
Pingali, K.3
-
12
-
-
77956200064
-
An effective GPU implementation of breadth-first search
-
L. Luo, M. Wong, and W.-m. Hwu, "An effective GPU implementation of breadth-first search, " in Proc. of DAC 2010.
-
(2010)
Proc. of DAC
-
-
Luo, L.1
Wong, M.2
Hwu, W.-M.3
-
14
-
-
84856541553
-
Efficient parallel graph exploration on multi-core CPU and GPU
-
S. Hong, T. Oguntebi, and K. Olukotun, "Efficient Parallel Graph Exploration on Multi-Core CPU and GPU, " in Proc. of PACT 2011.
-
(2011)
Proc. of PACT
-
-
Hong, S.1
Oguntebi, T.2
Olukotun, K.3
-
15
-
-
84884887302
-
On graphs, GPUs, and blind dating: A workload to processor matchmaking quest
-
A. Gharaibeh, et al., "On Graphs, GPUs, and Blind Dating: A Workload to Processor Matchmaking Quest, " in Proc. of IPDPS 2013.
-
(2013)
Proc. of IPDPS
-
-
Gharaibeh, A.1
-
17
-
-
84875980776
-
Data-driven versus topology-driven irregular computations on GPUs
-
R. Nasre, M. Burtscher, and K. Pingali, "Data-driven versus Topology-driven Irregular Computations on GPUs, " in Proc. of IPDPS 2013.
-
(2013)
Proc. of IPDPS
-
-
Nasre, R.1
Burtscher, M.2
Pingali, K.3
-
20
-
-
79952811127
-
Accelerating CUDA graph algorithms at maximum warp
-
S. Hong, et al., "Accelerating CUDA graph algorithms at maximum warp, " in Proc. of PPoPP 2011.
-
(2011)
Proc. of PPoPP
-
-
Hong, S.1
-
22
-
-
80053287330
-
Computing strongly connected components in parallel on CUDA
-
J. Barnat, et al., "Computing Strongly Connected Components in Parallel on CUDA, " in Proc. of IPDPS 2011.
-
(2011)
Proc. of IPDPS
-
-
Barnat, J.1
-
23
-
-
84893628986
-
Pannotia: Understanding irregular GPGPU graph applications
-
C. Shuai, et al., "Pannotia: Understanding irregular GPGPU graph applications, " in Proc of IISWC 2013.
-
(2013)
Proc of IISWC
-
-
Shuai, C.1
-
24
-
-
84960150259
-
CuSha: Vertex-centric graph processing on GPUs
-
F. Khorasani, et al., "CuSha: vertex-centric graph processing on GPUs, " in Proc. of HPDC 2014.
-
(2014)
Proc. of HPDC
-
-
Khorasani, F.1
-
25
-
-
0025380943
-
Compiling collection-oriented languages onto massively parallel computers
-
G. E. Blelloch, and G. W. Sabot, "Compiling collection-oriented languages onto massively parallel computers, " J. Parallel Distrib. Comput., vol. 8, no. 2, pp. 119-134, 1990.
-
(1990)
J. Parallel Distrib. Comput
, vol.8
, Issue.2
, pp. 119-134
-
-
Blelloch, G.E.1
Sabot, G.W.2
-
28
-
-
84976466502
-
Performance impact of dynamic parallelism on different clustering algorithms and the new GPU architecture
-
J. DiMarco, and M. Taufer, "Performance Impact of Dynamic Parallelism on Different Clustering Algorithms and the New GPU Architecture, " in Proc. of SPIE Defense, Security, and Sensing Symp. 2013.
-
(2013)
Proc. of SPIE Defense, Security, and Sensing Symp.
-
-
DiMarco, J.1
Taufer, M.2
|