-
1
-
-
78649824847
-
Exploiting memory access patterns to improve memory performance in data-parallel architectures
-
B. Jang et al. Exploiting memory access patterns to improve memory performance in data-parallel architectures. IEEE Trans. Parallel Distrib. Syst., 22(1):105-118, 2011.
-
(2011)
IEEE Trans. Parallel Distrib. Syst.
, vol.22
, Issue.1
, pp. 105-118
-
-
Jang, B.1
-
2
-
-
84870725376
-
Policy-based tuning for performance portability and library co-optimization
-
D. Merrill et al. Policy-based tuning for performance portability and library co-optimization. In InPar, pages 1-10, 2012.
-
(2012)
InPar
, pp. 1-10
-
-
Merrill, D.1
-
4
-
-
84937693610
-
PORPLE: An extensible optimizer for portable data placement on GPU
-
G. Chen et al. PORPLE: An extensible optimizer for portable data placement on GPU. In MICRO, pages 88-100, 2014.
-
(2014)
MICRO
, pp. 88-100
-
-
Chen, G.1
-
5
-
-
84961314978
-
Locality-centric thread scheduling for bulk-synchronous programming models on cpu architectures
-
H.-S. Kim et al. Locality-centric thread scheduling for bulk-synchronous programming models on cpu architectures. In CGO, pages 257-268, 2015.
-
(2015)
CGO
, pp. 257-268
-
-
Kim, H.-S.1
-
6
-
-
67650786281
-
Petabricks: A language and compiler for algorithmic choice
-
J. Ansel et al. Petabricks: A language and compiler for algorithmic choice. In PLDI, pages 38-49, 2009.
-
(2009)
PLDI
, pp. 38-49
-
-
Ansel, J.1
-
7
-
-
84859143447
-
Improving performance of OpenCL on CPUs
-
R. Karrenberg and S. Hack. Improving Performance of OpenCL on CPUs. In CC, pages 1-20, 2012.
-
(2012)
CC
, pp. 1-20
-
-
Karrenberg, R.1
Hack, S.2
-
9
-
-
84975230376
-
Dysel: Lightweight dynamic selection for kernel-based data-parallel programming model
-
in press
-
L.-W. Chang et al. Dysel: Lightweight dynamic selection for kernel-based data-parallel programming model. In ASPLOS, 2016 (in press).
-
(2016)
ASPLOS
-
-
Chang, L.-W.1
-
10
-
-
1542396679
-
Spiral: A generator for platform-adapted libraries of signal processing alogorithms
-
M. Puschel et al. Spiral: A generator for platform-adapted libraries of signal processing alogorithms. International Journal of High Performance Computing Applications, 18(1):21-45, 2004.
-
(2004)
International Journal of High Performance Computing Applications
, vol.18
, Issue.1
, pp. 21-45
-
-
Puschel, M.1
-
12
-
-
0343462141
-
Automated empirical optimizations of software and the atlas project
-
R. C. Whaley et el. Automated empirical optimizations of software and the atlas project. Parallel Computing, 27(1):3-35, 2001.
-
(2001)
Parallel Computing
, vol.27
, Issue.1
, pp. 3-35
-
-
Whaley, R.C.1
-
13
-
-
70649092154
-
Rodinia: A benchmark suite for heterogeneous computing
-
S. Che et al. Rodinia: A benchmark suite for heterogeneous computing. In IISWC, pages 44-54, 2009.
-
(2009)
IISWC
, pp. 44-54
-
-
Che, S.1
|