-
1
-
-
77749337497
-
An adaptive performance modeling tool for GPU architectures
-
S. S. Baghsorkhi, M. Delahaye, S. J. Patel, W. D. Gropp, and W.-m. Hwu. An adaptive performance modeling tool for GPU architectures. In PPoPP, 2010.
-
(2010)
PPoPP
-
-
Baghsorkhi, S.S.1
Delahaye, M.2
Patel, S.J.3
Gropp, W.D.4
Hwu, W.-M.5
-
2
-
-
70649092154
-
Rodinia: A benchmark suite for heterogeneous computing
-
October
-
S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer, S.-H. Lee, and K. Skadron. Rodinia: A benchmark suite for heterogeneous computing. In International Symposium on Workload Characterization (IISWC), October 2009.
-
(2009)
International Symposium on Workload Characterization (IISWC)
-
-
Che, S.1
Boyer, M.2
Meng, J.3
Tarjan, D.4
Sheaffer, J.W.5
Lee, S.-H.6
Skadron, K.7
-
3
-
-
79959598339
-
Modeling the performance of an algebraic multigrid cycle on HPC platforms
-
H. Gahvari, A. H. Baker, M. Schulz, U. M. Yang, K. E. Jordan, and W. Gropp. Modeling the performance of an algebraic multigrid cycle on HPC platforms. In ICS, 2011.
-
(2011)
ICS
-
-
Gahvari, H.1
Baker, A.H.2
Schulz, M.3
Yang, U.M.4
Jordan, K.E.5
Gropp, W.6
-
5
-
-
0026186967
-
An implementation of interprocedural bounded regular section analysis
-
P. Havlak and K. Kennedy. An implementation of interprocedural bounded regular section analysis. IEEE Trans. Parallel Distrib. Syst., 2, 1991.
-
(1991)
IEEE Trans Parallel Distrib. Syst.
, vol.2
-
-
Havlak, P.1
Kennedy, K.2
-
6
-
-
70450231944
-
An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness
-
S. Hong and H. Kim. An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness. In ISCA, 2009.
-
(2009)
ISCA
-
-
Hong, S.1
Kim, H.2
-
7
-
-
79959904195
-
Automatic CPU-GPU communication management and optimization
-
T. B. Jablin, P. Prabhu, J. A. Jablin, N. P. Johnson, S. R. Beard, and D. I. August. Automatic CPU-GPU communication management and optimization. In PLDI, 2011.
-
(2011)
PLDI
-
-
Jablin, T.B.1
Prabhu, P.2
Jablin, J.A.3
Johnson, N.P.4
Beard, S.R.5
August, D.I.6
-
8
-
-
83155186310
-
Modern potentials and the properties of condensed 4He
-
M. H. Kalos, M. A. Lee, P. A. Whitlock, and G. V. Chester. Modern potentials and the properties of condensed 4He. In Phys. Rev. C 66, 044310-1:14, 1981.
-
(1981)
Phys. Rev. C
, vol.66
, pp. 044310-045114
-
-
Kalos, M.H.1
Lee, M.A.2
Whitlock, P.A.3
Chester, G.V.4
-
9
-
-
70349100958
-
-
Khronos Group Std Version 1.0.
-
Khronos Group Std. The OpenCL Specification, Version 1.0. http://www.khronos.org/registry/cl/specs/opencl-1.0.33.pdf, 2009.
-
(2009)
The OpenCL Specification
-
-
-
10
-
-
77952204218
-
A performance prediction model for the CUDA GPGPU platform
-
K. Kothapalli, R. Mukherjee, M. S. Rehman, S. Patidar, P. J. Narayanan, and K. Srinathan. A performance prediction model for the CUDA GPGPU platform. In HiPC, 2009.
-
(2009)
HiPC
-
-
Kothapalli, K.1
Mukherjee, R.2
Rehman, M.S.3
Patidar, S.4
Narayanan, P.J.5
Srinathan, K.6
-
11
-
-
34547288276
-
Accurate and efficient regression modeling for microarchitectural performance and power prediction
-
B. C. Lee and D. M. Brooks. Accurate and efficient regression modeling for microarchitectural performance and power prediction. In ASPLOSXII, 2006.
-
(2006)
ASPLOSXII
-
-
Lee, B.C.1
Brooks, D.M.2
-
12
-
-
66749185800
-
CPR: Composable performance regression for scalable multiprocessor models
-
B. C. Lee, J. Collins, H. Wang, and D. Brooks. CPR: Composable performance regression for scalable multiprocessor models. In MICRO, 2008.
-
(2008)
MICRO
-
-
Lee, B.C.1
Collins, J.2
Wang, H.3
Brooks, D.4
-
13
-
-
34748909426
-
Methods of inference and learning for performance modeling of parallel applications
-
Benjamin C. Lee, David M. Brooks, Bronis R. de Supinski, Martin Schulz, Karan Singh, and Sally A. McKee. Methods of inference and learning for performance modeling of parallel applications. In PPoPP, 2007.
-
(2007)
PPoPP
-
-
Lee, B.C.1
Brooks, D.M.2
De Supinski, B.R.3
Schulz, M.4
Singh, K.5
McKee, S.A.6
-
14
-
-
77954995885
-
Debunking the 100x GPU vs. CPU myth: An evaluation of throughput computing on CPU and GPU
-
V. W. Lee, C. Kim, J. Chhugani, M. Deisher, D. Kim, A. D. Nguyen, N. Satish, M. Smelyanskiy, S. Chennupaty, P. Hammarlund, R. Singhal, and P. Dubey. Debunking the 100x GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU. In ISCA, 2010.
-
(2010)
ISCA
-
-
Lee, V.W.1
Kim, C.2
Chhugani, J.3
Deisher, M.4
Kim, D.5
Nguyen, A.D.6
Satish, N.7
Smelyanskiy, M.8
Chennupaty, S.9
Hammarlund, P.10
Singhal, R.11
Dubey, P.12
-
15
-
-
83155184571
-
GROPHECY: GPU performance projection from CPU code skeletons
-
Jiayuan Meng, Vitali A. Morozov, Kalyan Kumaran, Venkatram Vishwanath, and Thomas D. Uram. GROPHECY: GPU performance projection from CPU code skeletons. In SC, 2011.
-
(2011)
SC
-
-
Meng, J.1
Morozov, V.A.2
Kumaran, K.3
Vishwanath, V.4
Uram, T.D.5
-
17
-
-
79959617900
-
MDR: Performance model driven runtime for heterogeneous parallel platforms
-
J. A. Pienaar, A. Raghunathan, and S. Chakradhar. MDR: performance model driven runtime for heterogeneous parallel platforms. In ICS, 2011.
-
(2011)
ICS
-
-
Pienaar, J.A.1
Raghunathan, A.2
Chakradhar, S.3
-
18
-
-
0036819106
-
Wiringa. Quantum Monte Carlo calculations of A=9,10 nuclei
-
S. C. Pieper, K. Varga, and R. B. Wiringa. Quantum Monte Carlo calculations of A=9,10 nuclei. In Phys. Rev. C 66, 044310-1:14, 2002.
-
(2002)
Phys. Rev. C
, vol.66
, pp. 044310-045114
-
-
Pieper, S.C.1
Varga, K.B.R.2
-
19
-
-
0242505770
-
A framework for performance modeling and prediction
-
A. Snavely, L. Carrington, N. Wolter, J. Labarta, R. Badia, and A. Purkayastha. A framework for performance modeling and prediction. In SC, 2002.
-
(2002)
SC
-
-
Snavely, A.1
Carrington, L.2
Wolter, N.3
Labarta, J.4
Badia, R.5
Purkayastha, A.6
-
20
-
-
33845442055
-
Cross-platform performance prediction of parallel applications using partial execution
-
L. T. Yang, X. Ma, and F. Mueller. Cross-platform performance prediction of parallel applications using partial execution. In SC, 2005.
-
(2005)
SC
-
-
Yang, L.T.1
Ma, X.2
Mueller, F.3
-
21
-
-
79955921273
-
A quantitative performance analysis model for GPU architectures
-
Y. Zhang and J. D. Owens. A quantitative performance analysis model for GPU architectures. In HPCA, 2011.
-
(2011)
HPCA
-
-
Zhang, Y.1
Owens, J.D.2
|