-
1
-
-
84858377399
-
-
http://clang.llvm.org/.
-
-
-
-
3
-
-
77957561221
-
An adaptive performance modeling tool for GPU architectures
-
S. S. Baghsorkhi, M. Delahaye, S. J. Patel, W. D. Gropp, andW.-m.W. Hwu. An adaptive performance modeling tool for GPU architectures. In Proceedings of the 15th ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 105-114, 2010.
-
(2010)
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
, pp. 105-114
-
-
Baghsorkhi, S.S.1
Delahaye, M.2
Patel, S.J.3
Gropp, W.D.4
Hwu, Andw.-M.W.5
-
4
-
-
70349169075
-
Analyzing CUDA workloads using a detailed GPU simulator
-
A. Bakhoda, G. Yuan, W. Fung, H. Wong, and T. Aamodt. Analyzing CUDA workloads using a detailed GPU simulator. In Performance Analysis of Systems and Software, 2009. ISPASS 2009. IEEE International Symposium on, pages 163 -174, 2009.
-
(2009)
Performance Analysis of Systems and Software, 2009. ISPASS 2009. IEEE International Symposium on
, pp. 163-174
-
-
Bakhoda, A.1
Yuan, G.2
Fung, W.3
Wong, H.4
Aamodt, T.5
-
6
-
-
0029547346
-
The M-Machine multicomputer
-
M. Fillo, S. W. Keckler, W. J. Dally, N. P. Carter, A. Chang, Y. Gurevich, and W. S. Lee. The M-Machine multicomputer. In Proceedings of the 28th annual international symposium on Microarchitecture, pages 146-156, 1995.
-
(1995)
Proceedings of the 28th Annual International Symposium on Microarchitecture
, pp. 146-156
-
-
Fillo, M.1
Keckler, S.W.2
Dally, W.J.3
Carter, N.P.4
Chang, A.5
Gurevich, Y.6
Lee, W.S.7
-
8
-
-
0024107186
-
Accurate low-cost methods for performance evaluation of cache memory systems
-
S. Laha, J. H. Patel, and R. K. Iyer. Accurate low-cost methods for performance evaluation of cache memory systems. IEEE Trans. Comput., 37:1325-1336.
-
IEEE Trans. Comput.
, vol.37
, pp. 1325-1336
-
-
Laha, S.1
Patel, J.H.2
Iyer, R.K.3
-
10
-
-
84858377949
-
VAMPIR: Visualization and analysis of MPI resources
-
W. Nagel, A. Arnold, M. Weber, H. Hoppe, and K. Solchenbach. VAMPIR: Visualization and analysis of MPI resources. KFA, ZAM, 1996.
-
(1996)
KFA, ZAM
-
-
Nagel, W.1
Arnold, A.2
Weber, M.3
Hoppe, H.4
Solchenbach, K.5
-
11
-
-
77951154340
-
The GPU computing era
-
March
-
J. Nickolls and W. J. Dally. The GPU computing era. IEEE Micro, 30:56-69, March 2010.
-
(2010)
IEEE Micro
, vol.30
, pp. 56-69
-
-
Nickolls, J.1
Dally, W.J.2
-
14
-
-
33751095034
-
PARAVER: A tool to visualise and analyze parallel code
-
V. Pillet, J. Labarta, T. Cortes, and S. Girona. PARAVER: A tool to visualise and analyze parallel code. In Proceedings of WoTUG-18: Transputer and occam Developments, volume 44, pages 17-31, 1995.
-
(1995)
Proceedings of WoTUG-18: Transputer and Occam Developments
, vol.44
, pp. 17-31
-
-
Pillet, V.1
Labarta, J.2
Cortes, T.3
Girona, S.4
-
15
-
-
43449094719
-
Program optimization space pruning for a multithreaded GPU
-
DOI 10.1145/1356058.1356084, Proceedings of the 2008 CGO - Sixth International Symposium on Code Generation and Optimization
-
S. Ryoo, C. I. Rodrigues, S. S. Stone, S. S. Baghsorkhi, S.-Z. Ueng, J. A. Stratton, and W.-m. W. Hwu. Program optimization space pruning for a multithreaded GPU. In Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization, pages 195-204, 2008. (Pubitemid 351667266)
-
(2008)
Proceedings of the 2008 CGO - Sixth International Symposium on Code Generation and Optimization
, pp. 195-204
-
-
Ryoo, S.1
Rodrigues, C.I.2
Stone, S.S.3
Baghsorkhi, S.S.4
Ueng, S.-Z.5
Stratton, J.A.6
Hwu, W.-M.W.7
-
16
-
-
52249099896
-
Next-generation performance counters: Towards monitoring over thousand concurrent events
-
V. Salapura, K. Ganesan, A. Gara, M. Gschwind, J. C. Sexton, and R. E. Walkup. Next-generation performance counters: Towards monitoring over thousand concurrent events. In Proceedings of the ISPASS 2008 - IEEE International Symposium on Performance Analysis of Systems and software, pages 139-146, 2008.
-
(2008)
Proceedings of the ISPASS 2008 - IEEE International Symposium on Performance Analysis of Systems and software
, pp. 139-146
-
-
Salapura, V.1
Ganesan, K.2
Gara, A.3
Gschwind, M.4
Sexton, J.C.5
Walkup, R.E.6
-
19
-
-
84877021547
-
Multi-processor performance on the tera MTA
-
A. Snavely, L. Carter, J. Boisseau, A. Majumdar, K. S. Gatlin, N. Mitchell, J. Feo, and B. Koblenz. Multi-processor performance on the tera MTA. In Proceedings of the 1998 ACM/IEEE conference on Supercomputing, pages 1-8, 1998.
-
(1998)
Proceedings of the 1998 ACM/IEEE Conference on Supercomputing
, pp. 1-8
-
-
Snavely, A.1
Carter, L.2
Boisseau, J.3
Majumdar, A.4
Gatlin, K.S.5
Mitchell, N.6
Feo, J.7
Koblenz, B.8
-
20
-
-
0003423822
-
-
Prentice-Hall, Inc. Upper Saddle River, NJ, USA
-
H. Stark and J.Woods. Probability, random processes, and estimation theory for engineers. Prentice-Hall, Inc. Upper Saddle River, NJ, USA, 1986.
-
(1986)
Probability, Random Processes, and Estimation Theory for Engineers
-
-
Stark, H.1
Woods, J.2
|