-
1
-
-
84906668459
-
-
Introducing TITAN
-
Introducing TITAN. http://www.olcf.ornl.gov/titan/.
-
-
-
-
2
-
-
84906718295
-
-
Swiss national supercomputing centre
-
Swiss national supercomputing centre. http://www.cscs.ch/.
-
-
-
-
3
-
-
84873458159
-
A quantitative study of irregular programs on GPUs
-
M. Burtscher, R. Nasre, and K. Pingali. A quantitative study of irregular programs on GPUs. In IISWC, 2012.
-
(2012)
IISWC
-
-
Burtscher, M.1
Nasre, R.2
Pingali, K.3
-
4
-
-
84906668460
-
-
Macsim. http://code.google.com/p/macsim/.
-
-
-
-
5
-
-
21644454187
-
Pinpointing representative portions of large Intel Itanium programs with dynamic instrumentation
-
H. Patil, R. S. Cohn, M. Charney, R. Kapoor, A. Sun, and A. Karunanidhi. Pinpointing representative portions of large Intel Itanium programs with dynamic instrumentation. In MICRO, 2004.
-
(2004)
MICRO
-
-
Patil, H.1
Cohn, R.S.2
Charney, M.3
Kapoor, R.4
Sun, A.5
Karunanidhi, A.6
-
7
-
-
84881442631
-
Sampled simulation of multi-threaded applications
-
T. E. Carlson, W. Heirman, and L. Eeckhout. Sampled simulation of multi-threaded applications. In ISPASS, 2013.
-
(2013)
ISPASS
-
-
Carlson, T.E.1
Heirman, W.2
Eeckhout, L.3
-
8
-
-
84906684431
-
-
CUDA Documentation
-
CUDA Documentation. http://www.nvidia.com/object/cudadevelop.html.
-
-
-
-
9
-
-
31944440969
-
Pin: Building customized program analysis tools with dynamic instrumentation
-
Chi-Keung Luk et al. Pin: Building customized program analysis tools with dynamic instrumentation. In PLDI, 2005.
-
(2005)
PLDI
-
-
Luk, C.1
-
10
-
-
33744474064
-
The strong correlation between code signatures and performance
-
J. Lau, J. Sampson, E. Perelman, G. Hamerly, and B. Calder. The strong correlation between code signatures and performance. In ISPASS, 2005.
-
(2005)
ISPASS
-
-
Lau, J.1
Sampson, J.2
Perelman, E.3
Hamerly, G.4
Calder, B.5
-
11
-
-
78149233155
-
Ocelot: A dynamic compiler for bulk-synchronous applications in heterogeneous systems
-
G. Diamos, A. Kerr, S. Yalamanchili, and N. Clark. Ocelot: A dynamic compiler for bulk-synchronous applications in heterogeneous systems. In PACT, 2010.
-
(2010)
PACT
-
-
Diamos, G.1
Kerr, A.2
Yalamanchili, S.3
Clark, N.4
-
13
-
-
64949101685
-
A first-order fine-grained multithreaded throughput model
-
Xi E. Chen and Tor M. Aamodt. A first-order fine-grained multithreaded throughput model. In HPCA, 2009.
-
(2009)
HPCA
-
-
Chen, X.E.1
Aamodt, T.M.2
-
14
-
-
84906718293
-
Accelerating gpGPU architecture simulation
-
Zhibin Yu, Lieven Eeckhout, Nilanjan Goswami, Tao Li, Lizy John, Hai Jin, and Chengzhong Xu. Accelerating gpGPU architecture simulation. In SIGMETRICS, 2013.
-
(2013)
SIGMETRICS
-
-
Yu, Z.1
Eeckhout, L.2
Goswami, N.3
Li, T.4
John, L.5
Jin, H.6
Xu, C.7
-
15
-
-
70450231944
-
An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness
-
S. Hong and H. Kim. An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness. In ISCA, 2009.
-
(2009)
ISCA
-
-
Hong, S.1
Kim, H.2
-
16
-
-
84863347222
-
A performance analysis framework for identifying potential benefits in gpGPU applications
-
J. Sim, A. Dasgupta, H. Kim, and R. Vuduc. A performance analysis framework for identifying potential benefits in gpGPU applications. In PPoPP, 2012.
-
(2012)
PPoPP
-
-
Sim, J.1
Dasgupta, A.2
Kim, H.3
Vuduc, R.4
-
17
-
-
77749337497
-
An adaptive performance modeling tool for GPU architectures
-
S. S. Baghsorkhi, M. Delahaye, S. J. Patel, W. D. Gropp, and W. W. Hwu. An adaptive performance modeling tool for GPU architectures. In PPoPP, 2010.
-
(2010)
PPoPP
-
-
Baghsorkhi, S.S.1
Delahaye, M.2
Patel, S.J.3
Gropp, W.D.4
Hwu, W.W.5
|