-
1
-
-
33847112910
-
A study of the on-chip interconnection network for the ibm cyclops64 multi-core architecture
-
Y. P. Zhang, T. Jeong, F. Chen, H. P. Wu, R. Nitzsche, and G. R. Gao, "A Study of the On-Chip Interconnection Network for the IBM Cyclops64 Multi-Core Architecture," in Int'l Parallel and Distributed Processing Symp., 2006.
-
(2006)
Int'l Parallel and Distributed Processing Symp.
-
-
Zhang, Y.P.1
Jeong, T.2
Chen, F.3
Wu, H.P.4
Nitzsche, R.5
Gao, G.R.6
-
2
-
-
79955435088
-
Fermi gf100 gpu architecture
-
C. M. Wittenbrink, E. Kilgariff, and A. Prabhu, "Fermi GF100 GPU Architecture," IEEE Micro, vol. 31, pp. 50-59, 2011.
-
(2011)
IEEE Micro
, vol.31
, pp. 50-59
-
-
Wittenbrink, C.M.1
Kilgariff, E.2
Prabhu, A.3
-
4
-
-
84860003663
-
Thread affinity mapping for irregular data access on shared cache gpgpu
-
H.-K. Kuo, K.-T. Chen, B.-C. C. Lai, and J.-Y. Jou, "Thread Affinity Mapping for Irregular Data Access on Shared Cache GPGPU," in Asia and South Pacific Design Automation Conf., 2012.
-
(2012)
Asia and South Pacific Design Automation Conf.
-
-
Kuo, H.-K.1
Chen, K.-T.2
Lai, B.-C.C.3
Jou, J.-Y.4
-
6
-
-
0001483604
-
Communication optimizations for irregular scientific computations on distributed memory architectures
-
R. Das, M. Uysal, J. Saltz, and Y.-S. Hwang, "Communication Optimizations for Irregular Scientific Computations on Distributed Memory Architectures," J. Parallel Distrib. Comput., vol. 22, pp. 462-478, 1994.
-
(1994)
J. Parallel Distrib. Comput.
, vol.22
, pp. 462-478
-
-
Das, R.1
Uysal, M.2
Saltz, J.3
Hwang, Y.-S.4
-
8
-
-
79953126288
-
On-The-fly elimination of dynamic irregularities for gpu computing
-
E. Z. Zhang, Y. Jiang, Z. Guo, K. Tian, and X. Shen, "On-the-Fly Elimination of Dynamic Irregularities for GPU Computing," in Int'l Conf. Architectural Support for Programming Languages and Operating Systems, 2011.
-
(2011)
Int'l Conf. Architectural Support for Programming Languages and Operating Systems
-
-
Zhang, E.Z.1
Jiang, Y.2
Guo, Z.3
Tian, K.4
Shen, X.5
-
9
-
-
70349169075
-
Analyzing cuda workloads using a detailed gpu simulator
-
A. Bakhoda, G. L. Yuan, W. W. L. Fung, H. Wong, and T. M. Aamodt, "Analyzing CUDA Workloads Using a Detailed GPU Simulator," presented at the Int'l Symp. Performance Analysis of Systems and Software, 2009.
-
(2009)
Int'l Symp. Performance Analysis of Systems and Software
-
-
Bakhoda, A.1
Yuan, G.L.2
Fung, W.W.L.3
Wong, H.4
Aamodt, T.M.5
-
14
-
-
0016561620
-
Analysis of several task-scheduling algorithms for a model of multiprogramming computer systems
-
K. L. Krause, V. Y. Shen, and H. D. Schwetman, "Analysis of Several Task-Scheduling Algorithms for a Model of Multiprogramming Computer Systems," J. ACM, vol. 22, pp. 522-550, 1975.
-
(1975)
J. ACM
, vol.22
, pp. 522-550
-
-
Krause, K.L.1
Shen, V.Y.2
Schwetman, H.D.3
-
15
-
-
84877747717
-
-
ITC'99 Benchmarks. Available: http://www.cad.polito.it/downloads/tools/ itc99.html
-
-
-
-
16
-
-
78149233155
-
Ocelot: A dynamic optimization framework for bulk-synchronous applications in heterogeneous systems
-
G. F. Diamos, A. R. Kerr, S. Yalamanchili, and N. Clark, "Ocelot: A Dynamic Optimization Framework for Bulk-Synchronous Applications in Heterogeneous Systems," in Int'l Conf. Parallel Architectures and Compilation Techniques, 2010.
-
(2010)
Int'l Conf. Parallel Architectures and Compilation Techniques
-
-
Diamos, G.F.1
Kerr, A.R.2
Yalamanchili, S.3
Clark, N.4
|