-
1
-
-
78651550268
-
Scalable parallel programming with CUDA
-
March
-
J. Nickolls, I. Buck, M. Garland, and K. Skadron, "Scalable parallel programming with CUDA," Queue, vol. 6, pp. 40-53, March 2008.
-
(2008)
Queue
, vol.6
, pp. 40-53
-
-
Nickolls, J.1
Buck, I.2
Garland, M.3
Skadron, K.4
-
2
-
-
84863710946
-
-
"OpenCL," http://www.khronos.org/opencl/.
-
OpenCL
-
-
-
3
-
-
78149231331
-
MapCG: Writing parallel program portable between cpu and gpu
-
New York, NY, USA
-
C. Honget al, "MapCG: writing parallel program portable between cpu and gpu," in PACT '10, New York, NY, USA, 2010, pp. 217-226.
-
(2010)
PACT '10
, pp. 217-226
-
-
Hong, C.1
-
4
-
-
78149233155
-
Ocelot: A dynamic optimization framework for bulk-synchronous applications in heterogeneous systems
-
New York, NY, USA
-
G. F. Diamos et al, "Ocelot: a dynamic optimization framework for bulk-synchronous applications in heterogeneous systems," in PACT'10, New York, NY, USA, 2010, pp. 353-364.
-
(2010)
PACT'10
, pp. 353-364
-
-
Diamos, G.F.1
-
5
-
-
72049125355
-
Coordinating the use of GPU and CPU for improving performance of compute intensive applications
-
G. Teodoro et al, "Coordinating the use of GPU and CPU for improving performance of compute intensive applications," in CLUSTER '09, 2009, pp. 1-10.
-
(2009)
CLUSTER '09
, pp. 1-10
-
-
Teodoro, G.1
-
6
-
-
57349153933
-
Harmony: An execution model and runtime for heterogeneous many core systems
-
New York, NY, USA
-
G. F. Diamos and S. Yalamanchili, "Harmony: an execution model and runtime for heterogeneous many core systems," in HPDC '08, New York, NY, USA, 2008, pp. 197-200.
-
(2008)
HPDC '08
, pp. 197-200
-
-
Diamos, G.F.1
Yalamanchili, S.2
-
7
-
-
76749140917
-
Qilin: Exploiting parallelism on heterogeneous multiprocessors with adaptive mapping
-
New York, NY, USA
-
C.-K. Luk, S. Hong, and H. Kim, "Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping," in MICRO '09, New York, NY, USA, 2009, pp. 45-55.
-
(2009)
MICRO '09
, pp. 45-55
-
-
Luk, C.-K.1
Hong, S.2
Kim, H.3
-
8
-
-
77954709868
-
Compiler and runtime support for enabling generalized reduction computations on heterogeneous parallel configurations
-
New York, NY, USA
-
V. T. Ravi, W. Ma, D. Chiu, and G. Agrawal, "Compiler and runtime support for enabling generalized reduction computations on heterogeneous parallel configurations," in ICS '10, New York, NY, USA, 2010, pp. 137-146.
-
(2010)
ICS '10
, pp. 137-146
-
-
Ravi, V.T.1
Ma, W.2
Chiu, D.3
Agrawal, G.4
-
9
-
-
77954927300
-
Data-aware scheduling of legacy kernels on heterogeneous platforms with distributed memory
-
New York, NY, USA
-
M. Becchi et al, "Data-aware scheduling of legacy kernels on heterogeneous platforms with distributed memory," in SPAA '10, New York, NY, USA, 2010, pp. 82-91.
-
(2010)
SPAA '10
, pp. 82-91
-
-
Becchi, M.1
-
11
-
-
84877719088
-
-
"PGI CUDA-X86 Compiler," http://www.pgroup.com/resources/cuda- x86.htm.
-
PGI CUDA-X86 Compiler
-
-
-
12
-
-
79251597562
-
Swan: A tool for porting cuda programs to opencl
-
[Online]. Available
-
M. Harvey and G. D. Fabritiis, "Swan: A tool for porting cuda programs to opencl," Computer Physics Communications, vol. 182, no. 4, pp. 1093 - 1099, 2011. [Online]. Available: http://www.sciencedirect.com/science/ article/pii/S0010465511000117
-
(2011)
Computer Physics Communications
, vol.182
, Issue.4
, pp. 1093-1099
-
-
Harvey, M.1
Fabritiis, G.D.2
-
13
-
-
78650802947
-
OpenMPC: Extended openmp programming and tuning for gpus
-
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, ser. Washington, DC, USA: IEEE Computer Society, [Online]. Available
-
S. Lee and R. Eigenmann, "OpenMPC: Extended openmp programming and tuning for gpus," in Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, ser. SC '10. Washington, DC, USA: IEEE Computer Society, 2010, pp. 1-11. [Online]. Available: http://dx.doi.org/10.1109/SC.2010.36
-
(2010)
SC '10
, pp. 1-11
-
-
Lee, S.1
Eigenmann, R.2
-
14
-
-
70649092154
-
Rodinia: A benchmark suite for heterogeneous computing
-
Washington, DC, USA
-
S. Che et al, "Rodinia: A benchmark suite for heterogeneous computing,"in IISWC '09, Washington, DC, USA, 2009, pp. 44-54.
-
(2009)
IISWC '09
, pp. 44-54
-
-
Che, S.1
-
16
-
-
77952256778
-
Modeling GPU-CPU workloads and systems
-
A. Kerr et al, "Modeling GPU-CPU workloads and systems," in GPGPU '10, 2010, pp. 31-42.
-
(2010)
GPGPU '10
, pp. 31-42
-
-
Kerr, A.1
-
17
-
-
23944523844
-
Parallel job scheduling - A status report
-
D. G. Feitelson, L. Rudolph, and U. Schwiegelshohn, "Parallel job scheduling - a status report," in JSSPP, 2004, pp. 1-16.
-
(2004)
JSSPP
, pp. 1-16
-
-
Feitelson, D.G.1
Rudolph, L.2
Schwiegelshohn, U.3
-
18
-
-
0030149947
-
Effective distributed scheduling of parallel workloads
-
May
-
A. C. Dusseau, R. H. Arpaci, and D. E. Culler, "Effective distributed scheduling of parallel workloads," SIGMETRICS Perform. Eval. Rev., vol. 24, pp. 25-36, May 1996.
-
(1996)
SIGMETRICS Perform. Eval. Rev.
, vol.24
, pp. 25-36
-
-
Dusseau, A.C.1
Arpaci, R.H.2
Culler, D.E.3
-
19
-
-
0027721450
-
Performance analysis of job scheduling policies in parallel supercomputing environments
-
New York, NY, USA
-
V. K. Naik, M. S. Squillante, and S. K. Setia, "Performance analysis of job scheduling policies in parallel supercomputing environments," in Supercomputing '93, New York, NY, USA, 1993, pp. 824-833.
-
(1993)
Supercomputing '93
, pp. 824-833
-
-
Naik, V.K.1
Squillante, M.S.2
Setia, S.K.3
-
20
-
-
0242656076
-
Scheduling of parallel jobs in a heterogeneous multi-site environment
-
G. Sabin, R. Kettimuthu, A. Rajan, and P. Sadayappan, "Scheduling of parallel jobs in a heterogeneous multi-site environment," in in Proc. of the 9th International Workshop on Job Scheduling Strategies for Parallel Processing, 2003, pp. 87-104.
-
Proc. of the 9th International Workshop on Job Scheduling Strategies for Parallel Processing, 2003
, pp. 87-104
-
-
Sabin, G.1
Kettimuthu, R.2
Rajan, A.3
Sadayappan, P.4
-
21
-
-
27544449350
-
Assessment and enhancement of meta-schedulers for multi-site job sharing
-
Washington, DC, USA
-
G. Sabin, V. Sahasrabudhe, and P. Sadayappan, "Assessment and enhancement of meta-schedulers for multi-site job sharing," in HPDC '05, Washington, DC, USA, 2005, pp. 144-153.
-
(2005)
HPDC '05
, pp. 144-153
-
-
Sabin, G.1
Sahasrabudhe, V.2
Sadayappan, P.3
-
22
-
-
4644370318
-
Single-isa heterogeneous multi-core architectures for multithreaded workload performance
-
Washington, DC, USA: IEEE Computer Society
-
R. Kumar et al., "Single-isa heterogeneous multi-core architectures for multithreaded workload performance," in ISCA '04. Washington, DC, USA: IEEE Computer Society, 2004, pp. 64-75.
-
(2004)
ISCA '04
, pp. 64-75
-
-
Kumar, R.1
-
23
-
-
34247331460
-
Dynamic thread assignment on heterogeneous multiprocessor architectures
-
New York, NY, USA
-
M. Becchi and P. Crowley, "Dynamic thread assignment on heterogeneous multiprocessor architectures," in CF '06, New York, NY, USA, 2006, pp. 29-40.
-
(2006)
CF '06
, pp. 29-40
-
-
Becchi, M.1
Crowley, P.2
-
25
-
-
83455220920
-
Comprehensive performance monitoring for gpu cluster systems
-
Proceedings of the 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum, ser. Washington, DC, USA: IEEE Computer Society, [Online]. Available
-
K. Furlinger, N. J. Wright, and D. Skinner, "Comprehensive performance monitoring for gpu cluster systems," in Proceedings of the 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum, ser. IPDPSW '11. Washington, DC, USA: IEEE Computer Society, 2011, pp. 1377-1386. [Online]. Available: http://dx.doi.org/10.1109/IPDPS.2011. 289
-
(2011)
IPDPSW '11
, pp. 1377-1386
-
-
Furlinger, K.1
Wright, N.J.2
Skinner, D.3
-
26
-
-
84866869010
-
MATE-CG: A mapreduce-like framework for accelerating data-intensive computations on heterogeneous clusters
-
to appear
-
W. Jiang and G. Agrawal, "MATE-CG: A mapreduce-like framework for accelerating data-intensive computations on heterogeneous clusters,"in IPDPS '12 (to appear), 2012.
-
(2012)
IPDPS '12
-
-
Jiang, W.1
Agrawal, G.2
-
27
-
-
84871429328
-
-
"Torque Resource Manager," http://www.clusterresources.com/ products/torque-resource-manager.php.
-
Torque Resource Manager
-
-
-
28
-
-
0032591264
-
Dynamic matching and scheduling of a class of independent tasks onto heterogeneous computing systems
-
Proceedings of the Eighth Heterogeneous Computing Workshop, ser. Washington, DC, USA: IEEE Computer Society, [Online]. Available
-
M. Maheswaran, S. Ali, H. J. Siegel, D. Hensgen, and R. F. Freund, "Dynamic matching and scheduling of a class of independent tasks onto heterogeneous computing systems," in Proceedings of the Eighth Heterogeneous Computing Workshop, ser. HCW '99. Washington, DC, USA: IEEE Computer Society, 1999, pp. 30-. [Online]. Available: http://dl.acm.org/ citation.cfm?id=795690.797893
-
(1999)
HCW '99
, pp. 30
-
-
Maheswaran, M.1
Ali, S.2
Siegel, H.J.3
Hensgen, D.4
Freund, R.F.5
-
29
-
-
0002691736
-
The elusive goal of workload characterization
-
March [Online]. Available
-
A. B. Downey and D. G. Feitelson, "The elusive goal of workload characterization," SIGMETRICS Perform. Eval. Rev., vol. 26, pp. 14-29, March 1999. [Online]. Available: http://doi.acm.org/10.1145/309746.309750
-
(1999)
SIGMETRICS Perform. Eval. Rev.
, vol.26
, pp. 14-29
-
-
Downey, A.B.1
Feitelson, D.G.2
-
30
-
-
0006547373
-
Scheduling resources in multi-user, heterogeneous, computing environments with smartnet
-
Proceedings of the Seventh Heterogeneous Computing Workshop, ser. Washington, DC, USA: IEEE Computer Society, [Online]. Available
-
R. F. t. Freund, "Scheduling resources in multi-user, heterogeneous, computing environments with smartnet," in Proceedings of the Seventh Heterogeneous Computing Workshop, ser. HCW '98. Washington, DC, USA: IEEE Computer Society, 1998, pp. 3-. [Online]. Available: http://dl.acm.org/citation. cfm?id=795689.797878
-
(1998)
HCW '98
, pp. 3
-
-
Freund, R.F.T.1
-
31
-
-
0036802314
-
Using moldability to improve the performance of supercomputer jobs
-
W. Cirne and F. Berman, "Using moldability to improve the performance of supercomputer jobs," JPDC, vol. 62, no. 10, pp. 1571-1601, 2002.
-
(2002)
JPDC
, vol.62
, Issue.10
, pp. 1571-1601
-
-
Cirne, W.1
Berman, F.2
-
32
-
-
34248186212
-
Effective selection of partition sizes for moldable scheduling of parallel jobs
-
S. Srinivasan, V. Subramani, R. Kettimuthu, P. Holenarsipur, and P. Sadayappan, "Effective selection of partition sizes for moldable scheduling of parallel jobs," in HiPC, 2002, pp. 174-183.
-
(2002)
HiPC
, pp. 174-183
-
-
Srinivasan, S.1
Subramani, V.2
Kettimuthu, R.3
Holenarsipur, P.4
Sadayappan, P.5
-
33
-
-
0032202051
-
DPS: Dynamic priority scheduling heuristic for heterogeneous computing systems
-
[Online]. Available
-
I. Ahmad, M. Dhodhi, and R. Ul-Mustafa, "DPS: Dynamic priority scheduling heuristic for heterogeneous computing systems," IEE Proceedings - Computers and Digital Techniques, vol. 145, no. 6, pp. 411-418, 1998. [Online]. Available: http://link.aip.org/link/?ICE/145/411/1
-
(1998)
IEE Proceedings - Computers and Digital Techniques
, vol.145
, Issue.6
, pp. 411-418
-
-
Ahmad, I.1
Dhodhi, M.2
Ul-Mustafa, R.3
-
34
-
-
84934343585
-
Realistic modeling and synthesis of resources for computational grids
-
Y. S. Kee, H. Casanova, and A. Chien, "Realistic modeling and synthesis of resources for computational grids," in SC '04, 2004, pp. 54-63.
-
(2004)
SC '04
, pp. 54-63
-
-
Kee, Y.S.1
Casanova, H.2
Chien, A.3
-
35
-
-
34548270092
-
Improving grid resource allocation via integrated selection and binding
-
Y. S. Kee, K. Yocum, A. A. Chien, and H. Casanova, "Improving grid resource allocation via integrated selection and binding," in SC '06, 2006.
-
(2006)
SC '06
-
-
Kee, Y.S.1
Yocum, K.2
Chien, A.A.3
Casanova, H.4
-
36
-
-
23944436115
-
New grid scheduling and rescheduling methods in the grads project
-
F. Berman et al, "New grid scheduling and rescheduling methods in the grads project," International Journal of Parallel Programming, pp. 209-229, 2005.
-
(2005)
International Journal of Parallel Programming
, pp. 209-229
-
-
Berman, F.1
-
37
-
-
84983561277
-
Scheduling parallel applications on utility grids: Time and cost trade-off management
-
S. K. Garg, R. Buyya, and H. J. Siegel, "Scheduling parallel applications on utility grids: Time and cost trade-off management," in ACSC '09, 2009, pp. 139-147.
-
(2009)
ACSC '09
, pp. 139-147
-
-
Garg, S.K.1
Buyya, R.2
Siegel, H.J.3
|