-
1
-
-
84866918568
-
-
NVIDIA. CUDA C Programming Guide 7.5. https://docs.nvidia.com/cuda/cuda-c-programming-guide/; 2016.
-
(2016)
CUDA C Programming Guide 7.5
-
-
-
2
-
-
84900624911
-
Red Fox: An execution environment for relational query processing on GPUs
-
CGO '14, Orlando, FL, USA, ACM
-
Wu H, Diamos G, Sheard T, et al. Red Fox: An execution environment for relational query processing on GPUs. Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO '14. Orlando, FL, USA: ACM; 2014:44:44–44:54.
-
(2014)
Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
, pp. 44:44-44:54
-
-
Wu, H.1
Diamos, G.2
Sheard, T.3
-
3
-
-
79955948842
-
Data parallel three-dimensional cahn-hilliard field equation simulation on GPUs with CUDA
-
Las Vegas, Nevada, USA
-
Playne DP, Hawick KA. Data parallel three-dimensional cahn-hilliard field equation simulation on GPUs with CUDA. Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, PDPTA, Las Vegas, Nevada, USA; 2009.
-
(2009)
Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, PDPTA
-
-
Playne, D.P.1
Hawick, K.A.2
-
4
-
-
84908118037
-
Tridiagonalization of a dense symmetric matrix on multiple GPUs and its application to symmetric eigenvalue problems
-
Yamazaki I, Dong T, Solc R, Tomov S, Dongarra J, Schulthess T. Tridiagonalization of a dense symmetric matrix on multiple GPUs and its application to symmetric eigenvalue problems. Concurrency Computat: Pract Exper. 2014;26(16):2652–2666.
-
(2014)
Concurrency Computat: Pract Exper
, vol.26
, Issue.16
, pp. 2652-2666
-
-
Yamazaki, I.1
Dong, T.2
Solc, R.3
Tomov, S.4
Dongarra, J.5
Schulthess, T.6
-
6
-
-
77953133957
-
Parallel option pricing with fourier space time-stepping method on graphics processing units
-
Surkov V. Parallel option pricing with fourier space time-stepping method on graphics processing units. Parallel Comput. 2010;36(7):372–380.
-
(2010)
Parallel Comput
, vol.36
, Issue.7
, pp. 372-380
-
-
Surkov, V.1
-
7
-
-
84879422035
-
Performance modeling of microsecond scale biological molecular dynamics simulations on heterogeneous architectures
-
Agarwal PK, Hampton S, Poznanovic J, Ramanthan A, Alam SR, Crozier PS. Performance modeling of microsecond scale biological molecular dynamics simulations on heterogeneous architectures. Concurrency Computat: Pract Exper. 2013;25(10):1356–1375.
-
(2013)
Concurrency Computat: Pract Exper
, vol.25
, Issue.10
, pp. 1356-1375
-
-
Agarwal, P.K.1
Hampton, S.2
Poznanovic, J.3
Ramanthan, A.4
Alam, S.R.5
Crozier, P.S.6
-
8
-
-
0242571753
-
Slurm: Simple linux utility for resource management
-
Berlin, Heidelberg, Springer Berlin Heidelberg
-
Yoo AB, Jette MA, Grondona M. Slurm: Simple linux utility for resource management. Job Scheduling Strategies for Parallel Processing: 9th International Workshop, JSSPP 2003, Seattle, WA, USA, June 24, 2003. Revised Paper. Berlin, Heidelberg: Springer Berlin Heidelberg; 2003:44–60.
-
(2003)
Job Scheduling Strategies for Parallel Processing: 9th International Workshop, JSSPP 2003, Seattle, WA, USA, June 24, 2003. Revised Paper
, pp. 44-60
-
-
Yoo, A.B.1
Jette, M.A.2
Grondona, M.3
-
9
-
-
84978076653
-
Remote GPU virtualization: Is it useful
-
Barcelona, Spain, IEEE Computer Society
-
Silla F, Prades J, Iserte S, Reaño C. Remote GPU virtualization: Is it useful. The 2nd IEEE International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era. Barcelona, Spain: IEEE Computer Society; 2016:41–48.
-
(2016)
The 2nd IEEE International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era
, pp. 41-48
-
-
Silla, F.1
Prades, J.2
Iserte, S.3
Reaño, C.4
-
11
-
-
84876533447
-
DS-CUDA: A middleware to use many GPUs in the cloud environment
-
SCC '12, IEEE Computer Society, Washington, DC, USA
-
Oikawa M, Kawai A, Nomura K, Yasuoka K, Yoshikawa K, Narumi T. DS-CUDA: A middleware to use many GPUs in the cloud environment. Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and Analysis, SCC '12. IEEE Computer Society, Washington, DC, USA; 2012:1207–1214.
-
(2012)
Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and Analysis
, pp. 1207-1214
-
-
Oikawa, M.1
Kawai, A.2
Nomura, K.3
Yasuoka, K.4
Yoshikawa, K.5
Narumi, T.6
-
12
-
-
84867268537
-
-
Euro-Par 2010 - Parallel Processing, Ischia, Italy, Springer
-
Giunta G, Montella R, Agrillo G, Coviello G. A GPGPU transparent virtualization component for high performance computing clouds. Euro-Par 2010 - Parallel Processing. Ischia, Italy: Springer; 2010.
-
(2010)
A GPGPU transparent virtualization component for high performance computing clouds
-
-
Giunta, G.1
Montella, R.2
Agrillo, G.3
Coviello, G.4
-
13
-
-
70450031611
-
vCUDA: GPU accelerated high performance computing in virtual machines
-
2009. IPDPS 2009, Rome, Italy, IEEE
-
Shi L, Chen H, Sun J. vCUDA: GPU accelerated high performance computing in virtual machines. IEEE International Symposium on Parallel & Distributed Processing, 2009. IPDPS 2009. Rome, Italy: IEEE; 2009:1–11.
-
(2009)
IEEE International Symposium on Parallel & Distributed Processing
, pp. 1-11
-
-
Shi, L.1
Chen, H.2
Sun, J.3
-
14
-
-
70349123351
-
GViM: GPU-accelerated virtual machines
-
Nuremberg, Germany, ACM;
-
Gupta V, Gavrilovska A, Schwan K, et al. GViM: GPU-accelerated virtual machines. Proceedings of the 3rd ACM Workshop on System-level Virtualization for High Performance Computing, Nuremberg, Germany: ACM; 2009:17–24.
-
(2009)
Proceedings of the 3rd ACM Workshop on System-level Virtualization for High Performance Computing
, pp. 17-24
-
-
Gupta, V.1
Gavrilovska, A.2
Schwan, K.3
-
15
-
-
84908669300
-
A complete and efficient CUDA-sharing solution for HPC clusters
-
Peña AJ, Reaño C, Silla F, Mayo R, Quintana-Orti ES, Duato Jose. A complete and efficient CUDA-sharing solution for HPC clusters. Parallel Comput. 2014;40:574–588.
-
(2014)
Parallel Comput
, vol.40
, pp. 574-588
-
-
Peña, A.J.1
Reaño, C.2
Silla, F.3
Mayo, R.4
Quintana-Orti, E.S.5
Duato, J.6
-
16
-
-
85020087021
-
-
Accessed July 1, 2016
-
CUDA API Reference Manual 7.5. https://developer.nvidia.com/cuda-toolkit; 2016. Accessed July 1, 2016.
-
(2016)
CUDA API Reference Manual 7.5
-
-
-
17
-
-
79960181836
-
Shadowfax: Scaling in heterogeneous cluster systems via GPGPU assemblies
-
VTDC '11, ACM, New York, NY, USA
-
Merritt AM, Gupta V, Verma A, Gavrilovska A, Schwan K. Shadowfax: Scaling in heterogeneous cluster systems via GPGPU assemblies. Proceedings of the 5th International Workshop on Virtualization Technologies in Distributed Computing, VTDC '11. ACM, New York, NY, USA; 2011:3–10.
-
(2011)
Proceedings of the 5th International Workshop on Virtualization Technologies in Distributed Computing
, pp. 3-10
-
-
Merritt, A.M.1
Gupta, V.2
Verma, A.3
Gavrilovska, A.4
Schwan, K.5
-
21
-
-
84981309714
-
Local and remote GPUs perform similar with EDR 100G InfiniBand
-
Middleware Industry '15, Vancouver, Canada
-
Reaño C, Silla F, Shainer G, Schultz S. Local and remote GPUs perform similar with EDR 100G InfiniBand. Proceedings of the Industrial Track of the 16th International Middleware Conference, Middleware Industry '15, Vancouver, Canada; 2015.
-
(2015)
Proceedings of the Industrial Track of the 16th International Middleware Conference
-
-
Reaño, C.1
Silla, F.2
Shainer, G.3
Schultz, S.4
-
22
-
-
84941790414
-
Improving the user experience of the rCUDA remote GPU virtualization framework
-
Reaño C, Silla F, Castello A, et al. Improving the user experience of the rCUDA remote GPU virtualization framework. Concurrency Computat: Pract Exper. 2015;27(14):3746–3770.
-
(2015)
Concurrency Computat: Pract Exper
, vol.27
, Issue.14
, pp. 3746-3770
-
-
Reaño, C.1
Silla, F.2
Castello, A.3
-
24
-
-
78651415181
-
GPU-BLAST: Using graphics processors to accelerate protein sequence alignment
-
Vouzis PD, Sahinidis NV. GPU-BLAST: Using graphics processors to accelerate protein sequence alignment. Bioinformatics. 2011;27(2):182–188.
-
(2011)
Bioinformatics
, vol.27
, Issue.2
, pp. 182-188
-
-
Vouzis, P.D.1
Sahinidis, N.V.2
-
25
-
-
84855431216
-
Implementing molecular dynamics on hybrid high performance computers: Particle-particle particle-mesh
-
Brown WM, Kohlmeyer A, Plimpton SJ, Tharrington AN. Implementing molecular dynamics on hybrid high performance computers: Particle-particle particle-mesh. Comput Phys Commun. 2012;183(3):449–459.
-
(2012)
Comput Phys Commun
, vol.183
, Issue.3
, pp. 449-459
-
-
Brown, W.M.1
Kohlmeyer, A.2
Plimpton, S.J.3
Tharrington, A.N.4
-
26
-
-
77956339477
-
CUDA-MEME: Accelerating motif discovery in biological sequences using CUDA-enabled graphics processing units
-
Liu Y, Schmidt B, Liu W, Maskell DL. CUDA-MEME: Accelerating motif discovery in biological sequences using CUDA-enabled graphics processing units. Pattern Recogn Lett. 2010;31(14):2170–2177.
-
(2010)
Pattern Recogn Lett
, vol.31
, Issue.14
, pp. 2170-2177
-
-
Liu, Y.1
Schmidt, B.2
Liu, W.3
Maskell, D.L.4
-
27
-
-
84875592758
-
Gromacs 4.5: A high-throughput and highly parallel open source molecular simulation toolkit
-
Pronk S, Páall S, Schulz R, et al. Gromacs 4.5: A high-throughput and highly parallel open source molecular simulation toolkit. Bioinformatics. 2013;29(7):845–854.
-
(2013)
Bioinformatics
, vol.29
, Issue.7
, pp. 845-854
-
-
Pronk, S.1
Páall, S.2
Schulz, R.3
-
28
-
-
84855722896
-
Barracuda - a fast short read sequence aligner using graphics processing units
-
Klus P, Lam S, Lyberg D, et al. Barracuda - a fast short read sequence aligner using graphics processing units. BMC Res Not. 2012;5(1):27.
-
(2012)
BMC Res Not
, vol.5
, Issue.1
, pp. 27
-
-
Klus, P.1
Lam, S.2
Lyberg, D.3
-
29
-
-
2942538300
-
Versatile and open software for comparing large genomes
-
Kurtz S, Phillippy A, Delcher AL, et al. Versatile and open software for comparing large genomes. Genome Biol. 2004;5(2):R12.
-
(2004)
Genome Biol
, vol.5
, Issue.2
, pp. R12
-
-
Kurtz, S.1
Phillippy, A.2
Delcher, A.L.3
-
30
-
-
79955702502
-
Libsvm: A library for support vector machines
-
Chang C-C, Lin C-J. Libsvm: A library for support vector machines. ACM Trans Intell Syst Technol. May 2011;2(3):27:1–27:27.
-
(2011)
ACM Trans Intell Syst Technol
, vol.2
, Issue.3
, pp. 27:1-27:27
-
-
Chang, C.-C.1
Lin, C.-J.2
-
31
-
-
27344436659
-
Scalable molecular dynamics with namd
-
Phillips JC, Braun R, Wang W, et al. Scalable molecular dynamics with namd. J Comput Chem. 2005;26(16):1781–1802.
-
(2005)
J Comput Chem
, vol.26
, Issue.16
, pp. 1781-1802
-
-
Phillips, J.C.1
Braun, R.2
Wang, W.3
-
32
-
-
85020097885
-
-
Accessed July 1, 2016
-
NVIDIA Popular GPU-Accelerated Applications Catalog. http://www.nvidia.es/content/tesla/pdf/gpu-accelerated-applications-for-hpc.pdf; 2016. Accessed July 1, 2016.
-
(2016)
NVIDIA Popular GPU-Accelerated Applications Catalog
-
-
-
33
-
-
85116180112
-
-
7th IEEE International Conference on Cloud Computing (CLOUD 2014), Anchorage, AK, USA
-
Walters JP, Younge AJ, Kang D-I, et al. GPU-passthrough performance: A comparison of KVM, Xen, VMWare ESXi, and LXC for CUDA and OpenCL applications. 7th IEEE International Conference on Cloud Computing (CLOUD 2014), Anchorage, AK, USA; 2014.
-
(2014)
GPU-passthrough performance: A comparison of KVM, Xen, VMWare ESXi, and LXC for CUDA and OpenCL applications
-
-
Walters, J.P.1
Younge, A.J.2
Kang, D.-I.3
-
34
-
-
84874254048
-
-
2012 IEEE 4th International Conference on Cloud Computing Technology and Science (CLOUDCOM), Taipei, Taiwan, IEEE;
-
Yang C-T, Wang H-Y, Ou W-S, Liu Y-T, Hsu C-H. On implementation of GPU virtualization using PCI pass-through. 2012 IEEE 4th International Conference on Cloud Computing Technology and Science (CLOUDCOM), Taipei, Taiwan: IEEE; 2012:711–716.
-
(2012)
On implementation of GPU virtualization using PCI pass-through
, pp. 711-716
-
-
Yang, C.-T.1
Wang, H.-Y.2
Ou, W.-S.3
Liu, Y.-T.4
Hsu, C.-H.5
-
36
-
-
84878141243
-
Exploiting GPUs in virtual machine for BioCloud
-
pages
-
Jo H, Jeong J, Lee M, Choi DH. Exploiting GPUs in virtual machine for BioCloud. BioMed Res Int. vol. 2013, Article ID 939460, 11 pages, 2013. doi:10.1155/2013/939460.
-
(2013)
BioMed Res Int
, vol.2013
, pp. 11
-
-
Jo, H.1
Jeong, J.2
Lee, M.3
Choi, D.H.4
-
37
-
-
84963767039
-
CUDA acceleration for Xen virtual machines in Infiniband clusters with rCUDA
-
PPoPP '16, Barcelona, Spain
-
Prades J, Reaño C, Silla F. CUDA acceleration for Xen virtual machines in Infiniband clusters with rCUDA. Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '16, Barcelona, Spain; 2016.
-
(2016)
Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
-
-
Prades, J.1
Reaño, C.2
Silla, F.3
-
39
-
-
85000625507
-
CUDASW++ 3.0: Accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions
-
Liu Y, Wirawan A, Schmidt B. CUDASW++ 3.0: Accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions. BMC Bioinformatics. 2013;14(1):1–10.
-
(2013)
BMC Bioinformatics
, vol.14
, Issue.1
, pp. 1-10
-
-
Liu, Y.1
Wirawan, A.2
Schmidt, B.3
-
40
-
-
77950975351
-
CheCUDA: A checkpoint/restart tool for CUDA applications
-
Hiroshima, Japan
-
Takizawa H, Sato K, Komatsu K, Kobayashi H. CheCUDA: A checkpoint/restart tool for CUDA applications. Proceedings of the 2009 International Conference on Parallel and Distributed Computing, Applications and Technologies, Hiroshima, Japan; 2009.
-
(2009)
Proceedings of the 2009 International Conference on Parallel and Distributed Computing, Applications and Technologies
-
-
Takizawa, H.1
Sato, K.2
Komatsu, K.3
Kobayashi, H.4
|