메뉴 건너뛰기




Volumn 45, Issue 5, 2017, Pages 1142-1163

On the Virtualization of CUDA Based GPU Remoting on ARM and X86 Machines in the GVirtuS Framework

Author keywords

ARM; Cloud; GPGPU; HPC; Virtualization

Indexed keywords

ARM PROCESSORS; CLOUDS; DATA HANDLING; PROGRAM PROCESSORS; SCHEDULING; VIRTUAL REALITY;

EID: 84991112040     PISSN: 08857458     EISSN: None     Source Type: Journal    
DOI: 10.1007/s10766-016-0462-1     Document Type: Article
Times cited : (30)

References (41)
  • 3
    • 77958566946 scopus 로고    scopus 로고
    • Efficient sparse matrix-vector multiplication on CUDA. NVIDIA Technical Report NVR-2008-004
    • Bell, N., Garland, M.: Efficient sparse matrix-vector multiplication on CUDA. NVIDIA Technical Report NVR-2008-004, Nvidia Corporation (2008)
    • (2008) Nvidia Corporation
    • Bell, N.1    Garland, M.2
  • 6
    • 0002806690 scopus 로고    scopus 로고
    • OpenMP: an industry standard API for shared-memory programming
    • Dagum, L., Enon, R.: OpenMP: an industry standard API for shared-memory programming. IEEE Comput. Sci. Eng. 5(1), 46–55 (1998)
    • (1998) IEEE Comput. Sci. Eng. , vol.5 , Issue.1 , pp. 46-55
    • Dagum, L.1    Enon, R.2
  • 8
    • 84867278912 scopus 로고    scopus 로고
    • SIaaS-sensing instrument as a service using cloud computing to turn physical instrument into ubiquitous service. In: 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications
    • Di Lauro, R., Lucarelli, F., Montella, R.: SIaaS-sensing instrument as a service using cloud computing to turn physical instrument into ubiquitous service. In: 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications. IEEE, pp. 861–862 (2012)
    • (2012) IEEE , pp. 861-862
    • Di Lauro, R.1    Lucarelli, F.2    Montella, R.3
  • 10
    • 33749237984 scopus 로고    scopus 로고
    • pPOM: a nested, scalable, parallel and Fortran 90 implementation of the Princeton Ocean Model
    • Giunta, G., Mariani, P., Montella, R., Riccio, A.: pPOM: a nested, scalable, parallel and Fortran 90 implementation of the Princeton Ocean Model. Environ. Model. Softw. 22(1), 117–122 (2007)
    • (2007) Environ. Model. Softw. , vol.22 , Issue.1 , pp. 117-122
    • Giunta, G.1    Mariani, P.2    Montella, R.3    Riccio, A.4
  • 13
    • 84874423459 scopus 로고    scopus 로고
    • A GPU accelerated high performance cloud computing infrastructure for grid computing based virtual environmental laboratory. Adv
    • Giunta, G., Montella, R., Laccetti, G., Isaila, F., Blas, F.: A GPU accelerated high performance cloud computing infrastructure for grid computing based virtual environmental laboratory. Adv. Grid Comput. 35–43 (2011)
    • (2011) Grid Comput , pp. 35-43
    • Giunta, G.1    Montella, R.2    Laccetti, G.3    Isaila, F.4    Blas, F.5
  • 18
    • 77951610849 scopus 로고    scopus 로고
    • Accelerating high performance applications with CUDA and MPI. In: 2009 International Conference on Industrial and Information Systems (ICIIS), pp. 331–336
    • Karunadasa, N.P., Ranasinghe, D.N.: Accelerating high performance applications with CUDA and MPI. In: 2009 International Conference on Industrial and Information Systems (ICIIS), pp. 331–336. IEEE (2009)
    • (2009) IEEE
    • Karunadasa, N.P.1    Ranasinghe, D.N.2
  • 20
    • 84901266785 scopus 로고    scopus 로고
    • The high performance internet of things: using GVirtuS to share high-end GPUs with ARM based cluster computing nodes
    • In: Springer, Berlin, Heidelberg
    • Laccetti, G., Montella, R., Palmieri, C., Pelliccia, V.: The high performance internet of things: using GVirtuS to share high-end GPUs with ARM based cluster computing nodes. In: Parallel Processing and Applied Mathematics 2013, LNCS, vol. 8384, pp. 734–744. Springer, Berlin, Heidelberg (2013)
    • (2013) Parallel Processing and Applied Mathematics 2013, LNCS, vol. 8384 , pp. 734-744
    • Laccetti, G.1    Montella, R.2    Palmieri, C.3    Pelliccia, V.4
  • 22
    • 77951106340 scopus 로고    scopus 로고
    • CUDASW++ 2.0: enhanced Smith–Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions. BMC Res
    • Liu, Y., Schmidt, B., Maskell, D.L.: CUDASW++ 2.0: enhanced Smith–Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions. BMC Res. Notes 3(1), 93 (2010)
    • (2010) Notes , vol.3 , Issue.1 , pp. 93
    • Liu, Y.1    Schmidt, B.2    Maskell, D.L.3
  • 23
    • 43349092363 scopus 로고    scopus 로고
    • CUDA compatible GPU cards as efficient hardware accelerators for Smith–Waterman sequence alignment
    • Manavski, S.A., Valle, G.: CUDA compatible GPU cards as efficient hardware accelerators for Smith–Waterman sequence alignment. BMC Bioinf. 9(2), 1 (2008)
    • (2008) BMC Bioinf. , vol.9 , Issue.2 , pp. 1
    • Manavski, S.A.1    Valle, G.2
  • 25
    • 84896398134 scopus 로고    scopus 로고
    • Using hybrid grid/cloud computing technologies for environmental data elastic storage, processing, and provisioning
    • In: Springer, USA
    • Montella, R., Foster, I.: Using hybrid grid/cloud computing technologies for environmental data elastic storage, processing, and provisioning. In: Handbook of Cloud Computing, pp. 595–618. Springer, USA (2010)
    • (2010) Handbook of Cloud Computing , pp. 595-618
    • Montella, R.1    Foster, I.2
  • 27
    • 84896394744 scopus 로고    scopus 로고
    • Virtualizing high-end GPGPUs on ARM clusters for the next generation of high performance cloud computing
    • Montella, R., Giunta, G., Laccetti, G.: Virtualizing high-end GPGPUs on ARM clusters for the next generation of high performance cloud computing. Cluster Comput. 17(1), 139–152 (2014)
    • (2014) Cluster Comput. , vol.17 , Issue.1 , pp. 139-152
    • Montella, R.1    Giunta, G.2    Laccetti, G.3
  • 34
    • 85119102126 scopus 로고    scopus 로고
    • POSTER: Boosting the performance of remote GPU virtualization using InfiniBand connect-IB and PCIe 3.0. In: 2014 IEEE International Conference on Cluster Computing (CLUSTER), pp. 266–267
    • Reao, C., Silla, F., Pena, A.J., Shainer, G., Schultz, S., Castello, A., Quintana-Orti, E.S., Duato, J.: POSTER: Boosting the performance of remote GPU virtualization using InfiniBand connect-IB and PCIe 3.0. In: 2014 IEEE International Conference on Cluster Computing (CLUSTER), pp. 266–267. IEEE (2014)
    • (2014) IEEE
    • Reao, C.1    Silla, F.2    Pena, A.J.3    Shainer, G.4    Schultz, S.5    Castello, A.6    Quintana-Orti, E.S.7    Duato, J.8
  • 35
    • 84860524424 scopus 로고    scopus 로고
    • vCUDA: GPU-accelerated high-performance computing in virtual machines
    • Shi, L., Chen, H., Sun, J., Li, K.: vCUDA: GPU-accelerated high-performance computing in virtual machines. IEEE Trans. Comput. 61(6), 804–816 (2012)
    • (2012) IEEE Trans. Comput. , vol.61 , Issue.6 , pp. 804-816
    • Shi, L.1    Chen, H.2    Sun, J.3    Li, K.4
  • 36
    • 51449118065 scopus 로고    scopus 로고
    • A performance study of general-purpose applications on graphics processors using CUDA
    • Shuai, C., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J.W., Skadron, K.: A performance study of general-purpose applications on graphics processors using CUDA. J. Parallel Distrib. Comput. 68(10), 1370–1380 (2008)
    • (2008) J. Parallel Distrib. Comput. , vol.68 , Issue.10 , pp. 1370-1380
    • Shuai, C.1    Boyer, M.2    Meng, J.3    Tarjan, D.4    Sheaffer, J.W.5    Skadron, K.6
  • 38
    • 84962916916 scopus 로고    scopus 로고
    • Effective multi-GPU communication using multiple CUDA streams and threads. In: 2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS), pp. 981–986
    • Sourouri, M., Gillberg, T., Baden, S.B., Cai, X.: Effective multi-GPU communication using multiple CUDA streams and threads. In: 2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS), pp. 981–986. IEEE (2014)
    • (2014) IEEE
    • Sourouri, M.1    Gillberg, T.2    Baden, S.B.3    Cai, X.4
  • 40
    • 70350771131 scopus 로고    scopus 로고
    • Benchmarking GPUs to tune dense linear algebra. In: International Conference for High Performance Computing, Networking, Storage and Analysis 2008, SC, pp. 1–11
    • Volkov, V., Demmel, J.W.: Benchmarking GPUs to tune dense linear algebra. In: International Conference for High Performance Computing, Networking, Storage and Analysis 2008, SC 2008, pp. 1–11. IEEE (2008)
    • (2008) IEEE
    • Volkov, V.1    Demmel, J.W.2
  • 41
    • 78149358756 scopus 로고    scopus 로고
    • Hybrid CUDA, OpenMP, and MPI parallel programming on multicore GPU clusters
    • Yang, C., Huang, C., Lin, C.: Hybrid CUDA, OpenMP, and MPI parallel programming on multicore GPU clusters. Comput. Phys. Commun. 182(1), 266–269 (2011)
    • (2011) Comput. Phys. Commun. , vol.182 , Issue.1 , pp. 266-269
    • Yang, C.1    Huang, C.2    Lin, C.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.