SCOPUS 정보 검색 플랫폼

International Journal of Parallel Programming

Volumn 45, Issue 5, 2017, Pages 1142-1163

On the Virtualization of CUDA Based GPU Remoting on ARM and X86 Machines in the GVirtuS Framework

(10) Montella, Raffaele a Giunta, Giulio a Laccetti, Giuliano b Lapegna, Marco b Palmieri, Carlo a Ferraro, Carmine a Pelliccia, Valentina a Hong, Cheol Ho c Spence, Ivor c Nikolopoulos, Dimitrios S c

a UNIVERSITY OF NAPLES PARTHENOPE (Italy)

b UNIVERSITY OF NAPLES FEDERICO II (Italy)

c QUEEN'S UNIVERSITY BELFAST (United Kingdom)

Author keywords

ARM; Cloud; GPGPU; HPC; Virtualization

Indexed keywords

ARM PROCESSORS; CLOUDS; DATA HANDLING; PROGRAM PROCESSORS; SCHEDULING; VIRTUAL REALITY;

COMPUTING CLUSTERS; DISTRIBUTED CLOUDS; GPGPU; HARDWARE PLATFORM; HUMAN MACHINE INTERACTION; PROGRAMMING MODELS; TRANSPARENT LAYERS; VIRTUALIZATIONS;

BIG DATA;

EID: 84991112040 PISSN: 08857458 EISSN: None Source Type: Journal
DOI: 10.1007/s10766-016-0462-1 Document Type: Article

Times cited : (30)

References (41)

1
- 77953989275
- Shared device driver model for virtualized mobile handsets
- Armand, F., Gien, M., Maign, G., Mardinian, G.: Shared device driver model for virtualized mobile handsets. In: Proceedings of the First Workshop on Virtualization in Mobile Computing, pp. 12–16. ACM (2008)
- (2008) Proceedings of the First Workshop on Virtualization in Mobile Computing, pp. 12–16. ACM
- Armand, F.¹ Gien, M.² Maign, G.³ Mardinian, G.⁴

2
- 13444273448
- The universal protein resource (UniProt)
- Bairoch, A.M., Apweiler, R., Wu, C.H., Barker, W.C., Boeckmann, B., Ferro Rojas, S., Gasteiger, E., et al.: The universal protein resource (UniProt). Nucleic Acids Res. 33(Database issue), D154–D159 (2005)
- (2005) Nucleic Acids Res , vol.33 , pp. D154-D159
- Bairoch, A.M.¹ Apweiler, R.² Wu, C.H.³ Barker, W.C.⁴ Boeckmann, B.⁵ Ferro Rojas, S.⁶ Gasteiger, E.⁷

3
- 77958566946
- Efficient sparse matrix-vector multiplication on CUDA. NVIDIA Technical Report NVR-2008-004
- Bell, N., Garland, M.: Efficient sparse matrix-vector multiplication on CUDA. NVIDIA Technical Report NVR-2008-004, Nvidia Corporation (2008)
- (2008) Nvidia Corporation
- Bell, N.¹ Garland, M.²

4
- 24944576239
- Laccetti, Lapegna, M.: A performance contract system in a grid enabling, component based programming environment
- Caruso P.G. Laccetti, Lapegna, M.: A performance contract system in a grid enabling, component based programming environment. In: Advances in Grid Computing-EGC 2005, LNCS, vol. 3470, pp. 982–992. Springer (2005)
- (2005) Advances in Grid Computing-EGC 2005, LNCS, vol. 3470, pp. 982–992. Springer
- Caruso, P.G.¹

5
- 84908663379
- On the use of remote GPUs and low-power processors for the acceleration of scientific applications
- Castello, A., Duato, J., Mayo, R., Pena, A.J., Quintana-Ort, E.S., Roca, V., Silla, F.: On the use of remote GPUs and low-power processors for the acceleration of scientific applications. In: The Fourth International Conference on Smart Grids, Green Communications and IT Energy-aware Technologies (ENERGY), pp. 57–62 (2014)
- (2014) The Fourth International Conference on Smart Grids, Green Communications and IT Energy-aware Technologies (ENERGY , pp. 57-62
- Castello, A.¹ Duato, J.² Mayo, R.³ Pena, A.J.⁴ Quintana-Ort, E.S.⁵ Roca, V.⁶ Silla, F.⁷

6
- 0002806690
- OpenMP: an industry standard API for shared-memory programming
- Dagum, L., Enon, R.: OpenMP: an industry standard API for shared-memory programming. IEEE Comput. Sci. Eng. 5(1), 46–55 (1998)
- (1998) IEEE Comput. Sci. Eng. , vol.5 , Issue.1 , pp. 46-55
- Dagum, L.¹ Enon, R.²

7
- 84867244965
- Virtualizing general purpose GPUs for high performance cloud computing: an application to a fluid simulator
- Di Lauro, R., Giannone, F., Ambrosio, L., Montella, R.: Virtualizing general purpose GPUs for high performance cloud computing: an application to a fluid simulator. In: IEEE 10th International Symposium on Proceedings of Parallel and Distributed Processing with Applications (ISPA), pp. 863–864 (2012)
- (2012) IEEE 10th International Symposium on Proceedings of Parallel and Distributed Processing with Applications (ISPA) , pp. 863-864
- Di Lauro, R.¹ Giannone, F.² Ambrosio, L.³ Montella, R.⁴

8
- 84867278912
- SIaaS-sensing instrument as a service using cloud computing to turn physical instrument into ubiquitous service. In: 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications
- Di Lauro, R., Lucarelli, F., Montella, R.: SIaaS-sensing instrument as a service using cloud computing to turn physical instrument into ubiquitous service. In: 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications. IEEE, pp. 861–862 (2012)
- (2012) IEEE , pp. 861-862
- Di Lauro, R.¹ Lucarelli, F.² Montella, R.³

9
- 62949225623
- Cloud computing and grid computing 360-degree compared
- Foster, I., Zhao, Y., Raicu, I., Lu, S.: Cloud computing and grid computing 360-degree compared. In: IEEE Grid Computing Environments Workshop GCE 08, pp. 1–10 (2008)
- (2008) IEEE Grid Computing Environments Workshop GCE 08 , pp. 1-10
- Foster, I.¹ Zhao, Y.² Raicu, I.³ Lu, S.⁴

10
- 33749237984
- pPOM: a nested, scalable, parallel and Fortran 90 implementation of the Princeton Ocean Model
- Giunta, G., Mariani, P., Montella, R., Riccio, A.: pPOM: a nested, scalable, parallel and Fortran 90 implementation of the Princeton Ocean Model. Environ. Model. Softw. 22(1), 117–122 (2007)
- (2007) Environ. Model. Softw. , vol.22 , Issue.1 , pp. 117-122
- Giunta, G.¹ Mariani, P.² Montella, R.³ Riccio, A.⁴

11
- 53749092570
- Parallel computing experiences with CUDA
- Garland, M., Le Grand, S., Nickolls, J., Anderson, J., Hardwick, J., Morton, S., Phillips, E., Zhang, Y., Volkov, V.: Parallel computing experiences with CUDA. IEEE Micro 28(4), 13–27 (2008)
- (2008) IEEE Micro , vol.28 , Issue.4 , pp. 13-27
- Garland, M.¹ Le Grand, S.² Nickolls, J.³ Anderson, J.⁴ Hardwick, J.⁵ Morton, S.⁶ Phillips, E.⁷ Zhang, Y.⁸ Volkov, V.⁹

12
- 78349273083
- A GPGPU transparent virtualization component for high performance computing clouds
- Giunta, G., Montella, R., Agrillo, G., Coviello, G.: A GPGPU transparent virtualization component for high performance computing clouds. In: EuroPar 2010 Parallel Processing, LNCS, vol. 6271, no. 2, pp. 379–391. Springer (2010)
- (2010) EuroPar 2010 Parallel Processing, LNCS, vol. 6271, no. 2,. Springer , pp. 379-391
- Giunta, G.¹ Montella, R.² Agrillo, G.³ Coviello, G.⁴

13
- 84874423459
- A GPU accelerated high performance cloud computing infrastructure for grid computing based virtual environmental laboratory. Adv
- Giunta, G., Montella, R., Laccetti, G., Isaila, F., Blas, F.: A GPU accelerated high performance cloud computing infrastructure for grid computing based virtual environmental laboratory. Adv. Grid Comput. 35–43 (2011)
- (2011) Grid Comput , pp. 35-43
- Giunta, G.¹ Montella, R.² Laccetti, G.³ Isaila, F.⁴ Blas, F.⁵

14
- 84974728189
- MPICH2: a new start for MPI implementations
- Gropp, W.: MPICH2: a new start for MPI implementations. In: Recent Advances in Parallel Virtual Machine and Message Passing Interface 2002, LNCS, vol. 2474, p. 7. Springer (2002)
- (2002) Recent Advances in Parallel Virtual Machine and Message Passing Interface 2002, LNCS, vol. 2474, p. 7. Springer
- Gropp, W.¹

15
- 70349123351
- GViM: GPU-accelerated virtual machines
- Gupta, V., Gavrilovska, A., Schwan, K., Kharche, H., Tolia, N., Talwar, V., Ranganathan, P.: GViM: GPU-accelerated virtual machines. In: Proceedings of the 3rd ACM Workshop on System-Level Virtualization for High Performance Computing, pp. 17–24. ACM (2009)
- (2009) Proceedings of the 3rd ACM Workshop on System-Level Virtualization for High Performance Computing,. ACM , pp. 17-24
- Gupta, V.¹ Gavrilovska, A.² Schwan, K.³ Kharche, H.⁴ Tolia, N.⁵ Talwar, V.⁶ Ranganathan, P.⁷

16
- 84965004939
- Nvidia Corp, Santa Clara
- Herrera, A.: NVIDIA GRID: Graphics Accelerated VDI with the Visual Performance of a Workstation. Nvidia Corp, Santa Clara (2014)
- (2014) NVIDIA GRID: Graphics Accelerated VDI with the Visual Performance of a Workstation
- Herrera, A.¹

17
- 84894216021
- Kawai, A., Yasuoka, K., Yoshikawa, K., Narumi, T.: Distributed-shared CUDA: virtualization of large-scale GPU systems for programmability and reliability (2012)
- (2012) Distributed-shared CUDA: virtualization of large-scale GPU systems for programmability and reliability
- Kawai, A.¹ Yasuoka, K.² Yoshikawa, K.³ Narumi, T.⁴

18
- 77951610849
- Accelerating high performance applications with CUDA and MPI. In: 2009 International Conference on Industrial and Information Systems (ICIIS), pp. 331–336
- Karunadasa, N.P., Ranasinghe, D.N.: Accelerating high performance applications with CUDA and MPI. In: 2009 International Conference on Industrial and Information Systems (ICIIS), pp. 331–336. IEEE (2009)
- (2009) IEEE
- Karunadasa, N.P.¹ Ranasinghe, D.N.²

19
- 84983201882
- GPUswap: enabling oversubscription of GPU memory through transparent swapping
- Kehne, J., Metter, J., Bellosa, F.: GPUswap: enabling oversubscription of GPU memory through transparent swapping. In: Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, pp. 65–77. ACM (2015)
- (2015) Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments,. ACM , pp. 65-77
- Kehne, J.¹ Metter, J.² Bellosa, F.³

20
- 84901266785
- The high performance internet of things: using GVirtuS to share high-end GPUs with ARM based cluster computing nodes
- In: Springer, Berlin, Heidelberg
- Laccetti, G., Montella, R., Palmieri, C., Pelliccia, V.: The high performance internet of things: using GVirtuS to share high-end GPUs with ARM based cluster computing nodes. In: Parallel Processing and Applied Mathematics 2013, LNCS, vol. 8384, pp. 734–744. Springer, Berlin, Heidelberg (2013)
- (2013) Parallel Processing and Applied Mathematics 2013, LNCS, vol. 8384 , pp. 734-744
- Laccetti, G.¹ Montella, R.² Palmieri, C.³ Pelliccia, V.⁴

21
- 70450031959
- An efficient implementation of Smith–Waterman algorithm on GPU using CUDA, for massively parallel scanning of sequence databases
- Ligowski, L., Rudnicki, W.: An efficient implementation of Smith–Waterman algorithm on GPU using CUDA, for massively parallel scanning of sequence databases. In: IEEE International Symposium on Parallel and Distributed Processing 2009, IPDPS 2009, pp. 1–8. IEEE (2009)
- (2009) IEEE International Symposium on Parallel and Distributed Processing 2009, IPDPS 2009,. IEEE , pp. 1-8
- Ligowski, L.¹ Rudnicki, W.²

22
- 77951106340
- CUDASW++ 2.0: enhanced Smith–Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions. BMC Res
- Liu, Y., Schmidt, B., Maskell, D.L.: CUDASW++ 2.0: enhanced Smith–Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions. BMC Res. Notes 3(1), 93 (2010)
- (2010) Notes , vol.3 , Issue.1 , pp. 93
- Liu, Y.¹ Schmidt, B.² Maskell, D.L.³

23
- 43349092363
- CUDA compatible GPU cards as efficient hardware accelerators for Smith–Waterman sequence alignment
- Manavski, S.A., Valle, G.: CUDA compatible GPU cards as efficient hardware accelerators for Smith–Waterman sequence alignment. BMC Bioinf. 9(2), 1 (2008)
- (2008) BMC Bioinf. , vol.9 , Issue.2 , pp. 1
- Manavski, S.A.¹ Valle, G.²

24
- 85027040537
- Martinez-Noriega, E.J., Josafat, E., Kawai, A., Yoshikawa, K., Yasuoka, K., Narumi, T.: CUDA Enabled for Android Tablets through DS-CUDA (2013)
- (2013) CUDA Enabled for Android Tablets through DS-CUDA
- Martinez-Noriega, E.J.¹ Josafat, E.² Kawai, A.³ Yoshikawa, K.⁴ Yasuoka, K.⁵ Narumi, T.⁶

25
- 84896398134
- Using hybrid grid/cloud computing technologies for environmental data elastic storage, processing, and provisioning
- In: Springer, USA
- Montella, R., Foster, I.: Using hybrid grid/cloud computing technologies for environmental data elastic storage, processing, and provisioning. In: Handbook of Cloud Computing, pp. 595–618. Springer, USA (2010)
- (2010) Handbook of Cloud Computing , pp. 595-618
- Montella, R.¹ Foster, I.²

26
- 84865204787
- A general-purpose virtualization service for HPC on cloud computing: an application to GPUs
- In: Springer, Berlin, Heidelberg
- Montella, R., Coviello, G., Giunta, G., Laccetti, G., Isaila, F., Blas, J.G.: A general-purpose virtualization service for HPC on cloud computing: an application to GPUs. In: International Conference on Parallel Processing and Applied Mathematics, pp. 740–749. Springer, Berlin, Heidelberg (2011)
- (2011) International Conference on Parallel Processing and Applied Mathematics , pp. 740-749
- Montella, R.¹ Coviello, G.² Giunta, G.³ Laccetti, G.⁴ Isaila, F.⁵ Blas, J.G.⁶

27
- 84896394744
- Virtualizing high-end GPGPUs on ARM clusters for the next generation of high performance cloud computing
- Montella, R., Giunta, G., Laccetti, G.: Virtualizing high-end GPGPUs on ARM clusters for the next generation of high performance cloud computing. Cluster Comput. 17(1), 139–152 (2014)
- (2014) Cluster Comput. , vol.17 , Issue.1 , pp. 139-152
- Montella, R.¹ Giunta, G.² Laccetti, G.³

28
- 84944874118
- FACE IT: A science gateway for food security research
- Montella, R., Kelly, D., Xiong, W., Brizius, A., Elliott, J., Madduri, R., Maheshwari, K., et al.: FACE IT: A science gateway for food security research. Concurr. Comput. Pract. Exp. 27(16), 4423–4436 (2015)
- (2015) Concurr. Comput. Pract. Exp. , vol.27 , Issue.16 , pp. 4423-4436
- Montella, R.¹ Kelly, D.² Xiong, W.³ Brizius, A.⁴ Elliott, J.⁵ Madduri, R.⁶ Maheshwari, K.⁷

29
- 84964461702
- Virtualizing CUDA enabled GPGPUs on ARM clusters
- In: Springer, Berlin, Heidelberg
- Montella, R., Giunta, G., Laccetti, G., Lapegna, M., Palmieri, C., Ferraro, C., Pelliccia, V.: Virtualizing CUDA enabled GPGPUs on ARM clusters. In: Parallel Processing in and Applied Mathematics 2015, LNCS, vol. 9574, Springer, Berlin, Heidelberg (2016)
- (2016) Parallel Processing in and Applied Mathematics 2015, LNCS, vol , pp. 9574
- Montella, R.¹ Giunta, G.² Laccetti, G.³ Lapegna, M.⁴ Palmieri, C.⁵ Ferraro, C.⁶ Pelliccia, V.⁷

30
- 84868273935
- SOLE: linking research papers with science objects
- In: Springer, Berlin, Heidelberg
- Pham, Q., Malik, T., Foster, I., Di Lauro, R., Montella, R., SOLE: linking research papers with science objects. In: Provenance and Annotation of Data and Processes 2012, LNCS, vol. 7525, pp. 203–208. Springer, Berlin, Heidelberg (2012)
- (2012) Provenance and Annotation of Data and Processes 2012, LNCS, vol. 7525 , pp. 203-208
- Pham, Q.¹ Malik, T.² Foster, I.³ Di Lauro, R.⁴ Montella, R.⁵

31
- 84963767039
- CUDA acceleration for Xen virtual machines in infiniband clusters with rCUDA
- Prades, J., Reao, C., Silla, F.: CUDA acceleration for Xen virtual machines in infiniband clusters with rCUDA. In: Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, p. 35. ACM (2016)
- (2016) Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, p. 35. ACM
- Prades, J.¹ Reao, C.² Silla, F.³

32
- 84899651411
- Tibidabo: making the case for an ARM-based HPC system
- Rajovic, N., Rico, A., Puzovic, N., Adeniyi-Jones, C., Ramirez, A.: Tibidabo: making the case for an ARM-based HPC system. Fut. Gener. Comput. Syst. 36, 322–334 (2014)
- (2014) Fut. Gener. Comput. Syst. , vol.36 , pp. 322-334
- Rajovic, N.¹ Rico, A.² Puzovic, N.³ Adeniyi-Jones, C.⁴ Ramirez, A.⁵

33
- 84893593068
- IEEE International Conference on Cluster Computing, Indianapolis
- Reao, C., Mayo, R., Quintana-Orti, E.S., Silla, F., Duato, J., Pea, A.J.: Influence of InfiniBand FDR on the performance of remote GPU virtualization. In: Proceedings of the 2013 IEEE International Conference on Cluster Computing, Indianapolis, USA (2013)
- (2013) Influence of InfiniBand FDR on the performance of remote GPU virtualization. In: Proceedings of theUSA , vol.2013

34
- 85119102126
- POSTER: Boosting the performance of remote GPU virtualization using InfiniBand connect-IB and PCIe 3.0. In: 2014 IEEE International Conference on Cluster Computing (CLUSTER), pp. 266–267
- Reao, C., Silla, F., Pena, A.J., Shainer, G., Schultz, S., Castello, A., Quintana-Orti, E.S., Duato, J.: POSTER: Boosting the performance of remote GPU virtualization using InfiniBand connect-IB and PCIe 3.0. In: 2014 IEEE International Conference on Cluster Computing (CLUSTER), pp. 266–267. IEEE (2014)
- (2014) IEEE
- Reao, C.¹ Silla, F.² Pena, A.J.³ Shainer, G.⁴ Schultz, S.⁵ Castello, A.⁶ Quintana-Orti, E.S.⁷ Duato, J.⁸

35
- 84860524424
- vCUDA: GPU-accelerated high-performance computing in virtual machines
- Shi, L., Chen, H., Sun, J., Li, K.: vCUDA: GPU-accelerated high-performance computing in virtual machines. IEEE Trans. Comput. 61(6), 804–816 (2012)
- (2012) IEEE Trans. Comput. , vol.61 , Issue.6 , pp. 804-816
- Shi, L.¹ Chen, H.² Sun, J.³ Li, K.⁴

36
- 51449118065
- A performance study of general-purpose applications on graphics processors using CUDA
- Shuai, C., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J.W., Skadron, K.: A performance study of general-purpose applications on graphics processors using CUDA. J. Parallel Distrib. Comput. 68(10), 1370–1380 (2008)
- (2008) J. Parallel Distrib. Comput. , vol.68 , Issue.10 , pp. 1370-1380
- Shuai, C.¹ Boyer, M.² Meng, J.³ Tarjan, D.⁴ Sheaffer, J.W.⁵ Skadron, K.⁶

37
- 70649092154
- Rodinia: a benchmark suite for heterogeneous computing
- Shuai, C., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J.W., Lee, S.-H., Skadron, K.: Rodinia: a benchmark suite for heterogeneous computing. In: Proceedings of the IEEE International Symposium on Workload Characterization—IISWC 2009, pp. 44–54 (2009)
- (2009) In: Proceedings of the IEEE International Symposium on Workload Characterization—IISWC , vol.2009 , pp. 44-54
- Shuai, C.¹ Boyer, M.² Meng, J.³ Tarjan, D.⁴ Sheaffer, J.W.⁵ Lee, S.-H.⁶ Skadron, K.⁷

38
- 84962916916
- Effective multi-GPU communication using multiple CUDA streams and threads. In: 2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS), pp. 981–986
- Sourouri, M., Gillberg, T., Baden, S.B., Cai, X.: Effective multi-GPU communication using multiple CUDA streams and threads. In: 2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS), pp. 981–986. IEEE (2014)
- (2014) IEEE
- Sourouri, M.¹ Gillberg, T.² Baden, S.B.³ Cai, X.⁴

39
- 84900621040
- Experiences accelerating MATLAB systems biology applications. In: Proceedings of the workshop on biomedicine in computing: systems, architectures, and circuits (BiC) 2009
- Szafaryn, L.G., Skadron, K., Saucerman, J.J.: Experiences accelerating MATLAB systems biology applications. In: Proceedings of the workshop on biomedicine in computing: systems, architectures, and circuits (BiC) 2009. In: Conjunction with the 36th IEEE/ACM International Symposium on Computer Architecture (ISCA) (2009)
- (2009) Conjunction with the 36th IEEE/ACM International Symposium on Computer Architecture (ISCA)
- Szafaryn, L.G.¹ Skadron, K.² Saucerman, J.J.³

40
- 70350771131
- Benchmarking GPUs to tune dense linear algebra. In: International Conference for High Performance Computing, Networking, Storage and Analysis 2008, SC, pp. 1–11
- Volkov, V., Demmel, J.W.: Benchmarking GPUs to tune dense linear algebra. In: International Conference for High Performance Computing, Networking, Storage and Analysis 2008, SC 2008, pp. 1–11. IEEE (2008)
- (2008) IEEE
- Volkov, V.¹ Demmel, J.W.²

41
- 78149358756
- Hybrid CUDA, OpenMP, and MPI parallel programming on multicore GPU clusters
- Yang, C., Huang, C., Lin, C.: Hybrid CUDA, OpenMP, and MPI parallel programming on multicore GPU clusters. Comput. Phys. Commun. 182(1), 266–269 (2011)
- (2011) Comput. Phys. Commun. , vol.182 , Issue.1 , pp. 266-269
- Yang, C.¹ Huang, C.² Lin, C.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.