SCOPUS 정보 검색 플랫폼

Proceedings - 19th International Euromicro Conference on Parallel, Distributed, and Network-Based Processing, PDP 2011

Volumn , Issue , 2011, Pages 223-230

Programming GPU clusters with shared memory abstraction in software

(2) Karantasis, Konstantinos I a Polychronopoulos, Eleftherios D a

a UNIVERSITY OF PATRAS (Greece)

Author keywords

CUDA; GPU Clusters; OpenMP; Pleiad; Software DSM

Indexed keywords

CUDA; GPU CLUSTERS; OPENMP; PLEIAD; SOFTWARE DSM;

ABSTRACTING; APPLICATION PROGRAMMING INTERFACES (API); MIDDLEWARE; PARALLEL PROCESSING SYSTEMS; PROGRAM PROCESSORS;

SOFTWARE DESIGN;

EID: 79955023145 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/PDP.2011.91 Document Type: Conference Paper

Times cited : (11)

References (27)

1
- 38549121575
- The future of microprocessors
- K. Olukotun and L. Hammond, "The Future of Microprocessors," Queue, vol. 3, no. 7, pp. 26-29, 2005.
- (2005) Queue , vol.3 , Issue.7 , pp. 26-29
- Olukotun, K.¹ Hammond, L.²

2
- 85024275309
- Software and the concurrency revolution
- H. Sutter and J. Larus, "Software and the Concurrency Revolution," Queue, vol. 3, no. 7, pp. 54-62, 2005.
- (2005) Queue , vol.3 , Issue.7 , pp. 54-62
- Sutter, H.¹ Larus, J.²

3
- 77952123736
- A 48-core IA-32 message-passing processor with DVFS in 45nm CMOS
- San Francisco, California, USA, Feb.
- J. Howard, S. Dighe, Y. Hoskote, S. Vangal, D. Finan, G. Ruhl, D. Jenkins, H. Wilson, N. Borkar, G. Schrom, F. Pailet, S. Jain, T. Jacob, S. Yada, S. Marella, P. Salihundam, V. Erraguntla, M. Konow, M. Riepen, G. Droege, J. Lindemann, M. Gries, T. Apel, K. Henriss, T. Lund-Larsen, S. Steibl, S. Borkar, V. De, R. V. D. Wijngaart, and T. Mattson, "A 48-Core IA-32 Message-Passing Processor with DVFS in 45nm CMOS," in IEEE International Solid-State Circuits Conference, San Francisco, California, USA, Feb. 2010.
- (2010) IEEE International Solid-state Circuits Conference
- Howard, J.¹ Dighe, S.² Hoskote, Y.³ Vangal, S.⁴ Finan, D.⁵ Ruhl, G.⁶ Jenkins, D.⁷ Wilson, H.⁸ Borkar, N.⁹ Schrom, G.¹⁰ Pailet, F.¹¹ Jain, S.¹² Jacob, T.¹³ Yada, S.¹⁴ Marella, S.¹⁵ Salihundam, P.¹⁶ Erraguntla, V.¹⁷ Konow, M.¹⁸ Riepen, M.¹⁹ Droege, G.²⁰ more..

4
- 44849137198
- NVIDIA Tesla: A unified graphics and computing architecture
- DOI 10.1109/MM.2008.31
- E. Lindholm, J. Nickolls, S. Oberman, and J. Montrym, "NVIDIA Tesla: A Unified Graphics and Computing Architecture," IEEE Micro, vol. 28, no. 2, pp. 39-55, 2008. (Pubitemid 351796170)
- (2008) IEEE Micro , vol.28 , Issue.2 , pp. 39-55
- Lindholm, E.¹ Nickolls, J.² Oberman, S.³ Montrym, J.⁴

5
- 41649101136
- NVIDIA CUDA Jun. [Online]. Available
- NVIDIA CUDA, Compute Unified Device Architecture, Programming Guide (Version 3.1), NVIDIA Corporation, Jun. 2010. [Online]. Available: http://developer.nvidia.com/object/cuda-3-1-downloads.html
- (2010) Compute Unified Device Architecture, Programming Guide (Version 3.1)

6
- 74349092397
- OpenCL [Online]. Available
- OpenCL, OpenCL - the Open Standard for Parallel Programming of Heterogeneous Systems, Khronos Group, 2009. [Online]. Available: http://www.khronos.org/opencl/
- (2009) OpenCL - The Open Standard for Parallel Programming of Heterogeneous Systems

7
- 23944462603
- GPU cluster for high performance computing
- Washington, DC, USA: IEEE Computer Society
- Z. Fan, F. Qiu, A. Kaufman, and S. Yoakum-Stover, "GPU Cluster for High Performance Computing," in SC '04: Proceedings of the 2004 ACM/IEEE conference on Supercomputing. Washington, DC, USA: IEEE Computer Society, 2004, p. 47.
- (2004) SC '04: Proceedings of the 2004 ACM/IEEE Conference on Supercomputing , pp. 47
- Fan, Z.¹ Qiu, F.² Kaufman, A.³ Yoakum-Stover, S.⁴

8
- 35748969304
- Exploring weak scalability for FEM calculations on a GPU-enhanced cluster
- DOI 10.1016/j.parco.2007.09.002, PII S0167819107001020, High-Performance Computing Using Accelerators
- D. Göddeke, R. Strzodka, J. Mohd-Yusof, P. McCormick, S. H. M. Buijssen, M. Grajewski, and S. Turek, "Exploring weak scalability for FEM calculations on a GPU-enhanced cluster," Parallel Computing, vol. 33, no. 10-11, pp. 685-699, 2007. (Pubitemid 350051061)
- (2007) Parallel Computing , vol.33 , Issue.10-11 , pp. 685-699
- Goddeke, D.¹ Strzodka, R.² Mohd-Yusof, J.³ McCormick, P.⁴ Buijssen, S.H.M.⁵ Grajewski, M.⁶ Turek, S.⁷

9
- 78649859889
- An MPI-CUDA implementation for massively parallel incompressible flow computations on multi-GPU clusters
- Orlando, Florida, USA, Jan.
- D. A. Jacobsen, J. C. Thibault, and I. Senocak, "An MPI-CUDA Implementation for Massively Parallel Incompressible Flow Computations on Multi-GPU Clusters," in 48th AIAA Aerospace Sciences Meeting, Orlando, Florida, USA, Jan. 2010.
- (2010) 48th AIAA Aerospace Sciences Meeting
- Jacobsen, D.A.¹ Thibault, J.C.² Senocak, I.³

10
- 49249120833
- Using GPUs to improve multigrid solver performance on a cluster
- Nov.
- D. Göddeke, R. Strzodka, J. Mohd-Yusof, P. S. McCormick, H. Wobker, C. Becker, and S. Turek, "Using GPUs to Improve Multigrid Solver Performance on a Cluster," International Journal of Computational Science and Engineering, vol. 4, no. 1, pp. 36-55, Nov. 2008.
- (2008) International Journal of Computational Science and Engineering , vol.4 , Issue.1 , pp. 36-55
- Göddeke, D.¹ Strzodka, R.² Mohd-Yusof, J.³ McCormick, P.S.⁴ Wobker, H.⁵ Becker, C.⁶ Turek, S.⁷

11
- 70350754499
- Adapting a message-driven parallel application to GPU-accelerated clusters
- Piscataway, NJ, USA: IEEE Press
- J. C. Phillips, J. E. Stone, and K. Schulten, "Adapting a message-driven parallel application to GPU-accelerated clusters," in SC '08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing. Piscataway, NJ, USA: IEEE Press, 2008, pp. 1-9.
- (2008) SC '08: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing , pp. 1-9
- Phillips, J.C.¹ Stone, J.E.² Schulten, K.³

12
- 77954917614
- An MPI-stream hybrid programming model for computational clusters
- IEEE International Symposium on
- E. P. Mancini, G. Marsh, and D. K. Panda, "An MPI-Stream Hybrid Programming Model for Computational Clusters," Cluster Computing and the Grid, IEEE International Symposium on, vol. 0, pp. 323-330, 2010.
- (2010) Cluster Computing and the Grid , pp. 323-330
- Mancini, E.P.¹ Marsh, G.² Panda, D.K.³

13
- 77952251540
- An asymmetric distributed shared memory model for heterogeneous parallel systems
- New York, NY, USA: ACM
- I. Gelado, J. E. Stone, J. Cabezas, S. Patel, N. Navarro, and W.-m. W. Hwu, "An asymmetric distributed shared memory model for heterogeneous parallel systems," in ASPLOS '10: Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems. New York, NY, USA: ACM, 2010, pp. 347-358.
- (2010) ASPLOS '10: Proceedings of the Fifteenth Edition of ASPLOS on Architectural Support for Programming Languages and Operating Systems , pp. 347-358
- Gelado, I.¹ Stone, J.E.² Cabezas, J.³ Patel, S.⁴ Navarro, N.⁵ Hwu, W.-M.W.⁶

14
- 42949143017
- Zippy: A framework for computation and visualization on a GPU cluster
- DOI 10.1111/j.1467-8659.2008.01131.x
- Z. Fan, F. Qiu, and A. E. Kaufman, "Zippy: A Framework for Computation and Visualization on a GPU Cluster," Comput. Graph. Forum, vol. 27, no. 2, pp. 341-350, 2008. (Pubitemid 351612851)
- (2008) Computer Graphics Forum , vol.27 , Issue.2 , pp. 341-350
- Fan, Z.¹ Qiu, F.² Kaufman, A.E.³

15
- 67649334609
- CUDASA: Compute unified device and systems architecture
- K.-L. M. Daniel Weiskopf, Jean M. Favre, Ed. Eurographics Association
- M. Strengert, C. Mller, C. Dachsbacher, and T. Ertl, "CUDASA: Compute Unified Device and Systems Architecture," in Eurographics 2008 Symposium on Parallel Graphics and Visualization (EGPGV'08), K.-L. M. Daniel Weiskopf, Jean M. Favre, Ed. Eurographics Association, 2008, pp. 49-56.
- (2008) Eurographics 2008 Symposium on Parallel Graphics and Visualization (EGPGV'08) , pp. 49-56
- Strengert, M.¹ Mller, C.² Dachsbacher, C.³ Ertl, T.⁴

16
- 78649830827
- A package for OpenCL based heterogeneous computing on clusters with many GPU devices
- 2010 IEEE International Conference on, Sep.
- A. Barak, T. Ben-Nun, E. Levy, and A. Shiloh, "A package for OpenCL based heterogeneous computing on clusters with many GPU devices," in Cluster Computing Workshops and Posters (CLUSTER WORKSHOPS), 2010 IEEE International Conference on, Sep. 2010, pp. 1-7.
- (2010) Cluster Computing Workshops and Posters (CLUSTER WORKSHOPS) , pp. 1-7
- Barak, A.¹ Ben-Nun, T.² Levy, E.³ Shiloh, A.⁴

17
- 0032021963
- The MOSIX multicomputer operating system for high performance cluster computing
- PII S0167739X9700037X
- A. Barak and O. La'adan, "The MOSIX multicomputer operating system for high performance cluster computing," Future Gener. Comput. Syst., vol. 13, pp. 361-372, Mar. 1998. (Pubitemid 128395367)
- (1998) Future Generation Computer Systems , vol.13 , Issue.4-5 , pp. 361-372
- Barak, A.¹ La'adan, O.²

18
- 0030083764
- TreadMarks: Shared memory computing on networks of workstations
- C. Amza, A. L. Cox, S. Dwarkadas, P. Keleher, H. Lu, R. Rajamony, W. Yu, and W. Zwaenepoel, "TreadMarks: Shared Memory Computing on Networks of Workstations," Computer, vol. 29, no. 2, pp. 18-28, 1996. (Pubitemid 126522507)
- (1996) Computer , vol.29 , Issue.2 , pp. 18-28
- Amza, C.¹ Cox, A.L.² Dwarkadas, S.³ Keleher, P.⁴ Lu, H.⁵ Rajamony, R.⁶ Yu, W.⁷ Zwaenepoel, W.⁸

19
- 78649813794
- Pleiad: A cross-environment middleware providing efficient multithreading on clusters
- New York, NY, USA: ACM
- K. I. Karantasis and E. D. Polychronopoulos, "Pleiad: a cross-environment middleware providing efficient multithreading on clusters," in CF '09: Proceedings of the 6th ACM conference on Computing frontiers. New York, NY, USA: ACM, 2009, pp. 109-116.
- (2009) CF '09: Proceedings of the 6th ACM Conference on Computing Frontiers , pp. 109-116
- Karantasis, K.I.¹ Polychronopoulos, E.D.²

20
- 70350678845
- JCUDA: A programmer-friendly interface for accelerating java programs with CUDA
- Berlin, Heidelberg: Springer-Verlag
- Y. Yan, M. Grossman, and V. Sarkar, "JCUDA: A Programmer-Friendly Interface for Accelerating Java Programs with CUDA," in Euro-Par '09: Proceedings of the 15th International Euro-Par Conference on Parallel Processing. Berlin, Heidelberg: Springer-Verlag, 2009, pp. 887-899.
- (2009) Euro-par '09: Proceedings of the 15th International Euro-par Conference on Parallel Processing , pp. 887-899
- Yan, Y.¹ Grossman, M.² Sarkar, V.³

21
- 67650692011
- [Online]. Available
- The IMPACT Research Group, Parboil benchmark suite, 2009. [Online]. Available: http://impact.crhc.illinois.edu/parboil.php
- (2009) Parboil Benchmark Suite

22
- 77952273045
- The scalable heterogeneous computing (SHOC) benchmark suite
- New York, NY, USA: ACM
- A. Danalis, G. Marin, C. McCurdy, J. S. Meredith, P. C. Roth, K. Spafford, V. Tipparaju, and J. S. Vetter, "The Scalable Heterogeneous Computing (SHOC) benchmark suite," in GPGPU '10: Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units. New York, NY, USA: ACM, 2010, pp. 63-74.
- (2010) GPGPU '10: Proceedings of the 3rd Workshop on General-purpose Computation on Graphics Processing Units , pp. 63-74
- Danalis, A.¹ Marin, G.² McCurdy, C.³ Meredith, J.S.⁴ Roth, P.C.⁵ Spafford, K.⁶ Tipparaju, V.⁷ Vetter, J.S.⁸

23
- 0001457509
- Some methods for classification and analysis of MultiVariate observations
- L. M. L. Cam and J. Neyman, Eds. University of California Press
- J. B. MacQueen, "Some Methods for Classification and Analysis of MultiVariate Observations," in Proc. of the fifth Berkeley Symposium on Mathematical Statistics and Probability, L. M. L. Cam and J. Neyman, Eds., vol. 1. University of California Press, 1967, pp. 281-297.
- (1967) Proc. of the Fifth Berkeley Symposium on Mathematical Statistics and Probability , vol.1 , pp. 281-297
- MacQueen, J.B.¹

24
- 0030161771
- Efficient implementation of weighted ENO schemes
- DOI 10.1006/jcph.1996.0130
- G.-S. Jiang and C.-W. Shu, "Efficient implementation of weighted ENO schemes," J. Comput. Phys., vol. 126, no. 1, pp. 202-228, 1996. (Pubitemid 126162204)
- (1996) Journal of Computational Physics , vol.126 , Issue.1 , pp. 202-228
- Jiang, G.-S.¹ Shu, C.-W.²

25
- 84861134713
- A fast double precision CFD code using CUDA
- J. M. Cohen and J. Molemake, "A Fast Double Precision CFD Code Using CUDA," in 21st International Conference on Parallel Computational Fluid Dynamics (ParCFD2009), 2009.
- (2009) 21st International Conference on Parallel Computational Fluid Dynamics (ParCFD2009)
- Cohen, J.M.¹ Molemake, J.²

26
- 33750939148
- The UCI KDD archive of large data sets for data mining research and experimentation
- S. D. Bay, D. Kibler, M. J. Pazzani, and P. Smyth, "The UCI KDD archive of large data sets for data mining research and experimentation," SIGKDD Explor. Newsl., vol. 2, no. 2, pp. 81-85, 2000.
- (2000) SIGKDD Explor. Newsl. , vol.2 , Issue.2 , pp. 81-85
- Bay, S.D.¹ Kibler, D.² Pazzani, M.J.³ Smyth, P.⁴

27
- 35348919277
- NU-minebench 2.0
- Center for Ultra-Scale Computing and Information Security Aug. [Online]. Available
- J. Pisharath, Y. Liu, W.-k. Liao, A. Choudhary, o. GMemik, and J. Parhi, "NU-Minebench 2.0," Center for Ultra-Scale Computing and Information Security (CUCIS) Northwestern University, Tech. Rep. CUCIS-2005-08-01, Aug. 2005. [Online]. Available: http://cucis.ece.northwestern.edu/techreports/pdf/ CUCIS-2004-08-001.pdf
- (2005) Northwestern University, Tech. Rep. CUCIS-2005-08-01
- Pisharath, J.¹ Liu, Y.² Liao, W.-K.³ Choudhary, A.⁴ Gmemik, O.⁵ Parhi, J.⁶

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.