SCOPUS 정보 검색 플랫폼

ACM SIGPLAN Notices

Volumn 46, Issue 8, 2011, Pages 277-287

Achieving a single compute device image in OpenCL for multiple GPUs

(4) Kim, Jungwon a Kim, Honggyu a Lee, Joo Hwan a Lee, Jaejin a

a Seoul National University (South Korea)

Author keywords

Access range analysis; Compilers; OpenCL; Runtime; Virtual device memory; Workload distribution

Indexed keywords

OPENCL; RANGE ANALYSIS; RUNTIMES; VIRTUAL DEVICES; WORK-LOAD DISTRIBUTION;

BENCHMARKING; EQUIPMENT;

PROGRAM PROCESSORS;

EID: 80053994164 PISSN: 15232867 EISSN: None Source Type: Journal
DOI: 10.1145/2038037.1941591 Document Type: Conference Paper

Times cited : (61)

References (29)

1
- 79952792548
- ATI Stream Software Development Ket (SDK) v2.1. AMD, 2010. http://developer.amd.com/gpu/atistreamsdk/pages/default.aspx.
- (2010) ATI Stream Software Development Ket (SDK) v2.1. AM

2
- 85060036181
- Validity of the single processor approach to achieving large scale computing capabilities
- ACM
- G. M. Amdahl. Validity of the single processor approach to achieving large scale computing capabilities. In AFIPS '67 (Spring): Proceedings of the April 18-20, 1967, spring joint computer conference, pages 483-485. ACM, 1967.
- (1967) AFIPS '67 (Spring): Proceedings of the April 18-20, 1967, Spring Joint Computer Conference , pp. 483-485
- Amdahl, G.M.¹

3
- 63549095070
- The PARSEC benchmark suite: Characterization and architectural implications
- ACM, October
- C. Bienia, S. Kumar, J. P. Singh, and K. Li. The PARSEC benchmark suite: Characterization and architectural implications. In PACT '08: Proceedings of the 17th international conference on Parallel architectures and compilation techniques, pages 72-81. ACM, October 2008.
- (2008) PACT '08: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques , pp. 72-81
- Bienia, C.¹ Kumar, S.² Singh, J.P.³ Li, K.⁴

4
- 84947585636
- The SPMD model: Past, present and future
- January
- F. Darema. The SPMD Model: Past, Present and Future. Lecture Notes in Computer Science, 2131(1):1-1, January 2001.
- (2001) Lecture Notes in Computer Science , vol.2131 , Issue.1 , pp. 1-1
- Darema, F.¹

5
- 57349092386
- CUBA: An architecture for efficient CPU/co-processor data communication
- ACM, June
- I. Gelado, J. H. Kelm, S. Ryoo, S. S. Lumetta, N. Navarro, and W.- m. W. Hwu. CUBA: An architecture for efficient CPU/co-processor data communication. In ICS '08: Proceedings of the 22nd annual international conference on Supercomputing, pages 299-308. ACM, June 2008.
- (2008) ICS '08: Proceedings of the 22nd annual international conference on Supercomputing , pp. 299-308
- Gelado, I.¹ Kelm, J.H.² Ryoo, S.³ Lumetta, S.S.⁴ Navarro, N.⁵ Hwu, W.-M.W.⁶

6
- 78149276036
- Twin peaks: A software platform for heterogeneous computing on general-purpose and graphics processors
- ACM
- J. Gummaraju, L. Morichetti, M. Houston, B. Sander, B. R. Gaster, and B. Zheng. Twin peaks: A software platform for heterogeneous computing on general-purpose and graphics processors. In PACT '10: Proceedings of the 19th international conference on Parallel architectures and compilation techniques, pages 205-216. ACM, 2010.
- (2010) PACT '10: Proceedings of the 19th international conference on Parallel architectures and compilation techniques , pp. 205-216
- Gummaraju, J.¹ Morichetti, L.² Houston, M.³ Sander, B.⁴ Gaster, B.R.⁵ Zheng, B.⁶

7
- 77952342828
- Khronos OpenCLWorking Group., Khronos Group
- Khronos OpenCLWorking Group. The OpenCL Specification Version 1.0. Khronos Group, 2009. http://www.khronos.org/opencl.
- (2009) The OpenCL Specification Version 1.0

8
- 77951157944
- Morgan Kaufmann Publishers Inc., San Francisco, CA, USA. ISBN 0123814723 9780123814722
- D. B. Kirk and W.-m. W. Hwu. Programming Massively Parallel Processors: A Hands-on Approach. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2010. ISBN 0123814723, 9780123814722.
- (2010) Programming Massively Parallel Processors: A Hands-on Approach
- Kirk, D.B.¹ Hwu, W.-M.W.²

9
- 3042658703
- LLVM: A compilation framework for lifelong program analysis & transformation
- Washington, DC, USA March, IEEE Computer Society
- C. Lattner and V. Adve. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In CGO '04: Proceedings of the international symposium on Code generation and optimization, pages 75-86, Washington, DC, USA, March 2004. IEEE Computer Society.
- (2004) CGO '04: Proceedings of the International Symposium on Code Generation and Optimization , pp. 75-86
- Lattner, C.¹ Adve, V.²

10
- 78149255519
- An OpenCL framework for heterogeneous multicores with local memory
- ACM
- J. Lee, J. Kim, S. Seo, S. Kim, J. Park, H. Kim, T. T. Dao, Y. Cho, S. J. Seo, S. H. Lee, S. M. Cho, H. J. Song, S.-B. Suh, and J.-D. Choi. An OpenCL framework for heterogeneous multicores with local memory. In PACT '10: Proceedings of the 19th international conference on Parallel architectures and compilation techniques, pages 193-204. ACM, 2010.
- (2010) PACT '10: Proceedings of the 19th international conference on Parallel Architectures and Compilation Techniques , pp. 193-204
- Lee, J.¹ Kim, J.² Seo, S.³ Kim, S.⁴ Park, J.⁵ Kim, H.⁶ Dao, T.T.⁷ Cho, Y.⁸ Seo, S.J.⁹ Lee, S.H.¹⁰ Cho, S.M.¹¹ Song, H.J.¹² Suh, S.-B.¹³ Choi, J.-D.¹⁴

11
- 0003502903
- Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, ISBN 1-55860-320-4
- S. S. Muchnick. Advanced compiler design and implementation. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1997. ISBN 1-55860-320-4.
- (1997) Advanced Compiler Design and Implementation.
- Muchnick, S.S.¹

12
- 80053992838
- NASA Advanced Supercomputing Division. NAS Parallel Benchmarks.
- NASA Advanced Supercomputing Division. NAS Parallel Benchmarks. http://www.nas.nasa.gov/Resources/Software/npb. html.

13
- 79952776981
- NVIDIA
- NVIDIA Fermi Compute Architecture White Paper. NVIDIA, 2009. http://www.nvidia.com/content/PDF/fermi-white- papers/NVIDIA-Fermi-Compute- Architecture-Whitepaper. pdf.
- (2009) NVIDIA Fermi Compute Architecture White Paper

14
- 78650247683
- NVIDIA, May
- NVIDIA CUDA C Best Practices Guide 3.1. NVIDIA, May 2010.
- (2010) NVIDIA CUDA C Best Practices Guide 3.1

15
- 35948991669
- NVIDIA, July
- NVIDIA CUDA C Programming Guide 3.1.1. NVIDIA, July 2010.
- (2010) NVIDIA CUDA C Programming Guide 3.1.1

16
- 79952804884
- NVIDIA, July
- NVIDIA CUDA Zone. NVIDIA, July 2010. http://www.nvidia. com/object/cuda-home-new.html.
- (2010) NVIDIA CUDA Zone.

17
- 79952784236
- NVIDIA, June
- NVIDIA GPU Computing Software Development Kit. NVIDIA, June 2010. http://developer.nvidia.com/object/cuda-3-1- downloads.html.
- (2010) NVIDIA GPU Computing Software Development Kit

18
- 79952805269
- NVIDIA
- Tesla M2050/M2070 GPU Computing Module. NVIDIA, 2010. http://www.nvidia.com/object/product-tesla-M2050-M2070-us.html.
- (2010) Tesla M2050/M2070 GPU Computing Module

19
- 70350754499
- Adapting a messagedriven parallel application to GPU-accelerated clusters
- Piscataway, NJ, USA, November . IEEE Press
- J. C. Phillips, J. E. Stone, and K. Schulten. Adapting a messagedriven parallel application to GPU-accelerated clusters. In SC '08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing, pages 1-9, Piscataway, NJ, USA, November 2008. IEEE Press.
- (2008) SC '08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing , pp. 1-9
- Phillips, J.C.¹ Stone, J.E.² Schulten, K.³

20
- 67650021816
- Solving dense linear systems on platforms with multiple hardware accelerators
- ACM
- G. Quintana-Ortí, F. D. Igual, E. S. Quintana-Ortí, and R. A. van de Geijn. Solving dense linear systems on platforms with multiple hardware accelerators. In PPoPP '09: Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 121-130. ACM, 2009.
- (2009) PPoPP '09: Proceedings of the 14th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming , pp. 121-130
- Quintana-Ortí, G.¹ Igual, F.D.² Quintana-Ortí, E.S.³ Van De Geijn, R.A.⁴

21
- 77749280360
- The LOFAR correlator: Implementation and performance analysis
- ACM
- J. W. Romein, P. C. Broekema, J. D. Mol, and R. V. van Nieuwpoort. The LOFAR correlator: Implementation and performance analysis. In PPoPP '10: Proceedings of the 15th ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 169-178. ACM, 2010.
- (2010) PPoPP '10: Proceedings of the 15th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming , pp. 169-178
- Romein, J.W.¹ Broekema, P.C.² Mol, J.D.³ Van Nieuwpoort, R.V.⁴

22
- 70449793037
- Exploring the multiple-GPU design space
- May
- D. Schaa and D. Kaeli. Exploring the multiple-GPU design space. In IPDPS '09: Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing, pages 1-12, May 2009.
- (2009) IPDPS '09: Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing , pp. 1-12
- Schaa, D.¹ Kaeli, D.²

23
- 67649334609
- CUDASA: Compute unified device and systems architecture
- Eurographics Association April
- M. Strengert, C. Müler, C. Dachsbacher, and T. Ertl. CUDASA: Compute Unified Device and Systems Architecture. In Eurographics Symposium on Parallel Graphics and Visualization (EGPGV08), pages 49-56. Eurographics Association, April 2008.
- (2008) Eurographics Symposium on Parallel Graphics and Visualization (EGPGV08) , pp. 49-56
- Strengert, M.¹ C. Müler² Dachsbacher, C.³ Ertl, T.⁴

24
- 80054001882
- The IMPACT Research Group. Parboil Benchmark suite
- The IMPACT Research Group. Parboil Benchmark suite. http://impact.crhc. illinois.edu/parboil.php, 2009.
- (2009)

25
- 0004096330
- Technical report, Amsterdam, The Netherlands, The Netherlands
- F. Tip. A Survey of Program Slicing Techniques. Technical report, Amsterdam, The Netherlands, The Netherlands, 1994.
- (1994) A Survey of Program Slicing Techniques
- Tip, F.¹

26
- 70350771131
- Benchmarking gpus to tune dense linear algebra
- Piscataway, NJ, USA, IEEE Press
- V. Volkov and J.W. Demmel. Benchmarking gpus to tune dense linear algebra. In SC '08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing, pages 1-11, Piscataway, NJ, USA, 2008. IEEE Press.
- (2008) SC '08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing , pp. 1-11
- Volkov, V.¹ Demmel, J.W.²

27
- 85050273691
- Program slicing
- Piscataway, NJ, USA, IEEE Press
- M. Weiser. Program Slicing. In ICSE '81: Proceedings of the 5th International Conference on Software Engineering, pages 439-449, Piscataway, NJ, USA, 1981. IEEE Press.
- (1981) ICSE '81: Proceedings of the 5th International Conference on Software Engineering , pp. 439-449
- Weiser, M.¹

28
- 78649488776
- Adaptive optimization for petascale heterogeneous CPU/GPU computing
- Los Alamitos, CA, USA, IEEE Computer Society
- C. Yang, F. Wang, Y. Du, J. Chen, J. Liu, H. Yi, and K. Lu. Adaptive Optimization for Petascale Heterogeneous CPU/GPU Computing. In IEEE Cluster '10: Proceedings of IEEE International Conference on Cluster Computing, pages 19-28, Los Alamitos, CA, USA, 2010. IEEE Computer Society.
- (2010) IEEE Cluster '10: Proceedings of IEEE International Conference on Cluster Computing , pp. 19-28
- Yang, C.¹ Wang, F.² Du, Y.³ Chen, J.⁴ Liu, J.⁵ Yi, H.⁶ Lu, K.⁷

29
- 77957600490
- A GPGPU compiler for memory optimization and parallelism management
- ACM, June
- Y. Yang, P. Xiang, J. Kong, and H. Zhou. A GPGPU compiler for memory optimization and parallelism management. In PLDI '10: Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation, pages 86-97. ACM, June 2010.
- (2010) PLDI '10: Proceedings of the 2010 ACM SIGPLAN conference on Programming Language Design and Implementation , pp. 86-97
- Yang, Y.¹ Xiang, P.² Kong, J.³ Zhou, H.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.