SCOPUS 정보 검색 플랫폼

Proceedings - 2011 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation, IC-SAMOS 2011

Volumn , Issue , 2011, Pages 25-32

Skeleton-based automatic parallelization of image processing algorithms for GPUs

(3) Nugteren, Cedric a Corporaal, Henk a Mesman, Bart a

a EINDHOVEN UNIVERSITY OF TECHNOLOGY (Netherlands)

Author keywords

[No Author keywords available]

Indexed keywords

AUTOMATIC PARALLELIZATION; AUTOMATICALLY GENERATED; CODE GENERATORS; COMPLETE CLASSIFICATION; DATA THROUGHPUT; DOMAIN SPECIFIC; EFFICIENT IMPLEMENTATION; GRAPHICS PROCESSING UNITS; HARDWARE EFFICIENCY; HIGH PERFORMANCE COMPUTING; HIGH-QUALITY SOLUTIONS; IMAGE PROCESSING ALGORITHM; ON CHIP MEMORY; PARALLEL COMPUTATION; PARALLELIZATIONS; SKELETONIZATION;

ALGORITHMS; COMPUTER SIMULATION; COMPUTER SOFTWARE SELECTION AND EVALUATION; EMBEDDED SYSTEMS; FLOCCULATION; IMAGE CODING; IMAGING SYSTEMS; MUSCULOSKELETAL SYSTEM; OPTIMIZATION; PROGRAM PROCESSORS;

COMPUTATIONAL EFFICIENCY;

EID: 80155136237 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/SAMOS.2011.6045441 Document Type: Conference Paper

Times cited : (13)

References (22)

1
- 70350662847
- GpuCV: An opensource GPU-accelerated framework for image processing and computer vision
- ACM
- Y. Allusse, P. Horain, A. Agarwal, and C. Saipriyadarshan. GpuCV: an opensource GPU-accelerated framework for image processing and computer vision. In MM '08: Proceeding of the 16th ACM international conference on Multimedia, pages 1089-1092. ACM, 2008.
- (2008) MM '08: Proceeding of the 16th ACM International Conference on Multimedia , pp. 1089-1092
- Allusse, Y.¹ Horain, P.² Agarwal, A.³ Saipriyadarshan, C.⁴

2
- 0036683380
- Towards a general framework for FPGA based image processing using hardware skeletons
- K. Benkrid, D. Crookes, and A. Benkrid. Towards a general framework for FPGA based image processing using hardware skeletons. Parallel Computing, 28(7-8):1141-1154, 2002.
- (2002) Parallel Computing , vol.28 , Issue.7-8 , pp. 1141-1154
- Benkrid, K.¹ Crookes, D.² Benkrid, A.³

3
- 80155193587
- PhD thesis, Eindhoven University of Technology
- W. Caarls. Automated Design of Application-Specific Smart Camera Architectures. PhD thesis, Eindhoven University of Technology, 2008.
- (2008) Automated Design of Application-Specific Smart Camera Architectures.
- Caarls, W.¹

4
- 33847140637
- Algorithmic skeletons for stream programming in embedded heterogeneous parallel image processing applications
- W. Caarls, P. Jonker, and H. Corporaal. Algorithmic skeletons for stream programming in embedded heterogeneous parallel image processing applications. In Parallel and Distributed Processing Symposium. IPDPS 2006. 20th International, page 9 pp., 2006.
- (2006) Parallel and Distributed Processing Symposium. IPDPS 2006. 20th International
- Caarls, W.¹ Jonker, P.² Corporaal, H.³

5
- 0003587629
- MIT Press, Cambridge, MA, USA
- M. Cole. Algorithmic Skeletons: Structured Management of Parallel Computation. MIT Press, Cambridge, MA, USA, 1991.
- (1991) Algorithmic Skeletons: Structured Management of Parallel Computation
- Cole, M.¹

6
- 80155175370
- High-performance SIMT code generation in an active visual effects library
- ACM
- J. L. Cornwall, L. Howes, P. H. Kelly, P. Parsonage, and B. Nicoletti. High-performance SIMT code generation in an active visual effects library. In CF '09: Proceedings of the 6th ACM conference on Computing frontiers, pages 175-184. ACM, 2009.
- (2009) CF '09: Proceedings of the 6th ACM Conference on Computing Frontiers , pp. 175-184
- Cornwall, J.L.¹ Howes, L.² Kelly, P.H.³ Parsonage, P.⁴ Nicoletti, B.⁵

7
- 78349252088
- SkePU: A multi-backend skeleton programming library for multi-GPU systems
- New York, NY, USA, ACM
- J. Enmyren and C. W. Kessler. SkePU: a multi-backend skeleton programming library for multi-GPU systems. In Proceedings of the fourth international workshop on High-level parallel programming and applications, HLPP '10, pages 5-14, New York, NY, USA, 2010. ACM.
- (2010) Proceedings of the Fourth International Workshop on High-level Parallel Programming and Applications, HLPP '10 , pp. 5-14
- Enmyren, J.¹ Kessler, C.W.²

8
- 78149258346
- Understanding throughput-oriented architectures
- November
- M. Garland and D. B. Kirk. Understanding throughput-oriented architectures. Communications of the ACM, 53:58-66, November 2010.
- (2010) Communications of the ACM , vol.53 , pp. 58-66
- Garland, M.¹ Kirk, D.B.²

9
- 67650673468
- HiCUDA: A high-level directivebased language for GPU programming
- ACM
- T. D. Han and T. S. Abdelrahman. hiCUDA: a high-level directivebased language for GPU programming. In GPGPU-2: Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, pages 52-61. ACM, 2009.
- (2009) GPGPU-2: Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units , pp. 52-61
- Han, T.D.¹ Abdelrahman, T.S.²

10
- 67650661447
- Technical report, NVIDIA, NVIDIA
- M. Harris. Optimizing parallel reduction in CUDA. Technical report, NVIDIA, 2008. NVIDIA.
- (2008) Optimizing Parallel Reduction in CUDA
- Harris, M.¹

11
- 79955675214
- A design pattern language for engineering (parallel) software
- K. Keutzer and T. Mattson. A Design Pattern Language for Engineering (Parallel) Software. In Intel Technology Journal, 2010.
- (2010) Intel Technology Journal
- Keutzer, K.¹ Mattson, T.²

12
- 80155140858
- GPU Kernels as data-parallel array computations in Haskell
- S. Lee, V. Grover, M. Chakravarty, and G. Keller. GPU Kernels as Data-Parallel Array Computations in Haskell. In EPHAM 09': Exploiting Parallelism using GPUs and other Hardware-Assisted Methods, 2009.
- (2009) EPHAM 09': Exploiting Parallelism Using GPUs and Other Hardware-Assisted Methods
- Lee, S.¹ Grover, V.² Chakravarty, M.³ Keller, G.⁴

13
- 84937421176
- Automatic SIMD parallelization of embedded applications based on pattern recognition
- A. Bode, T. Ludwig, W. Karl, and R. Wismuller, editors
- R. Manniesing, I. Karkowski, and H. Corporaal. Automatic SIMD Parallelization of Embedded Applications Based on Pattern Recognition. In A. Bode, T. Ludwig, W. Karl, and R. Wismuller, editors, Euro-Par 2000 Parallel Processing, pages 349-356, 2000.
- (2000) Euro-Par 2000 Parallel Processing , pp. 349-356
- Manniesing, R.¹ Karkowski, I.² Corporaal, H.³

14
- 80155175368
- Analyzing CUDA's compiler through the visualization of decoded GPU binaries
- C. Nugteren, B. Mesman, and H. Corporaal. Analyzing CUDA's Compiler through the Visualization of Decoded GPU Binaries. In ODES-8: Proceedings of the 8th Workshop on Optimizations for DSP and Embedded Systems at CGO '10, 2010.
- (2010) ODES-8: Proceedings of the 8th Workshop on Optimizations for DSP and Embedded Systems at CGO '10
- Nugteren, C.¹ Mesman, B.² Corporaal, H.³

15
- 80955152874
- The top 10 innovations in the new Fermi architecture, and the top 3 next challenges
- D. Patterson. The Top 10 Innovations in the New Fermi Architecture, and the Top 3 Next Challenges. NVIDIA Whitepaper, 2009.
- (2009) NVIDIA Whitepaper
- Patterson, D.¹

16
- 57349086588
- Histogram calculation in CUDA
- V. Podlozhnyuk. Histogram calculation in CUDA. Technical report, NVIDIA, 2007.
- (2007) Technical Report NVIDIA
- Podlozhnyuk, V.¹

17
- 61349093320
- Image convolution with CUDA
- V. Podlozhnyuk. Image Convolution with CUDA. Technical report, NVIDIA, 2007.
- (2007) Technical Report NVIDIA
- Podlozhnyuk, V.¹

18
- 72449173321
- A skeletal parallel framework with fusion optimizer for GPGPU programming
- Z. Hu, editor, Programming Languages and Systems. Springer Berlin Heidelberg
- S. Sato and H. Iwasaki. A Skeletal Parallel Framework with Fusion Optimizer for GPGPU Programming. In Z. Hu, editor, Programming Languages and Systems, volume 5904 of Lecture Notes in Computer Science, pages 79-94. Springer Berlin Heidelberg, 2009.
- (2009) Lecture Notes in Computer Science , vol.5904 , pp. 79-94
- Sato, S.¹ Iwasaki, H.²

19
- 78650298274
- GPGPU kernel implementation and refinement using Obsidian
- ICCS 2010
- J. Svensson, K. Claessen, and M. Sheeran. GPGPU kernel implementation and refinement using Obsidian. Procedia Computer Science, 1(1):2059-2068, 2010. ICCS 2010.
- (2010) Procedia Computer Science , vol.1 , Issue.1 , pp. 2059-2068
- Svensson, J.¹ Claessen, K.² Sheeran, M.³

20
- 80155175375
- TunaCode
- TunaCode. CUVIlib. http://www.cuvilib.com.

21
- 58449127539
- CUDA-Lite: Reducing GPU programming complexity
- Languages and Compilers for Parallel Computing. Springer Berlin
- S.-Z. Ueng, M. Lathara, S. Baghsorkhi, and W.-m. Hwu. CUDA-Lite: Reducing GPU Programming Complexity. In Languages and Compilers for Parallel Computing, volume 5335 of Lecture Notes in Computer Science, pages 1-15. Springer Berlin, 2008.
- (2008) Lecture Notes in Computer Science , vol.5335 , pp. 1-15
- Ueng, S.-Z.¹ Lathara, M.² Baghsorkhi, S.³ Hwu, W.-M.⁴

22
- 77952281697
- Implementing the PGI Accelerator model
- USA. ACM
- M. Wolfe. Implementing the PGI Accelerator model. In Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, GPGPU '10, pages 43-50, USA, 2010. ACM.
- (2010) Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, GPGPU '10 , pp. 43-50
- Wolfe, M.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.