SCOPUS 정보 검색 플랫폼

IPDPS 2009 - Proceedings of the 2009 IEEE International Parallel and Distributed Processing Symposium

Volumn , Issue , 2009, Pages

A framework for efficient and scalable execution of domain-specific templates on GPUs

(3) Sundaram, Narayanan a,b Raghunathan, Anand a,c Chakradhar, Srimat T a

a NEC LABORATORIES AMERICA (United States)

b UNIVERSITY OF CALIFORNIA (United States)

c PURDUE UNIVERSITY (United States)

Author keywords

[No Author keywords available]

Indexed keywords

ABSTRACTION LEVEL; CODE GENERATORS; COMPUTING INDUSTRY; CONVOLUTIONAL NEURAL NETWORK; CRITICAL PROBLEMS; DOMAIN SPECIFIC; GPU COMPUTING; GPU IMPLEMENTATION; GPU PROGRAMMING; GRAPHICS CARD; GRAPHICS PROCESSING UNITS; INPUT DATAS; LARGE DATA; LARGE DATASETS; MANY-CORE COMPUTING; MEMORY FOOTPRINT; OPERATOR-SPLITTING; PARALLEL OPERATORS; PERFORMANCE IMPROVEMENTS; SOFTWARE FRAMEWORKS; VIDEO ANALYSIS;

COMPUTER SCIENCE; DATA TRANSFER; DISTRIBUTED PARAMETER NETWORKS; EDGE DETECTION; NEURAL NETWORKS; PARALLEL PROGRAMMING; PROGRAM PROCESSORS; SCALABILITY;

COMPUTER GRAPHICS EQUIPMENT;

EID: 70450029523 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/IPDPS.2009.5161039 Document Type: Conference Paper

Times cited : (39)

References (21)

1
- 79958011619
- CUDA Data Parallel Primitives Library. http://www. gpgpu.org/developer/ cudpp.
- CUDA Data Parallel Primitives Library

2
- 70450101161
- GPGPU community website
- GPGPU community website. http ://www. gpgpu . org.

3
- 70450117428
- Torch5 library. http://torch5.sourceforge.net.
- Torch5 library

4
- 70449932975
- Advanced Micro Devices, Inc
- Advanced Micro Devices, Inc. AMD Stream Computing SDK. http://ati.amd.com/technology/ streamcomputing/index.html.
- AMD Stream Computing SDK

5
- 10644248153
- Brook for GPUs: Stream computing on graphics hardware
- I. Buck, T. Foley, D. Horn, J. Sugerman, K. Fatahalian, M. Houston, and P. Hanrahan. Brook for GPUs: Stream computing on graphics hardware. In SIGGRAPH '04: ACM SIGGRAPH 2004 Papers, pages 777-786, 2004.
- (2004) SIGGRAPH '04: ACM SIGGRAPH 2004 Papers , pp. 777-786
- Buck, I.¹ Foley, T.² Horn, D.³ Sugerman, J.⁴ Fatahalian, K.⁵ Houston, M.⁶ Hanrahan, P.⁷

6
- 63549103331
- A map reduce framework for programming graphics processors
- April
- B. Catanzaro, N. Sundaram, and K. Keutzer. A map reduce framework for programming graphics processors. In Proc. Third Workshop on Software Tools for Multi Core Systems (STMCS), April 2008.
- (2008) Proc. Third Workshop on Software Tools for Multi Core Systems (STMCS)
- Catanzaro, B.¹ Sundaram, N.² Keutzer, K.³

7
- 77957966090
- Grading nuclear pleomorphism on histological micrographs
- E. Cosatto, M. Miller, H. P. Graf, and J. S. Meyer. Grading nuclear pleomorphism on histological micrographs. In Proc. Int. Conf. Pattern Recognition, pages 1-4, 2008.
- (2008) Proc. Int. Conf. Pattern Recognition , pp. 1-4
- Cosatto, E.¹ Miller, M.² Graf, H.P.³ Meyer, J.S.⁴

8
- 70450036077
- P. Dubey. A Platform 2015 Workload Model: Recognition, Mining and Synthesis Moves Computers to the Era of Tera, 2007. ftp ://download. intel. com/technology/computing/archinnov/ platform2015/download/RMS.pdf.
- P. Dubey. A Platform 2015 Workload Model: Recognition, Mining and Synthesis Moves Computers to the Era of Tera, 2007. ftp ://download. intel. com/technology/computing/archinnov/ platform2015/download/RMS.pdf.

9
- 33749557305
- Translating pseudo-Boolean constraints into SAT
- Jan
- N. Een and N. Sorensson. Translating pseudo-Boolean constraints into SAT. Journal on Satisfiability, Boolean Modeling and Computation, 2:1-25, Jan 2006.
- (2006) Journal on Satisfiability, Boolean Modeling and Computation , vol.2 , pp. 1-25
- Een, N.¹ Sorensson, N.²

10
- 57349092386
- CUBA: An architecture for efficient CPU/coprocessor data communication
- Jun
- I. Gelado, J. Kelm, S. Ryoo, S. Lumetta, N. Navarro, and W. mei Hwu. CUBA: An architecture for efficient CPU/coprocessor data communication. In ICS '08: Proceedings of the 22nd annual international conference on Supercomput-ing, Jun 2008.
- (2008) ICS '08: Proceedings of the 22nd annual international conference on Supercomput-ing
- Gelado, I.¹ Kelm, J.² Ryoo, S.³ Lumetta, S.⁴ Navarro, N.⁵ mei Hwu, W.⁶

11
- 63549097654
- Mars : A mapreduce framework for graphics processors
- October
- B. He, W. Fang, Q. Luo, N. K. Govindarajulu, and T Wang. Mars : A mapreduce framework for graphics processors. In Proc. Int. Conf. on Parallel Architectures and Compilation Techniques (PACT), October 2008.
- (2008) Proc. Int. Conf. on Parallel Architectures and Compilation Techniques (PACT)
- He, B.¹ Fang, W.² Luo, Q.³ Govindarajulu, N.K.⁴ Wang, T.⁵

12
- 34748865391
- T .J. Knight, J. Young, Park, M. Ren, M. Houston, M. Erez, K. Fatahalian, A. Aiken, W. J. Dally, and P. Hanrahan. Compilation for explicitly managed memory hierarchies. In PPoPP '07: Proceedings of the 12th ACM SIGPLAN Symposium on Principles and practice of parallel programming, March 2007.
- T .J. Knight, J. Young, Park, M. Ren, M. Houston, M. Erez, K. Fatahalian, A. Aiken, W. J. Dally, and P. Hanrahan. Compilation for explicitly managed memory hierarchies. In PPoPP '07: Proceedings of the 12th ACM SIGPLAN Symposium on Principles and practice of parallel programming, March 2007.

13
- 67650081010
- OpenMP to GPGPU: A compiler framework for automatic translation and optimization
- ACM Press, Feb
- S. Lee, S.-J. Min, and R. Eigenmann. OpenMP to GPGPU: A compiler framework for automatic translation and optimization. In Proc. of the ACM Symposium on Principles and Practice of Parallel Programming (PPOPP'09). ACM Press, Feb. 2009.
- (2009) Proc. of the ACM Symposium on Principles and Practice of Parallel Programming (PPOPP'09)
- Lee, S.¹ Min, S.-J.² Eigenmann, R.³

14
- 68149168035
- Merge: A programming model for heterogeneous multi-core systems
- Mar
- M. Linderman, J. Collins, H. Wang, and T Meng. Merge: A programming model for heterogeneous multi-core systems. In ASPLOS XIII: Proceedings of the 13th international conference on Architectural support for programming languages and operating systems, Mar 2008.
- (2008) ASPLOS XIII: Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
- Linderman, M.¹ Collins, J.² Wang, H.³ Meng, T.⁴

15
- 70450109401
- NVIDIA Corporation. NVIDIA CUDA, 2007. http:// nvidia.com/cuda.
- NVIDIA Corporation. NVIDIA CUDA, 2007. http:// nvidia.com/cuda.

16
- 70450115089
- Toward automatic parallelization and auto-tuning of affine kernels for GPUs
- July
- J. Ramanujam. Toward automatic parallelization and auto-tuning of affine kernels for GPUs. In Workshop on Automatic Tuning for Petascale Systems, July 2008.
- (2008) Workshop on Automatic Tuning for Petascale Systems
- Ramanujam, J.¹

17
- 70450107592
- RapidMind, Inc
- RapidMind, Inc. Rapidmind Multi-core Development Platform, http://www.rapidmind.net.
- Rapidmind Multi-core Development Platform

18
- 79959466764
- S. Ryoo, C. I. Rodrigues, S. S. Baghsorkhi, S. S. Stone, D. B. Kirk, and W. mei W. Hwu. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. In PPoPP '08: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming, pages 73-82, New York, NY, USA, 2008. ACM.
- S. Ryoo, C. I. Rodrigues, S. S. Baghsorkhi, S. S. Stone, D. B. Kirk, and W. mei W. Hwu. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. In PPoPP '08: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming, pages 73-82, New York, NY, USA, 2008. ACM.

19
- 43449094719
- W Hwu. Program optimization space pruning for a multithreaded GPU
- New York, NY, USA, ACM
- S. Ryoo, C. I. Rodrigues, S. S. Stone, S. S. Baghsorkhi, S.-Z. Ueng, J. A. Stratton, and W mei W Hwu. Program optimization space pruning for a multithreaded GPU. In CGO '08: Proceedings of the sixth annual IEEE/ACM international symposium on Code generation and optimization, pages 195-204, New York, NY, USA, 2008. ACM.
- (2008) CGO '08: Proceedings of the sixth annual IEEE/ACM international symposium on Code generation and optimization , pp. 195-204
- Ryoo, S.¹ Rodrigues, C.I.² Stone, S.S.³ Baghsorkhi, S.S.⁴ Ueng, S.-Z.⁵ Stratton, J.A.⁶ mei, W.⁷

20
- 33947595619
- Accelerator: Using data parallelism to program GPUs for general-purpose uses
- Oct
- D. Tarditi, S. Puri, and J. Oglesby. Accelerator: Using data parallelism to program GPUs for general-purpose uses. In ASPLOS-XII: Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, Oct 2006.
- (2006) ASPLOS-XII: Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
- Tarditi, D.¹ Puri, S.² Oglesby, J.³

21
- 35448978324
- EXOCHI: Architecture and programming environment for a heterogeneous multi-core multithreaded system
- Jun
- P. Wang, J. Collins, G. Chinya, H. Jiang, X. Tian, M. Girkar, N. Yang, G.-Y Lueh, and H. Wang. EXOCHI: Architecture and programming environment for a heterogeneous multi-core multithreaded system. In PLDI '07: Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation, Jun 2007.
- (2007) PLDI '07: Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
- Wang, P.¹ Collins, J.² Chinya, G.³ Jiang, H.⁴ Tian, X.⁵ Girkar, M.⁶ Yang, N.⁷ Lueh, G.-Y.⁸ Wang, H.⁹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.