SCOPUS 정보 검색 플랫폼

Proceedings of the 2009 CGO - 7th International Symposium on Code Generation and Optimization

Volumn , Issue , 2009, Pages 200-209

Software pipelined execution of stream programs on GPUs

(3) Udupa, Abhishek a Govindarajan, R a Thazhuthaveetil, Matthew J a

a INDIAN INSTITUTE OF SCIENCE (India)

Author keywords

CUDA; GPU programming; Software pipelining; Stream programming

Indexed keywords

BANDWIDTH LIMITATION; COMMUNICATION CHANNEL; CUDA; DATA PARALLELISM; GENERAL PURPOSE; GPU PROGRAMMING; GRAPHICS PROCESSING UNITS; HIGH MEMORY BANDWIDTH; INTEGER LINEAR PROGRAMS; MULTICORE ARCHITECTURES; PROGRAMMING MODELS; SINGLE-THREADED; SOFTWARE PIPELINE; SOFTWARE PIPELINING; STREAM PROGRAMMING; STREAMING APPLICATIONS;

NETWORK COMPONENTS; PIPELINE PROCESSING SYSTEMS; PIPELINES; PROGRAM PROCESSORS; SCHEDULING;

INTEGER PROGRAMMING;

EID: 67650563116 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/CGO.2009.20 Document Type: Conference Paper

Times cited : (83)

References (23)

1
- 67650552214
- [Online]. Available
- NVIDIA CUDA Programming Guide. [Online]. Available: http: //www.nvidia.com/cuda
- NVIDIA CUDA Programming Guide

2
- 43449094719
- Program optimization space pruning for a multithreaded GPU
- DOI 10.1145/1356058.1356084, Proceedings of the 2008 CGO - Sixth International Symposium on Code Generation and Optimization
- S. Ryoo, C. I. Rodrigues, S. S. Stone, S. S. Baghsorkhi, S.-Z. Ueng, J. A. Stratton, and W.-M. W. Hwu, "Program Optimization Space Pruning for a Multithreaded GPU," in CGO '08: Proc. of the sixth annual IEEE/ACM Intl. Symp. on Code Generation and Optimization, 2008, pp. 195-204. (Pubitemid 351667266)
- (2008) Proceedings of the 2008 CGO - Sixth International Symposium on Code Generation and Optimization , pp. 195-204
- Ryoo, S.¹ Rodrigues, C.I.² Stone, S.S.³ Baghsorkhi, S.S.⁴ Ueng, S.-Z.⁵ Stratton, J.A.⁶ Hwu, W.-M.W.⁷

3
- 67650530896
- [Online]. Available
- ATI CTM Guide. [Online]. Available: http://ati.amd.com/companyinfo/ researcher/documents/ATI CTM Guide.pdf
- ATI CTM Guide

4
- 67650543629
- [Online]. Available
- NVIDIA CUDA. [Online]. Available: http://www.nvidia.com/cuda

5
- 34547423880
- Exploiting Coarsegrained Task, Data, and Pipeline Parallelism in Stream Programs
- M. I. Gordon, W. Thies, and S. Amarasinghe, "Exploiting Coarsegrained Task, Data, and Pipeline Parallelism in Stream Programs," in ASPLOS-XII: Proc. of the 12th Intl. Conf. on Architectural Support for Programming Languages and Operating Systems, 2006, pp. 151-162.
- ASPLOS-XII: Proc. of the 12th Intl. Conf. on Architectural Support for Programming Languages and Operating Systems, 2006 , pp. 151-162
- Gordon, M.I.¹ Thies, W.² Amarasinghe, S.³

6
- 84959045524
- StreamIt: A Language for Streaming Applications
- W. Thies, M. Karczmarek, and S. P. Amarasinghe, "StreamIt: A Language for Streaming Applications," in CC '02: Proc. of the 11th Intl. Conf. on Compiler Construction, 2002, pp. 179-196.
- CC '02: Proc. of the 11th Intl. Conf. on Compiler Construction, 2002 , pp. 179-196
- Thies, W.¹ Karczmarek, M.² Amarasinghe, S.P.³

7
- 10644248153
- Brook for GPUs: Stream computing on graphics hardware
- DOI 10.1145/1015706.1015800, ACM Transactions on Graphics - Proceedings of ACM SIGGRAPH 2004
- I. Buck, T. Foley, D. Horn, J. Sugerman, K. Fatahalian, M. Houston, and P. Hanrahan, "Brook for GPUs: Stream Computing on Graphics Hardware," ACM Trans. on Graphics, vol.23, no.3, pp. 777-786, 2004. (Pubitemid 40163782)
- (2004) ACM Transactions on Graphics , vol.23 , Issue.3 , pp. 777-786
- Buck, I.¹ Foley, T.² Horn, D.³ Sugerman, J.⁴ Fatahalian, K.⁵ Houston, M.⁶ Hanrahan, P.⁷

8
- 33947595619
- Accelerator: Using data parallelism to program GPUs for general-purpose uses
- DOI 10.1145/1168857.1168898, ASPLOS XII: Twelfth International Conference on Architectural Support for Programming Languages and Operating Systems
- D. Tarditi, S. Puri, and J. Oglesby, "Accelerator: Using Data Parallelism to Program GPUs for General-Purpose Uses," in ASPLOS-XII: Proc. of the 12th Intl. Conf. on Architectural Support for Programming Languages and Operating Systems, 2006, pp. 325-335. (Pubitemid 47168412)
- (2006) International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS , pp. 325-335
- Tarditi, D.¹ Puri, S.² Oglesby, J.³

9
- 57349172999
- Orchestrating the Execution of Stream Programs on Multicore Platforms
- M. Kudlur and S. Mahlke, "Orchestrating the Execution of Stream Programs on Multicore Platforms," in PLDI '08: Proc. of the 2008 ACM SIGPLAN Conf. on Programming Language Design and Implementation, 2008, pp. 114-124.
- PLDI '08: Proc. of the 2008 ACM SIGPLAN Conf. on Programming Language Design and Implementation, 2008 , pp. 114-124
- Kudlur, M.¹ Mahlke, S.²

10
- 29144520630
- Optimizing Stream Programs using Linear State Space Analysis
- S. Agrawal, W. Thies, and S. Amarasinghe, "Optimizing Stream Programs using Linear State Space Analysis," in CASES '05: Proc. of the 2005 Intl. Conf. on Compilers, Architectures and Synthesis for Embedded Systems, 2005, pp. 126-136.
- CASES '05: Proc. of the 2005 Intl. Conf. on Compilers, Architectures and Synthesis for Embedded Systems, 2005 , pp. 126-136
- Agrawal, S.¹ Thies, W.² Amarasinghe, S.³

11
- 0023138886
- STATIC SCHEDULING of SYNCHRONOUS DATA FLOW PROGRAMS for DIGITAL SIGNAL PROCESSING
- E. A. Lee and D. G. Messerschmitt, "Static Scheduling of Synchronous Data Flow Programs for Digital Signal Processing," IEEE Trans. on Computers, vol.36, no.1, pp. 24-35, 1987. (Pubitemid 17517473)
- (1987) IEEE Transactions on Computers , vol.C-36 , Issue.1 , pp. 24-35
- Lee Edward, A.¹ Messerschmitt David, G.²

12
- 0028731856
- Looped Schedules for Dataflow Descriptions of Multirate Signal Processing Algorithms
- S. S. Bhattacharyya and E. A. Lee, "Looped Schedules for Dataflow Descriptions of Multirate Signal Processing Algorithms," Formal Methods in System Design, vol.5, no.3, pp. 183-205, 1994.
- (1994) Formal Methods in System Design , vol.5 , Issue.3 , pp. 183-205
- Bhattacharyya, S.S.¹ Lee, E.A.²

13
- 0242696254
- Phased Scheduling of Stream Programs
- M. Karczmarek, W. Thies, and S. Amarasinghe, "Phased Scheduling of Stream Programs," in LCTES '03: Proc. of the 2003 ACM SIGPLAN Conf. on Language, Compiler, and Tool Support for Embedded Systems, 2003, pp. 103-112.
- LCTES '03: Proc. of the 2003 ACM SIGPLAN Conf. on Language, Compiler, and Tool Support for Embedded Systems, 2003 , pp. 103-112
- Karczmarek, M.¹ Thies, W.² Amarasinghe, S.³

14
- 0028768026
- Minimizing Register Requirements under Resource-constrained Rate-optimal Software Pipelining
- R. Govindarajan, E. R. Altman, and G. R. Gao, "Minimizing Register Requirements Under Resource-constrained Rate-optimal Software Pipelining," in MICRO 27: Proc. of the 27th annual Intl. Symp. on Microarchitecture, 1994, pp. 85-94.
- MICRO 27: Proc. of the 27th Annual Intl. Symp. on Microarchitecture, 1994 , pp. 85-94
- Govindarajan, R.¹ Altman, E.R.² Gao, G.R.³

15
- 85017274167
- A Novel Framework for Multi-rate Scheduling in DSP Applications
- R. Govindarajan and G. Gao, "A Novel Framework for Multi-rate Scheduling in DSP Applications," in ASAP '93: Proc. of the 1993 Intl. Conf. on Application-Specific Array Processors, Oct 1993, pp. 77-88.
- ASAP '93: Proc. of the 1993 Intl. Conf. on Application-Specific Array Processors, Oct 1993 , pp. 77-88
- Govindarajan, R.¹ Gao, G.²

16
- 0026976353
- Code generation schema for modulo scheduled loops
- B. R. Rau, M. S. Schlansker, and P. P. Tirumalai, "Code Generation Schema for Modulo Scheduled Loops," in MICRO 25: Proc. of the 25th annual Intl. Symp. on Microarchitecture, 1992, pp. 158-169. (Pubitemid 23633740)
- (1992) Proceedings of the 25th Annual International Symposium on Microarchitecture , pp. 158-169
- Ramakrishna Rau, B.¹ Schlansker Michael, S.² Tirumalai, P.P.³

17
- 67650513069
- StreamIt Home Page. [Online]. Available
- StreamIt Home Page. [Online]. Available: http://www.cag.lcs.mit.edu/ streamit/

18
- 2942564428
- Buffer merging - A powerful technique for reducing memory requirements of synchronous dataflow specifications
- DOI 10.1145/989995.989999
- P. K. Murthy and S. S. Bhattacharyya, "Buffer Merging-A Powerful Technique for Reducing Memory Requirements of Synchronous Dataflow Specifications," ACM Trans. on Design and Automation of Electronic Systems, vol.9, no.2, pp. 212-237, 2004. (Pubitemid 38732390)
- (2004) ACM Transactions on Design Automation of Electronic Systems , vol.9 , Issue.2 , pp. 212-237
- Murthy, P.K.¹ Bhattacharyya, S.S.²

19
- 80455123249
- Well-Behaved Dataflow Programs for DSP Computation
- vol.5, Mar
- G. Gao, R. Govindarajan, and P. Panangaden, "Well-Behaved Dataflow Programs for DSP Computation," ICASSP-92: IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing, 1992., vol.5, pp. 561-564 vol.5, Mar 1992.
- (1992) ICASSP-92: IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing, 1992 , vol.5 , pp. 561-564
- Gao, G.¹ Govindarajan, R.² Panangaden, P.³

20
- 0028602312
- Minimizing Memory Requirements in Rate-optimal Schedules
- R. Govindarajan, G. Gao, and P. Desai, "Minimizing Memory Requirements in Rate-optimal Schedules," in ASAP '94: Proc. of the 1994 Intl. Conf. on Application Specific Array Processors, Aug 1994, pp. 75-86.
- ASAP '94: Proc. of the 1994 Intl. Conf. on Application Specific Array Processors, Aug 1994 , pp. 75-86
- Govindarajan, R.¹ Gao, G.² Desai, P.³

21
- 0036959649
- A Stream Compiler for Communication-Exposed Architectures
- M. I. Gordon, W. Thies, M. Karczmarek, J. Lin, A. S. Meli, A. A. Lamb, C. Leger, J. Wong, H. Hoffmann, D. Maze, and S. Amarasinghe, "A Stream Compiler for Communication-Exposed Architectures," in ASPLOS-X: Proc. of the 10th Intl. Conf. on Architectural Support for Programming Languages and Operating Systems, 2002, pp. 291-303.
- ASPLOS-X: Proc. of the 10th Intl. Conf. on Architectural Support for Programming Languages and Operating Systems, 2002 , pp. 291-303
- Gordon, M.I.¹ Thies, W.² Karczmarek, M.³ Lin, J.⁴ Meli, A.S.⁵ Lamb, A.A.⁶ Leger, C.⁷ Wong, J.⁸ Hoffmann, H.⁹ Maze, D.¹⁰ Amarasinghe, S.¹¹

22
- 67650528532
- A Lightweight Streaming Layer for Multicore Execution
- D. Zhang, Q. J. Li, R. Rabbah, and S. Amarasinghe, "A Lightweight Streaming Layer for Multicore Execution," SIGARCH Computer Architecture News, vol.36, no.2, pp. 18-27, 2008.
- (2008) SIGARCH Computer Architecture News , vol.36 , Issue.2 , pp. 18-27
- Zhang, D.¹ Li, Q.J.² Rabbah, R.³ Amarasinghe, S.⁴

23
- 79959466764
- Optimization Principles and Application Performance Evaluation of a Multithreaded GPU using CUDA
- S. Ryoo, C. I. Rodrigues, S. S. Baghsorkhi, S. S. Stone, D. B. Kirk, and W. mei W. Hwu, "Optimization Principles and Application Performance Evaluation of a Multithreaded GPU using CUDA," in PPoPP '08: Proc. of the 13th ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming, 2008, pp. 73-82.
- PPoPP '08: Proc. of the 13th ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming, 2008 , pp. 73-82
- Ryoo, S.¹ Rodrigues, C.I.² Baghsorkhi, S.S.³ Stone, S.S.⁴ Kirk, D.B.⁵ Hwu, W.M.W.⁶

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.