SCOPUS 정보 검색 플랫폼

Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI)

Volumn , Issue , 2012, Pages 13-22

Adaptive input-aware compilation for graphics engines

(5) Samadi, Mehrzad a Hormati, Amir b Mehrara, Mojtaba c Lee, Janghaeng a Mahlke, Scott a

a UNIVERSITY OF MICHIGAN (United States)

b MICROSOFT RESEARCH (United States)

c NVIDIA (United States)

Author keywords

Compiler; GPU; Optimization; Portability; Streaming

Indexed keywords

APPLICATION DESIGN; CODE GENERATION; COMFORT ZONE; COMPILER; GPU; GRAPHICS ENGINE; GRAPHICS PROCESSING UNITS; HIGH PERFORMANCE COMPUTATION; MEMORY HIERARCHY; PERFORMANCE TUNING; PROGRAMMABILITY; RUNTIMES;

ACOUSTIC STREAMING; COMPUTER ARCHITECTURE; COMPUTER GRAPHICS; COMPUTER GRAPHICS EQUIPMENT; COMPUTER PROGRAMMING; COMPUTER SOFTWARE PORTABILITY; HIGH LEVEL LANGUAGES; PROGRAM COMPILERS;

OPTIMIZATION;

EID: 84863443917 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/2254064.2254067 Document Type: Conference Paper

Times cited : (21)

References (34)

1
- 77951572335
- Automatic Cto- CUDA code generation for affine programs
- M. M. Baskaran, J. Ramanujam, and P. Sadayappan. Automatic Cto- CUDA code generation for affine programs. In Proc. of the 19th International Conference on Compiler Construction, pages 244-263, 2010.
- (2010) Proc. of the 19th International Conference on Compiler Construction , pp. 244-263
- Baskaran, M.M.¹ Ramanujam, J.² Sadayappan, P.³

2
- 10644248153
- Brook for GPUs: Stream computing on graphics hardware
- Aug
- I. Buck et al. Brook for GPUs: Stream computing on graphics hardware. ACM Transactions on Graphics, 23(3):777-786, Aug. 2004.
- (2004) ACM Transactions on Graphics , vol.23 , Issue.3 , pp. 777-786
- Buck, I.¹

3
- 80053989560
- Copperhead: Compiling an embedded data parallel language
- B. Catanzaro, M. Garland, and K. Keutzer. Copperhead: compiling an embedded data parallel language. In Proc. of the 16th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 47-56, 2011.
- (2011) Proc. of the 16th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming , pp. 47-56
- Catanzaro, B.¹ Garland, M.² Keutzer, K.³

4
- 53749089739
- Fast support vector machine training and classification on graphics processors
- B. Catanzaro, N. Sundaram, and K. Keutzer. Fast support vector machine training and classification on graphics processors. In Proc. of the 25th International Conference on Machine learning, pages 104-111, 2008.
- (2008) Proc. of the 25th International Conference on Machine Learning , pp. 104-111
- Catanzaro, B.¹ Sundaram, N.² Keutzer, K.³

5
- 77952280338
- Compiling python to a hybrid execution environment
- R. Garg and J. N. Amaral. Compiling python to a hybrid execution environment. In Proc. of the 3rd Workshop on General Purpose Processing on Graphics Processing Units, pages 19-30, 2010.
- (2010) Proc. of the 3rd Workshop on General Purpose Processing on Graphics Processing Units , pp. 19-30
- Garg, R.¹ Amaral, J.N.²

6
- 34547423880
- Exploiting coarsegrained task, data, and pipeline parallelism in stream programs
- M. I. Gordon, W. Thies, and S. Amarasinghe. Exploiting coarsegrained task, data, and pipeline parallelism in stream programs. In 14th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 151-162, 2006.
- (2006) 14th International Conference on Architectural Support for Programming Languages and Operating Systems , pp. 151-162
- Gordon, M.I.¹ Thies, W.² Amarasinghe, S.³

7
- 79952593905
- CnC-CUDA: Declarative programming for GPUs
- M. Grossman, A. Simion, Z. Budimli, and V. Sarkar. CnC-CUDA: Declarative Programming for GPUs. In Proc. of the 23rd Workshop on Languages and Compilers for Parallel Computing, pages 230-245, 2010.
- (2010) Proc. of the 23rd Workshop on Languages and Compilers for Parallel Computing , pp. 230-245
- Grossman, M.¹ Simion, A.² Budimli, Z.³ Sarkar, V.⁴

8
- 79952031801
- HiCUDA: High-level GPGPU programming
- T. Han and T. Abdelrahman. hiCUDA: High-level GPGPU programming. IEEE Transactions on Parallel and Distributed Systems, 22(1):52-61, 2010.
- (2010) IEEE Transactions on Parallel and Distributed Systems , vol.22 , Issue.1 , pp. 52-61
- Han, T.¹ Abdelrahman, T.²

9
- 78149231331
- Mapcg: Writing parallel program portable between CPU and GPU
- C. Hong, D. Chen, W. Chen, W. Zheng, and H. Lin. Mapcg: writing parallel program portable between CPU and GPU. In Proc. of the 19th International Conference on Parallel Architectures and Compilation Techniques, pages 217-226, 2010.
- (2010) Proc. of the 19th International Conference on Parallel Architectures and Compilation Techniques , pp. 217-226
- Hong, C.¹ Chen, D.² Chen, W.³ Zheng, W.⁴ Lin, H.⁵

10
- 70450231944
- An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness
- S. Hong and H. Kim. An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness. In Proc. of the 36th Annual International Symposium on Computer Architecture, pages 152-163, 2009.
- (2009) Proc. of the 36th Annual International Symposium on Computer Architecture , pp. 152-163
- Hong, S.¹ Kim, H.²

11
- 77952252026
- Macross: Macro-simdization of streaming applications
- A. Hormati, Y. Choi, M. Woh, M. Kudlur, R. Rabbah, T. Mudge, and S. Mahlke. Macross: Macro-simdization of streaming applications. In 18th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 285-296, 2010.
- (2010) 18th International Conference on Architectural Support for Programming Languages and Operating Systems , pp. 285-296
- Hormati, A.¹ Choi, Y.² Woh, M.³ Kudlur, M.⁴ Rabbah, R.⁵ Mudge, T.⁶ Mahlke, S.⁷

12
- 79953071805
- Sponge: Portable stream programming on graphics engines
- A. H. Hormati, M. Samadi, M.Woh, T.Mudge, and S.Mahlke. Sponge: portable stream programming on graphics engines. In 19th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 381-392, 2011.
- (2011) 19th International Conference on Architectural Support for Programming Languages and Operating Systems , pp. 381-392
- Hormati, A.H.¹ Samadi, M.² Woh, M.³ Mudge, T.⁴ Mahlke, S.⁵

13
- 57349172999
- Orchestrating the execution of stream programs on multicore platforms
- June
- M. Kudlur and S. Mahlke. Orchestrating the execution of stream programs on multicore platforms. In Proc. of the'08 Conference on Programming Language Design and Implementation, pages 114-124, June 2008.
- (2008) Proc. of the'08 Conference on Programming Language Design and Implementation , pp. 114-124
- Kudlur, M.¹ Mahlke, S.²

14
- 70350583252
- OpenMP to GPGPU: A compiler framework for automatic translation and optimization
- S. Lee, S.-J. Min, and R. Eigenmann. OpenMP to GPGPU: a compiler framework for automatic translation and optimization. In Proc. of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 101-110, 2009.
- (2009) Proc. of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming , pp. 101-110
- Lee, S.¹ Min, S.-J.² Eigenmann, R.³

15
- 77954995885
- Debunking the 100x GPU vs. CPU myth: An evaluation of throughput computing on CPU and GPU
- V. W. Lee, C. Kim, J. Chhugani, M. Deisher, D. Kim, A. D. Nguyen, N. Satish, M. Smelyanskiy, S. Chennupaty, P. Hammarlund, R. Singhal, and P. Dubey. Debunking the 100x GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU. In Proc. of the 37th Annual International Symposium on Computer Architecture, pages 451-460, 2010.
- (2010) Proc. of the 37th Annual International Symposium on Computer Architecture , pp. 451-460
- Lee, V.W.¹ Kim, C.² Chhugani, J.³ Deisher, M.⁴ Kim, D.⁵ Nguyen, A.D.⁶ Satish, N.⁷ Smelyanskiy, M.⁸ Chennupaty, S.⁹ Hammarlund, P.¹⁰ Singhal, R.¹¹ Dubey, P.¹²

16
- 70449633228
- Automatic parallelization for graphics processing units
- A. Leung, O. Lhoták, and G. Lashari. Automatic parallelization for graphics processing units. In Proc. of the 7th International Conference on Principles and Practice of Programming in Java, pages 91-100, 2009.
- (2009) Proc. Of the 7th International Conference on Principles and Practice of Programming in Java , pp. 91-100
- Leung, A.¹ Lhoták, O.² Lashari, G.³

17
- 3042561893
- A dynamically tuned sorting library
- X. Li, M. J. Garzaŕan, and D. Padua. A dynamically tuned sorting library. In Proc. of the 2004 International Symposium on Code Generation and Optimization, pages 111-, 2004.
- (2004) Proc. of the 2004 International Symposium on Code Generation and Optimization , pp. 111
- Li, X.¹ Garzaŕan, M.J.² Padua, D.³

18
- 70450103746
- A cross-input adaptive framework for GPU program optimizations
- Y. Liu, E. Z. Zhang, and X. Shen. A cross-input adaptive framework for GPU program optimizations. In 2009 IEEE International Symposium on Parallel and Distributed Processing, pages 1-10, 2009.
- (2009) 2009 IEEE International Symposium on Parallel and Distributed Processing , pp. 1-10
- Liu, Y.¹ Zhang, E.Z.² Shen, X.³

19
- 70449723385
- Performance modeling and automatic ghost zone optimization for iterative stencil loops on GPUs
- J. Meng and K. Skadron. Performance modeling and automatic ghost zone optimization for iterative stencil loops on GPUs. In Proc. of the 2009 International Conference on Supercomputing, pages 256-265, 2009.
- (2009) Proc. of the 2009 International Conference on Supercomputing , pp. 256-265
- Meng, J.¹ Skadron, K.²

20
- 79953108478
- NVIDIA. GPUs Are Only Up To 14 Times Faster than CPUs says Intel, 2010. http://blogs.nvidia.com/ntersect/2010/06/gpus-are-only-upto-14-times-faster- than-cpus-says-intel.html.
- (2010) GPUs Are Only up to 14 Times Faster than CPUs Says Intel

21
- 79959876216
- Automatic compilation of MATLAB programs for synergistic execution on heterogeneous processors
- A. Prasad, J. Anantpur, and R. Govindarajan. Automatic compilation of MATLAB programs for synergistic execution on heterogeneous processors. In Proc. of the'11 Conference on Programming Language Design and Implementation, pages 152-163, 2011.
- (2011) Proc. of the'11 Conference on Programming Language Design and Implementation , pp. 152-163
- Prasad, A.¹ Anantpur, J.² Govindarajan, R.³

22
- 77954709868
- Compiler and runtime support for enabling generalized reduction computations on heterogeneous parallel configurations
- V. T. Ravi, W.Ma, D. Chiu, and G. Agrawal. Compiler and runtime support for enabling generalized reduction computations on heterogeneous parallel configurations. In Proc. of the 2010 International Conference on Supercomputing, pages 137-146, 2010.
- (2010) Proc. of the 2010 International Conference on Supercomputing , pp. 137-146
- Ravi, V.T.¹ Ma, W.² Chiu, D.³ Agrawal, G.⁴

23
- 51849142544
- Efficient stream reduction on the GPU
- D. Roger, U. Assarsson, and N. Holzschuch. Efficient stream reduction on the GPU. In Proc. of the 1st Workshop on General Purpose Processing on Graphics Processing Units, pages 1-4, 2007.
- (2007) Proc. of the 1st Workshop on General Purpose Processing on Graphics Processing Units , pp. 1-4
- Roger, D.¹ Assarsson, U.² Holzschuch, N.³

24
- 79959466764
- Optimization principles and application performance evaluation of a multithreaded GPU using CUDA
- S. Ryoo, C. I. Rodrigues, S. S. Baghsorkhi, S. S. Stone, D. B. Kirk, and W. mei W. Hwu. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. In Proc. of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 73-82, 2008.
- (2008) Proc. of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming , pp. 73-82
- Ryoo, S.¹ Rodrigues, C.I.² Baghsorkhi, S.S.³ Stone, S.S.⁴ Kirk, D.B.⁵ Mei, W.⁶ Hwu, W.⁷

25
- 33947595619
- Accelerator: Using data parallelism to program GPUs for general-purpose uses
- D. Tarditi, S. Puri, and J. Oglesby. Accelerator: using data parallelism to program GPUs for general-purpose uses. In 14th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 325-335, 2006.
- (2006) 14th International Conference on Architectural Support for Programming Languages and Operating Systems , pp. 325-335
- Tarditi, D.¹ Puri, S.² Oglesby, J.³

26
- 78149262760
- An empirical characterization of stream programs and its implications for language and compiler design
- W. Thies and S. Amarasinghe. An empirical characterization of stream programs and its implications for language and compiler design. In Proc. of the 19th International Conference on Parallel Architectures and Compilation Techniques, pages 365-376, 2010.
- (2010) Proc. of the 19th International Conference on Parallel Architectures and Compilation Techniques , pp. 365-376
- Thies, W.¹ Amarasinghe, S.²

27
- 84959045524
- StreamIt: A language for streaming applications
- W. Thies, M. Karczmarek, and S. P. Amarasinghe. StreamIt: A language for streaming applications. In Proc. of the 2002 International Conference on Compiler Construction, pages 179-196, 2002.
- (2002) Proc. of the 2002 International Conference on Compiler Construction , pp. 179-196
- Thies, W.¹ Karczmarek, M.² Amarasinghe, S.P.³

28
- 31844454218
- A framework for adaptive algorithm selection in STAPL
- Proceedings of the 2005 ACM SIGPLAN Symposium on Principles and Practise of Parallel Programming, PROPP 05
- N. Thomas, G. Tanase, O. Tkachyshyn, J. Perdue, N. M. Amato, and L. Rauchwerger. A framework for adaptive algorithm selection in stapl. In Proc. of the 10th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 277-288, 2005. (Pubitemid 43182854)
- (2005) Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP , pp. 277-288
- Thomas, N.¹ Tanase, G.² Tkachyshyn, O.³ Perdue, J.⁴ Amato, N.M.⁵ Rauchwerger, L.⁶

29
- 78650086311
- An input-centric paradigm for program dynamic optimizations
- K. Tian, Y. Jiang, E. Z. Zhang, and X. Shen. An input-centric paradigm for program dynamic optimizations. In Proceedings of the OOPSLA'10, pages 125-139, 2010.
- (2010) Proceedings of the OOPSLA'10 , pp. 125-139
- Tian, K.¹ Jiang, Y.² Zhang, E.Z.³ Shen, X.⁴

30
- 67650563116
- Software pipelined execution of stream programs on GPUs
- A. Udupa, R. Govindarajan, and M. J. Thazhuthaveetil. Software pipelined execution of stream programs on GPUs. In Proc. of the 2009 International Symposium on Code Generation and Optimization, pages 200-209, 2009.
- (2009) Proc. of the 2009 International Symposium on Code Generation and Optimization , pp. 200-209
- Udupa, A.¹ Govindarajan, R.² Thazhuthaveetil, M.J.³

31
- 77952281697
- Implementing the PGI accelerator model
- M. Wolfe. Implementing the PGI accelerator model. In Proc. of the 3rd Workshop on General Purpose Processing on Graphics Processing Units, pages 43-50, 2010.
- (2010) Proc. of the 3rd Workshop on General Purpose Processing on Graphics Processing Units , pp. 43-50
- Wolfe, M.¹

32
- 78249259656
- Exploiting more parallelism from applications having generalized reductions on GPU architectures
- X.-L. Wu, N. Obeid, and W.-M. Hwu. Exploiting more parallelism from applications having generalized reductions on GPU architectures. In Proc. of the 2010 10th International Conference on Computers and Information Technology, pages 1175-1180, 2010.
- (2010) Proc. of the 2010 10th International Conference on Computers and Information Technology , pp. 1175-1180
- Wu, X.-L.¹ Obeid, N.² Hwu, W.-M.³

33
- 77954691442
- A GPGPU compiler for memory optimization and parallelism management
- Y. Yang, P. Xiang, J. Kong, and H. Zhou. A GPGPU compiler for memory optimization and parallelism management. In Proc. of the'10 Conference on Programming Language Design and Implementation, pages 86-97, 2010.
- (2010) Proc. of the'10 Conference on Programming Language Design and Implementation , pp. 86-97
- Yang, Y.¹ Xiang, P.² Kong, J.³ Zhou, H.⁴

34
- 58449127539
- CUDALite: Reducing GPU programming complexity
- S. zee Ueng, M. Lathara, S. S. Baghsorkhi, and W.mei, W. Hwu. CUDALite: Reducing GPU programming complexity. In Proc. of the 21st Workshop on Languages and Compilers for Parallel Computing, pages 1-15, 2008.
- (2008) Proc. of the 21st Workshop on Languages and Compilers for Parallel Computing , pp. 1-15
- Ueng, S.Z.¹ Lathara, M.² Baghsorkhi, S.S.³ Mei, W.⁴ Hwu, W.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.