SCOPUS 정보 검색 플랫폼

ACM International Conference Proceeding Series

Volumn , Issue , 2011, Pages

A framework for dynamically instrumenting GPU compute applications within GPU Ocelot

(5) Farooqui, Naila a Kerr, Andrew b Diamos, Gregory b Yalamanchili, S b Schwan, K a

a Georgia Institute of Technology (United States)

b GEORGIA INSTITUTE OF TECHNOLOGY (United States)

Author keywords

CUDA; dynamic binary compilation; GPGPU; GPU computing; instrumentation; Ocelot; OpenCL; Parboil; PTX; Rodinia

Indexed keywords

CUDA; GPGPU; GPU COMPUTING; INSTRUMENTATION; OCELOT; OPENCL; PARBOIL; PTX; RODINIA;

DATA STRUCTURES; INSTRUMENTS; PROGRAM PROCESSORS;

COMPUTER GRAPHICS EQUIPMENT;

EID: 79955069636 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/1964179.1964192 Document Type: Conference Paper

Times cited : (15)

References (20)

1
- 70349100958
- KHRONOS OpenCL Working Group December
- KHRONOS OpenCL Working Group. The OpenCL Specification, December 2008.
- (2008) The OpenCL Specification

2
- 67650694407
- NVIDIA NVIDIA Corporation, Santa Clara, California, 2.1 edition, October
- NVIDIA. NVIDIA CUDA Compute Unified Device Architecture. NVIDIA Corporation, Santa Clara, California, 2.1 edition, October 2008.
- (2008) NVIDIA CUDA Compute Unified Device Architecture

3
- 70649102016
- NVIDIA NVIDIA Corporation, Santa Clara, California, 1.3 edition, October
- NVIDIA. NVIDIA Compute PTX: Parallel Thread Execution. NVIDIA Corporation, Santa Clara, California, 1.3 edition, October 2008.
- (2008) NVIDIA Compute PTX: Parallel Thread Execution

4
- 70649115322
- June
- Gregory Diamos, Andrew Kerr, and Sudhakar Yalamanchili. Gpuocelot: A binary translation framework for ptx., June 2009. http://code.google.com/p/ gpuocelot/.
- (2009) Gpuocelot: A Binary Translation Framework for Ptx
- Diamos, G.¹ Kerr, A.² Yalamanchili, S.³

5
- 78149233155
- Ocelot: A dynamic optimization framework for bulk-synchronous applications in heterogeneous systems
- New York, NY, USA, ACM
- Gregory Diamos, Andrew Kerr, Sudhakar Yalamanchili, and Nathan Clark. Ocelot: a dynamic optimization framework for bulk-synchronous applications in heterogeneous systems. In Proceedings of the 19th international conference on Parallel architectures and compilation techniques, PACT '10, pages 353-364, New York, NY, USA, 2010. ACM.
- (2010) Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, PACT '10 , pp. 353-364
- Diamos, G.¹ Kerr, A.² Yalamanchili, S.³ Clark, N.⁴

6
- 84856622869
- Caracal: Dynamic translation of runtime environments for gpus
- To appear
- Rodrigo Dominguez, Dana Schaa, and David Kaeli. Caracal: Dynamic translation of runtime environments for gpus. In Proceedings of the 4th Workshop on General-Purpose Computation on Graphics Processing Units, 2011. To appear.
- Proceedings of the 4th Workshop on General-Purpose Computation on Graphics Processing Units, 2011
- Dominguez, R.¹ Schaa, D.² Kaeli, D.³

7
- 0026243790
- Efficiently computing static single assignment form and the control dependence graph
- Oct
- Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, and F. Kenneth Zadeck. Efficiently computing static single assignment form and the control dependence graph. ACM Transactions on Programming Languages and Systems, 13(4):451-490, Oct 1991.
- (1991) ACM Transactions on Programming Languages and Systems , vol.13 , Issue.4 , pp. 451-490
- Cytron, R.¹ Ferrante, J.² Rosen, B.K.³ Wegman, M.N.⁴ Zadeck, F.K.⁵

8
- 70649104826
- A characterization and analysis of ptx kernels
- Andrew Kerr, Gregory Diamos, and Sudhakar Yalamanchili. A characterization and analysis of ptx kernels. Workload Characterization, 2009. IISWC 2009. IEEE International Symposium on, 2009.
- Workload Characterization, 2009. IISWC 2009. IEEE International Symposium On, 2009
- Kerr, A.¹ Diamos, G.² Yalamanchili, S.³

9
- 3042658703
- LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation
- Chris Lattner and Vikram Adve. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Proceedings of the 2004 International Symposium on Code Generation and Optimization (CGO'04), Palo Alto, California, Mar 2004.
- Proceedings of the 2004 International Symposium on Code Generation and Optimization (CGO'04), Palo Alto, California, Mar 2004
- Lattner, C.¹ Adve, V.²

10
- 67650692011
- IMPACT
- IMPACT. The parboil benchmark suite, 2007.
- (2007) The Parboil Benchmark Suite

11
- 77951900491
- NVIDIA Corporation white paper, NVIDIA, November
- NVIDIA Corporation. Nvidia's next generation compute architecture: Fermi. white paper, NVIDIA, November 2009.
- (2009) Nvidia's next Generation Compute Architecture: Fermi

12
- 79955071201
- NVIDIA NVIDIA Corporation, Santa Clara, California, 1.0 edition, October
- NVIDIA. NVIDIA Compute Visual Profiler. NVIDIA Corporation, Santa Clara, California, 1.0 edition, October 2010.
- (2010) NVIDIA Compute Visual Profiler

13
- 70349123351
- Gvim: Gpu-accelerated virtual machines
- New York, NY, USA, ACM
- Vishakha Gupta, Ada Gavrilovska, Karsten Schwan, Harshvardhan Kharche, Niraj Tolia, Vanish Talwar, and Parthasarathy Ranganathan. Gvim: Gpu-accelerated virtual machines. In Proceedings of the 3rd ACM Workshop on System-level Virtualization for High Performance Computing, HPCVirt '09, pages 17-24, New York, NY, USA, 2009. ACM.
- (2009) Proceedings of the 3rd ACM Workshop on System-level Virtualization for High Performance Computing, HPCVirt '09 , pp. 17-24
- Gupta, V.¹ Gavrilovska, A.² Schwan, K.³ Kharche, H.⁴ Tolia, N.⁵ Talwar, V.⁶ Ranganathan, P.⁷

14
- 77954994853
- An integrated gpu power and performance model
- Sunpyo Hong and Hyesoon Kim. An integrated gpu power and performance model. Computer Architecture. IEEE International Symposium on, 2010.
- Computer Architecture. IEEE International Symposium On, 2010
- Hong, S.¹ Kim, H.²

15
- 31944440969
- Pin: Building customized program analysis tools with dynamic instrumentation
- New York, NY, USA, ACM
- Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. Pin: building customized program analysis tools with dynamic instrumentation. In Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, PLDI '05, pages 190-200, New York, NY, USA, 2005. ACM.
- (2005) Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '05 , pp. 190-200
- Luk, C.-K.¹ Cohn, R.² Muth, R.³ Patil, H.⁴ Klauser, A.⁵ Lowney, G.⁶ Wallace, S.⁷ Reddi, V.J.⁸ Hazelwood, K.⁹

16
- 77749265837
- Automated dynamic analysis of cuda programs
- Michael Boyer, Kevin Skadron, and Westley Weimer. Automated dynamic analysis of cuda programs. Third Workshop on Software Tools for MultiCore Systems (STMCS), 2008.
- Third Workshop on Software Tools for MultiCore Systems (STMCS), 2008
- Boyer, M.¹ Skadron, K.² Weimer, W.³

17
- 70349169075
- Analyzing cuda workloads using a detailed gpu simulator
- Ali Bakhoda, George Yuan, Wilson W. L. Fung, Henry Wong, and Tor M. Aamodt. Analyzing cuda workloads using a detailed gpu simulator. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Boston, MA, USA, April 2009.
- IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Boston, MA, USA, April 2009
- Bakhoda, A.¹ Yuan, G.² Fung, W.W.L.³ Wong, H.⁴ Aamodt, T.M.⁵

18
- 70649096881
- Technical Report hal-00359342
- Sylvain Collange, David Defour, and David Parello. Barra, a modular functional gpu simulator for gpgpu. Technical Report hal-00359342, 2009.
- (2009) Barra, A Modular Functional Gpu Simulator for Gpgpu
- Collange, S.¹ Defour, D.² Parello, D.³

19
- 79955921273
- A quantitative performance analysis model for gpu architectures
- Yao Zhang and John D. Owens. A quantitative performance analysis model for gpu architectures. In Proceedings of the 17th IEEE International Symposium on High-Performance Computer Architecture (HPCA 17), February 2011.
- Proceedings of the 17th IEEE International Symposium on High-Performance Computer Architecture (HPCA 17), February 2011
- Zhang, Y.¹ Owens, J.D.²

20
- 70450231944
- An analytical model for a gpu architecture with memory-level and thread-level parallelism awareness
- Sunpyo Hong and Hyesoon Kim. An analytical model for a gpu architecture with memory-level and thread-level parallelism awareness. SIGARCH Comput. Archit. News, 37(3):152-163, 2009.
- (2009) SIGARCH Comput. Archit. News , vol.37 , Issue.3 , pp. 152-163
- Hong, S.¹ Kim, H.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.