메뉴 건너뛰기




Volumn , Issue , 2012, Pages 23-32

Dynamic compilation of data-parallel kernels for vector processors

Author keywords

[No Author keywords available]

Indexed keywords

ARCHITECTURE DESIGNERS; CONTROL-FLOW; DATA PARALLEL; DYNAMIC COMPILATION; FUNCTIONAL UNITS; GPU COMPUTING; INSTRUCTION SET EXTENSION; MICRO-BENCHMARK; MODERN PROCESSORS; MULTI CORE; OVER CURRENT; PERFORMANCE IMPROVEMENTS; PERFORMANCE SCALABILITY; POWER EFFICIENCY; PROGRAM TRANSFORMATIONS; REAL-WORLD APPLICATION; SOFTWARE PARALLELISM; STATE OF THE ART; VECTOR PROCESSORS;

EID: 84863449186     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/2259016.2259020     Document Type: Conference Paper
Times cited : (10)

References (24)
  • 1
    • 70449098063 scopus 로고    scopus 로고
    • Intel Corporation, Number 248966-018 in Intel 64 and IA-32 Optimization Manaul, Intel Corporation, March
    • Intel Corporation. Intel 64 and IA-32 Architectures Optimization Reference Manual. Number 248966-018 in Intel 64 and IA-32 Optimization Manaul. Intel Corporation, March 2009.
    • (2009) Intel 64 and IA-32 Architectures Optimization Reference Manual
  • 3
    • 70349100958 scopus 로고    scopus 로고
    • KHRONOS OpenCL Working Group, December
    • KHRONOS OpenCL Working Group. The OpenCL Specification, December 2008.
    • (2008) The OpenCL Specification
  • 4
    • 67650694407 scopus 로고    scopus 로고
    • NVIDIA, NVIDIA Corporation, Santa Clara, California, 2.1 edition, October
    • NVIDIA. NVIDIA CUDA Compute Unified Device Architecture. NVIDIA Corporation, Santa Clara, California, 2.1 edition, October 2008.
    • (2008) NVIDIA CUDA Compute Unified Device Architecture
  • 5
    • 77953978573 scopus 로고    scopus 로고
    • Efficient compilation of fine-grained spmd-threaded programs for multicore cpus
    • Toronto, Canada, April
    • John Stratton and Vinod Grover et al. Efficient compilation of fine-grained spmd-threaded programs for multicore cpus. In CGO 2010, Toronto, Canada, April 2010.
    • (2010) CGO 2010
    • Stratton, J.1    Grover, V.2
  • 6
    • 78149276036 scopus 로고    scopus 로고
    • Twin peaks: A software platform for heterogeneous computing on general-purpose and graphics processors
    • New York, NY, USA, ACM
    • Jayanth Gummaraju and Laurent Morichetti et al. Twin peaks: a software platform for heterogeneous computing on general-purpose and graphics processors. PACT'10, pages 205-216, New York, NY, USA, 2010. ACM.
    • (2010) PACT'10 , pp. 205-216
    • Gummaraju, J.1    Morichetti, L.2
  • 7
    • 78149255519 scopus 로고    scopus 로고
    • An opencl framework for heterogeneous multicores with local memory
    • New York, NY, USA, ACM
    • Jaejin Lee and Jungwon Kim et al. An opencl framework for heterogeneous multicores with local memory. PACT'10, pages 193-204, New York, NY, USA, 2010. ACM.
    • (2010) PACT'10 , pp. 193-204
    • Lee, J.1    Kim, J.2
  • 9
    • 70649102016 scopus 로고    scopus 로고
    • NVIDIA, NVIDIA Corporation, Santa Clara, California, 1.3 edition, October
    • NVIDIA. NVIDIA Compute PTX: Parallel Thread Execution. NVIDIA Corporation, Santa Clara, California, 1.3 edition, October 2008.
    • (2008) NVIDIA Compute PTX: Parallel Thread Execution
  • 10
    • 57649106258 scopus 로고    scopus 로고
    • Larrabee: A many-core x86 architecture for visual computing
    • pages 18:1-18:15, New York, NY, USA, ACM
    • Larry Seiler and Doug Carmean et al. Larrabee: a many-core x86 architecture for visual computing. In ACM SIGGRAPH 2008 papers, SIGGRAPH'08, pages 18:1-18:15, New York, NY, USA, 2008. ACM.
    • (2008) ACM SIGGRAPH 2008 Papers, SIGGRAPH'08
    • Seiler, L.1    Carmean, D.2
  • 12
    • 84856559490 scopus 로고    scopus 로고
    • Dynamic detection of uniform and affine vectors in gpgpu computations
    • Universite de Perpignan, June
    • Sylvain Collange and David Defour et al. Dynamic detection of uniform and affine vectors in gpgpu computations. Technical report, Universite de Perpignan, University of California Davis, June 2009.
    • (2009) Technical Report, University of California Davis
    • Collange, S.1    Defour, D.2
  • 14
    • 70649104826 scopus 로고    scopus 로고
    • A characterization and analysis of ptx kernels
    • Austin, TX, USA, October
    • Andrew Kerr, Gregory Diamos, and Sudhakar Yalamanchili. A characterization and analysis of ptx kernels. In IISWC'09, Austin, TX, USA, October 2009.
    • (2009) IISWC'09
    • Kerr, A.1    Diamos, G.2    Yalamanchili, S.3
  • 16
    • 78149233155 scopus 로고    scopus 로고
    • Ocelot: A dynamic optimization framework for bulk-synchronous applications in heterogeneous systems
    • New York, NY, USA, ACM
    • Gregory Diamos, Andrew Kerr, Sudhakar Yalamanchili, and Nathan Clark. Ocelot: a dynamic optimization framework for bulk-synchronous applications in heterogeneous systems. PACT'10, pages 353-364, New York, NY, USA, 2010. ACM.
    • (2010) PACT'10 , pp. 353-364
    • Diamos, G.1    Kerr, A.2    Yalamanchili, S.3    Clark, N.4
  • 17
    • 84863474058 scopus 로고    scopus 로고
    • The parboil benchmark suite
    • IMPACT. The parboil benchmark suite, 2007.
    • (2007) IMPACT
  • 18
    • 70350771131 scopus 로고    scopus 로고
    • Benchmarking gpus to tune dense linear algebra
    • Piscataway, NJ, USA
    • Volkov Vasily and Demmel James W. Benchmarking gpus to tune dense linear algebra. In Supercomputing'08, Piscataway, NJ, USA, 2008.
    • (2008) Supercomputing'08
    • Volkov, V.1    Demmel, J.W.2
  • 19
    • 79957502935 scopus 로고    scopus 로고
    • Whole-function vectorization
    • Ralf Karrenberg and Sebastian Hack. Whole-function vectorization. CGO, 2011.
    • (2011) CGO
    • Ralf, K.1    Sebastian, H.2
  • 21
    • 79951700098 scopus 로고    scopus 로고
    • Improving simt efficiency of global rendering algorithms with architectural support for dynamic micro-kernels
    • Washington, DC, USA
    • Michael Steffen and Joseph Zambreno. Improving simt efficiency of global rendering algorithms with architectural support for dynamic micro-kernels. MICRO'43, Washington, DC, USA, 2010.
    • (2010) MICRO'43
    • Steffen, M.1    Zambreno, J.2
  • 24
    • 79951702599 scopus 로고    scopus 로고
    • Efficient selection of vector instructions using dynamic programming
    • Washington, DC, USA, IEEE Computer Society
    • Rajkishore Barik, J. Zhao, and V. Sarkar. Efficient selection of vector instructions using dynamic programming. MICRO'43, pages 201-212, Washington, DC, USA, 2010. IEEE Computer Society.
    • (2010) MICRO'43 , pp. 201-212
    • Barik, R.1    Zhao, J.2    Sarkar, V.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.