메뉴 건너뛰기




Volumn 7210 LNCS, Issue , 2012, Pages 21-40

Automatic restructuring of GPU kernels for exploiting inter-thread data locality

Author keywords

[No Author keywords available]

Indexed keywords

DATA LOCALITY; DIRECT IMPACT; MEMORY PERFORMANCE; MULTI-THREADING; ON CURRENTS; REGISTER PRESSURE; SHARED MEMORIES; SOFTWARE FRAMEWORKS;

EID: 84859153100     PISSN: 03029743     EISSN: 16113349     Source Type: Book Series    
DOI: 10.1007/978-3-642-28652-0_2     Document Type: Conference Paper
Times cited : (25)

References (28)
  • 1
    • 84859145371 scopus 로고    scopus 로고
    • CUDA PTX ISA, http://www.nvidia.com/content/CUDAptxisa1.4.pdf
    • CUDA PTX ISA
  • 5
    • 34547309668 scopus 로고    scopus 로고
    • Version 3.0. NVIDIA
    • CUDA Programming Guide, Version 3.0. NVIDIA (2010)
    • (2010) CUDA Programming Guide
  • 7
    • 84925509670 scopus 로고    scopus 로고
    • Optimizing Compilers for Modern Architectures
    • Allen, R., Kennedy, K.: Optimizing Compilers for Modern Architectures. Morgan Kaufmann (2002)
    • (2002) Morgan Kaufmann
    • Allen, R.1    Kennedy, K.2
  • 8
    • 77951572335 scopus 로고    scopus 로고
    • Automatic C-to-CUDA Code Generation for Affine Programs
    • Gupta, R. (ed.) CC 2010. Springer, Heidelberg
    • Baskaran, M.M., Ramanujam, J., Sadayappan, P.: Automatic C-to-CUDA Code Generation for Affine Programs. In: Gupta, R. (ed.) CC 2010. LNCS, vol. 6011, pp. 244-263. Springer, Heidelberg (2010)
    • (2010) LNCS , vol.6011 , pp. 244-263
    • Baskaran, M.M.1    Ramanujam, J.2    Sadayappan, P.3
  • 11
    • 0028549474 scopus 로고
    • Improving the ratio of memory operations to floatingpoint operations in loops
    • Carr, S., Kennedy, K.: Improving the ratio of memory operations to floatingpoint operations in loops. ACM Transactions on Programming Languages and Systems 16(6), 1768-1810 (1994)
    • (1994) ACM Transactions on Programming Languages and Systems , vol.16 , Issue.6 , pp. 1768-1810
    • Carr, S.1    Kennedy, K.2
  • 13
    • 0023565191 scopus 로고
    • What's in a name? -or- The value of renaming for parallelism detection and storage allocation
    • Cytron, R., Ferrante, J.: What's in a name? -or- the value of renaming for parallelism detection and storage allocation. In: ICPP 1987, pp. 19-27 (1987)
    • (1987) ICPP 1987 , pp. 19-27
    • Cytron, R.1    Ferrante, J.2
  • 18
    • 79952608669 scopus 로고    scopus 로고
    • Optimizing and Auto-tuning Belief Propagation on the GPU
    • Cooper, K., Mellor-Crummey, J., Sarkar, V. (eds.) LCPC 2010. Springer, Heidelberg
    • Grauer-Gray, S., Cavazos, J.: Optimizing and Auto-tuning Belief Propagation on the GPU. In: Cooper, K., Mellor-Crummey, J., Sarkar, V. (eds.) LCPC 2010. LNCS, vol. 6548, pp. 121-135. Springer, Heidelberg (2011)
    • (2011) LNCS , vol.6548 , pp. 121-135
    • Grauer-Gray, S.1    Cavazos, J.2
  • 20
    • 79952583455 scopus 로고    scopus 로고
    • Accelerating GPU Kernels for Dense Linear Algebra
    • Palma, J.M.L.M., Daydé, M., Marques, O., Lopes, J.C. (eds.) VECPAR 2010. Springer, Heidelberg
    • Nath, R., Tomov, S., Dongarra, J.: Accelerating GPU Kernels for Dense Linear Algebra. In: Palma, J.M.L.M., Daydé, M., Marques, O., Lopes, J.C. (eds.) VECPAR 2010. LNCS, vol. 6449, pp. 83-92. Springer, Heidelberg (2011)
    • (2011) LNCS , vol.6449 , pp. 83-92
    • Nath, R.1    Tomov, S.2    Dongarra, J.3
  • 25
    • 60949098907 scopus 로고    scopus 로고
    • Optimization of sparse matrix-vector multiplication on emerging multicore platforms
    • Williams, S., Oliker, L., Vuduc, R., Shalf, J., Yelick, K., Demmel, J.: Optimization of sparse matrix-vector multiplication on emerging multicore platforms. Parallel Comput. 35(3), 178-194 (2009)
    • (2009) Parallel Comput , vol.35 , Issue.3 , pp. 178-194
    • Williams, S.1    Oliker, L.2    Vuduc, R.3    Shalf, J.4    Yelick, K.5    Demmel, J.6
  • 26
    • 58449092097 scopus 로고    scopus 로고
    • Exploring the Optimization Space of Dense Linear Algebra Kernels
    • Amaral, J.N. (ed.) LCPC 2008. Springer, Heidelberg
    • Yi, Q., Qasem, A.: Exploring the Optimization Space of Dense Linear Algebra Kernels. In: Amaral, J.N. (ed.) LCPC 2008. LNCS, vol. 5335, pp. 343-355. Springer, Heidelberg (2008)
    • (2008) LNCS , vol.5335 , pp. 343-355
    • Yi, Q.1    Qasem, A.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.