메뉴 건너뛰기




Volumn , Issue , 2013, Pages 1080-1087

A multi-level optimization method for stencil computation on the domain that is bigger than memory capacity of GPU

Author keywords

GPU memory capacity; multi level optimization; stencil computation; temporal blocking

Indexed keywords

DISTRIBUTED COMPUTER SYSTEMS; PROBLEM SOLVING;

EID: 84899705665     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/IPDPSW.2013.58     Document Type: Conference Paper
Times cited : (13)

References (11)
  • 3
    • 84893593562 scopus 로고    scopus 로고
    • Physis: An implicitly-parallel programming model for stencil computing on large-scale GPU-Accelerated supercomputers
    • Naoya Maruyama, Tatsuo Nomura, Kento Sato, and Satoshi Matsuoka, "Physis: An implicitly-parallel programming model for stencil computing on large-scale GPU-Accelerated supercomputers," IEEE SC11,2011.
    • (2011) IEEE SC11
    • Maruyama, N.1    Nomura, T.2    Sato, K.3    Matsuoka, S.4
  • 6
    • 70449657442 scopus 로고    scopus 로고
    • Efcient temporal blocking for stencil computations by multicore-Aware wavefront parallelization
    • Gerhard Wellein, Georg Hager, Thomas Zeiser, Markus Wittmann and Holger Fehske, "Ef-cient temporal blocking for stencil computations by multicore-Aware wavefront parallelization," Computer Software and Applications Conference, vol.1, pp. 579-586, 2009.
    • (2009) Computer Software and Applications Conference , vol.1 , pp. 579-586
    • Wellein, G.1    Hager, G.2    Zeiser, T.3    Wittmann, M.4    Fehske, H.5
  • 7
    • 84893521469 scopus 로고    scopus 로고
    • Performance model for automatic optimization of communication in data-parallel stencil computations
    • 2012-HPC-135
    • Tomoki Kawamura,Naoya Maruyama, and Satoshi Matsuoka, "Performance model for automatic optimization of communication in data-parallel stencil computations," IPSJ SIG Technical Report-vol.2012-HPC-135, 8pages, 2012
    • (2012) IPSJ SIG Technical Report , pp. 8
    • Kawamura, T.1    Maruyama, N.2    Matsuoka, S.3
  • 8
    • 79958272014 scopus 로고    scopus 로고
    • 3.5-D blocking optimization for stencil computations on modern CPUs and GPUs
    • Anthony Nguyen, Nadathur Satish, Jatin Chhugani, Changkyu Kim, and Pradeep Dubey, "3.5-D blocking optimization for stencil computations on modern CPUs and GPUs," IEEE SC10, 2010.
    • (2010) IEEE SC10
    • Nguyen, A.1    Satish, N.2    Chhugani, J.3    Kim, C.4    Dubey, P.5
  • 9
    • 79953768747 scopus 로고    scopus 로고
    • Overcoming the GPU memory limitation on FDTDthrough the use of overlappingsubgrids
    • Leonardo Mattes and Sergio Kofuji, "Overcoming the GPU memory limitation on FDTDthrough the use of overlappingsubgrids," ICMMT, pp.1536-1539, 2010.
    • (2010) ICMMT , pp. 1536-1539
    • Mattes, L.1    Kofuji, S.2
  • 10
    • 77954903012 scopus 로고    scopus 로고
    • The use of overlapping subgrids to accelerate the FDTD on GPU devices
    • Leonardo Mattes and Sergio Kofuji, "The use of overlapping subgrids to accelerate the FDTD on GPU devices,"Radar Conference, pp. 807-810, 2010.
    • (2010) Radar Conference , pp. 807-810
    • Mattes, L.1    Kofuji, S.2
  • 11


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.