메뉴 건너뛰기




Volumn , Issue , 2014, Pages 437-446

TBPoint: Reducing simulation time for large-scale GPGPU kernels

Author keywords

GPGPU; performance modeling; sampling; simulation

Indexed keywords

DISTRIBUTED PARAMETER NETWORKS; RANDOM ERRORS; SAMPLING;

EID: 84906709211     PISSN: 15302075     EISSN: 23321237     Source Type: Conference Proceeding    
DOI: 10.1109/IPDPS.2014.53     Document Type: Conference Paper
Times cited : (16)

References (17)
  • 1
    • 84906668459 scopus 로고    scopus 로고
    • Introducing TITAN
    • Introducing TITAN. http://www.olcf.ornl.gov/titan/.
  • 2
    • 84906718295 scopus 로고    scopus 로고
    • Swiss national supercomputing centre
    • Swiss national supercomputing centre. http://www.cscs.ch/.
  • 3
    • 84873458159 scopus 로고    scopus 로고
    • A quantitative study of irregular programs on GPUs
    • M. Burtscher, R. Nasre, and K. Pingali. A quantitative study of irregular programs on GPUs. In IISWC, 2012.
    • (2012) IISWC
    • Burtscher, M.1    Nasre, R.2    Pingali, K.3
  • 4
    • 84906668460 scopus 로고    scopus 로고
    • Macsim. http://code.google.com/p/macsim/.
  • 5
    • 21644454187 scopus 로고    scopus 로고
    • Pinpointing representative portions of large Intel Itanium programs with dynamic instrumentation
    • H. Patil, R. S. Cohn, M. Charney, R. Kapoor, A. Sun, and A. Karunanidhi. Pinpointing representative portions of large Intel Itanium programs with dynamic instrumentation. In MICRO, 2004.
    • (2004) MICRO
    • Patil, H.1    Cohn, R.S.2    Charney, M.3    Kapoor, R.4    Sun, A.5    Karunanidhi, A.6
  • 6
    • 0036953769 scopus 로고    scopus 로고
    • Automatically characterizing large scale program behavior
    • T. Sherwood, E. Perelman, G. Hamerly, and B. Calder. Automatically characterizing large scale program behavior. In ASPLOS, 2002.
    • (2002) ASPLOS
    • Sherwood, T.1    Perelman, E.2    Hamerly, G.3    Calder, B.4
  • 7
    • 84881442631 scopus 로고    scopus 로고
    • Sampled simulation of multi-threaded applications
    • T. E. Carlson, W. Heirman, and L. Eeckhout. Sampled simulation of multi-threaded applications. In ISPASS, 2013.
    • (2013) ISPASS
    • Carlson, T.E.1    Heirman, W.2    Eeckhout, L.3
  • 8
    • 84906684431 scopus 로고    scopus 로고
    • CUDA Documentation
    • CUDA Documentation. http://www.nvidia.com/object/cudadevelop.html.
  • 9
    • 31944440969 scopus 로고    scopus 로고
    • Pin: Building customized program analysis tools with dynamic instrumentation
    • Chi-Keung Luk et al. Pin: Building customized program analysis tools with dynamic instrumentation. In PLDI, 2005.
    • (2005) PLDI
    • Luk, C.1
  • 10
  • 11
    • 78149233155 scopus 로고    scopus 로고
    • Ocelot: A dynamic compiler for bulk-synchronous applications in heterogeneous systems
    • G. Diamos, A. Kerr, S. Yalamanchili, and N. Clark. Ocelot: A dynamic compiler for bulk-synchronous applications in heterogeneous systems. In PACT, 2010.
    • (2010) PACT
    • Diamos, G.1    Kerr, A.2    Yalamanchili, S.3    Clark, N.4
  • 13
    • 64949101685 scopus 로고    scopus 로고
    • A first-order fine-grained multithreaded throughput model
    • Xi E. Chen and Tor M. Aamodt. A first-order fine-grained multithreaded throughput model. In HPCA, 2009.
    • (2009) HPCA
    • Chen, X.E.1    Aamodt, T.M.2
  • 15
    • 70450231944 scopus 로고    scopus 로고
    • An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness
    • S. Hong and H. Kim. An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness. In ISCA, 2009.
    • (2009) ISCA
    • Hong, S.1    Kim, H.2
  • 16
    • 84863347222 scopus 로고    scopus 로고
    • A performance analysis framework for identifying potential benefits in gpGPU applications
    • J. Sim, A. Dasgupta, H. Kim, and R. Vuduc. A performance analysis framework for identifying potential benefits in gpGPU applications. In PPoPP, 2012.
    • (2012) PPoPP
    • Sim, J.1    Dasgupta, A.2    Kim, H.3    Vuduc, R.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.