메뉴 건너뛰기




Volumn , Issue , 2013, Pages 1097-1106

Improving GPU performance prediction with data transfer modeling

Author keywords

[No Author keywords available]

Indexed keywords

DATA TRANSFER; PROGRAM PROCESSORS;

EID: 84896838047     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/IPDPSW.2013.236     Document Type: Conference Paper
Times cited : (46)

References (21)
  • 3
  • 5
    • 0026186967 scopus 로고
    • An implementation of interprocedural bounded regular section analysis
    • P. Havlak and K. Kennedy. An implementation of interprocedural bounded regular section analysis. IEEE Trans. Parallel Distrib. Syst., 2, 1991.
    • (1991) IEEE Trans Parallel Distrib. Syst. , vol.2
    • Havlak, P.1    Kennedy, K.2
  • 6
    • 70450231944 scopus 로고    scopus 로고
    • An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness
    • S. Hong and H. Kim. An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness. In ISCA, 2009.
    • (2009) ISCA
    • Hong, S.1    Kim, H.2
  • 8
    • 83155186310 scopus 로고
    • Modern potentials and the properties of condensed 4He
    • M. H. Kalos, M. A. Lee, P. A. Whitlock, and G. V. Chester. Modern potentials and the properties of condensed 4He. In Phys. Rev. C 66, 044310-1:14, 1981.
    • (1981) Phys. Rev. C , vol.66 , pp. 044310-045114
    • Kalos, M.H.1    Lee, M.A.2    Whitlock, P.A.3    Chester, G.V.4
  • 9
    • 70349100958 scopus 로고    scopus 로고
    • Khronos Group Std Version 1.0.
    • Khronos Group Std. The OpenCL Specification, Version 1.0. http://www.khronos.org/registry/cl/specs/opencl-1.0.33.pdf, 2009.
    • (2009) The OpenCL Specification
  • 11
    • 34547288276 scopus 로고    scopus 로고
    • Accurate and efficient regression modeling for microarchitectural performance and power prediction
    • B. C. Lee and D. M. Brooks. Accurate and efficient regression modeling for microarchitectural performance and power prediction. In ASPLOSXII, 2006.
    • (2006) ASPLOSXII
    • Lee, B.C.1    Brooks, D.M.2
  • 12
    • 66749185800 scopus 로고    scopus 로고
    • CPR: Composable performance regression for scalable multiprocessor models
    • B. C. Lee, J. Collins, H. Wang, and D. Brooks. CPR: Composable performance regression for scalable multiprocessor models. In MICRO, 2008.
    • (2008) MICRO
    • Lee, B.C.1    Collins, J.2    Wang, H.3    Brooks, D.4
  • 13
    • 34748909426 scopus 로고    scopus 로고
    • Methods of inference and learning for performance modeling of parallel applications
    • Benjamin C. Lee, David M. Brooks, Bronis R. de Supinski, Martin Schulz, Karan Singh, and Sally A. McKee. Methods of inference and learning for performance modeling of parallel applications. In PPoPP, 2007.
    • (2007) PPoPP
    • Lee, B.C.1    Brooks, D.M.2    De Supinski, B.R.3    Schulz, M.4    Singh, K.5    McKee, S.A.6
  • 15
    • 83155184571 scopus 로고    scopus 로고
    • GROPHECY: GPU performance projection from CPU code skeletons
    • Jiayuan Meng, Vitali A. Morozov, Kalyan Kumaran, Venkatram Vishwanath, and Thomas D. Uram. GROPHECY: GPU performance projection from CPU code skeletons. In SC, 2011.
    • (2011) SC
    • Meng, J.1    Morozov, V.A.2    Kumaran, K.3    Vishwanath, V.4    Uram, T.D.5
  • 17
    • 79959617900 scopus 로고    scopus 로고
    • MDR: Performance model driven runtime for heterogeneous parallel platforms
    • J. A. Pienaar, A. Raghunathan, and S. Chakradhar. MDR: performance model driven runtime for heterogeneous parallel platforms. In ICS, 2011.
    • (2011) ICS
    • Pienaar, J.A.1    Raghunathan, A.2    Chakradhar, S.3
  • 18
    • 0036819106 scopus 로고    scopus 로고
    • Wiringa. Quantum Monte Carlo calculations of A=9,10 nuclei
    • S. C. Pieper, K. Varga, and R. B. Wiringa. Quantum Monte Carlo calculations of A=9,10 nuclei. In Phys. Rev. C 66, 044310-1:14, 2002.
    • (2002) Phys. Rev. C , vol.66 , pp. 044310-045114
    • Pieper, S.C.1    Varga, K.B.R.2
  • 20
    • 33845442055 scopus 로고    scopus 로고
    • Cross-platform performance prediction of parallel applications using partial execution
    • L. T. Yang, X. Ma, and F. Mueller. Cross-platform performance prediction of parallel applications using partial execution. In SC, 2005.
    • (2005) SC
    • Yang, L.T.1    Ma, X.2    Mueller, F.3
  • 21
    • 79955921273 scopus 로고    scopus 로고
    • A quantitative performance analysis model for GPU architectures
    • Y. Zhang and J. D. Owens. A quantitative performance analysis model for GPU architectures. In HPCA, 2011.
    • (2011) HPCA
    • Zhang, Y.1    Owens, J.D.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.