메뉴 건너뛰기




Volumn 2016-December, Issue , 2016, Pages

Efficient kernel synthesis for performance portable programming

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER SOFTWARE PORTABILITY; ENERGY EFFICIENCY;

EID: 85008936147     PISSN: 10724451     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/MICRO.2016.7783715     Document Type: Conference Paper
Times cited : (15)

References (43)
  • 7
    • 78649824847 scopus 로고    scopus 로고
    • Exploiting memory access patterns to improve memory performance in data-parallel architectures
    • B. Jang, D. Schaa, P. Mistry, and D. Kaeli, "Exploiting memory access patterns to improve memory performance in data-parallel architectures, " IEEE Trans. Parallel Distrib. Syst., vol. 22, no. 1, pp. 105-118, 2011.
    • (2011) IEEE Trans. Parallel Distrib. Syst , vol.22 , Issue.1 , pp. 105-118
    • Jang, B.1    Schaa, D.2    Mistry, P.3    Kaeli, D.4
  • 8
    • 0343462141 scopus 로고    scopus 로고
    • Automated empirical optimizations of software and the atlas project
    • R. C. Whaley, A. Petitet, and J. J. Dongarra, "Automated empirical optimizations of software and the atlas project, " Parallel Computing, vol. 27, no. 1, pp. 3-35, 2001.
    • (2001) Parallel Computing , vol.27 , Issue.1 , pp. 3-35
    • Whaley, R.C.1    Petitet, A.2    Dongarra, J.J.3
  • 10
    • 84870725376 scopus 로고    scopus 로고
    • Policy-based tuning for performance portability and library co-optimization
    • D. Merrill, M. Garland, and A. Grimshaw, "Policy-based tuning for performance portability and library co-optimization, " in Innovative Parallel Computing, pp. 1-10, 2012.
    • (2012) Innovative Parallel Computing , pp. 1-10
    • Merrill, D.1    Garland, M.2    Grimshaw, A.3
  • 11
    • 84883116448 scopus 로고    scopus 로고
    • Halide: A language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines
    • J. Ragan-Kelley, C. Barnes, A. Adams, S. Paris, F. Durand, and S. Amarasinghe, "Halide: A language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines, " ACM SIGPLAN Notices, vol. 48, no. 6, pp. 519-530, 2013.
    • (2013) ACM SIGPLAN Notices , vol.48 , Issue.6 , pp. 519-530
    • Ragan-Kelley, J.1    Barnes, C.2    Adams, A.3    Paris, S.4    Durand, F.5    Amarasinghe, S.6
  • 15
    • 80053955412 scopus 로고    scopus 로고
    • Accelerating CUDA graph algorithms at maximum warp
    • S. Hong, S. K. Kim, T. Oguntebi, and K. Olukotun, "Accelerating CUDA graph algorithms at maximum warp, " in ACM SIGPLAN Notices, vol. 46, pp. 267-276, 2011.
    • (2011) ACM SIGPLAN Notices , vol.46 , pp. 267-276
    • Hong, S.1    Kim, S.K.2    Oguntebi, T.3    Olukotun, K.4
  • 17
    • 85009366731 scopus 로고    scopus 로고
    • NVIDIA, CUDA C best practices guide, v. 7.0
    • NVIDIA, "CUDA C best practices guide v. 7.0, " 2015.
    • (2015)
  • 19
  • 22
    • 85009381347 scopus 로고    scopus 로고
    • Intel Math Kernel Library
    • "Intel Math Kernel Library." http://software.intel.com/enus/articles/intel-mkl/.
  • 23
    • 84977938542 scopus 로고    scopus 로고
    • NVIDIA. NVIDIA, v7.0 ed Oct
    • NVIDIA, CUBLAS Library User Guide. NVIDIA, v7.0 ed., Oct. 2015.
    • (2015) CUBLAS Library User Guide
  • 38
    • 70449959487 scopus 로고    scopus 로고
    • CHiLL: A framework for composing high-level loop transformations
    • C. Chen, J. Chame, and M. Hall, "CHiLL: A framework for composing high-level loop transformations, " tech. rep., 2008.
    • (2008) Tech. Rep
    • Chen, C.1    Chame, J.2    Hall, M.3
  • 39
    • 44249094647 scopus 로고    scopus 로고
    • Anatomy of high-performance matrix multiplication
    • May
    • K. Goto and R. A. v. d. Geijn, "Anatomy of high-performance matrix multiplication, " ACM Transactions on Mathematical Software, vol. 34, pp. 12:1-12:25, May 2008.
    • (2008) ACM Transactions on Mathematical Software , vol.34 , pp. 121-1225
    • Goto, K.1    Geijn, D.V.A.R.2
  • 41
    • 84905980170 scopus 로고    scopus 로고
    • Delite: A compiler architecture for performance-oriented embedded domain-specific languages
    • A. K. Sujeeth, K. J. Brown, H. Lee, T. Rompf, H. Chafi, M. Odersky, and K. Olukotun, "Delite: A compiler architecture for performance-oriented embedded domain-specific languages, " ACM Trans. Embed. Comput. Syst., vol. 13, no. 4s, pp. 134:1-134:25, 2014.
    • (2014) ACM Trans. Embed. Comput. Syst , vol.13 , Issue.4 , pp. 1341-13425
    • Sujeeth, A.K.1    Brown, K.J.2    Lee, H.3    Rompf, T.4    Chafi, H.5    Odersky, M.6    Olukotun, K.7


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.