메뉴 건너뛰기




Volumn 2016-December, Issue , 2016, Pages

KLAP: Kernel launch aggregation and promotion for optimizing dynamic parallelism

Author keywords

[No Author keywords available]

Indexed keywords

PROGRAM COMPILERS; PROGRAM PROCESSORS;

EID: 85009382810     PISSN: 10724451     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/MICRO.2016.7783716     Document Type: Conference Paper
Times cited : (33)

References (36)
  • 4
    • 84923668340 scopus 로고    scopus 로고
    • Efficient GPUimplementation of adaptive mesh refinement for the shallow-water equations
    • M. L. Sætra, A. R. Brodtkorb, and K.-A. Lie, "Efficient GPUimplementation of adaptive mesh refinement for the shallow-water equations, " Journal of Scientific Computing, vol. 63, no. 1, pp. 23-48, 2015.
    • (2015) Journal of Scientific Computing , vol.63 , Issue.1 , pp. 23-48
    • Sætra, M.L.1    Brodtkorb, A.R.2    Lie, K.-A.3
  • 7
    • 84896893237 scopus 로고    scopus 로고
    • CUDA-NP: Realizing nested thread-level parallelism in GPGPU applications
    • ACM
    • Y. Yang and H. Zhou, "CUDA-NP: Realizing nested thread-level parallelism in GPGPU applications, " in ACM SIGPLAN Notices, vol. 49, pp. 93-106, ACM, 2014.
    • (2014) ACM SIGPLAN Notices , vol.49 , pp. 93-106
    • Yang, Y.1    Zhou, H.2
  • 8
    • 85009354023 scopus 로고    scopus 로고
    • A CUDA dynamic parallelism case study: PANDA Accessed 2016-04-01
    • "A CUDA dynamic parallelism case study: PANDA." https://devblogs.nvidia.com/parallelforall/a-CUDA-dynamic-parallelismcase-study-panda/. Accessed: 2016-04-01.
  • 16
    • 85049937265 scopus 로고    scopus 로고
    • The OpenCL specification, version 2.0
    • L. Howes and A. Munshi, "The OpenCL specification, version 2.0, " Khronos Group, 2015.
    • (2015) Khronos Group
    • Howes, L.1    Munshi, A.2
  • 20
    • 85009348622 scopus 로고    scopus 로고
    • NVIDIA, CUDA samples v. 7.5
    • NVIDIA, "CUDA samples v. 7.5, " 2015.
    • (2015)
  • 23
    • 20344394051 scopus 로고    scopus 로고
    • Accessed 2016-04-01
    • "Matrix market." http://math.nist.gov/MatrixMarket/. Accessed: 2016-04-01.
    • Matrix Market
  • 25
    • 85009348666 scopus 로고    scopus 로고
    • GPU pro tip: CUDA 7 streams simplify concurrency Accessed 2016-04-01
    • "GPU pro tip: CUDA 7 streams simplify concurrency." http://devblogs.nvidia.com/parallelforall/GPU-pro-Tip-CUDA-7-streamssimplify-concurrency/. Accessed: 2016-04-01.
  • 26
    • 84940066769 scopus 로고    scopus 로고
    • Accessed 2016-04-10
    • "CUDA dynamic parallelism API and principles." https://devblogs. nvidia.com/parallelforall/CUDA-dynamic-parallelism-Api-principles/. Accessed: 2016-04-10.
    • CUDA Dynamic Parallelism API and Principles
  • 27
    • 85009416339 scopus 로고    scopus 로고
    • NVIDIA, Profiler users guide v. 7.5
    • NVIDIA, "Profiler users guide v. 7.5, " 2015.
    • (2015)
  • 30
  • 31
    • 85009416419 scopus 로고    scopus 로고
    • Private communication
    • G. Chen and X. Shen. private communication.
    • Chen, G.1    Shen, X.2
  • 33
    • 84976510144 scopus 로고    scopus 로고
    • Nested parallelism on GPU: Exploring parallelization templates for irregular loops and recursive computations
    • IEEE
    • D. Li, H. Wu, and M. Becchi, "Nested parallelism on GPU: Exploring parallelization templates for irregular loops and recursive computations, " in Parallel Processing (ICPP), 2015 44th International Conference on, pp. 979-988, IEEE, 2015.
    • (2015) Parallel Processing (ICPP) 2015 44th International Conference on , pp. 979-988
    • Li, D.1    Wu, H.2    Becchi, M.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.