메뉴 건너뛰기




Volumn , Issue , 2012, Pages

Policy-based tuning for performance portability and library co-optimization

Author keywords

auto tuning; library design; metaprogramming; Performance; performance portability; policy; software reuse

Indexed keywords

AUTOTUNING; LIBRARY DESIGNS; META PROGRAMMING; PERFORMANCE; PERFORMANCE PORTABILITY;

EID: 84870725376     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/InPar.2012.6339597     Document Type: Conference Paper
Times cited : (17)

References (32)
  • 5
    • 84870721658 scopus 로고    scopus 로고
    • Accessed: 2011-08-25
    • CUDA: http://www.nvidia.com/object/cuda-home-new.html. Accessed: 2011-08-25.
    • CUDA
  • 6
    • 70449710961 scopus 로고    scopus 로고
    • Google Project Hosting: Accessed: 2011-07-12
    • cudpp - CUDA Data Parallel Primitives Library - Google Project Hosting: http://code.google.com/p/cudpp/. Accessed: 2011-07-12.
    • Cudpp - CUDA Data Parallel Primitives Library
  • 7
    • 0002806690 scopus 로고    scopus 로고
    • OpenMP: An industry standard API for shared-memory programming
    • Mar. 1998
    • Dagum, L. and Menon, R. 1998. OpenMP: an industry standard API for shared-memory programming. IEEE Computational Science and Engineering. 5, (Mar. 1998), 46-55.
    • (1998) IEEE Computational Science and Engineering , vol.5 , pp. 46-55
    • Dagum, L.1    Menon, R.2
  • 8
    • 37549003336 scopus 로고    scopus 로고
    • MapReduce: Simplified data processing on large clusters
    • Jan. 2008
    • Dean, J. and Ghemawat, S. 2008. MapReduce: simplified data processing on large clusters. Commun. ACM. 51, 1 (Jan. 2008), 107-113.
    • (2008) Commun. ACM. , vol.51 , Issue.1 , pp. 107-113
    • Dean, J.1    Ghemawat, S.2
  • 9
    • 20744452904 scopus 로고    scopus 로고
    • Self-Adapting Linear Algebra Algorithms and Software
    • Feb. 2005
    • Demmel, J. et al. 2005. Self-Adapting Linear Algebra Algorithms and Software. Proceedings of the IEEE. 93, 2 (Feb. 2005), 293-312.
    • (2005) Proceedings of the IEEE , vol.93 , Issue.2 , pp. 293-312
    • Demmel, J.1
  • 13
    • 84976721284 scopus 로고
    • MULTILISP: A language for concurrent symbolic computation
    • Oct. 1985
    • Halstead,Jr., R.H. 1985. MULTILISP: a language for concurrent symbolic computation. ACM Trans. Program. Lang. Syst. 7, 4 (Oct. 1985), 501-538.
    • (1985) ACM Trans. Program. Lang. Syst. , vol.7 , Issue.4 , pp. 501-538
    • Halstead Jr., R.H.1
  • 14
    • 0003568839 scopus 로고    scopus 로고
    • IEEE Computer Society 2009. IEEE Std 1076-2008 (Revision of IEEE Std 1076-2002)
    • IEEE Computer Society 2009. IEEE Standard VHDL Language Reference Manual. IEEE Std 1076-2008 (Revision of IEEE Std 1076-2002). (2009), c1-626.
    • (2009) IEEE Standard VHDL Language Reference Manual
  • 15
    • 84870669933 scopus 로고    scopus 로고
    • PyCUDA and PyOpenCL: A scripting-based approach to GPU run-time code generation
    • Sep. 2011
    • Klöckner, A. et al. 2011. PyCUDA and PyOpenCL: A scripting-based approach to GPU run-time code generation. Parallel Computing. (Sep. 2011).
    • (2011) Parallel Computing
    • Klöckner, A.1
  • 16
    • 84866532295 scopus 로고    scopus 로고
    • Technical Report #245. LAPACK Working Note
    • Kurzak, J. et al. 2011. Autotuning GEMMs for Fermi. Technical Report #245. LAPACK Working Note.
    • (2011) Autotuning GEMMs for Fermi
    • Kurzak, J.1
  • 19
    • 79959718248 scopus 로고    scopus 로고
    • High Performance and Scalable Radix Sorting: A case study of implementing dynamic parallelism for GPU computing
    • 2011
    • Merrill, D. and Grimshaw, A. 2011. High Performance and Scalable Radix Sorting: A case study of implementing dynamic parallelism for GPU computing. Parallel Processing Letters. 21, 02 (2011), 245-272.
    • (2011) Parallel Processing Letters , vol.21 , Issue.2 , pp. 245-272
    • Merrill, D.1    Grimshaw, A.2
  • 20
    • 78149268496 scopus 로고    scopus 로고
    • Technical Report #CS2009-14. Department of Computer Science, University of Virginia
    • Merrill, D. and Grimshaw, A. 2009. Parallel Scan for Stream Architectures. Technical Report #CS2009-14. Department of Computer Science, University of Virginia.
    • (2009) Parallel Scan for Stream Architectures
    • Merrill, D.1    Grimshaw, A.2
  • 21
    • 67650661447 scopus 로고    scopus 로고
    • Accessed: 2009-12-12
    • Optimizing parallel reduction in CUDA: 2007. http://developer.download. nvidia.com/compute/cuda/1-1/Website/projects/reduction/doc/reduction.pdf. Accessed: 2009-12-12.
    • (2007) Optimizing Parallel Reduction in CUDA
  • 22
    • 49049088756 scopus 로고    scopus 로고
    • GPU Computing
    • May. 2008
    • Owens, J.D. et al. 2008. GPU Computing. Proceedings of the IEEE. 96, 5 (May. 2008), 879-899.
    • (2008) Proceedings of the IEEE. , vol.96 , Issue.5 , pp. 879-899
    • Owens, J.D.1
  • 23
    • 19344368072 scopus 로고    scopus 로고
    • SPIRAL: Code Generation for DSP Transforms
    • Feb. 2005
    • Puschel, M. et al. 2005. SPIRAL: Code Generation for DSP Transforms. Proceedings of the IEEE. 93, 2 (Feb. 2005), 232-275.
    • (2005) Proceedings of the IEEE , vol.93 , Issue.2 , pp. 232-275
    • Puschel, M.1
  • 24
    • 33749908081 scopus 로고
    • Classes of Recursively Enumerable Sets and Their Decision Problems
    • 1953
    • Rice, H.G. 1953. Classes of Recursively Enumerable Sets and Their Decision Problems. Transactions of the American Mathematical Society. 74, 2 (1953), pp. 358-366.
    • (1953) Transactions of the American Mathematical Society , vol.74 , Issue.2 , pp. 358-366
    • Rice, H.G.1
  • 27
    • 84870714547 scopus 로고    scopus 로고
    • Google Project Hosting: Accessed: 2011-08-25
    • Thrust - Code at the speed of light - Google Project Hosting: http://code.google.com/p/thrust/. Accessed: 2011-08-25.
    • Thrust - Code at the Speed of Light
  • 30
    • 0025467711 scopus 로고
    • A bridging model for parallel computation
    • Aug. 1990
    • Valiant, L.G. 1990. A bridging model for parallel computation. Commun. ACM. 33, 8 (Aug. 1990), 103-111.
    • (1990) Commun. ACM. , vol.33 , Issue.8 , pp. 103-111
    • Valiant, L.G.1
  • 32
    • 24344485098 scopus 로고    scopus 로고
    • OSKI: A library of automatically tuned sparse matrix kernels
    • Jan. 2005
    • Vuduc, R. et al. 2005. OSKI: A library of automatically tuned sparse matrix kernels. Journal of Physics: Conference Series. 16, (Jan. 2005), 521-530.
    • (2005) Journal of Physics: Conference Series , vol.16 , pp. 521-530
    • Vuduc, R.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.