메뉴 건너뛰기




Volumn , Issue , 2011, Pages 467-478

Automated architecture-aware mapping of streaming applications onto GPUs

Author keywords

GPU; stream processing; StreamIt

Indexed keywords

ARCHITECTURAL FEATURES; DATA MOVEMENTS; GENERAL PURPOSE; GPU; GPU PROGRAMMING; GRAPHIC PROCESSING UNITS; MEMORY ACCESS; MEMORY FOOTPRINT; MEMORY HIERARCHY; NUMBER OF THREADS; OFF-CHIP; POOR PERFORMANCE; PROCESSING CORE; PROGRAMMING LANGUAGE; SHARED MEMORIES; STREAM PROCESSING; STREAMING APPLICATIONS; STREAMIT; THREAD GROUPS;

EID: 80053240142     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/IPDPS.2011.52     Document Type: Conference Paper
Times cited : (25)

References (19)
  • 1
    • 0036959649 scopus 로고    scopus 로고
    • A stream compiler for communication-exposed architectures
    • M. I. Gordon and et al., "A stream compiler for communication- exposed architectures," in ASPLOS '02, Oct 2002.
    • ASPLOS '02, Oct 2002
    • Gordon, M.I.1
  • 2
    • 77951154340 scopus 로고    scopus 로고
    • The GPU computing era
    • J. Nickolls and W. J. Dally, "The GPU computing era," IEEE Micro, vol. 30, pp. 56-69, 2010.
    • (2010) IEEE Micro , vol.30 , pp. 56-69
    • Nickolls, J.1    Dally, W.J.2
  • 3
    • 33947588048 scopus 로고    scopus 로고
    • A survey of general-purpose computation on graphics hardware
    • J. D. Owens and et al., "A survey of general-purpose computation on graphics hardware," Computer Graphics Forum, vol. 26, no. 1, pp. 80-113, 2007.
    • (2007) Computer Graphics Forum , vol.26 , Issue.1 , pp. 80-113
    • Owens, J.D.1
  • 4
    • 70349100958 scopus 로고    scopus 로고
    • Khronos OpenCL Working Group, version 1.0.29, 8 December
    • Khronos OpenCL Working Group, The OpenCL Specification, version 1.0.29, 8 December 2008.
    • (2008) The OpenCL Specification
  • 5
    • 84870629709 scopus 로고    scopus 로고
    • Nvidia cuda. Http://www.nvidia.com/object/cuda
    • Nvidia Cuda
  • 6
    • 84877609547 scopus 로고    scopus 로고
    • Brook for GPUs: Stream computing on graphics hardware
    • New York, NY, USA: ACM
    • I. Buck and et al., "Brook for GPUs: stream computing on graphics hardware," in SIGGRAPH '04. New York, NY, USA: ACM, 2004, pp. 777-786.
    • (2004) SIGGRAPH '04 , pp. 777-786
    • Buck, I.1
  • 7
    • 34547423880 scopus 로고    scopus 로고
    • Exploiting coarse-grained task, data, and pipeline parallelism in stream programs
    • New York, NY, USA: ACM
    • M. I. Gordon, W. Thies, and S. Amarasinghe, "Exploiting coarse-grained task, data, and pipeline parallelism in stream programs," in ASPLOS '06. New York, NY, USA: ACM, 2006, pp. 151-162.
    • (2006) ASPLOS '06 , pp. 151-162
    • Gordon, M.I.1    Thies, W.2    Amarasinghe, S.3
  • 8
    • 57349172999 scopus 로고    scopus 로고
    • Orchestrating the execution of stream programs on multicore platforms
    • M. Kudlur and S. Mahlke, "Orchestrating the execution of stream programs on multicore platforms," in PLDI '08, 2008, pp. 114-124.
    • (2008) PLDI '08 , pp. 114-124
    • Kudlur, M.1    Mahlke, S.2
  • 9
    • 67650563116 scopus 로고    scopus 로고
    • Software pipelined execution of stream programs on GPUs
    • A. Udupa, R. Govindarajan, and M. J. Thazhuthaveetil, "Software pipelined execution of stream programs on GPUs," in CGO '09, 2009, pp. 200-209.
    • (2009) CGO '09 , pp. 200-209
    • Udupa, A.1    Govindarajan, R.2    Thazhuthaveetil, M.J.3
  • 10
    • 0023138886 scopus 로고
    • Static scheduling of synchronous data flow programs for digital signal processing
    • E. A. Lee and D. G. Messerschmitt, "Static scheduling of synchronous data flow programs for digital signal processing," IEEE Trans. Comput., vol. 36, no. 1, pp. 24-35, 1987.
    • (1987) IEEE Trans. Comput. , vol.36 , Issue.1 , pp. 24-35
    • Lee, E.A.1    Messerschmitt, D.G.2
  • 11
    • 79959466764 scopus 로고    scopus 로고
    • Optimization principles and application performance evaluation of a multithreaded gpu using cuda
    • S. Ryoo, C. I. Rodrigues, S. S. Baghsorkhi, S. S. Stone, D. B. Kirk, and W.-m. W. Hwu, "Optimization principles and application performance evaluation of a multithreaded gpu using cuda," in PPoPP '08, 2008, pp. 73-82.
    • (2008) PPoPP '08 , pp. 73-82
    • Ryoo, S.1    Rodrigues, C.I.2    Baghsorkhi, S.S.3    Stone, S.S.4    Kirk, D.B.5    Hwu, W.-M.W.6
  • 13
    • 80053263954 scopus 로고    scopus 로고
    • Hpc project, par4all. Par4All
    • Hpc project, par4all. HPC Project, Par4All.
    • HPC Project
  • 14
    • 74549119494 scopus 로고    scopus 로고
    • Heterogeneous multicore parallel programming for graphics processing units
    • F. Bodin and S. Bihan, "Heterogeneous multicore parallel programming for graphics processing units," Sci. Program., vol. 17, no. 4, pp. 325-336, 2009.
    • (2009) Sci. Program. , vol.17 , Issue.4 , pp. 325-336
    • Bodin, F.1    Bihan, S.2
  • 15
    • 77952281697 scopus 로고    scopus 로고
    • Implementing the PGI accelerator model
    • M. Wolfe, "Implementing the PGI accelerator model," in GPGPU '10, 2010, pp. 43-50.
    • (2010) GPGPU '10 , pp. 43-50
    • Wolfe, M.1
  • 16
    • 31844444712 scopus 로고    scopus 로고
    • Cache aware optimization of stream programs
    • J. Sermulins, W. Thies, R. Rabbah, and S. Amarasinghe, "Cache aware optimization of stream programs," SIGPLAN Not., vol. 40, no. 7, pp. 115-126, 2005.
    • (2005) SIGPLAN Not. , vol.40 , Issue.7 , pp. 115-126
    • Sermulins, J.1    Thies, W.2    Rabbah, R.3    Amarasinghe, S.4
  • 17
    • 70350121632 scopus 로고    scopus 로고
    • Compiler-directed scratchpad memory management via graph coloring
    • L. Li, H. Feng, and J. Xue, "Compiler-directed scratchpad memory management via graph coloring," ACM Trans. Archit. Code Optim., vol. 6, no. 3, pp. 1-17, 2009.
    • (2009) ACM Trans. Archit. Code Optim. , vol.6 , Issue.3 , pp. 1-17
    • Li, L.1    Feng, H.2    Xue, J.3
  • 18
    • 77953977802 scopus 로고    scopus 로고
    • Streamit benchmarks. http://groups.csail.mit.edu/cag/streamit/shtml/ benchmarks.shtml.
    • Streamit Benchmarks


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.