메뉴 건너뛰기




Volumn 2015-June, Issue , 2015, Pages 509-520

Efficient execution of recursive programs on commodity vector hardware

Author keywords

Recursive programs; Task parallelism; Vectorization

Indexed keywords

COMPUTATIONAL EFFICIENCY; COMPUTATIONAL LINGUISTICS; COMPUTER PROGRAMMING LANGUAGES; COSINE TRANSFORMS; HARDWARE; PROGRAM PROCESSORS; VECTORS;

EID: 84951798257     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/2737924.2738004     Document Type: Conference Paper
Times cited : (15)

References (37)
  • 1
    • 78651324376 scopus 로고    scopus 로고
    • Understanding the efficiency of ray traversal on GPUs
    • T. Aila and S. Laine. Understanding the Efficiency of Ray Traversal on GPUs. In HPG'09, pages 145-149, 2009.
    • (2009) HPG'09 , pp. 145-149
    • Aila, T.1    Laine, S.2
  • 3
    • 84875205533 scopus 로고    scopus 로고
    • From relational verification to SIMD loop synthesis
    • G. Barthe, J. M. Crespo, S. Gulwani, C. Kunz, and M. Marron. From Relational Verification to SIMD Loop Synthesis. In PPoPP'13, pages 123-134, 2013.
    • (2013) PPoPP'13 , pp. 123-134
    • Barthe, G.1    Crespo, J.M.2    Gulwani, S.3    Kunz, C.4    Marron, M.5
  • 5
    • 84877715459 scopus 로고    scopus 로고
    • Billion-particle SIMD-friendly two-point correlation on largescale HPC cluster systems
    • J. Chhugani, C. Kim, H. Shukla, J. Park, P. Dubey, J. Shalf, and H. D. Simon. Billion-particle SIMD-friendly Two-point Correlation on Largescale HPC Cluster Systems. In SC'12, pages 1:1-1:11, 2012.
    • (2012) SC'12 , pp. 11-111
    • Chhugani, J.1    Kim, C.2    Shukla, H.3    Park, J.4    Dubey, P.5    Shalf, J.6    Simon, H.D.7
  • 6
    • 84951776710 scopus 로고    scopus 로고
    • Cilk. Cilk. http://supertech.csail.mit.edu/cilk/.
    • Cilk
    • Cilk1
  • 7
    • 51549087961 scopus 로고    scopus 로고
    • Shallow bounding volume hierarchies for fast SIMD ray tracing of incoherent rays
    • H. Dammertz, J. Hanika, and A. Keller. Shallow Bounding Volume Hierarchies for Fast SIMD Ray Tracing of Incoherent Rays. In EGSR'08, pages 1225-1233, 2008.
    • (2008) EGSR'08 , pp. 1225-1233
    • Dammertz, H.1    Hanika, J.2    Keller, A.3
  • 9
    • 0000011164 scopus 로고
    • A fast computer method for matrix transposing
    • July
    • J. O. Eklundh. A Fast Computer Method for Matrix Transposing. IEEE Trans. Comput., 21(7):801-803, July 1972.
    • (1972) IEEE Trans. Comput. , vol.21 , Issue.7 , pp. 801-803
    • Eklundh, J.O.1
  • 10
    • 0347507496 scopus 로고    scopus 로고
    • The implementation of the cilk-5 multithreaded language
    • M. Frigo, C. E. Leiserson, and K. H. Randall. The Implementation of the Cilk-5 Multithreaded Language. In PLDI'98, pages 212-223, 1998.
    • (1998) PLDI'98 , pp. 212-223
    • Frigo, M.1    Leiserson, C.E.2    Randall, K.H.3
  • 12
    • 84865327496 scopus 로고    scopus 로고
    • Can GPGPU programming be liberated from the data-parallel bottleneck?
    • August
    • B. Gaster and L. Howes. Can GPGPU Programming Be Liberated from the Data-Parallel Bottleneck? Computer, 45(8):42-52, August 2012.
    • (2012) Computer , vol.45 , Issue.8 , pp. 42-52
    • Gaster, B.1    Howes, L.2
  • 13
    • 70450029262 scopus 로고    scopus 로고
    • Work-first and help-first scheduling policies for async-finish task parallelism
    • Y. Guo, R. Barik, R. Raman, and V. Sarkar. Work-first and Help-first Scheduling Policies for Async-finish Task Parallelism. In IPDPS'09, pages 1-12, 2009.
    • (2009) IPDPS'09 , pp. 1-12
    • Guo, Y.1    Barik, R.2    Raman, R.3    Sarkar, V.4
  • 15
    • 0141427127 scopus 로고
    • Vectorization of tree traversals
    • Mar.
    • L. Hernquist. Vectorization of Tree Traversals. J. Comput. Phys., 87(1):137-147, Mar. 1990.
    • (1990) J. Comput. Phys. , vol.87 , Issue.1 , pp. 137-147
    • Hernquist, L.1
  • 16
    • 84976759390 scopus 로고
    • Graphinators and the Duality of SIMD and MIMD
    • P. Hudak and E. Hohr. Graphinators and the Duality of SIMD and MIMD. In LFP'88, pages 224-234, 1988.
    • (1988) LFP'88 , pp. 224-234
    • Hudak, P.1    Hohr, E.2
  • 17
    • 84879836252 scopus 로고    scopus 로고
    • Efficient scheduling of recursive control flow on GPUs
    • X. Huo, S. Krishnamoorthy, and G. Agrawal. Efficient Scheduling of Recursive Control Flow on GPUs. In ICS'13, pages 409-420, 2013.
    • (2013) ICS'13 , pp. 409-420
    • Huo, X.1    Krishnamoorthy, S.2    Agrawal, G.3
  • 18
    • 84858310773 scopus 로고    scopus 로고
    • Enhancing locality for recursive traversals of recursive structures
    • Y. Jo and M. Kulkarni. Enhancing Locality for Recursive Traversals of Recursive Structures. In OOPSLA'11, pages 463-482, 2011.
    • (2011) OOPSLA'11 , pp. 463-482
    • Jo, Y.1    Kulkarni, M.2
  • 19
    • 84887467173 scopus 로고    scopus 로고
    • Automatic vectorization of tree traversals
    • Y. Jo, M. Goldfarb, and M. Kulkarni. Automatic Vectorization of Tree Traversals. In PACT'13, pages 363-374, 2013.
    • (2013) PACT'13 , pp. 363-374
    • Jo, Y.1    Goldfarb, M.2    Kulkarni, M.3
  • 21
    • 84878542156 scopus 로고    scopus 로고
    • Efficient SIMD code generation for irregular kernels
    • S. Kim and H. Han. Efficient SIMD Code Generation for Irregular Kernels. In PPoPP'12, pages 55-64, 2012.
    • (2012) PPoPP'12 , pp. 55-64
    • Kim, S.1    Han, H.2
  • 24
    • 84897807567 scopus 로고    scopus 로고
    • Data-parallel Finite-state Machines
    • T. Mytkowicz, M. Musuvathi, and W. Schulte. Data-parallel Finite-state Machines. In ASPLOS'14, pages 529-542, 2014.
    • (2014) ASPLOS'14 , pp. 529-542
    • Mytkowicz, T.1    Musuvathi, M.2    Schulte, W.3
  • 25
    • 63549093768 scopus 로고    scopus 로고
    • Outer-loop Vectorization: Revisited for Short SIMD Architectures
    • D. Nuzman and A. Zaks. Outer-loop Vectorization: Revisited for Short SIMD Architectures. In PACT'08, pages 2-11, 2008.
    • (2008) PACT'08 , pp. 2-11
    • Nuzman, D.1    Zaks, A.2
  • 26
    • 84922773010 scopus 로고    scopus 로고
    • NVIDIA. CUDA. http://www.nvidia.com/object/cuda-home-new.html.
    • CUDA
    • NVIDIA1
  • 29
    • 84905454859 scopus 로고    scopus 로고
    • Finegrain task aggregation and coordination on GPUs
    • M. S. Orr, B. M. Beckmann, S. K. Reinhardt, and D. A. Wood. Finegrain Task Aggregation and Coordination on GPUs. In ISCA'14, pages 181-192, 2014.
    • (2014) ISCA'14 , pp. 181-192
    • Orr, M.S.1    Beckmann, B.M.2    Reinhardt, S.K.3    Wood, D.A.4
  • 32
    • 84876909157 scopus 로고    scopus 로고
    • SIMD parallelization of applications that traverse irregular data structures
    • B. Ren, G. Agrawal, J. R. Larus, T. Mytkowicz, T. Poutanen, and W. Schulte. SIMD Parallelization of Applications that Traverse Irregular Data Structures. In CGO'13, pages 1-10, 2013.
    • (2013) CGO'13 , pp. 1-10
    • Ren, B.1    Agrawal, G.2    Larus, J.R.3    Mytkowicz, T.4    Poutanen, T.5    Schulte, W.6
  • 33
    • 79951700098 scopus 로고    scopus 로고
    • Improving SIMT efficiency of global rendering algorithms with architectural support for dynamic micro-kernels
    • M. Steffen and J. Zambreno. Improving SIMT Efficiency of Global Rendering Algorithms with Architectural Support for Dynamic Micro-Kernels. In MICRO'43, pages 237-248, 2010.
    • (2010) MICRO'43 , pp. 237-248
    • Steffen, M.1    Zambreno, J.2
  • 34
    • 77952162137 scopus 로고    scopus 로고
    • OpenCL: A parallel programming standard for heterogeneous computing systems
    • May
    • J. E. Stone, D. Gohara, and G. Shi. OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems. IEEE Des. Test, 12(3):66-73, May 2010.
    • (2010) IEEE Des. Test , vol.12 , Issue.3 , pp. 66-73
    • Stone, J.E.1    Gohara, D.2    Shi, G.3
  • 35
    • 84951008030 scopus 로고    scopus 로고
    • Oct.
    • TPL. The Task Parallel Library. http://msdn. microsoft.com/en-us/magazine/cc163340.aspx, Oct. 2007.
    • (2007) The Task Parallel Library
    • TPL1
  • 36
    • 84934313374 scopus 로고    scopus 로고
    • Task management for irregularparallel workloads on the GPU
    • S. Tzeng, A. Patney, and J. D. Owens. Task Management for Irregularparallel Workloads on the GPU. In HPG'10, pages 29-37, 2010.
    • (2010) HPG'10 , pp. 29-37
    • Tzeng, S.1    Patney, A.2    Owens, J.D.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.