메뉴 건너뛰기




Volumn 7210 LNCS, Issue , 2012, Pages 1-20

Improving performance of OpenCL on CPUs

Author keywords

Code Generation; Data Parallelism; Divergent Control Flow; OpenCL; SIMD; Synchronization; Vectorization

Indexed keywords

CODE GENERATION; DATA PARALLELISM; DIVERGENT CONTROL FLOW; OPENCL; SIMD; VECTORIZATION;

EID: 84859143447     PISSN: 03029743     EISSN: 16113349     Source Type: Book Series    
DOI: 10.1007/978-3-642-28652-0_1     Document Type: Conference Paper
Times cited : (55)

References (27)
  • 1
    • 0020915645 scopus 로고
    • Conversion of control dependence to data dependence
    • ACM
    • Allen, J.R., Kennedy, K., Porterfield, C., Warren, J.: Conversion of control dependence to data dependence. In: POPL, pp. 177-189. ACM (1983)
    • (1983) POPL , pp. 177-189
    • Allen, J.R.1    Kennedy, K.2    Porterfield, C.3    Warren, J.4
  • 2
    • 0023438847 scopus 로고
    • Automatic translation of FORTRAN programs to vector form
    • Allen, R., Kennedy, K.: Automatic translation of FORTRAN programs to vector form. ACM Trans. Program. Lang. Syst. 9(4), 491-542 (1987)
    • (1987) ACM Trans. Program. Lang. Syst. , vol.9 , Issue.4 , pp. 491-542
    • Allen, R.1    Kennedy, K.2
  • 3
    • 84859153841 scopus 로고    scopus 로고
    • AMD: v2.5 March
    • AMD: AMD APP SDK v2.5 (March 2011)
    • (2011) AMD APP SDK
  • 7
    • 84962494055 scopus 로고    scopus 로고
    • CGiS, a New Language for Data-Parallel GPU Programming
    • Fritz, N., Lucas, P., Slusallek, P.: CGiS, a New Language for Data-Parallel GPU Programming. In: VMV, pp. 241-248 (2004)
    • (2004) VMV , pp. 241-248
    • Fritz, N.1    Lucas, P.2    Slusallek, P.3
  • 8
    • 78149276036 scopus 로고    scopus 로고
    • Twin peaks: A software platform for heterogeneous computing on general-purpose and graphics processors
    • ACM, New York
    • Gummaraju, J., Morichetti, L., Houston, M., Sander, B., Gaster, B.R., Zheng, B.: Twin peaks: a software platform for heterogeneous computing on general-purpose and graphics processors. In: PACT, pp. 205-216. ACM, New York (2010)
    • (2010) PACT , pp. 205-216
    • Gummaraju, J.1    Morichetti, L.2    Houston, M.3    Sander, B.4    Gaster, B.R.5    Zheng, B.6
  • 11
    • 78650928512 scopus 로고    scopus 로고
    • OpenCL-based design methodology for application-specific processors
    • July
    • Jaskelainen, P.O., de La Lama, C.S., Huerta, P., Takala, J.: OpenCL-based design methodology for application-specific processors. In: SAMOS 2010, pp. 223-230 (July 2010)
    • (2010) SAMOS 2010 , pp. 223-230
    • Jaskelainen, P.O.1    De La Lama, C.S.2    Huerta, P.3    Takala, J.4
  • 12
    • 79957502935 scopus 로고    scopus 로고
    • Whole Function Vectorization
    • Karrenberg, R., Hack, S.: Whole Function Vectorization. In: CGO, pp. 141-150 (2011)
    • (2011) CGO , pp. 141-150
    • Karrenberg, R.1    Hack, S.2
  • 14
    • 3042658703 scopus 로고    scopus 로고
    • LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation
    • March
    • Lattner, C., Adve, V.: LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In: CGO (March 2004)
    • (2004) CGO
    • Lattner, C.1    Adve, V.2
  • 17
    • 79953275887 scopus 로고    scopus 로고
    • Multi-platform auto-vectorization
    • Nuzman, D., Henderson, R.: Multi-platform auto-vectorization. In: CGO, pp. 281-294 (2006)
    • (2006) CGO , pp. 281-294
    • Nuzman, D.1    Henderson, R.2
  • 18
    • 63549093768 scopus 로고    scopus 로고
    • Outer-loop vectorization: Revisited for short simd architectures
    • ACM
    • Nuzman, D., Zaks, A.: Outer-loop vectorization: revisited for short simd architectures. In: PACT, pp. 2-11. ACM (2008)
    • (2008) PACT , pp. 2-11
    • Nuzman, D.1    Zaks, A.2
  • 22
    • 47849103500 scopus 로고    scopus 로고
    • Introducing Control Flow into Vectorized Code
    • IEEE Computer Society
    • Shin, J.: Introducing Control Flow into Vectorized Code. In: PACT, pp. 280-291. IEEE Computer Society (2007)
    • (2007) PACT , pp. 280-291
    • Shin, J.1
  • 23
    • 0034249157 scopus 로고    scopus 로고
    • A vectorizing compiler for multimedia extensions
    • Sreraman, N., Govindarajan, R.: A vectorizing compiler for multimedia extensions. Int. J. Parallel Program. 28(4), 363-400 (2000)
    • (2000) Int. J. Parallel Program. , vol.28 , Issue.4 , pp. 363-400
    • Sreraman, N.1    Govindarajan, R.2
  • 25
    • 58449109179 scopus 로고    scopus 로고
    • MCUDA: An Efficient Implementation of CUDA Kernels for Multi-core CPUs
    • Amaral, J.N. (ed.) LCPC 2008. Springer, Heidelberg
    • Stratton, J.A., Stone, S.S., Hwu, W.-m.W.: MCUDA: An Efficient Implementation of CUDA Kernels for Multi-core CPUs. In: Amaral, J.N. (ed.) LCPC 2008. LNCS, vol. 5335, pp. 16-30. Springer, Heidelberg (2008)
    • (2008) LNCS , vol.5335 , pp. 16-30
    • Stratton, J.A.1    Stone, S.S.2    Hwu, W.-M.W.3
  • 26
    • 84859153840 scopus 로고    scopus 로고
    • The Portland Group, Inc.: June
    • The Portland Group, Inc.: PGI CUDA-x86 (June 2011)
    • (2011) PGI CUDA-x86


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.