-
1
-
-
0020915645
-
Conversion of control dependence to data dependence
-
ACM
-
Allen, J.R., Kennedy, K., Porterfield, C., Warren, J.: Conversion of control dependence to data dependence. In: POPL, pp. 177-189. ACM (1983)
-
(1983)
POPL
, pp. 177-189
-
-
Allen, J.R.1
Kennedy, K.2
Porterfield, C.3
Warren, J.4
-
2
-
-
0023438847
-
Automatic translation of FORTRAN programs to vector form
-
Allen, R., Kennedy, K.: Automatic translation of FORTRAN programs to vector form. ACM Trans. Program. Lang. Syst. 9(4), 491-542 (1987)
-
(1987)
ACM Trans. Program. Lang. Syst.
, vol.9
, Issue.4
, pp. 491-542
-
-
Allen, R.1
Kennedy, K.2
-
3
-
-
84859153841
-
-
AMD: v2.5 March
-
AMD: AMD APP SDK v2.5 (March 2011)
-
(2011)
AMD APP SDK
-
-
-
6
-
-
0004129492
-
-
Birkhauser, Boston
-
Darte, A., Robert, Y., Vivien, F.: Scheduling and Automatic Parallelization. Birkhauser, Boston (2000)
-
(2000)
Scheduling and Automatic Parallelization
-
-
Darte, A.1
Robert, Y.2
Vivien, F.3
-
7
-
-
84962494055
-
CGiS, a New Language for Data-Parallel GPU Programming
-
Fritz, N., Lucas, P., Slusallek, P.: CGiS, a New Language for Data-Parallel GPU Programming. In: VMV, pp. 241-248 (2004)
-
(2004)
VMV
, pp. 241-248
-
-
Fritz, N.1
Lucas, P.2
Slusallek, P.3
-
8
-
-
78149276036
-
Twin peaks: A software platform for heterogeneous computing on general-purpose and graphics processors
-
ACM, New York
-
Gummaraju, J., Morichetti, L., Houston, M., Sander, B., Gaster, B.R., Zheng, B.: Twin peaks: a software platform for heterogeneous computing on general-purpose and graphics processors. In: PACT, pp. 205-216. ACM, New York (2010)
-
(2010)
PACT
, pp. 205-216
-
-
Gummaraju, J.1
Morichetti, L.2
Houston, M.3
Sander, B.4
Gaster, B.R.5
Zheng, B.6
-
9
-
-
77952252026
-
Macross: Macro-simdization of streaming applications
-
ACM, New York
-
Hormati, A.H., Choi, Y.,Woh, M., Kudlur, M., Rabbah, R., Mudge, T., Mahlke, S.: Macross: macro-simdization of streaming applications. In: ASPLOS, pp. 285-296. ACM, New York (2010)
-
(2010)
ASPLOS
, pp. 285-296
-
-
Hormati, A.H.1
Choi, Y.2
Woh, M.3
Kudlur, M.4
Rabbah, R.5
Mudge, T.6
Mahlke, S.7
-
11
-
-
78650928512
-
OpenCL-based design methodology for application-specific processors
-
July
-
Jaskelainen, P.O., de La Lama, C.S., Huerta, P., Takala, J.: OpenCL-based design methodology for application-specific processors. In: SAMOS 2010, pp. 223-230 (July 2010)
-
(2010)
SAMOS 2010
, pp. 223-230
-
-
Jaskelainen, P.O.1
De La Lama, C.S.2
Huerta, P.3
Takala, J.4
-
12
-
-
79957502935
-
Whole Function Vectorization
-
Karrenberg, R., Hack, S.: Whole Function Vectorization. In: CGO, pp. 141-150 (2011)
-
(2011)
CGO
, pp. 141-150
-
-
Karrenberg, R.1
Hack, S.2
-
14
-
-
3042658703
-
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation
-
March
-
Lattner, C., Adve, V.: LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In: CGO (March 2004)
-
(2004)
CGO
-
-
Lattner, C.1
Adve, V.2
-
15
-
-
79957475280
-
Intel's Array Building Blocks: A retargetable, dynamic compiler and embedded language
-
Newburn, C.J., So, B., Liu, Z., McCool, M.D., Ghuloum, A.M., Toit, S.D., Wang, Z.G., Du, Z., Chen, Y., Wu, G., Guo, P., Liu, Z., Zhang, D.: Intel's Array Building Blocks: A retargetable, dynamic compiler and embedded language. In: CGO, pp. 224-235 (2011)
-
(2011)
CGO
, pp. 224-235
-
-
Newburn, C.J.1
So, B.2
Liu, Z.3
McCool, M.D.4
Ghuloum, A.M.5
Toit, S.D.6
Wang, Z.G.7
Du, Z.8
Chen, Y.9
Wu, G.10
Guo, P.11
Liu, Z.12
Zhang, D.13
-
17
-
-
79953275887
-
Multi-platform auto-vectorization
-
Nuzman, D., Henderson, R.: Multi-platform auto-vectorization. In: CGO, pp. 281-294 (2006)
-
(2006)
CGO
, pp. 281-294
-
-
Nuzman, D.1
Henderson, R.2
-
18
-
-
63549093768
-
Outer-loop vectorization: Revisited for short simd architectures
-
ACM
-
Nuzman, D., Zaks, A.: Outer-loop vectorization: revisited for short simd architectures. In: PACT, pp. 2-11. ACM (2008)
-
(2008)
PACT
, pp. 2-11
-
-
Nuzman, D.1
Zaks, A.2
-
22
-
-
47849103500
-
Introducing Control Flow into Vectorized Code
-
IEEE Computer Society
-
Shin, J.: Introducing Control Flow into Vectorized Code. In: PACT, pp. 280-291. IEEE Computer Society (2007)
-
(2007)
PACT
, pp. 280-291
-
-
Shin, J.1
-
23
-
-
0034249157
-
A vectorizing compiler for multimedia extensions
-
Sreraman, N., Govindarajan, R.: A vectorizing compiler for multimedia extensions. Int. J. Parallel Program. 28(4), 363-400 (2000)
-
(2000)
Int. J. Parallel Program.
, vol.28
, Issue.4
, pp. 363-400
-
-
Sreraman, N.1
Govindarajan, R.2
-
25
-
-
58449109179
-
MCUDA: An Efficient Implementation of CUDA Kernels for Multi-core CPUs
-
Amaral, J.N. (ed.) LCPC 2008. Springer, Heidelberg
-
Stratton, J.A., Stone, S.S., Hwu, W.-m.W.: MCUDA: An Efficient Implementation of CUDA Kernels for Multi-core CPUs. In: Amaral, J.N. (ed.) LCPC 2008. LNCS, vol. 5335, pp. 16-30. Springer, Heidelberg (2008)
-
(2008)
LNCS
, vol.5335
, pp. 16-30
-
-
Stratton, J.A.1
Stone, S.S.2
Hwu, W.-M.W.3
-
26
-
-
84859153840
-
-
The Portland Group, Inc.: June
-
The Portland Group, Inc.: PGI CUDA-x86 (June 2011)
-
(2011)
PGI CUDA-x86
-
-
|