-
2
-
-
0023438847
-
Automatic translation of fortran programs to vector form
-
R. Allen and K. Kennedy. Automatic translation of fortran programs to vector form. ACM TOPLAS, 9(4):491-542, 1987.
-
(1987)
ACM TOPLAS
, vol.9
, Issue.4
, pp. 491-542
-
-
Allen, R.1
Kennedy, K.2
-
4
-
-
77949727098
-
-
ARM Ltd. ARM Neon, 2009. http://www.arm.com/miscPDFs/6629.p df.
-
(2009)
ARM Neon
-
-
-
5
-
-
10644248153
-
Brook for GPUs: Stream computing on graphics hardware
-
Aug.
-
I. Buck et al. Brook for GPUs: Stream computing on graphics hardware. ACM Trans. Gr., 23(3):777-786, Aug. 2004.
-
(2004)
ACM Trans. Gr.
, vol.23
, Issue.3
, pp. 777-786
-
-
Buck, I.1
-
6
-
-
31844442168
-
Shangrila: Achieving high performance from compiled network applications while enabling ease of programming
-
June
-
M. Chen, X. Li, R. Lian, J. Lin, L. Liu, T. Liu, and R. Ju. Shangrila: Achieving high performance from compiled network applications while enabling ease of programming. In Proc. '05 PLDI, pages 224-236, June 2005.
-
(2005)
Proc. '05 PLDI
, pp. 224-236
-
-
Chen, M.1
Li, X.2
Lian, R.3
Lin, J.4
Liu, L.5
Liu, T.6
Ju, R.7
-
7
-
-
8344245462
-
Vectorization for simd architectures with alignment constraints
-
A. E. Eichenberger, P. Wu, and K. O'Brien. Vectorization for simd architectures with alignment constraints. In Proc. '04 PLDI, pages 82-93, 2004.
-
(2004)
Proc. '04 PLDI
, pp. 82-93
-
-
Eichenberger, A.E.1
Wu, P.2
O'Brien, K.3
-
8
-
-
77952265561
-
-
Gcc 4.3.2
-
GNU Compiler Collection. Gcc 4.3.2, 2008. http://gcc.gnu.org/gcc-4.3/.
-
(2008)
-
-
-
9
-
-
33845384390
-
A stream compiler for communication-exposed architectures
-
Oct.
-
M. Gordon, W. Thies, M. Karczmarek, J. Lin, A. Meli, A. Lamb, C. Leger, J. Wong, H. Hoffmann, D. Maze, and S. Amarasinghe. A stream compiler for communication-exposed architectures. In 10th ASPLOS, pages 291-303, Oct. 2002.
-
(2002)
10th ASPLOS
, pp. 291-303
-
-
Gordon, M.1
Thies, W.2
Karczmarek, M.3
Lin, J.4
Meli, A.5
Lamb, A.6
Leger, C.7
Wong, J.8
Hoffmann, H.9
Maze, D.10
Amarasinghe, S.11
-
10
-
-
33846471996
-
Exploiting coarse-grained task, data, and pipeline parallelism in stream programs
-
M. I. Gordon, W. Thies, and S. Amarasinghe. Exploiting coarse-grained task, data, and pipeline parallelism in stream programs. In 12th ASPLOS, pages 151-162, 2006. (Pubitemid 46160720)
-
(2006)
ACM SIGPLAN Notices
, vol.41
, Issue.11
, pp. 151-162
-
-
Gordon, M.I.1
Thies, W.2
Amarasinghe, S.3
-
12
-
-
77952253263
-
-
Intel. Intel sse4, 2006. http://download.intel.com/technology/ architecture/new-instructions-paper.pdf.
-
(2006)
Intel Sse4
-
-
-
13
-
-
77949730233
-
-
Intel. Intel Core i7, 2008. http://www.intel.com/products/processor/ corei7/index.htm.
-
(2008)
Intel Core I7
-
-
-
14
-
-
77949698424
-
-
software.intel.com/en-us/intel-compilers
-
Intel. Intel compiler, 2009. software.intel.com/en-us/intel-compilers/.
-
(2009)
Intel Compiler
-
-
-
15
-
-
57349172999
-
Orchestrating the execution of stream programs on multicore platforms
-
June
-
M. Kudlur and S. Mahlke. Orchestrating the execution of stream programs on multicore platforms. In Proc. '08 PLDI, pages 114-124, June 2008.
-
(2008)
Proc. '08 PLDI
, pp. 114-124
-
-
Kudlur, M.1
Mahlke, S.2
-
16
-
-
0034446825
-
Exploiting superword level parallelism with multimedia instruction sets
-
June
-
S. Larsen and S. Amarasinghe. Exploiting superword level parallelism with multimedia instruction sets. In Proc. '00 PLDI, pages 145-156, June 2000.
-
(2000)
Proc. '00 PLDI
, pp. 145-156
-
-
Larsen, S.1
Amarasinghe, S.2
-
17
-
-
84939698077
-
Synchronous data flow
-
E. Lee and D. Messerschmitt. Synchronous data flow. Proc. IEEE, 75(9):1235-1245, 1987.
-
(1987)
Proc. IEEE
, vol.75
, Issue.9
, pp. 1235-1245
-
-
Lee, E.1
Messerschmitt, D.2
-
18
-
-
0037809797
-
An innovative low-power high-performance programmable signal processor for digital communications
-
J. H. Moreno, V. Zyuban, U. Shvadron, F. D. Neeser, J. H. Derby,M. S. Ware, K. Kailas, A. Zaks, A. Geva, S. Ben-David, S. W. Asaad, T. W. Fox, D. Littrell, M. Biberstein, D. Naishlos, and H. Hunter. An innovative low-power high-performance programmable signal processor for digital communications. IBM Jrn. of Research and Development, 47(2-3):299-326, 2003.
-
(2003)
IBM Jrn. of Research and Development
, vol.47
, Issue.2-3
, pp. 299-326
-
-
Moreno, J.H.1
Zyuban, V.2
Shvadron, U.3
Neeser, F.D.4
Derby, J.H.5
Ware, M.S.6
Kailas, K.7
Zaks, A.8
Geva, A.9
Ben-David, S.10
Asaad, S.W.11
Fox, T.W.12
Littrell, D.13
Biberstein, M.14
Naishlos, D.15
Hunter, H.16
-
20
-
-
77949723490
-
Generating permutation instructions from a high-level description
-
M. Narayanan and K. A. Yelick. Generating permutation instructions from a high-level description. In In Proc. MSP'04, 2004.
-
(2004)
In Proc. MSP'04
-
-
Narayanan, M.1
Yelick, K.A.2
-
21
-
-
79953275887
-
Multi-platform auto-vectorization
-
D. Nuzman and R. Henderson. Multi-platform auto-vectorization. In Proc. 2006 CGO, pages 281-294, 2006.
-
(2006)
Proc. 2006 CGO
, pp. 281-294
-
-
Nuzman, D.1
Henderson, R.2
-
22
-
-
33746034953
-
Auto-vectorization of interleaved data for simd
-
D. Nuzman, I. Rosen, and A. Zaks. Auto-vectorization of interleaved data for simd. In Proc. '06 PLDI, pages 132-142, 2006.
-
(2006)
Proc. '06 PLDI
, pp. 132-142
-
-
Nuzman, D.1
Rosen, I.2
Zaks, A.3
-
24
-
-
34547309668
-
-
June
-
Nvidia. CUDA Programming Guide, June 2007. http://developer.download. nvidia.com/compute/cuda.
-
(2007)
CUDA Programming Guide
-
-
-
25
-
-
33745222449
-
Optimizing data permutations for SIMD devices
-
DOI 10.1145/1133255.1133996, PLDI 2006 - Proceedings of the 2006 ACM SIGPLAN Conference on Programming Language Design and Implementation
-
G. Ren, P. Wu, and D. Padua. Optimizing data permutations for simd devices. In Proc. '06 PLDI, pages 118-131, 2006. (Pubitemid 44074926)
-
(2006)
Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI)
, vol.2006
, pp. 118-131
-
-
Ren, G.1
Wu, P.2
Padua, D.3
-
27
-
-
49249086142
-
Larrabee: A many-core x86 architecture for visual computing
-
L. Seiler et al. Larrabee: a many-core x86 architecture for visual computing. ACM Trans. Gr., 27(3):1-15, 2008.
-
(2008)
ACM Trans. Gr.
, vol.27
, Issue.3
, pp. 1-15
-
-
Seiler, L.1
-
28
-
-
77952250713
-
-
F. Semiconductor. Altivec, 2009. www.freescale.com/altivec.
-
(2009)
Altivec
-
-
-
29
-
-
84959045524
-
StreamIt: A language for streaming applications
-
W. Thies, M. Karczmarek, and S. P. Amarasinghe. StreamIt: A language for streaming applications. In Proc. 02 CC, pages 179-196, 2002.
-
(2002)
Proc. 02 CC
, pp. 179-196
-
-
Thies, W.1
Karczmarek, M.2
Amarasinghe, S.P.3
-
31
-
-
33646833599
-
Efficient simd code generation for runtime alignment and length conversion
-
P. Wu, A. E. Eichenberger, and A. Wang. Efficient simd code generation for runtime alignment and length conversion. In Proc. 2005 CGO, pages 153-164, 2005.
-
(2005)
Proc. 2005 CGO
, pp. 153-164
-
-
Wu, P.1
Eichenberger, A.E.2
Wang, A.3
|