-
1
-
-
0020915645
-
Conversion of control dependence to data dependence
-
New York, NY, USA. ACM
-
J. R. Allen, K. Kennedy, C. Porterfield, and J. Warren. Conversion of control dependence to data dependence. In Proceedings of the 10th ACM SIGACTSIGPLAN symposium on Principles of programming languages, POPL '83, pages 177-189, New York, NY, USA, 1983. ACM.
-
(1983)
Proceedings of the 10th ACM SIGACTSIGPLAN Symposium on Principles of Programming Languages, POPL '83
, pp. 177-189
-
-
Allen, J.R.1
Kennedy, K.2
Porterfield, C.3
Warren, J.4
-
4
-
-
70349169075
-
Analyzing CUDA workloads using a detailed GPU simulator
-
April
-
A. Bakhoda, G. Yuan, W. Fung, H. Wong, and T. Aamodt. Analyzing CUDA workloads using a detailed GPU simulator. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS-2009), April 2009.
-
(2009)
IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS-2009)
-
-
Bakhoda, A.1
Yuan, G.2
Fung, W.3
Wong, H.4
Aamodt, T.5
-
5
-
-
0015330108
-
The illiac IV system
-
April
-
W. Bouknight, S. Denenberg, D. McIntyre, J. Randall, A. Sameh, and D. Slotnick. The Illiac IV System. In Proceedings of the IEEE, volume 60, pages 369-388, April 1972.
-
(1972)
Proceedings of the IEEE
, vol.60
, pp. 369-388
-
-
Bouknight, W.1
Denenberg, S.2
McIntyre, D.3
Randall, J.4
Sameh, A.5
Slotnick, D.6
-
6
-
-
70649092154
-
Rodinia: A benchmark suite for heterogeneous computing
-
October
-
S. Che, M. Boyer, J. Meng, D. Tarjan, J. Sheaffer, S.- H. Lee, and K. Skadron. Rodinia: A benchmark suite for heterogeneous computing. In IEEE International Symposium on Workload Characterization (IISWC- 2009), October 2009.
-
(2009)
IEEE International Symposium on Workload Characterization (IISWC- 2009)
-
-
Che, S.1
Boyer, M.2
Meng, J.3
Tarjan, D.4
Sheaffer, J.5
Lee, S.-H.6
Skadron, K.7
-
7
-
-
84863351470
-
SIMD R e-convergence at thread frontiers
-
December
-
G. Diamos, B. Ashbaugh, S. Maiyuran, A. Kerr, H. Wu, and S. Yalamanchili. SIMD R e-Convergence At Thread Frontiers. In 44th International Symposium on Microarchitecture (MICRO-44), December 2011.
-
(2011)
44th International Symposium on Microarchitecture (MICRO-44)
-
-
Diamos, G.1
Ashbaugh, B.2
Maiyuran, S.3
Kerr, A.4
Wu, H.5
Yalamanchili, S.6
-
13
-
-
38349041620
-
Accelerating large graph algorithms on the GPU using CUDA
-
P. Harish and P. Narayanan. Accelerating Large Graph Algorithms on the GPU Using CUDA. In High Performance Computing HiPC 2007, volume 4873, pages 197-208. 2007.
-
(2007)
High Performance Computing HiPC 2007
, vol.4873
, pp. 197-208
-
-
Harish, P.1
Narayanan, P.2
-
14
-
-
84864836708
-
-
IMPACT Research Group. The Parboil Benchmark Suite, 2007
-
IMPACT Research Group. The Parboil Benchmark Suite, 2007.
-
-
-
-
17
-
-
0034459255
-
Efficient conditional operations for data-parallel architectures
-
December
-
U. Kapasi, W. Dally, S. Rixner, P. Mattson, J. Owens, and B. Khailany. Efficient conditional operations for data-parallel architectures. In 33th International Symposium on Microarchitecture (MICRO-33), December 2000.
-
(2000)
33th International Symposium on Microarchitecture (MICRO-33)
-
-
Kapasi, U.1
Dally, W.2
Rixner, S.3
Mattson, P.4
Owens, J.5
Khailany, B.6
-
19
-
-
84863342255
-
Improving GPU performance via largewarps and two-level warp scheduling
-
December
-
V. Narasiman, C. Lee, M. Shebanow, R. Miftakhutdinov, O. Mutlu, and Y. Patt. Improving GPU Performance via LargeWarps and Two-LevelWarp Scheduling. In 44th International Symposium on Microarchitecture (MICRO-44), December 2011.
-
(2011)
44th International Symposium on Microarchitecture (MICRO-44)
-
-
Narasiman, V.1
Lee, C.2
Shebanow, M.3
Miftakhutdinov, R.4
Mutlu, O.5
Patt, Y.6
-
24
-
-
0017922490
-
The CR AY-1 computer system
-
January
-
R. M. R ussell. The CR AY-1 computer system. Commun. ACM, 21:63-72, January 1978.
-
(1978)
Commun. ACM
, vol.21
, pp. 63-72
-
-
Russell, R.M.1
-
25
-
-
38849131252
-
High-throughput sequence alignment using graphics processing units
-
M. Schatz, C. Trapnell, A. Delcher, and A. Varshney. High-throughput sequence alignment using graphics processing units. BMC Bioinformatics, 8(1):474, 2007.
-
(2007)
BMC Bioinformatics
, vol.8
, Issue.1
, pp. 474
-
-
Schatz, M.1
Trapnell, C.2
Delcher, A.3
Varshney, A.4
-
26
-
-
49249086142
-
Larrabee: A many-core x86 architecture for visual computing
-
August
-
L. Seiler, D. Carmean, E. Sprangle, T. Forsyth, M. Abrash, P. Dubey, S. Junkins, A. Lake, J. Sugerman, R. Cavin, R. Espasa, E. Grochowski, T. Juan, and P. Hanrahan. Larrabee: A Many-core x86 Architecture for Visual Computing. ACM Trans. Graph., 27:18:1-18:15, August 2008.
-
(2008)
ACM Trans. Graph.
, vol.27
, pp. 181-1815
-
-
Seiler, L.1
Carmean, D.2
Sprangle, E.3
Forsyth, T.4
Abrash, M.5
Dubey, P.6
Junkins, S.7
Lake, A.8
Sugerman, J.9
Cavin, R.10
Espasa, R.11
Grochowski, E.12
Juan, T.13
Hanrahan, P.14
-
30
-
-
30744459395
-
RPU: A programmable ray processing unit for realtime ray tracing
-
July
-
S. Woop, J. Schmittler, and P. Slusallek. RPU: a programmable ray processing unit for realtime ray tracing. ACM Trans. Graph., 24:434-444, July 2005.
-
(2005)
ACM Trans. Graph.
, vol.24
, pp. 434-444
-
-
Woop, S.1
Schmittler, J.2
Slusallek, P.3
|