-
1
-
-
84869687636
-
-
ATI Mobility RadeonTM HD4850/4870 Graphics-Overview
-
ATI Mobility RadeonTM HD4850/4870 Graphics-Overview. http://ati.amd.com/ products/radeonhd4800.
-
-
-
-
5
-
-
84869664151
-
-
Advanced Micro Devices, Inc
-
Advanced Micro Devices, Inc. AMD Brook+. http://ati.amd.com/technology/ streamcomputing/AMD-Brookplus.pdf.
-
AMD Brook
-
-
-
6
-
-
70450275084
-
Analyzing cuda workloads using a detailed GPU simulator
-
April
-
A. Bakhoda, G. Yuan, W. W. L. Fung, H. Wong, and T. M. Aamodt. Analyzing cuda workloads using a detailed GPU simulator. In IEEE ISPASS, April 2009.
-
(2009)
IEEE ISPASS
-
-
Bakhoda, A.1
Yuan, G.2
Fung, W.W.L.3
Wong, H.4
Aamodt, T.M.5
-
7
-
-
64949101685
-
A first-order fine-grained multithreaded throughput model
-
X. E. Chen and T. M. Aamodt. A first-order fine-grained multithreaded throughput model. In HPCA, 2009.
-
(2009)
HPCA
-
-
Chen, X.E.1
Aamodt, T.M.2
-
8
-
-
44849137198
-
NVIDIA Tesla: A Unified Graphics and Computing Architecture
-
March-April
-
E. Lindholm, J. Nickolls, S. Oberman and J. Montrym. NVIDIA Tesla: A Unified Graphics and Computing Architecture. IEEE Micro, 28(2):39-55, March-April 2008.
-
(2008)
IEEE Micro
, vol.28
, Issue.2
, pp. 39-55
-
-
Lindholm, E.1
Nickolls, J.2
Oberman, S.3
Montrym, J.4
-
9
-
-
70450275082
-
-
M. Fatica, P. LeGresley, I. Buck, J. Stone, J. Phillips, S. Morton, and P. Micikevicius. High Performance Computing with CUDA, SC08, 2008.
-
(2008)
High Performance Computing with CUDA
-
-
Fatica, M.1
LeGresley, P.2
Buck, I.3
Stone, J.4
Phillips, J.5
Morton, S.6
Micikevicius, P.7
-
12
-
-
70450274279
-
An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness
-
Technical Report TR-2009-003, Atlanta, GA, USA
-
S. Hong and H. Kim. An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness. Technical Report TR-2009-003, Atlanta, GA, USA, 2009.
-
(2009)
-
-
Hong, S.1
Kim, H.2
-
14
-
-
70450275951
-
-
Intel SSE/MMX2/KNI documentation. http://www.intel80386.com/simd/mmx2- doc.html.
-
Intel SSE/MMX2/KNI documentation. http://www.intel80386.com/simd/mmx2- doc.html.
-
-
-
-
17
-
-
68149168035
-
Merge: A programming model for heterogeneous multi-core systems
-
M. D. Linderman, J. D. Collins, H. Wang, and T. H. Meng. Merge: a programming model for heterogeneous multi-core systems. In ASPLOS XIII, 2008.
-
(2008)
ASPLOS
, vol.13
-
-
Linderman, M.D.1
Collins, J.D.2
Wang, H.3
Meng, T.H.4
-
18
-
-
0034824085
-
Data-flow prescheduling for large instruction windows in out-of-order processors
-
P. Michaud and A. Seznec. Data-flow prescheduling for large instruction windows in out-of-order processors. In HPCA, 2001.
-
(2001)
HPCA
-
-
Michaud, P.1
Seznec, A.2
-
19
-
-
0033365427
-
Exploring instruction-fetch bandwidth requirement in wide-issue superscalar processors
-
P. Michaud, A. Seznec, and S. Jourdan. Exploring instruction-fetch bandwidth requirement in wide-issue superscalar processors. In PA C T, 1999.
-
(1999)
PA
, Issue.C T
-
-
Michaud, P.1
Seznec, A.2
Jourdan, S.3
-
20
-
-
78651550268
-
Scalable Parallel Programming with CUDA
-
March-April
-
J. Nickolls, I. Buck, M. Garland, and K. Skadron. Scalable Parallel Programming with CUDA. ACM Queue, 6(2):40-53, March-April 2008.
-
(2008)
ACM Queue
, vol.6
, Issue.2
, pp. 40-53
-
-
Nickolls, J.1
Buck, I.2
Garland, M.3
Skadron, K.4
-
21
-
-
85016676932
-
Theoretical modeling of superscalar processor performance
-
D. B. Noonburg and J. P. Shen. Theoretical modeling of superscalar processor performance. In MICRO-27, 1994.
-
(1994)
MICRO-27
-
-
Noonburg, D.B.1
Shen, J.P.2
-
24
-
-
43449094719
-
-
S. Ryoo, C. Rodrigues, S. Stone, S. Baghsorkhi, S. Ueng, J. Stratton, and W. Hwu. Program optimization space pruning for a multithreaded gpu. In CGO, 2008.
-
S. Ryoo, C. Rodrigues, S. Stone, S. Baghsorkhi, S. Ueng, J. Stratton, and W. Hwu. Program optimization space pruning for a multithreaded gpu. In CGO, 2008.
-
-
-
-
25
-
-
0342373102
-
An analytical solution for a markov chain modeling multithreaded
-
Technical report, Berkeley, CA, USA
-
R. H. Saavedra-Barrera and D. E. Culler. An analytical solution for a markov chain modeling multithreaded. Technical report, Berkeley, CA, USA, 1991.
-
(1991)
-
-
Saavedra-Barrera, R.H.1
Culler, D.E.2
-
26
-
-
49249086142
-
Larrabee: A many-core x86 architecture for visual computing
-
L. Seiler, D. Carmean, E. Sprangle, T. Forsyth, M. Abrash, P. Dubey, S. Junkins, A. Lake, J. Sugerman, R. Cavin, R. Espasa, E. Grochowski, T. Juan, and P. Hanrahan. Larrabee: a many-core x86 architecture for visual computing. ACM Trans. Graph., 2008.
-
(2008)
ACM Trans. Graph
-
-
Seiler, L.1
Carmean, D.2
Sprangle, E.3
Forsyth, T.4
Abrash, M.5
Dubey, P.6
Junkins, S.7
Lake, A.8
Sugerman, J.9
Cavin, R.10
Espasa, R.11
Grochowski, E.12
Juan, T.13
Hanrahan, P.14
-
27
-
-
0031593993
-
Analytic evaluation of shared-memory systems with ILP processors
-
D. J. Sorin, V. S. Pai, S. V. Adve, M. K. Vernon, and D. A. Wood. Analytic evaluation of shared-memory systems with ILP processors. In ISCA, 1998.
-
(1998)
ISCA
-
-
Sorin, D.J.1
Pai, V.S.2
Adve, S.V.3
Vernon, M.K.4
Wood, D.A.5
-
28
-
-
20444381978
-
Face detection using spectral histograms and SVMs
-
June
-
C. A. Waring and X. Liu. Face detection using spectral histograms and SVMs. Systems, Man, and Cybernetics, Part B, IEEE Transactions on, 35(3):467-476, June 2005.
-
(2005)
Systems, Man, and Cybernetics, Part B, IEEE Transactions on
, vol.35
, Issue.3
, pp. 467-476
-
-
Waring, C.A.1
Liu, X.2
|