-
1
-
-
76749123978
-
Complexity effective memory access scheduling for many-core accelerator architectures
-
G. Yuan, A. Bakhoda, and T. Aamodt, Complexity Effective Memory Access Scheduling for Many-Core Accelerator Architectures, MICRO, 2009.
-
(2009)
MICRO
-
-
Yuan, G.1
Bakhoda, A.2
Aamodt, T.3
-
2
-
-
47349104432
-
Dynamic warp formation and scheduling for efficient gpu control flow
-
W. Fung, I. Sham, G. Yuan, and T. Aamodt, Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow, MICRO, 2007.
-
(2007)
MICRO
-
-
Fung, W.1
Sham, I.2
Yuan, G.3
Aamodt, T.4
-
6
-
-
70349169075
-
Analyzing CUDA workloads using a detailed GPU simulator
-
A. Bakhoda, G. Yuan, W. Fung, H. Wong, and T. Aamodt, Analyzing CUDA Workloads using a Detailed GPU Simulator, ISPASS, 2009.
-
(2009)
ISPASS
-
-
Bakhoda, A.1
Yuan, G.2
Fung, W.3
Wong, H.4
Aamodt, T.5
-
9
-
-
33646486530
-
Measuring benchmark similarity using inherent program characteristics
-
A. Joshi, A. Phansalkar, L. Eeckhout, and L. John, Measuring Benchmark Similarity using Inherent Program Characteristics, IEEE Transactions on Computers, Vol.55, No.6, 2006.
-
(2006)
IEEE Transactions on Computers
, vol.55
, Issue.6
-
-
Joshi, A.1
Phansalkar, A.2
Eeckhout, L.3
John, L.4
-
10
-
-
70649095417
-
On the (Dis)similarity of transactional memory workloads
-
C. Hughe, J. Poe, A. Qouneh, and T. Li, On the (Dis)similarity of Transactional Memory Workloads, IISWC, 2009.
-
(2009)
IISWC
-
-
Hughe, C.1
Poe, J.2
Qouneh, A.3
Li, T.4
-
11
-
-
0037325558
-
-
L. Eeckhout, H. Vandierendonck, and K. De Bosschere, Designing Computer Architecture Research Workloads, Computer, Vol. 36, No. 2, 2003.
-
(2003)
Designing Computer Architecture Research Workloads Computer
, vol.36
, Issue.2
-
-
Eeckhout, L.1
Vandierendonck, H.2
De Bosschere, K.3
-
13
-
-
0034226001
-
SPEC CPU2000: Measuring CPU performance in the new millennium
-
July
-
J. Henning, SPEC CPU2000: Measuring CPU Performance in the New Millennium, IEEE Computer, pp. 28-35, July 2000.
-
(2000)
IEEE Computer
, pp. 28-35
-
-
Henning, J.1
-
14
-
-
0031339427
-
MediaBench: A tool for evaluating and synthesizing multimedia and communication systems
-
C. Lee, M. Potkonjak, and W. Smith, MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communication Systems, MICRO, 1997.
-
(1997)
MICRO
-
-
Lee, C.1
Potkonjak, M.2
Smith, W.3
-
15
-
-
84962779213
-
MiBench: A free, commercially representative embedded benchmark suite
-
M. Guthaus, J. Ringenberg, D. Ernst, T. Austin, T. Mudge, and R. Brown, MiBench: A Free, Commercially Representative Embedded Benchmark Suite, Workshop on Workload Characterization, 2001.
-
(2001)
Workshop on Workload Characterization
-
-
Guthaus, M.1
Ringenberg, J.2
Ernst, D.3
Austin, T.4
Mudge, T.5
Brown, R.6
-
16
-
-
0029194459
-
The SPLASH-2 programs: Characterization and methodological considerations
-
S. Woo, M. Ohara, E. Torrie, J. Singh, and A. Gupta, The SPLASH-2 Programs: Characterization and Methodological Considerations, ISCA, 1995.
-
(1995)
ISCA
-
-
Woo, S.1
Ohara, M.2
Torrie, E.3
Singh, J.4
Gupta, A.5
-
17
-
-
70649115330
-
STAMP: Stanford transactional memory applications for multi-processing
-
C. Minh, J. Chung, C. Kozyrakis, and K. Olukotun, STAMP: Stanford Transactional Memory Applications for Multi-Processing, IISWC, 2008.
-
(2008)
IISWC
-
-
Minh, C.1
Chung, J.2
Kozyrakis, C.3
Olukotun, K.4
-
18
-
-
51549095074
-
The PARSEC benchmark suite: Characterization and architectural implications
-
C. Bienia, S. Kumar, J. Singh, and K. Li, The PARSEC Benchmark Suite: Characterization and Architectural Implications, Princeton University Technical Report, 2008.
-
(2008)
Princeton University Technical Report
-
-
Bienia, C.1
Kumar, S.2
Singh, J.3
Li, K.4
-
20
-
-
78651550268
-
Scalable parallel programming with CUDA
-
Mar.
-
J. Nickolls, I. Buck, M. Garland, and K. Skadron, Scalable Parallel Programming with CUDA, Queue 6, 2 (Mar. 2008), 40-53.
-
(2008)
Queue
, vol.6
, Issue.2
, pp. 40-53
-
-
Nickolls, J.1
Buck, I.2
Garland, M.3
Skadron, K.4
-
21
-
-
44849137198
-
NVIDIA tesla: A unified graphics and computing architecture
-
E. Lindholm, J. Nickolls, S. Oberman, and J. Montrym, NVIDIA Tesla: A Unified Graphics and Computing Architecture, Micro, vol.28, no.2, 2008.
-
(2008)
Micro
, vol.28
, Issue.2
-
-
Lindholm, E.1
Nickolls, J.2
Oberman, S.3
Montrym, J.4
-
23
-
-
78751492924
-
Technical overview
-
AMD Inc
-
Technical Overview, ATI Stream Computing, AMD Inc, 2009.
-
(2009)
ATI Stream Computing
-
-
-
25
-
-
70649092154
-
Rodinia: A benchmark suite for heterogeneous computing
-
S. Che, M. Boyer, J. Meng, D. Tarjan, J. Sheaffer, S. Lee, and K. Skadron. Rodinia: A Benchmark Suite for Heterogeneous Computing, IISWC, 2009.
-
(2009)
IISWC
-
-
Che, S.1
Boyer, M.2
Meng, J.3
Tarjan, D.4
Sheaffer, J.5
Lee, S.6
Skadron, K.7
-
26
-
-
78751510969
-
-
Parboil Benchmark suite. URL: http://impact.crhc.illinois.edu/parboil. php.
-
-
-
-
28
-
-
78751471889
-
-
http://www.nvidia.com/object/cuda-sdks.html
-
-
-
-
31
-
-
33746400169
-
Hotspot: A compact thermal modeling methodology for early-stage VLSI design
-
W. Huang, S. Ghosh, S. Velusamy, K. Sankaranarayanan, and K. Skadron, M. R. Stan, Hotspot: A Compact Thermal Modeling Methodology for Early-Stage VLSI Design, IEEE Transactions on VLSI Systems 14 (5) (2006).
-
(2006)
IEEE Transactions on VLSI Systems
, vol.14
, Issue.5
-
-
Huang, W.1
Ghosh, S.2
Velusamy, S.3
Sankaranarayanan, K.4
Skadron, K.5
Stan, M.R.6
-
33
-
-
63549097654
-
Mars: A mapreduce framework on graphics processors
-
B. He, W. Fang, Q. Luo, N. Govindaraju, and T. Wang, Mars: a MapReduce Framework on Graphics Processors, PACT, 2008.
-
(2008)
PACT
-
-
He, B.1
Fang, W.2
Luo, Q.3
Govindaraju, N.4
Wang, T.5
-
34
-
-
78751498561
-
-
Billconan and Kavinguy
-
Billconan and Kavinguy, A Neural Network on GPU. http://www.codeproject. com/KB/graphics/GPUNN.aspx.
-
A Neural Network on GPU
-
-
-
35
-
-
78751541238
-
-
Pcchen
-
Pcchen. N-Queens Solver, http://forums.nvidia.com/index.php?showtopic= 76893, 2008.
-
(2008)
N-Queens Solver
-
-
-
36
-
-
85015171905
-
-
Maxime
-
Maxime. Ray tracing. http://www.nvidia.com/cuda.
-
Ray Tracing
-
-
-
37
-
-
57349130987
-
StoreGPU: Exploiting graphics processing units to accelerate distributed storage systems
-
S. Al-Kiswany, A. Gharaibeh, E. Santos-Neto, G. Yuan, and M. Ripeanu, StoreGPU: Exploiting Graphics Processing Units to accelerate Distributed Storage Systems, HPDC, 2008.
-
(2008)
HPDC
-
-
Al-Kiswany, S.1
Gharaibeh, A.2
Santos-Neto, E.3
Yuan, G.4
Ripeanu, M.5
-
39
-
-
47349098275
-
MineBench: A benchmark suite for data mining workloads
-
R. Narayanan, B. Ozisikyilmaz, J. Zambreno, J. Pisharath, G. Memik, and A. Choudhary, MineBench: A Benchmark Suite for Data Mining Workloads, IISWC, 2006.
-
(2006)
IISWC
-
-
Narayanan, R.1
Ozisikyilmaz, B.2
Zambreno, J.3
Pisharath, J.4
Memik, G.5
Choudhary, A.6
-
40
-
-
38849131252
-
High-throughput sequence alignment using graphics processing units
-
M. Schatz, C. Trapnell, A. Delcher, and A. Varshney, High-throughput Sequence Alignment using Graphics Processing Units, BMC Bioinformatics, 8(1): 474, 2007.
-
(2007)
BMC Bioinformatics
, vol.8
, Issue.1
, pp. 474
-
-
Schatz, M.1
Trapnell, C.2
Delcher, A.3
Varshney, A.4
-
42
-
-
51049111938
-
CUDA compatible GPU as an efficient hardware accelerator for AES cryptography
-
S. Manavski, CUDA compatible GPU as an Efficient Hardware Accelerator for AES Cryptography, ICSPC, 2007.
-
(2007)
ICSPC
-
-
Manavski, S.1
-
43
-
-
51049099597
-
GPU acceleration of numerical weather prediction
-
J. Michalakes and M. Vachharajani, GPU Acceleration of Numerical Weather Prediction, IPDPS, 2008.
-
(2008)
IPDPS
-
-
Michalakes, J.1
Vachharajani, M.2
-
44
-
-
56749139615
-
Accelerating advanced MRI reconstructions on GPUs
-
S. Stone, J. Haldar, S. Tsao, W. Hwu, Z. Liang, and B. Sutton, Accelerating Advanced MRI Reconstructions on GPUs, Computing Frontiers, 2008.
-
(2008)
Computing Frontiers
-
-
Stone, S.1
Haldar, J.2
Tsao, S.3
Hwu, W.4
Liang, Z.5
Sutton, B.6
-
45
-
-
78751510968
-
-
StatSoft, Inc. STATISTICA, http://www.statsoft.com/.
-
Statistica
-
-
-
47
-
-
78751548339
-
Analysis of benchmark characteristics and benchmark performance prediction
-
R. H. Saavedra and A. J. Smith, Analysis of Benchmark Characteristics and Benchmark Performance Prediction, ACM Trans. Computer Systems, 1998.
-
(1998)
ACM Trans. Computer Systems
-
-
Saavedra, R.H.1
Smith, A.J.2
-
48
-
-
34548329985
-
Microarchitecture-independent workload characterization
-
K. Hoste and L. Eeckhout. Microarchitecture-independent Workload Characterization. IEEE Micro, 27(3):63.-72, 2007.
-
(2007)
IEEE Micro
, vol.27
, Issue.3
, pp. 63-72
-
-
Hoste, K.1
Eeckhout, L.2
-
49
-
-
78751541744
-
-
http://www.khronos.org/opencl/
-
-
-
-
50
-
-
49249086142
-
Larrabee: A many-core X86 architecture for visual computing
-
L. Seiler, D. Carmean, E. Sprangle, T.Forsyth, M. Abrash, P. Dubey, S. Junkins, A. Lake, J. Sugerman, R. Cavin, R. Espasa, E. Grochowski, T. Juan, and P. Hanrahan, Larrabee: A Many-core X86 Architecture for Visual Computing, ACM Trans. Graph. 27-3, 2008.
-
(2008)
ACM Trans. Graph.
, pp. 27-3
-
-
Seiler, L.1
Carmean, D.2
Sprangle, E.3
Forsyth, T.4
Abrash, M.5
Dubey, P.6
Junkins, S.7
Lake, A.8
Sugerman, J.9
Cavin, R.10
Espasa, R.11
Grochowski, E.12
Juan, T.13
Hanrahan, P.14
-
51
-
-
35348913704
-
Analysis of redundancy and application balance in the SPEC CPU2006 benchmark suite
-
A. Phansalkar, A. Joshi, and L. John, Analysis of Redundancy and Application Balance in the SPEC CPU2006 Benchmark Suite, ISCA, 2007.
-
(2007)
ISCA
-
-
Phansalkar, A.1
Joshi, A.2
John, L.3
|