-
1
-
-
84919692085
-
Introduction to intel advanced vector extensions
-
C. Lomont, "Introduction to Intel Advanced Vector Extensions, " in ASCI "11, pp. 132-137.
-
ASCI "11
, pp. 132-137
-
-
Lomont, C.1
-
2
-
-
44849137198
-
NVIDIA tesla: A unified graphics and computing architecture
-
E. Lindholm et al., "NVIDIA Tesla: A Unified Graphics and Computing Architecture, " IEEE Micro '08, vol. 28, no. 2, pp. 39-55.
-
IEEE Micro '08
, vol.28
, Issue.2
, pp. 39-55
-
-
Lindholm, E.1
-
3
-
-
0033727057
-
Vector instruction set support for conditional operations
-
J. E. Smith et al., "Vector Instruction Set Support for Conditional Operations, " in ISCA '00, pp. 260-269.
-
ISCA '00
, pp. 260-269
-
-
Smith, J.E.1
-
4
-
-
84863351470
-
SIMD re-convergence at thread frontiers
-
G. Diamos et al., "SIMD Re-Convergence at Thread Frontiers, " in MICRO '11, pp. 477-488.
-
MICRO '11
, pp. 477-488
-
-
Diamos, G.1
-
5
-
-
47349104432
-
Dynamic warp formation and scheduling for efficient gpu control flow
-
W. W. L. Fung et al., "Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow, " in MICRO '07, pp. 407-420.
-
MICRO '07
, pp. 407-420
-
-
Fung, W.W.L.1
-
6
-
-
84861808638
-
Characterization and transformation of unstructured control flow in bulk synchronous gpu applications
-
H. Wu et al., "Characterization and Transformation of Unstructured Control Flow in Bulk Synchronous GPU Applications, " IJHPCA '12, vol. 26, no. 2, pp. 170-185.
-
IJHPCA '12
, vol.26
, Issue.2
, pp. 170-185
-
-
Wu, H.1
-
7
-
-
84863342255
-
Improving gpu performance via large warps and two-level warp scheduling
-
V. Narasiman et al., "Improving GPU Performance via Large Warps and Two-Level Warp Scheduling, " in MICRO '11, pp. 308-317.
-
MICRO '11
, pp. 308-317
-
-
Narasiman, V.1
-
8
-
-
79955923056
-
Thread block compaction for efficient simt control flow
-
W. W. L. Fung and T. M. Aamodt, "Thread Block Compaction for Efficient SIMT Control Flow, " in HPCA '11, pp. 25-36.
-
HPCA '11
, pp. 25-36
-
-
Fung, W.W.L.1
Aamodt, T.M.2
-
10
-
-
84876591853
-
Neural acceleration for general-purpose approximate programs
-
H. Esmaeilzadeh et al., "Neural Acceleration for General-Purpose Approximate Programs, " in MICRO '12, pp. 449-460.
-
MICRO '12
, pp. 449-460
-
-
Esmaeilzadeh, H.1
-
11
-
-
51449118065
-
A performance study of general-purpose applications on graphics processors using cuda
-
S. Che et al., "A Performance Study of General-Purpose Applications on Graphics Processors Using CUDA, " JPDC '08, vol. 68, no. 10, pp. 1370-1380.
-
JPDC '08
, vol.68
, Issue.10
, pp. 1370-1380
-
-
Che, S.1
-
12
-
-
84863543742
-
Architecture support for accelerator-rich cmps
-
J. Cong et al., "Architecture Support for Accelerator-Rich CMPs, " in DAC '12, pp. 843-849.
-
DAC '12
, pp. 843-849
-
-
Cong, J.1
-
13
-
-
84865554555
-
CHARM: A composable heterogeneous accelerator-rich microprocessor
-
J. Cong et al., "CHARM: A Composable Heterogeneous Accelerator-Rich Microprocessor, " in ISLPED '12, pp. 379-384.
-
ISLPED '12
, pp. 379-384
-
-
Cong, J.1
-
14
-
-
84859059850
-
ERSA: Error resilient system architecture for probabilistic applications
-
H. Cho et al., "ERSA: Error Resilient System Architecture for Probabilistic Applications, " TCAD '12, vol. 31, no. 4, pp. 546-558.
-
TCAD '12
, vol.31
, Issue.4
, pp. 546-558
-
-
Cho, H.1
-
15
-
-
80053213080
-
Managing performance vs accuracy trade-offs with loop perforation
-
S. Sidiroglou-Douskos et al., "Managing Performance vs. Accuracy Trade-Offs with Loop Perforation, " in SIGSOFT/FSE '11, pp. 124-134.
-
SIGSOFT/FSE '11
, pp. 124-134
-
-
Sidiroglou-Douskos, S.1
-
16
-
-
79959878920
-
EnerJ: Approximate data types for safe and general low-power computation
-
A. Sampson et al., "EnerJ: Approximate Data Types for Safe and General Low-Power Computation, " in PLDI '11, pp. 164-174.
-
PLDI '11
, pp. 164-174
-
-
Sampson, A.1
-
17
-
-
84888167548
-
Verifying quantitative reliability for programs that execute on unreliable hardware
-
M. Carbin, S. Misailovic, and M. C. Rinard, "Verifying Quantitative Reliability for Programs That Execute on Unreliable Hardware, " in OOPSLA, 2013, pp. 33-52.
-
(2013)
OOPSLA
, pp. 33-52
-
-
Carbin, M.1
Misailovic, S.2
Rinard, M.C.3
-
18
-
-
0034197208
-
Rose: Compiler support for object-oriented frameworks
-
D. Quinlan, "Rose: Compiler support for object-oriented frameworks, " Parallel Processing Letters '00, vol. 10, pp. 215-226.
-
Parallel Processing Letters '00
, vol.10
, pp. 215-226
-
-
Quinlan, D.1
-
19
-
-
0022471098
-
Learning representations by back-propagating errors
-
D. E. Rumelhart et al., "Learning Representations by Back-Propagating Errors, " Nature '86, vol. 323, no. 6088, pp. 533-536.
-
Nature '86
, vol.323
, Issue.6088
, pp. 533-536
-
-
Rumelhart, D.E.1
-
20
-
-
0025751820
-
Approximation capabilities of multilayer feedforward networks
-
K. Hornik, "Approximation Capabilities of Multilayer Feedforward Networks, " Neural Networks '91, vol. 4, no. 2, pp. 251-257.
-
Neural Networks '91
, vol.4
, Issue.2
, pp. 251-257
-
-
Hornik, K.1
-
21
-
-
63549095070
-
The parsec benchmark suite: Characterization and architectural implications
-
C. Bienia et al., "The PARSEC Benchmark Suite: Characterization and Architectural Implications, " in PACT '08, pp. 72-81.
-
PACT '08
, pp. 72-81
-
-
Bienia, C.1
-
27
-
-
84872693395
-
Branch and data herding: Reducing control and memory divergence for error-Tolerant gpu applications
-
Feb
-
J. Sartori and R. Kumar, "Branch and Data Herding: Reducing Control and Memory Divergence for Error-Tolerant GPU Applications, " IEEE TMM, vol. 15, no. 2, pp. 279-290, Feb 2013.
-
(2013)
IEEE TMM
, vol.15
, Issue.2
, pp. 279-290
-
-
Sartori, J.1
Kumar, R.2
-
28
-
-
84919692079
-
Improving coverage and reliability in approximate computing using application-specific, light-weight checks
-
B. Grigorian and G. Reinman, "Improving Coverage and Reliability in Approximate Computing Using Application-Specific, Light-Weight Checks, " in WACAS '14.
-
WACAS '14
-
-
Grigorian, B.1
Reinman, G.2
-
29
-
-
77954976292
-
Dynamic warp subdivision for integrated branch and memory divergence tolerance
-
J. Meng et al., "Dynamic Warp Subdivision for Integrated Branch and Memory Divergence Tolerance, " in ISCA '10, pp. 235-246.
-
ISCA '10
, pp. 235-246
-
-
Meng, J.1
-
30
-
-
84873463816
-
BenchNN: On the broad potential application scope of hardware neural network accelerators
-
T. Chen et al., "BenchNN: On the Broad Potential Application Scope of Hardware Neural Network Accelerators, " in IISWC '12, pp. 36-45.
-
IISWC '12
, pp. 36-45
-
-
Chen, T.1
|