SCOPUS 정보 검색 플랫폼

2014 32nd IEEE International Conference on Computer Design, ICCD 2014

Volumn , Issue , 2014, Pages 317-323

Accelerating divergent applications on SIMD architectures using neural networks

(2) Grigorian, Beayna a Reinman, Glenn a

a UNIVERSITY OF CALIFORNIA (United States)

Author keywords

Approximate Computing; Branch Divergence; Hardware Acceleration; Neural Networks; SIMD

Indexed keywords

APPLICATION PROGRAMS; CODES (SYMBOLS); ENERGY CONSERVATION; NETWORK ARCHITECTURE;

APPROXIMATE COMPUTING; BRANCH DIVERGENCE; HARDWARE ACCELERATION; NETWORK-BASED SOLUTIONS; NEURAL NETWORKS (NNS); PERFORMANCE DEGRADATION; SIMD; SINGLE-INSTRUCTION MULTIPLE-DATA ARCHITECTURES;

NEURAL NETWORKS;

EID: 84919678129 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICCD.2014.6974700 Document Type: Conference Paper

Times cited : (22)

References (30)

1
- 84919692085
- Introduction to intel advanced vector extensions
- C. Lomont, "Introduction to Intel Advanced Vector Extensions, " in ASCI "11, pp. 132-137.
- ASCI "11 , pp. 132-137
- Lomont, C.¹

2
- 44849137198
- NVIDIA tesla: A unified graphics and computing architecture
- E. Lindholm et al., "NVIDIA Tesla: A Unified Graphics and Computing Architecture, " IEEE Micro '08, vol. 28, no. 2, pp. 39-55.
- IEEE Micro '08 , vol.28 , Issue.2 , pp. 39-55
- Lindholm, E.¹

3
- 0033727057
- Vector instruction set support for conditional operations
- J. E. Smith et al., "Vector Instruction Set Support for Conditional Operations, " in ISCA '00, pp. 260-269.
- ISCA '00 , pp. 260-269
- Smith, J.E.¹

4
- 84863351470
- SIMD re-convergence at thread frontiers
- G. Diamos et al., "SIMD Re-Convergence at Thread Frontiers, " in MICRO '11, pp. 477-488.
- MICRO '11 , pp. 477-488
- Diamos, G.¹

5
- 47349104432
- Dynamic warp formation and scheduling for efficient gpu control flow
- W. W. L. Fung et al., "Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow, " in MICRO '07, pp. 407-420.
- MICRO '07 , pp. 407-420
- Fung, W.W.L.¹

6
- 84861808638
- Characterization and transformation of unstructured control flow in bulk synchronous gpu applications
- H. Wu et al., "Characterization and Transformation of Unstructured Control Flow in Bulk Synchronous GPU Applications, " IJHPCA '12, vol. 26, no. 2, pp. 170-185.
- IJHPCA '12 , vol.26 , Issue.2 , pp. 170-185
- Wu, H.¹

7
- 84863342255
- Improving gpu performance via large warps and two-level warp scheduling
- V. Narasiman et al., "Improving GPU Performance via Large Warps and Two-Level Warp Scheduling, " in MICRO '11, pp. 308-317.
- MICRO '11 , pp. 308-317
- Narasiman, V.¹

8
- 79955923056
- Thread block compaction for efficient simt control flow
- W. W. L. Fung and T. M. Aamodt, "Thread Block Compaction for Efficient SIMT Control Flow, " in HPCA '11, pp. 25-36.
- HPCA '11 , pp. 25-36
- Fung, W.W.L.¹ Aamodt, T.M.²

9
- 0003413187
- 2nd ed. Prentice Hall PTR
- S. Haykin, Neural Networks: A Comprehensive Foundation, 2nd ed. Prentice Hall PTR, 1998.
- (1998) Neural Networks: A Comprehensive Foundation
- Haykin, S.¹

10
- 84876591853
- Neural acceleration for general-purpose approximate programs
- H. Esmaeilzadeh et al., "Neural Acceleration for General-Purpose Approximate Programs, " in MICRO '12, pp. 449-460.
- MICRO '12 , pp. 449-460
- Esmaeilzadeh, H.¹

11
- 51449118065
- A performance study of general-purpose applications on graphics processors using cuda
- S. Che et al., "A Performance Study of General-Purpose Applications on Graphics Processors Using CUDA, " JPDC '08, vol. 68, no. 10, pp. 1370-1380.
- JPDC '08 , vol.68 , Issue.10 , pp. 1370-1380
- Che, S.¹

12
- 84863543742
- Architecture support for accelerator-rich cmps
- J. Cong et al., "Architecture Support for Accelerator-Rich CMPs, " in DAC '12, pp. 843-849.
- DAC '12 , pp. 843-849
- Cong, J.¹

13
- 84865554555
- CHARM: A composable heterogeneous accelerator-rich microprocessor
- J. Cong et al., "CHARM: A Composable Heterogeneous Accelerator-Rich Microprocessor, " in ISLPED '12, pp. 379-384.
- ISLPED '12 , pp. 379-384
- Cong, J.¹

14
- 84859059850
- ERSA: Error resilient system architecture for probabilistic applications
- H. Cho et al., "ERSA: Error Resilient System Architecture for Probabilistic Applications, " TCAD '12, vol. 31, no. 4, pp. 546-558.
- TCAD '12 , vol.31 , Issue.4 , pp. 546-558
- Cho, H.¹

15
- 80053213080
- Managing performance vs accuracy trade-offs with loop perforation
- S. Sidiroglou-Douskos et al., "Managing Performance vs. Accuracy Trade-Offs with Loop Perforation, " in SIGSOFT/FSE '11, pp. 124-134.
- SIGSOFT/FSE '11 , pp. 124-134
- Sidiroglou-Douskos, S.¹

16
- 79959878920
- EnerJ: Approximate data types for safe and general low-power computation
- A. Sampson et al., "EnerJ: Approximate Data Types for Safe and General Low-Power Computation, " in PLDI '11, pp. 164-174.
- PLDI '11 , pp. 164-174
- Sampson, A.¹

17
- 84888167548
- Verifying quantitative reliability for programs that execute on unreliable hardware
- M. Carbin, S. Misailovic, and M. C. Rinard, "Verifying Quantitative Reliability for Programs That Execute on Unreliable Hardware, " in OOPSLA, 2013, pp. 33-52.
- (2013) OOPSLA , pp. 33-52
- Carbin, M.¹ Misailovic, S.² Rinard, M.C.³

18
- 0034197208
- Rose: Compiler support for object-oriented frameworks
- D. Quinlan, "Rose: Compiler support for object-oriented frameworks, " Parallel Processing Letters '00, vol. 10, pp. 215-226.
- Parallel Processing Letters '00 , vol.10 , pp. 215-226
- Quinlan, D.¹

19
- 0022471098
- Learning representations by back-propagating errors
- D. E. Rumelhart et al., "Learning Representations by Back-Propagating Errors, " Nature '86, vol. 323, no. 6088, pp. 533-536.
- Nature '86 , vol.323 , Issue.6088 , pp. 533-536
- Rumelhart, D.E.¹

20
- 0025751820
- Approximation capabilities of multilayer feedforward networks
- K. Hornik, "Approximation Capabilities of Multilayer Feedforward Networks, " Neural Networks '91, vol. 4, no. 2, pp. 251-257.
- Neural Networks '91 , vol.4 , Issue.2 , pp. 251-257
- Hornik, K.¹

21
- 63549095070
- The parsec benchmark suite: Characterization and architectural implications
- C. Bienia et al., "The PARSEC Benchmark Suite: Characterization and Architectural Implications, " in PACT '08, pp. 72-81.
- PACT '08 , pp. 72-81
- Bienia, C.¹

22
- 0003602606
- Academic Press
- J. M. Ortega et al., Iterative Solution of Nonlinear Equations in Several Variables. Academic Press, 1970.
- (1970) Iterative Solution of Nonlinear Equations in Several Variables.
- Ortega, J.M.¹

23
- 77957836774
- Addison-Wesley Pro.
- J. Sanders and E. Kandrot, CUDA by Example: An Introduction to General-Purpose GPU Programming. Addison-Wesley Pro., 2010.
- (2010) CUDA by Example: An Introduction to General-Purpose GPU Programming.
- Sanders, J.¹ Kandrot, E.²

24
- 33745858913
- DIKU, Report
- S. Nissen, "Implementation of a Fast Artificial Neural Network Library (FANN), " DIKU, Report, 2003.
- (2003) Implementation of A Fast Artificial Neural Network Library (FANN)
- Nissen, S.¹

25
- 84919692082
- CUDA 5.5 Production Release, " http://developer.nvidia.com/ cuda-downloads, Nvidia.
- CUDA 5.5 Production Release

26
- 84919692081
- Kill A Watt, " http://www.p3international.com/products/p4400.html, P3 International.
- P3 International
- Watt, K.A.¹

27
- 84872693395
- Branch and data herding: Reducing control and memory divergence for error-Tolerant gpu applications
- Feb
- J. Sartori and R. Kumar, "Branch and Data Herding: Reducing Control and Memory Divergence for Error-Tolerant GPU Applications, " IEEE TMM, vol. 15, no. 2, pp. 279-290, Feb 2013.
- (2013) IEEE TMM , vol.15 , Issue.2 , pp. 279-290
- Sartori, J.¹ Kumar, R.²

28
- 84919692079
- Improving coverage and reliability in approximate computing using application-specific, light-weight checks
- B. Grigorian and G. Reinman, "Improving Coverage and Reliability in Approximate Computing Using Application-Specific, Light-Weight Checks, " in WACAS '14.
- WACAS '14
- Grigorian, B.¹ Reinman, G.²

29
- 77954976292
- Dynamic warp subdivision for integrated branch and memory divergence tolerance
- J. Meng et al., "Dynamic Warp Subdivision for Integrated Branch and Memory Divergence Tolerance, " in ISCA '10, pp. 235-246.
- ISCA '10 , pp. 235-246
- Meng, J.¹

30
- 84873463816
- BenchNN: On the broad potential application scope of hardware neural network accelerators
- T. Chen et al., "BenchNN: On the Broad Potential Application Scope of Hardware Neural Network Accelerators, " in IISWC '12, pp. 36-45.
- IISWC '12 , pp. 36-45
- Chen, T.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.