메뉴 건너뛰기




Volumn , Issue , 2014, Pages 317-323

Accelerating divergent applications on SIMD architectures using neural networks

Author keywords

Approximate Computing; Branch Divergence; Hardware Acceleration; Neural Networks; SIMD

Indexed keywords

APPLICATION PROGRAMS; CODES (SYMBOLS); ENERGY CONSERVATION; NETWORK ARCHITECTURE;

EID: 84919678129     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICCD.2014.6974700     Document Type: Conference Paper
Times cited : (22)

References (30)
  • 1
    • 84919692085 scopus 로고    scopus 로고
    • Introduction to intel advanced vector extensions
    • C. Lomont, "Introduction to Intel Advanced Vector Extensions, " in ASCI "11, pp. 132-137.
    • ASCI "11 , pp. 132-137
    • Lomont, C.1
  • 2
    • 44849137198 scopus 로고    scopus 로고
    • NVIDIA tesla: A unified graphics and computing architecture
    • E. Lindholm et al., "NVIDIA Tesla: A Unified Graphics and Computing Architecture, " IEEE Micro '08, vol. 28, no. 2, pp. 39-55.
    • IEEE Micro '08 , vol.28 , Issue.2 , pp. 39-55
    • Lindholm, E.1
  • 3
    • 0033727057 scopus 로고    scopus 로고
    • Vector instruction set support for conditional operations
    • J. E. Smith et al., "Vector Instruction Set Support for Conditional Operations, " in ISCA '00, pp. 260-269.
    • ISCA '00 , pp. 260-269
    • Smith, J.E.1
  • 4
    • 84863351470 scopus 로고    scopus 로고
    • SIMD re-convergence at thread frontiers
    • G. Diamos et al., "SIMD Re-Convergence at Thread Frontiers, " in MICRO '11, pp. 477-488.
    • MICRO '11 , pp. 477-488
    • Diamos, G.1
  • 5
    • 47349104432 scopus 로고    scopus 로고
    • Dynamic warp formation and scheduling for efficient gpu control flow
    • W. W. L. Fung et al., "Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow, " in MICRO '07, pp. 407-420.
    • MICRO '07 , pp. 407-420
    • Fung, W.W.L.1
  • 6
    • 84861808638 scopus 로고    scopus 로고
    • Characterization and transformation of unstructured control flow in bulk synchronous gpu applications
    • H. Wu et al., "Characterization and Transformation of Unstructured Control Flow in Bulk Synchronous GPU Applications, " IJHPCA '12, vol. 26, no. 2, pp. 170-185.
    • IJHPCA '12 , vol.26 , Issue.2 , pp. 170-185
    • Wu, H.1
  • 7
    • 84863342255 scopus 로고    scopus 로고
    • Improving gpu performance via large warps and two-level warp scheduling
    • V. Narasiman et al., "Improving GPU Performance via Large Warps and Two-Level Warp Scheduling, " in MICRO '11, pp. 308-317.
    • MICRO '11 , pp. 308-317
    • Narasiman, V.1
  • 8
    • 79955923056 scopus 로고    scopus 로고
    • Thread block compaction for efficient simt control flow
    • W. W. L. Fung and T. M. Aamodt, "Thread Block Compaction for Efficient SIMT Control Flow, " in HPCA '11, pp. 25-36.
    • HPCA '11 , pp. 25-36
    • Fung, W.W.L.1    Aamodt, T.M.2
  • 10
    • 84876591853 scopus 로고    scopus 로고
    • Neural acceleration for general-purpose approximate programs
    • H. Esmaeilzadeh et al., "Neural Acceleration for General-Purpose Approximate Programs, " in MICRO '12, pp. 449-460.
    • MICRO '12 , pp. 449-460
    • Esmaeilzadeh, H.1
  • 11
    • 51449118065 scopus 로고    scopus 로고
    • A performance study of general-purpose applications on graphics processors using cuda
    • S. Che et al., "A Performance Study of General-Purpose Applications on Graphics Processors Using CUDA, " JPDC '08, vol. 68, no. 10, pp. 1370-1380.
    • JPDC '08 , vol.68 , Issue.10 , pp. 1370-1380
    • Che, S.1
  • 12
    • 84863543742 scopus 로고    scopus 로고
    • Architecture support for accelerator-rich cmps
    • J. Cong et al., "Architecture Support for Accelerator-Rich CMPs, " in DAC '12, pp. 843-849.
    • DAC '12 , pp. 843-849
    • Cong, J.1
  • 13
    • 84865554555 scopus 로고    scopus 로고
    • CHARM: A composable heterogeneous accelerator-rich microprocessor
    • J. Cong et al., "CHARM: A Composable Heterogeneous Accelerator-Rich Microprocessor, " in ISLPED '12, pp. 379-384.
    • ISLPED '12 , pp. 379-384
    • Cong, J.1
  • 14
    • 84859059850 scopus 로고    scopus 로고
    • ERSA: Error resilient system architecture for probabilistic applications
    • H. Cho et al., "ERSA: Error Resilient System Architecture for Probabilistic Applications, " TCAD '12, vol. 31, no. 4, pp. 546-558.
    • TCAD '12 , vol.31 , Issue.4 , pp. 546-558
    • Cho, H.1
  • 15
    • 80053213080 scopus 로고    scopus 로고
    • Managing performance vs accuracy trade-offs with loop perforation
    • S. Sidiroglou-Douskos et al., "Managing Performance vs. Accuracy Trade-Offs with Loop Perforation, " in SIGSOFT/FSE '11, pp. 124-134.
    • SIGSOFT/FSE '11 , pp. 124-134
    • Sidiroglou-Douskos, S.1
  • 16
    • 79959878920 scopus 로고    scopus 로고
    • EnerJ: Approximate data types for safe and general low-power computation
    • A. Sampson et al., "EnerJ: Approximate Data Types for Safe and General Low-Power Computation, " in PLDI '11, pp. 164-174.
    • PLDI '11 , pp. 164-174
    • Sampson, A.1
  • 17
    • 84888167548 scopus 로고    scopus 로고
    • Verifying quantitative reliability for programs that execute on unreliable hardware
    • M. Carbin, S. Misailovic, and M. C. Rinard, "Verifying Quantitative Reliability for Programs That Execute on Unreliable Hardware, " in OOPSLA, 2013, pp. 33-52.
    • (2013) OOPSLA , pp. 33-52
    • Carbin, M.1    Misailovic, S.2    Rinard, M.C.3
  • 18
    • 0034197208 scopus 로고    scopus 로고
    • Rose: Compiler support for object-oriented frameworks
    • D. Quinlan, "Rose: Compiler support for object-oriented frameworks, " Parallel Processing Letters '00, vol. 10, pp. 215-226.
    • Parallel Processing Letters '00 , vol.10 , pp. 215-226
    • Quinlan, D.1
  • 19
    • 0022471098 scopus 로고    scopus 로고
    • Learning representations by back-propagating errors
    • D. E. Rumelhart et al., "Learning Representations by Back-Propagating Errors, " Nature '86, vol. 323, no. 6088, pp. 533-536.
    • Nature '86 , vol.323 , Issue.6088 , pp. 533-536
    • Rumelhart, D.E.1
  • 20
    • 0025751820 scopus 로고    scopus 로고
    • Approximation capabilities of multilayer feedforward networks
    • K. Hornik, "Approximation Capabilities of Multilayer Feedforward Networks, " Neural Networks '91, vol. 4, no. 2, pp. 251-257.
    • Neural Networks '91 , vol.4 , Issue.2 , pp. 251-257
    • Hornik, K.1
  • 21
    • 63549095070 scopus 로고    scopus 로고
    • The parsec benchmark suite: Characterization and architectural implications
    • C. Bienia et al., "The PARSEC Benchmark Suite: Characterization and Architectural Implications, " in PACT '08, pp. 72-81.
    • PACT '08 , pp. 72-81
    • Bienia, C.1
  • 27
    • 84872693395 scopus 로고    scopus 로고
    • Branch and data herding: Reducing control and memory divergence for error-Tolerant gpu applications
    • Feb
    • J. Sartori and R. Kumar, "Branch and Data Herding: Reducing Control and Memory Divergence for Error-Tolerant GPU Applications, " IEEE TMM, vol. 15, no. 2, pp. 279-290, Feb 2013.
    • (2013) IEEE TMM , vol.15 , Issue.2 , pp. 279-290
    • Sartori, J.1    Kumar, R.2
  • 28
    • 84919692079 scopus 로고    scopus 로고
    • Improving coverage and reliability in approximate computing using application-specific, light-weight checks
    • B. Grigorian and G. Reinman, "Improving Coverage and Reliability in Approximate Computing Using Application-Specific, Light-Weight Checks, " in WACAS '14.
    • WACAS '14
    • Grigorian, B.1    Reinman, G.2
  • 29
    • 77954976292 scopus 로고    scopus 로고
    • Dynamic warp subdivision for integrated branch and memory divergence tolerance
    • J. Meng et al., "Dynamic Warp Subdivision for Integrated Branch and Memory Divergence Tolerance, " in ISCA '10, pp. 235-246.
    • ISCA '10 , pp. 235-246
    • Meng, J.1
  • 30
    • 84873463816 scopus 로고    scopus 로고
    • BenchNN: On the broad potential application scope of hardware neural network accelerators
    • T. Chen et al., "BenchNN: On the Broad Potential Application Scope of Hardware Neural Network Accelerators, " in IISWC '12, pp. 36-45.
    • IISWC '12 , pp. 36-45
    • Chen, T.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.