메뉴 건너뛰기




Volumn 05-09-December-2015, Issue , 2015, Pages 482-493

Neural acceleration for GPU throughput processors

Author keywords

approximate computing; GPU; neural processing unit

Indexed keywords

ACCELERATION; BENCHMARKING; COMPUTER ARCHITECTURE; COMPUTER GRAPHICS; EMBEDDED SYSTEMS; IMAGE CODING; PROGRAM PROCESSORS; QUALITY CONTROL;

EID: 84959896262     PISSN: 10724451     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/2830772.2830810     Document Type: Conference Paper
Times cited : (104)

References (59)
  • 5
    • 85026956356 scopus 로고    scopus 로고
    • GeForce 400 series. " http: //en. wikipedia. org, 2015.
    • (2015) GeForce 400 Series
  • 7
    • 84897771889 scopus 로고    scopus 로고
    • Paraprox: Pattern-based approximation for data parallel applications
    • M. Samadi, D. A. Jamshidi, J. Lee, and S. Mahlke, "Paraprox: Pattern-based approximation for data parallel applications, " in ASPLOS, 2014.
    • (2014) ASPLOS
    • Samadi, M.1    Jamshidi, D.A.2    Lee, J.3    Mahlke, S.4
  • 8
    • 84905460431 scopus 로고    scopus 로고
    • Eliminating re-dundant fragment shader executions on a mobile GPU via hard-ware memoization
    • J.-M. Arnau, J.-M. Parcerisa, and P. Xekalakis, "Eliminating re-dundant fragment shader executions on a mobile GPU via hard-ware memoization, " ISCA, 2014.
    • (2014) ISCA
    • Arnau, J.-M.1    Parcerisa, J.-M.2    Xekalakis, P.3
  • 9
    • 84872693395 scopus 로고    scopus 로고
    • Branch and data herding: Reducing control and memory divergence for error-tolerant GPU applica-tions
    • J. Sartori and R. Kumar, "Branch and data herding: Reducing control and memory divergence for error-tolerant GPU applica-tions, " Multimedia, IEEE Transactions on, 2013.
    • (2013) Multimedia, IEEE Transactions on
    • Sartori, J.1    Kumar, R.2
  • 10
    • 84876591853 scopus 로고    scopus 로고
    • Neu-ral acceleration for general-purpose approximate programs
    • H. Esmaeilzadeh, A. Sampson, L. Ceze, and D. Burger, "Neu-ral acceleration for general-purpose approximate programs, " in MICRO, 2012.
    • (2012) MICRO
    • Esmaeilzadeh, H.1    Sampson, A.2    Ceze, L.3    Burger, D.4
  • 12
    • 84934325706 scopus 로고    scopus 로고
    • BRAINIAC: Bringing reliable accuracy into neurally-implemented approxi-mate computing
    • B. Grigorian, N. Farahpour, and G. Reinman, "BRAINIAC: Bringing reliable accuracy into neurally-implemented approxi-mate computing, " in HPCA, 2015.
    • (2015) HPCA
    • Grigorian, B.1    Farahpour, N.2    Reinman, G.3
  • 14
    • 84926041511 scopus 로고    scopus 로고
    • EMEURO: A framework for gen-erating multi-purpose accelerators via deep learning
    • L. McAfee and K. Olukotun, "EMEURO: A framework for gen-erating multi-purpose accelerators via deep learning, " in CGO, 2015.
    • (2015) CGO
    • McAfee, L.1    Olukotun, K.2
  • 15
    • 84919678129 scopus 로고    scopus 로고
    • Accelerating divergent applica-tions on SIMD architectures using neural networks
    • B. Grigorian and G. Reinman, "Accelerating divergent applica-tions on SIMD architectures using neural networks, " in ICCD, 2014.
    • (2014) ICCD
    • Grigorian, B.1    Reinman, G.2
  • 17
    • 84888167548 scopus 로고    scopus 로고
    • Verifying quanti-tative reliability for programs that execute on unreliable hard-ware
    • M. Carbin, S. Misailovic, and M. C. Rinard, "Verifying quanti-tative reliability for programs that execute on unreliable hard-ware, " in OOPSLA, 2013.
    • (2013) OOPSLA
    • Carbin, M.1    Misailovic, S.2    Rinard, M.C.3
  • 18
    • 84960395601 scopus 로고    scopus 로고
    • Flexjava: Language support for safe and modular approximate programming
    • J. Park, H. Esmaeilzadeh, X. Zhang, M. Naik, and W. Harris, "Flexjava: Language support for safe and modular approximate programming, " in FSE, 2015.
    • (2015) FSE
    • Park, J.1    Esmaeilzadeh, H.2    Zhang, X.3    Naik, M.4    Harris, W.5
  • 20
    • 84987170701 scopus 로고
    • An efficient way to find the side effects of proce-dure calls and the aliases of variables
    • J. P. Banning, "An efficient way to find the side effects of proce-dure calls and the aliases of variables, " in POPL, 1979.
    • (1979) POPL
    • Banning, J.P.1
  • 21
    • 0000646059 scopus 로고
    • Learning internal representations by error propagation
    • D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning internal representations by error propagation, " in PDP, 1986.
    • (1986) PDP
    • Rumelhart, D.E.1    Hinton, G.E.2    Williams, R.J.3
  • 26
    • 82555191201 scopus 로고    scopus 로고
    • Inverse kinematics solution for robotic manipulators using a CUDA-based parallel genetic algo-rithm
    • O. A. Aguilar and J. C. Huegel, "Inverse kinematics solution for robotic manipulators using a cuda-based parallel genetic algo-rithm, " AAI, 2011.
    • (2011) AAI
    • Aguilar, O.A.1    Huegel, J.C.2
  • 27
    • 85026960058 scopus 로고    scopus 로고
    • A high performance implementation of likelihood estimators on GPUs
    • M. Creel and M. Zubair, "A high performance implementation of likelihood estimators on gpus, " in CES, 2013.
    • (2013) CES
    • Creel, M.1    Zubair, M.2
  • 28
    • 84858790858 scopus 로고    scopus 로고
    • Archi-tecture support for disciplined approximate programming
    • H. Esmaeilzadeh, A. Sampson, L. Ceze, and D. Burger, "Archi-tecture support for disciplined approximate programming, " in ASPLOS, 2012.
    • (2012) ASPLOS
    • Esmaeilzadeh, H.1    Sampson, A.2    Ceze, L.3    Burger, D.4
  • 29
    • 77954707631 scopus 로고    scopus 로고
    • Green: A framework for support-ing energy-conscious programming using controlled approxima-tion
    • W. Baek and T. M. Chilimbi, "Green: A framework for support-ing energy-conscious programming using controlled approxima-tion, " in PLDI, 2010.
    • (2010) PLDI
    • Baek, W.1    Chilimbi, T.M.2
  • 31
    • 70349169075 scopus 로고    scopus 로고
    • An-alyzing CUDA workloads using a detailed GPU simulator
    • A. Bakhoda, G. Yuan, W. Fung, H. Wong, and T. Aamodt, "An-alyzing cuda workloads using a detailed GPU simulator, " in IS-PASS, 2009.
    • (2009) IS-PASS
    • Bakhoda, A.1    Yuan, G.2    Fung, W.3    Wong, H.4    Aamodt, T.5
  • 33
    • 76749146060 scopus 로고    scopus 로고
    • McPAT: An integrated power, area, and tim-ing modeling framework for multicore and manycore architec-tures
    • S. Li, J. H. Ahn, R. D. Strong, J. B. Brockman, D. M. Tullsen, and N. P. Jouppi, "McPAT: An integrated power, area, and tim-ing modeling framework for multicore and manycore architec-tures, " in MICRO, 2009.
    • (2009) MICRO
    • Li, S.1    Ahn, J.H.2    Strong, R.D.3    Brockman, J.B.4    Tullsen, D.M.5    Jouppi, N.P.6
  • 34
    • 47349084021 scopus 로고    scopus 로고
    • Op-timizing NUCA organizations and wiring alternatives for large caches with CACTI 6. 0
    • N. Muralimanohar, R. Balasubramonian, and N. Jouppi, "Op-timizing NUCA organizations and wiring alternatives for large caches with CACTI 6. 0, " in MICRO, 2007.
    • (2007) MICRO
    • Muralimanohar, N.1    Balasubramonian, R.2    Jouppi, N.3
  • 38
    • 79959885067 scopus 로고    scopus 로고
    • Flikker: Saving refresh-power in mobile devices through crit-ical data partitioning
    • S. Liu, K. Pattabiraman, T. Moscibroda, and B. G. Zorn, "Flikker: Saving refresh-power in mobile devices through crit-ical data partitioning, " in ASPLOS, 2011.
    • (2011) ASPLOS
    • Liu, S.1    Pattabiraman, K.2    Moscibroda, T.3    Zorn, B.G.4
  • 41
    • 77953110390 scopus 로고    scopus 로고
    • ERSA: Error resilient system architecture for probabilistic applications
    • L. Leem, H. Cho, J. Bau, Q. A. Jacobson, and S. Mitra, "ERSA: Error resilient system architecture for probabilistic applications, " in DATE, 2010.
    • (2010) DATE
    • Leem, L.1    Cho, H.2    Bau, J.3    Jacobson, Q.A.4    Mitra, S.5
  • 43
    • 78650166825 scopus 로고    scopus 로고
    • Pat-terns and statistical analysis for understanding reduced resource computing
    • M. Rinard, H. Hoffmann, S. Misailovic, and S. Sidiroglou, "Pat-terns and statistical analysis for understanding reduced resource computing, " in Onward!, 2010.
    • (2010) Onward!
    • Rinard, M.1    Hoffmann, H.2    Misailovic, S.3    Sidiroglou, S.4
  • 45
    • 85008028657 scopus 로고    scopus 로고
    • Fuzzy memoization for floating-point multimedia applications
    • C. Alvarez, J. Corbal, and M. Valero, "Fuzzy memoization for floating-point multimedia applications, " IEEE Trans. Comput., 2005.
    • (2005) IEEE Trans. Comput.
    • Alvarez, C.1    Corbal, J.2    Valero, M.3
  • 46
    • 77954968857 scopus 로고    scopus 로고
    • Relax: An ar-chitectural framework for software recovery of hardware faults
    • M. de Kruijf, S. Nomura, and K. Sankaralingam, "Relax: An ar-chitectural framework for software recovery of hardware faults, " in ISCA, 2010.
    • (2010) ISCA
    • De Kruijf, M.1    Nomura, S.2    Sankaralingam, K.3
  • 47
    • 34547697289 scopus 로고    scopus 로고
    • Application-level correctness and its impact on fault tolerance
    • X. Li and D. Yeung, "Application-level correctness and its impact on fault tolerance, " in HPCA, 2007.
    • (2007) HPCA
    • Li, X.1    Yeung, D.2
  • 48
    • 70350059816 scopus 로고    scopus 로고
    • Exploiting application-level correctness for low-cost fault tolerance
    • X. Li and D. Yeung, "Exploiting application-level correctness for low-cost fault tolerance, " J. Instruction-Level Parallelism, 2008.
    • (2008) J. Instruction-Level Parallelism
    • Li, X.1    Yeung, D.2
  • 49
    • 79959860111 scopus 로고    scopus 로고
    • Exploring the synergy of emerging workloads and silicon reliability trends
    • M. de Kruijf and K. Sankaralingam, "Exploring the synergy of emerging workloads and silicon reliability trends, " in SELSE, 2009.
    • (2009) SELSE
    • De Kruijf, M.1    Sankaralingam, K.2
  • 50
    • 84862943500 scopus 로고    scopus 로고
    • A fault criticality evaluation frame-work of digital systems for error tolerant video applications
    • Y. Fang, H. Li, and X. Li, "A fault criticality evaluation frame-work of digital systems for error tolerant video applications, " in ATS, 2011.
    • (2011) ATS
    • Fang, Y.1    Li, H.2    Li, X.3
  • 54
    • 84893368533 scopus 로고    scopus 로고
    • Approximate logic synthesis under general error magnitude and frequency con-straints
    • J. Miao, A. Gerstlauer, and M. Orshansky, "Approximate logic synthesis under general error magnitude and frequency con-straints, " in ICCAD, 2013.
    • (2013) ICCAD
    • Miao, J.1    Gerstlauer, A.2    Orshansky, M.3
  • 55
    • 84903831997 scopus 로고    scopus 로고
    • ABACUS: A tech-nique for automated behavioral synthesis of approximate com-puting circuits
    • K. Nepal, Y. Li, R. I. Bahar, and S. Reda, "ABACUS: A tech-nique for automated behavioral synthesis of approximate com-puting circuits, " in DATE, 2014.
    • (2014) DATE
    • Nepal, K.1    Li, Y.2    Bahar, R.I.3    Reda, S.4
  • 57
    • 84862690555 scopus 로고    scopus 로고
    • Algorithmic methodologies for ultra-efficient inexact architectures for sustaining technology scaling
    • A. Lingamneni, K. K. Muntimadugu, C. Enz, R. M. Karp, K. V. Palem, and C. Piguet, "Algorithmic methodologies for ultra-efficient inexact architectures for sustaining technology scaling, " in CF, 2012.
    • (2012) CF
    • Lingamneni, A.1    Muntimadugu, K.K.2    Enz, C.3    Karp, R.M.4    Palem, K.V.5    Piguet, C.6
  • 58
    • 84881175680 scopus 로고    scopus 로고
    • Con-tinuous real-world inputs can open up alternative accelerator de-signs
    • B. Belhadj, A. Joubert, Z. Li, R. Heliot, and O. Temam, "Con-tinuous real-world inputs can open up alternative accelerator de-signs, " in ISCA, 2013.
    • (2013) ISCA
    • Belhadj, B.1    Joubert, A.2    Li, Z.3    Heliot, R.4    Temam, O.5
  • 59
    • 84897884384 scopus 로고    scopus 로고
    • Leveraging the error resilience of machine-learning ap-plications for designing highly energy efficient accelerators
    • Z. Du, A. Lingamneni, Y. Chen, K. Palem, O. Temam, and C. Wu, "Leveraging the error resilience of machine-learning ap-plications for designing highly energy efficient accelerators, " in ASP-DAC, 2014.
    • (2014) ASP-DAC
    • Du, Z.1    Lingamneni, A.2    Chen, Y.3    Palem, K.4    Temam, O.5    Wu, C.6


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.