메뉴 건너뛰기




Volumn 27, Issue 2, 2012, Pages 240-255

PartialRC: A partial recomputing method for efficient fault recovery on GPGPUs

Author keywords

Checkpointing; CUDA; Fault tolerance; GPGPU; Partial recomputing

Indexed keywords

CHECK POINTING; COMPUTING POWER; CUDA; DEFENSE TECHNOLOGIES; FAULT RECOVERY; FAULT-TOLERANT MECHANISM; GPGPU; HETEROGENEOUS COMPUTING SYSTEM; HIGH PERFORMANCE COMPUTING; RE-COMPUTING; RELIABILITY GUARANTEE;

EID: 84861597079     PISSN: 10009000     EISSN: None     Source Type: Journal    
DOI: 10.1007/s11390-012-1220-5     Document Type: Article
Times cited : (6)

References (25)
  • 4
    • 84861622664 scopus 로고    scopus 로고
    • AMD. Brook+. http://developer.amd. com/gpu assets/AMDBrookplus.pdf.
  • 5
    • 34547309668 scopus 로고    scopus 로고
    • NVIDIA Corporation. Cuda programming guide, 2008. http://www.nvidia.com/ object/cuda develop.html.
    • (2008) Cuda Programming Guide
  • 6
    • 70350583252 scopus 로고    scopus 로고
    • OpenMP to GPGPU: A compiler framework for automatic translation and optimization
    • April
    • Lee S, Min S J, Eigenmann R. OpenMP to GPGPU: A compiler framework for automatic translation and optimization. ACM SIGPLAN Notices, April 2009, 44(4): 101-110.
    • (2009) ACM SIGPLAN Notices , vol.44 , Issue.4 , pp. 101-110
    • Lee, S.1    Min, S.J.2    Eigenmann, R.3
  • 7
    • 84861645603 scopus 로고    scopus 로고
    • Top500 Supercomputer Site. http://www.top500.org/lists /2010/11.
  • 9
    • 51549113195 scopus 로고    scopus 로고
    • Comparison of accelerated DRAM soft error rates measured at component and system level
    • Phoenix, USA, April 27-May 1
    • Borucki L, Schindlbeck G, Slayman C. Comparison of accelerated DRAM soft error rates measured at component and system level. In Proc. the Int. Reliability Physics Symposium, Phoenix, USA, April 27-May 1, 2008, pp.482-487.
    • (2008) Proc. The Int. Reliability Physics Symposium , pp. 482-487
    • Borucki, L.1    Schindlbeck, G.2    Slayman, C.3
  • 14
    • 56349149338 scopus 로고    scopus 로고
    • A hardware redundancy and recovery mechanism for reliable scientific computation on graphics processors
    • San Diego, California, USA, August 4-5
    • Sheaffer J W, Luebke D P, Skadron K. A hardware redundancy and recovery mechanism for reliable scientific computation on graphics processors. In Proc. the 22nd ACM SIGGRAPH/EUROGRAPHICS Symposium on Graphics Hardware, San Diego, California, USA, August 4-5, 2007, pp.55-64.
    • (2007) Proc. The 22nd ACM SIGGRAPH/EUROGRAPHICS Symposium on Graphics Hardware , pp. 55-64
    • Sheaffer, J.W.1    Luebke, D.P.2    Skadron, K.3
  • 17
    • 34848824452 scopus 로고    scopus 로고
    • A survey of checkpoint/restart implementations
    • July
    • Roman E. A survey of checkpoint/restart implementations. Berkeley Lab Technical Report, July 2002, https://ftg.lbl. gov/assets/projects/ CheckpointRestart/Pubs/checkpointSurvey-020724b.pdf.
    • (2002) Berkeley Lab Technical Report
    • Roman, E.1
  • 18
    • 84941514592 scopus 로고
    • Rollback and recovery strategies for computer programs
    • June
    • Chandy K M, Ramamoorthy C V. Rollback and recovery strategies for computer programs. IEEE Transactions on Computers, June 1972, 21(6): 546-556.
    • (1972) IEEE Transactions on Computers , vol.21 , Issue.6 , pp. 546-556
    • Chandy, K.M.1    Ramamoorthy, C.V.2
  • 25
    • 67649645285 scopus 로고    scopus 로고
    • KTH Royal Institute of Technology, Stockholm, Sweden
    • Dubrova E. Fault-Tolerant Design: An Introduction. KTH Royal Institute of Technology, Stockholm, Sweden, 2008, http://web.it.kth.se/dubrova/draft.pdf.
    • (2008) Fault-Tolerant Design: An Introduction
    • Dubrova, E.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.