메뉴 건너뛰기




Volumn , Issue , 2006, Pages 14-23

Cooperative checkpointing: A robust approach to large-scale systems reliability

Author keywords

Cooperative checkpointing; High performance computing; Parallel computing; RAS; Simulations; Supercomputing

Indexed keywords

COMPUTER SIMULATION; COMPUTER SYSTEM RECOVERY; LARGE SCALE SYSTEMS; ROBUSTNESS (CONTROL SYSTEMS); SCHEDULING;

EID: 34547424386     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1183401.1183406     Document Type: Conference Paper
Times cited : (48)

References (24)
  • 1
    • 34547478654 scopus 로고    scopus 로고
    • N. Adiga and T. B. Team. An overview of the bluegene/1 supercomputer. In Supercomputing, Technical Papers, Nov. 2002.
    • N. Adiga and T. B. Team. An overview of the bluegene/1 supercomputer. In Supercomputing, Technical Papers, Nov. 2002.
  • 4
    • 0022012278 scopus 로고
    • Discovering patterns in sequence of events
    • T. Dietterich and R. Michalski. Discovering patterns in sequence of events. In Artificial Intelligence, volume 25, pages 187-232, 1985.
    • (1985) Artificial Intelligence , vol.25 , pp. 187-232
    • Dietterich, T.1    Michalski, R.2
  • 6
    • 9144223280 scopus 로고    scopus 로고
    • Checkpointing for peta-scale systems: A look into the future of practical rollback-recovery
    • E. N. Elnozahy and J. S. Plank. Checkpointing for peta-scale systems: A look into the future of practical rollback-recovery. IEEE Trans. Dependable Secur. Comput., 1(2):97-108, 2004.
    • (2004) IEEE Trans. Dependable Secur. Comput , vol.1 , Issue.2 , pp. 97-108
    • Elnozahy, E.N.1    Plank, J.S.2
  • 18
    • 34547444432 scopus 로고    scopus 로고
    • J. S. Plank and W. R. Elwasif. Experimental
    • J. S. Plank and W. R. Elwasif. Experimental
  • 19
    • 34547405648 scopus 로고    scopus 로고
    • assessment of workstation failures and their impact on checkpointing systems. In Proceedings of the 28th Intl. Symposium on Fault-tolerant Computing, June 1998.
    • assessment of workstation failures and their impact on checkpointing systems. In Proceedings of the 28th Intl. Symposium on Fault-tolerant Computing, June 1998.
  • 20
    • 0035201417 scopus 로고    scopus 로고
    • Processor allocation and checkpoint interval selection in cluster computing systems
    • November
    • J. S. Plank and M. G. Thomason. Processor allocation and checkpoint interval selection in cluster computing systems. Journal of Parallel and Distributed Computing, 61(11):1570-1590, November 2001.
    • (2001) Journal of Parallel and Distributed Computing , vol.61 , Issue.11 , pp. 1570-1590
    • Plank, J.S.1    Thomason, M.G.2
  • 23
    • 84976696875 scopus 로고
    • Performance analysis of checkpointing strategies
    • May
    • A. N. Tantawi and M. Ruschitzka. Performance analysis of checkpointing strategies. In ACM Transactions on Computer Systems, volume 110, pages 123-144, May 1984.
    • (1984) ACM Transactions on Computer Systems , vol.110 , pp. 123-144
    • Tantawi, A.N.1    Ruschitzka, M.2
  • 24
    • 84976846528 scopus 로고
    • A first order approximation to the optimum checkpoint interval
    • J. W. Young. A first order approximation to the optimum checkpoint interval. Commun. ACM, 17(9):530-531, 1974.
    • (1974) Commun. ACM , vol.17 , Issue.9 , pp. 530-531
    • Young, J.W.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.