메뉴 건너뛰기




Volumn 6067 LNCS, Issue PART 1, 2010, Pages 206-215

A flexible checkpoint/restart model in distributed systems

Author keywords

Checkpointing; Fault tolerance; Reliability modeling

Indexed keywords

CHECKPOINT/RESTART; CHECKPOINTING; COMPLETION TIME; COMPUTATIONAL RESOURCES; COMPUTING PLATFORM; COORDINATED CHECKPOINTING; DISTRIBUTED SYSTEMS; FAULT TOLERANCE MECHANISMS; GLOBAL CONSISTENT STATE; LARGE-SCALE APPLICATIONS; MATHEMATICAL ANALYSIS; NEW MODEL; PROCESS FAILURE; RANDOM FAILURES; RELIABILITY MODELING; RELIABILITY PROBLEMS; RELIABLE EXECUTION; SINGLE PROCESSORS; WEIBULL;

EID: 77955097389     PISSN: 03029743     EISSN: 16113349     Source Type: Book Series    
DOI: 10.1007/978-3-642-14390-8_22     Document Type: Conference Paper
Times cited : (35)

References (14)
  • 3
    • 67349271621 scopus 로고    scopus 로고
    • An analysis of clustered failures on large supercomputing systems
    • Hacker, T.J., Romero, F., Carothers, C.D.: An analysis of clustered failures on large supercomputing systems. J. Parallel Distrib. Comput. 69(7), 652-665 (2009)
    • (2009) J. Parallel Distrib. Comput. , vol.69 , Issue.7 , pp. 652-665
    • Hacker, T.J.1    Romero, F.2    Carothers, C.D.3
  • 4
    • 28044460018 scopus 로고    scopus 로고
    • A higher order estimate of the optimum checkpoint interval for restart dumps
    • Daly, J.T.: A higher order estimate of the optimum checkpoint interval for restart dumps. Future Generation Computer Systems 22(3), 303-312 (2006)
    • (2006) Future Generation Computer Systems , vol.22 , Issue.3 , pp. 303-312
    • Daly, J.T.1
  • 5
    • 9144223280 scopus 로고    scopus 로고
    • Checkpointing for peta-scale systems: A look into the future of practical rollback-recovery
    • Elnozahy, E.N., Plank, J.S.: Checkpointing for peta-scale systems: A look into the future of practical rollback-recovery. IEEE Trans. Dependable Secur. Comput. 1(2), 97-108 (2004)
    • (2004) IEEE Trans. Dependable Secur. Comput. , vol.1 , Issue.2 , pp. 97-108
    • Elnozahy, E.N.1    Plank, J.S.2
  • 8
    • 84976846528 scopus 로고
    • A first order approximation to the optimum checkpoint interval
    • Young, J.W.: A first order approximation to the optimum checkpoint interval. ACM Commun. 17(9), 530-531 (1974)
    • (1974) ACM Commun. , vol.17 , Issue.9 , pp. 530-531
    • Young, J.W.1
  • 9
    • 0022020346 scopus 로고
    • Distributed snapshots: Determining global states of distributed systems
    • Chandy, K.M., Lamport, L.: Distributed snapshots: determining global states of distributed systems. ACM Trans. Comput. Syst. 3(1), 63-75 (1985)
    • (1985) ACM Trans. Comput. Syst. , vol.3 , Issue.1 , pp. 63-75
    • Chandy, K.M.1    Lamport, L.2
  • 11
    • 0000652719 scopus 로고
    • Selection of a checkpoint interval in a criticaltask environment
    • Geist, R., Reynolds, R., Westall, J.: Selection of a checkpoint interval in a criticaltask environment. IEEE Transactions on Reliability 37, 395-400 (1988)
    • (1988) IEEE Transactions on Reliability , vol.37 , pp. 395-400
    • Geist, R.1    Reynolds, R.2    Westall, J.3
  • 12
    • 0032597646 scopus 로고    scopus 로고
    • The average availability of parallel checkpointing systems and its importance in selecting runtime parameters
    • Plank, J.S., Thomason, M.G.: The average availability of parallel checkpointing systems and its importance in selecting runtime parameters. In: 29th International Symposium on Fault-Tolerant Computing, pp. 250-259 (1999)
    • (1999) 29th International Symposium on Fault-Tolerant Computing , pp. 250-259
    • Plank, J.S.1    Thomason, M.G.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.