메뉴 건너뛰기




Volumn , Issue , 2007, Pages 30-43

Modeling the impact of checkpoints on next-generation systems

Author keywords

[No Author keywords available]

Indexed keywords

ALTERNATIVE APPROACH; APPLICATION EXECUTION; APPLICATION PERFORMANCE; APPLICATION SCALABILITY; CHECKPOINT (CO); CHECKPOINT/RESTART; GENERATION SYSTEMS; LOWER BOUNDS; MASS STORAGE SYSTEMS; MASSIVELY PARALLEL PROCESSING; MATHEMATICAL MODELLING; NEW APPROACHES; OVERLAY NETWORKS (ON); PETAFLOP SYSTEMS; STORAGE SYSTEMS;

EID: 47249142074     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/MSST.2007.4367962     Document Type: Conference Paper
Times cited : (83)

References (39)
  • 2
    • 47249097978 scopus 로고    scopus 로고
    • An analysis of the consequences of a reduction in checkpoint latency for periodic checkpointing systems
    • Technical Report SAND2007-xxxx, Sandia National Laboratories
    • S. Arunagiri, S. Seelam, R. A. Oldfield, M. R. Varela, P. J. Teller, and R. Riesen. An analysis of the consequences of a reduction in checkpoint latency for periodic checkpointing systems. Technical Report SAND2007-xxxx, Sandia National Laboratories, 2007.
    • (2007)
    • Arunagiri, S.1    Seelam, S.2    Oldfield, R.A.3    Varela, M.R.4    Teller, P.J.5    Riesen, R.6
  • 6
    • 46049083336 scopus 로고    scopus 로고
    • The red storm computer architecture and its implementation
    • Salishan Lodge, Glenedon Beach, Oregon, April
    • W. J. Camp and J. L. Tomkins. The red storm computer architecture and its implementation. In The Conference on High-Speed Computing: LANL/LLNL/SNL, Salishan Lodge, Glenedon Beach, Oregon, April 2003.
    • (2003) The Conference on High-Speed Computing: LANL/LLNL/SNL
    • Camp, W.J.1    Tomkins, J.L.2
  • 7
    • 0029715009 scopus 로고    scopus 로고
    • Evaluation of checkpoint mechanisms for massively parallel machines
    • Sendai, Japan, June, IEEE Computer Society Press
    • T.-C. Chiueh and P. Deng. Evaluation of checkpoint mechanisms for massively parallel machines. In Proceedings of the Annual Symposium on Fault Tolerant Computing, pages 370-379, Sendai, Japan, June 1996. IEEE Computer Society Press.
    • (1996) Proceedings of the Annual Symposium on Fault Tolerant Computing , pp. 370-379
    • Chiueh, T.-C.1    Deng, P.2
  • 8
    • 28044438299 scopus 로고    scopus 로고
    • A model for predicting the optimum checkpoint interval for restart dumps
    • August
    • J. Daly. A model for predicting the optimum checkpoint interval for restart dumps. Lecture Notes in Computer Science, 2660:3-12, August 2003.
    • (2003) Lecture Notes in Computer Science , vol.2660 , pp. 3-12
    • Daly, J.1
  • 9
    • 29344435659 scopus 로고    scopus 로고
    • A strategy for running large scale applications based on a model that optimizes the checkpoint interval for restart dumps
    • Edinburgh, Scotland, UK, May
    • J. Daly. A strategy for running large scale applications based on a model that optimizes the checkpoint interval for restart dumps. In Proceedings of the 26th International Conference on Software Engineering, pages 70-74, Edinburgh, Scotland, UK, May 2004.
    • (2004) Proceedings of the 26th International Conference on Software Engineering , pp. 70-74
    • Daly, J.1
  • 10
    • 28044460018 scopus 로고    scopus 로고
    • A higher order estimate of the optimum checkpoint interval for restart dumps
    • J. Daly. A higher order estimate of the optimum checkpoint interval for restart dumps. Future Generation Computer Systems, 22:303-312, 2006.
    • (2006) Future Generation Computer Systems , vol.22 , pp. 303-312
    • Daly, J.1
  • 11
    • 0042078549 scopus 로고    scopus 로고
    • A survey of rollback-recovery protocols in message-passing systems
    • September
    • E. N. Elnozahy, L. Alvisi, Y. M. Wang, and D. B. Johnson. A survey of rollback-recovery protocols in message-passing systems. ACM Computing Surveys, 34(3):375-408, September 2002.
    • (2002) ACM Computing Surveys , vol.34 , Issue.3 , pp. 375-408
    • Elnozahy, E.N.1    Alvisi, L.2    Wang, Y.M.3    Johnson, D.B.4
  • 13
    • 9144223280 scopus 로고    scopus 로고
    • Checkpointing for petascale systems: A look into the future of practical rollback-recovery
    • April-June
    • E. N. Elnozahy and J. S. Plank. Checkpointing for petascale systems: A look into the future of practical rollback-recovery. IEEE Transactions on Dependable and Secure Computing, 1(2):97-108, April-June 2004.
    • (2004) IEEE Transactions on Dependable and Secure Computing , vol.1 , Issue.2 , pp. 97-108
    • Elnozahy, E.N.1    Plank, J.S.2
  • 16
  • 25
    • 0030392072 scopus 로고    scopus 로고
    • Improving the performance of coordinated checkpointers on networks of workstations using RAID techniques
    • J. S. Plank. Improving the performance of coordinated checkpointers on networks of workstations using RAID techniques. In Proceedings of the Symposium on Reliable Distributed Systems, pages 76-85, 1996.
    • (1996) Proceedings of the Symposium on Reliable Distributed Systems , pp. 76-85
    • Plank, J.S.1
  • 26
    • 0031570636 scopus 로고    scopus 로고
    • Fault-tolerant matrix operations for networks of workstations using diskless checkpointing
    • June
    • J. S. Plank, Y. Kim, and J. J. Dongarra. Fault-tolerant matrix operations for networks of workstations using diskless checkpointing. Journal of Parallel and Distributed Computing, 43(2):125-138, June 1997.
    • (1997) Journal of Parallel and Distributed Computing , vol.43 , Issue.2 , pp. 125-138
    • Plank, J.S.1    Kim, Y.2    Dongarra, J.J.3
  • 30
    • 33845593340 scopus 로고    scopus 로고
    • A large-scale study of failures in high-performance computing systems
    • Philadelphia, PA, June, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA
    • B. Schroeder and G. A. Gibson. A large-scale study of failures in high-performance computing systems. In Proceedings of the International Conference on Dependable Systems and Networks (DSN2006), Philadelphia, PA, June 2006. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA.
    • (2006) Proceedings of the International Conference on Dependable Systems and Networks (DSN2006)
    • Schroeder, B.1    Gibson, G.A.2
  • 31
    • 84864756973 scopus 로고    scopus 로고
    • An experimental study about diskless checkpointing
    • Vasteras, Sweden, August, IEEE Computer Society Press
    • L. M. Silva and J. G. Silva. An experimental study about diskless checkpointing. In Proceedings of the 24th EUROMICRO Conference, pages 395-402, Vasteras, Sweden, August 1998. IEEE Computer Society Press.
    • (1998) Proceedings of the 24th EUROMICRO Conference , pp. 395-402
    • Silva, L.M.1    Silva, J.G.2
  • 33
    • 47249146088 scopus 로고    scopus 로고
    • T. B. Team. An overview of the BlueGene/L supercomputer. In Proceedings of SC2002: High Performance Networking and Computing, Baltimore, MD, November 2002.
    • T. B. Team. An overview of the BlueGene/L supercomputer. In Proceedings of SC2002: High Performance Networking and Computing, Baltimore, MD, November 2002.
  • 34
    • 47249142897 scopus 로고    scopus 로고
    • A conservative path to petaflop computing: The Red Storm architecture scaled to a petaflop and beyond
    • October
    • J. Tomkins. A conservative path to petaflop computing: The Red Storm architecture scaled to a petaflop and beyond. 4th Annual Workshop on Linux Clusters for Supercomputing, October 2003.
    • (2003) 4th Annual Workshop on Linux Clusters for Supercomputing
    • Tomkins, J.1
  • 35
    • 84877699694 scopus 로고
    • A case for two-level distributed recovery schemes
    • N. H. Vaidya. A case for two-level distributed recovery schemes. SIGMETRICS Perform. Eval. Rev., 23(1):64-73, 1995.
    • (1995) SIGMETRICS Perform. Eval. Rev , vol.23 , Issue.1 , pp. 64-73
    • Vaidya, N.H.1
  • 36
    • 0031388399 scopus 로고    scopus 로고
    • Impact of checkpoint latency on overhead ratio of a checkpointing scheme
    • N. H. Vaidya. Impact of checkpoint latency on overhead ratio of a checkpointing scheme. IEEE Transactions on Computers, 46(8):942-947, 1997.
    • (1997) IEEE Transactions on Computers , vol.46 , Issue.8 , pp. 942-947
    • Vaidya, N.H.1
  • 39
    • 84976846528 scopus 로고
    • A first order approximation to the optimum checkpoint interval
    • J. W. Young. A first order approximation to the optimum checkpoint interval. Communications of the ACM, 17(9):530-531, 1974.
    • (1974) Communications of the ACM , vol.17 , Issue.9 , pp. 530-531
    • Young, J.W.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.