메뉴 건너뛰기




Volumn , Issue , 2007, Pages

The design and implementation of checkpoint/restart process fAult Tolerance for open MPI

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER SOFTWARE; FAULT TOLERANCE; INTERFACES (COMPUTER); MESSAGE PASSING; OPEN SYSTEMS; SCALABILITY;

EID: 34548789748     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/IPDPS.2007.370605     Document Type: Conference Paper
Times cited : (133)

References (22)
  • 2
    • 0032000230 scopus 로고    scopus 로고
    • Message logging: Pessimistic, optimistic, causal, and optimal
    • L. Alvisi and K. Marzullo. Message logging: Pessimistic, optimistic, causal, and optimal. IEEE Trans. Softw. Eng., 24(2): 149-159, 1998.
    • (1998) IEEE Trans. Softw. Eng , vol.24 , Issue.2 , pp. 149-159
    • Alvisi, L.1    Marzullo, K.2
  • 4
    • 0022020346 scopus 로고
    • Distributed snapshots: Determining global states of distributed systems
    • K. M. Chandy and L. Lamport. Distributed snapshots: determining global states of distributed systems. ACM Trans. Comput. Syst., 3(1):63-75, 1985.
    • (1985) ACM Trans. Comput. Syst , vol.3 , Issue.1 , pp. 63-75
    • Chandy, K.M.1    Lamport, L.2
  • 5
    • 12344277946 scopus 로고    scopus 로고
    • The design and implementation of Berkeley Lab's linux checkpoint/restart
    • Technical Report LBNL-54941, Lawrence Berkeley National Lab, 2003
    • J. Duell, P. Hargrove, and E. Roman. The design and implementation of Berkeley Lab's linux checkpoint/restart. Technical Report LBNL-54941, Lawrence Berkeley National Lab, 2003.
    • Duell, J.1    Hargrove, P.2    Roman, E.3
  • 6
    • 0042078549 scopus 로고    scopus 로고
    • A survey of rollback-recovery protocols in message-passing systems
    • E. N. M. Elnozahy, L. Alvisi, Y.-M. Wang, and D. B. Johnson. A survey of rollback-recovery protocols in message-passing systems. ACM Comput. Surv., 34(3):375-408, 2002.
    • (2002) ACM Comput. Surv , vol.34 , Issue.3 , pp. 375-408
    • Elnozahy, E.N.M.1    Alvisi, L.2    Wang, Y.-M.3    Johnson, D.B.4
  • 11
    • 34548755483 scopus 로고    scopus 로고
    • A checkpoint and restart service specification for Open MPI
    • Technical Report TR635, Indiana University, Bloomington, Indiana, USA, July
    • J. Hursey, J. M. Squyres, and A. Lumsdaine. A checkpoint and restart service specification for Open MPI. Technical Report TR635, Indiana University, Bloomington, Indiana, USA, July 2006.
    • (2006)
    • Hursey, J.1    Squyres, J.M.2    Lumsdaine, A.3
  • 12
    • 0003912256 scopus 로고    scopus 로고
    • Checkpoint and migration of UNIX processes in the Condor distributed processing system
    • Technical Report CS-TR-199701346, University of Wisconsin, Madison
    • M. Litzkow, T. Tannenbaum, J. Basney, and M. Livny. Checkpoint and migration of UNIX processes in the Condor distributed processing system. Technical Report CS-TR-199701346, University of Wisconsin, Madison, 1997.
    • (1997)
    • Litzkow, M.1    Tannenbaum, T.2    Basney, J.3    Livny, M.4
  • 13
  • 14
    • 85143038582 scopus 로고    scopus 로고
    • Message Passing Interface Forum. MPI: A Message Passing Interface. In Proc. of Supercomputing '93, pages 878-883. IEEE Computer Society Press, November 1993.
    • Message Passing Interface Forum. MPI: A Message Passing Interface. In Proc. of Supercomputing '93, pages 878-883. IEEE Computer Society Press, November 1993.
  • 16
    • 34548792745 scopus 로고    scopus 로고
    • J. S. Plank, M. Beck, G. Kingsley, and K. Li. Libckpt: Transparent checkpointing under Unix. Technical report, Knoxville, TN, USA, 1994.
    • J. S. Plank, M. Beck, G. Kingsley, and K. Li. Libckpt: Transparent checkpointing under Unix. Technical report, Knoxville, TN, USA, 1994.
  • 21
    • 0031124071 scopus 로고    scopus 로고
    • Consistent global checkpoints that contain a given set of local checkpoints
    • Y.-M. Wang. Consistent global checkpoints that contain a given set of local checkpoints. IEEE Trans. Comput., 46(4):456-468, 1997.
    • (1997) IEEE Trans. Comput , vol.46 , Issue.4 , pp. 456-468
    • Wang, Y.-M.1
  • 22
    • 33750234379 scopus 로고    scopus 로고
    • T. S. Woodall, G. M. Shipman, G. Bosilca, R. L. Graham, and A. B. Maccabe. High performance RDMA protocols in HPC. In Proceedings of EuroPVM-MPI 2006, 4192/2006 of Lecture Notes in Computer Science, pages 76-85. Springer berlin /Heidelberg, September 2006.
    • T. S. Woodall, G. M. Shipman, G. Bosilca, R. L. Graham, and A. B. Maccabe. High performance RDMA protocols in HPC. In Proceedings of EuroPVM-MPI 2006, volume 4192/2006 of Lecture Notes in Computer Science, pages 76-85. Springer berlin /Heidelberg, September 2006.


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.