메뉴 건너뛰기




Volumn 20, Issue 4, 2004, Pages 523-538

The development of an efficient checkpointing facility exploiting operating systems services of the GENESIS cluster operating system

Author keywords

Checkpointing; Cluster computing; Fault tolerance; Operating system services

Indexed keywords

COMPUTATION THEORY; COMPUTER NETWORKS; FAULT TOLERANT COMPUTER SYSTEMS; PARALLEL PROCESSING SYSTEMS;

EID: 1842536500     PISSN: 0167739X     EISSN: None     Source Type: Journal    
DOI: 10.1016/S0167-739X(03)00171-7     Document Type: Conference Paper
Times cited : (11)

References (17)
  • 1
    • 0034317011 scopus 로고    scopus 로고
    • Towards an operating system managing parallelism of computers on clusters
    • Goscinski A.M., Towards an operating system managing parallelism of computers on clusters. Future Gener. Comput. Syst. 17:2000;293-314.
    • (2000) Future Gener. Comput. Syst. , vol.17 , pp. 293-314
    • Goscinski, A.M.1
  • 3
    • 0008802172 scopus 로고    scopus 로고
    • A cluster operating system supporting parallel computing
    • Goscinski A.M., Hobbs M.J., Silcock J., A cluster operating system supporting parallel computing. Cluster Comput. 4:2001;145-156.
    • (2001) Cluster Comput. , vol.4 , pp. 145-156
    • Goscinski, A.M.1    Hobbs, M.J.2    Silcock, J.3
  • 5
    • 0004096191 scopus 로고    scopus 로고
    • A survey of rollback-recovery protocols in message-passing systems
    • Carnegie Mellon University, a revision of CMU-CS-96-181, June
    • E.N. Elnozahy, L. Alvisi, Y.-M. Wang, D.B. Johnson, A survey of rollback-recovery protocols in message-passing systems, Technical Report CMU-CS-99-148, Carnegie Mellon University, a revision of CMU-CS-96-181, June 1999.
    • (1999) Technical Report , vol.CMU-CS-99-148
    • Elnozahy, E.N.1    Alvisi, L.2    Wang, Y.-M.3    Johnson, D.B.4
  • 6
    • 0033360051 scopus 로고    scopus 로고
    • Quasi-synchronous checkpointing: Models, characterization, and classification
    • Manivannan D., Singhal M., Quasi-synchronous checkpointing: models, characterization, and classification. IEEE Trans. Parallel Distrib. Syst. 10(7):1999;703-713.
    • (1999) IEEE Trans. Parallel Distrib. Syst. , vol.10 , Issue.7 , pp. 703-713
    • Manivannan, D.1    Singhal, M.2
  • 8
    • 0026867749 scopus 로고
    • Manetho: Transparent rollback-recovery with low overhead, limited rollback, and fast output commit
    • Elnozahy E.N., Zwaenepoel W., Manetho: transparent rollback-recovery with low overhead, limited rollback, and fast output commit. IEEE Trans. Comput. 41(5):1992;526-531.
    • (1992) IEEE Trans. Comput. , vol.41 , Issue.5 , pp. 526-531
    • Elnozahy, E.N.1    Zwaenepoel, W.2
  • 10
    • 0003912256 scopus 로고    scopus 로고
    • Checkpoint and migration of unix processes in the condor distributed processing system
    • University of Wisconsin-Madison, April
    • M. Litzkow, T. Tannenbaum, J. Basney, M. Livny, Checkpoint and migration of unix processes in the condor distributed processing system, Technical Report 1346, University of Wisconsin-Madison, April 1997.
    • (1997) Technical Report , vol.1346
    • Litzkow, M.1    Tannenbaum, T.2    Basney, J.3    Livny, M.4
  • 15
    • 22644450661 scopus 로고    scopus 로고
    • Remote and concurrent process duplication for SPMD based parallel processing cows
    • Proceedings of the High-performance Computing and Networking (HPCN'99), Springer, Berlin
    • M. Hobbs, A. Goscinski, Remote and concurrent process duplication for SPMD based parallel processing cows, in: Proceedings of the High-performance Computing and Networking (HPCN'99), Lecture Notes in Computer Science 1593, Springer, Berlin, 1999, pp. 603-612.
    • (1999) Lecture Notes in Computer Science , vol.1593 , pp. 603-612
    • Hobbs, M.1    Goscinski, A.2
  • 16
    • 84944940986 scopus 로고    scopus 로고
    • A group communications facility for reliable computing on clusters
    • The International Society for Computers and Their Applications, Dallas, TX, USA
    • J. Rough, A.M. Goscinski, A group communications facility for reliable computing on clusters, in: Proceedings of the Parallel and Distributed Computing Systems (PDCS), The International Society for Computers and Their Applications, Dallas, TX, USA, 2001, pp. 19-24.
    • (2001) Proceedings of the Parallel and Distributed Computing Systems (PDCS) , pp. 19-24
    • Rough, J.1    Goscinski, A.M.2
  • 17
    • 0032686588 scopus 로고    scopus 로고
    • Finding, expressing and managing parallelism in programs executed on clusters of workstations
    • Goscinski A.M., Finding, expressing and managing parallelism in programs executed on clusters of workstations. Comput. Commun. 22:1999;998-1016.
    • (1999) Comput. Commun. , vol.22 , pp. 998-1016
    • Goscinski, A.M.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.