메뉴 건너뛰기




Volumn , Issue , 2004, Pages 277-286

Adaptive incremental checkpointing for massively parallel systems

Author keywords

Fault Tolerance; Incremental Checkpoint; Large Scale Systems; Probabilistic Checkpoint

Indexed keywords

ADAPTIVE ALGORITHMS; BENCHMARKING; COMPUTER HARDWARE; COMPUTER OPERATING SYSTEMS; LARGE SCALE SYSTEMS; MATHEMATICAL MODELS; PROBABILISTIC LOGICS;

EID: 8344232253     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1006209.1006248     Document Type: Conference Paper
Times cited : (132)

References (32)
  • 1
    • 0031388399 scopus 로고    scopus 로고
    • Impact of checkpoint latency on overhead ratio of a checkpointing scheme
    • Aug.
    • N. H. Vaidya, "Impact of checkpoint latency on overhead ratio of a checkpointing scheme," IEEE Transactions on Computers, vol. 46, Aug. 1997.
    • (1997) IEEE Transactions on Computers , vol.46
    • Vaidya, N.H.1
  • 3
    • 0004097019 scopus 로고
    • Compressed differences: An algorithm for fast incremental checkpointing
    • University of Tennessee at Knoxville, Aug.
    • J. S. Plank, J. Xu, and R. H. Netzer, "Compressed differences: An algorithm for fast incremental checkpointing," Tech. Rep. CS-95-302, University of Tennessee at Knoxville, Aug. 1995.
    • (1995) Tech. Rep. , vol.CS-95-302
    • Plank, J.S.1    Xu, J.2    Netzer, R.H.3
  • 5
    • 0003912256 scopus 로고    scopus 로고
    • Checkpoint and migration of UNIX processes in the Condor distributed processing system
    • University of Wisconsin - Madison Computer Sciences Department, April
    • M. Litzkow, T. Tannenbaum, J. Basney, and M. Livny, "Checkpoint and migration of UNIX processes in the Condor distributed processing system," Tech. Rep. UW-CS-TR-1346, University of Wisconsin - Madison Computer Sciences Department, April 1997.
    • (1997) Tech. Rep. , vol.UW-CS-TR-1346
    • Litzkow, M.1    Tannenbaum, T.2    Basney, J.3    Livny, M.4
  • 8
    • 0004096191 scopus 로고    scopus 로고
    • A survey of rollback-recovery protocols in message passing systems
    • School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA, Oct.
    • M. Elnozahy, L. Alvisi, Y. Wang, and D. Johnson, "A survey of rollback-recovery protocols in message passing systems," Tech. Rep. CMU-CS-96-181, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA, Oct. 1996.
    • (1996) Tech. Rep. , vol.CMU-CS-96-181
    • Elnozahy, M.1    Alvisi, L.2    Wang, Y.3    Johnson, D.4
  • 10
    • 0031570635 scopus 로고    scopus 로고
    • Application level fault tolerance in heterogeneus networks of workstations
    • May
    • A. Beguelin, E. Seligman, and P. Stephan., "Application level fault tolerance in heterogeneus networks of workstations.," Journal of Parallel and Distributed Computing, vol. 43, pp. 147-155, May 1997.
    • (1997) Journal of Parallel and Distributed Computing , vol.43 , pp. 147-155
    • Beguelin, A.1    Seligman, E.2    Stephan, P.3
  • 11
    • 8344260303 scopus 로고    scopus 로고
    • Quasi-asynchronous migration: A novel migration protocol for PVM tasks
    • Apr.
    • D. Pei, D. Wang, and Y. Zhang, "Quasi-asynchronous migration: A novel migration protocol for PVM tasks," ACM Operating Systems Review, vol. 33, Apr. 1999.
    • (1999) ACM Operating Systems Review , vol.33
    • Pei, D.1    Wang, D.2    Zhang, Y.3
  • 16
    • 0002991145 scopus 로고
    • Ickp: A consistent checkpointer for multicomputers
    • June
    • J. S. Plank and K. Li., "ickp: A consistent checkpointer for multicomputers.," IEEE Parallel and Distributed Technologies, vol, 2, pp. 62-67, June 1994.
    • (1994) IEEE Parallel and Distributed Technologies , vol.2 , pp. 62-67
    • Plank, J.S.1    Li, K.2
  • 20
    • 8344283205 scopus 로고    scopus 로고
    • CRAK: Linux checkpoint/restart as a kernel module
    • Department of Computer Science, Columbia University, Nov.
    • H. Zhong and J. Nieh, "CRAK: Linux checkpoint/restart as a kernel module," Tech. Rep. CUCS-014-01, Department of Computer Science, Columbia University, Nov. 2001.
    • (2001) Tech. Rep. , vol.CUCS-014-01
    • Zhong, H.1    Nieh, J.2
  • 29
    • 11144287593 scopus 로고    scopus 로고
    • An overview of the BlueGene/L Supercomputer
    • Nov.
    • N. Adiga and et. al., "An overview of the BlueGene/L Supercomputer," in In Proceedings of the Supercomputing, Nov. 2002.
    • (2002) In Proceedings of the Supercomputing
    • Adiga, N.1
  • 32
    • 84862455395 scopus 로고    scopus 로고
    • "ASCI blue benchmarks." http://www.llnl.gov/asci_benchmarks/ asci/asci_code.list.html.
    • ASCI Blue Benchmarks


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.