메뉴 건너뛰기




Volumn 2005, Issue , 2005, Pages 320-327

A Checkpointing/Recovery system for MPI applications on cluster of IA-64 computers

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER PROGRAMMING; FAULT TOLERANT COMPUTER SYSTEMS; PARALLEL PROCESSING SYSTEMS; PERFORMANCE; RELIABILITY;

EID: 33745225196     PISSN: 15302016     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICPPW.2005.5     Document Type: Conference Paper
Times cited : (2)

References (19)
  • 1
    • 0027757472 scopus 로고
    • MPI: A message passing interface
    • IEEE Computer Society Press, November
    • Message Passing Interface Forum. MPI: A Message Passing Interface. In Proc. of Supercomputing '93, pages 878-883. IEEE Computer Society Press, November 1993.
    • (1993) Proc. of Supercomputing '93 , pp. 878-883
  • 2
    • 0004096191 scopus 로고    scopus 로고
    • A survey of rollback recovery protocols in message-passing system
    • Pittsburgh, PA: CMU-CS-96-181. Carnegie Mellon University, Oct
    • Elnozahy E N, Johnson D B, Wang Y M. A Survey of Rollback Recovery Protocols in Message-Passing System. Technical Report. Pittsburgh, PA: CMU-CS-96-181. Carnegie Mellon University, Oct 1996.
    • (1996) Technical Report
    • Elnozahy, E.N.1    Johnson, D.B.2    Wang, Y.M.3
  • 6
    • 0002067202 scopus 로고
    • Libckpt: Transparent checkpointing under unix conference proceedings
    • New Orleans, LA, January
    • James S. Plank, Micah Beck, Gerry Kingsley, and Kai Li. Libckpt: Transparent Checkpointing under Unix Conference Proceedings, Usenix Winter 1995 Technical Conference, New Orleans, LA, January, 1995, pp. 213-223.
    • (1995) Usenix Winter 1995 Technical Conference , pp. 213-223
    • Plank, J.S.1    Beck, M.2    Kingsley, G.3    Li, K.4
  • 8
    • 0022020346 scopus 로고
    • Distributed snapshots: Determining global states of distributed systems
    • Chandy K M, Lamport L. Distributed snapshots: Determining global states of distributed systems. ACM Trans on Computer Systems. 1985, 3(1): 63-75.
    • (1985) ACM Trans on Computer Systems , vol.3 , Issue.1 , pp. 63-75
    • Chandy, K.M.1    Lamport, L.2
  • 10
    • 8344283205 scopus 로고    scopus 로고
    • CRAK: Linux checkpoint / restart as a kernel module
    • Department of Computer Science, Columbia University
    • H. Zhong and J. Nieh. CRAK: Linux checkpoint / restart as a kernel module. Technical Report CUCS-014-01, Department of Computer Science, Columbia University, 2001.
    • (2001) Technical Report , vol.CUCS-014-01
    • Zhong, H.1    Nieh, J.2
  • 17
    • 84940567900 scopus 로고    scopus 로고
    • Fault-tolerant MPI: Supporting dynamic applications in a dynamic world
    • Jack Dongarra, Peter Kacsuk, and Norbert Podhorszki, editors Recent Advances in Parallel Virutal Machine and Message Passing Interface . 7th European PVM/MPI Users' Group Meeting
    • Graham Fagg and Jack Dongarra. Fault-tolerant MPI: Supporting dynamic applications in a dynamic world. In Jack Dongarra, Peter Kacsuk, and Norbert Podhorszki, editors, Recent Advances in Parallel Virutal Machine and Message Passing Interface, number 1908 in Springer Lecture Notes in Computer Science, pages 346-353, 2000. 7th European PVM/MPI Users' Group Meeting.
    • (2000) Springer Lecture Notes in Computer Science , vol.1908 , pp. 346-353
    • Fagg, G.1    Dongarra, J.2
  • 19
    • 0042625580 scopus 로고
    • Monitors, messages, and clusters: The p4 parallel programming system
    • Argonne National Laboratory
    • Ralph Butler and Ewing Lusk. "Monitors, Messages, and Clusters: the p4 Parallel Programming System". Technical report, Argonne National Laboratory, 1993.
    • (1993) Technical Report
    • Butler, R.1    Lusk, E.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.