메뉴 건너뛰기




Volumn , Issue , 2009, Pages 237-245

CIFTS: A coordinated infrastructure for fault-tolerant systems

Author keywords

[No Author keywords available]

Indexed keywords

BACKPLANES; FAULT NOTIFICATION; FAULT-TOLERANCE CAPABILITY; FAULT-TOLERANT; FAULT-TOLERANT SYSTEMS; HIGH-END COMPUTING; HOLISTIC MANNER; INTERFACE SPECIFICATION; NON-INTRUSIVE; PERFORMANCE DEGRADATION; SOFTWARE COMPONENT; SOFTWARE PROGRAM; SOFTWARE STACKS; SYSTEM SOFTWARES;

EID: 77951481809     PISSN: 01903918     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICPP.2009.20     Document Type: Conference Paper
Times cited : (48)

References (30)
  • 1
    • 34548748822 scopus 로고    scopus 로고
    • Automatic path migration over infiniband: Early experiences
    • A. Vishnu, A.R. Mamidala, S. Narravula, and D.K. Panda. Automatic Path Migration over InfiniBand: Early Experiences. In IPDPS, 2007.
    • (2007) IPDPS
    • Vishnu, A.1    Mamidala, A.R.2    Narravula, S.3    Panda, D.K.4
  • 2
  • 4
    • 12344277946 scopus 로고    scopus 로고
    • The design and implementation of Berkeley Lab's Linux checkpoint/restart
    • Available at
    • J. Duell, P. Hargrove, and E. Roman. The Design and Implementation of Berkeley Lab's Linux Checkpoint/Restart. Technical Report LBNL-54941, 2002. Available at https://ftg.lbl.gov/CheckpointRestart/Pubs/blcr.pdf.
    • Technical Report LBNL-54941
    • Duell, J.1    Hargrove, P.2    Roman, E.3
  • 5
    • 34848824452 scopus 로고    scopus 로고
    • A survey of checkpoint/restart implementations
    • Lawrence Berkeley National Laboratory, 2002. Available at
    • E. Roman. A Survey of Checkpoint/Restart Implementations. Technical Report LBNL-54942, Lawrence Berkeley National Laboratory, 2002. Available at https://ftg.lbl.gov/CheckpointRestart/CheckpointPapers.shtml.
    • (2002) Technical Report LBNL-54942
    • Roman, E.1
  • 6
    • 0032317368 scopus 로고    scopus 로고
    • System-level versus user-defined checkpointing
    • J.G. Silva and L.M. Silva. System-level versus user-defined checkpointing. In SRDS, 1998.
    • (1998) SRDS
    • Silva, J.G.1    Silva, L.M.2
  • 7
    • 34948863388 scopus 로고    scopus 로고
    • Migol: A fault-tolerant service framework for MPI applications in the grid
    • A. Luckow and B. Schnor. Migol: A Fault-Tolerant Service Framework for MPI Applications in the Grid. In Journal of Future Generation Computer Systems '08, volume 24, pages 142-152, 2008.
    • (2008) Journal of Future Generation Computer Systems '08 , vol.24 , pp. 142-152
    • Luckow, A.1    Schnor, B.2
  • 8
    • 34548789748 scopus 로고    scopus 로고
    • The design and implementation of checkpoint/restart process fault tolerance for open MPI
    • J. Hursey, J. Squyres, T. Mattox, and A. Lumsdaine. The Design and Implementation of Checkpoint/Restart Process Fault Tolerance for Open MPI. In IPDPS, 2007.
    • (2007) IPDPS
    • Hursey, J.1    Squyres, J.2    Mattox, T.3    Lumsdaine, A.4
  • 9
    • 34547940148 scopus 로고    scopus 로고
    • FEMPI: A lightweight fault-tolerant MPI for embedded cluster systems
    • R. Subramaniyan, V. Aggarwal, A. Jacobs, and A. George. FEMPI: A Lightweight Fault-Tolerant MPI for Embedded Cluster Systems. In ESA, 2006.
    • (2006) ESA
    • Subramaniyan, R.1    Aggarwal, V.2    Jacobs, A.3    George, A.4
  • 10
    • 34547424834 scopus 로고    scopus 로고
    • Application-transparent checkpoint/restart for MPI programs over infiniband
    • Q. Gao, W. Yu, W. Huang, and D.K. Panda. Application-Transparent Checkpoint/Restart for MPI Programs over InfiniBand. In ICPP, 2006.
    • (2006) ICPP
    • Gao, Q.1    Yu, W.2    Huang, W.3    Panda, D.K.4
  • 14
    • 34548046749 scopus 로고    scopus 로고
    • Proactive fault tolerance for HPC with Xen virtualization
    • A. Nagarajan, F. Mueller, C. Engelmann, and S.L. Scott. Proactive Fault Tolerance for HPC with Xen Virtualization. In ICS, 2007.
    • (2007) ICS
    • Nagarajan, A.1    Mueller, F.2    Engelmann, C.3    Scott, S.L.4
  • 15
    • 51049095700 scopus 로고    scopus 로고
    • Enhancing application robustness through adaptive fault tolerance
    • Z. Zheng, P. Gujrati, Z. Lan, and Y. Li. Enhancing Application Robustness through Adaptive Fault Tolerance. In IPDPS, 2008.
    • (2008) IPDPS
    • Zheng, Z.1    Gujrati, P.2    Lan, Z.3    Li, Y.4
  • 16
    • 67650091156 scopus 로고    scopus 로고
    • A tunable holistic resiliency approach for high-performance computing systems
    • S. Scott, C. Engelmann, G. Vallee, and T. Naughton et al. A tunable holistic resiliency approach for high-performance computing systems. In PPoPP, 2009.
    • (2009) PPoPP
    • Scott, S.1    Engelmann, C.2    Vallee, G.3    Naughton, T.4
  • 19
    • 0036534708 scopus 로고    scopus 로고
    • Implementing the JMS publish/subscribe API
    • P. Rousselle. Implementing the JMS publish/subscribe API. In Dr. Dobb's Journal, volume 27, pages 28-32, 2002.
    • (2002) Dr. Dobb's Journal , vol.27 , pp. 28-32
    • Rousselle, P.1
  • 23
  • 29
    • 41349108025 scopus 로고    scopus 로고
    • From pull-down data to protein interaction networks and complexes with biological relevance
    • B. Zhang, B.H. Park, T. Karpinets, and N. Samatova. From pull-down data to protein interaction networks and complexes with biological relevance. Journal of Bioinformatics, (24), 2008.
    • (2008) Journal of Bioinformatics , Issue.24
    • Zhang, B.1    Park, B.H.2    Karpinets, T.3    Samatova, N.4
  • 30
    • 77951469575 scopus 로고    scopus 로고
    • Cifts website: Http://www.mcs.anl.gov/research/cifts/.


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.