메뉴 건너뛰기




Volumn 5168 LNCS, Issue , 2008, Pages 58-67

Providing non-stop service for message-passing based parallel applications with RADIC

Author keywords

[No Author keywords available]

Indexed keywords

ERRORS; FAULT TOLERANCE; MAINTAINABILITY; MAINTENANCE; MESSAGE PASSING; QUALITY ASSURANCE; RELIABILITY; SUPERCOMPUTERS;

EID: 51849121653     PISSN: 03029743     EISSN: 16113349     Source Type: Book Series    
DOI: 10.1007/978-3-540-85451-7_7     Document Type: Conference Paper
Times cited : (6)

References (9)
  • 1
    • 36148941068 scopus 로고    scopus 로고
    • Schroeder, B., Gibson, G.A.: Understanding failures in petascale computers. Journal of Physics: Conference Series 78, 012022, 11 (2007)
    • Schroeder, B., Gibson, G.A.: Understanding failures in petascale computers. Journal of Physics: Conference Series 78, 012022, 11 (2007)
  • 3
    • 0042078549 scopus 로고    scopus 로고
    • A survey of rollback-recovery protocols in message-passing systems
    • Elnozahy, E.N.M., Alvisi, L., Wang, Y.M., Johnson, D.B.: A survey of rollback-recovery protocols in message-passing systems. ACM Computing Surveys 34(3), 375-408 (2002)
    • (2002) ACM Computing Surveys , vol.34 , Issue.3 , pp. 375-408
    • Elnozahy, E.N.M.1    Alvisi, L.2    Wang, Y.M.3    Johnson, D.B.4
  • 4
    • 51849162159 scopus 로고    scopus 로고
    • Gropp, W., Lusk, E., Skjellum, A.: Using MPI: Portable Parallel Programming with the Message-Passing Interface. MIT Press, Cambridge (1999); LCCN: QA76.642 G76 1999
    • Gropp, W., Lusk, E., Skjellum, A.: Using MPI: Portable Parallel Programming with the Message-Passing Interface. MIT Press, Cambridge (1999); LCCN: QA76.642 G76 1999
  • 5
    • 49049087407 scopus 로고
    • Reliable, Atomic and Causal Broadcast
    • P T R Prentice Hall, USA
    • Jalote, P.: Reliable, Atomic and Causal Broadcast. In: Fault Tolerance in Distributed Systems, vol. 1, p. 142. P T R Prentice Hall, USA (1994)
    • (1994) Fault Tolerance in Distributed Systems , vol.1 , pp. 142
    • Jalote, P.1
  • 6
    • 33750255136 scopus 로고    scopus 로고
    • Duarte, A., Rexachs, D., Luque, E.: An intelligent management of fault tolerance in cluster using radicmpi. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds.) PVM/MPI 2006. LNCS, 4192, pp. 150-157. Springer, Heidelberg (2006)
    • Duarte, A., Rexachs, D., Luque, E.: An intelligent management of fault tolerance in cluster using radicmpi. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds.) PVM/MPI 2006. LNCS, vol. 4192, pp. 150-157. Springer, Heidelberg (2006)
  • 8
    • 33751082401 scopus 로고    scopus 로고
    • Li, Y., Lan, Z.: Exploit failure prediction for adaptive fault-tolerance in cluster computing. In: Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID 2006), May 16-19, 2006, 1, pp. 531-538 (2006)
    • Li, Y., Lan, Z.: Exploit failure prediction for adaptive fault-tolerance in cluster computing. In: Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID 2006), May 16-19, 2006, vol. 1, pp. 531-538 (2006)


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.