메뉴 건너뛰기




Volumn 18, Issue , 2004, Pages 2903-2910

System-level fault-tolerance in large-scale parallel machines with buffered coscheduling

Author keywords

Checkpointing; Communication protocols; Failure characterization; Fault tolerance; Large scale parallel computers; Operating systems

Indexed keywords

COMPUTER HARDWARE; COMPUTER OPERATING SYSTEMS; COMPUTER SYSTEM RECOVERY; DYNAMIC RANDOM ACCESS STORAGE; FAULT TOLERANT COMPUTER SYSTEMS; NETWORK PROTOCOLS; RELIABILITY; SCHEDULING; SUPERCOMPUTERS;

EID: 12444268325     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (21)

References (21)
  • 1
    • 18844416337 scopus 로고    scopus 로고
    • Quadrics QsNet II: A network for supercomputing applications
    • Stanford University, California, August 18-20
    • D. Addison, J. Beecroft, D. Hewson, M. McLaren, and F. Petrini. Quadrics QsNet II: A network for Supercomputing Applications. In Proceedings of the Hot Chips 14, Stanford University, California, August 18-20, 2003. Available from http://www.c3.lanl.gov/~fabrizio/talks/hot03.ppt.
    • (2003) Proceedings of the Hot Chips , vol.14
    • Addison, D.1    Beecroft, J.2    Hewson, D.3    McLaren, M.4    Petrini, F.5
  • 3
    • 12444289471 scopus 로고    scopus 로고
    • ASCI Blue Mountain. Available from http://www.lanl.gov/asci/bluemtn
  • 4
    • 12444321957 scopus 로고    scopus 로고
    • ASCI Q machine. Available from http://www.lanl.gov/asci/.
  • 6
    • 12444318552 scopus 로고    scopus 로고
    • CPLANT. Available from http://www.cs.sandia.gov/cplant/.
  • 8
    • 12444268956 scopus 로고    scopus 로고
    • Architectural considerations in delivering a balanced linux cluster
    • Gleneden Beach, Oregon, April 22-25
    • D. Doerfler. Architectural considerations in delivering a balanced linux cluster. In Proceedings from the Conference on High Speed Computing, Gleneden Beach, Oregon, April 22-25, 2002. Available from http://www.ccs.lanl.gov/ salihaslan02/doerfle.pdf.
    • (2002) Proceedings from the Conference on High Speed Computing
    • Doerfler, D.1
  • 9
    • 0042078549 scopus 로고    scopus 로고
    • A survey of rollback-recovery protocols in message-passing systems
    • September
    • E. N. Elnozahy, L. Alvisi, D. B. Johnson, and Y. M. Wang. A Survey of Rollback-Recovery Protocols in Message-Passing Systems. ACM Computing Surveys, 34(3):375-408, September 2002. Avail able from ftp://ftp.cs.emu.edu/user/mootaz/ papers/S.ps.
    • (2002) ACM Computing Surveys , vol.34 , Issue.3 , pp. 375-408
    • Elnozahy, E.N.1    Alvisi, L.2    Johnson, D.B.3    Wang, Y.M.4
  • 10
    • 33645243002 scopus 로고    scopus 로고
    • BCS MPI: A new approach in the system software design for large-scale parallel computers
    • Phoenix, Arizona, November 10-16
    • J. Fernández, E. Frachtenberg, and F. Petrini. BCS MPI: A New Approach in the System Software Design for Large-Scale Parallel Computers. In Proceedings of SC2003, Phoenix, Arizona, November 10-16, 2003. Available from http://www.c3.lanl.gov/~fabrizio/papers/sc03_bcs.pdf.
    • (2003) Proceedings of SC2003
    • Fernández, J.1    Frachtenberg, E.2    Petrini, F.3
  • 11
    • 12444327177 scopus 로고    scopus 로고
    • Challenges in developing scalable scalable software for bluegene/l
    • Pittsburgh, PA, May
    • Manish Gupta. Challenges in developing scalable scalable software for bluegene/l. In Scaling to New Heights Workshop, Pittsburgh, PA, May 2002.
    • (2002) Scaling to New Heights Workshop
    • Gupta, M.1
  • 17
    • 0036170241 scopus 로고    scopus 로고
    • The quadrics network (QsNet): High-performance clustering technology
    • January-February
    • F. Petrini, W. Feng, A. Hoisie, S. Coll, and E. Frachtenberg. The Quadrics Network (QsNet): High-Performance Clustering Technology. IEEE Micro, 22(1):46-57, January-February 2002. Available from http://www.c3.lanl.gov/ ~fabrizio/papers/ieeemicro.pdf.
    • (2002) IEEE Micro , vol.22 , Issue.1 , pp. 46-57
    • Petrini, F.1    Feng, W.2    Hoisie, A.3    Coll, S.4    Frachtenberg, E.5
  • 18


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.