메뉴 건너뛰기




Volumn 2005, Issue , 2005, Pages

Transparent, incremental checkpointing at kernel level: A foundation for fault tolerance for parallel computers

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER ARCHITECTURE; COMPUTER OPERATING SYSTEMS; COMPUTER SOFTWARE; DIGITAL LIBRARIES; FAULT TOLERANT COMPUTER SYSTEMS; USER INTERFACES;

EID: 33845434226     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/SC.2005.76     Document Type: Conference Paper
Times cited : (122)

References (35)
  • 1
    • 18844416337 scopus 로고    scopus 로고
    • Quadrics QsNet II: A network for supercomputing applications
    • Stanford University, California, August 18-20
    • D. Addison, J. Beecroft, D. Hewson, M. McLaren, and F. Petrini. Quadrics QsNet II: A Network for Supercomputing Applications. In Hot Chips 14, Stanford University, California, August 18-20, 2003.
    • (2003) Hot Chips , vol.14
    • Addison, D.1    Beecroft, J.2    Hewson, D.3    McLaren, M.4    Petrini, F.5
  • 4
    • 0032021963 scopus 로고    scopus 로고
    • The MOSIX multicomputer operating system for high performance cluster computing
    • March
    • A. Barak and O. La'adan. The MOSIX Multicomputer Operating System for High Performance Cluster Computing. Journal of Future Generation Computer Systems, 13(4-5):361-372, March 1998.
    • (1998) Journal of Future Generation Computer Systems , vol.13 , Issue.4-5 , pp. 361-372
    • Barak, A.1    La'adan, O.2
  • 6
    • 0036680081 scopus 로고    scopus 로고
    • Checkpointing of multithreaded programs
    • August
    • C. Carothers and B. Szymanski. Checkpointing of Multithreaded Programs. Dr. Dobbs Journal, 15(8), August 2002.
    • (2002) Dr. Dobbs Journal , vol.15 , Issue.8
    • Carothers, C.1    Szymanski, B.2
  • 13
    • 33645243002 scopus 로고    scopus 로고
    • BCS MPI: A new approach in the system software design for large-scale parallel computers
    • Phoenix, Arizona, November 10-16
    • Juan Fernández, Eitan Frachtenberg, and Fabrizio Petrini. BCS MPI: A New Approach in the System Software Design for Large-Scale Parallel Computers. In Proceedings of SC2003, Phoenix, Arizona, November 10-16, 2003.
    • (2003) Proceedings of SC2003
    • Fernández, J.1    Frachtenberg, E.2    Petrini, F.3
  • 16
    • 33845396625 scopus 로고    scopus 로고
    • Designing parallel operating systems via parallel programming
    • Pisa, Italy, August
    • Eitan Frachtenberg, Kei Davis, Fabrizio Petrini, Juan Fernández, and José Carlos Sancho. Designing Parallel Operating Systems via Parallel Programming. In Euro-Par 2004, Pisa, Italy, August 2004.
    • (2004) Euro-par 2004
    • Frachtenberg, E.1    Davis, K.2    Petrini, F.3    Fernández, J.4    Sancho, J.C.5
  • 19
    • 84867482607 scopus 로고    scopus 로고
    • E. Hendriks. VMADump. Available from http://cvs.sourceforge.net/viewcvs. py/bproc/vmadump.
    • VMADump
    • Hendriks, E.1
  • 22
    • 33845380230 scopus 로고    scopus 로고
    • Lightning Linux Cluster. Available from http://www.lanl.gov/worldview/ news/releases/archive/03-107.shtml.
    • Lightning Linux Cluster
  • 23
    • 33746293114 scopus 로고    scopus 로고
    • User and kernel level checkpointing
    • Phoenix, Arizona, November 15-17
    • N. Meyer. User and Kernel Level Checkpointing. In Proceedings of the Sun Microsystems HPC Consortium Meeting, Phoenix, Arizona, November 15-17, 2003. Available from http://checkpointing.psnc.pl/Progress/sat_nmeyer.pdf.
    • (2003) Proceedings of the Sun Microsystems HPC Consortium Meeting
    • Meyer, N.1
  • 26
    • 33845436998 scopus 로고    scopus 로고
    • EPCKPT
    • E. Pinheiro. EPCKPT. Available from http://www.research.rutgers.edu/ ~edpin/epckpt.
    • Pinheiro, E.1
  • 32
    • 84934312471 scopus 로고    scopus 로고
    • Implementation and evaluation of a scalable application-level checkpoint-recovery scheme for MPI programs
    • Pittsburgh, PA, November 10-16
    • Martin Schulz, Greg Bronevetsky, Rohit Fernandes, Daniel Marques, Keshav Pingali, , and Paul Stodghill. Implementation and Evaluation of a Scalable Application-level Checkpoint-Recovery Scheme for MPI Programs. In ACM/IEEE SC2004, Pittsburgh, PA, November 10-16, 2004.
    • (2004) ACM/IEEE SC2004
    • Schulz, M.1    Bronevetsky, G.2    Fernandes, R.3    Marques, D.4    Pingali, K.5    Stodghill, P.6
  • 34
    • 0003595929 scopus 로고    scopus 로고
    • The ASCI Sweep3D Benchmark. Available from http://www.llnl.gov/ asci_benchmarks/asci/limited/sweep3d/.
    • The ASCI Sweep3D Benchmark
  • 35
    • 8344283205 scopus 로고    scopus 로고
    • Technical Report CUCS-014-01, Department of Computer Science, Columbia University, New York, November
    • H. Zhong and J. Nieh. CRAK: Linux Checkpoint/Restart as a Kernel Module. Technical Report CUCS-014-01, Department of Computer Science, Columbia University, New York, November 2001.
    • (2001) CRAK: Linux Checkpoint/Restart as a Kernel Module
    • Zhong, H.1    Nieh, J.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.