메뉴 건너뛰기




Volumn 6, Issue 1, 2009, Pages 32-44

Flexible rollback recovery in dynamic heterogeneous grid computing

Author keywords

Checkpointing; Event logging; Grid computing; Rollback recovery

Indexed keywords

DATA FLOW ANALYSIS; FAULT TOLERANCE; RECOVERY;

EID: 60449088092     PISSN: 15455971     EISSN: None     Source Type: Journal    
DOI: 10.1109/TDSC.2008.17     Document Type: Article
Times cited : (23)

References (33)
  • 1
    • 0032000230 scopus 로고    scopus 로고
    • Message Logging: Pessimistic, Optimistic, Causal and Optimal
    • Feb
    • L. Alvisi and K. Marzullo, "Message Logging: Pessimistic, Optimistic, Causal and Optimal," IEEE Trans. Software Eng., vol. 24, no. 2, pp. 149-159, Feb. 1998.
    • (1998) IEEE Trans. Software Eng , vol.24 , Issue.2 , pp. 149-159
    • Alvisi, L.1    Marzullo, K.2
  • 2
    • 0038823138 scopus 로고    scopus 로고
    • Solving Large Quadratic Assignment Problems on Computational Grids
    • K. Anstreicher, N. Brixius, J.-P. Goux, and J. Linderoth, "Solving Large Quadratic Assignment Problems on Computational Grids," Math. Programming, vol. 91, no. 3, 2002.
    • (2002) Math. Programming , vol.91 , Issue.3
    • Anstreicher, K.1    Brixius, N.2    Goux, J.-P.3    Linderoth, J.4
  • 3
    • 84866225421 scopus 로고    scopus 로고
    • A Communication-Induced Checkpointing Protocol That Ensures Rollback-Dependency Trackability
    • 97, p
    • R. Baldoni, "A Communication-Induced Checkpointing Protocol That Ensures Rollback-Dependency Trackability," Proc. 27th Int'l Symp. Fault-Tolerant Computing (FTCS '97), p. 68, 1997.
    • (1997) Proc. 27th Int'l Symp. Fault-Tolerant Computing (FTCS , pp. 68
    • Baldoni, R.1
  • 5
    • 84884662651 scopus 로고    scopus 로고
    • MPICH-V: Toward a Scalable Fault Tolerant MPI for Volatile Nodes
    • Nov
    • G. Bosilca et al., "MPICH-V: Toward a Scalable Fault Tolerant MPI for Volatile Nodes," Proc. ACM/IEEE Conf. Supercomputing (SC '02), Nov. 2002.
    • (2002) Proc. ACM/IEEE Conf. Supercomputing (SC '02)
    • Bosilca, G.1
  • 6
    • 60449096682 scopus 로고    scopus 로고
    • MPICH-V2: A Fault Tolerant MPI for Volatile Nodes Based on the Pessimistic Sender Based Message Logging
    • A. Bouteiller et al., "MPICH-V2: A Fault Tolerant MPI for Volatile Nodes Based on the Pessimistic Sender Based Message Logging," Proc. ACM/IEEE Conf. Supercomputing (SC '03), pp. 1-17, 2003.
    • (2003) Proc. ACM/IEEE Conf. Supercomputing (SC '03) , pp. 1-17
    • Bouteiller, A.1
  • 9
    • 0022020346 scopus 로고
    • Distributed Snapshots: Determining Global States of Distributed Systems
    • K.M. Chandy and L. Lamport, "Distributed Snapshots: Determining Global States of Distributed Systems," ACM Trans. Computer Systems, vol. 3, no. 1, pp. 63-75, 1985.
    • (1985) ACM Trans. Computer Systems , vol.3 , Issue.1 , pp. 63-75
    • Chandy, K.M.1    Lamport, L.2
  • 10
    • 0042078549 scopus 로고    scopus 로고
    • A Survey of Rollback-Recovery Protocols in Message-Passing Systems
    • Sept
    • E.N. Elnozahy, L. Alvisi, Y.-M. Wang, and D.B. Johnson, "A Survey of Rollback-Recovery Protocols in Message-Passing Systems," ACM Computing Surveys, vol. 34, no. 3, pp. 375-408, Sept. 2002.
    • (2002) ACM Computing Surveys , vol.34 , Issue.3 , pp. 375-408
    • Elnozahy, E.N.1    Alvisi, L.2    Wang, Y.-M.3    Johnson, D.B.4
  • 13
    • 70350117203 scopus 로고    scopus 로고
    • A Large Scale Nation-Wide Infrastructure for Grid Research
    • A Large Scale Nation-Wide Infrastructure for Grid Research, Grid5000, https://www.grid5000.fr, 2006.
    • (2006) Grid5000
  • 16
    • 60449089144 scopus 로고    scopus 로고
    • A Probabilistic Approach for Task and Result Certification of Large-Scale Distributed Applications in Hostile Environments
    • P. Sloot et al, eds, Feb
    • A.W. Krings, J.-L. Roch, S. Jafar, and S. Varrette, "A Probabilistic Approach for Task and Result Certification of Large-Scale Distributed Applications in Hostile Environments," Proc. European Grid Conf (EGC '05), P. Sloot et al., eds., Feb. 2005.
    • (2005) Proc. European Grid Conf (EGC '05)
    • Krings, A.W.1    Roch, J.-L.2    Jafar, S.3    Varrette, S.4
  • 19
    • 0003912256 scopus 로고    scopus 로고
    • Check-point and Migration of UNIX Processes in the Condor Distributed Processing System,
    • Technical Report CS-TR-97-1346, Univ. of Wisconsin, Madison
    • M. Litzkow, T. Tannenbaum, J. Basney, and M. Livny, "Check-point and Migration of UNIX Processes in the Condor Distributed Processing System," Technical Report CS-TR-97-1346, Univ. of Wisconsin, Madison, 1997.
    • (1997)
    • Litzkow, M.1    Tannenbaum, T.2    Basney, J.3    Livny, M.4
  • 21
    • 84944041103 scopus 로고
    • A Case for Redundant Arrays of Inexpensive Disks (RAID)
    • 88, pp
    • D.A. Patterson, G. Gibson, and R.H. Katz, "A Case for Redundant Arrays of Inexpensive Disks (RAID)," Proc. ACM SIGMOD '88, pp. 109-116, 1988.
    • (1988) Proc. ACM SIGMOD , pp. 109-116
    • Patterson, D.A.1    Gibson, G.2    Katz, R.H.3
  • 23
    • 0016829070 scopus 로고
    • System Structure for Software Fault Tolerance
    • B. Randell, "System Structure for Software Fault Tolerance," Proc. Int'l Conf. Reliable Software, pp. 437-449, 1975.
    • (1975) Proc. Int'l Conf. Reliable Software , pp. 437-449
    • Randell, B.1
  • 24
    • 0036499242 scopus 로고    scopus 로고
    • Sabotage-Tolerance Mechanisms for Volunteer Computing Systems
    • L. Sarmenta, "Sabotage-Tolerance Mechanisms for Volunteer Computing Systems," Future Generation Computer Systems, vol. 18, no. 4, 2002.
    • (2002) Future Generation Computer Systems , vol.18 , Issue.4
    • Sarmenta, L.1
  • 25
    • 0039285280 scopus 로고    scopus 로고
    • Asynchrony in Parallel Computing: From Dataflow to Multithreading
    • J. Silc, B. Robic, and T. Ungerer, "Asynchrony in Parallel Computing: from Dataflow to Multithreading," Progress in Computer Research, pp. 1-33, 2001.
    • (2001) Progress in Computer Research , pp. 1-33
    • Silc, J.1    Robic, B.2    Ungerer, T.3
  • 26
    • 0029713612 scopus 로고    scopus 로고
    • CoCheck: Checkpointing and Process Migration for MPI
    • 96, pp, Apr
    • G. Stellner, "CoCheck: Checkpointing and Process Migration for MPI," Proc. 10th Int'l Parallel Processing Symp. (IPPS '96), pp. 526-531, Apr. 1996.
    • (1996) Proc. 10th Int'l Parallel Processing Symp. (IPPS , pp. 526-531
    • Stellner, G.1
  • 27
    • 0022112420 scopus 로고
    • Optimistic Recovery in Distributed Systems
    • R. Strom and S. Yemini, "Optimistic Recovery in Distributed Systems," ACM Trans. Computer Systems, vol. 3, no. 3, pp. 204-226, 1985.
    • (1985) ACM Trans. Computer Systems , vol.3 , Issue.3 , pp. 204-226
    • Strom, R.1    Yemini, S.2
  • 28
    • 0032155082 scopus 로고    scopus 로고
    • Portable and Fault-Tolerant Software Systems
    • Sept./Oct
    • V. Strumpen, "Portable and Fault-Tolerant Software Systems," IEEE Micro, vol. 18, no. 5, pp. 22-32, Sept./Oct. 1998.
    • (1998) IEEE Micro , vol.18 , Issue.5 , pp. 22-32
    • Strumpen, V.1
  • 32
    • 0142066947 scopus 로고    scopus 로고
    • Selecting the Right Data Distribution Scheme for a Survivable Storage System,
    • Technical Report CMU-CS-01-120, Carnegie Mellon Univ, May
    • J.J. Wylie et al., "Selecting the Right Data Distribution Scheme for a Survivable Storage System," Technical Report CMU-CS-01-120, Carnegie Mellon Univ., May 2001.
    • (2001)
    • Wylie, J.J.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.