-
1
-
-
0032597670
-
An analysis of communication-induced checkpointing
-
June
-
Lorenzo Alvisi, Elmootazbellah Elnozahy, Sriram Rao, Syed Amir Husain, and Asanka Del Mel. An Analysis of Communication-Induced Checkpointing. In Proceedings of the 1999 International Symposium on Fault-Tolerant Computing (FTCS), June 1999.
-
(1999)
Proceedings of the 1999 International Symposium on Fault-Tolerant Computing (FTCS)
-
-
Alvisi, L.1
Elnozahy, E.2
Rao, S.3
Husain, S.A.4
Mel, A.D.5
-
4
-
-
0024606852
-
Fault tolerance under Unix
-
February
-
Anita Borg, Wolfgang Blau, Wolfgang Graetsch, Ferdinand Herrman, and Wolfgang Oberle. Fault Tolerance Under UNIX. ACM Transactions on Computer Systems, 7(1):1–24, February 1989.
-
(1989)
ACM Transactions on Computer Systems
, vol.7
, Issue.1
, pp. 1-24
-
-
Borg, A.1
Blau, W.2
Graetsch, W.3
Herrman, F.4
Oberle, W.5
-
8
-
-
0022020346
-
Distributed snapshots: Determining global states in distributed systems
-
February
-
K. Mani Chandy and Leslie Lamport. Distributed Snapshots: Determining Global States in Distributed Systems. ACM Transactions on Computer Systems, 3(1):63–75, February 1985.
-
(1985)
ACM Transactions on Computer Systems
, vol.3
, Issue.1
, pp. 63-75
-
-
Mani Chandy, K.1
Lamport, L.2
-
9
-
-
17044422113
-
The Rio file cache: Surviving operating system crashes
-
October
-
Peter M. Chen, Wee Teck Ng, Subhachandra Chandra, Christopher M. Aycock, Gurushankar Rajamani, and David Lowell. The Rio File Cache: Surviving Operating System Crashes. In Proceedings of the 1996 International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), October 1996.
-
(1996)
Proceedings of the 1996 International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS)
-
-
Chen, P.M.1
Ng, W.T.2
Chandra, S.3
Aycock, C.M.4
Rajamani, G.5
Lowell, D.6
-
10
-
-
0004096191
-
-
Technical Report CMU-CS-99-148, Carnegie Mellon University, June
-
Elmootazbellah N. Elnozahy, Lorenzo Alvisi, Yi-Min Wang, and David B. Johnson. A Survey of Rollback-Recovery Protocols in Message-Passing Systems. Technical Report CMU-CS-99-148, Carnegie Mellon University, June 1999.
-
(1999)
A Survey of Rollback-Recovery Protocols in Message-Passing Systems
-
-
Elnozahy, E.N.1
Alvisi, L.2
Wang, Y.-M.3
Johnson, D.B.4
-
11
-
-
0026867749
-
Manetho: Transparent rollback-recovery with low overhead, limited rollback, and fast output commit
-
May
-
Elmootazbellah N. Elnozahy and Willy Zwaenepoel. Manetho: Transparent Rollback-Recovery with Low Overhead, Limited Rollback, and Fast Output Commit. IEEE Transactions on Computers, C-41(5):526–531, May 1992.
-
(1992)
IEEE Transactions on Computers
, vol.41
, Issue.5
, pp. 526-531
-
-
Elnozahy, E.N.1
Zwaenepoel, W.2
-
14
-
-
85094877129
-
NT-Swift: Software-implemented Fault Tolerance for Windows-NT
-
August
-
Y. Huang, P. Y. Chung, C. M. R. Kintala, D. Liang, and C. Wang. NT-SwiFT: Software-implemented Fault Tolerance for Windows-NT. In Proceedings of the 1998 USENIX WindowsNT Symposium, August 1998.
-
(1998)
Proceedings of the 1998 USENIX WindowsNT Symposium
-
-
Huang, Y.1
Chung, P.Y.2
Kintala, C.M.R.3
Liang, D.4
Wang, C.5
-
17
-
-
0032686475
-
Chameleon: A software infrastructure for adaptive fault tolerance
-
June
-
Zbigniew T. Kalbarczyk, Saurabh Bagchi, Keith Whisnant, and Ravishankar K. Iyer. Chameleon: A Software Infrastructure for Adaptive Fault Tolerance. IEEE Transactions on Parallel and Distributed Systems, 10(6):560–579, June 1999.
-
(1999)
IEEE Transactions on Parallel and Distributed Systems
, vol.10
, Issue.6
, pp. 560-579
-
-
Kalbarczyk, Z.T.1
Bagchi, S.2
Whisnant, K.3
Iyer, R.K.4
-
18
-
-
0023090161
-
Checkpointing and rollback-recovery for distributed systems
-
SE January
-
Richard Koo and Sam Toueg. Checkpointing and Rollback-Recovery for Distributed Systems. IEEE Transactions on Software Engineering, SE-13(1):23–31, January 1987.
-
(1987)
IEEE Transactions on Software Engineering
, vol.13
, Issue.1
, pp. 23-31
-
-
Koo, R.1
Toueg, S.2
-
19
-
-
0017996760
-
Time, clocks, and the ordering of events in a distributed system
-
July
-
Leslie Lamport. Time, Clocks, and the Ordering of Events in a Distributed System. Communications of the ACM, 21(7):558–565, July 1978.
-
(1978)
Communications of the ACM
, vol.21
, Issue.7
, pp. 558-565
-
-
Lamport, L.1
-
27
-
-
84976724324
-
Byzantine generals in action: Implementing fail-stop processors
-
May
-
Fred B. Schneider. Byzantine Generals in Action: Implementing Fail-Stop Processors. ACM Transactions on Computer Systems, 2(2):145–154, May 1984.
-
(1984)
ACM Transactions on Computer Systems
, vol.2
, Issue.2
, pp. 145-154
-
-
Schneider, F.B.1
-
28
-
-
0022112420
-
Optimistic recovery in distributed systems
-
August
-
Robert E. Strom and Shaula Yemini. Optimistic Recovery in Distributed Systems. ACM Transactions on Computer Systems, 3(3):204–226, August 1985.
-
(1985)
ACM Transactions on Computer Systems
, vol.3
, Issue.3
, pp. 204-226
-
-
Strom, R.E.1
Yemini, S.2
|