-
2
-
-
4544241556
-
Common mechanisms for supporting fault tolerance in DSM and message passing systems
-
July
-
R. Badrinath and C. Morin. Common mechanisms for supporting fault tolerance in DSM and message passing systems. Technical report, July 2003.
-
(2003)
Technical Report
-
-
Badrinath, R.1
Morin, C.2
-
3
-
-
84884662651
-
MPICH-V: Toward a scalable fault tolerant MPI for volatile nodes
-
Baltimore, Maryland, November
-
G. Bosilca, A. Bouteiller, F. Cappello, S. Djailali, G. Fedak, C. Germain, T. Herault, P. Lemarinier, O. Lodygensky, F. Magniette, V. Neri, and A. Selikhov. MPICH-V: Toward a Scalable Fault Tolerant MPI for Volatile Nodes. In Proceedings of the IEEE/ACM SC2002 Conference, pages 2947, Baltimore, Maryland, November 2002.
-
(2002)
Proceedings of the IEEE/ACM SC2002 Conference
, pp. 29-47
-
-
Bosilca, G.1
Bouteiller, A.2
Cappello, F.3
Djailali, S.4
Fedak, G.5
Germain, C.6
Herault, T.7
Lemarinier, P.8
Lodygensky, O.9
Magniette, F.10
Neri, V.11
Selikhov, A.12
-
4
-
-
0022020346
-
Distributed snapshots: Determining global states of distributed systems
-
February
-
K. Chandy and L. Lamport. Distributed Snapshots: Determining Global States of Distributed Systems. ACM Trans. Computer Systems, 3(1):63-75, February 1985.
-
(1985)
ACM Trans. Computer Systems
, vol.3
, Issue.1
, pp. 63-75
-
-
Chandy, K.1
Lamport, L.2
-
5
-
-
0005029744
-
Lightweight logging for lazy release consistent distributed shared memory
-
October
-
M. Costa, P. Guedes, M. Sequeira, N. Neves, and M. Castro. Lightweight Logging for Lazy Release Consistent Distributed Shared Memory. In Operating Systems Design and Implementation, pages 59-73, October 1996.
-
(1996)
Operating Systems Design and Implementation
, pp. 59-73
-
-
Costa, M.1
Guedes, P.2
Sequeira, M.3
Neves, N.4
Castro, M.5
-
6
-
-
0042078549
-
A survey of rollback-recovery protocols in message-passing systems
-
September
-
M. Elnozahy, L. Alvisi, Y.-M. Wang, and D. Johnson. A Survey of Rollback-Recovery Protocols in Message-Passing Systems. ACM Computing Surveys (CSUR), 34:375-408, September 2002.
-
(2002)
ACM Computing Surveys (CSUR)
, vol.34
, pp. 375-408
-
-
Elnozahy, M.1
Alvisi, L.2
Wang, Y.-M.3
Johnson, D.4
-
7
-
-
84862438692
-
Conception et évaluation d'un protocole de reprise d'applications parallèles dans une fédération de grappes de calculateurs
-
IFSIC, Université de Rennes 1, France, June. In French
-
S. Monnet. Conception et évaluation d'un protocole de reprise d'applications parallèles dans une fédération de grappes de calculateurs. Rapport de stage de dea, IFSIC, Université de Rennes 1, France, June 2003. In French.
-
(2003)
Rapport de Stage de dea
-
-
Monnet, S.1
-
8
-
-
0034188020
-
An efficient and scalable approach for implementing fault tolerant DSM architectures
-
May
-
C. Morin, A.-M. Kermarrec, M. Banâtre, and A. Gefflaut. An Efficient and Scalable Approach for Implementing Fault Tolerant DSM Architectures. IEEE Transactions on Computers, 49(5):414-430, May 2000.
-
(2000)
IEEE Transactions on Computers
, vol.49
, Issue.5
, pp. 414-430
-
-
Morin, C.1
Kermarrec, A.-M.2
Banâtre, M.3
Gefflaut, A.4
-
9
-
-
0010539744
-
-
PhD thesis, Faculty of the School of Engineering and Applied Science at the University of Virginia, August
-
A. Nguyen-Tuong. Integrating Fault-Tolerance Techniques, in Grid Applications. PhD thesis, Faculty of the School of Engineering and Applied Science at the University of Virginia, August 2000.
-
(2000)
Integrating Fault-tolerance Techniques, in Grid Applications
-
-
Nguyen-Tuong, A.1
-
12
-
-
84862438618
-
-
C++SIM. http://cxxsim.ncl.ac.uk.
-
C++SIM
-
-
|