-
2
-
-
0038335808
-
Compiler-assisted checkpointing
-
Dept. of Computer Science, University of Tennessee
-
M. Beck, J. S. Plank, and G. Kingsley. Compiler-assisted checkpointing. Technical Report UT-CS-94-269, Dept. of Computer Science, University of Tennessee, 1994.
-
(1994)
Technical Report
, vol.UT-CS-94-269
-
-
Beck, M.1
Plank, J.S.2
Kingsley, G.3
-
3
-
-
0031570635
-
Application level fault tolerance in heterogeneous networks of workstations
-
A. Beguelin, E. Seligman, and P. Stephan. Application level fault tolerance in heterogeneous networks of workstations. Journal of Parallel and Distributed Computing, 43(2): 147-155, 1997.
-
(1997)
Journal of Parallel and Distributed Computing
, vol.43
, Issue.2
, pp. 147-155
-
-
Beguelin, A.1
Seligman, E.2
Stephan, P.3
-
4
-
-
1142268808
-
Collective operations in an application-level fault tolerant MPI system
-
San Francisco, CA, June 23-26
-
G. Bronevetsky, D. Marques, K. Pingali, and P. Stodghill. Collective operations in an application-level fault tolerant MPI system. In International Conference on Supercomputing (ICS) 2003, San Francisco, CA, June 23-26 2003.
-
(2003)
International Conference on Supercomputing (ICS) 2003
-
-
Bronevetsky, G.1
Marques, D.2
Pingali, K.3
Stodghill, P.4
-
5
-
-
0022020346
-
Distributed snapshots: Determining global states of distributed systems
-
M. Chandy and L. Lamport. Distributed snapshots: Determining global states of distributed systems. ACM Transactions on Computing Systems, 3(1):63-75, 1985.
-
(1985)
ACM Transactions on Computing Systems
, vol.3
, Issue.1
, pp. 63-75
-
-
Chandy, M.1
Lamport, L.2
-
6
-
-
0026867749
-
Manetho: Transparent rollback-recovery with low overhead, limited rollback and fast output
-
May
-
E. N. Elnozahy and W. Zwaenepoel. Manetho: Transparent rollback-recovery with low overhead, limited rollback and fast output. IEEE Transactions on Computers, 41(5), May 1992.
-
(1992)
IEEE Transactions on Computers
, vol.41
, Issue.5
-
-
Elnozahy, E.N.1
Zwaenepoel, W.2
-
7
-
-
0004096191
-
A survey of rollback-recovery protocols in message passing systems
-
School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA, Oct.
-
M. Elnozahy, L. Alvisi, Y. M. Wang, and D. B. Johnson. A survey of rollback-recovery protocols in message passing systems. Technical Report CMU-CS-96-181, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA, Oct. 1996.
-
(1996)
Technical Report
, vol.CMU-CS-96-181
-
-
Elnozahy, M.1
Alvisi, L.2
Wang, Y.M.3
Johnson, D.B.4
-
8
-
-
0003413672
-
MPI: A message-passing interface standard
-
University of Tennessee
-
M. P. I. Forum. MPI: A message-passing interface standard. Technical Report UT-CS-94-230, University of Tennessee, 1994.
-
(1994)
Technical Report
, vol.UT-CS-94-230
-
-
-
10
-
-
0036374186
-
A network-failure-tolerant message-passing system for tera-scale clusters
-
R. Graham, S.-E. Choi, D. Daniel, N. Desai, R. Minnich, C. Rasmussen, D. Risinger, and M. Sukalski. A network-failure-tolerant message-passing system for tera-scale clusters. In Proceedings of the International Conference on Supercomputing 2002, 2002.
-
(2002)
Proceedings of the International Conference on Supercomputing 2002
-
-
Graham, R.1
Choi, S.-E.2
Daniel, D.3
Desai, N.4
Minnich, R.5
Rasmussen, C.6
Risinger, D.7
Sukalski, M.8
-
12
-
-
0037660091
-
-
IBM Research. Blue gene project overview. Online at http://www.research.ibm.com/bluegene/, 2002.
-
(2002)
Blue Gene Project Overview
-
-
-
13
-
-
0026142735
-
Transparent optimistic rollback recovery
-
D. B. Johnson and W. Zwaenepoel. Transparent optimistic rollback recovery. Operating Systems Review, 25(2):99-102, 1991.
-
(1991)
Operating Systems Review
, vol.25
, Issue.2
, pp. 99-102
-
-
Johnson, D.B.1
Zwaenepoel, W.2
-
14
-
-
0004215089
-
-
Morgan Kaufmann, San Francisco, California, first edition
-
N. Lynch. Distributed Algorithms. Morgan Kaufmann, San Francisco, California, first edition, 1996.
-
(1996)
Distributed Algorithms
-
-
Lynch, N.1
-
15
-
-
0003912256
-
Checkpoint and migration of UNIX processes in the condor distributed processing system
-
University of Wisconsin-Madison
-
J. B. M. Litzkow, T. Tannenbaum and M. Livny. Checkpoint and migration of UNIX processes in the condor distributed processing system. Technical Report 1346, University of Wisconsin-Madison, 1997.
-
(1997)
Technical Report
, vol.1346
-
-
Litzkow, J.B.M.1
Tannenbaum, T.2
Livny, M.3
-
16
-
-
1442359688
-
-
National Nuclear Security Administration. Asci home. Online at http://www.nnsa.doe.gov/asc/, 2002.
-
(2002)
Asci Home
-
-
-
17
-
-
0002067202
-
Transparent checkpointing under UNIX
-
Dept. of Computer Science, University of Tennessee
-
J. S. Plank, M. Beck, G. Kingsley, and K. Li. Libckpt: Transparent checkpointing under UNIX. Technical Report UT-CS-94-242, Dept. of Computer Science, University of Tennessee, 1994.
-
(1994)
Technical Report
, vol.UT-CS-94-242
-
-
Plank, J.S.1
Beck, M.2
Kingsley, G.3
Libckpt, K.Li.4
-
20
-
-
1342295420
-
The use of the MPI communication library in the NAS parallel benchmarks
-
Advanced Computer Architecture Laboratory, Dept. of Electrical Engineering and Computer Science, University of Michigan, 17
-
T. Tabe and Q. F. Stout. The use of the MPI communication library in the NAS parallel benchmarks. Technical Report CSE-TR-386-99, Advanced Computer Architecture Laboratory, Dept. of Electrical Engineering and Computer Science, University of Michigan, 17, 1999.
-
(1999)
Technical Report
, vol.CSE-TR-386-99
-
-
Tabe, T.1
Stout, Q.F.2
|