-
1
-
-
0001924983
-
A survey of analytic models of roll-back and recovery strategies
-
May
-
K.M. Chandy, "A survey of analytic models of roll-back and recovery strategies," Computer 8,5 (May 1975), 40-47
-
(1975)
Computer
, vol.8
, Issue.5
, pp. 40-47
-
-
Chandy, K.M.1
-
2
-
-
0016487291
-
Analytic models for rollback and recovery stratagems m data base systems
-
March
-
K.M. Chandy, J.C. Browne, C. W. Dissly, and W. R. Unrig, "Analytic models for rollback and recovery stratagems m data base systems," IEEE Trans Software Eng. SE-1, (March 1975), 100-110
-
(1975)
IEEE Trans Software Eng
, vol.SE-1
, pp. 100-110
-
-
Chandy, K.M.1
Browne, J.C.2
Dissly, C.W.3
Unrig, W.R.4
-
3
-
-
28044438299
-
A Model for Predicting the Optimum Checkpoint Interval for Restart Dumps
-
J.T. Daly, "A Model for Predicting the Optimum Checkpoint Interval for Restart Dumps," ICCS 2003, LNCS 2660, Proceedings 4 (2003) 3-12
-
(2003)
ICCS 2003, LNCS 2660, Proceedings
, vol.4
, pp. 3-12
-
-
Daly, J.T.1
-
4
-
-
51049113966
-
A Higher Order Estimate of the Optimum Checkpoint Interval for Restart Dumps
-
Elsevier, Amsterdam
-
J.T. Daly, "A Higher Order Estimate of the Optimum Checkpoint Interval for Restart Dumps," Future Generation Computer Systems (Elsevier, Amsterdam, 2004)
-
(2004)
Future Generation Computer Systems
-
-
Daly, J.T.1
-
5
-
-
9144223280
-
Checkpointing for Peta-Scale Systems: A Look into the Future of Practical Rollback-Recovery
-
IEEE Trans. Dependable Sec. Comput
-
E. Elnozahy, J. Plank, "Checkpointing for Peta-Scale Systems: A Look into the Future of Practical Rollback-Recovery," IEEE Trans. Dependable Sec. Comput. .1.(2): 97-108 (2004)
-
(2004)
, vol.1
, Issue.2
, pp. 97-108
-
-
Elnozahy, E.1
Plank, J.2
-
6
-
-
0000652719
-
Selection, of a checkpoint interval in a critical-task environment
-
R. Geist, R. Reynolds, and J. Westall, "Selection, of a checkpoint interval in a critical-task environment," IEEE Trans. Reliability, 37, (4), 395-400 (1988)
-
(1988)
IEEE Trans. Reliability
, vol.37
, Issue.4
, pp. 395-400
-
-
Geist, R.1
Reynolds, R.2
Westall, J.3
-
7
-
-
50649103655
-
-
Louisiana Tech University, Ruston, LA, USA, May
-
Y. Liu, "Reliability-Aware Optimal Checkpoint/Restart Model In High Performance Computing. PhD thesis," Louisiana Tech University, Ruston, LA, USA, May. 2007.
-
(2007)
Reliability-Aware Optimal Checkpoint/Restart Model In High Performance Computing
-
-
Liu, Y.1
-
8
-
-
0004244684
-
Checkpointing and the Modeling of Program Execution Time
-
M.R. Lyu, ed, pp, John Wiley & Sons
-
V.F. Nicola, "Checkpointing and the Modeling of Program Execution Time," Software Fault Tolerance, M.R. Lyu, ed., pp. 167-188, John Wiley & Sons, 1995.
-
(1995)
Software Fault Tolerance
, pp. 167-188
-
-
Nicola, V.F.1
-
9
-
-
34547424386
-
Cooperative Checkpointing: A Robust Approach to Large-scale Systems Reliability
-
Cairns, Australia, June
-
A.J. Oliner, L. Rudolph, and R.K. Sahoo, "Cooperative Checkpointing: A Robust Approach to Large-scale Systems Reliability," In Proceedings of the 20th Annual International Conference on Supercomputing (ICS), Cairns, Australia, June 2006.
-
(2006)
Proceedings of the 20th Annual International Conference on Supercomputing (ICS)
-
-
Oliner, A.J.1
Rudolph, L.2
Sahoo, R.K.3
-
10
-
-
0032597646
-
The Average Availability of Parallel Checkpointing Systems and Its Importance in Selecting Runtime Parameters
-
J.S. Plank, M.A. Thomason, "The Average Availability of Parallel Checkpointing Systems and Its Importance in Selecting Runtime Parameters," IEEE Proc. Int'l Symp. on Fault-Tolerant Computing, 1999.
-
(1999)
IEEE Proc. Int'l Symp. on Fault-Tolerant Computing
-
-
Plank, J.S.1
Thomason, M.A.2
-
11
-
-
0003778293
-
-
Wiley; 2nd edition January, ISBN-10: 0471120626
-
S.M. Ross, "Stochastic Processes," Wiley; 2nd edition (January 1995), ISBN-10: 0471120626
-
(1995)
Stochastic Processes
-
-
Ross, S.M.1
-
12
-
-
33746286070
-
Performance implications of periodic checkpointing on large-scale cluster systems
-
IEEE International
-
A. J. Oliner "Performance implications of periodic checkpointing on large-scale cluster systems", Parallel and Distributed Processing Symposium, 2005. Proc. 19th IEEE International (2005), pp. 299b-299b.
-
(2005)
Parallel and Distributed Processing Symposium, 2005. Proc. 19th
-
-
Oliner, A.J.1
-
13
-
-
20444444457
-
The LAM/MPI Checkpoint/Restart Framework: System-Initiated Checkpoint
-
Santa Fe, NM. October
-
S. Sankaran, J.M. Squyres, B. Barrett, A. Lumsdaine, J. Duell, P. Hargrove, and E. Roman, "The LAM/MPI Checkpoint/Restart Framework: System-Initiated Checkpoint," The 2003 Los Alamos Computer Science Institute Symposium, Santa Fe, NM. October 2003.
-
(2003)
The 2003 Los Alamos Computer Science Institute Symposium
-
-
Sankaran, S.1
Squyres, J.M.2
Barrett, B.3
Lumsdaine, A.4
Duell, J.5
Hargrove, P.6
Roman, E.7
-
15
-
-
53349101430
-
-
M. Treaster, A survey of fault-tolerance and fault-recovery techniques in parallel systems, Technical Report cs.DC/ 0501002, ACM Computing Research Repository (CoRR), January 2005.
-
M. Treaster, "A survey of fault-tolerance and fault-recovery techniques in parallel systems," Technical Report cs.DC/ 0501002, ACM Computing Research Repository (CoRR), January 2005.
-
-
-
-
16
-
-
84866903812
-
Distributed Computing Systems and Checkpointing
-
K. F. Wong, M.A. Franklin, "Distributed Computing Systems and Checkpointing," HPDC 1993: 224-233
-
(1993)
HPDC
, pp. 224-233
-
-
Wong, K.F.1
Franklin, M.A.2
-
18
-
-
16244423775
-
An overview of the BlueGene/L supercomputer
-
A.R. Adiga, G Almasi, and et al., "An overview of the BlueGene/L supercomputer," In Proceedings of Supercomputing, IEEE/ACM 2002 Conference, 60-60
-
Proceedings of Supercomputing, IEEE/ACM 2002 Conference
, pp. 60-60
-
-
Adiga, A.R.1
Almasi, G.2
and et, al.3
-
20
-
-
0035390088
-
A Variational Calculus Approach to Optimal Checkpoint Placement
-
July
-
Y. Ling, J. Mi, and X. Lin, "A Variational Calculus Approach to Optimal Checkpoint Placement," IEEE Trans. Computers, vol. 50, no. 7, 699-707, July 2001.
-
(2001)
IEEE Trans. Computers
, vol.50
, Issue.7
, pp. 699-707
-
-
Ling, Y.1
Mi, J.2
Lin, X.3
-
21
-
-
33646721605
-
Distribution-Free Checkpoint Placement Algorithms Based on Min-Max Principle
-
April
-
T. Ozaki, T. Dohi, and H. Okamura, "Distribution-Free Checkpoint Placement Algorithms Based on Min-Max Principle," IEEE Transactions on Dependable and Secure Computing, Volume 3 , Issue 2 (April 2006), 130-140
-
(2006)
IEEE Transactions on Dependable and Secure Computing
, vol.3
, Issue.2
, pp. 130-140
-
-
Ozaki, T.1
Dohi, T.2
Okamura, H.3
|