-
1
-
-
0031388399
-
Impact of checkpoint latency on overhead ratio of a checkpointing scheme
-
Aug.
-
N. H. Vaidya, "Impact of checkpoint latency on overhead ratio of a checkpointing scheme," IEEE Transactions on Computers, vol. 46, Aug. 1997.
-
(1997)
IEEE Transactions on Computers
, vol.46
-
-
Vaidya, N.H.1
-
3
-
-
0004097019
-
Compressed differences: An algorithm for fast incremental checkpointing
-
University of Tennessee at Knoxville, Aug.
-
J. S. Plank, J. Xu, and R. H. Netzer, "Compressed differences: An algorithm for fast incremental checkpointing," Tech. Rep. CS-95-302, University of Tennessee at Knoxville, Aug. 1995.
-
(1995)
Tech. Rep.
, vol.CS-95-302
-
-
Plank, J.S.1
Xu, J.2
Netzer, R.H.3
-
4
-
-
0028485392
-
Low-latency, concurrent checkpointing for parallel programs
-
Aug.
-
K. Li, J. F. Naughton, and J. S. Plank, "Low-latency, concurrent checkpointing for parallel programs," IEEE Transactions on Parallel and Distributed Systems, vol. 5, Aug. 1994.
-
(1994)
IEEE Transactions on Parallel and Distributed Systems
, vol.5
-
-
Li, K.1
Naughton, J.F.2
Plank, J.S.3
-
5
-
-
0003912256
-
Checkpoint and migration of UNIX processes in the Condor distributed processing system
-
University of Wisconsin - Madison Computer Sciences Department, April
-
M. Litzkow, T. Tannenbaum, J. Basney, and M. Livny, "Checkpoint and migration of UNIX processes in the Condor distributed processing system," Tech. Rep. UW-CS-TR-1346, University of Wisconsin - Madison Computer Sciences Department, April 1997.
-
(1997)
Tech. Rep.
, vol.UW-CS-TR-1346
-
-
Litzkow, M.1
Tannenbaum, T.2
Basney, J.3
Livny, M.4
-
7
-
-
85084159983
-
Libckpt: Transparent checkpointing under Unix
-
Jan.
-
J. S. Plank, M. Beck, G. Kingsley, and K. Li, "Libckpt: Transparent checkpointing under Unix," in Usenix Winter Technical Conference, pp. 213-223, Jan. 1995.
-
(1995)
Usenix Winter Technical Conference
, pp. 213-223
-
-
Plank, J.S.1
Beck, M.2
Kingsley, G.3
Li, K.4
-
8
-
-
0004096191
-
A survey of rollback-recovery protocols in message passing systems
-
School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA, Oct.
-
M. Elnozahy, L. Alvisi, Y. Wang, and D. Johnson, "A survey of rollback-recovery protocols in message passing systems," Tech. Rep. CMU-CS-96-181, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA, Oct. 1996.
-
(1996)
Tech. Rep.
, vol.CMU-CS-96-181
-
-
Elnozahy, M.1
Alvisi, L.2
Wang, Y.3
Johnson, D.4
-
10
-
-
0031570635
-
Application level fault tolerance in heterogeneus networks of workstations
-
May
-
A. Beguelin, E. Seligman, and P. Stephan., "Application level fault tolerance in heterogeneus networks of workstations.," Journal of Parallel and Distributed Computing, vol. 43, pp. 147-155, May 1997.
-
(1997)
Journal of Parallel and Distributed Computing
, vol.43
, pp. 147-155
-
-
Beguelin, A.1
Seligman, E.2
Stephan, P.3
-
11
-
-
8344260303
-
Quasi-asynchronous migration: A novel migration protocol for PVM tasks
-
Apr.
-
D. Pei, D. Wang, and Y. Zhang, "Quasi-asynchronous migration: A novel migration protocol for PVM tasks," ACM Operating Systems Review, vol. 33, Apr. 1999.
-
(1999)
ACM Operating Systems Review
, vol.33
-
-
Pei, D.1
Wang, D.2
Zhang, Y.3
-
12
-
-
0038040085
-
Automated application-level checkpointing of mpi programs
-
June
-
G. Bronevetsky, D. Marques, K. Pingali, and P. Stodghill, "Automated application-level checkpointing of mpi programs.," in In Principles and Practice of Parallel Programming, June 2003.
-
(2003)
In Principles and Practice of Parallel Programming
-
-
Bronevetsky, G.1
Marques, D.2
Pingali, K.3
Stodghill, P.4
-
15
-
-
0032179680
-
Diskless checkpointing
-
Oct.
-
J. S. Plank, Y. Kim, and J. J. Dongarra, "Diskless checkpointing," IEEE Transactions on Parallel and Distributed Systems, vol. 9, pp. 972-986, Oct. 1998.
-
(1998)
IEEE Transactions on Parallel and Distributed Systems
, vol.9
, pp. 972-986
-
-
Plank, J.S.1
Kim, Y.2
Dongarra, J.J.3
-
16
-
-
0002991145
-
Ickp: A consistent checkpointer for multicomputers
-
June
-
J. S. Plank and K. Li., "ickp: A consistent checkpointer for multicomputers.," IEEE Parallel and Distributed Technologies, vol, 2, pp. 62-67, June 1994.
-
(1994)
IEEE Parallel and Distributed Technologies
, vol.2
, pp. 62-67
-
-
Plank, J.S.1
Li, K.2
-
19
-
-
8344269036
-
The design and implementation of Berkeley lab's Linux checkpoint/restart
-
Oct.
-
S. Sankaran, J. Squyres, B. Barren, A. Lumsdaine, J. Duell, P. Hargrove, and E. Roman, "The design and implementation of Berkeley lab's Linux checkpoint/restart," in Los Alamos Computer Science Institute (LACSI) Symposium, Oct. 2003.
-
(2003)
Los Alamos Computer Science Institute (LACSI) Symposium
-
-
Sankaran, S.1
Squyres, J.2
Barren, B.3
Lumsdaine, A.4
Duell, J.5
Hargrove, P.6
Roman, E.7
-
20
-
-
8344283205
-
CRAK: Linux checkpoint/restart as a kernel module
-
Department of Computer Science, Columbia University, Nov.
-
H. Zhong and J. Nieh, "CRAK: Linux checkpoint/restart as a kernel module," Tech. Rep. CUCS-014-01, Department of Computer Science, Columbia University, Nov. 2001.
-
(2001)
Tech. Rep.
, vol.CUCS-014-01
-
-
Zhong, H.1
Nieh, J.2
-
22
-
-
0036292677
-
Safetynet: Improving the availability of shared memory multiprocessors with global checkpoint/recovery
-
May
-
D. J. Sorin, M. K. Martin, M. D. Hill, and D. A.Wood, "Safetynet: Improving the availability of shared memory multiprocessors with global checkpoint/recovery," in In Proceedings of the International Symposium on Computer Architecture, May 2002.
-
(2002)
In Proceedings of the International Symposium on Computer Architecture
-
-
Sorin, D.J.1
Martin, M.K.2
Hill, M.D.3
Wood, D.A.4
-
23
-
-
12444268355
-
On the feasibility of incremental checkpointing for scientific computing
-
(Santa Fe, NM, USA), April
-
J. C. Sancho, F. Petrini, G. Johnson, J. Fernandez, and E. Frachtenberg, "On the feasibility of incremental checkpointing for scientific computing," in International Parallel and Distributed Processing Symposium, (Santa Fe, NM, USA), April 2004.
-
(2004)
International Parallel and Distributed Processing Symposium
-
-
Sancho, J.C.1
Petrini, F.2
Johnson, G.3
Fernandez, J.4
Frachtenberg, E.5
-
24
-
-
0036630606
-
Probabilistic checkpointing
-
July
-
H. Nam, J. Kim, S. J. Hong, and S. Lee, "Probabilistic checkpointing," IEICE Transactions, Information and Systems, vol. E85-D, July 2002.
-
(2002)
IEICE Transactions, Information and Systems
, vol.E85-D
-
-
Nam, H.1
Kim, J.2
Hong, S.J.3
Lee, S.4
-
26
-
-
0037370246
-
A secure checkpointing system
-
H. Nam, J. Kim, S. J. Hong, and S. Lee, "A secure checkpointing system," Journal of Systems Architecture, vol. 48, pp. 237-254, 2003.
-
(2003)
Journal of Systems Architecture
, vol.48
, pp. 237-254
-
-
Nam, H.1
Kim, J.2
Hong, S.J.3
Lee, S.4
-
29
-
-
11144287593
-
An overview of the BlueGene/L Supercomputer
-
Nov.
-
N. Adiga and et. al., "An overview of the BlueGene/L Supercomputer," in In Proceedings of the Supercomputing, Nov. 2002.
-
(2002)
In Proceedings of the Supercomputing
-
-
Adiga, N.1
-
30
-
-
10744227260
-
An overview of the blueGene/L system software organization
-
Aug.
-
G. Almasi, R. Bellofatto, J. Brunheroto, C. Ca - scaval, J. G. Castaqos, L. Ceze, P. Crumley, C. C. Erway, J. Gagliano, D. Lieber, X. Martorell, J. E. Moreira, A. Sanomiya and K. Strauss, "An Overview of the BlueGene/L System Software Organization," in Euro-Par: 9th International European Conference on Parallel Processing, Aug. 2003.
-
(2003)
Euro-Par: 9th International European Conference on Parallel Processing
-
-
Almasi, G.1
Bellofatto, R.2
Brunheroto, J.3
Ca Scaval, C.4
Castaqos, J.G.5
Ceze, L.6
Crumley, P.7
Erway, C.C.8
Gagliano, J.9
Lieber, D.10
Martorell, X.11
Moreira, J.E.12
Sanomiya, A.13
Strauss, K.14
-
31
-
-
0003605998
-
The NAS parallel benchmarks 2.0
-
NAS Systems Division, Dec.
-
D. Bailey, T. Harris, W. Saphir, R. vander Wijngaart, A. Woo, and M. Yarros, "The NAS parallel benchmarks 2.0," Tech. Rep. NAS-95-020, NAS Systems Division, Dec. 1995.
-
(1995)
Tech. Rep.
, vol.NAS-95-020
-
-
Bailey, D.1
Harris, T.2
Saphir, W.3
Vander Wijngaart, R.4
Woo, A.5
Yarros, M.6
-
32
-
-
84862455395
-
-
"ASCI blue benchmarks." http://www.llnl.gov/asci_benchmarks/ asci/asci_code.list.html.
-
ASCI Blue Benchmarks
-
-
|