-
1
-
-
0003660984
-
-
ANL-95/11 - Revision 3.1, Argonne National Laboratory
-
Balay, S., et al.: PETSc Users Manual. ANL-95/11 - Revision 3.1, Argonne National Laboratory (2010)
-
(2010)
PETSc Users Manual
-
-
Balay, S.1
-
2
-
-
31844451082
-
Building Fault Survivable MPI Programs with FT MPI Using Diskless Checkpointing
-
Chen, Z., Fagg, G.E., Gabriel, E., Langou, J., Angskun, T., Bosilca, G., Dongarra, J.: Building Fault Survivable MPI Programs with FT MPI Using Diskless Checkpointing. In: Proceedings for ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 213-223 (2005)
-
(2005)
Proceedings for ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
, pp. 213-223
-
-
Chen, Z.1
Fagg, G.E.2
Gabriel, E.3
Langou, J.4
Angskun, T.5
Bosilca, G.6
Dongarra, J.7
-
3
-
-
61449223447
-
Algorithmic Based Fault Tolerance Applied to High Performance Computing
-
Dongarra, J., Bosilca, B., Delmas, R., Langou, J.: Algorithmic Based Fault Tolerance Applied to High Performance Computing. Journal of Parallel and Distributed Computing 69, 410-416 (2009)
-
(2009)
Journal of Parallel and Distributed Computing
, vol.69
, pp. 410-416
-
-
Dongarra, J.1
Bosilca, B.2
Delmas, R.3
Langou, J.4
-
4
-
-
35048884271
-
Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation
-
Kranzlmüller, D., Kacsuk, P., Dongarra, J. (eds.) EuroPVM/MPI 2004. Springer, Heidelberg
-
Gabriel, E., Fagg, G.E., Bosilca, G., Angskun, T., Dongarra, J., Squyres, J.M., Sahay, V., Kambadur, P., Barrett, B.W., Lumsdaine, A., Castain, R.H., Daniel, D.J., Graham, R.L., Woodall, T.S.: Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation. In: Kranzlmüller, D., Kacsuk, P., Dongarra, J. (eds.) EuroPVM/MPI 2004. LNCS, vol. 3241, pp. 97-104. Springer, Heidelberg (2004)
-
(2004)
LNCS
, vol.3241
, pp. 97-104
-
-
Gabriel, E.1
Fagg, G.E.2
Bosilca, G.3
Angskun, T.4
Dongarra, J.5
Squyres, J.M.6
Sahay, V.7
Kambadur, P.8
Barrett, B.W.9
Lumsdaine, A.10
Castain, R.H.11
Daniel, D.J.12
Graham, R.L.13
Woodall, T.S.14
-
6
-
-
25144486687
-
Super-Scalable Algorithms for Computing on 100,000 Processors
-
Sunderam, V.S., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2005. Springer, Heidelberg
-
Engelmann, C., Geist, A.: Super-Scalable Algorithms for Computing on 100,000 Processors. In: Sunderam, V.S., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2005. LNCS, vol. 3514, pp. 313-321. Springer, Heidelberg (2005)
-
(2005)
LNCS
, vol.3514
, pp. 313-321
-
-
Engelmann, C.1
Geist, A.2
-
8
-
-
77954948567
-
On Disk-based and Diskless Check-pointing for Parallel and Distributed Systems: An Empirical Analysis
-
Kofahi, N.A., Al-Bokhitan, S., Journal, A.A.: On Disk-based and Diskless Check-pointing for Parallel and Distributed Systems: An Empirical Analysis. Information Technology Journal 4, 367-376 (2005)
-
(2005)
Information Technology Journal
, vol.4
, pp. 367-376
-
-
Kofahi, N.A.1
Al-Bokhitan, S.2
Journal, A.A.3
-
9
-
-
24944565453
-
Process resurrection: A fast recovery mechanism for real-time embedded systems
-
IEEE
-
Lee, K., Sha, L.: Process resurrection: A fast recovery mechanism for real-time embedded systems. In: Real-Time and Embedded Technology and Applications Symposium, pp. 292-301. IEEE (2005)
-
(2005)
Real-Time and Embedded Technology and Applications Symposium
, pp. 292-301
-
-
Lee, K.1
Sha, L.2
-
10
-
-
36448932746
-
Monitoring and Migration of a PETSc-based Parallel Application for Medical Imaging in a Grid computing PSE
-
Springer
-
Murli, A., Boccia, V., Carracciuolo, L., D Amore, L., Lapegna, M.: Monitoring and Migration of a PETSc-based Parallel Application for Medical Imaging in a Grid computing PSE. In: Proceedings of IFIP 2.5 WoCo9, vol. 239, pp. 421-432. Springer (2007)
-
(2007)
Proceedings of IFIP 2.5 WoCo9
, vol.239
, pp. 421-432
-
-
Murli, A.1
Boccia, V.2
Carracciuolo, L.3
D Amore, L.4
Lapegna, M.5
-
11
-
-
33746167122
-
-
Technical Report CS-97-380, University of Tennessee December
-
Plank, J.S., Li, K., Puening, M.A.: Diskless Checkpointing. Technical Report CS-97-380, University of Tennessee (December 1997)
-
(1997)
Diskless Checkpointing
-
-
Plank, J.S.1
Li, K.2
Puening, M.A.3
-
12
-
-
0032673296
-
The Performance of Coordinated and Independent Checkpointing
-
IEEE Computer Society, Washington, DC
-
Silva, L.M., Silva, G.J.: The Performance of Coordinated and Independent Checkpointing. In: Proceedings of the 13th International Symposium on Parallel Processing, pp. 280-284. IEEE Computer Society, Washington, DC (1999)
-
(1999)
Proceedings of the 13th International Symposium on Parallel Processing
, pp. 280-284
-
-
Silva, L.M.1
Silva, G.J.2
-
13
-
-
84865271923
-
-
ch. 11, SIAM Press
-
Simon, H.D., Heroux, M.A., Raghavan, P.: Faul Tolerance in Large Scale Scientific Computing, ch. 11, pp. 203-220. SIAM Press (2006)
-
(2006)
Faul Tolerance in Large Scale Scientific Computing
, pp. 203-220
-
-
Simon, H.D.1
Heroux, M.A.2
Raghavan, P.3
-
14
-
-
33750936415
-
Availability Modeling and Analysis on High Performance Cluster Computing Systems
-
Song, H., Leangsuksun, C., Nassar, R.: Availability Modeling and Analysis on High Performance Cluster Computing Systems. In: First International Conference on Availability, Reliability and Security, pp. 305-313 (2006)
-
(2006)
First International Conference on Availability, Reliability and Security
, pp. 305-313
-
-
Song, H.1
Leangsuksun, C.2
Nassar, R.3
-
15
-
-
0141682129
-
SRS - A Framework for Developing Malleable and Migratable Parallel Applications for Distributed Systems
-
Vadhiyar, S.S., Dongarra, J.: SRS - A Framework for Developing Malleable and Migratable Parallel Applications for Distributed Systems. In: Parallel Processing Letters, pp. 291-312 (2002)
-
(2002)
Parallel Processing Letters
, pp. 291-312
-
-
Vadhiyar, S.S.1
Dongarra, J.2
-
16
-
-
34548768671
-
A Job Pause Service under LAM/MPI+BLCR for Transparent Fault Tolerance
-
Wang, C., Mueller, F., Engelmann, C., Scott, S.L.: A Job Pause Service under LAM/MPI+BLCR for Transparent Fault Tolerance. In: Parallel and Distributed Processing Symposium (2007)
-
Parallel and Distributed Processing Symposium (2007)
-
-
Wang, C.1
Mueller, F.2
Engelmann, C.3
Scott, S.L.4
|