-
1
-
-
84867640806
-
-
Tech. rep., Department of Electrical Engineering and Computer Science, University of Tennessee
-
Bland, W., Bosilca, G., Bouteiller, A., Herault, T., Dongarra, J.: A proposal for User-Level Failure Mitigation in the MPI-3 standard. Tech. rep., Department of Electrical Engineering and Computer Science, University of Tennessee (2012)
-
(2012)
A Proposal for User-Level Failure Mitigation in the MPI-3 Standard
-
-
Bland, W.1
Bosilca, G.2
Bouteiller, A.3
Herault, T.4
Dongarra, J.5
-
2
-
-
84867646266
-
An Evaluation of User-Level Failure Mitigation Support in MPI
-
Träff, J.L., Benkner, S., Dongarra, J.J. (eds.) EuroMPI 2012. Springer, Heidelberg
-
Bland, W., Bouteiller, A., Herault, T., Hursey, J., Bosilca, G., Dongarra, J.J.: An Evaluation of User-Level Failure Mitigation Support in MPI. In: Träff, J.L., Benkner, S., Dongarra, J.J. (eds.) EuroMPI 2012. LNCS, vol. 7490, pp. 193-203. Springer, Heidelberg (2012)
-
(2012)
LNCS
, vol.7490
, pp. 193-203
-
-
Bland, W.1
Bouteiller, A.2
Herault, T.3
Hursey, J.4
Bosilca, G.5
Dongarra, J.J.6
-
3
-
-
84867640976
-
-
Tech. Rep. RR-7950, INRIA
-
Bosilca, G., Bouteiller, A., Brunet, É., Cappello, F., Dongarra, J., Guermouche, A., Hérault, T., Robert, Y., Vivien, F., Zaidouni, D.: Unified Model for Assessing Checkpointing Protocols at Extreme-Scale. Tech. Rep. RR-7950, INRIA (2012)
-
(2012)
Unified Model for Assessing Checkpointing Protocols at Extreme-Scale
-
-
Bosilca, G.1
Bouteiller, A.2
Brunet, É.3
Cappello, F.4
Dongarra, J.5
Guermouche, A.6
Hérault, T.7
Robert, Y.8
Vivien, F.9
Zaidouni, D.10
-
4
-
-
84867631517
-
-
Tech. Rep. 265, LAWNs
-
Bougeret, M., Casanova, H., Robert, Y., Vivien, F., Zaidouni, D.: Using group replication for resilience on exascale systems. Tech. Rep. 265, LAWNs (2012)
-
(2012)
Using Group Replication for Resilience on Exascale Systems
-
-
Bougeret, M.1
Casanova, H.2
Robert, Y.3
Vivien, F.4
Zaidouni, D.5
-
5
-
-
70450206305
-
Toward exascale resilience
-
Cappello, F., Geist, A., Gropp, B., Kalé, L.V., Kramer, B., Snir, M.: Toward exascale resilience. IJHPCA 23(4), 374-388 (2009)
-
(2009)
IJHPCA
, vol.23
, Issue.4
, pp. 374-388
-
-
Cappello, F.1
Geist, A.2
Gropp, B.3
Kalé, L.V.4
Kramer, B.5
Snir, M.6
-
6
-
-
4344695315
-
Fault tolerance in Message Passing Interface programs
-
Gropp, W., Lusk, E.: Fault tolerance in Message Passing Interface programs. IJHPCA 18, 363-372 (2004)
-
(2004)
IJHPCA
, vol.18
, pp. 363-372
-
-
Gropp, W.1
Lusk, E.2
-
7
-
-
0021439162
-
Algorithm-based fault tolerance for matrix operations
-
Huang, K., Abraham, J.: Algorithm-based fault tolerance for matrix operations. IEEE Transactions on Computers 100(6), 518-528 (1984)
-
(1984)
IEEE Transactions on Computers
, vol.100
, Issue.6
, pp. 518-528
-
-
Huang, K.1
Abraham, J.2
-
8
-
-
80052974659
-
A Log-Scaling Fault Tolerant Agreement Algorithm for a Fault Tolerant MPI
-
Cotronis, Y., Danalis, A., Nikolopoulos, D.S., Dongarra, J. (eds.) EuroMPI 2011. Springer, Heidelberg
-
Hursey, J., Naughton, T., Vallee, G., Graham, R.L.: A Log-Scaling Fault Tolerant Agreement Algorithm for a Fault Tolerant MPI. In: Cotronis, Y., Danalis, A., Nikolopoulos, D.S., Dongarra, J. (eds.) EuroMPI 2011. LNCS, vol. 6960, pp. 255-263. Springer, Heidelberg (2011)
-
(2011)
LNCS
, vol.6960
, pp. 255-263
-
-
Hursey, J.1
Naughton, T.2
Vallee, G.3
Graham, R.L.4
-
9
-
-
0022048209
-
EFFICIENT COMMIT PROTOCOLS for the TREE of PROCESSES MODEL of DISTRIBUTED TRANSACTIONS
-
Mohan, C., Lindsay, B.: Efficient commit protocols for the tree of processes model of distributed transactions. In: SIGOPS OSR, vol. 19, pp. 40-52. ACM (1985) (Pubitemid 15580239)
-
(1985)
Operating Systems Review (ACM)
, vol.19
, Issue.2
, pp. 40-52
-
-
Mohan, C.1
Lindsay, B.2
|