-
1
-
-
84866852589
-
Hydee: Failure containment without event logging for large scale send-deterministic mpi applications
-
May
-
Guermouche, A., Ropars, T., Snir, M., Cappello, F.: Hydee: Failure containment without event logging for large scale send-deterministic mpi applications. In: 2012 IEEE 26th International on Parallel Distributed Processing Symposium (IPDPS), pp. 1216-1227 (May 2012)
-
(2012)
2012 IEEE 26th International on Parallel Distributed Processing Symposium (IPDPS)
, pp. 1216-1227
-
-
Guermouche, A.1
Ropars, T.2
Snir, M.3
Cappello, F.4
-
3
-
-
84877693592
-
Fault prediction under the microscope: A closer look into hpc systems
-
IEEE Computer Society Press, Los Alamitos
-
Gainaru, A., Cappello, F., Snir, M., Kramer, W.: Fault prediction under the microscope: a closer look into hpc systems. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC 2012, pp. 77:1-77:11. IEEE Computer Society Press, Los Alamitos (2012)
-
(2012)
Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC 2012
-
-
Gainaru, A.1
Cappello, F.2
Snir, M.3
Kramer, W.4
-
4
-
-
84866712387
-
Assessing time coalescence techniques for the analysis of supercomputer logs
-
Di Martino, C., Cinque, M., Cotroneo, D.: Assessing time coalescence techniques for the analysis of supercomputer logs. In: IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2012), pp. 1-12 (2012)
-
(2012)
IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2012)
, pp. 1-12
-
-
Di Martino, C.1
Cinque, M.2
Cotroneo, D.3
-
5
-
-
0029703899
-
A comparative analysis of event tupling schemes
-
IEEE Computer Society, Washington, DC
-
Buckley, M.F., Siewiorek, D.P.: A comparative analysis of event tupling schemes. In: FTCS 1996: Proc. of the The Twenty-Sixth Annual Int. Symp. on Fault-Tolerant Computing (FTCS 1996), p. 294. IEEE Computer Society, Washington, DC (1996)
-
(1996)
FTCS 1996: Proc. of the the Twenty-Sixth Annual Int. Symp. on Fault-Tolerant Computing (FTCS 1996)
, pp. 294
-
-
Buckley, M.F.1
Siewiorek, D.P.2
-
6
-
-
33845589803
-
Bluegene/l failure analysis and prediction models
-
Liang, Y., Zhang, Y., Sivasubramaniam, A., Jette, M., Sahoo, R.: Bluegene/l failure analysis and prediction models. In: Int. Conference on Dependable Systems and Networks, DSN 2006, pp. 425-434 (2006)
-
(2006)
Int. Conference on Dependable Systems and Networks, DSN 2006425-434
-
-
Liang, Y.1
Zhang, Y.2
Sivasubramaniam, A.3
Jette, M.4
Sahoo, R.5
-
7
-
-
80052167311
-
Models for time coalescence in event logs
-
July
-
Hansen, J., Siewiorek, D.: Models for time coalescence in event logs. In: Twenty-Second Int. Symp. on Fault-Tolerant Computing, FTCS-22, Digest of Papers, pp. 221-227 (July 1992)
-
(1992)
Twenty-Second Int. Symp. on Fault-Tolerant Computing, FTCS-22, Digest of Papers
, pp. 221-227
-
-
Hansen, J.1
Siewiorek, D.2
-
9
-
-
4544382099
-
Failure data analysis of a large-scale heterogeneous server environment
-
IEEE Computer Society, Washington, DC
-
Sahoo, R.K., Sivasubramaniam, A., Squillante, M.S., Zhang, Y.: Failure data analysis of a large-scale heterogeneous server environment. In: DSN 2004: Proc. of the 2004 Int. Conference on Dependable Systems and Networks, p. 772. IEEE Computer Society, Washington, DC (2004)
-
(2004)
DSN 2004: Proc. of the 2004 Int. Conference on Dependable Systems and Networks
, pp. 772
-
-
Sahoo, R.K.1
Sivasubramaniam, A.2
Squillante, M.S.3
Zhang, Y.4
-
10
-
-
27544497222
-
Filtering failure logs for a bluegene/l prototype
-
IEEE Computer Society, Washington, DC
-
Liang, Y., Sivasubramaniam, A., Moreira, J.: Filtering failure logs for a bluegene/l prototype. In: DSN 2005: Proc. of the 2005 Int. Conference on Dependable Systems and Networks, pp. 476-485. IEEE Computer Society, Washington, DC (2005)
-
(2005)
DSN 2005: Proc. of the 2005 Int. Conference on Dependable Systems and Networks
, pp. 476-485
-
-
Liang, Y.1
Sivasubramaniam, A.2
Moreira, J.3
-
11
-
-
78651588409
-
Event log based dependability analysis of windows nt and 2k systems
-
IEEE Computer Society, Washington, DC
-
Simache, C., Kaâniche, M., Saidane, A.: Event log based dependability analysis of windows nt and 2k systems. In: PRDC 2002: Proc. of the 2002 Pacific Rim Int. Symp. on Dependable Computing, p. 311. IEEE Computer Society, Washington, DC (2002)
-
(2002)
PRDC 2002: Proc. of the 2002 Pacific Rim Int. Symp. on Dependable Computing
, pp. 311
-
-
Simache, C.1
Kaâniche, M.2
Saidane, A.3
-
12
-
-
80051683646
-
Security and performance trade-off in perfcloud
-
Guarracino, M.R., Vivien, F., Träff, J.L., Cannatoro, M., Danelutto, M., Hast, A., Perla, F., Knüpfer, A., Di Martino, B., Alexander, M. (eds.) Euro- Par-Workshop 2010. Springer, Heidelberg
-
Casola, V., Cuomo, A., Rak, M., Villano, U.: Security and performance trade-off in perfcloud. In: Guarracino, M.R., Vivien, F., Träff, J.L., Cannatoro, M., Danelutto, M., Hast, A., Perla, F., Knüpfer, A., Di Martino, B., Alexander, M. (eds.) Euro- Par-Workshop 2010. LNCS, vol. 6586, pp. 633-640. Springer, Heidelberg (2011)
-
(2011)
LNCS
, vol.6586
, pp. 633-640
-
-
Casola, V.1
Cuomo, A.2
Rak, M.3
Villano, U.4
-
13
-
-
78650855128
-
A fault avoidance strategy improving the reliability of the EGI production grid infrastructure
-
Lu, C., Masuzawa, T., Mosbah, M. (eds.) OPODIS 2010. Springer, Heidelberg
-
Palmieri, F., Pardi, S., Veronesi, P.: A fault avoidance strategy improving the reliability of the EGI production grid infrastructure. In: Lu, C., Masuzawa, T., Mosbah, M. (eds.) OPODIS 2010. LNCS, vol. 6490, pp. 159-172. Springer, Heidelberg (2010)
-
(2010)
LNCS
, vol.6490
, pp. 159-172
-
-
Palmieri, F.1
Pardi, S.2
Veronesi, P.3
-
14
-
-
84866614504
-
Modelling the behaviour of an adaptive scheduling controller
-
Barone, G.B., Boccia, V., Bottalico, D., Carracciuolo, L., Doria, A., Laccetti, G.: Modelling the behaviour of an adaptive scheduling controller. In: 2012 Sixth International Conference on Complex, Intelligent and Software Intensive Systems (CISIS), pp. 438-442 (2012)
-
(2012)
2012 Sixth International Conference on Complex, Intelligent and Software Intensive Systems (CISIS)
, pp. 438-442
-
-
Barone, G.B.1
Boccia, V.2
Bottalico, D.3
Carracciuolo, L.4
Doria, A.5
Laccetti, G.6
-
16
-
-
84866681807
-
A framework for assessing the dependability of supercomputers via automated log analysis
-
Di Martino, C., Cotroneo, D., Kalbarczyk, Z., Iyer, R.K.: A framework for assessing the dependability of supercomputers via automated log analysis. In: DSN 2008: Sup. Volume of Proc. of the Int. Conference on Dependable Systems and Networks, Anchorage, AK, pp. 383-384 (2008)
-
(2008)
DSN 2008: Sup. Volume of Proc. of the Int. Conference on Dependable Systems and Networks, Anchorage, AK
, pp. 383-384
-
-
Di Martino, C.1
Cotroneo, D.2
Kalbarczyk, Z.3
Iyer, R.K.4
-
17
-
-
80051915968
-
Improving log-based field failure data analysis of multi-node computing systems
-
June
-
Pecchia, A., Cotroneo, D., Kalbarczyk, Z., Iyer, R.K.: Improving log-based field failure data analysis of multi-node computing systems. In: 2011 IEEE/IFIP 41st International Conference on Dependable Systems Networks (DSN), pp. 97-108 (June 2011)
-
(2011)
2011 IEEE/IFIP 41st International Conference on Dependable Systems Networks (DSN)
, pp. 97-108
-
-
Pecchia, A.1
Cotroneo, D.2
Kalbarczyk, Z.3
Iyer, R.K.4
-
18
-
-
0030379933
-
Analyze-now-an environment for collection and analysis of failures in a network of workstations
-
Thakur, A., Iyer, R.K.: Analyze-now-an environment for collection and analysis of failures in a network of workstations. IEEE Transactions on Reliability 45(4), 561-570 (1996)
-
(1996)
IEEE Transactions on Reliability
, vol.45
, Issue.4
, pp. 561-570
-
-
Thakur, A.1
Iyer, R.K.2
-
19
-
-
0033344278
-
Failure data analysis of a lan of windows nt based computers
-
Kalyanakrishnam, M., Kalbarczyk, Z., Iyer, R.: Failure data analysis of a lan of windows nt based computers. In: Proc. of the 18th IEEE Symp. on Reliable Distributed Systems, pp. 178-187 (1999)
-
(1999)
Proc. of the 18th IEEE Symp. on Reliable Distributed Systems
, pp. 178-187
-
-
Kalyanakrishnam, M.1
Kalbarczyk, Z.2
Iyer, R.3
-
22
-
-
70449794134
-
System log pre-processing to improve failure prediction
-
Zheng, Z., Lan, Z., Park, B., Geist, A.: System log pre-processing to improve failure prediction. In: IEEE/IFIP International Conference on Dependable Systems Networks, DSN 2009, June 29-July 2, pp. 572-577 (2009)
-
(2009)
IEEE/IFIP International Conference on Dependable Systems Networks, DSN 2009, June 29-July 2
, pp. 572-577
-
-
Zheng, Z.1
Lan, Z.2
Park, B.3
Geist, A.4
-
24
-
-
36049002858
-
How do mobile phones fail? A failure data analysis of symbian os smart phones
-
IEEE Computer Society, Washington, DC
-
Cinque, M., Cotroneo, D., Kalbarczyk, Z., Iyer, R.K.: How do mobile phones fail? a failure data analysis of symbian os smart phones. In: Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2007, pp. 585-594. IEEE Computer Society, Washington, DC (2007)
-
(2007)
Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2007
, pp. 585-594
-
-
Cinque, M.1
Cotroneo, D.2
Kalbarczyk, Z.3
Iyer, R.K.4
|