-
1
-
-
83455261683
-
Experimental assessment of workstation failures and their impact on checkpointing systems
-
J. S. Plank and W. R. Elwasif. Experimental assessment of workstation failures and their impact on checkpointing systems. In Proceedings of FTCS-98.
-
Proceedings of FTCS-98
-
-
Plank, J.S.1
Elwasif, W.R.2
-
3
-
-
83455247703
-
Failure data-driven selective node-level duplication to improve MTTF in high performance computing systems
-
June
-
N. Nakka, A. Choudhary, "Failure data-driven selective node-level duplication to improve MTTF in High Performance Computing Systems", In Proceedings of HPCS 2009, June 2009.
-
(2009)
Proceedings of HPCS 2009
-
-
Nakka, N.1
Choudhary, A.2
-
6
-
-
45749113088
-
Modeling machine availability in enterprise and wide-area distributed computing environments
-
D. Nurmi, J. Brevik, and R. Wolski. Modeling machine availability in enterprise and wide-area distributed computing environments. In Euro-Par'05, 2005.
-
(2005)
Euro-par'05
-
-
Nurmi, D.1
Brevik, J.2
Wolski, R.3
-
9
-
-
84958782417
-
Networked windows NT system field failure data analysis
-
J. Xu, Z. Kalbarczyk, and R. K. Iyer. Networked Windows NT system field failure data analysis. In Proc. of the PRDC, 1999.
-
(1999)
Proc. of the PRDC
-
-
Xu, J.1
Kalbarczyk, Z.2
Iyer, R.K.3
-
10
-
-
33845593340
-
A large-scale study of failures in high-performance-computing systems
-
June
-
B. Schroeder and G. Gibson. A large-scale study of failures in high-performance-computing systems. In Proceedings of the DSN, June 2006.
-
(2006)
Proceedings of the DSN
-
-
Schroeder, B.1
Gibson, G.2
-
11
-
-
84976815079
-
Measurement and modeling of computer reliability as affected by system activity
-
R. K. Iyer, D. J. Rossetti, and M. C. Hsueh. Measurement and modeling of computer reliability as affected by system activity. ACM Transactions on Computing Systems, Vol. 4, No. 3, 1986.
-
(1986)
ACM Transactions on Computing Systems
, vol.4
, Issue.3
-
-
Iyer, R.K.1
Rossetti, D.J.2
Hsueh, M.C.3
-
13
-
-
36049013419
-
What supercomputers say: A study of five system logs
-
UK, June
-
Adam J. Oliner, Jon Stearley: What Supercomputers Say: A Study of Five System Logs. In Proceedings of the DSN, Edinburgh, UK, June 2007, pp. 575-584.
-
(2007)
Proceedings of the DSN, Edinburgh
, pp. 575-584
-
-
Oliner, A.J.1
Stearley, J.2
-
14
-
-
55849103487
-
A fault diagnosis and prognosis service for TeraGrid clusters
-
Z. Lan, Y. Li, P. Gujrati, Z. Zheng, R. Thakur, and J. White, "A Fault Diagnosis and Prognosis Service for TeraGrid Clusters", In Proceedings of TeraGrid'07, 2007.
-
(2007)
Proceedings of TeraGrid'07
-
-
Lan, Z.1
Li, Y.2
Gujrati, P.3
Zheng, Z.4
Thakur, R.5
White, J.6
-
15
-
-
47249123819
-
Exploring meta-learning to improve failure prediction in supercomputing clusters
-
P. Gujrati, Y. Li, Z. Lan, R. Thakur, and J. White, "Exploring Meta-learning to Improve Failure Prediction in Supercomputing Clusters", In Proceedings of ICPP, 2007.
-
(2007)
Proceedings of ICPP
-
-
Gujrati, P.1
Li, Y.2
Lan, Z.3
Thakur, R.4
White, J.5
-
16
-
-
79952168926
-
Using adaptive fault tolerance to improve application robustness on the TeraGrid
-
Y. Li and Z. Lan, "Using Adaptive Fault Tolerance to Improve Application Robustness on the TeraGrid", In Proceedings of TeraGrid'07, 2007.
-
(2007)
Proceedings of TeraGrid'07
-
-
Li, Y.1
Lan, Z.2
-
17
-
-
57049111494
-
Adaptive fault management of parallel applications for high performance computing
-
Z. Lan and Y. Li, "Adaptive Fault Management of Parallel Applications for High Performance Computing", IEEE Transactions on Computers, Vol. 57, No. 12, pp. 1647-1660, 2008.
-
(2008)
IEEE Transactions on Computers
, vol.57
, Issue.12
, pp. 1647-1660
-
-
Lan, Z.1
Li, Y.2
-
18
-
-
12444257746
-
Fault-aware job scheduling for bluegene/L systems
-
A. J. Oliner, R. K. Sahoo, J. E. Moreira, M. Gupta, and A. Sivasubramaniam. Fault-aware job scheduling for Bluegene/L systems. In Proceedings of the 18th IPDPS, 2004.
-
(2004)
Proceedings of the 18th IPDPS
-
-
Oliner, A.J.1
Sahoo, R.K.2
Moreira, J.E.3
Gupta, M.4
Sivasubramaniam, A.5
-
20
-
-
34249832377
-
A Bayesian method for the induction of probabilistic networks from data
-
G. Cooper, E. Herskovits (1992). A Bayesian method for the induction of probabilistic networks from data. Machine Learning. 9(4):309-347.
-
(1992)
Machine Learning
, vol.9
, Issue.4
, pp. 309-347
-
-
Cooper, G.1
Herskovits, E.2
-
23
-
-
0006452367
-
The alternating decision tree learning algorithm
-
Bled, Slovenia
-
Freund, Y., Mason, L.: The alternating decision tree learning algorithm. In: Proceeding of the Sixteenth International Conference on Machine Learning, Bled, Slovenia, 124-133, 1999.
-
(1999)
Proceeding of the Sixteenth International Conference on Machine Learning
, pp. 124-133
-
-
Freund, Y.1
Mason, L.2
-
24
-
-
0035478854
-
Random forests
-
Leo Breiman (2001). Random Forests. Machine Learning. 45(1):5-32.
-
(2001)
Machine Learning
, vol.45
, Issue.1
, pp. 5-32
-
-
Leo, B.1
|