-
3
-
-
77952378080
-
Critical event prediction for proactive management in large-scale computer clusters
-
R. K. Sahoo et al: Critical Event Prediction for Proactive Management In Large-scale Computer Clusters. International conference on Knowledge discovery and data mining, pp 426-435, 2003
-
(2003)
International Conference on Knowledge Discovery and Data Mining
, pp. 426-435
-
-
Sahoo, R.K.1
-
4
-
-
79951644113
-
Analysis and modeling of time-correlated failures in large-scale distributed systems
-
N. Yigitbasi et al: Analysis and Modeling of Time-Correlated Failures in Large-Scale Distributed Systems. IEEE/ACM International Conference on Grid Computing, pp 65-72, 2010
-
(2010)
IEEE/ACM International Conference on Grid Computing
, pp. 65-72
-
-
Yigitbasi, N.1
-
5
-
-
77958132122
-
Mining dependency in distributed systems through unstructured logs analysis
-
January
-
J. G. Lou et al: Mining Dependency in Distributed Systems through Unstructured Logs Analysis ACM SIGOPS Volume 44 Issue 1, January 2010
-
(2010)
ACM SIGOPS
, vol.44
, Issue.1
-
-
Lou, J.G.1
-
6
-
-
77951145583
-
Online system problem detection by mining patterns of console logs
-
W. Xu et al: Online System Problem Detection by Mining Patterns of Console Logs IEEE International Conference on Data Mining, pp 588-597, 2009
-
(2009)
IEEE International Conference on Data Mining
, pp. 588-597
-
-
Xu, W.1
-
8
-
-
55849147399
-
Dynamic meta-learning for failure prediction in large-scale systems: A case study
-
J. Gu et al: Dynamic Meta-Learning for Failure Prediction in Large-Scale Systems: A case Study International Conference on Parallel Processing, pp 157-164, 2008
-
(2008)
International Conference on Parallel Processing
, pp. 157-164
-
-
Gu, J.1
-
12
-
-
84874307933
-
Event log mining tool for large scale HPC systems
-
Sep
-
Ana Gainaru, Franck Cappello et al: Event log mining tool for large scale HPC systems. Euro-Par conference, Sep 2011.
-
(2011)
Europar Conference
-
-
Gainaru, A.1
Cappello, F.2
-
14
-
-
81055140431
-
-
The TOP500 Supercomputer Sites. www.top500.org
-
-
-
-
17
-
-
9144223280
-
Checkpointing for peta-scale systems: A look into the future of practical rollback-recovery
-
Apr
-
E. Elnozahy and J Plank: Checkpointing for peta-scale systems: a look into the future of practical rollback-recovery. Dependable and Secure Computing, pp 97-108, Apr, 2004.
-
(2004)
Dependable and Secure Computing
, pp. 97-108
-
-
Elnozahy, E.1
Plank, J.2
-
19
-
-
83155160934
-
Modeling and tolerating heterogeneous failures in large parallel systems
-
November
-
Eric Heien, Derrick Kondo, Ana Gainaru, Franck Cappello: Modeling and Tolerating Heterogeneous Failures in Large Parallel Systems - Supercomputing 2011, November 2011
-
(2011)
Supercomputing 2011
-
-
Heien, E.1
Kondo, D.2
Gainaru, A.3
Cappello, F.4
-
21
-
-
21044437801
-
Overview of the Blue Gene/L system architecture
-
A. Gara et al: Overview of the blue gene/l system architecture. IBM Journal of Research and Development, 49(2):195-212, March 2005. (Pubitemid 40718128)
-
(2005)
IBM Journal of Research and Development
, vol.49
, Issue.2-3
, pp. 195-212
-
-
Gara, A.1
Blumrich, M.A.2
Chen, D.3
Chiu, G.L.-T.4
Coteus, P.5
Giampapa, M.E.6
Haring, R.A.7
Heidelberger, P.8
Hoenicke, D.9
Kopcsay, G.V.10
Liebsch, T.A.11
Ohmacht, M.12
Steinmacher-Burow, B.D.13
Takken, T.14
Vranas, P.15
-
22
-
-
81055140424
-
-
Accessed on 2010
-
National Center for Supercomputing Applications at the University of Illinois. www.ncsa.illinois.edu. Accessed on 2010.
-
-
-
|