-
1
-
-
40749160036
-
Overview of the IBM blue gene/P project
-
Blue Gene Team
-
Blue Gene Team, "Overview of the IBM Blue Gene/P project, " IBM Journal of Research and Development, 2008.
-
(2008)
IBM Journal of Research and Development
-
-
-
3
-
-
55849147399
-
Dynamic meta-learning for failure prediction in large-scale systems: A case study
-
J. Gu, Z. Zheng, Z. Lan, J. White, and B. Park. Dynamic meta-learning for failure prediction in large-scale systems: A case study. Proc. of ICPP, 2008.
-
(2008)
Proc. of ICPP
-
-
Gu, J.1
Zheng, Z.2
Lan, Z.3
White, J.4
Park, B.5
-
4
-
-
77951481809
-
CiFTS: A coordinated infrastructure for fault-tolerant systems
-
R. Gupta, P. Beckman, B.-H. Park, E. Lusk, and P. Hargrove. CiFTS: A coordinated infrastructure for fault-tolerant systems. Proc. of ICPP, 2009.
-
(2009)
Proc. of ICPP
-
-
Gupta, R.1
Beckman, P.2
Park, B.-H.3
Lusk, E.4
Hargrove, P.5
-
5
-
-
57049111494
-
Adaptive fault management of parallel applications for high performance computing
-
Z. Lan and Y. Li. Adaptive fault management of parallel applications for high performance computing. IEEE Trans. on Computers, 57(12):1647-1660, 2008.
-
(2008)
IEEE Trans. on Computers
, vol.57
, Issue.12
, pp. 1647-1660
-
-
Lan, Z.1
Li, Y.2
-
7
-
-
33845589803
-
Blue Gene/L failure analysis and prediction models
-
Y. Liang, Y. Zhang, M. Jette, A. Sivasubramanium, and R. Sahoo. Blue Gene/L failure analysis and prediction models. Proc. of DSN, 2006.
-
(2006)
Proc. of DSN
-
-
Liang, Y.1
Zhang, Y.2
Jette, M.3
Sivasubramanium, A.4
Sahoo, R.5
-
8
-
-
70450055295
-
An adaptive semantic filter for Blue Gene/L failure log analysis systems
-
Y. Liang, Y. Zhang, H. Xiong, and R. Sahoo. An adaptive semantic filter for Blue Gene/L failure log analysis systems. Workshop on SMTPS, 2007.
-
(2007)
Workshop on SMTPS
-
-
Liang, Y.1
Zhang, Y.2
Xiong, H.3
Sahoo, R.4
-
10
-
-
53349174366
-
A log mining approach to failure analysis of enterprise telephony systems
-
C. Lim, N. Singh, and S. Yajnik. A log mining approach to failure analysis of enterprise telephony systems. Proc. of DSN, 2008.
-
(2008)
Proc. of DSN
-
-
Lim, C.1
Singh, N.2
Yajnik, S.3
-
13
-
-
47249142074
-
Modeling the impact of checkpoints on next generation systems
-
R. Oldfield, S. Arunagiri, P. Teller, S. Seelam, and M. Varela. Modeling the impact of checkpoints on next generation systems. Proc. of MSST, 2007.
-
(2007)
Proc. of MSST
-
-
Oldfield, R.1
Arunagiri, S.2
Teller, P.3
Seelam, S.4
Varela, M.5
-
14
-
-
34547424386
-
Cooperative checkpointing: A robust approach to large-scale systems reliability
-
A. Oliner, L. Rudolph, and R. Sahoo. Cooperative checkpointing: A robust approach to large-scale systems reliability. Proc. of ICS, 2006.
-
(2006)
Proc. of ICS
-
-
Oliner, A.1
Rudolph, L.2
Sahoo, R.3
-
15
-
-
12444257746
-
Fault-aware job scheduling for Blue Gene/L systems
-
A. Oliner, R. Sahoo, J. Moreira, M. Gupta, and A. Sivasubramaniam. Fault-aware job scheduling for Blue Gene/L systems. Proc. of IPDPS, 2004.
-
(2004)
Proc. of IPDPS
-
-
Oliner, A.1
Sahoo, R.2
Moreira, J.3
Gupta, M.4
Sivasubramaniam, A.5
-
16
-
-
36049013419
-
What supercomputers say: A study of five system logs
-
A. Oliner and J. Stearly. What supercomputers say: A study of five system logs. Proc. of DSN, 2007.
-
(2007)
Proc. of DSN
-
-
Oliner, A.1
Stearly, J.2
-
17
-
-
15744384822
-
Optimization of association rule mining using improved genetic algorithms
-
M. Sagger, A. Agrawal, and A. Lad. Optimization of association rule mining using improved genetic algorithms. Proc. of SMC, 2004.
-
(2004)
Proc. of SMC
-
-
Sagger, M.1
Agrawal, A.2
Lad, A.3
-
18
-
-
12444270465
-
Critical event prediction for proactive management in large-scale computer clusters
-
R. Sahoo and A. Oliner et al. Critical event prediction for proactive management in large-scale computer clusters. Proc. of SIGKDD, 2003.
-
(2003)
Proc. of SIGKDD
-
-
Sahoo, R.1
Olinet, A.2
-
19
-
-
47249121233
-
Using hidden semi-markov models for effective online failure prediction
-
F. Salfner and M. Malek. Using hidden semi-markov models for effective online failure prediction. Proc. of SRDS, 2007.
-
(2007)
Proc. of SRDS
-
-
Salfner, F.1
Malek, M.2
-
20
-
-
4444380999
-
A survey of fault localization techniques in computer networks
-
M. Steinder and A. Sethi. A survey of fault localization techniques in computer networks. Science of Computer Programming, 53(2), 2004.
-
Science of Computer Programming
, vol.53
, Issue.2
, pp. 2004
-
-
Steinder, M.1
Sethi, A.2
-
23
-
-
33847141517
-
Timeweaver: A genetic algorithm for identifying predictive patterns in sequences of events
-
G. Weiss. Timeweaver: A genetic algorithm for identifying predictive patterns in sequences of events. Genetic and Evolutionary Computation Conference, 1999.
-
(1999)
Genetic and Evolutionary Computation Conference
-
-
Weiss, G.1
|