-
1
-
-
84860056236
-
-
BlueGene/L
-
BlueGene/L. http://www.llnl.gov/asci/platforms/bluegenel.
-
-
-
-
2
-
-
56749120377
-
-
Data Lifeguard. http://www.wdc.com/en/library/2579-850105.pdf.
-
Data Lifeguard
-
-
-
3
-
-
84860043723
-
-
SIGuardian. http://www.siguardian.com.
-
-
-
-
4
-
-
84860043722
-
-
The UC Berkeley/Stanford Recovery-Oriented Computing (ROC) Project. http://roc.cs.berkeley.edu/.
-
-
-
-
5
-
-
84860056238
-
-
11
-
TOP500 List 11/2004. http://www.top500.org/lists/2005/11/basic.
-
(2004)
-
-
-
8
-
-
4043157227
-
Reliability, Availability, and Serviceability (RAS) of the IBM eServer z990
-
M. L. Fair, C. R. Conklin, S. B. Swaney, P. J. Meaney, W. J. Clarke, L. C. Alves, I. N. Modi, F. Freier, W. Fischer, and N. E. Weber. Reliability, Availability, and Serviceability (RAS) of the IBM eServer z990. IBM Journal of Research and Development, 48(3/4), 2004.
-
(2004)
IBM Journal of Research and Development
, vol.48
, Issue.3-4
-
-
Fair, M.L.1
Conklin, C.R.2
Swaney, S.B.3
Meaney, P.J.4
Clarke, W.J.5
Alves, L.C.6
Modi, I.N.7
Freier, F.8
Fischer, W.9
Weber, N.E.10
-
9
-
-
0030147013
-
Minimizing completion time of a program by checkpointing and rejuvenation
-
May
-
S. Garg, Y. Huang, C. Kintala, and K. S. Trivedi. Minimizing Completion Time of a Program by Checkpointing and Rejuvenation. In Proceedings of the ACM SIGMETRICS 1996 Conference on Measurement and Modeling of Computer Systems, pages 252-261, May 1996.
-
(1996)
Proceedings of the ACM SIGMETRICS 1996 Conference on Measurement and Modeling of Computer Systems
, pp. 252-261
-
-
Garg, S.1
Huang, Y.2
Kintala, C.3
Trivedi, K.S.4
-
11
-
-
0036734883
-
Improved disk-drive failure warnings
-
G. F. Hughes, J. F. Murray, K. Kreutz-Delgado, and C. Elkan. Improved disk-drive failure warnings. IEEE Transactions on Reliability, 51(3):350-357, 2002.
-
(2002)
IEEE Transactions on Reliability
, vol.51
, Issue.3
, pp. 350-357
-
-
Hughes, G.F.1
Murray, J.F.2
Kreutz-Delgado, K.3
Elkan, C.4
-
12
-
-
84860048593
-
-
Autonomic computing initiative
-
IBM. Autonomic computing initiative, 2002. http://www.research.ibm.com/ autonomic/index_nf.html.
-
(2002)
-
-
-
14
-
-
27544497222
-
Filtering failure logs for a bluegene/1 prototype
-
Y. Liang, Y. Zhang, A. Sivasubramaniam, R. Sahoo, J. Moreira, and M. Gupta. Filtering failure logs for a bluegene/1 prototype. In Proceedings of the International Conference on Dependable Systems and Networks (DSN), 2005.
-
(2005)
Proceedings of the International Conference on Dependable Systems and Networks (DSN)
-
-
Liang, Y.1
Zhang, Y.2
Sivasubramaniam, A.3
Sahoo, R.4
Moreira, J.5
Gupta, M.6
-
17
-
-
84944403418
-
A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor
-
S. Mukherjee, C. Weaver, J. Emer, S. Reinhardt, and T. Austin. A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor. In Proceedings of the International Symposium on Microarchitecture (MICRO), pages 29-40, 2003.
-
(2003)
Proceedings of the International Symposium on Microarchitecture (MICRO)
, pp. 29-40
-
-
Mukherjee, S.1
Weaver, C.2
Emer, J.3
Reinhardt, S.4
Austin, T.5
-
18
-
-
12444257746
-
Fault-aware Job scheduling for BlueGene/L systems
-
A. J. Oliner, R., Sahoo, J. E. Moreira, M. Gupta, and A. Sivasubramaniam. Fault-aware Job Scheduling for BlueGene/L Systems. In Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS), 2004.
-
(2004)
Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS)
-
-
Oliner, A.J.1
Sahoo, R.2
Moreira, J.E.3
Gupta, M.4
Sivasubramaniam, A.5
-
20
-
-
77952378080
-
Critical event prediction for proactive management in large-scale computer clusters
-
August
-
R. K. Sahoo, A. J. Oliner, I. Rish, M. Gupta, J. E. Moreira, S. Ma, R. Vilalta, and A. Sivasubramaniam. Critical Event Prediction for Proactive Management in Large-scale Computer Clusters. In Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2003.
-
(2003)
Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
-
-
Sahoo, R.K.1
Oliner, A.J.2
Rish, I.3
Gupta, M.4
Moreira, J.E.5
Ma, S.6
Vilalta, R.7
Sivasubramaniam, A.8
-
21
-
-
0038684860
-
Temperature-aware microarchitecture
-
June
-
K. Skadron, M. Stan, W. Huang, S. Velusamy, K. Sankaranarayanan, and D. Tarjan. Temperature-Aware Microarchitecture. In Proceedings of the International Symposium on Computer Architecture (ISCA), pages 1-13, June 2003.
-
(2003)
Proceedings of the International Symposium on Computer Architecture (ISCA)
, pp. 1-13
-
-
Skadron, K.1
Stan, M.2
Huang, W.3
Velusamy, S.4
Sankaranarayanan, K.5
Tarjan, D.6
-
22
-
-
77956226483
-
A reliability odometer - Lemon check your processor!
-
J. Srinivasan, S. V. Adve, P. Bose, and J. A. Rivers. A reliability odometer - lemon check your processor! In Proceedings of the Eleventh International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XI), The Wild and Crazy Idea Session IV, 2004.
-
(2004)
Proceedings of the Eleventh International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XI), the Wild and Crazy Idea Session IV
-
-
Srinivasan, J.1
Adve, S.V.2
Bose, P.3
Rivers, J.A.4
-
25
-
-
0026869241
-
Analysis and modeling of correlated failures in multicomputer systems
-
D. Tang and R. K. Iyer. Analysis and modeling of correlated failures in multicomputer systems. IEEE Transactions on Computers, 41(5):567-577, 1992.
-
(1992)
IEEE Transactions on Computers
, vol.41
, Issue.5
, pp. 567-577
-
-
Tang, D.1
Iyer, R.K.2
-
27
-
-
0034832697
-
Analysis and implementation of software rejuvenation in cluster systems
-
June
-
K. Vaidyanathan, R. E. Harper, S. W. Hunter, and K. S. Trivedi. Analysis and Implementation of Software Rejuvenation in Cluster Systems. In Proceedings of the ACM BIOMETRICS 2001 Conference on Measurement and Modeling of Computer Systems, pages 62-71, June 2001.
-
(2001)
Proceedings of the ACM BIOMETRICS 2001 Conference on Measurement and Modeling of Computer Systems
, pp. 62-71
-
-
Vaidyanathan, K.1
Harper, R.E.2
Hunter, S.W.3
Trivedi, K.S.4
|