메뉴 건너뛰기




Volumn , Issue , 2011, Pages 15-22

Establishing hypothesis for recurrent system failures from cluster log files

Author keywords

Failure diagnosis; Hypothesis testing; Large cluster systems; Reliability; Syslogs

Indexed keywords

CAUSAL RELATIONSHIPS; CORRELATION ANALYSIS; DIAGNOSE SYSTEM; EVENT SEQUENCE; FAILURE DIAGNOSIS; FAILURE DIAGNOSTICS; FILE SYSTEMS; HIGH CONFIDENCE; HYPOTHESIS TESTING; LARGE CLUSTERS; LOG ANALYSIS; LOG FILE; OPEN SOURCE SOFTWARE; SECOND GENERATION; SYSLOGS; SYSTEM FAILURES; SYSTEMS ADMINISTRATOR; UNIVERSITY OF TEXAS;

EID: 84856109383     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/DASC.2011.27     Document Type: Conference Paper
Times cited : (9)

References (30)
  • 1
    • 0025502686 scopus 로고
    • Error log analysis: Statistical modeling and heuristic trend analysis
    • T.-T. Y. Lin and D. P. Siewiorek, "Error log analysis: Statistical modeling and heuristic trend analysis," IEEE Transactions on Reliability, vol. 39, no. 4, 1990.
    • (1990) IEEE Transactions on Reliability , vol.39 , Issue.4
    • Lin, T.-T.Y.1    Siewiorek, D.P.2
  • 3
    • 33847328785 scopus 로고    scopus 로고
    • Availability assessment of sunos/solaris unix systems based on syslogd and wtmpx log files: A case study
    • C. Simache and M. Kaaniche, "Availability assessment of sunos/solaris unix systems based on syslogd and wtmpx log files: A case study," in Proceedings of IEEE PRDC, Dec 2005.
    • Proceedings of IEEE PRDC, Dec 2005
    • Simache, C.1    Kaaniche, M.2
  • 5
    • 33845593340 scopus 로고    scopus 로고
    • A large-scale study of failures in high-performance computing systems
    • B. Schroeder and G. Gibson, "A large-scale study of failures in high-performance computing systems," in Proceedings of IEEE/IFIP DSN, 2006, pp. 249-258.
    • Proceedings of IEEE/IFIP DSN, 2006 , pp. 249-258
    • Schroeder, B.1    Gibson, G.2
  • 9
    • 84856079819 scopus 로고    scopus 로고
    • One graph is worth a thousand logs: Uncovering hidden structures in massive system event logs
    • M. Aharon, G. Barash, I. Cohen, and E. Mordechai, "One graph is worth a thousand logs: Uncovering hidden structures in massive system event logs," in Proceedings of ECML PKDD, 2009.
    • Proceedings of ECML PKDD, 2009
    • Aharon, M.1    Barash, G.2    Cohen, I.3    Mordechai, E.4
  • 12
    • 4243934975 scopus 로고
    • PhD Thesis, Department of Electrical and Computer Engineering, Carnegie Mellon University
    • M. M. Tsao, "Trend analysis and fault prediction," PhD Thesis, Department of Electrical and Computer Engineering, Carnegie Mellon University, 1983.
    • (1983) Trend Analysis and Fault Prediction
    • Tsao, M.M.1
  • 15
    • 56749178938 scopus 로고    scopus 로고
    • Exploring event correlation for failure prediction in coalitions of clusters
    • S. Fu and C.-Z. Xu, "Exploring event correlation for failure prediction in coalitions of clusters," in Proceedings of ACM/IEEE Supercomputing, no. 41, 2007.
    • (2007) Proceedings of ACM/IEEE Supercomputing , Issue.41
    • Fu, S.1    Xu, C.-Z.2
  • 20
    • 77956291503 scopus 로고    scopus 로고
    • End-to-end framework for fault management for open source clusters: Ranger
    • J. L. Hammond, T. Minyard, and J. Browne, "End-to-end framework for fault management for open source clusters: Ranger," in Proceedings of ACM TeraGrid, no. 9, 2010.
    • (2010) Proceedings of ACM TeraGrid , Issue.9
    • Hammond, J.L.1    Minyard, T.2    Browne, J.3
  • 24
    • 49049104267 scopus 로고
    • Automated system monitoring and notification with swatch
    • S. E. Hansen and E. T. Atkins, "Automated system monitoring and notification with swatch," in USENIX LISA, 1993.
    • (1993) USENIX LISA
    • Hansen, S.E.1    Atkins, E.T.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.