메뉴 건너뛰기




Volumn , Issue , 2013, Pages 111-120

Linking resource usage anomalies with system failures from cluster log data

Author keywords

Cluster log data; Large clusters; Linux O S; Lustre file system; Resource Anomalies and Failures

Indexed keywords

FILESYSTEM; LARGE CLUSTER SYSTEMS; LARGE CLUSTERS; LINKING RESOURCES; LINUX O/S; LOG DATA; MULTIPLE SOURCE; RESOURCE-MONITORING TOOLS;

EID: 84891501027     PISSN: 10609857     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/SRDS.2013.20     Document Type: Conference Paper
Times cited : (39)

References (41)
  • 1
    • 33845593340 scopus 로고    scopus 로고
    • A large-scale study of failures in highperformance computing systems
    • B. Schroeder and G. A. Gibson, "A large-scale study of failures in highperformance computing systems," in Proceedings of IEEE/IFIP DSN, 2006, pp. 249-258.
    • (2006) Proceedings of IEEE/IFIP DSN , pp. 249-258
    • Schroeder, B.1    Gibson, G.A.2
  • 4
    • 84966284395 scopus 로고    scopus 로고
    • Probabilistic diagnosis of performance faults in large scale parallel applications
    • I. Laguna, D. H. Anh, B. R. de Supinski, S. Bagchi, and T. Gamblin, "Probabilistic diagnosis of performance faults in large scale parallel applications," in Proceedings of PACT, 2012, pp. 1-10.
    • (2012) Proceedings of PACT , pp. 1-10
    • Laguna, I.1    Anh, D.H.2    De Supinski, B.R.3    Bagchi, S.4    Gamblin, T.5
  • 5
  • 6
    • 79952786041 scopus 로고    scopus 로고
    • Anomaly detection in large-scale coalition clusters for dependability assurance
    • Q. Guan, D. Smith, and S. Fu, "Anomaly detection in large-scale coalition clusters for dependability assurance," in Proceedings of IEEE HiPC, 2010, pp. 1-10.
    • (2010) Proceedings of IEEE HiPC , pp. 1-10
    • Guan, Q.1    Smith, D.2    Fu, S.3
  • 9
    • 84867695274 scopus 로고    scopus 로고
    • 3-dimensional root cause diagnosis via co-analysis
    • Z. Zheng, L. Yu, Z. Lan, and T. Jones, "3-dimensional root cause diagnosis via co-analysis," in Proceedings of ACM ICAC, 2012, pp. 181-190.
    • (2012) Proceedings of ACM ICAC , pp. 181-190
    • Zheng, Z.1    Yu, L.2    Lan, Z.3    Jones, T.4
  • 12
  • 13
    • 84874306678 scopus 로고    scopus 로고
    • Logmaster: Mining event correlations in logs of large-scale cluster systems
    • X. Fu, R. Ren, J. Zhan, W. Zhou, Z. Jia, and G. Lu, "Logmaster: Mining event correlations in logs of large-scale cluster systems," in Proceedings of IEEE SRDS, 2012, pp. 1-10.
    • (2012) Proceedings of IEEE SRDS , pp. 1-10
    • Fu, X.1    Ren, R.2    Zhan, J.3    Zhou, W.4    Jia, Z.5    Lu, G.6
  • 14
    • 84891541902 scopus 로고    scopus 로고
    • Tacc stats: I/o performance monitoring for the intransigent
    • J. Hammond, "Tacc stats: I/o performance monitoring for the intransigent," in Invited Keynote for the 3rd IASDS Workshop, 2011, pp. 1-29.
    • (2011) Invited Keynote for the 3rd IASDS Workshop , pp. 1-29
    • Hammond, J.1
  • 16
    • 77956291503 scopus 로고    scopus 로고
    • End-to-end framework for fault management for open source clusters: Ranger
    • J. L. Hammond, T. Minyard, and J. Browne, "End-to-end framework for fault management for open source clusters: Ranger," in Proceedings of ACM TeraGrid, no. 9, 2010.
    • (2010) Proceedings of ACM TeraGrid , Issue.9
    • Hammond, J.L.1    Minyard, T.2    Browne, J.3
  • 18
    • 84891524202 scopus 로고    scopus 로고
    • T. T. Project
    • T. T. Project, http://sebastien.godard.pagesperso-orange.fr/.
  • 19
    • 80053278089 scopus 로고    scopus 로고
    • Co-analysis of ras log and job log on blue gene/p
    • Z. Zheng, L. Yu, W. Tang, and Z. Lan, "Co-analysis of ras log and job log on blue gene/p," in Proceedings of IEEE IPDPS, 2011, pp. 840-851.
    • (2011) Proceedings of IEEE IPDPS , pp. 840-851
    • Zheng, Z.1    Yu, L.2    Tang, W.3    Lan, Z.4
  • 20
    • 36049013419 scopus 로고    scopus 로고
    • What supercomputers say: A study of five system logs
    • June
    • A. Oliner and J. Stearley, "What supercomputers say: A study of five system logs," in Proceedings of IEEE/IFIP DSN, June 2007, pp. 575-584.
    • (2007) Proceedings of IEEE/IFIP DSN , pp. 575-584
    • Oliner, A.1    Stearley, J.2
  • 23
    • 0042826822 scopus 로고    scopus 로고
    • Independent component analysis: Algorithms and applications
    • A. Hyvarinen and E. Oja, "Independent component analysis: Algorithms and applications," Neural Networks, vol. 13, no. 4-5, pp. 411-430, 2000.
    • (2000) Neural Networks , vol.13 , Issue.4-5 , pp. 411-430
    • Hyvarinen, A.1    Oja, E.2
  • 25
    • 0000178613 scopus 로고
    • On the reciprocal of the general algebraic matrix
    • E. H. Moore, "On the reciprocal of the general algebraic matrix," Bulletin of the AMS, vol. 26, no. 9, p. 394395, 1920.
    • (1920) Bulletin of the AMS , vol.26 , Issue.9 , pp. 394395
    • Moore, E.H.1
  • 27
    • 80051926966 scopus 로고    scopus 로고
    • Online detection of multi-component interactions in production systems
    • A. J. Oliner and A. Aiken, "Online detection of multi-component interactions in production systems," in Proceedings of IEEE/IFIP DSN, 2011, pp. 49-60.
    • (2011) Proceedings of IEEE/IFIP DSN , pp. 49-60
    • Oliner, A.J.1    Aiken, A.2
  • 29
    • 84891522525 scopus 로고    scopus 로고
    • Lustre
    • Lustre, http://http://wiki.lustre.org/manual/LustreManual18 HTML/ index.html.
  • 34
    • 74949101845 scopus 로고    scopus 로고
    • A framework for distributed monitoring and root cause analysis for large ip networks
    • D. Banerjee, V. Madduri, and M. Srivatsa, "A framework for distributed monitoring and root cause analysis for large ip networks," in Proceedings of IEEE SRDS, 2009, pp. 246-255.
    • (2009) Proceedings of IEEE SRDS , pp. 246-255
    • Banerjee, D.1    Madduri, V.2    Srivatsa, M.3
  • 37
    • 84866677413 scopus 로고    scopus 로고
    • Adaptive algorithms for diagnosing large-scale failures in computer networks
    • S. Tati, B. J. Ko, G. Cao, A. Swami, and T. L. Porta, "Adaptive algorithms for diagnosing large-scale failures in computer networks," in Proceedings of IEEE/IFIP DSN, 2012, pp. 1-12.
    • (2012) Proceedings of IEEE/IFIP DSN , pp. 1-12
    • Tati, S.1    Ko, B.J.2    Cao, G.3    Swami, A.4    Porta, T.L.5


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.