메뉴 건너뛰기




Volumn , Issue , 2012, Pages 1168-1179

Taming of the shrew: Modeling the normal and faulty behaviour of large-scale HPC systems

Author keywords

fault detection; fault tolerance; large scale HPC systems; signal analysis

Indexed keywords

COMPLEX MACHINES; ENTIRE SYSTEM; EVENT ANALYSIS; EVENT MINING; FAILURE RATE; FAULT PREDICTION; FILTERING ALGORITHM; HARDWARE AND SOFTWARE COMPONENTS; NORMAL FLOW; PROACTIVE FAULT; SIGNAL ANALYZERS; SYSTEM ADMINISTRATION; SYSTEM OUTPUT; SYSTEM STATE;

EID: 84866885057     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/IPDPS.2012.107     Document Type: Conference Paper
Times cited : (61)

References (29)
  • 3
    • 77952378080 scopus 로고    scopus 로고
    • Critical Event Prediction for Proactive Management in Large-scale Computer Clusters
    • R. K. Sahoo et al: Critical Event Prediction for Proactive Management In Large-scale Computer Clusters. International conference on Knowledge discovery and data mining, pp 426-435, 2003
    • (2003) International Conference on Knowledge Discovery and Data Mining , pp. 426-435
    • Sahoo, R.K.1
  • 4
    • 79951644113 scopus 로고    scopus 로고
    • Analysis and Modeling of Time-Correlated Failures in Large-Scale Distributed Systems
    • N. Yigitbasi et al: Analysis and Modeling of Time-Correlated Failures in Large-Scale Distributed Systems. IEEE/ACM International Conference on Grid Computing, pp 65-72, 2010
    • (2010) IEEE/ACM International Conference on Grid Computing , pp. 65-72
    • Yigitbasi, N.1
  • 5
    • 77958132122 scopus 로고    scopus 로고
    • Mining Dependency in Distributed Systems through Unstructured Logs Analysis
    • January
    • J. G. Lou et al: Mining Dependency in Distributed Systems through Unstructured Logs Analysis ACM SIGOPS Volume 44 Issue 1, January 2010
    • (2010) ACM SIGOPS , vol.44 , Issue.1
    • Lou, J.G.1
  • 6
    • 77951145583 scopus 로고    scopus 로고
    • Online System Problem Detection by Mining Patterns of Console Logs
    • W. Xu et al: Online System Problem Detection by Mining Patterns of Console Logs IEEE International Conference on Data Mining, pp 588-597, 2009
    • (2009) IEEE International Conference on Data Mining , pp. 588-597
    • Xu, W.1
  • 8
    • 55849147399 scopus 로고    scopus 로고
    • Dynamic Meta-Learning for Failure Prediction in Large-Scale Systems: A case Study
    • J. Gu et al: Dynamic Meta-Learning for Failure Prediction in Large-Scale Systems: A case Study International Conference on Parallel Processing, pp 157-164, 2008
    • (2008) International Conference on Parallel Processing , pp. 157-164
    • Gu, J.1
  • 14
    • 0025416073 scopus 로고
    • Automatic recognition of intermittent failures: An experimental study of field data
    • R. Iyer et al: Automatic recognition of intermittent failures: An experimental study of field data. IEEE Transactions on Computers, 39:525537, 1990.
    • (1990) IEEE Transactions on Computers , vol.39 , pp. 525537
    • Iyer, R.1
  • 20
    • 33845593340 scopus 로고    scopus 로고
    • A large-scale study of failures in high-performance computing systems
    • June
    • B. Schroeder et al: A large-scale study of failures in high-performance computing systems. IEEE DSN, pages 249-258, June 2006
    • (2006) IEEE DSN , pp. 249-258
    • Schroeder, B.1
  • 21
    • 84954739936 scopus 로고
    • The robustness of the twosample ttest over the Pearson system
    • H. Posten: The robustness of the twosample ttest over the Pearson system. Journal of Statistical Computation and Simulation, Volume 6, Issue 3-4, 1978
    • (1978) Journal of Statistical Computation and Simulation , vol.6 , Issue.3-4
    • Posten, H.1
  • 22
    • 84866869754 scopus 로고    scopus 로고
    • National Center for Supercomputing Applications Accessed on 2010
    • National Center for Supercomputing Applications at the University of Illinois. www.ncsa.illinois.edu. Accessed on 2010.
  • 23
  • 24
    • 67349122907 scopus 로고    scopus 로고
    • Removal of Correlated Noise by Modeling the Signal of Interest in the Wavelet Domain
    • B. Goossens et al: Removal of Correlated Noise by Modeling the Signal of Interest in the Wavelet Domain. IEEE Trans Image Process. 2009
    • (2009) IEEE Trans Image Process.
    • Goossens, B.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.