메뉴 건너뛰기




Volumn , Issue , 2013, Pages 167-176

Online-ABFT: An online algorithm based fault tolerance scheme for soft error detection in iterative methods

Author keywords

algorithm based fault tolerance (abft); checkpoint; iterative methods; online error detection; soft error

Indexed keywords

ALGORITHM BASED FAULT TOLERANCE; CHECKPOINT; COMPUTATION EFFICIENCY; KRYLOV SUBSPACE ITERATIVE METHODS; NUMBER OF COMPONENTS; ON-LINE ERROR DETECTION; SOFT ERROR; SOFT ERROR DETECTION;

EID: 84875168534     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/2442516.2442533     Document Type: Conference Paper
Times cited : (80)

References (30)
  • 3
    • 84875193661 scopus 로고    scopus 로고
    • MPICH-V. http://mpich-v.lri.fr.
  • 9
    • 25144456004 scopus 로고    scopus 로고
    • Numerically stable real number codes based on random matrices
    • Proceeding of the 5th International Conference on Computational Science (ICCS2005), Atlanta, Georgia, USA, May 22-25, 2005
    • Z. Chen and J. Dongarra. Numerically stable real number codes based on random matrices. In Proceeding of the 5th International Conference on Computational Science (ICCS2005), Atlanta, Georgia, USA, May 22-25, 2005. LNCS 3514,
    • LNCS , vol.3514
    • Chen, Z.1    Dongarra, J.2
  • 10
    • 33746136466 scopus 로고    scopus 로고
    • Condition numbers of Gaussian random matrices
    • DOI 10.1137/040616413
    • Z. Chen and J. Dongarra. Condition Numbers of Gaussian Random Matrices. SIAM Journal on Matrix Analysis and Applications, Volume 27, Number 3, Page 603-620, 2005. (Pubitemid 44085054)
    • (2005) SIAM Journal on Matrix Analysis and Applications , vol.27 , Issue.3 , pp. 603-620
    • Chen, Z.1    Dongarra, J.J.2
  • 13
    • 75449102762 scopus 로고    scopus 로고
    • Highly Scalable Self-Healing Algorithms for High Performance Scientific Computing
    • July
    • Z. Chen, and J. Dongarra. Highly Scalable Self-Healing Algorithms for High Performance Scientific Computing. IEEE Transactions on Computers, July, 2009.
    • (2009) IEEE Transactions on Computers
    • Chen, Z.1    Dongarra, J.2
  • 19
    • 28044460018 scopus 로고    scopus 로고
    • A higher order estimate of the optimum checkpoint interval for restart dumps
    • J. Daly. A higher order estimate of the optimum checkpoint interval for restart dumps. Future Generation Comp. Syst., 22(3): 303-312 (2006).
    • (2006) Future Generation Comp. Syst. , vol.22 , Issue.3 , pp. 303-312
    • Daly, J.1
  • 21
    • 84875140305 scopus 로고    scopus 로고
    • Open MPI: www.open-mpi.org/.
  • 22
    • 58349092078 scopus 로고    scopus 로고
    • Failure Tolerance in Petascale Computers
    • November
    • G. A. Gibson, B. Schroeder, and J. Digney. Failure Tolerance in Petascale Computers. CTWatch Quarterly, Volume 3, Number 4, November 2007.
    • (2007) CTWatch Quarterly , vol.3 , Issue.4
    • Gibson, G.A.1    Schroeder, B.2    Digney, J.3
  • 24
    • 0021439162 scopus 로고
    • Algorithm-based fault tolerance for matrix operations
    • K.-H. Huang and J. A. Abraham. Algorithm-based fault tolerance for matrix operations. IEEE Transactions on Computers, vol. C-33:518- 528, 1984.
    • (1984) IEEE Transactions on Computers , vol.C-33 , pp. 518-528
    • Huang, K.-H.1    Abraham, J.A.2
  • 28
    • 84990637885 scopus 로고
    • PVM: A framework for parallel distributed computing
    • V. S. Sunderam. PVM: a framework for parallel distributed computing. Concurrency: Pract. Exper., 2(4):315-339, 1990.
    • (1990) Concurrency: Pract. Exper. , vol.2 , Issue.4 , pp. 315-339
    • Sunderam, V.S.1
  • 29
    • 1842829625 scopus 로고    scopus 로고
    • Society for Industrial and Applied Mathematics. Second Edition. April 30
    • Y. Saad. Iterative Methods for Sparse Linear Systems. Society for Industrial and Applied Mathematics. Second Edition. April 30, 2003.
    • (2003) Iterative Methods for Sparse Linear Systems
    • Saad, Y.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.