메뉴 건너뛰기




Volumn 68, Issue 5, 2008, Pages 663-677

Fault tolerant algorithms for heat transfer problems

Author keywords

Parabolic problems; Parallel numerical algorithms; Process fault tolerance

Indexed keywords

COMPUTER HARDWARE; COMPUTER SIMULATION; COMPUTER SYSTEM RECOVERY; FAULT TOLERANT COMPUTER SYSTEMS; LINEAR SYSTEMS; PROBLEM SOLVING;

EID: 41449089800     PISSN: 07437315     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.jpdc.2007.09.004     Document Type: Article
Times cited : (28)

References (29)
  • 1
    • 0033359224 scopus 로고    scopus 로고
    • A. Agbaria, R. Friedman, Starfish: fault-tolerant dynamic MPI programs on clusters of workstations, in: HPDC '99: Proceedings of the Eighth IEEE International Symposium on High Performance Distributed Computing, Washington, DC, USA, IEEE Computer Society, Silver Spring, MD, 1999, p. 31.
    • A. Agbaria, R. Friedman, Starfish: fault-tolerant dynamic MPI programs on clusters of workstations, in: HPDC '99: Proceedings of the Eighth IEEE International Symposium on High Performance Distributed Computing, Washington, DC, USA, IEEE Computer Society, Silver Spring, MD, 1999, p. 31.
  • 2
    • 77954003885 scopus 로고    scopus 로고
    • R. Batchu, A. Skjellum, Z. Cui, M. Beddhu, J.P. Neelamegam, Y. Dandass, M. Apte, MPI/FTTM: architecture and taxonomies for fault-tolerant, message-passing middleware for performance-portable parallel computing, in: CCGRID '01: Proceedings of the 1st International Symposium on Cluster Computing and the Grid, Washington, DC, USA, IEEE Computer Society, Silver Spring, MD, 2001, p. 26.
    • R. Batchu, A. Skjellum, Z. Cui, M. Beddhu, J.P. Neelamegam, Y. Dandass, M. Apte, MPI/FTTM: architecture and taxonomies for fault-tolerant, message-passing middleware for performance-portable parallel computing, in: CCGRID '01: Proceedings of the 1st International Symposium on Cluster Computing and the Grid, Washington, DC, USA, IEEE Computer Society, Silver Spring, MD, 2001, p. 26.
  • 3
    • 41449106385 scopus 로고
    • The mollification method and the numerical solution of ill-posed problems (Diego A. Murio)
    • Beck J.V. The mollification method and the numerical solution of ill-posed problems (Diego A. Murio). SIAM Rev. 36 3 (1994) 502-503
    • (1994) SIAM Rev. , vol.36 , Issue.3 , pp. 502-503
    • Beck, J.V.1
  • 5
    • 41449112246 scopus 로고    scopus 로고
    • G. Bosilca, A. Bouteiller, F. Cappello, S. Djilali, G. Fedak, C. Germain, T. Herault, P. Lemarinier, O. Lodygensky, F. Magniette, V. Neri, A. Selikhov, MPICH-V: toward a scalable fault tolerant MPI for volatile nodes, in: Supercomputing '02: Proceedings of the 2002 ACM/IEEE Conference on Supercomputing, Los Alamitos, CA, USA, IEEE Computer Society Press, Silver Spring, MD, 2002, pp. 1-18.
    • G. Bosilca, A. Bouteiller, F. Cappello, S. Djilali, G. Fedak, C. Germain, T. Herault, P. Lemarinier, O. Lodygensky, F. Magniette, V. Neri, A. Selikhov, MPICH-V: toward a scalable fault tolerant MPI for volatile nodes, in: Supercomputing '02: Proceedings of the 2002 ACM/IEEE Conference on Supercomputing, Los Alamitos, CA, USA, IEEE Computer Society Press, Silver Spring, MD, 2002, pp. 1-18.
  • 6
    • 85140868634 scopus 로고    scopus 로고
    • G. Bosilca, Z. Chen, J. Dongarra, J. Langou, Recovery patterns for iterative methods in a parallel unstable environment, SIAM J. Sci. Comput., May, 2007.
    • G. Bosilca, Z. Chen, J. Dongarra, J. Langou, Recovery patterns for iterative methods in a parallel unstable environment, SIAM J. Sci. Comput., May, 2007.
  • 7
    • 31844451082 scopus 로고    scopus 로고
    • Z. Chen, G.E. Fagg, E. Gabriel, J. Langou, T. Angskun, G. Bosilca, J. Dongarra, Fault tolerant high performance computing by a coding approach, in: PPoPP '05: Proceedings of the Tenth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, New York, NY, USA, ACM Press, New York, 2005, pp. 213-223.
    • Z. Chen, G.E. Fagg, E. Gabriel, J. Langou, T. Angskun, G. Bosilca, J. Dongarra, Fault tolerant high performance computing by a coding approach, in: PPoPP '05: Proceedings of the Tenth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, New York, NY, USA, ACM Press, New York, 2005, pp. 213-223.
  • 8
    • 41449103072 scopus 로고    scopus 로고
    • J.J. Dongarra, Z. Chen, G. Bosilca, J. Langou, Disaster survival guide in petascale computing: an algorithmic approach, in: Petascale Computing: Algorithms and Applications, Chapman & Hall, CRC Press, London, Boca Raton, FL, 2007.
    • J.J. Dongarra, Z. Chen, G. Bosilca, J. Langou, Disaster survival guide in petascale computing: an algorithmic approach, in: Petascale Computing: Algorithms and Applications, Chapman & Hall, CRC Press, London, Boca Raton, FL, 2007.
  • 9
    • 41449113981 scopus 로고    scopus 로고
    • J. Duell, The design and implementation of Berkeley Lab's linux checkpoint/restart, Berkeley Lab Technical Report (Publication LBNL-54941 〈http://www.osti.gov/servlets/purl/891617-2L2UJc/〉, September 25, 2006.
    • J. Duell, The design and implementation of Berkeley Lab's linux checkpoint/restart, Berkeley Lab Technical Report (Publication LBNL-54941 〈http://www.osti.gov/servlets/purl/891617-2L2UJc/〉, September 25, 2006.
  • 11
    • 0004767479 scopus 로고
    • Asymptotic analysis on large timescales for singular perturbations of hyperbolic type
    • Eckhaus W., and Garbey M. Asymptotic analysis on large timescales for singular perturbations of hyperbolic type. SIAM J. Math. Anal. 21 4 (1990) 867-883
    • (1990) SIAM J. Math. Anal. , vol.21 , Issue.4 , pp. 867-883
    • Eckhaus, W.1    Garbey, M.2
  • 12
    • 25144486687 scopus 로고    scopus 로고
    • C. Engelmann, A. Geist, Super-scalable algorithms for computing on 100,000 processors, in: V.S. Sunderam, G. Dick van Albada, P.M.A. Sloot, J. Dongarra (Eds.), Proceedings of the International Conference on Computational Science (ICCS) 2005, Part I, Lecture Notes in Computer Science, vol. 3514, Springer, Berlin, 2005, pp. 313-321.
    • C. Engelmann, A. Geist, Super-scalable algorithms for computing on 100,000 processors, in: V.S. Sunderam, G. Dick van Albada, P.M.A. Sloot, J. Dongarra (Eds.), Proceedings of the International Conference on Computational Science (ICCS) 2005, Part I, Lecture Notes in Computer Science, vol. 3514, Springer, Berlin, 2005, pp. 313-321.
  • 14
    • 34548773868 scopus 로고    scopus 로고
    • E. Gabriel, S. Huang, Runtime optimization of application level communication patterns, in: Proceedings of the 21st IEEE International Parallel and Distributed Processing Symposium, 12th International Workshop on High-Level Parallel Programming Models and Supportive Environments, Long Beach, CA, March 26, IEEE Computer Society, Silver Spring, MD, 2007, p. 185.
    • E. Gabriel, S. Huang, Runtime optimization of application level communication patterns, in: Proceedings of the 21st IEEE International Parallel and Distributed Processing Symposium, 12th International Workshop on High-Level Parallel Programming Models and Supportive Environments, Long Beach, CA, March 26, IEEE Computer Society, Silver Spring, MD, 2007, p. 185.
  • 15
    • 41449112852 scopus 로고    scopus 로고
    • M. Garbey, H. Ltaief, Fault tolerant domain decomposition for parabolic problems, New York University, Lecture Notes in Computational Science and Engineering, Springer, Berlin, January 2005, pp. 565-572.
    • M. Garbey, H. Ltaief, Fault tolerant domain decomposition for parabolic problems, New York University, Lecture Notes in Computational Science and Engineering, Springer, Berlin, January 2005, pp. 565-572.
  • 16
    • 41449110416 scopus 로고    scopus 로고
    • A least square extrapolation method for the a priori error estimate of CFD and heat transfer problem
    • Garbey M., and Picard C. A least square extrapolation method for the a priori error estimate of CFD and heat transfer problem. Structural Dynamic Eurodyn (2005) 871-876
    • (2005) Structural Dynamic Eurodyn , pp. 871-876
    • Garbey, M.1    Picard, C.2
  • 17
    • 0030243005 scopus 로고    scopus 로고
    • A high-performance, portable implementation of the MPI message passing interface standard
    • Gropp W., Lusk E., Doss N., and Skjellum A. A high-performance, portable implementation of the MPI message passing interface standard. Parallel Comput. 22 6 (1996) 789-828
    • (1996) Parallel Comput. , vol.22 , Issue.6 , pp. 789-828
    • Gropp, W.1    Lusk, E.2    Doss, N.3    Skjellum, A.4
  • 18
    • 41449087256 scopus 로고    scopus 로고
    • P. Hough, V. Howle, Fault tolerance in large scale scientific computing, in: M.A. Heroux, P. Raghavan, H.D. Simon (Eds.), Parallel Processing for Scientific Computing, SIAM Press, Philadelphia, PA, 2006.
    • P. Hough, V. Howle, Fault tolerance in large scale scientific computing, in: M.A. Heroux, P. Raghavan, H.D. Simon (Eds.), Parallel Processing for Scientific Computing, SIAM Press, Philadelphia, PA, 2006.
  • 19
    • 0035007397 scopus 로고    scopus 로고
    • K. Ingols, I. Keidar, Availability study of dynamic voting algorithms, in: Proceedings of the 21st IEEE International Conference on Distributed Computing Systems (ICDCS), 2001, pp. 247-254.
    • K. Ingols, I. Keidar, Availability study of dynamic voting algorithms, in: Proceedings of the 21st IEEE International Conference on Distributed Computing Systems (ICDCS), 2001, pp. 247-254.
  • 20
    • 41449096456 scopus 로고    scopus 로고
    • MPI Forum, Special Issue: MPI2: A Message-Passing Interface Standard. Internat. J. Supercomputer Appl. and High Performance Comput. 12(1-2) (1998) 1-299.
    • MPI Forum, Special Issue: MPI2: A Message-Passing Interface Standard. Internat. J. Supercomputer Appl. and High Performance Comput. 12(1-2) (1998) 1-299.
  • 21
    • 41449111112 scopus 로고    scopus 로고
    • I.R. Philp, Software failures and the road to a petaflop machine, in: First Workshop on High Performance Computing Reliability Issues (HPCRI), Los Alamos National Laboratory, February 2005.
    • I.R. Philp, Software failures and the road to a petaflop machine, in: First Workshop on High Performance Computing Reliability Issues (HPCRI), Los Alamos National Laboratory, February 2005.
  • 23
    • 22144498897 scopus 로고    scopus 로고
    • Latency tolerance through parallelization of time in scientific applications
    • Srinivasana A., and Chandra N. Latency tolerance through parallelization of time in scientific applications. Parallel Comput. 31 7 (2005) 777-796
    • (2005) Parallel Comput. , vol.31 , Issue.7 , pp. 777-796
    • Srinivasana, A.1    Chandra, N.2
  • 24
    • 0029713612 scopus 로고    scopus 로고
    • G. Stellner, CoCheck: checkpointing and process migration for MPI, in: IPPS '96: Proceedings of the 10th International Parallel Processing Symposium, Washington, DC, USA, IEEE Computer Society, Silver Spring, MD, 1996, pp. 526-531.
    • G. Stellner, CoCheck: checkpointing and process migration for MPI, in: IPPS '96: Proceedings of the 10th International Parallel Processing Symposium, Washington, DC, USA, IEEE Computer Society, Silver Spring, MD, 1996, pp. 526-531.
  • 26
    • 85143037324 scopus 로고    scopus 로고
    • The MPI Forum, MPI: a message passing interface, in: Supercomputing '93: Proceedings of the 1993 ACM/IEEE Conference on Supercomputing, New York, NY, USA, ACM Press, New York, 1993, pp. 878-883.
    • The MPI Forum, MPI: a message passing interface, in: Supercomputing '93: Proceedings of the 1993 ACM/IEEE Conference on Supercomputing, New York, NY, USA, ACM Press, New York, 1993, pp. 878-883.
  • 27
    • 34548768671 scopus 로고    scopus 로고
    • C. Wang, F. Mueller, C. Engelmann, S.L. Scott, A job pause service under LAM / MPI + BLCR for transparent fault tolerance, in: Proceedings of the 21st International Parallel and Distributed Processing Symposium (IPDPS), Long Beach, CA, USA, 2007
    • C. Wang, F. Mueller, C. Engelmann, S.L. Scott, A job pause service under LAM / MPI + BLCR for transparent fault tolerance, in: Proceedings of the 21st International Parallel and Distributed Processing Symposium (IPDPS), Long Beach, CA, USA, 2007
  • 28
    • 0028994273 scopus 로고    scopus 로고
    • Y.-M. Wang, Y. Huang, K.-P. Vo, P.-Y. Chung, C. Kintala, Checkpointing and its applications, in: Proceedings of the International Symposium on. Fault-Tolerant Computing, June 1995, pp. 22-31.
    • Y.-M. Wang, Y. Huang, K.-P. Vo, P.-Y. Chung, C. Kintala, Checkpointing and its applications, in: Proceedings of the International Symposium on. Fault-Tolerant Computing, June 1995, pp. 22-31.
  • 29
    • 41449097278 scopus 로고    scopus 로고
    • Y. Zhuang, X.-H. Sun, Stable, globally non-iterative, non-overlapping domain decomposition parallel solvers for parabolic problems, in: Supercomputing '01: Proceedings of the 2001 ACM/IEEE Conference on Supercomputing (CDROM), New York, NY, USA, ACM Press, New York, 2001, p. 19.
    • Y. Zhuang, X.-H. Sun, Stable, globally non-iterative, non-overlapping domain decomposition parallel solvers for parabolic problems, in: Supercomputing '01: Proceedings of the 2001 ACM/IEEE Conference on Supercomputing (CDROM), New York, NY, USA, ACM Press, New York, 2001, p. 19.


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.