메뉴 건너뛰기




Volumn 16, Issue 3, 2011, Pages 213-222

An optimistic checkpoint mechanism based on job characteristics and resource availability for dynamic grids

Author keywords

checkpoint; fault tolerance; grid computing; Markov

Indexed keywords


EID: 79957984912     PISSN: 10071202     EISSN: None     Source Type: Journal    
DOI: 10.1007/s11859-011-0739-6     Document Type: Article
Times cited : (2)

References (28)
  • 2
    • 0036468389 scopus 로고    scopus 로고
    • A taxonomy and survey of grid resource management systems for distributed computing [J]
    • Krauter K, Buyya R, Maheswaran M. A taxonomy and survey of grid resource management systems for distributed computing [J]. Software Practice and Experience, 2002, 32(2): 135-164.
    • (2002) Software Practice and Experience , vol.32 , Issue.2 , pp. 135-164
    • Krauter, K.1    Buyya, R.2    Maheswaran, M.3
  • 6
    • 0035390088 scopus 로고    scopus 로고
    • A variational calculus approach to optimal checkpoint placement [J]
    • Ling Y, Mi J, Lin X. A variational calculus approach to optimal checkpoint placement [J]. IEEE Transaction on Computers, 2001, 50(7): 699-708.
    • (2001) IEEE Transaction on Computers , vol.50 , Issue.7 , pp. 699-708
    • Ling, Y.1    Mi, J.2    Lin, X.3
  • 7
    • 50149088903 scopus 로고    scopus 로고
    • Minimizing the network overhead of checkpointing in cycle-harvesting cluster environments [C]
    • Boston: IEEE Press
    • Nurmi D, Brevik J, Wolski R. Minimizing the network overhead of checkpointing in cycle-harvesting cluster environments [C]//IEEE International Conference on Custer Computing, Boston: IEEE Press, 2005: 1-10.
    • (2005) IEEE International Conference on Custer Computing , pp. 1-10
    • Nurmi, D.1    Brevik, J.2    Wolski, R.3
  • 9
    • 0036041277 scopus 로고    scopus 로고
    • Improving cluster availability using workstation validation [C]
    • Marina Del Rey: ACM Press
    • Heath T, Martin R, Nguyen T D. Improving cluster availability using workstation validation [C]//Proceedings of the ACM Sigmetrics, Marina Del Rey: ACM Press, 2002: 217-227.
    • (2002) Proceedings of the ACM Sigmetrics , pp. 217-227
    • Heath, T.1    Martin, R.2    Nguyen, T.D.3
  • 10
    • 4544337911 scopus 로고    scopus 로고
    • Automatic methods for predicting machine availability in desktop grid and peer-to-peer systems [C]
    • Washington: IEEE Press
    • Brevik J, Nurmi D, Wolski R. Automatic methods for predicting machine availability in desktop grid and peer-to-peer systems [C]//Proceedings of the Cluster Computing and the Grid, Washington: IEEE Press, 2004: 190-199.
    • (2004) Proceedings of the Cluster Computing and the Grid , pp. 190-199
    • Brevik, J.1    Nurmi, D.2    Wolski, R.3
  • 11
    • 77957960970 scopus 로고    scopus 로고
    • Reducing costs of spot instances via checkpointing in the Amazon elastic compute cloud [C]
    • Florida: IEEE Press
    • Sangho Y, Derrick K, Artur A. Reducing costs of spot instances via checkpointing in the Amazon elastic compute cloud [C]//IEEE International Conference on Cloud Computing, Florida: IEEE Press, 2010: 236-243.
    • (2010) IEEE International Conference on Cloud Computing , pp. 236-243
    • Sangho, Y.1    Derrick, K.2    Artur, A.3
  • 12
    • 77951447133 scopus 로고    scopus 로고
    • Accelerating checkpoint operation by node-level write aggregation on multicore systems [C]
    • Vienna: IEEE Press
    • Ouyang X, Gopalakrishnan K, Panda D K. Accelerating checkpoint operation by node-level write aggregation on multicore systems [C]//International Conference on Parallel Processing, Vienna: IEEE Press, 2009: 34-41.
    • (2009) International Conference on Parallel Processing , pp. 34-41
    • Ouyang, X.1    Gopalakrishnan, K.2    Panda, D.K.3
  • 15
    • 10644223387 scopus 로고    scopus 로고
    • Computing on large-scale distributed systems: XtremWeb architecture, programming models, security, tests and convergence with grid [J]
    • Cappello F, Djilali S, Fedak G, et al. Computing on large-scale distributed systems: XtremWeb architecture, programming models, security, tests and convergence with grid [J]. Future Generation Computer Systems, 2005, 21(3): 417-437.
    • (2005) Future Generation Computer Systems , vol.21 , Issue.3 , pp. 417-437
    • Cappello, F.1    Djilali, S.2    Fedak, G.3
  • 17
    • 84960378385 scopus 로고    scopus 로고
    • Nimrod/G: An architecture of a resource management and scheduling system in a global computational grid [C]
    • Los Alamitos: IEEE Press
    • Buyya R, Abramson D, Giddy J. Nimrod/G: An architecture of a resource management and scheduling system in a global computational grid [C]//Proceedings of High Performance Computing in the Asia-Pacific Region, Los Alamitos: IEEE Press, 2000: 283-289.
    • (2000) Proceedings of High Performance Computing in the Asia-Pacific Region , pp. 283-289
    • Buyya, R.1    Abramson, D.2    Giddy, J.3
  • 18
    • 0035466298 scopus 로고    scopus 로고
    • An optimal checkpointing-strategy for real-time control systems under transient faults [J]
    • Kwak S W, Choi B J, Kim B K. An optimal checkpointing-strategy for real-time control systems under transient faults [J]. IEEE Transaction on Reliability, 2001, 50: 293-301.
    • (2001) IEEE Transaction on Reliability , vol.50 , pp. 293-301
    • Kwak, S.W.1    Choi, B.J.2    Kim, B.K.3
  • 20
    • 0031388399 scopus 로고    scopus 로고
    • Impact on checkpoint latency on overhead Ratio of checkpointing scheme [J]
    • Vaidya N H. Impact on checkpoint latency on overhead Ratio of checkpointing scheme [J]. IEEE Transaction on Computers, 1997, 46(8): 942-947.
    • (1997) IEEE Transaction on Computers , vol.46 , Issue.8 , pp. 942-947
    • Vaidya, N.H.1
  • 22
    • 33845593340 scopus 로고    scopus 로고
    • A large-scale study of failures in high-performance computing systems [C]
    • Washington D C: IEEE Press
    • Schroeder B, Gibson G. A large-scale study of failures in high-performance computing systems [C]//IEEE Conference on Dependable Systems and Networks (DSN), Washington D C: IEEE Press, 2006: 249-258.
    • (2006) IEEE Conference on Dependable Systems and Networks (DSN) , pp. 249-258
    • Schroeder, B.1    Gibson, G.2
  • 24
    • 84976846528 scopus 로고
    • A first order approximation to the optimum checkpoint interval [J]
    • Young J W. A first order approximation to the optimum checkpoint interval [J]. Communications of the ACM, 1974, 17: 530-531.
    • (1974) Communications of the ACM , vol.17 , pp. 530-531
    • Young, J.W.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.