메뉴 건너뛰기




Volumn 13, Issue 4, 2015, Pages 507-525

Fault-Tolerant Dynamic Rescheduling for Heterogeneous Computing Systems

Author keywords

Directed acyclic graph; Fault tolerance; Heterogeneous computing system; Task reschedule

Indexed keywords

DIRECTED GRAPHS; DISTRIBUTED COMPUTER SYSTEMS; FAULT TOLERANCE; SCHEDULING; SCHEDULING ALGORITHMS;

EID: 84958239215     PISSN: 15707873     EISSN: 15729184     Source Type: Journal    
DOI: 10.1007/s10723-015-9331-1     Document Type: Article
Times cited : (28)

References (43)
  • 1
    • 0021529549 scopus 로고
    • Practical multiprocessor scheduling algorithms for efficient parallel processing
    • Kasahara, H., Narita, S.: Practical multiprocessor scheduling algorithms for efficient parallel processing. IEEE Trans. Comput. 33(11), 1023–1029 (1984)
    • (1984) IEEE Trans. Comput. , vol.33 , Issue.11 , pp. 1023-1029
    • Kasahara, H.1    Narita, S.2
  • 2
    • 0036504666 scopus 로고    scopus 로고
    • Performance-effective and low-complexity task scheduling for heterogeneous computing
    • Topcuoglu, H., Hariri, S., Wu, M.-Y.: Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. Parallel Distrib. Syst. 13(3), 260–274 (2002)
    • (2002) IEEE Trans. Parallel Distrib. Syst. , vol.13 , Issue.3 , pp. 260-274
    • Topcuoglu, H.1    Hariri, S.2    Wu, M.-Y.3
  • 3
    • 39749157730 scopus 로고    scopus 로고
    • A high performance algorithm for static task scheduling in heterogeneous distributed computing systems
    • Daoud, M.I., Kharma, N.: A high performance algorithm for static task scheduling in heterogeneous distributed computing systems. J. Parallel Distrib. Comput. 68(4), 399–409 (2008)
    • (2008) J. Parallel Distrib. Comput. , vol.68 , Issue.4 , pp. 399-409
    • Daoud, M.I.1    Kharma, N.2
  • 4
    • 84887424532 scopus 로고    scopus 로고
    • Energy-aware scheduling on multicore heterogeneous grid computing systems
    • Nesmachnow, S., Dorronsoro, B., Pecero, J., Bouvry, P.: Energy-aware scheduling on multicore heterogeneous grid computing systems. J. Grid Comput. 11(4), 653–680 (2013)
    • (2013) J. Grid Comput. , vol.11 , Issue.4 , pp. 653-680
    • Nesmachnow, S.1    Dorronsoro, B.2    Pecero, J.3    Bouvry, P.4
  • 5
    • 84920710092 scopus 로고    scopus 로고
    • A budget constrained scheduling algorithm for workflow applications
    • Arabnejad, H., Barbosa, J.: A budget constrained scheduling algorithm for workflow applications. J. Grid Comput. 12(4), 665–679 (2014)
    • (2014) J. Grid Comput. , vol.12 , Issue.4 , pp. 665-679
    • Arabnejad, H.1    Barbosa, J.2
  • 7
    • 0041848306 scopus 로고    scopus 로고
    • An improved duplication strategy for scheduling precedence constrained graphs in multiprocessor systems
    • Bansal, S., Kumar, P., Singh, K.: An improved duplication strategy for scheduling precedence constrained graphs in multiprocessor systems. IEEE Trans. Parallel Distrib. Syst. 14(6), 533–544 (2003)
    • (2003) IEEE Trans. Parallel Distrib. Syst. , vol.14 , Issue.6 , pp. 533-544
    • Bansal, S.1    Kumar, P.2    Singh, K.3
  • 8
    • 46149101204 scopus 로고    scopus 로고
    • Task scheduling algorithm using minimized duplications in homogeneous systems
    • Shin, K., Cha, M., Jang, M., Jung, J., Yoon, W., Choi, S.: Task scheduling algorithm using minimized duplications in homogeneous systems. J. Parallel Distrib. Comput. 68(8), 1146–1156 (2008)
    • (2008) J. Parallel Distrib. Comput. , vol.68 , Issue.8 , pp. 1146-1156
    • Shin, K.1    Cha, M.2    Jang, M.3    Jung, J.4    Yoon, W.5    Choi, S.6
  • 9
    • 76849102536 scopus 로고    scopus 로고
    • List scheduling with duplication for heterogeneous computing systems
    • Tang, X., Li, K., Liao, G., Li, R.: List scheduling with duplication for heterogeneous computing systems. J. Parallel Distrib. Comput. 70(4), 323–329 (2010)
    • (2010) J. Parallel Distrib. Comput. , vol.70 , Issue.4 , pp. 323-329
    • Tang, X.1    Li, K.2    Liao, G.3    Li, R.4
  • 10
    • 83755181508 scopus 로고    scopus 로고
    • Task scheduling algorithm with minimal redundant duplications in homogeneous multiprocessor system in Grid and Distributed Computing, pp. 238–245
    • Song, I., Yoon, W., Jang, E., Choi, S.: Task scheduling algorithm with minimal redundant duplications in homogeneous multiprocessor system in Grid and Distributed Computing, pp. 238–245. Springer (2011)
    • (2011) Springer
    • Song, I.1    Yoon, W.2    Jang, E.3    Choi, S.4
  • 11
    • 0041848306 scopus 로고    scopus 로고
    • An improved duplication strategy for scheduling precedence constrained graphs in multiprocessor systems
    • Bansal, S., Kumar, P., Singh, K.: An improved duplication strategy for scheduling precedence constrained graphs in multiprocessor systems. IEEE Trans. Parallel Distrib. Syst. 14(6), 533–544 (2003)
    • (2003) IEEE Trans. Parallel Distrib. Syst. , vol.14 , Issue.6 , pp. 533-544
    • Bansal, S.1    Kumar, P.2    Singh, K.3
  • 12
    • 22144471943 scopus 로고    scopus 로고
    • A high performance, low complexity algorithm for compile-time task scheduling in heterogeneous systems
    • Hagras, T., brevecek, J.J.: A high performance, low complexity algorithm for compile-time task scheduling in heterogeneous systems. Parallel Comput. 31(7), 653–670 (2005)
    • (2005) Parallel Comput. , vol.31 , Issue.7 , pp. 653-670
    • Hagras, T.1    brevecek, J.J.2
  • 15
    • 84863421985 scopus 로고    scopus 로고
    • Scheduling for heterogeneous systems using constrained critical paths
    • Khan, M.A.: Scheduling for heterogeneous systems using constrained critical paths. Parallel Comput. 38(4), 175–193 (2012)
    • (2012) Parallel Comput. , vol.38 , Issue.4 , pp. 175-193
    • Khan, M.A.1
  • 16
    • 36049028957 scopus 로고    scopus 로고
    • Defining and measuring supercomputer reliability, availability, and serviceability (ras)
    • Stearley, J.: Defining and measuring supercomputer reliability, availability, and serviceability (ras). In: Proceedings of the Linux Clusters Institute Conference (2005)
    • (2005) Proceedings of the Linux Clusters Institute Conference
    • Stearley, J.1
  • 17
    • 38949104192 scopus 로고    scopus 로고
    • Replica placement strategies in data grid
    • Rahman, R.M., Barker, K., Alhajj, R.: Replica placement strategies in data grid. J. Grid Comput. 6(1), 103–123 (2008)
    • (2008) J. Grid Comput. , vol.6 , Issue.1 , pp. 103-123
    • Rahman, R.M.1    Barker, K.2    Alhajj, R.3
  • 18
    • 84860890249 scopus 로고    scopus 로고
    • Mapreduce workload modeling with statistical approach
    • Yang, H., Luan, Z., Li, W., Qian, D.: Mapreduce workload modeling with statistical approach. J. grid Comput. 10(2), 279–310 (2012)
    • (2012) J. grid Comput. , vol.10 , Issue.2 , pp. 279-310
    • Yang, H.1    Luan, Z.2    Li, W.3    Qian, D.4
  • 19
    • 0023090161 scopus 로고
    • Checkpointing and rollback-recovery for distributed systems
    • Koo, R., Toueg, S.: Checkpointing and rollback-recovery for distributed systems. IEEE Trans. Softw. Eng. 1, 23–31 (1987)
    • (1987) IEEE Trans. Softw. Eng. , vol.1 , pp. 23-31
    • Koo, R.1    Toueg, S.2
  • 20
    • 84969366799 scopus 로고    scopus 로고
    • A fault tolerance protocol for fast recovery
    • Chakravorty, S.: A fault tolerance protocol for fast recovery. ProQuest (2008)
    • (2008) ProQuest
    • Chakravorty, S.1
  • 21
    • 84860593469 scopus 로고    scopus 로고
    • The reliability wall for exascale supercomputing
    • Yang, X., Wang, Z., Xue, J., Zhou, Y.: The reliability wall for exascale supercomputing. IEEE Trans. Comput. 61(6), 767–779 (2012)
    • (2012) IEEE Trans. Comput. , vol.61 , Issue.6 , pp. 767-779
    • Yang, X.1    Wang, Z.2    Xue, J.3    Zhou, Y.4
  • 24
    • 0026923304 scopus 로고
    • Task allocation for maximizing reliability of distributed computer systems
    • Shatz, S.M., Wang, J.-P., Goto, M.: Task allocation for maximizing reliability of distributed computer systems. IEEE Trans. Comput. 41(9), 1156–1168 (1992)
    • (1992) IEEE Trans. Comput. , vol.41 , Issue.9 , pp. 1156-1168
    • Shatz, S.M.1    Wang, J.-P.2    Goto, M.3
  • 25
    • 20444463471 scopus 로고    scopus 로고
    • A dynamic and reliability-driven scheduling algorithm for parallel real-time jobs executing on heterogeneous clusters
    • Qin, X., Jiang, H.: A dynamic and reliability-driven scheduling algorithm for parallel real-time jobs executing on heterogeneous clusters. J. Parallel Distrib. Comput. 65(8), 885–900 (2005)
    • (2005) J. Parallel Distrib. Comput. , vol.65 , Issue.8 , pp. 885-900
    • Qin, X.1    Jiang, H.2
  • 28
    • 59149105005 scopus 로고    scopus 로고
    • Reliability versus performance for critical applications
    • Girault, A., Saule, E., Trystram, D.: Reliability versus performance for critical applications. J. Parallel Distrib. Comput. 69(3), 326–336 (2009)
    • (2009) J. Parallel Distrib. Comput. , vol.69 , Issue.3 , pp. 326-336
    • Girault, A.1    Saule, E.2    Trystram, D.3
  • 29
    • 77955509553 scopus 로고    scopus 로고
    • Reliability-aware scheduling strategy for heterogeneous distributed computing systems
    • Tang, X., Li, K., Li, R., Veeravalli, B.: Reliability-aware scheduling strategy for heterogeneous distributed computing systems. J. Parallel Distrib. Comput. 70(9), 941–952 (2010)
    • (2010) J. Parallel Distrib. Comput. , vol.70 , Issue.9 , pp. 941-952
    • Tang, X.1    Li, K.2    Li, R.3    Veeravalli, B.4
  • 30
    • 79960559123 scopus 로고    scopus 로고
    • An efficient weighted bi-objective scheduling algorithm for heterogeneous systems
    • Boeres, C., Sardiña, I. M., Drummond, L.: An efficient weighted bi-objective scheduling algorithm for heterogeneous systems. Parallel Comput. 37(8), 349–364 (2011)
    • (2011) Parallel Comput. , vol.37 , Issue.8 , pp. 349-364
    • Boeres, C.1    Sardiña, I.M.2    Drummond, L.3
  • 31
    • 84855339817 scopus 로고    scopus 로고
    • Optimizing performance and reliability on heterogeneous parallel systems: Approximation algorithms and heuristics
    • Jeannot, E., Saule, E., Trystram, D.: Optimizing performance and reliability on heterogeneous parallel systems: Approximation algorithms and heuristics. J. Parallel Distrib. Comput. 72(2), 268–280 (2012)
    • (2012) J. Parallel Distrib. Comput. , vol.72 , Issue.2 , pp. 268-280
    • Jeannot, E.1    Saule, E.2    Trystram, D.3
  • 32
    • 84874664055 scopus 로고    scopus 로고
    • Dependable grid workflow scheduling based on resource availability
    • Tao, Y., Jin, H., Wu, S., Shi, X., Shi, L.: Dependable grid workflow scheduling based on resource availability. J. Grid Comput. 11(1), 47–61 (2013)
    • (2013) J. Grid Comput. , vol.11 , Issue.1 , pp. 47-61
    • Tao, Y.1    Jin, H.2    Wu, S.3    Shi, X.4    Shi, L.5
  • 34
    • 33747806256 scopus 로고    scopus 로고
    • A novel fault-tolerant scheduling algorithm for precedence constrained tasks in real-time heterogeneous systems
    • Qin, X., Jiang, H.: A novel fault-tolerant scheduling algorithm for precedence constrained tasks in real-time heterogeneous systems. Parallel Comput. 32(5), 331–356 (2006)
    • (2006) Parallel Comput. , vol.32 , Issue.5 , pp. 331-356
    • Qin, X.1    Jiang, H.2
  • 35
    • 59149085680 scopus 로고    scopus 로고
    • On the design of communication-aware fault-tolerant scheduling algorithms for precedence constrained tasks in grid computing systems with dedicated communication devices
    • Zheng, Q., Veeravalli, B.: On the design of communication-aware fault-tolerant scheduling algorithms for precedence constrained tasks in grid computing systems with dedicated communication devices. J. Parallel Distrib. Comput. 69(3), 282–294 (2009)
    • (2009) J. Parallel Distrib. Comput. , vol.69 , Issue.3 , pp. 282-294
    • Zheng, Q.1    Veeravalli, B.2
  • 36
    • 64049088350 scopus 로고    scopus 로고
    • On the design of fault-tolerant scheduling strategies using primary-backup approach for computational grids with low replication costs
    • Zheng, Q., Veeravalli, B., Tham, C.-K.: On the design of fault-tolerant scheduling strategies using primary-backup approach for computational grids with low replication costs. IEEE Trans. Comput. 58(3), 380–393 (2009)
    • (2009) IEEE Trans. Comput. , vol.58 , Issue.3 , pp. 380-393
    • Zheng, Q.1    Veeravalli, B.2    Tham, C.-K.3
  • 38
    • 0002366826 scopus 로고
    • Heterogeneous computing: challenges and opportunities
    • Khokhar, A., Prasanna, V., Shaaban, M., Wang, C.-L.: Heterogeneous computing: challenges and opportunities. Computer 26(6), 18–27 (1993)
    • (1993) Computer , vol.26 , Issue.6 , pp. 18-27
    • Khokhar, A.1    Prasanna, V.2    Shaaban, M.3    Wang, C.-L.4
  • 41
    • 84976846528 scopus 로고    scopus 로고
    • Young, J.W.: A first order approximation to the optimum checkpoint interval. Commun. ACM 17(9), 530–531
    • Young, J.W.: A first order approximation to the optimum checkpoint interval. Commun. ACM 17(9), 530–531
  • 43
    • 39749157730 scopus 로고    scopus 로고
    • A high performance algorithm for static task scheduling in heterogeneous distributed computing systems
    • Daoud, M.I., Kharma, N.: A high performance algorithm for static task scheduling in heterogeneous distributed computing systems. J. Parallel Distrib. Comput. 68(4), 399–409 (2008)
    • (2008) J. Parallel Distrib. Comput. , vol.68 , Issue.4 , pp. 399-409
    • Daoud, M.I.1    Kharma, N.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.