메뉴 건너뛰기




Volumn , Issue , 2007, Pages 43-54

Using queue structures to improve job reliability

Author keywords

Cluster design and architecture; Reliability

Indexed keywords

COMPUTER ARCHITECTURE; PROGRAM PROCESSORS; RELIABILITY ANALYSIS; SERVERS;

EID: 34548092060     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1272366.1272373     Document Type: Conference Paper
Times cited : (12)

References (27)
  • 3
    • 12444292740 scopus 로고    scopus 로고
    • Computation-at-risk: Assessing job portfolio management risk on clusters
    • IEEE Computer Society
    • S. D. Kleban and S. H. Clearwater, "Computation-at-risk: Assessing job portfolio management risk on clusters," in IPDPS. IEEE Computer Society, 2004.
    • (2004) IPDPS
    • Kleban, S.D.1    Clearwater, S.H.2
  • 5
    • 0036041277 scopus 로고    scopus 로고
    • Improving cluster availability using workstation validation
    • ACM
    • T. Heath, R. P. Martin, and T. D. Nguyen, "Improving cluster availability using workstation validation," in SIGMETRICS. ACM, 2002, pp. 217-227.
    • (2002) SIGMETRICS , pp. 217-227
    • Heath, T.1    Martin, R.P.2    Nguyen, T.D.3
  • 6
    • 34548056878 scopus 로고    scopus 로고
    • D. Nurmi, J. Brevik, and R. Wolski, Quantifying machine availability in networked and desktop grid systems, University of California, Santa Barbara, Computer Science, Tech. Rep. ucsb.cs:TR-2003-37, Nov. 2003.
    • D. Nurmi, J. Brevik, and R. Wolski, "Quantifying machine availability in networked and desktop grid systems," University of California, Santa Barbara, Computer Science, Tech. Rep. ucsb.cs:TR-2003-37, Nov. 2003.
  • 7
    • 4544337911 scopus 로고    scopus 로고
    • Automatic methods for predicting machine availability in desktop grid and peer-to-peer systems
    • IEEE Computer Society
    • J. Brevik, D. Nurmi, and R. Wolski, "Automatic methods for predicting machine availability in desktop grid and peer-to-peer systems," in CCGRID. IEEE Computer Society, 2004, pp. 190-199.
    • (2004) CCGRID , pp. 190-199
    • Brevik, J.1    Nurmi, D.2    Wolski, R.3
  • 8
    • 27144534020 scopus 로고    scopus 로고
    • Modeling machine availability in enterprise and wide-area distributed computing environments
    • Euro-Par 2005, Parallel Processing, 11th International Euro-Par Conference, Lisbon, Portugal, August 30, September 2, 2005, Proceedings, Springer
    • D. Nurmi, J. Brevik, and R. Wolski, "Modeling machine availability in enterprise and wide-area distributed computing environments," in Euro-Par 2005, Parallel Processing, 11th International Euro-Par Conference, Lisbon, Portugal, August 30 - September 2, 2005, Proceedings, ser. Lecture Notes in Computer Science, vol. 3648. Springer, 2005, pp. 432-441.
    • (2005) ser. Lecture Notes in Computer Science , vol.3648 , pp. 432-441
    • Nurmi, D.1    Brevik, J.2    Wolski, R.3
  • 13
    • 34548108278 scopus 로고    scopus 로고
    • Los Alamos National Laboratory, data on system failures, Online, Available
    • Los Alamos National Laboratory. (2006) Raw operational data on system failures. [Online]. Available: http://www.lanl.gov/projects/computerscience/ data/
    • (2006) Raw operational
  • 21
    • 84898046897 scopus 로고    scopus 로고
    • Scaling to Thousands of Processors with Buffer Coscheduling
    • Pittsburgh, PA, Aug
    • F. Petøini, "Scaling to Thousands of Processors with Buffer Coscheduling," in Scaling to New Heights Workshop, Pittsburgh, PA, Aug 2002.
    • (2002) Scaling to New Heights Workshop
    • Petøini, F.1
  • 22
    • 0345446547 scopus 로고    scopus 로고
    • The workload on parallel supercomputers: Modeling the characteristics of rigid jobs
    • Lublin and Feitelson, "The workload on parallel supercomputers: Modeling the characteristics of rigid jobs," JPDC: Journal of Parallel and Distributed Computing, vol. 63, 2003.
    • (2003) JPDC: Journal of Parallel and Distributed Computing , vol.63
    • Lublin1    Feitelson2
  • 23
    • 0031388399 scopus 로고    scopus 로고
    • Impact of checkpoint latency on overhead ratio of a checkpointing scheme
    • N. H. Vaidya, "Impact of checkpoint latency on overhead ratio of a checkpointing scheme," IEEE Trans. Computers, vol. 46, no. 8, pp. 942-947, 1997.
    • (1997) IEEE Trans. Computers , vol.46 , Issue.8 , pp. 942-947
    • Vaidya, N.H.1
  • 25
    • 34548105831 scopus 로고    scopus 로고
    • N. Stone, J. Kochmar, R. Reddy, J. R. Scott, J. Sommerfield, and C. Vizinok, A checkpoint and recovery system for the Pittsburgh supercomputing center terascale computing system, Pittsburgh Supercomputer Center, Tech. Rep. CMU-PSC-TR-2001-0002, 2001.
    • N. Stone, J. Kochmar, R. Reddy, J. R. Scott, J. Sommerfield, and C. Vizinok, "A checkpoint and recovery system for the Pittsburgh supercomputing center terascale computing system," Pittsburgh Supercomputer Center, Tech. Rep. CMU-PSC-TR-2001-0002, 2001.
  • 26
    • 84978437474 scopus 로고    scopus 로고
    • Pastiche: Making backup cheap and easy
    • Proceedings of the 5th ACM Symposium on Operating System Design and Implementation OSDI-02, New York: ACM Press, Dec. 9-11
    • L. P. Cox, C. D. Murray, and B. Noble, "Pastiche: Making backup cheap and easy," in Proceedings of the 5th ACM Symposium on Operating System Design and Implementation (OSDI-02), ser. Operating Systems Review. New York: ACM Press, Dec. 9-11 2007, pp. 285-298.
    • (2007) ser. Operating Systems Review , pp. 285-298
    • Cox, L.P.1    Murray, C.D.2    Noble, B.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.