메뉴 건너뛰기




Volumn , Issue , 2012, Pages 3-13

Workload characterization on a production Hadoop cluster: A case study on Taobao

Author keywords

Hadoop; MapReduce; Workload characterization

Indexed keywords

COMPUTING PARADIGM; DATA PLATFORM; GAIN INSIGHT; HADOOP; JOB CHARACTERISTICS; LARGE CLUSTERS; LARGE-SCALE DATASETS; LARGE-SCALE PRODUCTION; MAP-REDUCE; PERFORMANCE OPTIMIZATIONS; PRODUCTION ENVIRONMENTS; SCIENTIFIC COMPUTATION; SOCIAL NETWORKS; SYSTEM THROUGHPUT; WEB SEARCHES; WORKLOAD CHARACTERIZATION;

EID: 84873482265     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/IISWC.2012.6402895     Document Type: Conference Paper
Times cited : (98)

References (33)
  • 1
    • 85030321143 scopus 로고    scopus 로고
    • Mapreduce: Simplified data processing on large clusters
    • J. Dean and S. Ghemawat, "Mapreduce: Simplified data processing on large clusters," in OSDI, 2004, pp. 137-150.
    • (2004) OSDI , pp. 137-150
    • Dean, J.1    Ghemawat, S.2
  • 3
    • 21644437974 scopus 로고    scopus 로고
    • The google file system
    • S. Ghemawat, H. Gobioff, and S.-T. Leung, "The google file system," in ACM SOSP, vol. 37, no. 5, 2003, pp. 29-43.
    • (2003) ACM SOSP , vol.37 , Issue.5 , pp. 29-43
    • Ghemawat, S.1    Gobioff, H.2    Leung, S.-T.3
  • 4
    • 84873465043 scopus 로고    scopus 로고
    • [Online]. Available
    • Ganglia. [Online]. Available: ganglia.sourceforge.net
  • 5
    • 84860581420 scopus 로고    scopus 로고
    • Energy efficiency for large-scale mapreduce workloads with significant interactive analysis
    • Y. Chen, S. Alspaugh, D. Borthakur, and R. H. Katz, "Energy efficiency for large-scale mapreduce workloads with significant interactive analysis," in EuroSys. ACM, 2012, pp. 43-56.
    • (2012) EuroSys. ACM , pp. 43-56
    • Chen, Y.1    Alspaugh, S.2    Borthakur, D.3    Katz, R.H.4
  • 6
    • 0000861722 scopus 로고
    • A prof for the queuing formula l=hw
    • J. D. C. Little, "A prof for the queuing formula L=hW," Operations Research, vol. 9, 1961.
    • (1961) Operations Research , vol.9
    • Little, J.D.C.1
  • 7
    • 72049096630 scopus 로고    scopus 로고
    • Implementing webgis on hadoop: A case study of improving small file i/o performance on hdfs
    • X. Liu, J. Han, Y. Zhong, C. Han, and X. He, "Implementing webGIS on hadoop: A case study of improving small file I/O performance on HDFS," in CLUSTER, 2009, pp. 1-8.
    • (2009) CLUSTER , pp. 1-8
    • Liu, X.1    Han, J.2    Zhong, Y.3    Han, C.4    He, X.5
  • 8
    • 72049093234 scopus 로고    scopus 로고
    • Improving metadata management for small files in hdfs
    • G. Mackey, S. Sehrish, and J. Wang, "Improving metadata management for small files in HDFS," in CLUSTER, 2009, pp. 1-4.
    • (2009) CLUSTER , pp. 1-4
    • MacKey, G.1    Sehrish, S.2    Wang, J.3
  • 9
    • 77954901315 scopus 로고    scopus 로고
    • An analysis of traces from a production mapreduce cluster
    • S. Kavulya, J. Tan, R. Gandhi, and P. Narasimhan, "An analysis of traces from a production mapreduce cluster," in CCGRID, 2010, pp. 94-103.
    • (2010) CCGRID , pp. 94-103
    • Kavulya, S.1    Tan, J.2    Gandhi, R.3    Narasimhan, P.4
  • 10
    • 77954636142 scopus 로고    scopus 로고
    • Delay scheduling: A simple technique for achieving locality and fairness in cluster scheduling
    • M. Zaharia, D. Borthakur, J. S. Sarma, K. Elmeleegy, S. Shenker, and I. Stoica, "Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling," in EuroSys, 2010, pp. 265-278.
    • (2010) EuroSys , pp. 265-278
    • Zaharia, M.1    Borthakur, D.2    Sarma, J.S.3    Elmeleegy, K.4    Shenker, S.5    Stoica, I.6
  • 11
    • 84863180724 scopus 로고    scopus 로고
    • Matchmaking: A new mapreduce scheduling technique
    • C. He, Y. Lu, and D. Swanson, "Matchmaking: A new mapreduce scheduling technique," in CloudCom, 2011, pp. 40-47.
    • (2011) CloudCom , pp. 40-47
    • He, C.1    Lu, Y.2    Swanson, D.3
  • 13
    • 43949158104 scopus 로고
    • A scalable distributed shared memory architecture
    • S. Krishnamoorthy and A. Choudhary, "A scalable distributed shared memory architecture," JPDC, vol. 22, no. 3, pp. 547-554, 1994.
    • (1994) JPDC , vol.22 , Issue.3 , pp. 547-554
    • Krishnamoorthy, S.1    Choudhary, A.2
  • 14
    • 33845593340 scopus 로고    scopus 로고
    • A large-scale study of failures in highperformance computing systems
    • B. Schroeder and G. A. Gibson, "A large-scale study of failures in highperformance computing systems," in DSN, 2006, pp. 249-258.
    • (2006) DSN , pp. 249-258
    • Schroeder, B.1    Gibson, G.A.2
  • 15
    • 4544382099 scopus 로고    scopus 로고
    • Failure data analysis of a large-scale heterogeneous server environment
    • R. K. Sahoo, A. Sivasubramaniam, M. S. Squillante, and Y. Zhang, "Failure data analysis of a large-scale heterogeneous server environment," in DSN, 2004, p. 772.
    • (2004) DSN , pp. 772
    • Sahoo, R.K.1    Sivasubramaniam, A.2    Squillante, M.S.3    Zhang, Y.4
  • 16
    • 12444339816 scopus 로고    scopus 로고
    • Pfair scheduling of periodic tasks with allocation constraints on multiple processors
    • D. Liu and Y.-H. Lee, "Pfair scheduling of periodic tasks with allocation constraints on multiple processors," in IPDPS, 2004.
    • (2004) IPDPS
    • Liu, D.1    Lee, Y.-H.2
  • 17
  • 18
    • 34848855114 scopus 로고    scopus 로고
    • Characterizing network traffic in a cluster-based, multi-tier data center
    • D. Ersoz, M. S. Yousif, and C. R. Das, "Characterizing network traffic in a cluster-based, multi-tier data center," in ICDCS, 2007, p. 59.
    • (2007) ICDCS , pp. 59
    • Ersoz, D.1    Yousif, M.S.2    Das, C.R.3
  • 19
    • 0028491368 scopus 로고
    • Empirically derived analytic models of wide-area tcp connections
    • V. Paxson, "Empirically derived analytic models of wide-area TCP connections," IEEE/ACM Trans. Netw, vol. 2, no. 4, pp. 316-336, 1994.
    • (1994) IEEE/ACM Trans. Netw , vol.2 , Issue.4 , pp. 316-336
    • Paxson, V.1
  • 21
    • 79955977868 scopus 로고    scopus 로고
    • Sierra: Practical powerproportionality for data center storage
    • E. Thereska, A. Donnelly, and D. Narayanan, "Sierra: practical powerproportionality for data center storage," in EuroSys, 2011, pp. 169-182.
    • (2011) EuroSys , pp. 169-182
    • Thereska, E.1    Donnelly, A.2    Narayanan, D.3
  • 22
    • 23944524302 scopus 로고    scopus 로고
    • File system workload analysis for large scale scientific computing applications
    • F. Wang, Q. Xin, B. Hong, S. A. Brandt, E. L. Miller, D. D. E. Long, and T. T. Mclarty, "File system workload analysis for large scale scientific computing applications," in MSST, 2004, pp. 139-152.
    • (2004) MSST , pp. 139-152
    • Wang, F.1    Xin, Q.2    Hong, B.3    Brandt, S.A.4    Miller, E.L.5    Long, D.D.E.6    McLarty, T.T.7
  • 23
    • 68349104301 scopus 로고    scopus 로고
    • Web server performance analysis using histogram workload models
    • E. Hernández-Orallo and J. Vila-Carbó, "Web server performance analysis using histogram workload models," Computer Networks, vol. 53, no. 15, pp. 2727-2739, 2009.
    • (2009) Computer Networks , vol.53 , Issue.15 , pp. 2727-2739
    • Hernández-Orallo, E.1    Vila-Carbó, J.2
  • 24
    • 4243930084 scopus 로고    scopus 로고
    • Workload characterization of a personalized web site and its implications for dynamic content caching
    • W. Shi, Y. Wright, E. Collins, and V. Karamcheti, "Workload characterization of a personalized web site and its implications for dynamic content caching," in WCW, 2002, pp. 1-16.
    • (2002) WCW , pp. 1-16
    • Shi, W.1    Wright, Y.2    Collins, E.3    Karamcheti, V.4
  • 25
    • 68949197600 scopus 로고    scopus 로고
    • Workload characterization in a high-energy data grid and impact on resource management
    • A. Iamnitchi, S. Doraimani, and G. Garzoglio, "Workload characterization in a high-energy data grid and impact on resource management," Cluster Computing, vol. 12, no. 2, pp. 153-173, 2009.
    • (2009) Cluster Computing , vol.12 , Issue.2 , pp. 153-173
    • Iamnitchi, A.1    Doraimani, S.2    Garzoglio, G.3
  • 26
    • 38949131272 scopus 로고    scopus 로고
    • Statistical analysis and modeling of jobs in a grid environment
    • K. Christodoulopoulos, V. Gkamas, and E. A. Varvarigos, "Statistical analysis and modeling of jobs in a grid environment," J. Grid Comput, vol. 6, no. 1, 2008.
    • (2008) J. Grid Comput , vol.6 , Issue.1
    • Christodoulopoulos, K.1    Gkamas, V.2    Varvarigos, E.A.3
  • 28
    • 33845308246 scopus 로고    scopus 로고
    • User group-based workload analysis and modelling
    • B. Song, C. Ernemann, and R. Yahyapour, "User group-based workload analysis and modelling," in CCGRID, 2005, pp. 953-961.
    • (2005) CCGRID , pp. 953-961
    • Song, B.1    Ernemann, C.2    Yahyapour, R.3
  • 30
    • 85031898917 scopus 로고    scopus 로고
    • Towards characterizing cloud backend workloads: Insights from google compute clusters
    • A. K. Mishra, J. L. Hellerstein, W. Cirne, and C. R. Das, "Towards characterizing cloud backend workloads: insights from google compute clusters," SIGMETRICS Performance Evaluation Review, vol. 37, no. 4, pp. 34-41, 2010.
    • (2010) SIGMETRICS Performance Evaluation Review , vol.37 , Issue.4 , pp. 34-41
    • Mishra, A.K.1    Hellerstein, J.L.2    Cirne, W.3    Das, C.R.4
  • 31
    • 77954889082 scopus 로고    scopus 로고
    • Benchmarking cloud serving systems with ycsb
    • J. M. Hellerstein, S. Chaudhuri, and M. Rosenblum, Eds.
    • B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears, "Benchmarking cloud serving systems with YCSB," in SoCC, J. M. Hellerstein, S. Chaudhuri, and M. Rosenblum, Eds., 2010, pp. 143-154.
    • (2010) SoCC , pp. 143-154
    • Cooper, B.F.1    Silberstein, A.2    Tam, E.3    Ramakrishnan, R.4    Sears, R.5
  • 32
    • 80053019024 scopus 로고    scopus 로고
    • The case for evaluating mapreduce performance using workload suites
    • Y. Chen, A. Ganapathi, R. Griffith, and R. H. Katz, "The case for evaluating mapreduce performance using workload suites," in MASCOTS, 2011, pp. 390-399.
    • (2011) MASCOTS , pp. 390-399
    • Chen, Y.1    Ganapathi, A.2    Griffith, R.3    Katz, R.H.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.