-
1
-
-
84912092256
-
-
Apache Cassandra. http://cassandra.apache.org/.
-
-
-
-
2
-
-
84912092255
-
-
Apache Hadoop. http://hadoop.apache.org/.
-
-
-
-
3
-
-
84912069839
-
-
Apache HBase. http://hbase.apache.org/.
-
-
-
-
4
-
-
84912069838
-
-
Apache Oozie. http://incubator.apache.org/oozie/.
-
-
-
-
5
-
-
84912135231
-
-
Apache Crunch. http://crunch.apache.org/.
-
-
-
-
6
-
-
84912092254
-
-
Dell. http://www.dell.com/us/business/p/servers.
-
-
-
-
7
-
-
84912135230
-
-
Luigi. https://github.com/spotify/luigi.
-
Luigi
-
-
-
8
-
-
84912135228
-
-
Apache Mahout. http://mahout.apache.org/.
-
-
-
-
9
-
-
0032000230
-
Message logging: Pessimistic, optimistic, causal, and optimal
-
IEEE Transactions on
-
L. Alvisi and K. Marzullo. Message logging: Pessimistic, optimistic, causal, and optimal. Software Engineering, IEEE Transactions on, 24(2):149-159, 1998.
-
(1998)
Software Engineering
, vol.24
, Issue.2
, pp. 149-159
-
-
Alvisi, L.1
Marzullo, K.2
-
10
-
-
0036267101
-
Causality tracking in causal message-logging protocols
-
L. Alvisi, K. Bhatia, and K. Marzullo. Causality tracking in causal message-logging protocols. Distributed Computing, 15(1):1-15, 2002.
-
(2002)
Distributed Computing
, vol.15
, Issue.1
, pp. 1-15
-
-
Alvisi, L.1
Bhatia, K.2
Marzullo, K.3
-
12
-
-
84919827070
-
PACMan: Coordinated memory caching for parallel jobs
-
G. Ananthanarayanan, A. Ghodsi, A. Wang, D. Borthakur, S. Kandula, S. Shenker, and I. Stoica. PACMan: Coordinated Memory Caching for Parallel Jobs. In NSDI 2012.
-
NSDI 2012
-
-
Ananthanarayanan, G.1
Ghodsi, A.2
Wang, A.3
Borthakur, D.4
Kandula, S.5
Shenker, S.6
Stoica, I.7
-
13
-
-
72249085354
-
Fawn: A fast array of wimpy nodes
-
ACM
-
D. G. Andersen, J. Franklin, M. Kaminsky, A. Phanishayee, L. Tan, and V. Vasudevan. Fawn: A fast array of wimpy nodes. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, pages 1-14. ACM, 2009.
-
(2009)
Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles
, pp. 1-14
-
-
Andersen, D.G.1
Franklin, J.2
Kaminsky, M.3
Phanishayee, A.4
Tan, L.5
Vasudevan, V.6
-
14
-
-
84899979017
-
Flat datacenter storage
-
E. B. Nightingale, J. Elson, J. Fan, O. Hofmann, J. Howell, and Y. Suzue. Flat Datacenter Storage. In OSDI 2012.
-
OSDI 2012
-
-
Nightingale, E.B.1
Elson, J.2
Fan, J.3
Hofmann, O.4
Howell, J.5
Suzue, Y.6
-
15
-
-
79955085235
-
Megastore: Providing scalable, highly available storage for interactive services
-
J. Baker, C. Bond, J. Corbett, J. Furman, A. Khorlin, J. Larson, J.-M. Léon, Y. Li, A. Lloyd, and V. Yushprakh. Megastore: Providing scalable, highly available storage for interactive services. In CIDR, Volume 11, pages 223-234, 2011.
-
(2011)
CIDR
, vol.11
, pp. 223-234
-
-
Baker, J.1
Bond, C.2
Corbett, J.3
Furman, J.4
Khorlin, A.5
Larson, J.6
Léon, J.-M.7
Li, Y.8
Lloyd, A.9
Yushprakh, V.10
-
16
-
-
4544343377
-
Explicit control in the batch-aware distributed file system
-
J. Bent, D. Thain, A. C. Arpaci-Dusseau, R. H. Arpaci-Dusseau, and M. Livny. Explicit control in the batch-aware distributed file system. In NSDI, Volume 4, pages 365-378, 2004.
-
(2004)
NSDI
, vol.4
, pp. 365-378
-
-
Bent, J.1
Thain, D.2
Arpaci-Dusseau, A.C.3
Arpaci-Dusseau, R.H.4
Livny, M.5
-
17
-
-
24344453002
-
Lineage retrieval for scientic data processing: A survey
-
R. Bose and J. Frew. Lineage Retrieval for Scientic Data Processing: A Survey. In ACM Computing Surveys 2005.
-
ACM Computing Surveys 2005
-
-
Bose, R.1
Frew, J.2
-
18
-
-
24344453002
-
Lineage retrieval for scientific data processing: A survey
-
R. Bose and J. Frew. Lineage retrieval for scientific data processing: a survey. ACM Computing Surveys (CSUR), 37(1):1-28, 2005.
-
(2005)
ACM Computing Surveys (CSUR)
, vol.37
, Issue.1
, pp. 1-28
-
-
Bose, R.1
Frew, J.2
-
19
-
-
77954727236
-
FlumeJava: Easy, efficient data-parallel pipelines
-
C. Chambers et al. FlumeJava: easy, efficient data-parallel pipelines. In PLDI 2010.
-
PLDI 2010
-
-
Chambers, C.1
-
20
-
-
84873134968
-
Interactive analytical processing in big data systems: A cross-industry study of mapreduce workloads
-
Y. Chen, S. Alspaugh, and R. Katz. Interactive analytical processing in big data systems: A cross-industry study of mapreduce workloads. Proceedings of the VLDB Endowment, 5(12):1802-1813, 2012.
-
(2012)
Proceedings of the VLDB Endowment
, vol.5
, Issue.12
, pp. 1802-1813
-
-
Chen, Y.1
Alspaugh, S.2
Katz, R.3
-
23
-
-
85030321143
-
MapReduce: Simplified data processing on large clusters
-
J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. In OSDI 2004.
-
OSDI 2004
-
-
Dean, J.1
Ghemawat, S.2
-
25
-
-
84894579876
-
Hyperdex: A distributed, searchable key-value store
-
R. Escriva, B. Wong, and E. G. Sirer. Hyperdex: A distributed, searchable key-value store. ACM SIGCOMM Computer Communication Review, 42(4):25-36, 2012.
-
(2012)
ACM SIGCOMM Computer Communication Review
, vol.42
, Issue.4
, pp. 25-36
-
-
Escriva, R.1
Wong, B.2
Sirer, E.G.3
-
28
-
-
85077058426
-
CDE: Using system call interposition to automatically create portable software packages
-
P. J. Guo and D. Engler. CDE: Using system call interposition to automatically create portable software packages. In Proceedings of the 2011 USENIX Annual Technical Conference, pages 247-252, 2011.
-
(2011)
Proceedings of the 2011 USENIX Annual Technical Conference
, pp. 247-252
-
-
Guo, P.J.1
Engler, D.2
-
30
-
-
34548041192
-
Dryad: Distributed data-parallel programs from sequential building blocks
-
M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: distributed data-parallel programs from sequential building blocks. ACM SIGOPS Operating Systems Review, 41(3):59-72, 2007.
-
(2007)
ACM SIGOPS Operating Systems Review
, vol.41
, Issue.3
, pp. 59-72
-
-
Isard, M.1
Budiu, M.2
Yu, Y.3
Birrell, A.4
Fetterly, D.5
-
31
-
-
72249118633
-
Quincy: Fair scheduling for distributed computing clusters
-
November
-
M. Isard, V. Prabhakaran, J. Currey, U. Wieder, K. Talwar, and A. Goldberg. Quincy: Fair scheduling for distributed computing clusters. In SOSP, November 2009.
-
(2009)
SOSP
-
-
Isard, M.1
Prabhakaran, V.2
Currey, J.3
Wieder, U.4
Talwar, K.5
Goldberg, A.6
-
32
-
-
84940830917
-
Reliable, memory speed storage for cluster computing frameworks
-
University of California, Berkeley, Jun
-
H. Li, A. Ghodsi, M. Zaharia, S. Shenker, and I. Stoica. Reliable, memory speed storage for cluster computing frameworks. Technical Report UCB/EECS-2014-135, EECS Department, University of California, Berkeley, Jun 2014.
-
(2014)
Technical Report UCB/EECS-2014-135, EECS Department
-
-
Li, H.1
Ghodsi, A.2
Zaharia, M.3
Shenker, S.4
Stoica, I.5
-
33
-
-
79956021016
-
Priority inversion and its control: An experimental investigation
-
ACM
-
D. Locke, L. Sha, R. Rajikumar, J. Lehoczky, and G. Burns. Priority inversion and its control: An experimental investigation. In ACM SIGAda Ada Letters, Volume 8, pages 39-42. ACM, 1988.
-
(1988)
ACM SIGAda Ada Letters
, vol.8
, pp. 39-42
-
-
Locke, D.1
Sha, L.2
Rajikumar, R.3
Lehoczky, J.4
Burns, G.5
-
34
-
-
84863735533
-
Distributed graphlab: A framework for machine learning and data mining in the cloud
-
Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein. Distributed graphlab: a framework for machine learning and data mining in the cloud. Proceedings of the VLDB Endowment, 5(8):716-727, 2012.
-
(2012)
Proceedings of the VLDB Endowment
, vol.5
, Issue.8
, pp. 716-727
-
-
Low, Y.1
Bickson, D.2
Gonzalez, J.3
Guestrin, C.4
Kyrola, A.5
Hellerstein, J.M.6
-
35
-
-
77954723629
-
Pregel: A system for large-scale graph processing
-
ACM
-
G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pages 135-146. ACM, 2010.
-
(2010)
Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data
, pp. 135-146
-
-
Malewicz, G.1
Austern, M.H.2
Bik, A.J.3
Dehnert, J.C.4
Horn, I.5
Leiser, N.6
Czajkowski, G.7
-
36
-
-
79958258284
-
Dremel: Interactive analysis of web-scale datasets
-
S. Melnik, A. Gubarev, J. J. Long, G. Romer, S. Shivakumar, M. Tolton, and T. Vassilakis. Dremel: interactive analysis of web-scale datasets. Proceedings of the VLDB Endowment, 3 (1-2):330-339, 2010.
-
(2010)
Proceedings of the VLDB Endowment
, vol.3
, Issue.1-2
, pp. 330-339
-
-
Melnik, S.1
Gubarev, A.2
Long, J.J.3
Romer, G.4
Shivakumar, S.5
Tolton, M.6
Vassilakis, T.7
-
37
-
-
84885629677
-
Speculative execution in a distributed file system
-
ACM
-
E. B. Nightingale, P. M. Chen, and J. Flinn. Speculative execution in a distributed file system. In ACM SIGOPS Operating Systems Review, Volume 39, pages 191-205. ACM, 2005.
-
(2005)
ACM SIGOPS Operating Systems Review
, vol.39
, pp. 191-205
-
-
Nightingale, E.B.1
Chen, P.M.2
Flinn, J.3
-
38
-
-
55349148888
-
Pig latin: A not-so-foreign language for data processing
-
C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins. Pig latin: a not-so-foreign language for data processing. In SIGMOD '08, pages 1099-1110.
-
SIGMOD '08
, pp. 1099-1110
-
-
Olston, C.1
Reed, B.2
Srivastava, U.3
Kumar, R.4
Tomkins, A.5
-
39
-
-
79960006372
-
The case for ramcloud
-
J. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis, J. Leverich, D. Mazières, S. Mitra, A. Narayanan, D. Ongaro, G. Parulkar, et al. The case for ramcloud. Communications of the ACM, 54(7):121-130, 2011.
-
(2011)
Communications of the ACM
, vol.54
, Issue.7
, pp. 121-130
-
-
Ousterhout, J.1
Agrawal, P.2
Erickson, D.3
Kozyrakis, C.4
Leverich, J.5
Mazières, D.6
Mitra, S.7
Narayanan, A.8
Ongaro, D.9
Parulkar, G.10
-
40
-
-
0003820750
-
An overview of checkpointing in uniprocessor and distributed systems, focusing on implementation and performance
-
J. Plank. An Overview of Checkpointing in Uniprocessor and Distributed Systems, Focusing on Implementation and Performance. In Technical Report, University of Tennessee, 1997.
-
(1997)
Technical Report, University of Tennessee
-
-
Plank, J.1
-
44
-
-
84870524514
-
Heterogeneity and dynamicity of clouds at scale: Google trace analysis
-
ACM
-
C. Reiss, A. Tumanov, G. R. Ganger, R. H. Katz, and M. A. Kozuch. Heterogeneity and dynamicity of clouds at scale: Google trace analysis. In Proceedings of the Third ACM Symposium on Cloud Computing. ACM, 2012.
-
(2012)
Proceedings of the Third ACM Symposium on Cloud Computing
-
-
Reiss, C.1
Tumanov, A.2
Ganger, G.R.3
Katz, R.H.4
Kozuch, M.A.5
-
45
-
-
77957838299
-
The hadoop distributed file system
-
2010 IEEE 26th Symposium on IEEE
-
K. Shvachko, H. Kuang, S. Radia, and R. Chansler. The hadoop distributed file system. In Mass Storage Systems and Technologies (MSST), 2010 IEEE 26th Symposium on, pages 1-10. IEEE, 2010.
-
(2010)
Mass Storage Systems and Technologies (MSST)
, pp. 1-10
-
-
Shvachko, K.1
Kuang, H.2
Radia, S.3
Chansler, R.4
-
46
-
-
77952775707
-
Hive a petabyte scale data warehouse using hadoop
-
2010 IEEE 26th International Conference on IEEE
-
A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka, N. Zhang, S. Antony, H. Liu, and R. Murthy. Hive a petabyte scale data warehouse using hadoop. In Data Engineering (ICDE), 2010 IEEE 26th International Conference on, pages 996-1005. IEEE, 2010.
-
(2010)
Data Engineering (ICDE)
, pp. 996-1005
-
-
Thusoo, A.1
Sarma, J.S.2
Jain, N.3
Shao, Z.4
Chakka, P.5
Zhang, N.6
Antony, S.7
Liu, H.8
Murthy, R.9
-
48
-
-
0031388399
-
Impact of checkpoint latency on overhead ratio of a checkpointing scheme
-
N. H. Vaidya. Impact of Checkpoint Latency on Overhead Ratio of a Checkpointing Scheme. In IEEE Trans. Computers 1997.
-
IEEE Trans. Computers 1997
-
-
Vaidya, N.H.1
-
49
-
-
85038881415
-
Ceph: A scalable, high-performance distributed file system
-
USENIX Association
-
S. A. Weil, S. A. Brandt, E. L. Miller, D. D. Long, and C. Maltzahn. Ceph: A scalable, high-performance distributed file system. In Proceedings of the 7th symposium on Operating systems design and implementation, pages 307-320. USENIX Association, 2006.
-
(2006)
Proceedings of the 7th Symposium on Operating Systems Design and Implementation
, pp. 307-320
-
-
Weil, S.A.1
Brandt, S.A.2
Miller, E.L.3
Long, D.D.4
Maltzahn, C.5
-
50
-
-
84976846528
-
A first order approximation to the optimum checkpoint interval
-
Sept
-
J. W. Young. A first order approximation to the optimum checkpoint interval. Commun. ACM, 17:530-531, Sept 1974. ISSN 0001-0782.
-
(1974)
Commun. ACM
, vol.17
, pp. 530-531
-
-
Young, J.W.1
-
51
-
-
85076882757
-
Dryadlinq: A system for generalpurpose distributed data-parallel computing using a high-level language
-
USENIX Association
-
Y. Yu, M. Isard, D. Fetterly, M. Budiu, Ú. Erlingsson, P. K. Gunda, and J. Currey. Dryadlinq: a system for generalpurpose distributed data-parallel computing using a high-level language. In Proceedings of the 8th USENIX conference on Operating systems design and implementation, pages 1-14. USENIX Association, 2008.
-
(2008)
Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation
, pp. 1-14
-
-
Yu, Y.1
Isard, M.2
Fetterly, D.3
Budiu, M.4
Erlingsson, U.5
Gunda, P.K.6
Currey, J.7
-
52
-
-
77954636142
-
Delay scheduling: A simple technique for achieving locality and fairness in cluster scheduling
-
M. Zaharia, D. Borthakur, J. Sen Sarma, K. Elmeleegy, S. Shenker, and I. Stoica. Delay scheduling: A simple technique for achieving locality and fairness in cluster scheduling. In EuroSys 10, 2010.
-
(2010)
EuroSys 10
-
-
Zaharia, M.1
Borthakur, D.2
Sen Sarma, J.3
Elmeleegy, K.4
Shenker, S.5
Stoica, I.6
-
53
-
-
85040175609
-
Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing
-
USENIX Association
-
M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. In Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation. USENIX Association, 2012.
-
(2012)
Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation
-
-
Zaharia, M.1
Chowdhury, M.2
Das, T.3
Dave, A.4
Ma, J.5
McCauley, M.6
Franklin, M.J.7
Shenker, S.8
Stoica, I.9
-
54
-
-
84889637396
-
Discretized streams: Fault-tolerant streaming computation at scale
-
ACM
-
M. Zaharia, T. Das, H. Li, T. Hunter, S. Shenker, and I. Stoica. Discretized streams: Fault-tolerant streaming computation at scale. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pages 423-438. ACM, 2013.
-
(2013)
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
, pp. 423-438
-
-
Zaharia, M.1
Das, T.2
Li, H.3
Hunter, T.4
Shenker, S.5
Stoica, I.6
|