-
1
-
-
84878806200
-
-
20Newsgroups Dataset. Available at: http://people.csail.mit.edu/jrennie/ 20newsgroups/.
-
20Newsgroups Dataset
-
-
-
2
-
-
79957809015
-
HadoopDB: An architectural hybrid of MapReduce and DBMS technologies for analytical workloads
-
A. Abouzeid, K. Bajda-Pawlikowski, D. Abadi, A. Silberschatz, and A. Rasin. HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads. PVLDB, 2(1):922-933, 2009.
-
(2009)
PVLDB
, vol.2
, Issue.1
, pp. 922-933
-
-
Abouzeid, A.1
Bajda-Pawlikowski, K.2
Abadi, D.3
Silberschatz, A.4
Rasin, A.5
-
5
-
-
0141607824
-
Latent dirichlet allocation
-
D. M. Blei, A. N. Ng, and M. I. Jordan. Latent Dirichlet Allocation. JMLR, 3:993-1022, 2003.
-
(2003)
JMLR
, vol.3
, pp. 993-1022
-
-
Blei, D.M.1
Ng, A.N.2
Jordan, M.I.3
-
6
-
-
79956351190
-
Haloop: Efficient iterative data processing on large clusters
-
Y. Bu, B. Howe, M. Balazinska, and M. D. Ernst. Haloop: efficient iterative data processing on large clusters. PVLDB, 3(1-2):285-296, 2010.
-
(2010)
PVLDB
, vol.3
, Issue.1-2
, pp. 285-296
-
-
Bu, Y.1
Howe, B.2
Balazinska, M.3
Ernst, M.D.4
-
7
-
-
84860560293
-
SCOPE: Easy and efficient parallel processing of massive data sets
-
R. Chaiken, B. Jenkins, P. Larson, B. Ramsey, D. Shakib, S. Weaver, and J. Zhou. SCOPE: easy and efficient parallel processing of massive data sets. PVLDB, 1(2):1265-1276, 2008.
-
(2008)
PVLDB
, vol.1
, Issue.2
, pp. 1265-1276
-
-
Chaiken, R.1
Jenkins, B.2
Larson, P.3
Ramsey, B.4
Shakib, D.5
Weaver, S.6
Zhou, J.7
-
8
-
-
77954727236
-
Flumejava: Easy, efficient data-parallel pipelines
-
C. Chambers, A. Raniwala, F. Perry, S. Adams, R. R. Henry, R. Bradshaw, and N. Weizenbaum. Flumejava: easy, efficient data-parallel pipelines. In PLDA, 2010.
-
(2010)
PLDA
-
-
Chambers, C.1
Raniwala, A.2
Perry, F.3
Adams, S.4
Henry, R.R.5
Bradshaw, R.6
Weizenbaum, N.7
-
9
-
-
84862684677
-
Tenzing a sql implementation on the mapreduce framework
-
B. Chattopadhyay, L. Lin, W. Liu, S. Mittal, P. Aragonda, V. Lychagina, Y. Kwon, and M. Wong. Tenzing a sql implementation on the mapreduce framework. PVLDB, 4(12):1318-1327, 2011.
-
(2011)
PVLDB
, vol.4
, Issue.12
, pp. 1318-1327
-
-
Chattopadhyay, B.1
Lin, L.2
Liu, W.3
Mittal, S.4
Aragonda, P.5
Lychagina, V.6
Kwon, Y.7
Wong, M.8
-
10
-
-
85075599381
-
Map-reduce for machine learning on multicore
-
C.-T. Chu, S. K. Kim, Y.-A. Lin, Y. Yu, G. Bradski, A. Y. Ng, and K. Olukotun. Map-reduce for machine learning on multicore. In NIPS, 2006.
-
(2006)
NIPS
-
-
Chu, C.-T.1
Kim, S.K.2
Lin, Y.-A.3
Yu, Y.4
Bradski, G.5
Ng, A.Y.6
Olukotun, K.7
-
12
-
-
77954751910
-
Ricardo: Integrating r and hadoop
-
S. Das, Y. Sismanis, K. S. Beyer, R. Gemulla, P. J. Haas, and J. McPherson. Ricardo: Integrating r and hadoop. In SIGMOD, 2010.
-
(2010)
SIGMOD
-
-
Das, S.1
Sismanis, Y.2
Beyer, K.S.3
Gemulla, R.4
Haas, P.J.5
McPherson, J.6
-
13
-
-
77954729163
-
On probabilistic fixpoint and Markov chain query languages
-
D. Deutch, C. Koch, and T. Milo. On probabilistic fixpoint and markov chain query languages. In PODS, pages 215-226, 2010.
-
(2010)
PODS
, pp. 215-226
-
-
Deutch, D.1
Koch, C.2
Milo, T.3
-
14
-
-
79957859069
-
SystemML: Declarative machine learning on mapreduce
-
A. Ghoting, R. Krishnamurthy, E. Pednault, B. Reinwald, V. Sindhwani, S. Tatikonda, Y. Tian, and S. Vaithyanathan. SystemML: Declarative machine learning on mapreduce. In ICDE, 2011.
-
(2011)
ICDE
-
-
Ghoting, A.1
Krishnamurthy, R.2
Pednault, E.3
Reinwald, B.4
Sindhwani, V.5
Tatikonda, S.6
Tian, Y.7
Vaithyanathan, S.8
-
15
-
-
0034829234
-
Optimizing queries using materialized views: A practical, scalable solution
-
J. Goldstein and P. Larson. Optimizing Queries Using Materialized Views: A practical, scalable solution. In SIGMOD, 2001.
-
(2001)
SIGMOD
-
-
Goldstein, J.1
Larson, P.2
-
16
-
-
34548770803
-
Models for incomplete and probabilistic information
-
T. J. Green and V. Tannen. Models for incomplete and probabilistic information. IEEE Data Eng. Bull., 29(1):17-24, 2006.
-
(2006)
IEEE Data Eng. Bull.
, vol.29
, Issue.1
, pp. 17-24
-
-
Green, T.J.1
Tannen, V.2
-
17
-
-
0027872183
-
Optimal histograms for limiting worst-case error propagation in the size of join results
-
Y. E. Ioannidis and S. Christodoulakis. Optimal histograms for limiting worst-case error propagation in the size of join results. TODS, 18(4):709-748, 1993.
-
(1993)
TODS
, vol.18
, Issue.4
, pp. 709-748
-
-
Ioannidis, Y.E.1
Christodoulakis, S.2
-
18
-
-
80052344771
-
The Monte Carlo database system: Stochastic analysis close to the data
-
R. Jampani, F. Xu, M. Wu, L. L. Perez, C. Jermaine, and P. J. Haas. The Monte Carlo Database System: Stochastic analysis close to the data. TODS, 36(3):1-41, 2011.
-
(2011)
TODS
, vol.36
, Issue.3
, pp. 1-41
-
-
Jampani, R.1
Xu, F.2
Wu, M.3
Perez, L.L.4
Jermaine, C.5
Haas, P.J.6
-
19
-
-
77951152705
-
Pegasus: A peta-scale graph mining system - Implementation and observations
-
U. Kang, C. E. Tsourakakis, and C. Faloutsos. Pegasus: A peta-scale graph mining system - implementation and observations. In ICDM, 2009.
-
(2009)
ICDM
-
-
Kang, U.1
Tsourakakis, C.E.2
Faloutsos, C.3
-
20
-
-
79959942894
-
Jigsaw: Efficient optimization over uncertain enterprise data
-
O. Kennedy and S. Nath. Jigsaw: Efficient optimization over uncertain enterprise data. In SIGMOD, 2011.
-
(2011)
SIGMOD
-
-
Kennedy, O.1
Nath, S.2
-
21
-
-
63749084566
-
On query algebras for probabilistic databases
-
C. Koch. On query algebras for probabilistic databases. SIGMOD Record, 37(4):78-85, 2008.
-
(2008)
SIGMOD Record
, vol.37
, Issue.4
, pp. 78-85
-
-
Koch, C.1
-
22
-
-
84939166370
-
Scalable clustering for n-body simulations in a shared-nothing cluster
-
Y. Kwon, D. Nunley, J. D. Gardner, M. Balazinska, B. Howe, and S. Loebman. Scalable clustering for n-body simulations in a shared-nothing cluster. In SSDBM, 2010.
-
(2010)
SSDBM
-
-
Kwon, Y.1
Nunley, D.2
Gardner, J.D.3
Balazinska, M.4
Howe, B.5
Loebman, S.6
-
23
-
-
79955694310
-
Plda+: Parallel latent dirichlet allocation with data placement and pipeline processing
-
Z. Liu, Y. Zhang, E. Y. Chang, and M. Sun. Plda+: Parallel Latent Dirichlet Allocation with Data Placement and Pipeline Processing. ACM TIST, 2(3):26:1-26:18, 2011.
-
(2011)
ACM TIST
, vol.2
, Issue.3
, pp. 261-2618
-
-
Liu, Z.1
Zhang, Y.2
Chang, E.Y.3
Sun, M.4
-
24
-
-
80052875653
-
GraphLab: A new parallel framework for machine learning
-
Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. M. Hellerstein. GraphLab: A New Parallel Framework for Machine Learning. In UAI, 2010.
-
(2010)
UAI
-
-
Low, Y.1
Gonzalez, J.2
Kyrola, A.3
Bickson, D.4
Guestrin, C.5
Hellerstein, J.M.6
-
25
-
-
84880564635
-
-
A. Mahout. Available at: http://mahout.apache.org/.
-
-
-
Mahout, A.1
-
26
-
-
84863448825
-
Efficiently compiling efficient query plans for modern hardware
-
T. Neumann. Efficiently compiling efficient query plans for modern hardware. PVLDB, 4(9):539-550, 2011.
-
(2011)
PVLDB
, vol.4
, Issue.9
, pp. 539-550
-
-
Neumann, T.1
-
27
-
-
77955032649
-
Planet: Massively parallel learning of tree ensembles with mapreduce
-
B. Panda, J. S. Herbach, S. Basu, and R. J. Bayardo. Planet: Massively parallel learning of tree ensembles with mapreduce. PVLDB, 2(2):1426-1437, 2009.
-
(2009)
PVLDB
, vol.2
, Issue.2
, pp. 1426-1437
-
-
Panda, B.1
Herbach, J.S.2
Basu, S.3
Bayardo, R.J.4
-
28
-
-
67149126890
-
DisCo: Distributed co-clustering with map-reduce: A case study towards petabyte-scale end-to-end mining
-
S. Papadimitriou and J. Sun. DisCo: Distributed co-clustering with map-reduce: A case study towards petabyte-scale end-to-end mining. In ICDM, 2008.
-
(2008)
ICDM
-
-
Papadimitriou, S.1
Sun, J.2
-
29
-
-
65449138803
-
Fast collapsed gibbs sampling for latent dirichlet allocation
-
I. Porteous, D. Newman, A. T. Ihler, A. Asuncion, P. Smyth, and M. Welling. Fast collapsed Gibbs sampling for Latent Dirichlet Allocation. In SIGKDD, 2008.
-
(2008)
SIGKDD
-
-
Porteous, I.1
Newman, D.2
Ihler, A.T.3
Asuncion, A.4
Smyth, P.5
Welling, M.6
-
33
-
-
80052119994
-
An architecture for parallel topic models
-
A. J. Smola and S. Narayanamurthy. An Architecture for Parallel Topic models. PVLDB, 3(1):703-710, 2010.
-
(2010)
PVLDB
, vol.3
, Issue.1
, pp. 703-710
-
-
Smola, A.J.1
Narayanamurthy, S.2
-
34
-
-
84868325513
-
Hive: A warehousing solution over a map-reduce framework
-
A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka, S. Anthony, H. Liu, P. Wyckoff, and R. Murthy. Hive: a warehousing solution over a map-reduce framework. PVLDB, 2(2):1626-1629, 2009.
-
(2009)
PVLDB
, vol.2
, Issue.2
, pp. 1626-1629
-
-
Thusoo, A.1
Sarma, J.S.2
Jain, N.3
Shao, Z.4
Chakka, P.5
Anthony, S.6
Liu, H.7
Wyckoff, P.8
Murthy, R.9
-
35
-
-
80052694199
-
Behavioral simulations in mapreduce
-
G. Wang, M. V. Salles, B. Sowell, X. Wang, T. Cao, A. Demers, J. Gehrke, and W. White. Behavioral simulations in mapreduce. PVLDB, 3(1-2):952-963, 2010.
-
(2010)
PVLDB
, vol.3
, Issue.1-2
, pp. 952-963
-
-
Wang, G.1
Salles, M.V.2
Sowell, B.3
Wang, X.4
Cao, T.5
Demers, A.6
Gehrke, J.7
White, W.8
-
36
-
-
79961186434
-
Scalable probabilistic databases with factor graphs and MCMC
-
M. Wick, A. McCallum, and G. Miklau. Scalable probabilistic databases with factor graphs and MCMC. In VLDB, 2010.
-
(2010)
VLDB
-
-
Wick, M.1
McCallum, A.2
Miklau, G.3
-
37
-
-
70350591395
-
Dryadlinq: A system for general-purpose distributed data-parallel computing using a high-level language
-
Y. Yu, M. Isard, D. Fetterly, M. Budiu, U. Erlingsson, P. K. Gunda, and J. Currey. Dryadlinq: A system for general-purpose distributed data-parallel computing using a high-level language. In OSDI, 2008.
-
(2008)
OSDI
-
-
Yu, Y.1
Isard, M.2
Fetterly, D.3
Budiu, M.4
Erlingsson, U.5
Gunda, P.K.6
Currey, J.7
-
38
-
-
79959979871
-
Hybrid in-database inference for declarative information extraction
-
D. Z. Zhang, M. J. Franklin, M. Garofalakis, J. M. Hellerstein, and M. L. Wick. Hybrid in-database inference for declarative information extraction. In SIGMOD, 2011.
-
(2011)
SIGMOD
-
-
Zhang, D.Z.1
Franklin, M.J.2
Garofalakis, M.3
Hellerstein, J.M.4
Wick, M.L.5
|