-
1
-
-
79957809015
-
An architectural hybrid of MapReduce and DBMS technologies for analytical workloads
-
A. Abouzeid, K. Bajda-Pawlikowski, D. Abadi, A. Silberschatz, and A. Rasin. HadoopDB: An architectural hybrid of MapReduce and DBMS technologies for analytical workloads. VLDB, pp. 922-933, 2009.
-
(2009)
VLDB
, pp. 922-933
-
-
Abouzeid, A.1
Bajda-Pawlikowski, K.2
Abadi, D.3
Silberschatz, A.4
Rasin, A.5
Hadoop, D.B.6
-
2
-
-
0029212693
-
Mining sequential patterns
-
R. Agrawal and R. Srikant. Mining sequential patterns. ICDE, pp. 3-14, 1995.
-
(1995)
ICDE
, pp. 3-14
-
-
Agrawal, R.1
Srikant, R.2
-
3
-
-
0025183708
-
Basic local alignment search tool
-
S. Altschul, W. Gish, W. Miller, E. Myers, and D. Lipman. Basic local alignment search tool. Journal of Molecular Biology, 215(3):403-410, 1990.
-
(1990)
Journal of Molecular Biology
, vol.215
, Issue.3
, pp. 403-410
-
-
Altschul, S.1
Gish, W.2
Miller, W.3
Myers, E.4
Lipman, D.5
-
4
-
-
79959994432
-
Efficient processing of data warehousing queries in a split execution environment
-
K. Bajda-Pawlikowski, D. Abadi, A. Silberschatz, and E. Paulson. Efficient processing of data warehousing queries in a split execution environment. SIGMOD, pp. 1165-1176, 2011.
-
(2011)
SIGMOD
, pp. 1165-1176
-
-
Bajda-Pawlikowski, K.1
Abadi, D.2
Silberschatz, A.3
Paulson, E.4
-
5
-
-
79960018131
-
Apache Hadoop goes realtime at Facebook
-
D. Borthakur, J. Gray, J. Sarma, K. Muthukkaruppan, N. Spiegelberg, H. Kuang, K. Ranganathan, D. Molkov, A. Menon, S. Rash, R. Schmidt, and A. Aiyer. Apache Hadoop goes realtime at Facebook. SIGMOD, pp. 1071-1080, 2011.
-
(2011)
SIGMOD
, pp. 1071-1080
-
-
Borthakur, D.1
Gray, J.2
Sarma, J.3
Muthukkaruppan, K.4
Spiegelberg, N.5
Kuang, H.6
Ranganathan, K.7
Molkov, D.8
Menon, A.9
Rash, S.10
Schmidt, R.11
Aiyer, A.12
-
6
-
-
84936824188
-
Word association norms, mutual information, and lexicography
-
K. Church and P. Hanks. Word association norms, mutual information, and lexicography. Computational Linguistics, 16(1):22-29, 1990.
-
(1990)
Computational Linguistics
, vol.16
, Issue.1
, pp. 22-29
-
-
Church, K.1
Hanks, P.2
-
7
-
-
85030321143
-
MapReduce: Simplified data processing on large clusters
-
J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. OSDI, pp. 137-150, 2004.
-
(2004)
OSDI
, pp. 137-150
-
-
Dean, J.1
Ghemawat, S.2
-
8
-
-
80053521271
-
Hadoop++: Making a yellow elephant run like a cheetah (without it even noticing)
-
J. Dittrich, J.-A. Quiané-Ruiz, A. Jindal, Y. Kargin, V. Setty, and J. Schad. Hadoop++: Making a yellow elephant run like a cheetah (without it even noticing). VLDB, pp. 515-529, 2010.
-
(2010)
VLDB
, pp. 515-529
-
-
Dittrich, J.1
Quiané-Ruiz, J.-A.2
Jindal, A.3
Kargin, Y.4
Setty, V.5
Schad, J.6
-
9
-
-
85055298348
-
Accurate methods for the statistics of surprise and coincidence
-
T. Dunning. Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19(1):61-74, 1993.
-
(1993)
Computational Linguistics
, vol.19
, Issue.1
, pp. 61-74
-
-
Dunning, T.1
-
10
-
-
77952278077
-
Building a high-level dataflow system on top of MapReduce: The Pig experience
-
A. Gates, O. Natkovich, S. Chopra, P. Kamath, S. Narayanamurthy, C. Olston, B. Reed, S. Srinivasan, and U. Srivastava. Building a high-level dataflow system on top of MapReduce: The Pig experience. VLDB, pp. 1414-1425, 2009.
-
(2009)
VLDB
, pp. 1414-1425
-
-
Gates, A.1
Natkovich, O.2
Chopra, S.3
Kamath, P.4
Narayanamurthy, S.5
Olston, C.6
Reed, B.7
Srinivasan, S.8
Srivastava, U.9
-
11
-
-
79957794587
-
RCFile: A fast and space-efficient data placement structure in MapReduce-based warehouse systems
-
Y. He, R. Lee, Y. Huai, Z. Shao, N. Jain, X. Zhang, and Z. Xu. RCFile: A fast and space-efficient data placement structure in MapReduce-based warehouse systems. ICDE, pp. 1199-1208, 2011.
-
(2011)
ICDE
, pp. 1199-1208
-
-
He, Y.1
Lee, R.2
Huai, Y.3
Shao, Z.4
Jain, N.5
Zhang, X.6
Xu, Z.7
-
12
-
-
85077109271
-
ZooKeeper: Wait-free coordination for Internet-scale systems
-
P. Hunt, M. Konar, F. Junqueira, and B. Reed. ZooKeeper: Wait-free coordination for Internet-scale systems. USENIX, pp. 145-158, 2010.
-
(2010)
USENIX
, pp. 145-158
-
-
Hunt, P.1
Konar, M.2
Junqueira, F.3
Reed, B.4
-
13
-
-
82155168632
-
Trojan data layouts: Right shoes for a running elephant
-
A. Jindal, J.-A. Quiané-Ruiz, and J. Dittrich. Trojan data layouts: Right shoes for a running elephant. SoCC, pp. 21:1-21:14, 2011.
-
(2011)
SoCC
-
-
Jindal, A.1
Quiané-Ruiz, J.-A.2
Dittrich, J.3
-
14
-
-
84925370208
-
Speech and Language Processing
-
D. Jurafsky and J. Martin. Speech and Language Processing. Pearson, 2009.
-
(2009)
Pearson
-
-
Jurafsky, D.1
Martin, J.2
-
15
-
-
85061893834
-
A generative constituent-context model for improved grammar induction
-
D. Klein and C. Manning. A generative constituent-context model for improved grammar induction. ACL, pp. 128-135, 2002.
-
(2002)
ACL
, pp. 128-135
-
-
Klein, D.1
Manning, C.2
-
16
-
-
36849051407
-
Practical guide to controlled experiments on the web: Listen to your customers not to the HiPPO
-
R. Kohavi, R. Henne, and D. Sommerfield. Practical guide to controlled experiments on the web: Listen to your customers not to the HiPPO. KDD, pp. 959-967, 2007.
-
(2007)
KDD
, pp. 959-967
-
-
Kohavi, R.1
Henne, R.2
Sommerfield, D.3
-
17
-
-
84873171258
-
Kafka: A distributed messaging system for log processing
-
J. Kreps, N. Narkhede, and J. Rao. Kafka: A distributed messaging system for log processing. NetDB, 2011.
-
(2011)
NetDB
-
-
Kreps, J.1
Narkhede, N.2
Rao, J.3
-
18
-
-
84862684679
-
Large-scale machine learning at twitter
-
J. Lin and A. Kolcz. Large-scale machine learning at twitter. SIGMOD, pp. 793-804, 2012.
-
(2012)
SIGMOD
, pp. 793-804
-
-
Lin, J.1
Kolcz, A.2
-
19
-
-
79961034447
-
Full-text indexing for optimizing selection operations in large-scale data analytics
-
J. Lin, D. Ryaboy, and K. Weil. Full-text indexing for optimizing selection operations in large-scale data analytics. MAPREDUCE Workshop, pp. 59-66, 2011.
-
(2011)
MAPREDUCE Workshop
, pp. 59-66
-
-
Lin, J.1
Ryaboy, D.2
Weil, K.3
-
20
-
-
67650932418
-
Modeling actions of PubMed users with n-gram language models
-
J. Lin and W. Wilbur. Modeling actions of PubMed users with n-gram language models. Information Retrieval, 12(4):487-503, 2009.
-
(2009)
Information Retrieval
, vol.12
, Issue.4
, pp. 487-503
-
-
Lin, J.1
Wilbur, W.2
-
21
-
-
79959945877
-
Llama: Leveraging columnar storage for scalable join processing in the MapReduce framework
-
Y. Lin, D. Agrawal, C. Chen, B. Ooi, and S. Wu. Llama: Leveraging columnar storage for scalable join processing in the MapReduce framework. SIGMOD, pp. 961-972, 2011.
-
(2011)
SIGMOD
, pp. 961-972
-
-
Lin, Y.1
Agrawal, D.2
Chen, C.3
Ooi, B.4
Wu, S.5
-
22
-
-
0035789606
-
Funnel report mining for the MSN network
-
T. Mah, H. Hoek, and Y. Li. Funnel report mining for the MSN network. KDD, pp. 450-455, 2001.
-
(2001)
KDD
, pp. 450-455
-
-
Mah, T.1
Hoek, H.2
Li, Y.3
-
24
-
-
11144311052
-
Modeling online browsing and path analysis using clickstream data
-
A. Montgomery, S. Li, K. Srinivasan, and J. Liechty. Modeling online browsing and path analysis using clickstream data. Marketing Science, 23(4):579-595, 2004.
-
(2004)
Marketing Science
, vol.23
, Issue.4
, pp. 579-595
-
-
Montgomery, A.1
Li, S.2
Srinivasan, K.3
Liechty, J.4
-
25
-
-
79957860992
-
Distributed cube materialization on holistic measures
-
A. Nandi, C. Yu, P. Bohannon, and R. Ramakrishnan. Distributed cube materialization on holistic measures. ICDE, pp. 183-194, 2011.
-
(2011)
ICDE
, pp. 183-194
-
-
Nandi, A.1
Yu, C.2
Bohannon, P.3
Ramakrishnan, R.4
-
26
-
-
55349148888
-
Pig Latin: A not-so-foreign language for data processing
-
C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins. Pig Latin: A not-so-foreign language for data processing. SIGMOD, pp. 1099-1110, 2008.
-
(2008)
SIGMOD
, pp. 1099-1110
-
-
Olston, C.1
Reed, B.2
Srivastava, U.3
Kumar, R.4
Tomkins, A.5
-
27
-
-
85016826214
-
A comparative evaluation of collocation extraction techniques
-
D. Pearce. A comparative evaluation of collocation extraction techniques. LREC, pp. 1530-1536, 2002.
-
(2002)
LREC
, pp. 1530-1536
-
-
Pearce, D.1
-
28
-
-
77952775707
-
Hive-a petabyte scale data warehouse using Hadoop
-
A. Thusoo, J. Sarma, N. Jain, Z. Shao, P. Chakka, N. Zhang, S. Anthony, H. Liu, and R. Murthy. Hive-a petabyte scale data warehouse using Hadoop. ICDE, 2010.
-
(2010)
ICDE
-
-
Thusoo, A.1
Sarma, J.2
Jain, N.3
Shao, Z.4
Chakka, P.5
Zhang, N.6
Anthony, S.7
Liu, H.8
Murthy, R.9
-
29
-
-
77954709174
-
Data warehousing and analytics infrastructure at Facebook
-
A. Thusoo, Z. Shao, S. Anthony, D. Borthakur, N. Jain, J. Sarma, R. Murthy, and H. Liu. Data warehousing and analytics infrastructure at Facebook. SIGMOD, 2010.
-
(2010)
SIGMOD
-
-
Thusoo, A.1
Shao, Z.2
Anthony, S.3
Borthakur, D.4
Jain, N.5
Sarma, J.6
Murthy, R.7
Liu, H.8
-
30
-
-
79957944320
-
LifeFlow: Visualizing an overview of event sequences
-
K. Wongsuphasawat, J. Gómez, C. Plaisant, T. Wang, M. Taieb-Maimon, and B. Shneiderman. LifeFlow: Visualizing an overview of event sequences. CHI Extended Abstracts, pp. 507-510, 2011.
-
(2011)
CHI Extended Abstracts
, pp. 507-510
-
-
Wongsuphasawat, K.1
Gómez, J.2
Plaisant, C.3
Wang, T.4
Taieb-Maimon, M.5
Shneiderman, B.6
|