-
4
-
-
80052917068
-
Sampling the repairs of functional dependency violations under hard constraints
-
G. Beskales, I. F. Ilyas, and L. Golab. Sampling the repairs of functional dependency violations under hard constraints. PVLDB, 3:197-207, 2010.
-
(2010)
PVLDB
, vol.3
, pp. 197-207
-
-
Beskales, G.1
Ilyas, I.F.2
Golab, L.3
-
5
-
-
29844436973
-
A cost-based model and effective heuristic for repairing constraints by value modification
-
P. Bohannon, M. Flaster, W. Fan, and R. Rastogi. A cost-based model and effective heuristic for repairing constraints by value modification. In SIGMOD, 2005.
-
(2005)
SIGMOD
-
-
Bohannon, P.1
Flaster, M.2
Fan, W.3
Rastogi, R.4
-
6
-
-
84881365460
-
Holistic data cleaning: Putting violations into context
-
X. Chu, I. F. Ilyas, and P. Papotti. Holistic Data Cleaning: Putting Violations into Context. In ICDE, 2013.
-
(2013)
ICDE
-
-
Chu, X.1
Ilyas, I.F.2
Papotti, P.3
-
7
-
-
84880546390
-
NADEEF: A commodity data cleaning system
-
M. Dallachiesa, A. Ebaid, A. Eldawy, A. Elmagarmid, I. F. Ilyas, M. Ouzzani, and N. Tang. NADEEF: A Commodity Data Cleaning System. In SIGMOD, 2013.
-
(2013)
SIGMOD
-
-
Dallachiesa, M.1
Ebaid, A.2
Eldawy, A.3
Elmagarmid, A.4
Ilyas, I.F.5
Ouzzani, M.6
Tang, N.7
-
8
-
-
37549003336
-
MapReduce: Simplified data processing on large clusters
-
J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. Communications of the ACM, 51(1):107-113, 2008.
-
(2008)
Communications of the ACM
, vol.51
, Issue.1
, pp. 107-113
-
-
Dean, J.1
Ghemawat, S.2
-
11
-
-
46649106686
-
Conditional functional dependencies for capturing data inconsistencies
-
W. Fan, F. Geerts, X. Jia, and A. Kementsietsidis. Conditional Functional Dependencies for Capturing Data Inconsistencies. ACM Transactions on Database Systems (TODS), 33(2):6:1-6:48, 2008.
-
(2008)
ACM Transactions on Database Systems (TODS)
, vol.33
, Issue.2
, pp. 61-648
-
-
Fan, W.1
Geerts, F.2
Jia, X.3
Kementsietsidis, A.4
-
12
-
-
84907031191
-
Conict resolution with data currency and consistency
-
W. Fan, F. Geerts, N. Tang, and W. Yu. Conict resolution with data currency and consistency. J. Data and Information Quality, 5(1-2):6, 2014.
-
(2014)
J. Data and Information Quality
, vol.5
, Issue.1-2
, pp. 6
-
-
Fan, W.1
Geerts, F.2
Tang, N.3
Yu, W.4
-
13
-
-
79959944062
-
Interaction between record matching and data repairing
-
W. Fan, J. Li, S. Ma, N. Tang, and W. Yu. Interaction between record matching and data repairing. In SIGMOD, 2011.
-
(2011)
SIGMOD
-
-
Fan, W.1
Li, J.2
Ma, S.3
Tang, N.4
Yu, W.5
-
14
-
-
84864198280
-
Incremental detection of inconsistencies in distributed data
-
W. Fan, J. Li, N. Tang, and W. Yu. Incremental Detection of Inconsistencies in Distributed Data. In ICDE, 2012.
-
(2012)
ICDE
-
-
Fan, W.1
Li, J.2
Tang, N.3
Yu, W.4
-
15
-
-
1542305821
-
A systematic approach to automatic edit and imputation
-
I. Fellegi and D. Holt. A systematic approach to automatic edit and imputation. J. American Statistical Association, 71(353), 1976.
-
(1976)
J. American Statistical Association
, vol.71
, Issue.353
-
-
Fellegi, I.1
Holt, D.2
-
17
-
-
84882696854
-
The LLUNATIC data-cleaning framework
-
F. Geerts, G. Mecca, P. Papotti, and D. Santoro. The LLUNATIC Data-Cleaning Framework. PVLDB, 6(9):625-636, 2013.
-
(2013)
PVLDB
, vol.6
, Issue.9
, pp. 625-636
-
-
Geerts, F.1
Mecca, G.2
Papotti, P.3
Santoro, D.4
-
19
-
-
84940824296
-
Proof positive and negative in data cleaning
-
M. Interlandi and N. Tang. Proof positive and negative in data cleaning. In ICDE, 2015.
-
(2015)
ICDE
-
-
Interlandi, M.1
Tang, N.2
-
20
-
-
84863535860
-
Automatic optimization for mapreduce programs
-
E. Jahani, M. J. Cafarella, and C. Ré. Automatic optimization for mapreduce programs. PVLDB, 4(6):385-396, 2011.
-
(2011)
PVLDB
, vol.4
, Issue.6
, pp. 385-396
-
-
Jahani, E.1
Cafarella, M.J.2
Ré, C.3
-
21
-
-
84880540611
-
Cartilage: Adding flexibility to the hadoop skeleton
-
A. Jindal, J.-A. Quiané-Ruiz, and S. Madden. Cartilage: Adding Flexibility to the Hadoop Skeleton. In SIGMOD, 2013.
-
(2013)
SIGMOD
-
-
Jindal, A.1
Quiané-Ruiz, J.-A.2
Madden, S.3
-
23
-
-
77951101246
-
On approximating optimum repairs for functional dependency violations
-
S. Kolahi and L. V. S. Lakshmanan. On Approximating Optimum Repairs for Functional Dependency Violations. In ICDT, 2009.
-
(2009)
ICDT
-
-
Kolahi, S.1
Lakshmanan, L.V.S.2
-
24
-
-
84872977079
-
Dedoop: Efficient deduplication with hadoop
-
L. Kolb, A. Thor, and E. Rahm. Dedoop: Efficient Deduplication with Hadoop. PVLDB, 2012.
-
(2012)
PVLDB
-
-
Kolb, L.1
Thor, A.2
Rahm, E.3
-
25
-
-
77954723629
-
Pregel: A system for large-scale graph processing
-
G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: A System for Large-scale Graph Processing. In SIGMOD, 2010.
-
(2010)
SIGMOD
-
-
Malewicz, G.1
Austern, M.H.2
Bik, A.J.3
Dehnert, J.C.4
Horn, I.5
Leiser, N.6
Czajkowski, G.7
-
26
-
-
77954714416
-
ERACER: A database approach for statistical inference and data cleaning
-
C. Mayfield, J. Neville, and S. Prabhakar. ERACER: a database approach for statistical inference and data cleaning. In SIGMOD, 2010.
-
(2010)
SIGMOD
-
-
Mayfield, C.1
Neville, J.2
Prabhakar, S.3
-
27
-
-
79960020260
-
Processing theta-joins using mapreduce
-
A. Okcan and M. Riedewald. Processing theta-joins using mapreduce. In SIGMOD, 2011.
-
(2011)
SIGMOD
-
-
Okcan, A.1
Riedewald, M.2
-
28
-
-
55349148888
-
Pig Latin: A not-so-foreign language for data processing
-
C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins. Pig Latin: A Not-so-foreign Language for Data Processing. In SIGMOD, 2008.
-
(2008)
SIGMOD
-
-
Olston, C.1
Reed, B.2
Srivastava, U.3
Kumar, R.4
Tomkins, A.5
-
29
-
-
0002490026
-
Data cleaning: Problems and current approaches
-
E. Rahm and H. H. Do. Data cleaning: Problems and current approaches. IEEE Data Eng. Bull., 23(4):3-13, 2000.
-
(2000)
IEEE Data Eng. Bull.
, vol.23
, Issue.4
, pp. 3-13
-
-
Rahm, E.1
Do, H.H.2
-
31
-
-
48249116542
-
Gartner warns firms of 'dirty data'
-
N. Swartz. Gartner warns firms of 'dirty data'. Information Management Journal, 41(3), 2007.
-
(2007)
Information Management Journal
, vol.41
, Issue.3
-
-
Swartz, N.1
-
32
-
-
84868325513
-
Hive: A warehousing solution over a map-reduce framework
-
A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka, S. Anthony, H. Liu, P. Wyckoff, and R. Murthy. Hive: A Warehousing Solution over a Map-reduce Framework. PVLDB, 2(2):1626-1629, 2009.
-
(2009)
PVLDB
, vol.2
, Issue.2
, pp. 1626-1629
-
-
Thusoo, A.1
Sarma, J.S.2
Jain, N.3
Shao, Z.4
Chakka, P.5
Anthony, S.6
Liu, H.7
Wyckoff, P.8
Murthy, R.9
-
33
-
-
84901749646
-
CrowdCleaner: Data cleaning for multi-version data on the web via crowdsourcing
-
Y. Tong, C. C. Cao, C. J. Zhang, Y. Li, and L. Chen. CrowdCleaner: Data cleaning for multi-version data on the web via crowdsourcing. In ICDE, 2014.
-
(2014)
ICDE
-
-
Tong, Y.1
Cao, C.C.2
Zhang, C.J.3
Li, Y.4
Chen, L.5
-
35
-
-
84904301041
-
A sample-and-clean framework for fast and accurate query processing on dirty data
-
J. Wang, S. Krishnan, M. J. Franklin, K. Goldberg, T. Kraska, and T. Milo. A Sample-and-Clean Framework for Fast and Accurate Query Processing on Dirty Data. In SIGMOD, 2014.
-
(2014)
SIGMOD
-
-
Wang, J.1
Krishnan, S.2
Franklin, M.J.3
Goldberg, K.4
Kraska, T.5
Milo, T.6
-
36
-
-
84904293819
-
Towards dependable data repairing with fixing rules
-
J. Wang and N. Tang. Towards dependable data repairing with fixing rules. In SIGMOD, 2014.
-
(2014)
SIGMOD
-
-
Wang, J.1
Tang, N.2
-
37
-
-
84880566945
-
GraphX: A resilient distributed graph system on spark
-
ACM
-
R. S. Xin, J. E. Gonzalez, M. J. Franklin, and I. Stoica. GraphX: A Resilient Distributed Graph System on Spark. In First International Workshop on Graph Data Management Experiences and Systems, GRADES. ACM, 2013.
-
(2013)
First International Workshop on Graph Data Management Experiences and Systems, GRADES
-
-
Xin, R.S.1
Gonzalez, J.E.2
Franklin, M.J.3
Stoica, I.4
-
38
-
-
84880533620
-
Shark: SQL and rich analytics at scale
-
R. S. Xin, J. Rosen, M. Zaharia, M. J. Franklin, S. Shenker, and I. Stoica. Shark: SQL and Rich Analytics at Scale. In SIGMOD, 2013.
-
(2013)
SIGMOD
-
-
Xin, R.S.1
Rosen, J.2
Zaharia, M.3
Franklin, M.J.4
Shenker, S.5
Stoica, I.6
-
39
-
-
84880515658
-
Don't be SCAREd: Use SCalable Automatic REpairing with maximal likelihood and bounded changes
-
M. Yakout, L. Berti-Equille, and A. K. Elmagarmid. Don't be SCAREd: use SCalable Automatic REpairing with maximal likelihood and bounded changes. In SIGMOD, 2013.
-
(2013)
SIGMOD
-
-
Yakout, M.1
Berti-Equille, L.2
Elmagarmid, A.K.3
-
40
-
-
80052313567
-
Guided data repair
-
M. Yakout, A. K. Elmagarmid, J. Neville, M. Ouzzani, and I. F. Ilyas. Guided data repair. PVLDB, 2011.
-
(2011)
PVLDB
-
-
Yakout, M.1
Elmagarmid, A.K.2
Neville, J.3
Ouzzani, M.4
Ilyas, I.F.5
-
41
-
-
85085251984
-
Spark: Cluster computing with working sets
-
M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica. Spark: Cluster computing with working sets. In HotCloud, 2010.
-
(2010)
HotCloud
-
-
Zaharia, M.1
Chowdhury, M.2
Franklin, M.J.3
Shenker, S.4
Stoica, I.5
|