-
3
-
-
56349095491
-
Aggregating inconsistent information: Ranking and clustering
-
N. Ailon, M. Charikar, and A. Newman. Aggregating inconsistent information: ranking and clustering. Journal of the ACM (JACM), 55(5):23, 2008.
-
(2008)
Journal of the ACM (JACM)
, vol.55
, Issue.5
, pp. 23
-
-
Ailon, N.1
Charikar, M.2
Newman, A.3
-
9
-
-
80052917068
-
Sampling the repairs of functional dependency violations under hard constraints
-
G. Beskales, I. F. Ilyas, and L. Golab. Sampling the repairs of functional dependency violations under hard constraints. Proceedings of the VLDB Endowment, 3(1-2):197-207, 2010.
-
(2010)
Proceedings of the VLDB Endowment
, vol.3
, Issue.1-2
, pp. 197-207
-
-
Beskales, G.1
Ilyas, I.F.2
Golab, L.3
-
10
-
-
84881320841
-
On the relative trust between inconsistent data and inaccurate constraints
-
G. Beskales, I. F. Ilyas, L. Golab, and A. Galiullin. On the relative trust between inconsistent data and inaccurate constraints. In 29th IEEE International Conference on Data Engineering, pages 541-552, 2013.
-
(2013)
29th IEEE International Conference on Data Engineering
, pp. 541-552
-
-
Beskales, G.1
Ilyas, I.F.2
Golab, L.3
Galiullin, A.4
-
11
-
-
84892822296
-
Sampling from repairs of conditional functional dependency violations
-
G. Beskales, I. F. Ilyas, L. Golab, and A. Galiullin. Sampling from repairs of conditional functional dependency violations. The VLDB Journal, 23(1):103-128, 2014.
-
(2014)
The VLDB Journal
, vol.23
, Issue.1
, pp. 103-128
-
-
Beskales, G.1
Ilyas, I.F.2
Golab, L.3
Galiullin, A.4
-
12
-
-
77954695997
-
Modeling and querying possible repairs in duplicate detection
-
G. Beskales, M. A. Soliman, I. F. Ilyas, and S. Ben-David. Modeling and querying possible repairs in duplicate detection. Proceedings of the VLDB Endowment, pages 598-609, 2009.
-
(2009)
Proceedings of the VLDB Endowment
, pp. 598-609
-
-
Beskales, G.1
Soliman, M.A.2
Ilyas, I.F.3
Ben-David, S.4
-
17
-
-
29844436973
-
A cost-based model and effective heuristic for repairing constraints by value modification
-
ACM
-
P. Bohannon, W. Fan, M. Flaster, and R. Rastogi. A cost-based model and effective heuristic for repairing constraints by value modification. In Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pages 143-154. ACM, 2005.
-
(2005)
Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data
, pp. 143-154
-
-
Bohannon, P.1
Fan, W.2
Flaster, M.3
Rastogi, R.4
-
18
-
-
34548731840
-
Conditional functional dependencies for data cleaning
-
P. Bohannon, W. Fan, F. Geerts, X. Jia, and A. Kementsietsidis. Conditional functional dependencies for data cleaning. In Proceedings of the 23rd International Conference on Data Engineering, pages 746-755, 2007.
-
(2007)
Proceedings of the 23rd International Conference on Data Engineering
, pp. 746-755
-
-
Bohannon, P.1
Fan, W.2
Geerts, F.3
Jia, X.4
Kementsietsidis, A.5
-
19
-
-
84904358688
-
Descriptive and prescriptive data cleaning
-
A. Chalamalla, I. F. Ilyas, M. Ouzzani, and P. Papotti. Descriptive and prescriptive data cleaning. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pages 445-456, 2014.
-
(2014)
Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data
, pp. 445-456
-
-
Chalamalla, A.1
Ilyas, I.F.2
Ouzzani, M.3
Papotti, P.4
-
20
-
-
85011029434
-
Example-driven design of efficient record matching queries
-
S. Chaudhuri, B. Chen, V. Ganti, and R. Kaushik. Example-driven design of efficient record matching queries. In Proceedings of the 33rd International Conference on Very Large Data Bases, pages 327-338, 2007.
-
(2007)
Proceedings of the 33rd International Conference on Very Large Data Bases
, pp. 327-338
-
-
Chaudhuri, S.1
Chen, B.2
Ganti, V.3
Kaushik, R.4
-
24
-
-
14744293228
-
Minimal-change integrity maintenance using tuple deletions
-
J. Chomicki and J. Marcinkowski. Minimal-change integrity maintenance using tuple deletions. Information and Computation, 197(1):90-121, 2005.
-
(2005)
Information and Computation
, vol.197
, Issue.1
, pp. 90-121
-
-
Chomicki, J.1
Marcinkowski, J.2
-
25
-
-
84891066910
-
Discovering denial constraints
-
X. Chu, I. F. Ilyas, and P. Papotti. Discovering denial constraints. Proceedings of the VLDB Endowment, 6(13):1498-1509, 2013.
-
(2013)
Proceedings of the VLDB Endowment
, vol.6
, Issue.13
, pp. 1498-1509
-
-
Chu, X.1
Ilyas, I.F.2
Papotti, P.3
-
27
-
-
84901784972
-
Ruleminer: Data quality rules discovery
-
X. Chu, I. F. Ilyas, P. Papotti, and Y. Ye. Ruleminer: Data quality rules discovery. In IEEE 30th International Conference on Data Engineering, pages 1222-1225, 2014.
-
(2014)
IEEE 30th International Conference on Data Engineering
, pp. 1222-1225
-
-
Chu, X.1
Ilyas, I.F.2
Papotti, P.3
Ye, Y.4
-
28
-
-
84957586399
-
KATARA: A data cleaning system powered by knowledge bases and crowdsourcing
-
X. Chu, J. Morcos, I. F. Ilyas, M. Ouzzani, P. Papotti, N. Tang, and Y. Ye. KATARA: A data cleaning system powered by knowledge bases and crowdsourcing. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pages 1247-1261, 2015.
-
(2015)
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data
, pp. 1247-1261
-
-
Chu, X.1
Morcos, J.2
Ilyas, I.F.3
Ouzzani, M.4
Papotti, P.5
Tang, N.6
Ye, Y.7
-
29
-
-
84968376348
-
7 facts about data quality
-
S. Clemens. 7 facts about data quality. InsightSquared, 2012.
-
(2012)
InsightSquared
-
-
Clemens, S.1
-
30
-
-
0032091575
-
Integration of heterogeneous databases without common domains using queries based on textual similarity
-
W. W. Cohen. Integration of heterogeneous databases without common domains using queries based on textual similarity. In ACM SIGMOD Record, volume 27, pages 201-212, 1998.
-
(1998)
ACM SIGMOD Record
, vol.27
, pp. 201-212
-
-
Cohen, W.W.1
-
31
-
-
84959912087
-
Improving data quality: Consistency and accuracy
-
VLDB Endowment
-
G. Cong, W. Fan, F. Geerts, X. Jia, and S. Ma. Improving data quality: Consistency and accuracy. In Proceedings of the 33rd International Conference on Very Large Data Bases, pages 315-326. VLDB Endowment, 2007.
-
(2007)
Proceedings of the 33rd International Conference on Very Large Data Bases
, pp. 315-326
-
-
Cong, G.1
Fan, W.2
Geerts, F.3
Jia, X.4
Ma, S.5
-
32
-
-
84880546390
-
Nadeef: A commodity data cleaning system
-
M. Dallachiesa, A. Ebaid, A. Eldawy, A. Elmagarmid, I. F. Ilyas, M. Ouzzani, and N. Tang. Nadeef: a commodity data cleaning system. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pages 541-552, 2013.
-
(2013)
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
, pp. 541-552
-
-
Dallachiesa, M.1
Ebaid, A.2
Eldawy, A.3
Elmagarmid, A.4
Ilyas, I.F.5
Ouzzani, M.6
Tang, N.7
-
34
-
-
58549093024
-
Unary and n-ary inclusion dependency discovery in relational databases
-
F. De Marchi, S. Lopes, and J.-M. Petit. Unary and n-ary inclusion dependency discovery in relational databases. Journal of Intelligent Information Systems, 32(1):53-73, 2009.
-
(2009)
Journal of Intelligent Information Systems
, vol.32
, Issue.1
, pp. 53-73
-
-
De Marchi, F.1
Lopes, S.2
Petit, J.-M.3
-
35
-
-
37549003336
-
Mapreduce: Simplified data processing on large clusters
-
J. Dean and S. Ghemawat. Mapreduce: simplified data processing on large clusters. Communications of the ACM, 51(1):107-113, 2008.
-
(2008)
Communications of the ACM
, vol.51
, Issue.1
, pp. 107-113
-
-
Dean, J.1
Ghemawat, S.2
-
36
-
-
84866896733
-
-
McGraw-Hill
-
D. Deroos, C. Eaton, G. Lapis, P. Zikopoulos, and T. Deutsch. Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data. McGraw-Hill, 2011.
-
(2011)
Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data
-
-
Deroos, D.1
Eaton, C.2
Lapis, G.3
Zikopoulos, P.4
Deutsch, T.5
-
38
-
-
2342615638
-
Profile-based object matching for information integration
-
A. Doan, Y. Lu, Y. Lee, and J. Han. Profile-based object matching for information integration. IEEE Intelligent Systems, 18(5):54-59, 2003.
-
(2003)
IEEE Intelligent Systems
, vol.18
, Issue.5
, pp. 54-59
-
-
Doan, A.1
Lu, Y.2
Lee, Y.3
Han, J.4
-
39
-
-
84863067746
-
Data fusion: Resolving data conflicts for integration
-
X. L. Dong and F. Naumann. Data fusion: resolving data conflicts for integration. Proceedings of the VLDB Endowment, 2(2):1654-1655, 2009.
-
(2009)
Proceedings of the VLDB Endowment
, vol.2
, Issue.2
, pp. 1654-1655
-
-
Dong, X.L.1
Naumann, F.2
-
40
-
-
33845667955
-
Duplicate record detection: A survey
-
A. K. Elmagarmid, P. G. Ipeirotis, and V. S. Verykios. Duplicate record detection: A survey. IEEE Transactions on Knowledge and Data Engineering, 19(1):1-16, 2007.
-
(2007)
IEEE Transactions on Knowledge and Data Engineering
, vol.19
, Issue.1
, pp. 1-16
-
-
Elmagarmid, A.K.1
Ipeirotis, P.G.2
Verykios, V.S.3
-
41
-
-
74549151555
-
You talking to me? A corpus and algorithm for conversation disentanglement
-
M. Elsner and E. Charniak. You talking to me? a corpus and algorithm for conversation disentanglement. In Association for Computational Linguistics (ACL), pages 834-842, 2008.
-
(2008)
Association for Computational Linguistics (ACL)
, pp. 834-842
-
-
Elsner, M.1
Charniak, E.2
-
45
-
-
79953230060
-
Discovering conditional functional dependencies
-
W. Fan, F. Geerts, J. Li, and M. Xiong. Discovering conditional functional dependencies. IEEE Transactions on Knowledge and Data Engineering, 23(5):683-698, 2011.
-
(2011)
IEEE Transactions on Knowledge and Data Engineering
, vol.23
, Issue.5
, pp. 683-698
-
-
Fan, W.1
Geerts, F.2
Li, J.3
Xiong, M.4
-
46
-
-
77952749687
-
Detecting inconsistencies in distributed data
-
W. Fan, F. Geerts, S. Ma, and H. Müller. Detecting inconsistencies in distributed data. In Proceedings of the 26th International Conference on Data Engineering, pages 64-75, 2010.
-
(2010)
Proceedings of the 26th International Conference on Data Engineering
, pp. 64-75
-
-
Fan, W.1
Geerts, F.2
Ma, S.3
Müller, H.4
-
47
-
-
84881326725
-
Inferring data currency and consistency for conflict resolution
-
W. Fan, F. Geerts, N. Tang, and W. Yu. Inferring data currency and consistency for conflict resolution. In 29th IEEE International Conference on Data Engineering, pages 470-481, 2013.
-
(2013)
29th IEEE International Conference on Data Engineering
, pp. 470-481
-
-
Fan, W.1
Geerts, F.2
Tang, N.3
Yu, W.4
-
48
-
-
84907031191
-
Conflict resolution with data currency and consistency
-
W. Fan, F. Geerts, N. Tang, and W. Yu. Conflict resolution with data currency and consistency. Journal of Data and Information Quality, 5(1-2):6:1-6:37, 2014.
-
(2014)
Journal of Data and Information Quality
, vol.5
, Issue.1-2
, pp. 61-637
-
-
Fan, W.1
Geerts, F.2
Tang, N.3
Yu, W.4
-
49
-
-
84865086832
-
Reasoning about record matching rules
-
W. Fan, X. Jia, J. Li, and S. Ma. Reasoning about record matching rules. Proceedings of the VLDB Endowment, 2(1):407-418, 2009.
-
(2009)
Proceedings of the VLDB Endowment
, vol.2
, Issue.1
, pp. 407-418
-
-
Fan, W.1
Jia, X.2
Li, J.3
Ma, S.4
-
50
-
-
84858615261
-
Towards certain fixes with editing rules and master data
-
W. Fan, J. Li, S. Ma, N. Tang, and W. Yu. Towards certain fixes with editing rules and master data. Proceedings of the VLDB Endowment, 3(1-2):173-184, 2010.
-
(2010)
Proceedings of the VLDB Endowment
, vol.3
, Issue.1-2
, pp. 173-184
-
-
Fan, W.1
Li, J.2
Ma, S.3
Tang, N.4
Yu, W.5
-
51
-
-
79959944062
-
Interaction between record matching and data repairing
-
ACM
-
W. Fan, J. Li, S. Ma, N. Tang, and W. Yu. Interaction between record matching and data repairing. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, pages 469-480. ACM, 2011.
-
(2011)
Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data
, pp. 469-480
-
-
Fan, W.1
Li, J.2
Ma, S.3
Tang, N.4
Yu, W.5
-
52
-
-
84902202624
-
Incremental detection of inconsistencies in distributed data
-
W. Fan, J. Li, N. Tang, and W. Yu. Incremental detection of inconsistencies in distributed data. IEEE Transactions on Knowledge and Data Engineering, 26(6):1367-1383, 2014.
-
(2014)
IEEE Transactions on Knowledge and Data Engineering
, vol.26
, Issue.6
, pp. 1367-1383
-
-
Fan, W.1
Li, J.2
Tang, N.3
Yu, W.4
-
53
-
-
0344756845
-
Declarative data cleaning: Language, model, and algorithms
-
H. Galhardas, D. Florescu, D. Shasha, E. Simon, and C. Saita. Declarative data cleaning: Language, model, and algorithms. In VLDB 2001, Proceedings of 27th International Conference on Very Large Data Bases, pages 371-380, 2001.
-
(2001)
VLDB 2001, Proceedings of 27th International Conference on Very Large Data Bases
, pp. 371-380
-
-
Galhardas, H.1
Florescu, D.2
Shasha, D.3
Simon, E.4
Saita, C.5
-
54
-
-
80052307108
-
Support for user involvement in data cleaning
-
H. Galhardas, A. Lopes, and E. Santos. Support for user involvement in data cleaning. In Data Warehousing and Knowledge Discovery-13th International Conference, DaWaK 2011, pages 136-151, 2011.
-
(2011)
Data Warehousing and Knowledge Discovery-13th International Conference, DaWaK 2011
, pp. 136-151
-
-
Galhardas, H.1
Lopes, A.2
Santos, E.3
-
55
-
-
84882696854
-
The llunatic datacleaning framework
-
F. Geerts, G. Mecca, P. Papotti, and D. Santoro. The llunatic datacleaning framework. Proceedings of the VLDB Endowment, 6(9):625-636, 2013.
-
(2013)
Proceedings of the VLDB Endowment
, vol.6
, Issue.9
, pp. 625-636
-
-
Geerts, F.1
Mecca, G.2
Papotti, P.3
Santoro, D.4
-
56
-
-
84901745035
-
Mapping and cleaning
-
F. Geerts, G. Mecca, P. Papotti, and D. Santoro. Mapping and cleaning. In IEEE 30th International Conference on Data Engineering, pages 232-243, 2014.
-
(2014)
IEEE 30th International Conference on Data Engineering
, pp. 232-243
-
-
Geerts, F.1
Mecca, G.2
Papotti, P.3
Santoro, D.4
-
57
-
-
84905824914
-
That's all folks! LLUNATIC goes open source
-
F. Geerts, G. Mecca, P. Papotti, and D. Santoro. That's all folks! LLUNATIC goes open source. Proceedings of the VLDB Endowment, 7(13):1565-1568, 2014.
-
(2014)
Proceedings of the VLDB Endowment
, vol.7
, Issue.13
, pp. 1565-1568
-
-
Geerts, F.1
Mecca, G.2
Papotti, P.3
Santoro, D.4
-
58
-
-
84873162472
-
Entity resolution: Theory, practice & open challenges
-
L. Getoor and A. Machanavajjhala. Entity resolution: theory, practice & open challenges. Proceedings of the VLDB Endowment, 5(12):2018-2019, 2012.
-
(2012)
Proceedings of the VLDB Endowment
, vol.5
, Issue.12
, pp. 2018-2019
-
-
Getoor, L.1
Machanavajjhala, A.2
-
59
-
-
84904317392
-
Corleone: Hands-off crowdsourcing for entity matching
-
C. Gokhale, S. Das, A. Doan, J. F. Naughton, N. Rampalli, J. Shavlik, and X. Zhu. Corleone: Hands-off crowdsourcing for entity matching. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pages 601-612, 2014.
-
(2014)
Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data
, pp. 601-612
-
-
Gokhale, C.1
Das, S.2
Doan, A.3
Naughton, J.F.4
Rampalli, N.5
Shavlik, J.6
Zhu, X.7
-
60
-
-
70349846180
-
On generating near-optimal tableaux for conditional functional dependencies
-
L. Golab, H. Karloff, F. Korn, D. Srivastava, and B. Yu. On generating near-optimal tableaux for conditional functional dependencies. Proceedings of the VLDB Endowment, 1(1):376-390, 2008.
-
(2008)
Proceedings of the VLDB Endowment
, vol.1
, Issue.1
, pp. 376-390
-
-
Golab, L.1
Karloff, H.2
Korn, F.3
Srivastava, D.4
Yu, B.5
-
63
-
-
84901796427
-
Incremental record linkage
-
A. Gruenheid, X. L. Dong, and D. Srivastava. Incremental record linkage. Proceedings of the VLDB Endowment, 7(9):697-708, 2014.
-
(2014)
Proceedings of the VLDB Endowment
, vol.7
, Issue.9
, pp. 697-708
-
-
Gruenheid, A.1
Dong, X.L.2
Srivastava, D.3
-
65
-
-
29844438087
-
Clio grows up: From research prototype to industrial tool
-
L. M. Haas, M. A. Hernández, H. Ho, L. Popa, and M. Roth. Clio grows up: from research prototype to industrial tool. In Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pages 805-810, 2005.
-
(2005)
Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data
, pp. 805-810
-
-
Haas, L.M.1
Hernández, M.A.2
Ho, H.3
Popa, L.4
Roth, M.5
-
68
-
-
84976856849
-
The merge/purge problem for large databases
-
M. A. Hernández and S. J. Stolfo. The merge/purge problem for large databases. ACM SIGMOD Record, 24(2):127-138, 1995.
-
(1995)
ACM SIGMOD Record
, vol.24
, Issue.2
, pp. 127-138
-
-
Hernández, M.A.1
Stolfo, S.J.2
-
69
-
-
0013331361
-
Real-world data is dirty: Data cleansing and the merge/purge problem
-
M. A. Hernández and S. J. Stolfo. Real-world data is dirty: Data cleansing and the merge/purge problem. Data Mining and Knowledge Discovery, 2(1):9-37, 1998.
-
(1998)
Data Mining and Knowledge Discovery
, vol.2
, Issue.1
, pp. 9-37
-
-
Hernández, M.A.1
Stolfo, S.J.2
-
71
-
-
0345201769
-
TANE: An efficient algorithm for discovering functional and approximate dependencies
-
Y. Huhtala, J. Kärkkäinen, P. Porkka, and H. Toivonen. TANE: An efficient algorithm for discovering functional and approximate dependencies. Computer Journal, 42(2):100-111, 1999.
-
(1999)
Computer Journal
, vol.42
, Issue.2
, pp. 100-111
-
-
Huhtala, Y.1
Kärkkäinen, J.2
Porkka, P.3
Toivonen, H.4
-
73
-
-
0004037050
-
Unimatch: A record linkage system: User's manual
-
M. A. Jaro. Unimatch: A record linkage system: User's manual. U.S. Bureau of the Census, 1976.
-
(1976)
U.S. Bureau of the Census
-
-
Jaro, M.A.1
-
76
-
-
84949872769
-
Bigdansing: A system for big data cleansing
-
Z. Khayyat, I. F. Ilyas, A. Jindal, S. Madden, M. Ouzzani, P. Papotti, J.-A. Quiané-Ruiz, N. Tang, and S. Yin. Bigdansing: A system for big data cleansing. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pages 1215-1230, 2015.
-
(2015)
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data
, pp. 1215-1230
-
-
Khayyat, Z.1
Ilyas, I.F.2
Jindal, A.3
Madden, S.4
Ouzzani, M.5
Papotti, P.6
Quiané-Ruiz, J.-A.7
Tang, N.8
Yin, S.9
-
78
-
-
84872977079
-
Dedoop: Efficient deduplication with hadoop
-
L. Kolb, A. Thor, and E. Rahm. Dedoop: efficient deduplication with hadoop. Proceedings of the VLDB Endowment, 5(12):1878-1881, 2012.
-
(2012)
Proceedings of the VLDB Endowment
, vol.5
, Issue.12
, pp. 1878-1881
-
-
Kolb, L.1
Thor, A.2
Rahm, E.3
-
80
-
-
67649655745
-
Metric functional dependencies
-
N. Koudas, A. Saha, D. Srivastava, and S. Venkatasubramanian. Metric functional dependencies. In Proceedings of the 25th International Conference on Data Engineering, pages 1275-1278, 2009.
-
(2009)
Proceedings of the 25th International Conference on Data Engineering
, pp. 1275-1278
-
-
Koudas, N.1
Saha, A.2
Srivastava, D.3
Venkatasubramanian, S.4
-
82
-
-
0001116877
-
Binary codes capable of correcting deletions, insertions and reversals
-
V. I. Levenshtein. Binary codes capable of correcting deletions, insertions and reversals. In Soviet Physics Doklady, volume 10, page 707, 1966.
-
(1966)
Soviet Physics Doklady
, vol.10
, pp. 707
-
-
Levenshtein, V.I.1
-
83
-
-
84878700965
-
Complexity of consistent query answering in databases under cardinality-based and incremental repair semantics
-
A. Lopatenko and L. E. Bertossi. Complexity of consistent query answering in databases under cardinality-based and incremental repair semantics. In 11th International Conference on Database Theory, pages 179-193, 2007.
-
(2007)
11th International Conference on Database Theory
, pp. 179-193
-
-
Lopatenko, A.1
Bertossi, L.E.2
-
84
-
-
84890119754
-
Extending inclusion dependencies with conditions
-
S. Ma, W. Fan, and L. Bravo. Extending inclusion dependencies with conditions. Theoretical Computer Science, 515:64-95, 2014.
-
(2014)
Theoretical Computer Science
, vol.515
, pp. 64-95
-
-
Ma, S.1
Fan, W.2
Bravo, L.3
-
87
-
-
79959996140
-
Tracing data errors with view-conditioned causality
-
A. Meliou, W. Gatterbauer, S. Nath, and D. Suciu. Tracing data errors with view-conditioned causality. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, pages 505-516, 2011.
-
(2011)
Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data
, pp. 505-516
-
-
Meliou, A.1
Gatterbauer, W.2
Nath, S.3
Suciu, D.4
-
89
-
-
85018108837
-
The field matching problem: Algorithms and applications
-
A. E. Monge and C. Elkan. The field matching problem: Algorithms and applications. In Knowledge Discovery and Data Mining, pages 267-270, 1996.
-
(1996)
Knowledge Discovery and Data Mining
, pp. 267-270
-
-
Monge, A.E.1
Elkan, C.2
-
92
-
-
85013638127
-
Divide & conquer-based inclusion dependency discovery
-
T. Papenbrock, S. Kruse, J.-A. Quiané-Ruiz, and F. Naumann. Divide & conquer-based inclusion dependency discovery. Proceedings of the VLDB Endowment, 8(7):774-785, 2015.
-
(2015)
Proceedings of the VLDB Endowment
, vol.8
, Issue.7
, pp. 774-785
-
-
Papenbrock, T.1
Kruse, S.2
Quiané-Ruiz, J.-A.3
Naumann, F.4
-
93
-
-
0002490026
-
Data cleaning: Problems and current approaches
-
E. Rahm and H. H. Do. Data cleaning: Problems and current approaches. IEEE Data Engineering Bulletin, 23:2000, 2000.
-
(2000)
IEEE Data Engineering Bulletin
, vol.23
, pp. 2000
-
-
Rahm, E.1
Do, H.H.2
-
95
-
-
84958034478
-
-
Apr. 2 US Patent
-
R. Russell. Index., Apr. 2 1918. US Patent 1,261,167.
-
(1918)
Index
-
-
Russell, R.1
-
97
-
-
84905106551
-
Clusterjoin: A similarity joins framework using map-reduce
-
A. D. Sarma, Y. He, and S. Chaudhuri. Clusterjoin: A similarity joins framework using map-reduce. Proceedings of the VLDB Endowment, 7(12):1059-1070, 2014.
-
(2014)
Proceedings of the VLDB Endowment
, vol.7
, Issue.12
, pp. 1059-1070
-
-
Sarma, A.D.1
He, Y.2
Chaudhuri, S.3
-
98
-
-
84871075183
-
An automatic blocking mechanism for large-scale de-duplication tasks
-
A. D. Sarma, A. Jain, A. Machanavajjhala, and P. Bohannon. An automatic blocking mechanism for large-scale de-duplication tasks. In 21st ACM International Conference on Information and Knowledge Management, pages 1055-1064, 2012.
-
(2012)
21st ACM International Conference on Information and Knowledge Management
, pp. 1055-1064
-
-
Sarma, A.D.1
Jain, A.2
Machanavajjhala, A.3
Bohannon, P.4
-
103
-
-
0039891959
-
A machine learning approach to coreference resolution of noun phrases
-
W. M. Soon, H. T. Ng, and D. C. Y. Lim. A machine learning approach to coreference resolution of noun phrases. Computational Linguistics, 27(4):521-544, 2001.
-
(2001)
Computational Linguistics
, vol.27
, Issue.4
, pp. 521-544
-
-
Soon, W.M.1
Ng, H.T.2
Lim, D.C.Y.3
-
105
-
-
85084016251
-
Data curation at scale: The data tamer system
-
M. Stonebraker, D. Bruckner, I. F. Ilyas, G. Beskales, M. Cherniack, S. B. Zdonik, A. Pagan, and S. Xu. Data curation at scale: The data tamer system. In CIDR 2013, 6th Biennial Conference on Innovative Data Systems Research, 2013.
-
(2013)
CIDR 2013, 6th Biennial Conference on Innovative Data Systems Research
-
-
Stonebraker, M.1
Bruckner, D.2
Ilyas, I.F.3
Beskales, G.4
Cherniack, M.5
Zdonik, S.B.6
Pagan, A.7
Xu, S.8
-
106
-
-
48249116542
-
Gartner warns firms of "dirty data"
-
N. Swartz. Gartner warns firms of "dirty data". Information Management Journal, 41(3), 2007.
-
(2007)
Information Management Journal
, vol.41
, Issue.3
-
-
Swartz, N.1
-
107
-
-
32444450026
-
-
Special report (New York State Identification and Intelligence System). Bureau of Systems Development, New York State Identification and Intelligence System
-
R. Taft. Name Search Techniques. Special report (New York State Identification and Intelligence System). Bureau of Systems Development, New York State Identification and Intelligence System, 1970.
-
(1970)
Name Search Techniques
-
-
Taft, R.1
-
108
-
-
0035545848
-
Learning object identification rules for information integration
-
S. Tejada, C. A. Knoblock, and S. Minton. Learning object identification rules for information integration. Information Systems, 26(8):607-633, 2001.
-
(2001)
Information Systems
, vol.26
, Issue.8
, pp. 607-633
-
-
Tejada, S.1
Knoblock, C.A.2
Minton, S.3
-
109
-
-
0042868698
-
Support vector machine active learning with applications to text classification
-
S. Tong and D. Koller. Support vector machine active learning with applications to text classification. The Journal of Machine Learning Research, 2:45-66, 2002.
-
(2002)
The Journal of Machine Learning Research
, vol.2
, pp. 45-66
-
-
Tong, S.1
Koller, D.2
-
111
-
-
0034228352
-
Automating the approximate record-matching process
-
V. S. Verykios, A. K. Elmagarmid, and E. N. Houstis. Automating the approximate record-matching process. Information Sciences, 126(1):83-98, 2000.
-
(2000)
Information Sciences
, vol.126
, Issue.1
, pp. 83-98
-
-
Verykios, V.S.1
Elmagarmid, A.K.2
Houstis, E.N.3
-
113
-
-
84901814945
-
Continuous data cleaning
-
M. Volkovs, F. Chiang, J. Szlichta, and R. J. Miller. Continuous data cleaning. In IEEE 30th International Conference on Data Engineering, pages 244-255, 2014.
-
(2014)
IEEE 30th International Conference on Data Engineering
, pp. 244-255
-
-
Volkovs, M.1
Chiang, F.2
Szlichta, J.3
Miller, R.J.4
-
114
-
-
84872946975
-
Crowder: Crowdsourcing entity resolution
-
J. Wang, T. Kraska, M. J. Franklin, and J. Feng. Crowder: Crowdsourcing entity resolution. Proceedings of the VLDB Endowment, 5(11):1483-1494, 2012.
-
(2012)
Proceedings of the VLDB Endowment
, vol.5
, Issue.11
, pp. 1483-1494
-
-
Wang, J.1
Kraska, T.2
Franklin, M.J.3
Feng, J.4
-
115
-
-
84904301041
-
A sample-and-clean framework for fast and accurate query processing on dirty data
-
J. Wang, S. Krishnan, M. J. Franklin, K. Goldberg, T. Kraska, and T. Milo. A sample-and-clean framework for fast and accurate query processing on dirty data. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pages 469-480, 2014.
-
(2014)
Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data
, pp. 469-480
-
-
Wang, J.1
Krishnan, S.2
Franklin, M.J.3
Goldberg, K.4
Kraska, T.5
Milo, T.6
-
116
-
-
84880551539
-
Leveraging transitive relations for crowdsourced joins
-
J.Wang, G. Li, T. Kraska, M. J. Franklin, and J. Feng. Leveraging transitive relations for crowdsourced joins. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pages 229-240, 2013.
-
(2013)
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
, pp. 229-240
-
-
Wang, J.1
Li, G.2
Kraska, T.3
Franklin, M.J.4
Feng, J.5
-
118
-
-
0001642687
-
Some biological sequence metrics
-
M. S. Waterman, T. F. Smith, and W. A. Beyer. Some biological sequence metrics. Advances in Mathematics, 20(3):367-387, 1976.
-
(1976)
Advances in Mathematics
, vol.20
, Issue.3
, pp. 367-387
-
-
Waterman, M.S.1
Smith, T.F.2
Beyer, W.A.3
-
119
-
-
77956549963
-
Industryscale duplicate detection
-
M. Weis, F. Naumann, U. Jehle, J. Lufter, and H. Schuster. Industryscale duplicate detection. Proceedings of the VLDB Endowment, 1(2):1253-1264, 2008.
-
(2008)
Proceedings of the VLDB Endowment
, vol.1
, Issue.2
, pp. 1253-1264
-
-
Weis, M.1
Naumann, F.2
Jehle, U.3
Lufter, J.4
Schuster, H.5
-
121
-
-
0008976521
-
String comparator metrics and enhanced decision rules in the fellegi-sunter model of record linkage
-
W. E. Winkler. String comparator metrics and enhanced decision rules in the fellegi-sunter model of record linkage. Proceedings of the Section on Survey Research, 1990.
-
(1990)
Proceedings of the Section on Survey Research
-
-
Winkler, W.E.1
-
123
-
-
84881115711
-
Scorpion: Explaining away outliers in aggregate queries
-
E. Wu and S. Madden. Scorpion: Explaining away outliers in aggregate queries. Proceedings of the VLDB Endowment, 6(8):553-564, 2013.
-
(2013)
Proceedings of the VLDB Endowment
, vol.6
, Issue.8
, pp. 553-564
-
-
Wu, E.1
Madden, S.2
-
124
-
-
84873113399
-
A demonstration of dbwipes: Clean as you query
-
E. Wu, S. Madden, and M. Stonebraker. A demonstration of dbwipes: clean as you query. Proceedings of the VLDB Endowment, 5(12):1894-1897, 2012.
-
(2012)
Proceedings of the VLDB Endowment
, vol.5
, Issue.12
, pp. 1894-1897
-
-
Wu, E.1
Madden, S.2
Stonebraker, M.3
-
125
-
-
84958780501
-
FastFDs: A heuristicdriven, depth-first algorithm for mining functional dependencies from relation instances
-
C. M. Wyss, C. Giannella, and E. L. Robertson. FastFDs: A heuristicdriven, depth-first algorithm for mining functional dependencies from relation instances. In International Conference on Big Data Analytics and Knowledge Discovery, pages 101-110, 2001.
-
(2001)
International Conference on Big Data Analytics and Knowledge Discovery
, pp. 101-110
-
-
Wyss, C.M.1
Giannella, C.2
Robertson, E.L.3
-
126
-
-
80052313567
-
Guided data repair
-
M. Yakout, A. K. Elmagarmid, J. Neville, M. Ouzzani, and I. F. Ilyas. Guided data repair. Proceedings of the VLDB Endowment, 4(5):279-289, 2011.
-
(2011)
Proceedings of the VLDB Endowment
, vol.4
, Issue.5
, pp. 279-289
-
-
Yakout, M.1
Elmagarmid, A.K.2
Neville, J.3
Ouzzani, M.4
Ilyas, I.F.5
-
127
-
-
85085251984
-
Spark: Cluster computing with working sets
-
M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica. Spark: cluster computing with working sets. In Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, volume 10, page 10, 2010.
-
(2010)
Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing
, vol.10
, pp. 10
-
-
Zaharia, M.1
Chowdhury, M.2
Franklin, M.J.3
Shenker, S.4
Stoica, I.5
|