-
1
-
-
85104914015
-
Efficient exact set-similarity joins
-
Arvind Arasu, Venkatesh Ganti, and Raghav Kaushik. Efficient exact set-similarity joins. In VLDB, pages 918-929, 2006.
-
(2006)
VLDB
, pp. 918-929
-
-
Arasu, A.1
Ganti, V.2
Kaushik, R.3
-
2
-
-
77954717287
-
On active learning of record matching packages
-
Arvind Arasu, Michaela Götz, and Raghav Kaushik. On active learning of record matching packages. In SIGMOD Conference, pages 783-794, 2010.
-
(2010)
SIGMOD Conference
, pp. 783-794
-
-
Arasu, A.1
Götz, M.2
Kaushik, R.3
-
3
-
-
29844440778
-
Towards a robust query optimizer: A principled and practical approach
-
Brian Babcock and Surajit Chaudhuri. Towards a robust query optimizer: a principled and practical approach. In SIGMOD, pages 119-130, 2005.
-
(2005)
SIGMOD
, pp. 119-130
-
-
Babcock, B.1
Chaudhuri, S.2
-
4
-
-
0034837020
-
Sampling algorithms: Lower bounds and applications
-
Ziv Bar-Yossef, Ravi Kumar, and D. Sivakumar. Sampling algorithms: lower bounds and applications. In STOC, pages 266-275, 2001.
-
(2001)
STOC
, pp. 266-275
-
-
Bar-Yossef, Z.1
Kumar, R.2
Sivakumar, D.3
-
7
-
-
84871099550
-
Adaptive blocking: Learning to scale up record linkage and clustering
-
Mikhail Bilenko, Beena Kamath, and Raymond J. Mooney. Adaptive blocking: Learning to scale up record linkage and clustering. In ICDM, 2006.
-
(2006)
ICDM
-
-
Bilenko, M.1
Kamath, B.2
Mooney, R.J.3
-
8
-
-
33749597967
-
A primitive operator for similarity joins in data cleaning
-
Surajit Chaudhuri, Venkatesh Ganti, and Raghav Kaushik. A primitive operator for similarity joins in data cleaning. In ICDE, 2006.
-
(2006)
ICDE
-
-
Chaudhuri, S.1
Ganti, V.2
Kaushik, R.3
-
9
-
-
0000301097
-
A greedy heuristic for the set-covering problem
-
Aug
-
V. Chvatal. A greedy heuristic for the set-covering problem. Mathematics of Operations Research, 4:233-235, Aug 1979.
-
(1979)
Mathematics of Operations Research
, vol.4
, pp. 233-235
-
-
Chvatal, V.1
-
12
-
-
29844452555
-
Reference reconciliation in complex information spaces
-
Xin Dong, Alon Halevy, and Jayant Madhavan. Reference reconciliation in complex information spaces. In SIGMOD, pages 85-96, 2005.
-
(2005)
SIGMOD
, pp. 85-96
-
-
Dong, X.1
Halevy, A.2
Madhavan, J.3
-
14
-
-
85027780074
-
Creating probabilistic databases from information extraction models
-
Rahul Gupta and Sunita Sarawagi. Creating probabilistic databases from information extraction models. In VLDB, pages 965-976, 2006.
-
(2006)
VLDB
, pp. 965-976
-
-
Gupta, R.1
Sarawagi, S.2
-
15
-
-
65449122439
-
Unsupervised deduplication using cross-field dependencies
-
Rob Hall, Charles Sutton, and Andrew McCallum. Unsupervised deduplication using cross-field dependencies. In KDD, pages 310-317, 2008.
-
(2008)
KDD
, pp. 310-317
-
-
Hall, R.1
Sutton, C.2
McCallum, A.3
-
16
-
-
77956220112
-
Scalable similarity search with optimized kernel hashing
-
Junfeng He, Wei Liu, and Shih-Fu Chang. Scalable similarity search with optimized kernel hashing. In KDD, pages 1129-1138, 2010.
-
(2010)
KDD
, pp. 1129-1138
-
-
He, J.1
Liu, W.2
Chang, S.-F.3
-
17
-
-
0013331361
-
Real-world data is dirty: Data cleansing and the merge/purge problem
-
January
-
Mauricio A. Hernández and Salvatore J. Stolfo. Real-world data is dirty: Data cleansing and the merge/purge problem. Data Min. Knowl. Discov., 2:9-37, January 1998.
-
(1998)
Data Min. Knowl. Discov.
, vol.2
, pp. 9-37
-
-
Hernández, M.A.1
Stolfo, S.J.2
-
18
-
-
0036083445
-
A data complexity analysis of comparative advantages of decision forest constructors
-
Tin Kam Ho. A data complexity analysis of comparative advantages of decision forest constructors. Pattern Anal. Appl., pages 102-112, 2002.
-
(2002)
Pattern Anal. Appl.
, pp. 102-112
-
-
Ho, T.K.1
-
21
-
-
84950419860
-
Advances in record-linkage methodology as applied to matching the 1985 Census of Tampa, Florida
-
Matthew A. Jaro. Advances in Record-Linkage Methodology as Applied to Matching the 1985 Census of Tampa, Florida. Journal of the American Statistical Association, 84(406):414-420, 1989.
-
(1989)
Journal of the American Statistical Association
, vol.84
, Issue.406
, pp. 414-420
-
-
Jaro, M.A.1
-
23
-
-
0034592784
-
Efficient clustering of high-dimensional data sets with application to reference matching
-
Andrew McCallum, Kamal Nigam, and Lyle H. Ungar. Efficient clustering of high-dimensional data sets with application to reference matching. In Knowledge Discovery and Data Mining, pages 169-178, 2000.
-
(2000)
Knowledge Discovery and Data Mining
, pp. 169-178
-
-
McCallum, A.1
Nigam, K.2
Ungar, L.H.3
-
25
-
-
33750728911
-
Learning blocking schemes for record linkage
-
Matthew Michelson and Craig A. Knoblock. Learning blocking schemes for record linkage. In AAAI, pages 440-445, 2006.
-
(2006)
AAAI
, pp. 440-445
-
-
Michelson, M.1
Knoblock, C.A.2
-
26
-
-
0345566149
-
A guided tour to approximate string matching
-
Gonzalo Navarro. A guided tour to approximate string matching. ACM Comput. Surv., 33(1):31-88, 2001.
-
(2001)
ACM Comput. Surv.
, vol.33
, Issue.1
, pp. 31-88
-
-
Navarro, G.1
-
27
-
-
0001592068
-
Automatic linkage of vital records
-
October
-
H. B. Newcombe, J. M. Kennedy, S. J. Axford, and A. P. andJames. Automatic Linkage of Vital Records. Science, 130:954-959, October 1959.
-
(1959)
Science
, vol.130
, pp. 954-959
-
-
Newcombe, H.B.1
Kennedy, J.M.2
Axford, S.J.3
James, A.P.4
-
28
-
-
1842751587
-
-
NIPS
-
H. Pasula, B. Marthi, B. Milch, S. Russell, and I. Shpitser. Identity uncertainty and citation matching. In NIPS, 2002.
-
(2002)
Identity Uncertainty and Citation Matching
-
-
Pasula, H.1
Marthi, B.2
Milch, B.3
Russell, S.4
Shpitser, I.5
-
30
-
-
84891103393
-
An automatic blocking mechanism for large-scale de-duplication tasks
-
Anish Das Sarma, Ankur Jain, Ashwin Machanavajjhala, and Philip Bohannon. An automatic blocking mechanism for large-scale de-duplication tasks. In CIKM, 2012.
-
(2012)
CIKM
-
-
Sarma, A.D.1
Jain, A.2
Machanavajjhala, A.3
Bohannon, P.4
-
31
-
-
84878044770
-
Entity resolution with markov logic
-
Parag Singla and Pedro Domingos. Entity resolution with markov logic. In icdm, pages 572-582, 2006.
-
(2006)
Icdm
, pp. 572-582
-
-
Singla, P.1
Domingos, P.2
|