-
4
-
-
33745448357
-
A latent Dirichlet model for unsupervised entity resolution
-
I. Bhattacharya and L. Getoor. A latent Dirichlet model for unsupervised entity resolution. In SDM, 2006.
-
(2006)
SDM
-
-
Bhattacharya, I.1
Getoor, L.2
-
5
-
-
77952372966
-
Adaptive duplicate detection using learnable string similarity measures
-
M. Bilenko and R. J. Mooney. Adaptive duplicate detection using learnable string similarity measures. In KDD, 2003.
-
(2003)
KDD
-
-
Bilenko, M.1
Mooney, R.J.2
-
8
-
-
0242540438
-
Learning to match and cluster large high-dimensional data sets for data integration
-
W. Cohen and J. Richman. Learning to match and cluster large high-dimensional data sets for data integration. In KDD, 2002.
-
(2002)
KDD
-
-
Cohen, W.1
Richman, J.2
-
9
-
-
33845667955
-
Duplicate record detection: A survey
-
A. K. Elmagarmid, P. G. Ipeirotis, and V. S. Verykios. Duplicate record detection: A survey. IEEE Transactions on Knowledge and Data Engineering, 19(1):1-16, 2007.
-
(2007)
IEEE Transactions on Knowledge and Data Engineering
, vol.19
, Issue.1
, pp. 1-16
-
-
Elmagarmid, A.K.1
Ipeirotis, P.G.2
Verykios, V.S.3
-
10
-
-
70849098813
-
Entity resolution with iterative blocking
-
S. Euijong Whang, D. Menestrina, G. Koutrika, M. Theobald, and H. Garcia-Molina. Entity resolution with iterative blocking. In SIGMOD, 2009.
-
(2009)
SIGMOD
-
-
Euijong Whang, S.1
Menestrina, D.2
Koutrika, G.3
Theobald, M.4
Garcia-Molina, H.5
-
11
-
-
84865086832
-
Reasoning about record matching rules
-
W. Fan, X. Jia, J. Li, and S. Ma. Reasoning about record matching rules. In VLDB, 2009.
-
(2009)
VLDB
-
-
Fan, W.1
Jia, X.2
Li, J.3
Ma, S.4
-
13
-
-
84880467474
-
Text joins in an RDBMS for web data integration
-
L. Gravano, P. G. Ipeirotis, N. Koudas, and D. Srivastava. Text joins in an RDBMS for web data integration. In WWW, pages 90-101, 2003.
-
(2003)
WWW
, pp. 90-101
-
-
Gravano, L.1
Ipeirotis, P.G.2
Koudas, N.3
Srivastava, D.4
-
15
-
-
72649086387
-
Framework for evaluating clustering algorithms in duplicate detection
-
O. Hassanzadeh, F. Chiang, H. C. Lee, and R. J. Miller. Framework for evaluating clustering algorithms in duplicate detection. In VLDB, 2009.
-
(2009)
VLDB
-
-
Hassanzadeh, O.1
Chiang, F.2
Lee, H.C.3
Miller, R.J.4
-
16
-
-
0013331361
-
Real-world data is dirty: Data cleansing and the merge/purge problem
-
M. Hernandez, M. A. H. Andez, S. Stolfo, and U. Fayyad. Real-world data is dirty: Data cleansing and the merge/purge problem. Data Mining and Knowledge Discovery, 2(1):9-37, 1998.
-
(1998)
Data Mining and Knowledge Discovery
, vol.2
, Issue.1
, pp. 9-37
-
-
Hernandez, M.1
Andez, M.A.H.2
Stolfo, S.3
Fayyad, U.4
-
17
-
-
0001116877
-
Binary codes capable of correcting deletions, insertions, and reversals
-
V. I. Levenshtein. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady, 10(8):707-710, 1966.
-
(1966)
Soviet Physics Doklady
, vol.10
, Issue.8
, pp. 707-710
-
-
Levenshtein, V.I.1
-
19
-
-
0034592784
-
Efficient clustering of high-dimensional data sets with application to reference matching
-
A. McCallum, K. Nigam, and L. Ungar. Efficient clustering of high-dimensional data sets with application to reference matching. In KDD, pages 169-178, 2000.
-
(2000)
KDD
, pp. 169-178
-
-
McCallum, A.1
Nigam, K.2
Ungar, L.3
-
20
-
-
0001592068
-
Automatic linkage of vital records
-
H. B. Newcombe, J. M. Kennedy, S. J. Axford, and A. P. James. Automatic linkage of vital records. Science, 130(3381):954-959, 1959.
-
(1959)
Science
, vol.130
, Issue.3381
, pp. 954-959
-
-
Newcombe, H.B.1
Kennedy, J.M.2
Axford, S.J.3
James, A.P.4
-
21
-
-
37649028224
-
Finding and evaluating community structure in networks
-
Feb
-
M. E. J. Newman and M. Girvan. Finding and evaluating community structure in networks. Physical Review E, 69(2):026113+, Feb 2004.
-
(2004)
Physical Review E
, vol.69
, Issue.2
, pp. 026113
-
-
Newman, M.E.J.1
Girvan, M.2
-
22
-
-
85156206690
-
Identity uncertainty and citation matching
-
H. Pasula, B. Marthi, B. Milch, S. Russell, and I. Shpitser. Identity uncertainty and citation matching. In NIPS, 2002.
-
(2002)
NIPS
-
-
Pasula, H.1
Marthi, B.2
Milch, B.3
Russell, S.4
Shpitser, I.5
-
24
-
-
85166310944
-
Methods for linking and mining massive heterogeneous databases
-
J. C. Pinheiro and D. X. Sun. Methods for linking and mining massive heterogeneous databases. In KDD, 1998.
-
(1998)
KDD
-
-
Pinheiro, J.C.1
Sun, D.X.2
-
25
-
-
0016572913
-
A vector space model for automatic indexing
-
G. Salton, A. Wong, and C. S. Yang. A vector space model for automatic indexing. Communications of the ACM, 18(11):613-620, 1975.
-
(1975)
Communications of the ACM
, vol.18
, Issue.11
, pp. 613-620
-
-
Salton, G.1
Wong, A.2
Yang, C.S.3
-
26
-
-
0242456811
-
Interactive deduplication using active learning
-
S. Sarawagi and A. Bhamidipaty. Interactive deduplication using active learning. In KDD, pages 269-278, 2002.
-
(2002)
KDD
, pp. 269-278
-
-
Sarawagi, S.1
Bhamidipaty, A.2
-
27
-
-
79957852084
-
Efficient spectral neighborhood blocking for entity resolution
-
Hannover, Germany, April
-
L. Shu, A. Chen, M. Xiong, and W. Meng. Efficient spectral neighborhood blocking for entity resolution. In ICDE, pages 1067-1078, Hannover, Germany, April 2011.
-
(2011)
ICDE
, pp. 1067-1078
-
-
Shu, L.1
Chen, A.2
Xiong, M.3
Meng, W.4
-
28
-
-
79957866123
-
A latent topic model for complete entity resolution
-
L. Shu, B. Long, and W. Meng. A latent topic model for complete entity resolution. In ICDE, 2009.
-
(2009)
ICDE
-
-
Shu, L.1
Long, B.2
Meng, W.3
-
29
-
-
0027113212
-
Approximate string matching with q-grams and maximal matches
-
E. Ukkonen. Approximate string matching with q-grams and maximal matches. Theoretical Computer Science, 92(1):191-211, 1992.
-
(1992)
Theoretical Computer Science
, vol.92
, Issue.1
, pp. 191-211
-
-
Ukkonen, E.1
-
30
-
-
0034228352
-
Automating the approximate record matching process
-
V. S. Verykios and A. K. Elmagarmid. Automating the approximate record matching process. Information Sciences, 126:83-98, 1999.
-
(1999)
Information Sciences
, vol.126
, pp. 83-98
-
-
Verykios, V.S.1
Elmagarmid, A.K.2
|