-
1
-
-
85104914015
-
Efficient exact set-similarity joins
-
A. Arasu, V. Ganti, R. Kaushik, Efficient exact set-similarity joins, in: Proceedings of the 32nd International Conference on Very Large Data Bases (VLDB '06), 2006, pp. 918-929.
-
(2006)
Proceedings of the 32nd International Conference on Very Large Data Bases (VLDB '06)
, pp. 918-929
-
-
Arasu, A.1
Ganti, V.2
Kaushik, R.3
-
2
-
-
67649649597
-
Large-scale deduplication with constraints using dedupalog
-
A. Arasu, C. Ré, D. Suciu, Large-scale deduplication with constraints using dedupalog, in: Proceedings of the 25th International Conference on Data Engineering (ICDE '09), 2009, pp. 952-963.
-
(2009)
Proceedings of the 25th International Conference on Data Engineering (ICDE '09)
, pp. 952-963
-
-
Arasu, A.1
Ré, C.2
Suciu, D.3
-
4
-
-
5444258997
-
A comparison of fast blocking methods for record linkage
-
R. Baxter, P. Christen, T. Churches, A comparison of fast blocking methods for record linkage, in: Proceedings of the Ninth ACM SIGKDD Workshop on Data Cleaning, Record Linkage and Object Consolidation, 2003, pp. 25-27.
-
(2003)
Proceedings of the Ninth ACM SIGKDD Workshop on Data Cleaning, Record Linkage and Object Consolidation
, pp. 25-27
-
-
Baxter, R.1
Christen, P.2
Churches, T.3
-
5
-
-
58149472338
-
Swoosh: a generic approach to entity resolution
-
Benjelloun O., Garcia-Molina H., Menestrina D., Su Q., Whang S.E., and Widom J. Swoosh: a generic approach to entity resolution. VLDB J. 18 1 (2009) 255-276
-
(2009)
VLDB J.
, vol.18
, Issue.1
, pp. 255-276
-
-
Benjelloun, O.1
Garcia-Molina, H.2
Menestrina, D.3
Su, Q.4
Whang, S.E.5
Widom, J.6
-
9
-
-
33746054079
-
Adaptive product normalization: Using online learning for record linkage in comparison shopping
-
M. Bilenko, S. Basu, M. Sahami, Adaptive product normalization: using online learning for record linkage in comparison shopping, in: Proceedings of the Fifth IEEE International Conference on Data Mining (ICDM '05), 2005, pp. 58-65.
-
(2005)
Proceedings of the Fifth IEEE International Conference on Data Mining (ICDM '05)
, pp. 58-65
-
-
Bilenko, M.1
Basu, S.2
Sahami, M.3
-
10
-
-
84878049861
-
Adaptive blocking: Learning to scale up record linkage
-
M. Bilenko, B. Kamath, R.J. Mooney, Adaptive blocking: learning to scale up record linkage, in: Proceedings of the Sixth IEEE International Conference on Data Mining (ICDM '06), 2006, pp. 87-96.
-
(2006)
Proceedings of the Sixth IEEE International Conference on Data Mining (ICDM '06)
, pp. 87-96
-
-
Bilenko, M.1
Kamath, B.2
Mooney, R.J.3
-
12
-
-
9444249661
-
On evaluation and training-set construction for duplicate detection
-
M. Bilenko, R.J. Mooney, On evaluation and training-set construction for duplicate detection, in: Proceedings of the KDD-2003 Workshop on Data Cleaning, Record Linkage, and Object Consolidation, 2003, pp. 7-12.
-
(2003)
Proceedings of the KDD-2003 Workshop on Data Cleaning, Record Linkage, and Object Consolidation
, pp. 7-12
-
-
Bilenko, M.1
Mooney, R.J.2
-
13
-
-
85011029434
-
Example-driven design of efficient record matching queries
-
S. Chaudhuri, B.-C. Chen, V. Ganti, R. Kaushik, Example-driven design of efficient record matching queries, in: Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB '07), 2007, pp. 327-338.
-
(2007)
Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB '07)
, pp. 327-338
-
-
Chaudhuri, S.1
Chen, B.-C.2
Ganti, V.3
Kaushik, R.4
-
14
-
-
72649102401
-
Mining document collections to facilitate accurate approximate entity matching
-
Chaudhuri S., Ganti V., and Xin D. Mining document collections to facilitate accurate approximate entity matching. PVLDB 2 1 (2009) 395-406
-
(2009)
PVLDB
, vol.2
, Issue.1
, pp. 395-406
-
-
Chaudhuri, S.1
Ganti, V.2
Xin, D.3
-
15
-
-
85016708008
-
Exploiting relationships for object consolidation
-
Z. Chen, D.V. Kalashnikov, S. Mehrotra, Exploiting relationships for object consolidation, in: Proceedings of the International Workshop on Information Quality in Information Systems (IQIS '05), 2005, pp. 47-58.
-
(2005)
Proceedings of the International Workshop on Information Quality in Information Systems (IQIS '05)
, pp. 47-58
-
-
Chen, Z.1
Kalashnikov, D.V.2
Mehrotra, S.3
-
16
-
-
70849083729
-
Exploiting context analysis for combining multiple entity resolution systems
-
Z. Chen, D.V. Kalashnikov, S. Mehrotra, Exploiting context analysis for combining multiple entity resolution systems, in: Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data (SIGMOD '09), 2009, pp. 207-218.
-
(2009)
Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data (SIGMOD '09)
, pp. 207-218
-
-
Chen, Z.1
Kalashnikov, D.V.2
Mehrotra, S.3
-
18
-
-
67649649496
-
FEBRL: A freely available record linkage system with a graphical user interface
-
Australia, Australia
-
P. Christen, FEBRL: a freely available record linkage system with a graphical user interface, in: Proceedings of the Second Australasian workshop on Health Data and Knowledge Management (HDKM '08), Australian Computer Society Inc., Darlinghurst, Australia, Australia, 2008, pp. 17-25.
-
(2008)
Proceedings of the Second Australasian workshop on Health Data and Knowledge Management (HDKM '08), Australian Computer Society Inc., Darlinghurst
, pp. 17-25
-
-
Christen, P.1
-
19
-
-
0034592802
-
Hardening soft information sources
-
W.W. Cohen, H.A. Kautz, D.A. McAllester, Hardening soft information sources, in: Proceedings of the Sixth ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '00), 2000, pp. 255-259.
-
(2000)
Proceedings of the Sixth ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '00)
, pp. 255-259
-
-
Cohen, W.W.1
Kautz, H.A.2
McAllester, D.A.3
-
20
-
-
11144240583
-
A comparison of string distance metrics for name-matching tasks
-
W.W. Cohen, P. Ravikumar, S.E. Fienberg, A comparison of string distance metrics for name-matching tasks, in: Proceedings of IJCAI-03 Workshop on Information Integration on the Web (IIWeb '03), 2003, pp. 73-78.
-
(2003)
Proceedings of IJCAI-03 Workshop on Information Integration on the Web (IIWeb '03)
, pp. 73-78
-
-
Cohen, W.W.1
Ravikumar, P.2
Fienberg, S.E.3
-
22
-
-
29844452555
-
Reference reconciliation in complex information spaces
-
X. Dong, A.Y. Halevy, J. Madhavan, Reference reconciliation in complex information spaces, in: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data (SIGMOD '05), 2005, pp. 85-96.
-
(2005)
Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data (SIGMOD '05)
, pp. 85-96
-
-
Dong, X.1
Halevy, A.Y.2
Madhavan, J.3
-
23
-
-
84893947008
-
A comparison and generalization of blocking and windowing algorithms for duplicate detection
-
U. Draisbach, F. Naumann, A comparison and generalization of blocking and windowing algorithms for duplicate detection, in: Proceedings of QDB 2009 Workshop at VLDB, 2009.
-
(2009)
Proceedings of QDB 2009 Workshop at VLDB
-
-
Draisbach, U.1
Naumann, F.2
-
24
-
-
0036203458
-
TAILOR: A record linkage tool box
-
M.G. Elfeky, A.K. Elmagarmid, V.S. Verykios, TAILOR: a record linkage tool box, in: Proceedings of the 18th International Conference on Data Engineering (ICDE '02), 2002, pp. 17-28.
-
(2002)
Proceedings of the 18th International Conference on Data Engineering (ICDE '02)
, pp. 17-28
-
-
Elfeky, M.G.1
Elmagarmid, A.K.2
Verykios, V.S.3
-
26
-
-
84947399464
-
A theory for record linkage
-
Fellegi I.P., and Sunter A.B. A theory for record linkage. J. Am. Stat. Assoc. 64 328 (1969) 1183-1210
-
(1969)
J. Am. Stat. Assoc.
, vol.64
, Issue.328
, pp. 1183-1210
-
-
Fellegi, I.P.1
Sunter, A.B.2
-
27
-
-
85012212427
-
AJAX: An extensible data cleaning tool
-
H. Galhardas, D. Florescu, D. Shasha, E. Simon, AJAX: an extensible data cleaning tool, in: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data (SIGMOD '00), 2000, p. 590.
-
(2000)
Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data (SIGMOD '00)
, pp. 590
-
-
Galhardas, H.1
Florescu, D.2
Shasha, D.3
Simon, E.4
-
28
-
-
33845350152
-
Record linkage: Current practice and future directions
-
Tech. Rep, CSIRO Mathematical and Information Sciences
-
L. Gu, R. Baxter, D. Vickers, C. Rainsford, Record linkage: current practice and future directions, Tech. Rep., CSIRO Mathematical and Information Sciences, 2003.
-
(2003)
-
-
Gu, L.1
Baxter, R.2
Vickers, D.3
Rainsford, C.4
-
29
-
-
79953162324
-
Merging the results of approximate match operations
-
S. Guha, N. Koudas, A. Marathe, D. Srivastava, Merging the results of approximate match operations, in: Proceedings of the 30th International Conference on Very Large Data Bases (VLDB '04), 2004, pp. 636-647.
-
(2004)
Proceedings of the 30th International Conference on Very Large Data Bases (VLDB '04)
, pp. 636-647
-
-
Guha, S.1
Koudas, N.2
Marathe, A.3
Srivastava, D.4
-
30
-
-
52649145249
-
Fast indexes and algorithms for set similarity selection queries
-
M. Hadjieleftheriou, A. Chandel, N. Koudas, D. Srivastava, Fast indexes and algorithms for set similarity selection queries, in: Proceedings of the 24th International Conference on Data Engineering (ICDE '08), 2008, pp. 267-276.
-
(2008)
Proceedings of the 24th International Conference on Data Engineering (ICDE '08)
, pp. 267-276
-
-
Hadjieleftheriou, M.1
Chandel, A.2
Koudas, N.3
Srivastava, D.4
-
32
-
-
72649086387
-
Framework for evaluating clustering algorithms in duplicate detection
-
Hassanzadeh O., Chiang F., Miller R.J., and Lee H.C. Framework for evaluating clustering algorithms in duplicate detection. PVLDB 2 1 (2009) 1282-1293
-
(2009)
PVLDB
, vol.2
, Issue.1
, pp. 1282-1293
-
-
Hassanzadeh, O.1
Chiang, F.2
Miller, R.J.3
Lee, H.C.4
-
34
-
-
47649126673
-
Interactive entity resolution in relational data: a visual analytic tool and its evaluation
-
Kang H., Getoor L., Shneiderman B., Bilgic M., and Licamele L. Interactive entity resolution in relational data: a visual analytic tool and its evaluation. IEEE Trans. Vis. Comput. Graph. 14 5 (2008) 999-1014
-
(2008)
IEEE Trans. Vis. Comput. Graph.
, vol.14
, Issue.5
, pp. 999-1014
-
-
Kang, H.1
Getoor, L.2
Shneiderman, B.3
Bilgic, M.4
Licamele, L.5
-
35
-
-
34250670467
-
Record linkage: Similarity measures and algorithms
-
N. Koudas, S. Sarawagi, D. Srivastava, Record linkage: similarity measures and algorithms, in: Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data (SIGMOD '06), 2006, pp. 802-803.
-
(2006)
Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data (SIGMOD '06)
, pp. 802-803
-
-
Koudas, N.1
Sarawagi, S.2
Srivastava, D.3
-
37
-
-
0034592786
-
Intelliclean: A knowledge-based intelligent data cleaner
-
M.-L. Lee, T.W. Ling, W.L. Low, Intelliclean: a knowledge-based intelligent data cleaner, in: Proceedings of the Sixth ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '00), 2000, pp. 290-294.
-
(2000)
Proceedings of the Sixth ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '00)
, pp. 290-294
-
-
Lee, M.-L.1
Ling, T.W.2
Low, W.L.3
-
38
-
-
63449096532
-
Structure-based inference of XML similarity for fuzzy duplicate detection
-
L. Leitão, P. Calado, M. Weis, Structure-based inference of XML similarity for fuzzy duplicate detection, in: Proceedings of the 16th ACM Conference on Information and Knowledge Management (CIKM '07), 2007, pp. 293-302.
-
(2007)
Proceedings of the 16th ACM Conference on Information and Knowledge Management (CIKM '07)
, pp. 293-302
-
-
Leitão, L.1
Calado, P.2
Weis, M.3
-
39
-
-
70349158570
-
Time-completeness trade-offs in record linkage using adaptive query processing
-
ACM, New York, NY, USA
-
R. Lengu, P. Missier, A.A.A. Fernandes, G. Guerrini, M. Mesiti, Time-completeness trade-offs in record linkage using adaptive query processing, in: Proceedings of the 12th International Conference on Extending Database Technology (EDBT '09), ACM, New York, NY, USA, 2009, pp. 851-861.
-
(2009)
Proceedings of the 12th International Conference on Extending Database Technology (EDBT '09)
, pp. 851-861
-
-
Lengu, R.1
Missier, P.2
Fernandes, A.A.A.3
Guerrini, G.4
Mesiti, M.5
-
40
-
-
0034592784
-
Efficient clustering of high-dimensional data sets with application to reference matching
-
A. McCallum, K. Nigam, L.H. Ungar, Efficient clustering of high-dimensional data sets with application to reference matching, in: Proceedings of the Sixth ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '00), 2000, pp. 169-178.
-
(2000)
Proceedings of the Sixth ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '00)
, pp. 169-178
-
-
McCallum, A.1
Nigam, K.2
Ungar, L.H.3
-
41
-
-
33646398530
-
Conditional models of identity uncertainty with application to noun coreference
-
A. McCallum, B. Wellner, Conditional models of identity uncertainty with application to noun coreference, in: Advances in Neural Information Processing Systems, vol. 17. 2004, pp. 905-912.
-
(2004)
in: Advances in Neural Information Processing Systems
, vol.17
, pp. 905-912
-
-
McCallum, A.1
Wellner, B.2
-
43
-
-
33750728576
-
A heterogeneous field matching method for record linkage
-
S. Minton, C. Nanjo, C.A. Knoblock, M. Michalowski, M. Michelson, A heterogeneous field matching method for record linkage, in: Proceedings of the Fifth IEEE International Conference on Data Mining (ICDM '05), 2005, pp. 314-321.
-
(2005)
Proceedings of the Fifth IEEE International Conference on Data Mining (ICDM '05)
, pp. 314-321
-
-
Minton, S.1
Nanjo, C.2
Knoblock, C.A.3
Michalowski, M.4
Michelson, M.5
-
44
-
-
72649087996
-
Object identification quality
-
M. Neiling, S. Jurk, H.J. Lenz, F. Naumann, Object identification quality, in: Proceedings of the International Workshop on Data Quality in Cooperative Information Systsems (DQCIS '03), 2003, pp. 187-198.
-
(2003)
Proceedings of the International Workshop on Data Quality in Cooperative Information Systsems (DQCIS '03)
, pp. 187-198
-
-
Neiling, M.1
Jurk, S.2
Lenz, H.J.3
Naumann, F.4
-
47
-
-
0002490026
-
Data cleaning: problems and current approaches
-
Rahm E., and Do H.H. Data cleaning: problems and current approaches. IEEE Data Eng. Bull. 23 4 (2000) 3-13
-
(2000)
IEEE Data Eng. Bull.
, vol.23
, Issue.4
, pp. 3-13
-
-
Rahm, E.1
Do, H.H.2
-
53
-
-
0035545848
-
Learning object identification rules for information integration
-
Tejada S., Knoblock C.A., and Minton S. Learning object identification rules for information integration. Inf. Syst. 26 8 (2001) 607-633
-
(2001)
Inf. Syst.
, vol.26
, Issue.8
, pp. 607-633
-
-
Tejada, S.1
Knoblock, C.A.2
Minton, S.3
-
54
-
-
0242456803
-
Learning domain-independent string transformation weights for high accuracy object identification
-
S. Tejada, C.A. Knoblock, S. Minton, Learning domain-independent string transformation weights for high accuracy object identification, in: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '02), 2002, pp. 350-359.
-
(2002)
Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '02)
, pp. 350-359
-
-
Tejada, S.1
Knoblock, C.A.2
Minton, S.3
-
56
-
-
0038208065
-
A Bayesian decision model for cost optimal record matching
-
Verykios V.S., Moustakides G.V., and Elfeky M.G. A Bayesian decision model for cost optimal record matching. VLDB J. 12 1 (2003) 28-40
-
(2003)
VLDB J.
, vol.12
, Issue.1
, pp. 28-40
-
-
Verykios, V.S.1
Moustakides, G.V.2
Elfeky, M.G.3
-
59
-
-
70849098813
-
Entity resolution with iterative blocking
-
S.E. Whang, D. Menestrina, G. Koutrika, M. Theobald, H. Garcia-Molina, Entity resolution with iterative blocking, in: Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data (SIGMOD '09), 2009, pp. 219-232.
-
(2009)
Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data (SIGMOD '09)
, pp. 219-232
-
-
Whang, S.E.1
Menestrina, D.2
Koutrika, G.3
Theobald, M.4
Garcia-Molina, H.5
-
60
-
-
33845615644
-
Overview of record linkage and current research directions, Tech
-
Rep, US Bureau of the Census, Washington, DC
-
W.E. Winkler, Overview of record linkage and current research directions, Tech. Rep., US Bureau of the Census, Washington, DC, 2006.
-
(2006)
-
-
Winkler, W.E.1
-
61
-
-
70849105253
-
Ed-join: an efficient algorithm for similarity joins with edit distance constraints
-
Xiao C., Wang W., and Lin X. Ed-join: an efficient algorithm for similarity joins with edit distance constraints. PVLDB 1 1 (2008) 933-944
-
(2008)
PVLDB
, vol.1
, Issue.1
, pp. 933-944
-
-
Xiao, C.1
Wang, W.2
Lin, X.3
-
62
-
-
5644287747
-
Entity identification for heterogeneous database integration - a Multiple Classifier System approach and empirical evaluation
-
Zhao H., and Ram S. Entity identification for heterogeneous database integration - a Multiple Classifier System approach and empirical evaluation. Inf. Syst. 30 2 (2005) 119-132
-
(2005)
Inf. Syst.
, vol.30
, Issue.2
, pp. 119-132
-
-
Zhao, H.1
Ram, S.2
-
63
-
-
47849087202
-
Entity matching across heterogeneous data sources: an approach based on constrained cascade generalization
-
Zhao H., and Ram S. Entity matching across heterogeneous data sources: an approach based on constrained cascade generalization. Data Knowl. Eng. 66 3 (2008) 368-381
-
(2008)
Data Knowl. Eng.
, vol.66
, Issue.3
, pp. 368-381
-
-
Zhao, H.1
Ram, S.2
|