-
1
-
-
0034228041
-
Rock: A robust clustering algorithm for categorical attributes
-
S. Guha, R. Rastogi, and K. Shim, "Rock: A robust clustering algorithm for categorical attributes," Inf. Syst., vol. 25, no. 5, pp. 345 -366, 2000.
-
(2000)
Inf. Syst
, vol.25
, Issue.5
, pp. 345-366
-
-
Guha, S.1
Rastogi, R.2
Shim, K.3
-
2
-
-
35448937301
-
Leveraging aggregate constraints for deduplication
-
S. Chaudhuri, A. D. Sarma, V. Ganti, and R. Kaushik, "Leveraging aggregate constraints for deduplication," in SIGMOD Conference, 2007, pp. 437 -448.
-
(2007)
SIGMOD Conference
, pp. 437-448
-
-
Chaudhuri, S.1
Sarma, A.D.2
Ganti, V.3
Kaushik, R.4
-
3
-
-
0032091575
-
Integration of heterogeneous databases without common domains using queries based on textual similarity
-
W. W. Cohen, "Integration of heterogeneous databases without common domains using queries based on textual similarity." in SIGMOD Conference, 1998, pp. 201 -212.
-
(1998)
SIGMOD Conference
, pp. 201-212
-
-
Cohen, W.W.1
-
4
-
-
84949423737
-
Constraintbased clustering in large databases
-
A. K. H. Tung, R. T. Ng, L. V. S. Lakshmanan, and J. Han, "Constraintbased clustering in large databases," in ICDT, 2001, pp. 405 -419.
-
(2001)
ICDT
, pp. 405-419
-
-
Tung, A.K.H.1
Ng, R.T.2
Lakshmanan, L.V.S.3
Han, J.4
-
5
-
-
33745448357
-
-
I. Bhattacharya and L. Getoor, A latent dirichlet model for unsupervised entity resolution, in SDM, 2006.
-
I. Bhattacharya and L. Getoor, "A latent dirichlet model for unsupervised entity resolution," in SDM, 2006.
-
-
-
-
6
-
-
29344435802
-
Constraint-based entity matching
-
W. Shen, X. Li, and A. Doan, "Constraint-based entity matching," in AAAI, 2005, pp. 862 -867.
-
(2005)
AAAI
, pp. 862-867
-
-
Shen, W.1
Li, X.2
Doan, A.3
-
7
-
-
85156206690
-
Identity uncertainty and citation matching
-
H. Pasula, B. Marthi, B. Milch, S. J. Russell, and I. Shpitser, "Identity uncertainty and citation matching," in NIPS, 2002, pp. 1401 -1408.
-
(2002)
NIPS
, pp. 1401-1408
-
-
Pasula, H.1
Marthi, B.2
Milch, B.3
Russell, S.J.4
Shpitser, I.5
-
8
-
-
35048857464
-
Limbo: Scalable clustering of categorical data
-
P. Andritsos, P. Tsaparas, R. J. Miller, and K. C. Sevcik, "Limbo: Scalable clustering of categorical data," in EDBT, 2004, pp. 123 -146.
-
(2004)
EDBT
, pp. 123-146
-
-
Andritsos, P.1
Tsaparas, P.2
Miller, R.J.3
Sevcik, K.C.4
-
9
-
-
33750288047
-
Measuring constraint-set utility for partitional clustering algorithms
-
I. Davidson, K. Wagstaff, and S. Basu, "Measuring constraint-set utility for partitional clustering algorithms," in PKDD, 2006, pp. 115 -126.
-
(2006)
PKDD
, pp. 115-126
-
-
Davidson, I.1
Wagstaff, K.2
Basu, S.3
-
10
-
-
57549092754
-
When is constrained clustering beneficial, and why?
-
K. Wagstaff, S. Basu, and I. Davidson, "When is constrained clustering beneficial, and why?" in AAAI, 2006.
-
(2006)
AAAI
-
-
Wagstaff, K.1
Basu, S.2
Davidson, I.3
-
11
-
-
0344756851
-
Eliminating fuzzy duplicates in data warehouses
-
R. Ananthakrishna, S. Chaudhuri, and V. Ganti, "Eliminating fuzzy duplicates in data warehouses," in VLDB, 2002, pp. 586 -597.
-
(2002)
VLDB
, pp. 586-597
-
-
Ananthakrishna, R.1
Chaudhuri, S.2
Ganti, V.3
-
13
-
-
84880903471
-
Semantics and inference for recursive probability models
-
A. Pfeffer and D. Koller, "Semantics and inference for recursive probability models," in AAAI/IAAI, 2000, pp. 538 -544.
-
(2000)
AAAI/IAAI
, pp. 538-544
-
-
Pfeffer, A.1
Koller, D.2
-
14
-
-
84878044770
-
Entity resolution with markov logic
-
P. Singla and P. Domingos, "Entity resolution with markov logic," in ICDM, 2006, pp. 572 -582.
-
(2006)
ICDM
, pp. 572-582
-
-
Singla, P.1
Domingos, P.2
-
15
-
-
33746868385
-
Correlation clustering in general weighted graphs
-
E. D. Demaine, D. Emanuel, A. Fiat, and N. Immorlica, "Correlation clustering in general weighted graphs," Theor. Comput. Sci., vol. 361, no. 2-3, pp. 172 -187, 2006.
-
(2006)
Theor. Comput. Sci
, vol.361
, Issue.2-3
, pp. 172-187
-
-
Demaine, E.D.1
Emanuel, D.2
Fiat, A.3
Immorlica, N.4
-
16
-
-
67649651583
-
-
Available
-
[Online]. Available: http://www.cs.umass.edu/ mccallum/code-data.html
-
-
-
-
17
-
-
0019004898
-
Equality and domain closure in first-order databases
-
R. Reiter, "Equality and domain closure in first-order databases," J. ACM, vol. 27, no. 2, pp. 235 -249, 1980.
-
(1980)
J. ACM
, vol.27
, Issue.2
, pp. 235-249
-
-
Reiter, R.1
-
18
-
-
67649639188
-
Alias: An active learning led interactive deduplication system
-
S. Sarawagi, A. Bhamidipaty, A. Kirpal, and C. Mouli, "Alias: An active learning led interactive deduplication system," in VLDB, 2002, pp. 1103 -1106.
-
(2002)
VLDB
, pp. 1103-1106
-
-
Sarawagi, S.1
Bhamidipaty, A.2
Kirpal, A.3
Mouli, C.4
-
19
-
-
33745661243
-
D-dupe: An interactive tool for entity resolution in social networks
-
M. Bilgic, L. Licamele, L. Getoor, and B. Shneiderman, "D-dupe: An interactive tool for entity resolution in social networks," in Graph Drawing, 2005, pp. 505 -507.
-
(2005)
Graph Drawing
, pp. 505-507
-
-
Bilgic, M.1
Licamele, L.2
Getoor, L.3
Shneiderman, B.4
-
21
-
-
34248229658
-
Collective entity resolution in relational data
-
I. Bhattacharya and L. Getoor, "Collective entity resolution in relational data," TKDD, vol. 1, no. 1, 2007.
-
(2007)
TKDD
, vol.1
, Issue.1
-
-
Bhattacharya, I.1
Getoor, L.2
-
22
-
-
34548731840
-
Conditional functional dependencies for data cleaning
-
P. Bohannon, W. Fan, F. Geerts, X. Jia, and A. Kementsietsidis, "Conditional functional dependencies for data cleaning," in ICDE, 2007, pp. 746 -755.
-
(2007)
ICDE
, pp. 746-755
-
-
Bohannon, P.1
Fan, W.2
Geerts, F.3
Jia, X.4
Kementsietsidis, A.5
-
23
-
-
57549084481
-
Dependencies revisited for improving data quality
-
W. Fan, "Dependencies revisited for improving data quality," in PODS, 2008, pp. 159 -170.
-
(2008)
PODS
, pp. 159-170
-
-
Fan, W.1
-
24
-
-
3142665421
-
Correlation clustering
-
N. Bansal, A. Blum, and S. Chawla, "Correlation clustering," Machine Learning, vol. 56, no. 1-3, pp. 89 -113, 2004.
-
(2004)
Machine Learning
, vol.56
, Issue.1-3
, pp. 89-113
-
-
Bansal, N.1
Blum, A.2
Chawla, S.3
-
25
-
-
24644456480
-
Clustering with qualitative information
-
M. Charikar, V. Guruswami, and A. Wirth, "Clustering with qualitative information," J. Comput. Syst. Sci., vol. 71, no. 3, pp. 360 -383, 2005.
-
(2005)
J. Comput. Syst. Sci
, vol.71
, Issue.3
, pp. 360-383
-
-
Charikar, M.1
Guruswami, V.2
Wirth, A.3
-
26
-
-
34848818026
-
Aggregating inconsistent information: Ranking and clustering
-
N. Ailon, M. Charikar, and A. Newman, "Aggregating inconsistent information: ranking and clustering," in STOC, 2005, pp. 684 -693.
-
(2005)
STOC
, pp. 684-693
-
-
Ailon, N.1
Charikar, M.2
Newman, A.3
-
29
-
-
85104914015
-
Efficient exact set-similarity joins
-
A. Arasu, V. Ganti, and R. Kaushik, "Efficient exact set-similarity joins," in VLDB, 2006, pp. 918 -929.
-
(2006)
VLDB
, pp. 918-929
-
-
Arasu, A.1
Ganti, V.2
Kaushik, R.3
-
30
-
-
34248168069
-
Clustering aggregation
-
A. Gionis, H. Mannila, and P. Tsaparas, "Clustering aggregation," TKDD, vol. 1, no. 1, 2007.
-
(2007)
TKDD
, vol.1
, Issue.1
-
-
Gionis, A.1
Mannila, H.2
Tsaparas, P.3
-
31
-
-
0028514351
-
On the hardness of approximating minimization problems
-
C. Lund and M. Yannakakis, "On the hardness of approximating minimization problems," J. ACM, vol. 41, no. 5, pp. 960 -981, 1994.
-
(1994)
J. ACM
, vol.41
, Issue.5
, pp. 960-981
-
-
Lund, C.1
Yannakakis, M.2
-
32
-
-
33845667955
-
Duplicate record detection: A survey
-
A. K. Elmagarmid, P. G. Ipeirotis, and V. S. Verykios, "Duplicate record detection: A survey," IEEE Trans. Knowl. Data Eng., vol. 19, no. 1, pp.1 -16, 2007.
-
(2007)
IEEE Trans. Knowl. Data Eng
, vol.19
, Issue.1
, pp. 1-16
-
-
Elmagarmid, A.K.1
Ipeirotis, P.G.2
Verykios, V.S.3
-
33
-
-
77952372966
-
Adaptive duplicate detection using learnable string similarity measures
-
M. Bilenko and R. J. Mooney, "Adaptive duplicate detection using learnable string similarity measures," in KDD, 2003, pp. 39 -48.
-
(2003)
KDD
, pp. 39-48
-
-
Bilenko, M.1
Mooney, R.J.2
-
34
-
-
1142279457
-
Robust and efficient fuzzy match for online data cleaning
-
S. Chaudhuri, K. Ganjam, V. Ganti, and R. Motwani, "Robust and efficient fuzzy match for online data cleaning," in SIGMOD Conference, 2003, pp. 313 -324.
-
(2003)
SIGMOD Conference
, pp. 313-324
-
-
Chaudhuri, S.1
Ganjam, K.2
Ganti, V.3
Motwani, R.4
-
35
-
-
84880467474
-
Text joins in an rdbms for web data integration
-
L. Gravano, P. G. Ipeirotis, N. Koudas, and D. Srivastava, "Text joins in an rdbms for web data integration," in WWW, 2003, pp. 90 -101.
-
(2003)
, pp. 90-101
-
-
Gravano, L.1
Ipeirotis, P.G.2
Koudas, N.3
Srivastava, D.4
-
36
-
-
67649639189
-
-
M. A. Jaro, Unimatch: A record linkage system: Users manual, US Bureau of the Census, Washington, D.C., Tech. Rep., 1976.
-
M. A. Jaro, "Unimatch: A record linkage system: Users manual," US Bureau of the Census, Washington, D.C., Tech. Rep., 1976.
-
-
-
-
37
-
-
84947399464
-
A theory for record linkage
-
I. P. Fellegi and A. B. Sunter, "A theory for record linkage," J. Am. Statistical Assoc., vol. 64, no. 328, pp. 1183 -1210, 1969.
-
(1969)
J. Am. Statistical Assoc
, vol.64
, Issue.328
, pp. 1183-1210
-
-
Fellegi, I.P.1
Sunter, A.B.2
-
38
-
-
0242456811
-
Interactive deduplication using active learning
-
S. Sarawagi and A. Bhamidipaty, "Interactive deduplication using active learning," in KDD, 2002, pp. 269 -278.
-
(2002)
KDD
, pp. 269-278
-
-
Sarawagi, S.1
Bhamidipaty, A.2
-
39
-
-
52649145249
-
Fast indexes and algorithms for set similarity selection queries
-
M. Hadjieleftheriou, A. Chandel, N. Koudas, and D. Srivastava, "Fast indexes and algorithms for set similarity selection queries," in ICDE, 2008, pp. 267 -276.
-
(2008)
ICDE
, pp. 267-276
-
-
Hadjieleftheriou, M.1
Chandel, A.2
Koudas, N.3
Srivastava, D.4
-
41
-
-
3142777876
-
Efficient set joins on similarity predicates
-
S. Sarawagi and A. Kirpal, "Efficient set joins on similarity predicates," in SIGMOD Conference, 2004, pp. 743 -754.
-
(2004)
SIGMOD Conference
, pp. 743-754
-
-
Sarawagi, S.1
Kirpal, A.2
-
43
-
-
0036203458
-
Tailor: A record linkage tool box
-
M. G. Elfeky, A. K. Elmagarmid, and V. S. Verykios, "Tailor: A record linkage tool box," in ICDE, 2002, pp. 17 -28.
-
(2002)
ICDE
, pp. 17-28
-
-
Elfeky, M.G.1
Elmagarmid, A.K.2
Verykios, V.S.3
-
44
-
-
29844434654
-
Spider: Flexible matching in databases
-
N. Koudas, A. Marathe, and D. Srivastava, "Spider: flexible matching in databases," in SIGMOD Conference, 2005, pp. 876 -878.
-
(2005)
SIGMOD Conference
, pp. 876-878
-
-
Koudas, N.1
Marathe, A.2
Srivastava, D.3
-
45
-
-
26444550791
-
Robust identification of fuzzy duplicates
-
S. Chaudhuri, V. Ganti, and R. Motwani, "Robust identification of fuzzy duplicates," in ICDE, 2005, pp. 865 -876.
-
(2005)
ICDE
, pp. 865-876
-
-
Chaudhuri, S.1
Ganti, V.2
Motwani, R.3
-
46
-
-
0344756845
-
Declarative data cleaning: Language, model, and algorithms
-
H. Galhardas, D. Florescu, D. Shasha, E. Simon, and C.-A. Saita, "Declarative data cleaning: Language, model, and algorithms," in VLDB, 2001, pp. 371 -380.
-
(2001)
VLDB
, pp. 371-380
-
-
Galhardas, H.1
Florescu, D.2
Shasha, D.3
Simon, E.4
Saita, C.-A.5
-
47
-
-
84944315993
-
Potter 's wheel: An interactive data cleaning system
-
V. Raman and J. M. Hellerstein, "Potter 's wheel: An interactive data cleaning system," in VLDB, 2001, pp. 381 -390.
-
(2001)
VLDB
, pp. 381-390
-
-
Raman, V.1
Hellerstein, J.M.2
-
48
-
-
29844452555
-
Reference reconciliation in complex information spaces
-
X. Dong, A. Y. Halevy, and J. Madhavan, "Reference reconciliation in complex information spaces," in SIGMOD Conference, 2005, pp. 85 -96.
-
(2005)
SIGMOD Conference
, pp. 85-96
-
-
Dong, X.1
Halevy, A.Y.2
Madhavan, J.3
-
49
-
-
84878044770
-
Entity resolution with markov logic
-
P. Singla and P. Domingos, "Entity resolution with markov logic," in ICDM, 2006, pp. 572 -582.
-
(2006)
ICDM
, pp. 572-582
-
-
Singla, P.1
Domingos, P.2
-
50
-
-
33745776306
-
Joint deduplication of multiple record types in relational data
-
A. Culotta and A. McCallum, "Joint deduplication of multiple record types in relational data," in CIKM, 2005, pp. 257 -258.
-
(2005)
CIKM
, pp. 257-258
-
-
Culotta, A.1
McCallum, A.2
-
51
-
-
34548759040
-
Fast identification of relational constraint violations
-
A. Chandel, N. Koudas, K. Q. Pu, and D. Srivastava, "Fast identification of relational constraint violations," in ICDE, 2007, pp. 776 -785.
-
(2007)
ICDE
, pp. 776-785
-
-
Chandel, A.1
Koudas, N.2
Pu, K.Q.3
Srivastava, D.4
-
52
-
-
33745329531
-
A cost-based model and effective heuristic for repairing constraints by value modification
-
P. Bohannon, M. Flaster, W. Fan, and R. Rastogi, "A cost-based model and effective heuristic for repairing constraints by value modification," in SIGMOD Conference, 2005, pp. 143 -154.
-
(2005)
SIGMOD Conference
, pp. 143-154
-
-
Bohannon, P.1
Flaster, M.2
Fan, W.3
Rastogi, R.4
-
53
-
-
84959912087
-
Improving data quality: Consistency and accuracy
-
G. Cong, W. Fan, F. Geerts, X. Jia, and S. Ma, "Improving data quality: Consistency and accuracy," in VLDB, 2007, pp. 315 -326.
-
(2007)
VLDB
, pp. 315-326
-
-
Cong, G.1
Fan, W.2
Geerts, F.3
Jia, X.4
Ma, S.5
-
54
-
-
29844448776
-
Conquer: Efficient management of inconsistent databases
-
A. Fuxman, E. Fazli, and R. J. Miller, "Conquer: Efficient management of inconsistent databases," in SIGMOD Conference, 2005, pp. 155 -166.
-
(2005)
SIGMOD Conference
, pp. 155-166
-
-
Fuxman, A.1
Fazli, E.2
Miller, R.J.3
|