-
1
-
-
85088761870
-
Mining reference tables for automatic text segmentation
-
E. Agichtein and V. Ganti. Mining reference tables for automatic text segmentation. In KDD-04.
-
KDD-04
-
-
Agichtein, E.1
Ganti, V.2
-
2
-
-
33749588820
-
Clean answers over dirty databases: A probabilistic approach
-
P. Andritsos, A. Fuxman, and R. J. Miller. Clean answers over dirty databases: A probabilistic approach. In ICDE-06.
-
ICDE-06
-
-
Andritsos, P.1
Fuxman, A.2
Miller, R.J.3
-
4
-
-
33750452514
-
Swoosh: A generic approach to entity resolution
-
Technical report, Stanford University, March
-
O. Benjelloun, H. Garcia-Molina, Q. Su, and J. Widom. Swoosh: A generic approach to entity resolution. Technical report, Stanford University, March 2005.
-
(2005)
-
-
Benjelloun, O.1
Garcia-Molina, H.2
Su, Q.3
Widom, J.4
-
6
-
-
2342447399
-
Adaptive name matching in information integration
-
M. Bilenko, R. J. Mooney, W. W. Cohen, P. Ravikumar, and S. E. Fienberg. Adaptive name matching in information integration. IEEE Intelligent Systems, 2003.
-
(2003)
IEEE Intelligent Systems
-
-
Bilenko, M.1
Mooney, R.J.2
Cohen, W.W.3
Ravikumar, P.4
Fienberg, S.E.5
-
7
-
-
33745600025
-
Automatic text segmentation for extracting structured records
-
V. Borkar, K. Deshmukh, and S. Sarawagi. Automatic text segmentation for extracting structured records. In SIGMOD-01.
-
SIGMOD-01
-
-
Borkar, V.1
Deshmukh, K.2
Sarawagi, S.3
-
8
-
-
33749597967
-
A primitive operator for similarity joins in data cleaning
-
S. Chaudhuri, V. Ganti, and R. Kaushik. A primitive operator for similarity joins in data cleaning. In ICDE-06.
-
ICDE-06
-
-
Chaudhuri, S.1
Ganti, V.2
Kaushik, R.3
-
10
-
-
0032091575
-
Integration of heterogeneous databases without common domains using queries based on textual similarity
-
W. Cohen. Integration of heterogeneous databases without common domains using queries based on textual similarity. In SIGMOD-98.
-
SIGMOD-98
-
-
Cohen, W.1
-
12
-
-
34548351048
-
Community information management
-
A. Doan, R. Ramakrishnan, F. Chen, P. DeRose, Y. Lee, R. McCann, M. Sayyadian, and W. Shen. Community information management. IEEE Data Engineering Bulletin, 29(1), 2006.
-
(2006)
IEEE Data Engineering Bulletin
, vol.29
, Issue.1
-
-
Doan, A.1
Ramakrishnan, R.2
Chen, F.3
DeRose, P.4
Lee, Y.5
McCann, R.6
Sayyadian, M.7
Shen, W.8
-
13
-
-
85086954029
-
Reference reconciliation in complex information spaces
-
X. Dong, A. Halevy, and J. Madhavan. Reference reconciliation in complex information spaces. In SIGMOD-05.
-
SIGMOD-05
-
-
Dong, X.1
Halevy, A.2
Madhavan, J.3
-
16
-
-
0344756845
-
Declarative data cleaning: Language, model, and algorithms
-
H. Galhardas, D. Florescu, D. Shasha, E. Simon, and C.-A. Saita. Declarative data cleaning: Language, model, and algorithms. In VLDB-01.
-
VLDB-01
-
-
Galhardas, H.1
Florescu, D.2
Shasha, D.3
Simon, E.4
Saita, C.-A.5
-
17
-
-
84944318804
-
Approximate string joins in a database (almost) for free
-
L. Gravano, P. G. Ipeirotis, H. V. Jagadish, N. Koudas, S. Muthukrishnan, and D. Srivastava. Approximate string joins in a database (almost) for free. In VLDB-01.
-
VLDB-01
-
-
Gravano, L.1
Ipeirotis, P.G.2
Jagadish, H.V.3
Koudas, N.4
Muthukrishnan, S.5
Srivastava, D.6
-
19
-
-
84882989940
-
The merge/purge problem for large databases
-
M. Hernandez and S. Stolfo. The merge/purge problem for large databases. In SIGMOD-95.
-
SIGMOD-95
-
-
Hernandez, M.1
Stolfo, S.2
-
20
-
-
84958184841
-
Email alias detection using network analysis
-
R. Holzer, B. Malin, and L. Sweeny. Email alias detection using network analysis. In SIGKDD Workshop on Link Discovery: Issues, Approaches, and Applications, 2005.
-
(2005)
SIGKDD Workshop on Link Discovery: Issues, Approaches, and Applications
-
-
Holzer, R.1
Malin, B.2
Sweeny, L.3
-
21
-
-
85088720915
-
Exploiting relationships for domain-independent data cleaning
-
D. V. Kalashnikov, S. Mehrotra, and Z. Chen. Exploiting relationships for domain-independent data cleaning. In SIAM-05.
-
SIAM-05
-
-
Kalashnikov, D.V.1
Mehrotra, S.2
Chen, Z.3
-
22
-
-
85123004356
-
Flexible string matching against large databases in practice
-
N. Koudas, A. Marathe, and D. Srivastava. Flexible string matching against large databases in practice. In VLDB-04.
-
VLDB-04
-
-
Koudas, N.1
Marathe, A.2
Srivastava, D.3
-
23
-
-
34250670467
-
Record linkage: Similarity measures and algorithms (tutorial)
-
N. Koudas, S. Sarawagi, and D. Srivastava. Record linkage: Similarity measures and algorithms (tutorial). In SIGMOD-06.
-
SIGMOD-06
-
-
Koudas, N.1
Sarawagi, S.2
Srivastava, D.3
-
24
-
-
34548777065
-
Identification and tracing of ambiguous names: Discriminative and generative approaches
-
X. Li, P. Morie, and D. Roth. Identification and tracing of ambiguous names: Discriminative and generative approaches. In AAAI-04.
-
AAAI-04
-
-
Li, X.1
Morie, P.2
Roth, D.3
-
25
-
-
0035545906
-
A knowledge-based approach for duplicate elimination in data cleaning
-
W. L. Low, M.-L. Lee, and T. W. Ling. A knowledge-based approach for duplicate elimination in data cleaning. Inf. Syst., 2001.
-
(2001)
Inf. Syst
-
-
Low, W.L.1
Lee, M.-L.2
Ling, T.W.3
-
26
-
-
34548711352
-
Efficient clustering of highdimensional data sets with application to reference matching
-
A. McCallum, K. Nigam, and L. Ungar. Efficient clustering of highdimensional data sets with application to reference matching. In KDD-00.
-
KDD-00
-
-
McCallum, A.1
Nigam, K.2
Ungar, L.3
-
27
-
-
0003780986
-
The PageRank citation ranking: Bringing order to the Web
-
Technical report, Stanford Digital Library Technologies Project
-
L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the Web. Technical report, Stanford Digital Library Technologies Project, 1998.
-
(1998)
-
-
Page, L.1
Brin, S.2
Motwani, R.3
Winograd, T.4
-
29
-
-
36349034741
-
Object identification, with attributemediated dependences
-
P. Singla and P. Domingos. Object identification, with attributemediated dependences. In PKDD-05.
-
PKDD-05
-
-
Singla, P.1
Domingos, P.2
-
30
-
-
85084776016
-
Learning domainindependent string transformation weights for high accuracy object identification
-
S. Tejada, C. Knoblock, and S. Minton. Learning domainindependent string transformation weights for high accuracy object identification. In KDD-02.
-
KDD-02
-
-
Tejada, S.1
Knoblock, C.2
Minton, S.3
-
32
-
-
29844441371
-
Dogmatix tracks down duplicates in XML
-
M. Weis and F. Naumann. Dogmatix tracks down duplicates in XML. In SIGMOD-05.
-
SIGMOD-05
-
-
Weis, M.1
Naumann, F.2
-
33
-
-
34548763879
-
Extracting web data using instance-based learning
-
Y. Zhai and B. Liu. Extracting web data using instance-based learning. In WISE-OS.
-
WISE-OS
-
-
Zhai, Y.1
Liu, B.2
|