-
1
-
-
84920600570
-
Efficient record linkage using a double embedding scheme
-
N. Adly, "Efficient record linkage using a double embedding scheme," in Proc. Int. Conf. Data Mining, 2009, pp. 274-281.
-
(2009)
Proc. Int. Conf. Data Mining
, pp. 274-281
-
-
Adly, N.1
-
3
-
-
37549058056
-
Near-optimal hashing algorithms for near neighbor problem in high dimension
-
A. Andoni and P. Indyk, "Near-optimal hashing algorithms for near neighbor problem in high dimension," Commun. ACM, vol. 51, no. 1, pp. 117-122, 2008.
-
(2008)
Commun. ACM
, vol.51
, Issue.1
, pp. 117-122
-
-
Andoni, A.1
Indyk, P.2
-
4
-
-
77954717287
-
On active learning of record matching packages
-
A. Arasu, M. Götz, and R. Kaushik, "On active learning of record matching packages," in Proc. ACM SIGMOD Int. Conf. Manage. Data, 2010, pp. 783-794.
-
(2010)
Proc. ACM SIGMOD Int. Conf. Manage. Data
, pp. 783-794
-
-
Arasu, A.1
Götz, M.2
Kaushik, R.3
-
5
-
-
38149034062
-
LSH forest: Self-tuning indexes for similarity search
-
M. Bawa, T. Condie, and P. Ganesan, "LSH forest: Self-tuning indexes for similarity search," in Proc. Int. Conf. World Wide Web, 2005, pp. 651-660.
-
(2005)
Proc. Int. Conf. World Wide Web
, pp. 651-660
-
-
Bawa, M.1
Condie, T.2
Ganesan, P.3
-
6
-
-
5444258997
-
A comparison of fast blocking methods for record linkage
-
R. Baxter, P. Christen, and T. Churches, "A comparison of fast blocking methods for record linkage," in Proc. ACM Int. Conf. Knowl. Discovery Data Mining, 2003, vol. 3, pp. 25-27.
-
(2003)
Proc. ACM Int. Conf. Knowl. Discovery Data Mining
, vol.3
, pp. 25-27
-
-
Baxter, R.1
Christen, P.2
Churches, T.3
-
7
-
-
84878049861
-
Adaptive blocking: Learning to scale up record linkage
-
M. Bilenko, B. Kamath, and R. J. Mooney, "Adaptive blocking: Learning to scale up record linkage," in Proc. Int. Conf. Data Mining, 2006, pp. 87-96.
-
(2006)
Proc. Int. Conf. Data Mining
, pp. 87-96
-
-
Bilenko, M.1
Kamath, B.2
Mooney, R.J.3
-
8
-
-
0034207121
-
Min-wise independent permutations
-
A. Z. Broder, M. Charikar, A. M. Frieze, and M. Mitzenmacher, "Min-wise independent permutations," J. Comput. Syst. Sci., vol. 60, no. 3, pp. 630-659, 2000.
-
(2000)
J. Comput. Syst. Sci.
, vol.60
, Issue.3
, pp. 630-659
-
-
Broder, A.Z.1
Charikar, M.2
Frieze, A.M.3
Mitzenmacher, M.4
-
10
-
-
65449165865
-
Towards parameter-free blocking for scalable record linkage
-
Australian Nat. Univ., CA, Australia, Tech. Rep. TR-CS-07-03
-
P. Christen, "Towards parameter-free blocking for scalable record linkage," Dept. Comput. Sci., Australian Nat. Univ., CA, Australia, Tech. Rep. TR-CS-07-03, 2007.
-
(2007)
Dept. Comput. Sci.
-
-
Christen, P.1
-
12
-
-
84920595044
-
A survey of indexing techniques for scalable record linkage and deduplication
-
Sep.
-
P. Christen, "A survey of indexing techniques for scalable record linkage and deduplication," IEEE Trans. Knowl. Data Eng., vol. 24, no. 9, pp. 1537-1555, Sep. 2012.
-
(2012)
IEEE Trans. Knowl. Data Eng.
, vol.24
, Issue.9
, pp. 1537-1555
-
-
Christen, P.1
-
15
-
-
74549152150
-
Robust record linkage blocking using suffix arrays
-
T. De Vries, H. Ke, S. Chawla, and P. Christen, "Robust record linkage blocking using suffix arrays," in Proc. 18th ACM Conf. Inf. Knowl. Manage., 2009, pp. 305-314.
-
(2009)
Proc. 18th ACM Conf. Inf. Knowl. Manage.
, pp. 305-314
-
-
De Vries, T.1
Ke, H.2
Chawla, S.3
Christen, P.4
-
16
-
-
0345566262
-
Learning to match ontologies on the semantic web
-
A. Doan, J. Madhavan, R. Dhamankar, P. Domingos, and A. Halevy, "Learning to match ontologies on the semantic web," VLDB J., vol. 12, no. 4, pp. 303-319, 2003.
-
(2003)
VLDB J.
, vol.12
, Issue.4
, pp. 303-319
-
-
Doan, A.1
Madhavan, J.2
Dhamankar, R.3
Domingos, P.4
Halevy, A.5
-
17
-
-
0036203458
-
Tailor: A record linkage toolbox
-
M. G. Elfeky, V. S. Verykios, and A. K. Elmagarmid, "Tailor: A record linkage toolbox," in Proc. Int. Conf. Data Eng., 2002, pp. 17-28.
-
(2002)
Proc. Int. Conf. Data Eng.
, pp. 17-28
-
-
Elfeky, M.G.1
Verykios, V.S.2
Elmagarmid, A.K.3
-
18
-
-
84947399464
-
A theory for record linkage
-
I. Fellegi and A. Sunter, "A theory for record linkage," J. Amer. Stat. Assoc., vol. 64, no. 328, pp. 1183-1210, 1969.
-
(1969)
J. Amer. Stat. Assoc.
, vol.64
, Issue.328
, pp. 1183-1210
-
-
Fellegi, I.1
Sunter, A.2
-
19
-
-
84880915872
-
Computing semantic relatedness using Wikipedia-based explicit semantic analysis
-
E. Gabrilovich and S. Markovitch, "Computing semantic relatedness using Wikipedia-based explicit semantic analysis," in Proc. 20th Int. Joint Conf. Artif. Intell., 2007, pp. 1606-1611.
-
(2007)
Proc. 20th Int. Joint Conf. Artif. Intell.
, pp. 1606-1611
-
-
Gabrilovich, E.1
Markovitch, S.2
-
20
-
-
0001944742
-
Similarity search in high dimensions via hashing
-
A. Gionis, P. Indyk, and R. Motwani, "Similarity search in high dimensions via hashing," in Proc. 25th Int. Conf. Very Large Databases, 1999, pp. 518-529.
-
(1999)
Proc. 25th Int. Conf. Very Large Databases
, pp. 518-529
-
-
Gionis, A.1
Indyk, P.2
Motwani, R.3
-
21
-
-
84976856849
-
The merge/purge problem for large databases
-
M. A. Hernández and S. J. Stolfo, "The merge/purge problem for large databases," ACM SIGMOD Rec., vol. 24, pp. 127-138, 1995.
-
(1995)
ACM SIGMOD Rec.
, vol.24
, pp. 127-138
-
-
Hernández, M.A.1
Stolfo, S.J.2
-
22
-
-
0013331361
-
Real-world data is dirty: Data cleansing and the merge/purge problem
-
M. A. Hernández and S. J. Stolfo, "Real-world data is dirty: Data cleansing and the merge/purge problem," Data Mining Knowl. Discovery, vol. 2, no. 1, pp. 9-37, 1998.
-
(1998)
Data Mining Knowl. Discovery
, vol.2
, Issue.1
, pp. 9-37
-
-
Hernández, M.A.1
Stolfo, S.J.2
-
23
-
-
0031644241
-
Approximate nearest neighbors: Towards removing the curse of dimensionality
-
P. Indyk and R. Motwani, "Approximate nearest neighbors: Towards removing the curse of dimensionality," in Proc. Annu. ACM Symp. Theory Comput., 1998, pp. 604-613.
-
(1998)
Proc. Annu. ACM Symp. Theory Comput.
, pp. 604-613
-
-
Indyk, P.1
Motwani, R.2
-
24
-
-
77954405011
-
Efficient semantic-aware detection of near duplicate resources
-
E. Ioannou, O. Papapetrou, D. Skoutas, and W. Nejdl, "Efficient semantic-aware detection of near duplicate resources," in Proc. 7th Int. Conf. Semantic Web: Res. Appl., 2010, pp. 136-150.
-
(2010)
Proc. 7th Int. Conf. Semantic Web: Res. Appl.
, pp. 136-150
-
-
Ioannou, E.1
Papapetrou, O.2
Skoutas, D.3
Nejdl, W.4
-
25
-
-
84943425383
-
Efficient record linkage in large data sets
-
L. Jin, C. Li, and S. Mehrotra, "Efficient record linkage in large data sets," in Proc. 8th Int. Conf. Database Syst. Adv. Appl., 2003, pp. 137-146.
-
(2003)
Proc. 8th Int. Conf. Database Syst. Adv. Appl.
, pp. 137-146
-
-
Jin, L.1
Li, C.2
Mehrotra, S.3
-
26
-
-
84894647271
-
An unsupervised algorithm for learning blocking schemes
-
M. Kejriwal and D. P. Miranker, "An unsupervised algorithm for learning blocking schemes," in Proc. Int. Conf. Data Mining, 2013, pp. 340-349.
-
(2013)
Proc. Int. Conf. Data Mining
, pp. 340-349
-
-
Kejriwal, M.1
Miranker, D.P.2
-
27
-
-
84892971761
-
MFIBlocks: An effective blocking algorithm for entity resolution
-
B. Kenig and A. Gal, "MFIBlocks: An effective blocking algorithm for entity resolution," Inf. Syst., vol. 38, no. 6, pp. 908-926, 2013.
-
(2013)
Inf. Syst.
, vol.38
, Issue.6
, pp. 908-926
-
-
Kenig, B.1
Gal, A.2
-
29
-
-
84955245129
-
Multiprobe LSH: Efficient indexing for high-dimensional similarity search
-
Q. Lv, W. Josephson, Z. Wang, M. Charikar, and K. Li, "Multiprobe LSH: Efficient indexing for high-dimensional similarity search," in Proc. Int. Conf. Very Large Databases, 2007, pp. 950-961.
-
(2007)
Proc. Int. Conf. Very Large Databases
, pp. 950-961
-
-
Lv, Q.1
Josephson, W.2
Wang, Z.3
Charikar, M.4
Li, K.5
-
31
-
-
84923644625
-
Graph-parallel entity resolution using LSH & IMM
-
P. Malhotra, P. Agarwal, and G. Shroff, "Graph-parallel entity resolution using LSH & IMM," in Proc. EDBT/ICDT Workshops, 2014, pp. 41-49.
-
(2014)
Proc. EDBT/ICDT Workshops
, pp. 41-49
-
-
Malhotra, P.1
Agarwal, P.2
Shroff, G.3
-
32
-
-
0034592784
-
Efficient clustering of high-dimensional data sets with application to reference matching
-
A. McCallum, K. Nigam, and L. H. Ungar, "Efficient clustering of high-dimensional data sets with application to reference matching," in Proc. 6th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2000, pp. 169-178.
-
(2000)
Proc. 6th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining
, pp. 169-178
-
-
McCallum, A.1
Nigam, K.2
Ungar, L.H.3
-
34
-
-
33244462877
-
Entropy based nearest neighbor search in high dimensions
-
R. Panigrahy, "Entropy based nearest neighbor search in high dimensions," in Proc. Annu. ACM-SIAM Symp. Discrete Algorithm, 2006, pp. 1186-1195.
-
(2006)
Proc. Annu. ACM-SIAM Symp. Discrete Algorithm
, pp. 1186-1195
-
-
Panigrahy, R.1
-
35
-
-
79960519872
-
Eliminating the redundancy in blocking-based entity resolution methods
-
G. Papadakis, E. Ioannou, C. Niederée, T. Palpanas, and W. Nejdl, "Eliminating the redundancy in blocking-based entity resolution methods," in Proc. 11th Annu. Int. ACM/IEEE Joint Conf. Digital libraries, 2011, pp. 85-94.
-
(2011)
Proc. 11th Annu. Int. ACM/IEEE Joint Conf. Digital Libraries
, pp. 85-94
-
-
Papadakis, G.1
Ioannou, E.2
Niederée, C.3
Palpanas, T.4
Nejdl, W.5
-
36
-
-
84858041897
-
Beyond 100 million entities: Large-scale blocking-based resolution for heterogeneous data
-
G. Papadakis, E. Ioannou, C. Niederée, T. Palpanas, and W. Nejdl, "Beyond 100 million entities: Large-scale blocking-based resolution for heterogeneous data," in Proc. ACM Int. Conf. Web Search Data Mining, 2012, pp. 53-62.
-
(2012)
Proc. ACM Int. Conf. Web Search Data Mining
, pp. 53-62
-
-
Papadakis, G.1
Ioannou, E.2
Niederée, C.3
Palpanas, T.4
Nejdl, W.5
-
37
-
-
84904650785
-
Metablocking: Taking entity resolution to the next level
-
Aug.
-
G. Papadakis, G. Koutrika, T. Palpanas, and W. Nejdl, "Metablocking: Taking entity resolution to the next level," IEEE Trans. Knowl. Data Eng., vol. 26, no. 8, pp. 1946-1960, Aug. 2014.
-
(2014)
IEEE Trans. Knowl. Data Eng.
, vol.26
, Issue.8
, pp. 1946-1960
-
-
Papadakis, G.1
Koutrika, G.2
Palpanas, T.3
Nejdl, W.4
-
38
-
-
0003033112
-
Using information content to evaluate semantic similarity in a taxonomy
-
P. Resnik, "Using information content to evaluate semantic similarity in a taxonomy," in Proc. Int. Joint Conf. Artif. Intell., 1995, pp. 448-453.
-
(1995)
Proc. Int. Joint Conf. Artif. Intell.
, pp. 448-453
-
-
Resnik, P.1
-
39
-
-
70849098813
-
Entity resolution with iterative blocking
-
S. E. Whang, D. Menestrina, G. Koutrika, M. Theobald, and H. Garcia-Molina, "Entity resolution with iterative blocking," in Proc. SIGMOD Int. Conf. Manage. Data, 2009, pp. 219-232.
-
(2009)
Proc. SIGMOD Int. Conf. Manage. Data
, pp. 219-232
-
-
Whang, S.E.1
Menestrina, D.2
Koutrika, G.3
Theobald, M.4
Garcia-Molina, H.5
-
40
-
-
84863154010
-
Towards a probabilistic taxonomy of many concepts
-
WA, USA, Tech. Rep. MSR-TR-2011-25
-
W. Wu, H. Li, H. Wang, and K. Zhu, "Towards a probabilistic taxonomy of many concepts," Microsoft Res. Redmond, WA, USA, Tech. Rep. MSR-TR-2011-25, 2011.
-
(2011)
Microsoft Res. Redmond
-
-
Wu, W.1
Li, H.2
Wang, H.3
Zhu, K.4
-
41
-
-
36348961379
-
Adaptive sorted neighborhood methods for efficient record linkage
-
S. Yan, D. Lee, M.-Y. Kan, and L. C. Giles, "Adaptive sorted neighborhood methods for efficient record linkage," in Proc. 7th ACM/ IEEE-CS Joint Conf. Digital librarie, 2007, pp. 185-194.
-
(2007)
Proc. 7th ACM/ IEEE-CS Joint Conf. Digital Librarie
, pp. 185-194
-
-
Yan, S.1
Lee, D.2
Kan, M.-Y.3
Giles, L.C.4
-
42
-
-
33748559731
-
-
New York, NY, USA: Springer
-
P. Zezula, G. Amato, V. Dohnal, and M. Batko, Similarity Search: The Metric Space Approach, vol. 32, New York, NY, USA: Springer, 2006.
-
(2006)
Similarity Search: The Metric Space Approach
, vol.32
-
-
Zezula, P.1
Amato, G.2
Dohnal, V.3
Batko, M.4
|