-
2
-
-
84920595044
-
A survey of indexing techniques for scalable record linkage and deduplication
-
P. Christen. A Survey of Indexing Techniques for Scalable Record Linkage and Deduplication. IEEE Trans. Knowl. Data Eng., 24(9), 2012.
-
(2012)
IEEE Trans. Knowl. Data Eng.
, vol.24
, Issue.9
-
-
Christen, P.1
-
4
-
-
84880543665
-
Pairwise document similarity in large collections with MapReduce
-
T. Elsayed, J. J. Lin, and D. W. Oard. Pairwise Document Similarity in Large Collections with MapReduce. In ACL (Short Papers), 2008.
-
(2008)
ACL (Short Papers)
-
-
Elsayed, T.1
Lin, J.J.2
Oard, D.W.3
-
5
-
-
84872977079
-
Dedoop: Efficient deduplication with hadoop
-
L. Kolb, A. Thor, and E. Rahm. Dedoop: Efficient Deduplication with Hadoop. PVLDB, 5(12), 2012.
-
(2012)
PVLDB
, vol.5
, Issue.12
-
-
Kolb, L.1
Thor, A.2
Rahm, E.3
-
6
-
-
84864223334
-
Load balancing for MapReduce-based entity resolution
-
L. Kolb, A. Thor, and E. Rahm. Load Balancing for MapReduce-based Entity Resolution. In ICDE, 2012.
-
(2012)
ICDE
-
-
Kolb, L.1
Thor, A.2
Rahm, E.3
-
7
-
-
84857059718
-
Multi-pass sorted neighborhood blocking with MapReduce
-
L. Kolb, A. Thor, and E. Rahm. Multi-pass Sorted Neighborhood Blocking with MapReduce. Computer Science - R&D, 27(1), 2012.
-
(2012)
Computer Science - R&D
, vol.27
, Issue.1
-
-
Kolb, L.1
Thor, A.2
Rahm, E.3
-
8
-
-
72649095071
-
Frameworks for entity matching: A comparison
-
H. Köpcke and E. Rahm. Frameworks for entity matching: A comparison. Data Knowl. Eng., 69(2), 2010.
-
(2010)
Data Knowl. Eng.
, vol.69
, Issue.2
-
-
Köpcke, H.1
Rahm, E.2
-
9
-
-
84885862102
-
Dynamic record blocking: Efficient linking of massive databases in MapReduce
-
N. McNeill, H. Kardes, and A. Borthwick. Dynamic Record Blocking: Efficient Linking of Massive Databases in MapReduce. In QDB, 2012.
-
(2012)
QDB
-
-
McNeill, N.1
Kardes, H.2
Borthwick, A.3
-
10
-
-
0038198745
-
Evaluating fuzzy clustering for relevance-based information access
-
M. Mendes and L. Sacks. Evaluating fuzzy clustering for relevance-based information access. In IEEE FUZZ, volume 1, 2003.
-
(2003)
IEEE FUZZ
, vol.1
-
-
Mendes, M.1
Sacks, L.2
-
11
-
-
72649098345
-
All-pairs: An abstraction for data-intensive computing on campus grids
-
C. Moretti, H. Bui, K. Hollingsworth, et al. All-Pairs: An Abstraction for Data-Intensive Computing on Campus Grids. IEEE Trans. Parallel Distrib. Syst., 21(1), 2010.
-
(2010)
IEEE Trans. Parallel Distrib. Syst.
, vol.21
, Issue.1
-
-
Moretti, C.1
Bui, H.2
Hollingsworth, K.3
-
12
-
-
79960519872
-
Eliminating the redundancy in blocking-based entity resolution methods
-
G. Papadakis, E. Ioannou, C. Niederée, et al. Eliminating the Redundancy in Blocking-based Entity Resolution Methods. In JCDL, 2011.
-
(2011)
JCDL
-
-
Papadakis, G.1
Ioannou, E.2
Niederée, C.3
-
13
-
-
79958070483
-
Efficient entity resolution for large heterogeneous information spaces
-
G. Papadakis and W. Nejdl. Efficient Entity Resolution for Large Heterogeneous Information Spaces. In ICDE Workshops, 2011.
-
(2011)
ICDE Workshops
-
-
Papadakis, G.1
Nejdl, W.2
-
14
-
-
65649120715
-
CloudBurst: Highly sensitive read mapping with MapReduce
-
M. C. Schatz. CloudBurst: highly sensitive read mapping with MapReduce. Bioinformatics, 25(11), 2009.
-
(2009)
Bioinformatics
, vol.25
, Issue.11
-
-
Schatz, M.C.1
-
15
-
-
77954744650
-
Efficient parallel set-similarity joins using MapReduce
-
R. Vernica, M. J. Carey, and C. Li. Efficient parallel set-similarity joins using MapReduce. In Sigmod, 2010.
-
(2010)
Sigmod
-
-
Vernica, R.1
Carey, M.J.2
Li, C.3
-
16
-
-
66249113620
-
Efficient similarity joins for near-duplicate detection
-
C. Xiao, W. Wang, X. Lin, et al. Efficient Similarity Joins for Near-Duplicate Detection. In WWW, 2008.
-
(2008)
WWW
-
-
Xiao, C.1
Wang, W.2
Lin, X.3
|