-
1
-
-
33845363891
-
A fast linkage detection scheme for multi-source information integration
-
Washington, DC, USA
-
A. Aizawa and K. Oyama. A fast linkage detection scheme for multi-source information integration. In Proceedings of the International Workshop on Challenges in Web Information Retrieval and Integration, pages 30-39, Washington, DC, USA, 2005.
-
(2005)
Proceedings of the International Workshop on Challenges in Web Information Retrieval and Integration
, pp. 30-39
-
-
Aizawa, A.1
Oyama, K.2
-
2
-
-
5444258997
-
A comparison of fast blocking methods for record linkage
-
C. F. Epidemiology
-
R. Baxter, P. Christen, and C. F. Epidemiology. A comparison of fast blocking methods for record linkage. In Proceedings of Workshop Data Cleaning, Record Linkage, and Object Consolidation, 2003.
-
Proceedings of Workshop Data Cleaning, Record Linkage, and Object Consolidation, 2003
-
-
Baxter, R.1
Christen, P.2
-
3
-
-
34848900466
-
D-swoosh: A family of algorithms for generic, distributed entity resolution
-
0
-
O. Benjelloun, H. Garcia-Molina, H. Gong, H. Kawai, T. E. Larson, D. Menestrina, and S. Thavisomboon. D-swoosh: A family of algorithms for generic, distributed entity resolution. Distributed Computing Systems, International Conference on, 0:37, 2007.
-
(2007)
Distributed Computing Systems, International Conference on
, pp. 37
-
-
Benjelloun, O.1
Garcia-Molina, H.2
Gong, H.3
Kawai, H.4
Larson, T.E.5
Menestrina, D.6
Thavisomboon, S.7
-
4
-
-
26444478506
-
Probabilistic data generation for deduplication and data linkage
-
Springer Berlin / Heidelberg
-
P. Christen. Probabilistic data generation for deduplication and data linkage. In Intelligent Data Engineering and Automated Learning - IDEAL 2005, pages 109-116. Springer Berlin / Heidelberg, 2005.
-
(2005)
Intelligent Data Engineering and Automated Learning - IDEAL 2005
, pp. 109-116
-
-
Christen, P.1
-
5
-
-
7444251738
-
Febrl - A parallel open source data linkage system
-
P. Christen, T. Churches, and M. Hegland. Febrl - a parallel open source data linkage system. In PAKDD, pages 638-647, 2004.
-
(2004)
PAKDD
, pp. 638-647
-
-
Christen, P.1
Churches, T.2
Hegland, M.3
-
6
-
-
74549152150
-
Robust record linkage blocking using suffix arrays
-
New York, NY, USA, ACM
-
T. de Vries, H. Ke, S. Chawla, and P. Christen. Robust record linkage blocking using suffix arrays. In CIKM '09: Proceeding of the 18th ACM conference on Information and knowledge management, pages 305-314, New York, NY, USA, 2009. ACM.
-
(2009)
CIKM '09: Proceeding of the 18th ACM Conference on Information and Knowledge Management
, pp. 305-314
-
-
De Vries, T.1
Ke, H.2
Chawla, S.3
Christen, P.4
-
7
-
-
85030321143
-
Mapreduce: Simplified data processing on large clusters
-
J. Dean and S. Ghemawat. Mapreduce: simplified data processing on large clusters. In Proceedings of the 6th conference on Symposium on Operating Systems Design and Implementation, Berkeley, CA, USA, 2004.
-
Proceedings of the 6th Conference on Symposium on Operating Systems Design and Implementation, Berkeley, CA, USA, 2004
-
-
Dean, J.1
Ghemawat, S.2
-
8
-
-
37549003336
-
Mapreduce: Simplified data processing on large clusters
-
J. Dean and S. Ghemawat. Mapreduce: simplified data processing on large clusters. Commun. ACM, 51(1):107-113, 2008.
-
(2008)
Commun. ACM
, vol.51
, Issue.1
, pp. 107-113
-
-
Dean, J.1
Ghemawat, S.2
-
11
-
-
0034592784
-
Efficient clustering of high-dimensional data sets with application to reference matching
-
ACM
-
A. McCallum, K. Nigam, and L. H. Ungar. Efficient clustering of high-dimensional data sets with application to reference matching. In KDD '00: ACM SIGKDD. ACM, 2000.
-
(2000)
KDD '00: ACM SIGKDD
-
-
McCallum, A.1
Nigam, K.2
Ungar, L.H.3
-
12
-
-
79959324551
-
Performance and scalability of fast blocking techniques for deduplication and data linkage
-
C. Peter. Performance and scalability of fast blocking techniques for deduplication and data linkage. Proc. VLDB Endow., 1(2):1253-1264, 2007.
-
(2007)
Proc. VLDB Endow.
, vol.1
, Issue.2
, pp. 1253-1264
-
-
Peter, C.1
-
13
-
-
34547679939
-
Evaluating mapreduce for multi-core and multiprocessor systems
-
C. Ranger, R. Raghuraman, A. Penmetsa, G. Bradski, and C. Kozyrakis. Evaluating mapreduce for multi-core and multiprocessor systems. In HPCA '07, pages 13-24, 2007.
-
(2007)
HPCA '07
, pp. 13-24
-
-
Ranger, C.1
Raghuraman, R.2
Penmetsa, A.3
Bradski, G.4
Kozyrakis, C.5
-
14
-
-
47249097469
-
A scalable parallel deduplication algorithm
-
0
-
W. Santos, T. Teixeira, C. Machado, W. M. Jr., R. Ferreira, D. Guedes, and A. S. D. Silva. A scalable parallel deduplication algorithm. Computer Architecture and High Performance Computing, Symposium on, 0:79-86, 2007.
-
(2007)
Computer Architecture and High Performance Computing, Symposium on
, pp. 79-86
-
-
Santos, W.1
Teixeira, T.2
Machado, C.3
M Jr., W.4
Ferreira, R.5
Guedes, D.6
Silva, A.S.D.7
|