-
1
-
-
5444258997
-
A comparison of fast blocking methods for record linkage
-
Washington DC
-
R. Baxter, P. Christen, and T. Churches. A comparison of fast blocking methods for record linkage. In ACM KDD'03 workshop on Data Cleaning, Record Linkage and Object Consolidation, pages 25-27, Washington DC, 2003.
-
(2003)
ACM KDD'03 workshop on Data Cleaning, Record Linkage and Object Consolidation
, pp. 25-27
-
-
Baxter, R.1
Christen, P.2
Churches, T.3
-
3
-
-
77952372966
-
Adaptive duplicate detection using learnable string similarity measures
-
Washington DC
-
M. Bilenko and R. J. Mooney. Adaptive duplicate detection using learnable string similarity measures. In ACM KDD'03, pages 39-48, Washington DC, 2003.
-
(2003)
ACM KDD'03
, pp. 39-48
-
-
Bilenko, M.1
Mooney, R.J.2
-
4
-
-
0003710380
-
-
Department of Computer Science, National Taiwan University, Software available at
-
C.-C. Chang and C-J. Lin. LIBSVM: A library for support vector machines. Manual, Department of Computer Science, National Taiwan University, 2001. Software available at: http://www.csie.ntu.edu.tw/~cjlin/libsvm.
-
(2001)
Manual, LIBSVM: A library for support vector machines
-
-
Chang, C.-C.1
Lin, C.-J.2
-
5
-
-
26444478506
-
Probabilistic data generation for deduplication and data linkage
-
IDEAL'05, Brisbane
-
P. Christen. Probabilistic data generation for deduplication and data linkage. In IDEAL'05, Springer LNCS 3578, pages 109-116, Brisbane, 2005.
-
(2005)
Springer LNCS
, vol.3578
, pp. 109-116
-
-
Christen, P.1
-
6
-
-
44649135932
-
A two-step classification approach to unsupervised record linkage
-
Gold Coast, Australia
-
P. Christen. A two-step classification approach to unsupervised record linkage. In AusDM'07, CRPIT vol. 70, pages 111-119, Gold Coast, Australia, 2007.
-
(2007)
AusDM'07, CRPIT
, vol.70
, pp. 111-119
-
-
Christen, P.1
-
7
-
-
44649093306
-
Automatic training example selection for scalable unsupervised record linkage
-
PAKDD'08, Springer, Osaka
-
P. Christen. Automatic training example selection for scalable unsupervised record linkage. In PAKDD'08, Springer LNAI 5012, pages 511-518, Osaka, 2008.
-
(2008)
LNAI
, vol.5012
, pp. 511-518
-
-
Christen, P.1
-
8
-
-
67649649496
-
Febrl - A freely available record linkage system with a graphical user interface
-
Wollongong, Australia
-
P. Christen. Febrl - A freely available record linkage system with a graphical user interface. In HDKM'08, CRPIT vol. 80, Wollongong, Australia, 2008.
-
(2008)
HDKM'08, CRPIT
, vol.80
-
-
Christen, P.1
-
9
-
-
33846428121
-
-
P. Christen and K. Goiser. Quality and complexity measures for data linkage and deduplication. In F. Guillet and H. Hamilton, editors, Quality Measures in Data Mining, 43 of Studies in Computational Intelligence. Springer, 2007.
-
P. Christen and K. Goiser. Quality and complexity measures for data linkage and deduplication. In F. Guillet and H. Hamilton, editors, Quality Measures in Data Mining, volume 43 of Studies in Computational Intelligence. Springer, 2007.
-
-
-
-
10
-
-
84884417241
-
Preparation of name and address data for record linkage using hidden Markov models
-
T. Churches, P. Christen, K. Lim, and J. X. Zhu. Preparation of name and address data for record linkage using hidden Markov models. BioMed Central Medical Informatics and Decision Making, 2(9), 2002.
-
(2002)
BioMed Central Medical Informatics and Decision Making
, vol.2
, Issue.9
-
-
Churches, T.1
Christen, P.2
Lim, K.3
Zhu, J.X.4
-
12
-
-
0242540438
-
Learning to match and cluster large high-dimensional data sets for data integration
-
Edmonton
-
W. Cohen and J. Richman. Learning to match and cluster large high-dimensional data sets for data integration. In ACM KDD'02, pages 475-480, Edmonton, 2002.
-
(2002)
ACM KDD'02
, pp. 475-480
-
-
Cohen, W.1
Richman, J.2
-
13
-
-
0036203458
-
TAILOR: A record linkage toolbox
-
San Jose
-
M. Elfeky, V. Verykios, and A. Elmagarmid. TAILOR: A record linkage toolbox. In ICDE'02, pages 17-28, San Jose, 2002.
-
(2002)
ICDE'02
, pp. 17-28
-
-
Elfeky, M.1
Verykios, V.2
Elmagarmid, A.3
-
14
-
-
33845667955
-
Duplicate record detection: A survey
-
A. Elmagarmid, P. Ipeirotis, and V. Verykios. Duplicate record detection: A survey. IEEE Transactions on Knowledge and Data Engineering, 19(1):1-16, 2007.
-
(2007)
IEEE Transactions on Knowledge and Data Engineering
, vol.19
, Issue.1
, pp. 1-16
-
-
Elmagarmid, A.1
Ipeirotis, P.2
Verykios, V.3
-
16
-
-
65449179112
-
Towards automated record linkage
-
Sydney
-
K. Goiser and P. Christen. Towards automated record linkage. In AusDM'06, CRPIT vol. 61, pages 23-31, Sydney, 2006.
-
(2006)
AusDM'06, CRPIT
, vol.61
, pp. 23-31
-
-
Goiser, K.1
Christen, P.2
-
18
-
-
45849148052
-
Effective counterterrorism and the limited role of predictive data mining
-
J. Jonas and J. Harper. Effective counterterrorism and the limited role of predictive data mining. Policy Analysis, (584), 2006.
-
(2006)
Policy Analysis
, vol.584
-
-
Jonas, J.1
Harper, J.2
-
19
-
-
0037867900
-
Two approaches to handling noisy variation in text mining
-
Sydney
-
U. Y. Nahm, M. Bilenko, and R. J. Mooney. Two approaches to handling noisy variation in text mining. In TextML'02, pages 18-27, Sydney, 2002.
-
(2002)
TextML'02
, pp. 18-27
-
-
Nahm, U.Y.1
Bilenko, M.2
Mooney, R.J.3
-
20
-
-
15744370005
-
Efficient nearest neighbor classification with data reduction and fast search algorithms
-
Man and Cybernetics
-
J. S. Sanchez, J. M. Sotoca, and F. Pla. Efficient nearest neighbor classification with data reduction and fast search algorithms. In IEEE International Conference on Systems, Man and Cybernetics, volume 5, pages 4757-4762, 2004.
-
(2004)
IEEE International Conference on Systems
, vol.5
, pp. 4757-4762
-
-
Sanchez, J.S.1
Sotoca, J.M.2
Pla, F.3
-
21
-
-
0242456811
-
Interactive deduplication using active learning
-
Edmonton
-
S. Sarawagi and A. Bhamidipaty. Interactive deduplication using active learning. In ACM KDD'02, pages 269-278, Edmonton, 2002.
-
(2002)
ACM KDD'02
, pp. 269-278
-
-
Sarawagi, S.1
Bhamidipaty, A.2
-
22
-
-
0242456803
-
Learning domain-independent string transformation weights for high accuracy object identification
-
Edmonton
-
S. Tejada, C. Knoblock, and S. Minton. Learning domain-independent string transformation weights for high accuracy object identification. In ACM KDD'02, pages 350-359, Edmonton, 2002.
-
(2002)
ACM KDD'02
, pp. 350-359
-
-
Tejada, S.1
Knoblock, C.2
Minton, S.3
-
23
-
-
33846411033
-
Using the EM algorithm for weight computation in the Fellegi-Sunter model of record linkage
-
Technical Report RR2000/05, US Bureau of the Census
-
W. E. Winkler. Using the EM algorithm for weight computation in the Fellegi-Sunter model of record linkage. Technical Report RR2000/05, US Bureau of the Census, 2000.
-
(2000)
-
-
Winkler, W.E.1
-
24
-
-
2942709772
-
Methods for evaluating and creating data quality
-
W. E. Winkler. Methods for evaluating and creating data quality. Elsevier Information Systems, 29(7):531-550, 2004.
-
(2004)
Elsevier Information Systems
, vol.29
, Issue.7
, pp. 531-550
-
-
Winkler, W.E.1
-
25
-
-
18744413274
-
Text classification from positive and unlabeled documents
-
New Orleans
-
H. Yu, C. X. Zhai, and J. Han. Text classification from positive and unlabeled documents. In CIKM'03, pages 232-239, New Orleans, 2003.
-
(2003)
CIKM'03
, pp. 232-239
-
-
Yu, H.1
Zhai, C.X.2
Han, J.3
|