-
2
-
-
61449154352
-
Evaluation of data quality in the cancer registry: principles and methods. Part II. Completeness
-
Parkin DM, Bray F. Evaluation of data quality in the cancer registry: principles and methods. Part II. Completeness. Eur J Cancer 2009;45:756-64.
-
(2009)
Eur J Cancer
, vol.45
, pp. 756-64
-
-
Parkin, D.M.1
Bray, F.2
-
3
-
-
0031745566
-
Evaluation of the effect of breast cancer screening by record linkage with the cancer registry, the Netherlands
-
Schouten LJ, de Rijke JM, Schlangen JT, et al. Evaluation of the effect of breast cancer screening by record linkage with the cancer registry, the Netherlands. J Med Screen 1998;5:37-41.
-
(1998)
J Med Screen
, vol.5
, pp. 37-41
-
-
Schouten, L.J.1
de Rijke, J.M.2
Schlangen, J.T.3
-
4
-
-
33846428121
-
Quality and complexity measures for data linkage and deduplication
-
43rd edn. Heidelberg: Springer Berlin, In: Guillet F, Hamilton H, eds
-
Christen P, Goiser K. Quality and complexity measures for data linkage and deduplication. In: Guillet F, Hamilton H, eds. Quality Measures in Data Mining, 43rd edn. Heidelberg: Springer Berlin, 2007:127-51.
-
(2007)
Quality Measures in Data Mining
, pp. 127-51
-
-
Christen, P.1
Goiser, K.2
-
6
-
-
26444478506
-
Probabilistic data generation for deduplication and data linkage
-
Proceedings of the Sixth International Conference on Intelligent Data Engineering and Automated Learning (IDEAL'05), Brisbane, July 2005. Lecture notes in computer science, Springer
-
Christen P. Probabilistic data generation for deduplication and data linkage. Intelligent Data Engineering and Automated Learning Ideal 2005, Proceedings of the Sixth International Conference on Intelligent Data Engineering and Automated Learning (IDEAL'05), Brisbane, July 2005. Lecture notes in computer science, Springer, 2005;3578:109-16.
-
(2005)
Intelligent Data Engineering and Automated Learning Ideal 2005
, vol.3578
, pp. 109-16
-
-
Christen, P.1
-
7
-
-
0013331361
-
Real-world data is dirty: data cleansing and the merge/ purge problem
-
Hernandez MA, Stolfo SJ. Real-world data is dirty: data cleansing and the merge/ purge problem. Data Min Knowl Discov 1998;2:9-37.
-
(1998)
Data Min Knowl Discov
, vol.2
, pp. 9-37
-
-
Hernandez, M.A.1
Stolfo, S.J.2
-
8
-
-
17244376008
-
Automatically utilizing secondary sources to align information across sources
-
Michalowski M, Thakkar S, Knoblock CA. Automatically utilizing secondary sources to align information across sources. Ai Magazine 2005;26:33-44.
-
(2005)
Ai Magazine
, vol.26
, pp. 33-44
-
-
Michalowski, M.1
Thakkar, S.2
Knoblock, C.A.3
-
9
-
-
71749103414
-
An integrated framework for de-identifying unstructured medical data
-
Gardner J, Xiong L. An integrated framework for de-identifying unstructured medical data. Data Knowl Eng 2009;68:1441-51.
-
(2009)
Data Knowl Eng
, vol.68
, pp. 1441-51
-
-
Gardner, J.1
Xiong, L.2
-
10
-
-
0037198576
-
An empirical comparison of record linkage procedures
-
Gomatam S, Carter R, Ariet M, et al. An empirical comparison of record linkage procedures. Stat Med 2002;21:1485-96.
-
(2002)
Stat Med
, vol.21
, pp. 1485-96
-
-
Gomatam, S.1
Carter, R.2
Ariet, M.3
-
11
-
-
84950419860
-
Advances in record-linkage methodology as applied to matching the 1985 census of Tampa/Florida
-
Jaro MA. Advances in record-linkage methodology as applied to matching the 1985 census of Tampa/Florida. J Am Stat Assoc 1989;89:414-20.
-
(1989)
J Am Stat Assoc
, vol.89
, pp. 414-20
-
-
Jaro, M.A.1
-
12
-
-
34147207121
-
Duplicate detection in adverse drug reaction surveillance
-
Noren GN, Orre R, Bate A, et al. Duplicate detection in adverse drug reaction surveillance. Data Min Knowl Discov 2007;14:305-28.
-
(2007)
Data Min Knowl Discov
, vol.14
, pp. 305-28
-
-
Noren, G.N.1
Orre, R.2
Bate, A.3
-
13
-
-
0022643388
-
The art and science of record linkage: methods that work with few identifiers
-
Roos LL Jr, Wajda A, Nicol JP. The art and science of record linkage: methods that work with few identifiers. Comput Biol Med 1986;16:45-57.
-
(1986)
Comput Biol Med
, vol.16
, pp. 45-57
-
-
Roos Jr., L.L.1
Wajda, A.2
Nicol, J.P.3
-
14
-
-
84895221824
-
-
Data Quality and Record Linkage Techniques. New York, NY: Springer
-
Herzog TN, Scheuren FJ, Winkler WE. Data Quality and Record Linkage Techniques. New York, NY: Springer, 2007.
-
(2007)
-
-
Herzog, T.N.1
Scheuren, F.J.2
Winkler, W.E.3
-
15
-
-
79953289666
-
Results from simulated data sets: probabilistic record linkage outperforms deterministic record linkage
-
Tromp M, Ravelli AC, Bonsel GJ, et al. Results from simulated data sets: probabilistic record linkage outperforms deterministic record linkage. J Clin Epidemiol 2011;64:565-72.
-
(2011)
J Clin Epidemiol
, vol.64
, pp. 565-72
-
-
Tromp, M.1
Ravelli, A.C.2
Bonsel, G.J.3
-
16
-
-
0002215719
-
-
Advanced methods for Record Linkage. Statistical Research Division U.S. Census Bureau, Suitland, Maryland
-
Winkler WE. Advanced methods for Record Linkage. Statistical Research Division U.S. Census Bureau, Suitland, Maryland, 1994. Technical Report. http://www.census.gov/srd/papers/pdf/rr94-5.pdf
-
(1994)
Technical Report
-
-
Winkler, W.E.1
-
17
-
-
33645723230
-
Working with missing values
-
Acock AC. Working with missing values. J Marriage Fam 2005;67:1012-28.
-
(2005)
J Marriage Fam
, vol.67
, pp. 1012-28
-
-
Acock, A.C.1
-
18
-
-
0035285349
-
Analyzing incomplete political science data: an alternative algorithm for multiple imputation
-
King G, Honaker J, Joseph A, et al. Analyzing incomplete political science data: an alternative algorithm for multiple imputation. Am Polit Sci Rev 2001;95: 49-69.
-
(2001)
Am Polit Sci Rev
, vol.95
, pp. 49-69
-
-
King, G.1
Honaker, J.2
Joseph, A.3
-
19
-
-
84857183817
-
A survey of indexing techniques for scalable record linkage and deduplication
-
Christen P. A survey of indexing techniques for scalable record linkage and deduplication. IEEE Trans Knowl Data Eng 2011;99:196-214.
-
(2011)
IEEE Trans Knowl Data Eng
, vol.99
, pp. 196-214
-
-
Christen, P.1
-
20
-
-
84863550642
-
-
Statistical Analysis with Missing Data. Hoboken, NJ: Wiley-Interscience
-
Little RJ, Rubin DB. Statistical Analysis with Missing Data. Hoboken, NJ: Wiley-Interscience, 2002.
-
(2002)
-
-
Little, R.J.1
Rubin, D.B.2
-
21
-
-
84947810053
-
-
Missing Data in Clinical Studies. Chichester: Wiley
-
Molenberghs G, Kenward MG. Missing Data in Clinical Studies. Chichester: Wiley, 2007.
-
(2007)
-
-
Molenberghs, G.1
Kenward, M.G.2
-
22
-
-
76749117619
-
An investigation of missing data methods for classification trees applied to binary response data
-
Ding YF, Simonoff JS. An investigation of missing data methods for classification trees applied to binary response data. J Mach Learn Res 2010;11:131-70.
-
(2010)
J Mach Learn Res
, vol.11
, pp. 131-70
-
-
Ding, Y.F.1
Simonoff, J.S.2
-
23
-
-
48249097676
-
B ayesian record linkage methodology for multiple imputation of missing links
-
Alexandria, VA, 7-10 August 2004
-
McGlincy MH. A B ayesian record linkage methodology for multiple imputation of missing links. ASA Proceedings of the Joint Statistical Meetings. Alexandria, VA, 7-10 August 2004:4001-8.
-
ASA Proceedings of the Joint Statistical Meetings
, pp. 4001-8
-
-
McGlincy, M.H.A.1
-
24
-
-
0346202075
-
Robison-Cox. A record linkage approach to imputation of missing data: analyzing tag retention in a tag: recapture experiment
-
James F. Robison-Cox. A record linkage approach to imputation of missing data: analyzing tag retention in a tag: recapture experiment. J Agric Biol Environ Stat 1998;3:48-61.
-
(1998)
J Agric Biol Environ Stat
, vol.3
, pp. 48-61
-
-
James, F.1
-
25
-
-
34547696888
-
Handling missing values when applying classification models
-
Saar-Tsechansky M, Provost F. Handling missing values when applying classification models. J Mach Learn Res 2007;8:1625-57.
-
(2007)
J Mach Learn Res
, vol.8
, pp. 1625-57
-
-
Saar-Tsechansky, M.1
Provost, F.2
-
26
-
-
8344282137
-
-
C4.5: Programs for machine learning by J. Ross Quinlan. Morgan Kaufmann Publishers, Inc., 1993
-
Salzberg SL. C4.5: Programs for machine learning by J. Ross Quinlan. Morgan Kaufmann Publishers, Inc., 1993. Mach Learn 1994;16:235-40.
-
(1994)
Mach Learn
, vol.16
, pp. 235-40
-
-
Salzberg, S.L.1
-
27
-
-
34948822426
-
Die Kölner Phonetik Ein Verfahren zur Identifizierung von Personennamen auf der Grundlage der Gestaltanalyse
-
Postel HJ. Die Kölner Phonetik. Ein Verfahren zur Identifizierung von Personennamen auf der Grundlage der Gestaltanalyse. IBM-Nachrichten 1969;19:925-31.
-
(1969)
IBM-Nachrichten
, vol.19
, pp. 925-31
-
-
Postel, H.J.1
-
29
-
-
76649106456
-
Evaluation of record linkage methods for iterative insertions
-
Sariyar M, Borg A, Pommerening K. Evaluation of record linkage methods for iterative insertions. Methods Inf Med 2009;48:429-37.
-
(2009)
Methods Inf Med
, vol.48
, pp. 429-37
-
-
Sariyar, M.1
Borg, A.2
Pommerening, K.3
-
31
-
-
84880898060
-
-
Classification and Regression Trees. Belmont, California: Wadsworth
-
Breiman L, Friedman J, Olshen R, et al. Classification and Regression Trees. Belmont, California: Wadsworth, 1984.
-
(1984)
-
-
Breiman, L.1
Friedman, J.2
Olshen, R.3
-
32
-
-
84880865195
-
-
rpart: Recursive Partitioning. R package [computer program]. Version 3
-
rpart: Recursive Partitioning. R package [computer program]. Version 3.1-47. 2010. http://cran.r-project.org/package1/4rpart
-
(2010)
, pp. 1-47
-
-
-
34
-
-
0242456811
-
Interactive deduplication using active learning
-
Edmonton, AB, Canada, July 23-25
-
Sarawagi S, Bhamidipaty A. Interactive deduplication using active learning. Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, AB, Canada, July 23-25, 2002:269-78.
-
Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
, vol.2002
, pp. 269-78
-
-
Sarawagi, S.1
Bhamidipaty, A.2
-
35
-
-
0035545848
-
Learning object identification rules for information integration
-
Tejada S, Knoblock CA, Minton S. Learning object identification rules for information integration. Information Systems 2001;26:607-33.
-
(2001)
Information Systems
, vol.26
, pp. 607-33
-
-
Tejada, S.1
Knoblock, C.A.2
Minton, S.3
-
36
-
-
84880884703
-
-
Record Linkage in R. R package [computer program]. Version 0.3-4. Mainz, Germany
-
Sariyar M, Borg A. Record Linkage in R. R package [computer program]. Version 0.3-4. Mainz, Germany, 2011. http://cran.r-project.org/package=RecordLinkage
-
(2011)
-
-
Sariyar, M.1
Borg, A.2
-
37
-
-
84863549453
-
The RecordLinkage Package: Detecting Errors in Data
-
Sariyar M, Borg A. The RecordLinkage Package: Detecting Errors in Data. The R Journal 2010;2:61-7.
-
(2010)
The R Journal
, vol.2
, pp. 61-7
-
-
Sariyar, M.1
Borg, A.2
-
39
-
-
0030211964
-
Bagging predictors
-
Breiman L. Bagging predictors. Mach Learn 1996;24:123-40.
-
(1996)
Mach Learn
, vol.24
, pp. 123-40
-
-
Breiman, L.1
-
42
-
-
84970843768
-
Quartiles, quintiles, centiles, and other quantiles
-
Altman DG, Bland JM. Quartiles, quintiles, centiles, and other quantiles. BMJ 1994;309:996.
-
(1994)
BMJ
, vol.309
, pp. 996
-
-
Altman, D.G.1
Bland, J.M.2
|