-
1
-
-
0005540823
-
Modern information retrieval
-
Addison-Wesley Longman Publishing Co., Boston
-
Baeza-Yates RA, Ribeiro-Neto B. Modern information retrieval. Addison-Wesley Longman Publishing Co., Boston, 1999.
-
(1999)
-
-
Baeza-Yates, R.A.1
Ribeiro-Neto, B.2
-
2
-
-
84908414603
-
Statistical linkage keys: How effective are they?
-
In Available online at
-
Bass J. Statistical linkage keys: How effective are they? In Symposium on Health Data Linkage, Sydney, 2002. Available online at: http://www.publichealth.gov.au/symposium.html.
-
(2002)
Symposium on Health Data Linkage, Sydney
-
-
Bass, J.1
-
3
-
-
5444258997
-
A comparison of fast blocking methods for record linkage
-
In Washington DC
-
Baxter R, Christen P, Churches T. A comparison of fast blocking methods for record linkage. In Proceedings of ACM SIGKDD workshop on Data Cleaning, Record Linkage and Object Consolidation, pages 25-27, Washington DC, 2003.
-
(2003)
Proceedings of ACM SIGKDD Workshop on Data Cleaning, Record Linkage and Object Consolidation
, pp. 25-27
-
-
Baxter, R.1
Christen, P.2
Churches, T.3
-
5
-
-
34249831790
-
Auction algorithms for network flow problems: A tutorial introduction
-
Bertsekas DP. Auction algorithms for network flow problems: A tutorial introduction. Computational Optimization and Applications, 1:7-66, 1992.
-
(1992)
Computational Optimization and Applications
, vol.1
, pp. 7-66
-
-
Bertsekas, D.P.1
-
6
-
-
77952372966
-
Adaptive duplicate detection using learnable string similarity measures
-
In Washington DC
-
Bilenko M, Mooney RJ. Adaptive duplicate detection using learnable string similarity measures. In Proceedings of ACM SIGKDD, pages 39-48, Washington DC, 2003.
-
(2003)
Proceedings of ACM SIGKDD
, pp. 39-48
-
-
Bilenko, M.1
Mooney, R.J.2
-
8
-
-
0036990263
-
Probabilistic record linkage and a method to calculate the positive predictive value
-
Blakely T, Salmond C. Probabilistic record linkage and a method to calculate the positive predictive value. International Journal of Epidemiology, 31:6:1246-1252, 2002.
-
(2002)
International Journal of Epidemiology
, vol.31
, Issue.6
, pp. 1246-1252
-
-
Blakely, T.1
Salmond, C.2
-
9
-
-
33846428768
-
New South Wales mothers and babies 2001
-
Centre for Epidemiology and Research, NSW Department of Health
-
Centre for Epidemiology and Research, NSW Department of Health. New South Wales mothers and babies 2001. NSW Public Health Bull, 13:S-4, 2001.
-
(2001)
NSW Public Health Bull
, vol.13
-
-
-
10
-
-
1142279457
-
Robust and efficient fuzzy match for online data cleaning
-
In San Diego
-
Chaudhuri S, Ganjam K, Ganti V, Motwani R. Robust and efficient fuzzy match for online data cleaning. In Proceedings of ACM SIGMOD, pages 313-324, San Diego, 2003.
-
(2003)
Proceedings of ACM SIGMOD
, pp. 313-324
-
-
Chaudhuri, S.1
Ganjam, K.2
Ganti, V.3
Motwani, R.4
-
12
-
-
7444251738
-
Febrl - A parallel open source data linkage system
-
In Sydney
-
Christen P, Churches T, Hegland M. Febrl - a parallel open source data linkage system. In Proceedings of the 8th PAKDD, Springer LNAI 3056, pages 638-647, Sydney, 2004.
-
(2004)
Proceedings of the 8th PAKDD, Springer LNAI 3056
, pp. 638-647
-
-
Christen, P.1
Churches, T.2
Hegland, M.3
-
13
-
-
84884417241
-
Preparation of name and address data for record linkage using hidden markov models
-
Available online at
-
Churches T, Christen P, Lim K, Zhu JX. Preparation of name and address data for record linkage using hidden markov models. BioMed Central Medical Informatics and Decision Making, 2(9), 2002. Available online at: http://www.biomedcentral.com/1472-6947/2/9/.
-
(2002)
BioMed Central Medical Informatics and Decision Making
, vol.2
, Issue.9
-
-
Churches, T.1
Christen, P.2
Lim, K.3
Zhu, J.X.4
-
14
-
-
0032091575
-
Integration of heterogeneous databases without common domains using queries based on textual similarity
-
In Seattle
-
Cohen WW. Integration of heterogeneous databases without common domains using queries based on textual similarity. In Proceedings of ACM SIGMOD, pages 201-212, Seattle, 1998.
-
(1998)
Proceedings of ACM SIGMOD
, pp. 201-212
-
-
Cohen, W.W.1
-
16
-
-
0242540438
-
Learning to match and cluster large high-dimensional data sets for data integration
-
In Edmonton
-
Cohen WW, Richman J. Learning to match and cluster large high-dimensional data sets for data integration. In Proceedings of ACM SIGKDD, pages 475-480, Edmonton, 2002.
-
(2002)
Proceedings of ACM SIGKDD
, pp. 475-480
-
-
Cohen, W.W.1
Richman, J.2
-
17
-
-
0017918892
-
Foundations of probabilistic and utility-theoretic indexing
-
Cooper WS, Maron ME. Foundations of probabilistic and utility-theoretic indexing. Journal of the ACM, 25(1):67-80, 1978.
-
(1978)
Journal of the ACM
, vol.25
, Issue.1
, pp. 67-80
-
-
Cooper, W.S.1
Maron, M.E.2
-
19
-
-
0345438685
-
ROC Graphs: Notes and practical considerations for researchers
-
Technical Report HPL-2003-4, HP Laboratories, Palo Alto
-
Fawcett T. ROC Graphs: Notes and practical considerations for researchers. Technical Report HPL-2003-4, HP Laboratories, Palo Alto, 2004.
-
(2004)
-
-
Fawcett, T.1
-
21
-
-
0033891155
-
An extensible framework for data cleaning
-
Galhardas H, Florescu D, Shasha D, Simon E. An extensible framework for data cleaning. In Proceedings of ICDE, page 312, 2000.
-
(2000)
Proceedings of ICDE
, pp. 312
-
-
Galhardas, H.1
Florescu, D.2
Shasha, D.3
Simon, E.4
-
22
-
-
1642332418
-
Methods for automatic record matching and linking and their use in national statistics
-
Technical Report National Statistics Methodology Series, no 25, National Statistics, London
-
Gill L. Methods for automatic record matching and linking and their use in national statistics. Technical Report National Statistics Methodology Series, no 25, National Statistics, London, 2001.
-
(2001)
-
-
Gill, L.1
-
23
-
-
0037198576
-
An empirical comparison of record linkage procedures
-
Gomatam S, Carter R, Ariet M, Mitchell G. An empirical comparison of record linkage procedures. Statistics in Medicine, 21(10):1485-1496, 2002.
-
(2002)
Statistics in Medicine
, vol.21
, Issue.10
, pp. 1485-1496
-
-
Gomatam, S.1
Carter, R.2
Ariet, M.3
Mitchell, G.4
-
26
-
-
84976856849
-
The merge/purge problem for large databases
-
In San Jose
-
Hernandez MA, Stolfo SJ. The merge/purge problem for large databases. In Proceedings of ACM SIGMOD, pages 127-138, San Jose, 1995.
-
(1995)
Proceedings of ACM SIGMOD
, pp. 127-138
-
-
Hernandez, M.A.1
Stolfo, S.J.2
-
27
-
-
0013331361
-
Real-world data is dirty: Data cleansing and the merge/purge problem
-
Hernandez MA, Stolfo SJ. Real-world data is dirty: Data cleansing and the merge/purge problem. Data Mining and Knowledge Discovery, 2(1):9-37, 1998.
-
(1998)
Data Mining and Knowledge Discovery
, vol.2
, Issue.1
, pp. 9-37
-
-
Hernandez, M.A.1
Stolfo, S.J.2
-
29
-
-
0034592786
-
IntelliClean: A knowledge-based intelligent data cleaner
-
In Boston
-
Lee ML, Ling TW, Low WL. IntelliClean: A knowledge-based intelligent data cleaner. In Proceedings of ACM SIGKDD, pages 290-294, Boston, 2000.
-
(2000)
Proceedings of ACM SIGKDD
, pp. 290-294
-
-
Lee, M.L.1
Ling, T.W.2
Low, W.L.3
-
32
-
-
0034592784
-
Efficient clustering of high-dimensional data sets with application to reference matching/
-
In Boston
-
McCallum A, Nigam K, Ungar LH. Efficient clustering of high-dimensional data sets with application to reference matching. In Proceedings of ACM SIGKDD, pages 169-178, Boston, 2000.
-
(2000)
Proceedings of ACM SIGKDD
, pp. 169-178
-
-
McCallum, A.1
Nigam, K.2
Ungar, L.H.3
-
33
-
-
85018108837
-
The field-matching problem: Algorithm and applications
-
In Portland
-
Monge A, Elkan C. The field-matching problem: Algorithm and applications. In Proceedings of ACM SIGKDD, pages 267-270, Portland, 1996.
-
(1996)
Proceedings of ACM SIGKDD
, pp. 267-270
-
-
Monge, A.1
Elkan, C.2
-
35
-
-
0001139918
-
Record linkage: Making maximum use of the discriminating power of identifying information
-
Newcombe HB, Kennedy JM. Record linkage: Making maximum use of the discriminating power of identifying information. Communications of the ACM, 5(11):563-566, 1962.
-
(1962)
Communications of the ACM
, vol.5
, Issue.11
, pp. 563-566
-
-
Newcombe, H.B.1
Kennedy, J.M.2
-
36
-
-
33745834241
-
UCI repository of machine learning databases
-
URL
-
Newman DJ, Hettich S, Blake CL, Merz CJ. UCI repository of machine learning databases, 1998. URL: http://www.ics.uci.edu/~mlearn/ MLRepository.html.
-
(1998)
-
-
Newman, D.J.1
Hettich, S.2
Blake, C.L.3
Merz, C.J.4
-
37
-
-
24344459318
-
Approximate string comparison and its effect on an advanced record linkage system
-
Technical Report RR97/02, US Bureau of the Census
-
Porter E, Winkler WE. Approximate string comparison and its effect on an advanced record linkage system. Technical Report RR97/02, US Bureau of the Census, 1997.
-
(1997)
-
-
Porter, E.1
Winkler, W.E.2
-
38
-
-
0003766191
-
Data preparation for data mining
-
Morgan Kaufmann Publishers, San Francisco
-
Pyle D. Data preparation for data mining. Morgan Kaufmann Publishers, San Francisco, 1999.
-
(1999)
-
-
Pyle, D.1
-
39
-
-
0002490026
-
Data cleaning: Problems and current approaches
-
Rahm E, Do HH. Data cleaning: Problems and current approaches. IEEE Data Engineering Bulletin, 23(4):3-13, 2000.
-
(2000)
IEEE Data Engineering Bulletin
, vol.23
, Issue.4
, pp. 3-13
-
-
Rahm, E.1
Do, H.H.2
-
41
-
-
27144463192
-
On comparing classifiers: Pitfalls to avoid and a recommended approach
-
Salzberg S. On comparing classifiers: Pitfalls to avoid and a recommended approach. Data Mining and Knowledge Discovery, 1(3):317-328, 1997.
-
(1997)
Data Mining and Knowledge Discovery
, vol.1
, Issue.3
, pp. 317-328
-
-
Salzberg, S.1
-
42
-
-
0242456811
-
Interactive deduplication using active learning
-
In. Edmonton
-
Sarawagi S, Bhamidipaty A. Interactive deduplication using active learning. In Proceedings of ACM SIGKDD, pages 269-278, Edmonton, 2002.
-
(2002)
Proceedings of ACM SIGKDD
, pp. 269-278
-
-
Sarawagi, S.1
Bhamidipaty, A.2
-
43
-
-
7444228338
-
The CRISP-DM model: The new blueprint for data mining
-
Shearer C. The CRISP-DM model: The new blueprint for data mining. Journal of Data Warehousing, 5(4):13-22, 2000.
-
(2000)
Journal of Data Warehousing
, vol.5
, Issue.4
, pp. 13-22
-
-
Shearer, C.1
-
44
-
-
0018743442
-
Accuracies of computer versus manual linkages of routine health records
-
Smith ME, Newcombe HB. Accuracies of computer versus manual linkages of routine health records. Methods of Information in Medicine, 18(2):89-97, 1979.
-
(1979)
Methods of Information in Medicine
, vol.18
, Issue.2
, pp. 89-97
-
-
Smith, M.E.1
Newcombe, H.B.2
-
45
-
-
0242456803
-
Learning domain-independent string transformation weights for high accuracy object identification
-
In Edmonton
-
Tejada S, Knoblock CA, Minton S. Learning domain-independent string transformation weights for high accuracy object identification. In Proceedings of ACM SIGKDD, pages 350-359, Edmonton, 2002.
-
(2002)
Proceedings of ACM SIGKDD
, pp. 350-359
-
-
Tejada, S.1
Knoblock, C.A.2
Minton, S.3
-
46
-
-
33846411033
-
Using the EM algorithm for weight computation in the Fellegi-Sunter model of record linkage
-
Technical Report RR00/05, US Bureau of the Census
-
Winkler WE. Using the EM algorithm for weight computation in the Fellegi-Sunter model of record linkage. Technical Report RR00/05, US Bureau of the Census, 2000.
-
(2000)
-
-
Winkler, W.E.1
-
47
-
-
2942741943
-
Methods for record linkage and Bayesian networks
-
Technical Report RR2002/05, US Bureau of the Census
-
Winkler WE. Methods for record linkage and Bayesian networks. Technical Report RR2002/05, US Bureau of the Census, 2002.
-
(2002)
-
-
Winkler, W.E.1
-
48
-
-
33845615644
-
Overview of record linkage and current research directions
-
Technical Report RR2006/02, US Bureau of the Census
-
Winkler WE. Overview of record linkage and current research directions. Technical Report RR2006/02, US Bureau of the Census, 2006.
-
(2006)
-
-
Winkler, W.E.1
-
49
-
-
27544453079
-
An application of the Fellegi-Sunter model of record linkage to the 1990 U.S. decennial census
-
Technical Report RR91/09, US Bureau of the Census
-
Winkler WE, Thibaudeau Y. An application of the Fellegi-Sunter model of record linkage to the 1990 U.S. decennial census. Technical Report RR91/ 09, US Bureau of the Census, 1991.
-
(1991)
-
-
Winkler, W.E.1
Thibaudeau, Y.2
-
50
-
-
33845622202
-
BigMatch: A program for extracting probable matches from a large file for record linkage
-
Technical Report RRC2002/01, US Bureau of the Census
-
Yancey WE. BigMatch: A program for extracting probable matches from a large file for record linkage. Technical Report RRC2002/01, US Bureau of the Census, 2002.
-
(2002)
-
-
Yancey, W.E.1
-
51
-
-
21144446452
-
An adaptive string comparator for record linkage
-
Technical Report RR2004/02, US Bureau of the Census
-
Yancey WE. An adaptive string comparator for record linkage. Technical Report RR2004/02, US Bureau of the Census, 2004.
-
(2004)
-
-
Yancey, W.E.1
-
53
-
-
1342281224
-
Linking hospital discharge and death records - Accuracy and sources of bias
-
Zingmond DS, Ye Z, Ettner SL, Liu H. Linking hospital discharge and death records - accuracy and sources of bias. Journal of Clinical Epidemiology, 57:21-29, 2004.
-
(2004)
Journal of Clinical Epidemiology
, vol.57
, pp. 21-29
-
-
Zingmond, D.S.1
Ye, Z.2
Ettner, S.L.3
Liu, H.4
|