-
1
-
-
84920600570
-
Efficient record linkage using a double embedding scheme
-
Las Vegas
-
Adly, N.: Efficient record linkage using a double embedding scheme. In: DMIN, pp. 274-281. Las Vegas (2009)
-
(2009)
DMIN
, pp. 274-281
-
-
Adly, N.1
-
2
-
-
63349112872
-
Managing and Mining Uncertain Data
-
Springer
-
Aggarwal, C.C.: Managing and Mining Uncertain Data, Advances in Database Systems, vol. 35. Springer (2009)
-
(2009)
Advances in Database Systems
, vol.35
-
-
Aggarwal, C.C.1
-
3
-
-
0034592763
-
The IGrid index: Reversing the dimensionality curse for similarity indexing in high dimensional space
-
Boston
-
Aggarwal, C.C., Yu, P.S.: The IGrid index: Reversing the dimensionality curse for similarity indexing in high dimensional space. In: ACM SIGKDD, pp. 119-129. Boston (2000)
-
(2000)
ACM SIGKDD
, pp. 119-129
-
-
Aggarwal, C.C.1
Yu, P.S.2
-
4
-
-
80052799499
-
Privacy-preserving data mining: Models and algorithms
-
Springer
-
Aggarwal, C.C., Yu, P.S.: Privacy-preserving data mining: models and algorithms, Advances in Database Systems, vol. 34. Springer (2008)
-
(2008)
Advances in Database Systems
, vol.34
-
-
Aggarwal, C.C.1
Yu, P.S.2
-
5
-
-
12244298488
-
Mining reference tables for automatic text segmentation
-
Seattle
-
Agichtein, E., Ganti, V.: Mining reference tables for automatic text segmentation. In: ACM SIGKDD, pp. 20-29. Seattle (2004)
-
(2004)
ACM SIGKDD
, pp. 20-29
-
-
Agichtein, E.1
Ganti, V.2
-
6
-
-
1142303699
-
Information sharing across private databases
-
San Diego
-
Agrawal, R., Evfimievski, A., Srikant, R.: Information sharing across private databases. In: ACM SIGMOD, pp. 86-97. San Diego (2003)
-
(2003)
ACM SIGMOD
, pp. 86-97
-
-
Agrawal, R.1
Evfimievski, A.2
Srikant, R.3
-
7
-
-
33845363891
-
A fast linkage detection scheme for multi-source information integration
-
Tokyo
-
Aizawa, A., Oyama, K.: A fast linkage detection scheme for multi-source information integration. In: WIRI, pp. 30-39. Tokyo (2005)
-
(2005)
WIRI
, pp. 30-39
-
-
Aizawa, A.1
Oyama, K.2
-
8
-
-
77952259855
-
Blocking-aware private record linkage
-
Al-Lawati, A., Lee, D., McDaniel, P.: Blocking-aware private record linkage. In:International Workshop on Information Quality in Information Systems, pp. 59-68 (2005)
-
(2005)
International Workshop on Information Quality in Information Systems
, pp. 59-68
-
-
Al-Lawati, A.1
Lee, D.2
McDaniel, P.3
-
9
-
-
85048686662
-
Interstate voter registration database matching: The Oregon-Washington 2008 pilot project
-
USENIX Association
-
Alvarez, R., Jonas, J., Winkler, W., Wright, R.: Interstate voter registration database matching: the Oregon-Washington 2008 pilot project. In: Workshop on Trustworthy Elections, pp. 17-17. USENIX Association (2009)
-
(2009)
Workshop on Trustworthy Elections
, pp. 17-17
-
-
Alvarez, R.1
Jonas, J.2
Winkler, W.3
Wright, R.4
-
10
-
-
45149118041
-
Identity theft
-
Anderson, K., Durbin, E., Salinger, M.: Identity theft. The Journal of Economic Perspectives 22(2), 171-192 (2008)
-
(2008)
The Journal of Economic Perspectives
, vol.22
, Issue.2
, pp. 171-192
-
-
Anderson, K.1
Durbin, E.2
Salinger, M.3
-
11
-
-
77954717287
-
On active learning of record matching packages
-
Indianapolis
-
Arasu, A., Götz, M., Kaushik, R.: On active learning of record matching packages. In: ACM SIGMOD, pp. 783-794. Indianapolis (2010)
-
(2010)
ACM SIGMOD
, pp. 783-794
-
-
Arasu, A.1
Götz, M.2
Kaushik, R.3
-
12
-
-
70849095483
-
A grammar-based entity representation framework for data cleaning
-
Providence, Rhode Island
-
Arasu, A., Kaushik, R.: A grammar-based entity representation framework for data cleaning. In: ACM SIGMOD, pp. 233-244. Providence, Rhode Island (2009)
-
(2009)
ACM SIGMOD
, pp. 233-244
-
-
Arasu, A.1
Kaushik, R.2
-
14
-
-
1642327704
-
Secure and private sequence comparisons
-
ACM
-
Atallah, M., Kerschbaum, F., Du, W.: Secure and private sequence comparisons. In: Workshop on Privacy in the Electronic Society, pp. 39-44. ACM (2003)
-
(2003)
Workshop on Privacy in the Electronic Society
, pp. 39-44
-
-
Atallah, M.1
Kerschbaum, F.2
Du, W.3
-
17
-
-
84892966962
-
A privacy-preserving framework for accuracy and completeness quality assessment
-
Barone, D., Maurino, A., Stella, F., Batini, C.: A privacy-preserving framework for accuracy and completeness quality assessment. Emerging Paradigms in Informatics, Systems and Communication p. 83 (2009)
-
(2009)
Emerging Paradigms in Informatics, Systems and Communication
, pp. 83
-
-
Barone, D.1
Maurino, A.2
Stella, F.3
Batini, C.4
-
18
-
-
84958759968
-
String matching with metric trees using an approximate distance
-
Lisbon, Portugal
-
Bartolini, I., Ciaccia, P., Patella, M.: String matching with metric trees using an approximate distance. In: String Processing and Information Retrieval, LNCS 2476, pp. 271-283. Lisbon, Portugal (2002)
-
(2002)
String Processing and Information Retrieval, LNCS 2476
, pp. 271-283
-
-
Bartolini, I.1
Ciaccia, P.2
Patella, M.3
-
20
-
-
5444258997
-
A comparison of fast blocking methods for record linkage
-
Washington DC
-
Baxter, R., Christen, P., Churches, T.: A comparison of fast blocking methods for record linkage. In: ACM SIGKDD Workshop on Data Cleaning, Record Linkage and Object Consolidation, pp. 25-27. Washington DC (2003)
-
(2003)
ACM SIGKDD Workshop on Data Cleaning, Record Linkage and Object Consolidation
, pp. 25-27
-
-
Baxter, R.1
Christen, P.2
Churches, T.3
-
21
-
-
35348849154
-
Scaling up all pairs similarity search
-
Banff, Canada
-
Bayardo, R., Ma, Y., Srikant, R.: Scaling up all pairs similarity search. In: WWW, pp. 131-140. Banff, Canada (2007)
-
(2007)
WWW
, pp. 131-140
-
-
Bayardo, R.1
Ma, Y.2
Srikant, R.3
-
22
-
-
67649641448
-
Space-constrained gram-based indexing for efficient approximate string search
-
Shanghai
-
Behm, A., Ji, S., Li, C., Lu, J.: Space-constrained gram-based indexing for efficient approximate string search. In: IEEE ICDE, pp. 604-615. Shanghai (2009)
-
(2009)
IEEE ICDE
, pp. 604-615
-
-
Behm, A.1
Ji, S.2
Li, C.3
Lu, J.4
-
25
-
-
34848900466
-
D-Swoosh: A family of algorithms for generic, distributed entity resolution
-
Benjelloun, O., Garcia-Molina, H., Gong, H., Kawai, H., Larson, T., Menestrina, D., Thavisomboon, S.: D-Swoosh: A family of algorithms for generic, distributed entity resolution. In: International Conference on Distributed Computing Systems, pp. 37-37 (2007)
-
(2007)
International Conference on Distributed Computing Systems
, pp. 37-37
-
-
Benjelloun, O.1
Garcia-Molina, H.2
Gong, H.3
Kawai, H.4
Larson, T.5
Menestrina, D.6
Thavisomboon, S.7
-
26
-
-
58149472338
-
Swoosh: A generic approach to entity resolution
-
Benjelloun, O., Garcia-Molina, H., Menestrina, D., Su, Q., Whang, S., Widom, J.: Swoosh:a generic approach to entity resolution. The VLDB Journal 18(1), 255-276 (2009)
-
(2009)
The VLDB Journal
, vol.18
, Issue.1
, pp. 255-276
-
-
Benjelloun, O.1
Garcia-Molina, H.2
Menestrina, D.3
Su, Q.4
Whang, S.5
Widom, J.6
-
27
-
-
84964528874
-
A survey of longest common subsequence algorithms
-
A Curuna, Spain
-
Bergroth, L., Hakonen, H., Raita, T.: A survey of longest common subsequence algorithms. In: String Processing and Information Retrieval, pp. 39-48. A Curuna, Spain (2000)
-
(2000)
String Processing and Information Retrieval
, pp. 39-48
-
-
Bergroth, L.1
Hakonen, H.2
Raita, T.3
-
28
-
-
77955134730
-
Scalable probabilistic similarity ranking in uncertain databases
-
Bernecker, T., Kriegel, H.P., Mamoulis, N., Renz, M., Zuefle, A.: Scalable probabilistic similarity ranking in uncertain databases. IEEE Transactions on Knowledge and Data Engineering 22(9), 1234-1246 (2010)
-
(2010)
IEEE Transactions on Knowledge and Data Engineering
, vol.22
, Issue.9
, pp. 1234-1246
-
-
Bernecker, T.1
Kriegel, H.P.2
Mamoulis, N.3
Renz, M.4
Zuefle, A.5
-
30
-
-
34249831790
-
Auction algorithms for network flow problems: A tutorial introduction
-
Bertsekas, D.P.: Auction algorithms for network flow problems: A tutorial introduction. Computational Optimization and Applications 1, 7-66 (1992)
-
(1992)
Computational Optimization and Applications
, vol.1
, pp. 7-66
-
-
Bertsekas, D.P.1
-
33
-
-
85089829325
-
Adaptive product normalization: Using online learning for record linkage in comparison shopping
-
Houston
-
Bilenko, M., Basu, S., Sahami, M.: Adaptive product normalization: Using online learning for record linkage in comparison shopping. In: IEEE ICDM, pp. 58-65. Houston (2005)
-
(2005)
IEEE ICDM
, pp. 58-65
-
-
Bilenko, M.1
Basu, S.2
Sahami, M.3
-
34
-
-
84878049861
-
Adaptive blocking: Learning to scale up record linkage
-
Hong Kong
-
Bilenko, M., Kamath, B., Mooney, R.J.: Adaptive blocking: Learning to scale up record linkage. In: IEEE ICDM, pp. 87-96. Hong Kong (2006)
-
(2006)
IEEE ICDM
, pp. 87-96
-
-
Bilenko, M.1
Kamath, B.2
Mooney, R.J.3
-
35
-
-
77952372966
-
Adaptive duplicate detection using learnable string similarity measures
-
Washington DC
-
Bilenko, M., Mooney, R.J.: Adaptive duplicate detection using learnable string similarity measures. In: ACM SIGKDD, pp. 39-48. Washington DC (2003)
-
(2003)
ACM SIGKDD
, pp. 39-48
-
-
Bilenko, M.1
Mooney, R.J.2
-
36
-
-
39049123817
-
D-dupe: An interactive tool for entity resolution in social networks
-
Bilgic, M., Licamele, L., Getoor, L., Shneiderman, B.: D-dupe: An interactive tool for entity resolution in social networks. In: IEEE Symposium on Visual Analytics, Science and Technology, pp. 43-50 (2006)
-
(2006)
IEEE Symposium on Visual Analytics, Science and Technology
, pp. 43-50
-
-
Bilgic, M.1
Licamele, L.2
Getoor, L.3
Shneiderman, B.4
-
37
-
-
0036990263
-
Probabilistic record linkage and a method to calculate the positive predictive value
-
Blakely, T., Salmond, C.: Probabilistic record linkage and a method to calculate the positive predictive value. International Journal of Epidemiology 31:6, 1246-1252 (2002)
-
(2002)
International Journal of Epidemiology
, vol.31
, Issue.6
, pp. 1246-1252
-
-
Blakely, T.1
Salmond, C.2
-
39
-
-
0014814325
-
Space/time trade-offs in hash coding with allowable errors
-
Bloom, B.: Space/time trade-offs in hash coding with allowable errors. Communications of the ACM 13(7), 422-426 (1970)
-
(1970)
Communications of the ACM
, vol.13
, Issue.7
, pp. 422-426
-
-
Bloom, B.1
-
40
-
-
84989573853
-
Getty’s synonameTM and its cousins: A survey of applications of personal name-matching algorithms
-
Borgman, C.L., Siegfried, S.L.: Getty’s synonameTM and its cousins: A survey of applications of personal name-matching algorithms. Journal of the American Society for Information Science 43(7), 459-476 (1992)
-
(1992)
Journal of the American Society for Information Science
, vol.43
, Issue.7
, pp. 459-476
-
-
Borgman, C.L.1
Siegfried, S.L.2
-
41
-
-
0040748315
-
Automatic segmentation of text into structured records
-
Borkar, V., Deshmukh, K., Sarawagi, S.: Automatic segmentation of text into structured records. ACM SIGMOD Record 30(2), 175-186 (2001)
-
(2001)
ACM SIGMOD Record
, vol.30
, Issue.2
, pp. 175-186
-
-
Borkar, V.1
Deshmukh, K.2
Sarawagi, S.3
-
42
-
-
0003802343
-
-
Chapman and Hall/CRC
-
Breiman, L., Freidman, J., Olshen, R., Stone, C.: Classification and regression trees. Chapman and Hall/CRC (1984)
-
(1984)
Classification and regression trees
-
-
Breiman, L.1
Freidman, J.2
Olshen, R.3
Stone, C.4
-
43
-
-
18744398431
-
Efficient query evaluation using a two-level retrieval process
-
New Orleans
-
Broder, A., Carmel, D., Herscovici, M., Soffer, A., Zien, J.: Efficient query evaluation using a two-level retrieval process. In: ACM CIKM, pp. 426-434. New Orleans (2003)
-
(2003)
ACM CIKM
, pp. 426-434
-
-
Broder, A.1
Carmel, D.2
Herscovici, M.3
Soffer, A.4
Zien, J.5
-
44
-
-
40449103107
-
Public good through data linkage: Measuring research outputs from the Western Australian data linkage system
-
Brook, E., Rosman, D., Holman, C.: Public good through data linkage: measuring research outputs from the Western Australian data linkage system. Australian and New Zealand journal of public health 32(1), 19-23 (2008)
-
(2008)
Australian and New Zealand journal of public health
, vol.32
, Issue.1
, pp. 19-23
-
-
Brook, E.1
Rosman, D.2
Holman, C.3
-
45
-
-
39049194535
-
Reverse geocoding: Concerns about patient confidentiality in the display of geospatial health data
-
American Medical Informatics Association
-
Brownstein, J.S., Cassa, C., Kohane, I.S., Mandl, K.D.: Reverse geocoding: Concerns about patient confidentiality in the display of geospatial health data. In: AMIA Annual Symposium Proceedings, p. 905. American Medical Informatics Association (2005)
-
(2005)
AMIA Annual Symposium Proceedings
, pp. 905
-
-
Brownstein, J.S.1
Cassa, C.2
Kohane, I.S.3
Mandl, K.D.4
-
46
-
-
33750089757
-
No place to hide-reverse identification of patients from published maps
-
Brownstein, J.S., Cassa, C., Mandl, K.D.: No place to hide-reverse identification of patients from published maps. New England Journal of Medicine 355(16), 1741-1742 (2006)
-
(2006)
New England Journal of Medicine
, vol.355
, Issue.16
, pp. 1741-1742
-
-
Brownstein, J.S.1
Cassa, C.2
Mandl, K.D.3
-
47
-
-
39049148128
-
Record linkage software in the public domain: A comparison of Link Plus, The Link King, and a basic deterministic algorithm
-
Campbell, K., Deck, D., Krupski, A.: Record linkage software in the public domain: a comparison of Link Plus, The Link King, and a basic deterministic algorithm. Health Informatics Journal 14(1), 5 (2008)
-
(2008)
Health Informatics Journal
, vol.14
, Issue.1
, pp. 5
-
-
Campbell, K.1
Deck, D.2
Krupski, A.3
-
49
-
-
51849162587
-
Common pitfalls using the normalized compression distance: What to watch out for in a compressor
-
Cebrián, M., Alfonseca, M., Ortega, A.: Common pitfalls using the normalized compression distance: What to watch out for in a compressor. Communications in Information and Systems 5(4), 367-384 (2005)
-
(2005)
Communications in Information and Systems
, vol.5
, Issue.4
, pp. 367-384
-
-
Cebrián, M.1
Alfonseca, M.2
Ortega, A.3
-
52
-
-
26444550791
-
Robust identification of fuzzy duplicates
-
Tokyo
-
Chaudhuri, S., Ganti, V., Motwani, R.: Robust identification of fuzzy duplicates. In: IEEE ICDE, pp. 865-876. Tokyo (2005)
-
(2005)
IEEE ICDE
, pp. 865-876
-
-
Chaudhuri, S.1
Ganti, V.2
Motwani, R.3
-
53
-
-
67650262114
-
Privacy advisors for personal information management
-
Seattle, Washington
-
Chaytor, R., Brown, E., Wareham, T.: Privacy advisors for personal information management. In: SIGIR Workshop on Personal Information Management, pp. 28-31. Seattle, Washington (2006)
-
(2006)
SIGIR Workshop on Personal Information Management
, pp. 28-31
-
-
Chaytor, R.1
Brown, E.2
Wareham, T.3
-
54
-
-
1942500388
-
Crime data mining: A general framework and some examples
-
Chen, H., Chung, W., Xu, J., Wang, G., Qin, Y., Chau, M.: Crime data mining: a general framework and some examples. IEEE Computer 37(4), 50-56 (2004)
-
(2004)
IEEE Computer
, vol.37
, Issue.4
, pp. 50-56
-
-
Chen, H.1
Chung, W.2
Xu, J.3
Wang, G.4
Qin, Y.5
Chau, M.6
-
55
-
-
0032201622
-
Private information retrieval
-
Chor, B., Kushilevitz, E., Goldreich, O., Sudan, M.: Private information retrieval. Journal of the ACM (JACM) 45(6), 965-981 (1998)
-
(1998)
Journal of the ACM (JACM)
, vol.45
, Issue.6
, pp. 965-981
-
-
Chor, B.1
Kushilevitz, E.2
Goldreich, O.3
Sudan, M.4
-
56
-
-
26444478506
-
Probabilistic data generation for deduplication and data linkage
-
Brisbane
-
Christen, P.: Probabilistic data generation for deduplication and data linkage. In: IDEAL, Springer LNCS, vol. 3578, pp. 109-116. Brisbane (2005)
-
(2005)
IDEAL, Springer LNCS, vol. 3578
, pp. 109-116
-
-
Christen, P.1
-
57
-
-
78449293191
-
A comparison of personal name matching: Techniques and practical issues
-
Hong Kong
-
Christen, P.: A comparison of personal name matching: Techniques and practical issues. In:Workshop on Mining Complex Data, held at IEEE ICDM. Hong Kong (2006)
-
(2006)
Workshop on Mining Complex Data, held at IEEE ICDM
-
-
Christen, P.1
-
58
-
-
67650258952
-
Privacy-preserving data linkage and geocoding: Current approaches and research directions
-
Hong Kong
-
Christen, P.: Privacy-preserving data linkage and geocoding: Current approaches and research directions. In: Workshop on Privacy Aspects of Data Mining, held at IEEE ICDM. Hong Kong (2006)
-
(2006)
Workshop on Privacy Aspects of Data Mining, held at IEEE ICDM
-
-
Christen, P.1
-
59
-
-
65449139594
-
Automatic record linkage using seeded nearest neighbour and support vector machine classification
-
Las Vegas
-
Christen, P.: Automatic record linkage using seeded nearest neighbour and support vector machine classification. In: ACM SIGKDD, pp. 151-159. Las Vegas (2008)
-
(2008)
ACM SIGKDD
, pp. 151-159
-
-
Christen, P.1
-
60
-
-
44649093306
-
Automatic training example selection for scalable unsupervised record linkage
-
Osaka
-
Christen, P.: Automatic training example selection for scalable unsupervised record linkage. In: PAKDD, Springer LNAI, vol. 5012, pp. 511-518. Osaka (2008)
-
(2008)
PAKDD, Springer LNAI, vol. 5012
, pp. 511-518
-
-
Christen, P.1
-
61
-
-
65449178105
-
Febrl: An open source data cleaning, deduplication and record linkage system with a graphical user interface
-
Las Vegas
-
Christen, P.: Febrl: An open source data cleaning, deduplication and record linkage system with a graphical user interface. In: ACM SIGKDD, pp. 1065-1068. Las Vegas (2008)
-
(2008)
ACM SIGKDD
, pp. 1065-1068
-
-
Christen, P.1
-
62
-
-
74049138802
-
Development and user experiences of an open source data cleaning, deduplication and record linkage system
-
Christen, P.: Development and user experiences of an open source data cleaning, deduplication and record linkage system. SIGKDD Explorations 11(1), 39-48 (2009)
-
(2009)
SIGKDD Explorations
, vol.11
, Issue.1
, pp. 39-48
-
-
Christen, P.1
-
64
-
-
84857183817
-
A survey of indexing techniques for scalable record linkage and deduplication
-
Christen, P.: A survey of indexing techniques for scalable record linkage and deduplication. IEEE Transactions on Knowledge and Data Engineering X(Y) (2011)
-
(2011)
IEEE Transactions on Knowledge and Data Engineering
, vol.X
, Issue.Y
-
-
Christen, P.1
-
65
-
-
84857149294
-
Automated probabilistic address standardisation and verification
-
Sydney
-
Christen, P., Belacic, D.: Automated probabilistic address standardisation and verification. In: AusDM, pp. 53-67. Sydney (2005)
-
(2005)
AusDM
, pp. 53-67
-
-
Christen, P.1
Belacic, D.2
-
66
-
-
7444251738
-
Febrl-A parallel open source data linkage system
-
Sydney
-
Christen, P., Churches, T., Hegland, M.: Febrl-A parallel open source data linkage system. In: PAKDD, Springer LNAI, vol. 3056, pp. 638-647. Sydney (2004)
-
(2004)
PAKDD, Springer LNAI, vol. 3056
, pp. 638-647
-
-
Christen, P.1
Churches, T.2
Hegland, M.3
-
67
-
-
85031020894
-
A probabilistic geocoding system based on a national address file
-
Cairns
-
Christen, P., Churches, T., Willmore, A.: A probabilistic geocoding system based on a national address file. In: AusDM. Cairns (2004)
-
(2004)
AusDM
-
-
Christen, P.1
Churches, T.2
Willmore, A.3
-
69
-
-
67650216370
-
Towards scalable real-time entity resolution using a similarityaware inverted index approach
-
Glenelg, Australia
-
Christen, P., Gayler, R.: Towards scalable real-time entity resolution using a similarityaware inverted index approach. In: AusDM, CRPIT, vol. 87, pp. 51-60. Glenelg, Australia (2008)
-
(2008)
AusDM, CRPIT, vol. 87
, pp. 51-60
-
-
Christen, P.1
Gayler, R.2
-
70
-
-
74549185155
-
Similarity-aware indexing for real-time entity resolution
-
Hong Kong
-
Christen, P., Gayler, R., Hawking, D.: Similarity-aware indexing for real-time entity resolution. In: ACM CIKM, pp. 1565-1568. Hong Kong (2009)
-
(2009)
ACM CIKM
, pp. 1565-1568
-
-
Christen, P.1
Gayler, R.2
Hawking, D.3
-
71
-
-
33846428121
-
Quality and complexity measures for data linkage and deduplication
-
F. Guillet, H. Hamilton (eds.), Springer
-
Christen, P., Goiser, K.: Quality and complexity measures for data linkage and deduplication. In: F. Guillet, H. Hamilton (eds.) Quality Measures in Data Mining, Studies in Computational Intelligence, vol. 43, pp. 127-151. Springer (2007)
-
(2007)
Quality Measures in Data Mining, Studies in Computational Intelligence, vol. 43
, pp. 127-151
-
-
Christen, P.1
Goiser, K.2
-
72
-
-
67650700151
-
Accurate synthetic generation of realistic personal information
-
Bangkok, Thailand
-
Christen, P., Pudjijono, A.: Accurate synthetic generation of realistic personal information. In: PAKDD, Springer LNAI, vol. 5476, pp. 507-514. Bangkok, Thailand (2009)
-
(2009)
PAKDD, Springer LNAI, vol. 5476
, pp. 507-514
-
-
Christen, P.1
Pudjijono, A.2
-
73
-
-
0642275698
-
A proposed architecture and method of operation for improving the protection of privacy and confidentiality in disease registers
-
Churches, T.: A proposed architecture and method of operation for improving the protection of privacy and confidentiality in disease registers. BioMed Central Medical Research Methodology 3(1) (2003)
-
(2003)
BioMed Central Medical Research Methodology
, vol.3
, Issue.1
-
-
Churches, T.1
-
74
-
-
7444258692
-
Blind data linkage using n-gram similarity comparisons
-
Sydney
-
Churches, T., Christen, P.: Blind data linkage using n-gram similarity comparisons. In: PAKDD, Springer LNAI, vol. 3056, pp. 121-126. Sydney (2004)
-
(2004)
PAKDD, Springer LNAI, vol. 3056
, pp. 121-126
-
-
Churches, T.1
Christen, P.2
-
76
-
-
84884417241
-
Preparation of name and address data for record linkage using hidden Markov models
-
Churches, T., Christen, P., Lim, K., Zhu, J.X.: Preparation of name and address data for record linkage using hidden Markov models. BioMed Central Medical Informatics and Decision Making 2(9) (2002)
-
(2002)
BioMed Central Medical Informatics and Decision Making
, vol.2
, Issue.9
-
-
Churches, T.1
Christen, P.2
Lim, K.3
Zhu, J.X.4
-
78
-
-
4344570142
-
Practical introduction to record linkage for injury research
-
Clark, D.E.: Practical introduction to record linkage for injury research. Injury Prevention 10, 186-191 (2004)
-
(2004)
Injury Prevention
, vol.10
, pp. 186-191
-
-
Clark, D.E.1
-
79
-
-
34748816024
-
Privacy-preserving data integration and sharing
-
Clifton, C., Kantarcioglu, M., Doan, A., Schadow, G., Vaidya, J., Elmagarmid, A., Suciu, D.: Privacy-preserving data integration and sharing. In: ACM SIGMOD workshop on Research issues in Data Mining and Knowledge Discovery, pp. 19-26 (2004)
-
(2004)
ACM SIGMOD workshop on Research issues in Data Mining and Knowledge Discovery
, pp. 19-26
-
-
Clifton, C.1
Kantarcioglu, M.2
Doan, A.3
Schadow, G.4
Vaidya, J.5
Elmagarmid, A.6
Suciu, D.7
-
80
-
-
0035452641
-
Efficient data reconciliation
-
Cochinwala, M., Kurien, V., Lalk, G., Shasha, D.: Efficient data reconciliation. Information Sciences 137(1-4), 1-15 (2001)
-
(2001)
Information Sciences
, vol.137
, Issue.1-4
, pp. 1-15
-
-
Cochinwala, M.1
Kurien, V.2
Lalk, G.3
Shasha, D.4
-
81
-
-
0010355394
-
The WHIRL approach to data integration
-
Cohen, W.: The WHIRL approach to data integration. IEEE Intelligent Systems 13(3), 20-24 (1998)
-
(1998)
IEEE Intelligent Systems
, vol.13
, Issue.3
, pp. 20-24
-
-
Cohen, W.1
-
82
-
-
0000666461
-
Data integration using similarity joins and a word-based information representation language
-
Cohen, W.: Data integration using similarity joins and a word-based information representation language. ACM Transactions on Information Systems 18(3), 288-321 (2000)
-
(2000)
ACM Transactions on Information Systems
, vol.18
, Issue.3
, pp. 288-321
-
-
Cohen, W.1
-
83
-
-
0032091575
-
Integration of heterogeneous databases without common domains using queries based on textual similarity
-
Seattle
-
Cohen, W.: Integration of heterogeneous databases without common domains using queries based on textual similarity. In: ACM SIGMOD, pp. 201-212. Seattle (1998)
-
(1998)
ACM SIGMOD
, pp. 201-212
-
-
Cohen, W.1
-
84
-
-
11144240583
-
A comparison of string distance metrics for namematching tasks
-
Acapulco
-
Cohen, W., Ravikumar, P., Fienberg, S.: A comparison of string distance metrics for namematching tasks. In: Workshop on Information Integration on the Web, held at IJCAI, pp. 73-78. Acapulco (2003)
-
(2003)
Workshop on Information Integration on the Web, held at IJCAI
, pp. 73-78
-
-
Cohen, W.1
Ravikumar, P.2
Fienberg, S.3
-
85
-
-
0242540438
-
Learning to match and cluster large high-dimensional data sets for data integration
-
Edmonton
-
Cohen, W., Richman, J.: Learning to match and cluster large high-dimensional data sets for data integration. In: ACM SIGKDD, pp. 475-480. Edmonton (2002)
-
(2002)
ACM SIGKDD
, pp. 475-480
-
-
Cohen, W.1
Richman, J.2
-
87
-
-
33750469045
-
Spatial confidentiality and GIS: Re-engineering mortality locations from published maps about Hurricane Katrina
-
Curtis, A.J., Mills, J.W., Leitner, M.: Spatial confidentiality and GIS: Re-engineering mortality locations from published maps about Hurricane Katrina. International Journal of Health Geographics 5(1), 44-56 (2006)
-
(2006)
International Journal of Health Geographics
, vol.5
, Issue.1
, pp. 44-56
-
-
Curtis, A.J.1
Mills, J.W.2
Leitner, M.3
-
88
-
-
79959294918
-
A fast approach for parallel deduplication on multicore processors
-
Dal Bianco, G., Galante, R., Heuser, C.: A fast approach for parallel deduplication on multicore processors. In: ACM Symposium on Applied, Computing, pp. 1027-1032 (2011)
-
(2011)
ACM Symposium on Applied, Computing
, pp. 1027-1032
-
-
Dal Bianco, G.1
Galante, R.2
Heuser, C.3
-
89
-
-
84941869105
-
A technique for computer detection and correction of spelling errors
-
Damerau, F.J.: A technique for computer detection and correction of spelling errors. Communications of the ACM 7(3), 171-176 (1964)
-
(1964)
Communications of the ACM
, vol.7
, Issue.3
, pp. 171-176
-
-
Damerau, F.J.1
-
91
-
-
79251527493
-
Efficient techniques for online record linkage
-
Dey, D., Mookerjee, V., Liu, D.: Efficient techniques for online record linkage. IEEE Transactions on Knowledge and Data Engineering 23(3), 373-387 (2010)
-
(2010)
IEEE Transactions on Knowledge and Data Engineering
, vol.23
, Issue.3
, pp. 373-387
-
-
Dey, D.1
Mookerjee, V.2
Liu, D.3
-
92
-
-
0348062787
-
Disclosure risk assessment in statistical microdata protection via advanced record linkage
-
Domingo-Ferrer, J., Torra, V.: Disclosure risk assessment in statistical microdata protection via advanced record linkage. Statistics and Computing 13(4), 343-354 (2003)
-
(2003)
Statistics and Computing
, vol.13
, Issue.4
, pp. 343-354
-
-
Domingo-Ferrer, J.1
Torra, V.2
-
93
-
-
29844452555
-
Reference reconciliation in complex information spaces
-
Baltimore
-
Dong, X., Halevy, A., Madhavan, J.: Reference reconciliation in complex information spaces. In: ACM SIGMOD, pp. 85-96. Baltimore (2005)
-
(2005)
ACM SIGMOD
, pp. 85-96
-
-
Dong, X.1
Halevy, A.2
Madhavan, J.3
-
94
-
-
84888417083
-
A comparison and generalization of blocking and windowing algorithms for duplicate detection
-
Lyon
-
Draisbach, U., Naumann, F.: A comparison and generalization of blocking and windowing algorithms for duplicate detection. In: Workshop on Quality in Databases, held at VLDB. Lyon (2009)
-
(2009)
Workshop on Quality in Databases, held at VLDB
-
-
Draisbach, U.1
Naumann, F.2
-
98
-
-
84964941330
-
Private medical record linkage with approximate matching
-
American Medical Informatics Association
-
Durham, E., Xue, Y., Kantarcioglu, M., Malin, B.: Private medical record linkage with approximate matching. In: AMIA Annual Symposium Proceedings, p. 182. American Medical Informatics Association (2010)
-
(2010)
AMIA Annual Symposium Proceedings
, pp. 182
-
-
Durham, E.1
Xue, Y.2
Kantarcioglu, M.3
Malin, B.4
-
99
-
-
84870509565
-
-
Information Fusion In Press
-
Durham, E., Xue, Y., Kantarcioglu, M., Malin, B.: Quantifying the correctness, computational complexity, and security of privacy-preserving string comparators for record linkage. Information Fusion In Press (2011)
-
(2011)
Quantifying the correctness, computational complexity, and security of privacy-preserving string comparators for record linkage
-
-
Durham, E.1
Xue, Y.2
Kantarcioglu, M.3
Malin, B.4
-
100
-
-
84874780558
-
-
Ph.D. thesis, Faculty of the Graduate School of Vanderbilt University, Nashville, TN
-
Durham, E.: A framework for accurate, efficient private record linkage. Ph.D. thesis, Faculty of the Graduate School of Vanderbilt University, Nashville, TN (2012)
-
(2012)
A framework for accurate, efficient private record linkage
-
-
Durham, E.1
-
102
-
-
0036203458
-
TAILOR: A record linkage toolbox
-
San Jose
-
Elfeky, M.G., Verykios, V., Elmagarmid, A.K.: TAILOR: A record linkage toolbox. In: IEEE ICDE, pp. 17-28. San Jose (2002)
-
(2002)
IEEE ICDE
, pp. 17-28
-
-
Elfeky, M.G.1
Verykios, V.2
Elmagarmid, A.K.3
-
103
-
-
33845667955
-
Duplicate record detection: A survey
-
Elmagarmid, A.K., Ipeirotis, P.G., Verykios, V.: Duplicate record detection: A survey. IEEE Transactions on Knowledge and Data Engineering 19(1), 1-16 (2007)
-
(2007)
IEEE Transactions on Knowledge and Data Engineering
, vol.19
, Issue.1
, pp. 1-16
-
-
Elmagarmid, A.K.1
Ipeirotis, P.G.2
Verykios, V.3
-
104
-
-
0030150177
-
Comparing information without leaking it
-
Fagin, R., Naor, M., Winkler, P.: Comparing information without leaking it. Communications of the ACM 39(5), 77-85 (1996)
-
(1996)
Communications of the ACM
, vol.39
, Issue.5
, pp. 77-85
-
-
Fagin, R.1
Naor, M.2
Winkler, P.3
-
105
-
-
84976803260
-
Fastmap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets
-
San Jose
-
Faloutsos, C., Lin, K.I.: Fastmap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In: ACM SIGMOD, pp. 163-174. San Jose (1995)
-
(1995)
ACM SIGMOD
, pp. 163-174
-
-
Faloutsos, C.1
Lin, K.I.2
-
110
-
-
33748542906
-
Privacy and confidentiality in an e-commerce world: Data mining, data warehousing, matching and disclosure limitation
-
Fienberg, S.: Privacy and confidentiality in an e-commerce world: Data mining, data warehousing, matching and disclosure limitation. Statistical Science 21(2), 143-154 (2006)
-
(2006)
Statistical Science
, vol.21
, Issue.2
, pp. 143-154
-
-
Fienberg, S.1
-
112
-
-
67649351564
-
On Bayesian record linkage
-
Fortini, M., Liseo, B., Nuccitelli, A., Scanu, M.: On Bayesian record linkage. Research in Official Statistics 4(1), 185-198 (2001)
-
(2001)
Research in Official Statistics
, vol.4
, Issue.1
, pp. 185-198
-
-
Fortini, M.1
Liseo, B.2
Nuccitelli, A.3
Scanu, M.4
-
113
-
-
0026786263
-
Tolerating spelling errors during patient validation
-
Friedman, C., Sideli, R.: Tolerating spelling errors during patient validation. Computers and Biomedical Research 25, 486-509 (1992)
-
(1992)
Computers and Biomedical Research
, vol.25
, pp. 486-509
-
-
Friedman, C.1
Sideli, R.2
-
114
-
-
84857170753
-
Automatic cleaning and linking of historical census data using household information
-
Vancouver
-
Fu, Z., Christen, P., Boot, M.: Automatic cleaning and linking of historical census data using household information. In: Workshop on Domain Driven Data Mining, held at IEEE ICDM. Vancouver (2011)
-
(2011)
Workshop on Domain Driven Data Mining, held at IEEE ICDM
-
-
Fu, Z.1
Christen, P.2
Boot, M.3
-
115
-
-
84861452649
-
A supervised learning and group linking method for historical census household linkage
-
Ballarat, Australia
-
Fu, Z., Christen, P., Boot, M.: A supervised learning and group linking method for historical census household linkage. In: AusDM, CRPIT, vol. 125. Ballarat, Australia (2011)
-
(2011)
AusDM, CRPIT
, vol.125
-
-
Fu, Z.1
Christen, P.2
Boot, M.3
-
116
-
-
84861452098
-
Multiple instance learning for group record linkage
-
Kuala Lumpur, Malaysia
-
Fu, Z., Zhou, J., Christen, P., Boot, M.: Multiple instance learning for group record linkage. In: PAKDD, Springer LNAI. Kuala Lumpur, Malaysia (2012)
-
(2012)
PAKDD, Springer LNAI
-
-
Fu, Z.1
Zhou, J.2
Christen, P.3
Boot, M.4
-
117
-
-
0033891155
-
An extensible framework for data cleaning
-
San Diego
-
Galhardas, H., Florescu, D., Shasha, D., Simon, E.: An extensible framework for data cleaning. In: IEEE ICDE. San Diego (2000)
-
(2000)
IEEE ICDE
-
-
Galhardas, H.1
Florescu, D.2
Shasha, D.3
Simon, E.4
-
118
-
-
13244269176
-
OX-LINK: The Oxford medical record linkage system
-
Arlington, Virginia
-
Gill, L.: OX-LINK: The Oxford medical record linkage system. In: Proc. IntGI Record Linkage Workshop and Exposition, pp. 15-33. Arlington, Virginia (1997)
-
(1997)
Proc. IntGI Record Linkage Workshop and Exposition
, pp. 15-33
-
-
Gill, L.1
-
119
-
-
1642332418
-
Methods for automatic record matching and linking and their use in national statistics
-
National Statistics, London
-
Gill, L.: Methods for automatic record matching and linking and their use in national statistics. Tech. Rep. Methodology Series, no. 25, National Statistics, London (2001)
-
(2001)
Tech. Rep. Methodology Series
, vol.25
-
-
Gill, L.1
-
120
-
-
38149140884
-
Semantic matching: Algorithms and implementation
-
Giunchiglia, F., Yatskevich, M., Shvaiko, P.: Semantic matching: Algorithms and implementation. Journal on Data Semantics IX pp. 1-38 (2007)
-
(2007)
Journal on Data Semantics
, vol.9
, pp. 1-38
-
-
Giunchiglia, F.1
Yatskevich, M.2
Shvaiko, P.3
-
121
-
-
38949127609
-
Cohort profile: The Western Australian family connections genealogical project
-
Glasson, E., De Klerk, N., Bass, A., Rosman, D., Palmer, L., Holman, C.: Cohort profile: the Western Australian family connections genealogical project. International Journal of epidemiology 37(1), 30-35 (2008)
-
(2008)
International Journal of epidemiology
, vol.37
, Issue.1
, pp. 30-35
-
-
Glasson, E.1
De Klerk, N.2
Bass, A.3
Rosman, D.4
Palmer, L.5
Holman, C.6
-
123
-
-
0003839182
-
-
Tech. rep., Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Israel
-
Goldreich, O.: Secure multi-party computation. Tech. rep., Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Israel (2002)
-
(2002)
Secure multi-party computation
-
-
Goldreich, O.1
-
124
-
-
0037198576
-
An empirical comparison of record linkage procedures
-
Gomatam, S., Carter, R., Ariet, M., Mitchell, G.: An empirical comparison of record linkage procedures. Statistics in Medicine 21(10), 1485-1496 (2002)
-
(2002)
Statistics in Medicine
, vol.21
, Issue.10
, pp. 1485-1496
-
-
Gomatam, S.1
Carter, R.2
Ariet, M.3
Mitchell, G.4
-
125
-
-
32244443683
-
Syllable alignment: A novel model for phonetic string search
-
Gong, R., Chan, T.K.: Syllable alignment: A novel model for phonetic string search. IEICE Transactions on Information and Systems E89-D(1), 332-339 (2006)
-
(2006)
IEICE Transactions on Information and Systems
, vol.E89-D
, Issue.1
, pp. 332-339
-
-
Gong, R.1
Chan, T.K.2
-
126
-
-
2342641297
-
-
Addison-Wesley Longman Publishing Co., Inc
-
Grama, A., Karypis, G., Kumar, V., Gupta, A.: Introduction to parallel computing, 2 edn. Addison-Wesley Longman Publishing Co., Inc. (2003)
-
(2003)
Introduction to parallel computing, 2 edn
-
-
Grama, A.1
Karypis, G.2
Kumar, V.3
Gupta, A.4
-
127
-
-
84944318804
-
Approximate string joins in a database (almost) for free
-
Roma
-
Gravano, L., Ipeirotis, P.G., Jagadish, H.V., Koudas, N., Muthukrishnan, S., Srivastava, D.: Approximate string joins in a database (almost) for free. In: VLDB, pp. 491-500. Roma (2001)
-
(2001)
VLDB
, pp. 491-500
-
-
Gravano, L.1
Ipeirotis, P.G.2
Jagadish, H.V.3
Koudas, N.4
Muthukrishnan, S.5
Srivastava, D.6
-
130
-
-
70350623193
-
Address standardization with latent semantic association
-
Paris
-
Guo, H., Zhu, H., Guo, Z., Zhang, X., Su, Z.: Address standardization with latent semantic association. In: ACM SIGKDD, pp. 1155-1164. Paris (2009)
-
(2009)
ACM SIGKDD
, pp. 1155-1164
-
-
Guo, H.1
Zhu, H.2
Guo, Z.3
Zhang, X.4
Su, Z.5
-
131
-
-
77956039068
-
Adaptive near-duplicate detection via similarity learning
-
Geneva, Switzerland
-
Hajishirzi, H., Yih, W., Kolcz, A.: Adaptive near-duplicate detection via similarity learning. In: ACM SIGIR, pp. 419-426. Geneva, Switzerland (2010)
-
(2010)
ACM SIGIR
, pp. 419-426
-
-
Hajishirzi, H.1
Yih, W.2
Kolcz, A.3
-
132
-
-
76749092270
-
The WEKA data mining software: An update
-
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The WEKA data mining software: an update. ACM SIGKDD Explorations 11(1), 10-18 (2009)
-
(2009)
ACM SIGKDD Explorations
, vol.11
, Issue.1
, pp. 10-18
-
-
Hall, M.1
Frank, E.2
Holmes, G.3
Pfahringer, B.4
Reutemann, P.5
Witten, I.6
-
134
-
-
78049382188
-
Privacy-preserving record linkage
-
Corfu, Greece
-
Hall, R., Fienberg, S.: Privacy-preserving record linkage. In: Privacy in Statistical Databases, Springer LNCS 6344, pp. 269-283. Corfu, Greece (2010)
-
(2010)
Privacy in Statistical Databases, Springer LNCS 6344
, pp. 269-283
-
-
Hall, R.1
Fienberg, S.2
-
136
-
-
33745886270
-
Classifier technology and the illusion of progress
-
Hand, D.: Classifier technology and the illusion of progress. Statistical Science 21(1), 1-14 (2006)
-
(2006)
Statistical Science
, vol.21
, Issue.1
, pp. 1-14
-
-
Hand, D.1
-
137
-
-
70349826301
-
Creating probabilistic databases from duplicated data
-
Hassanzadeh, O., Miller, R.: Creating probabilistic databases from duplicated data. The VLDB Journal 18(5), 1141-1166 (2009)
-
(2009)
The VLDB Journal
, vol.18
, Issue.5
, pp. 1141-1166
-
-
Hassanzadeh, O.1
Miller, R.2
-
139
-
-
33750296887
-
Finding near-duplicate web pages: A large-scale evaluation of algorithms
-
Seattle
-
Henzinger, M.: Finding near-duplicate web pages: a large-scale evaluation of algorithms. In: ACM SIGIR, pp. 284-291. Seattle (2006)
-
(2006)
ACM SIGIR
, pp. 284-291
-
-
Henzinger, M.1
-
140
-
-
84976856849
-
The merge/purge problem for large databases
-
San Jose
-
Hernandez, M.A., Stolfo, S.J.: The merge/purge problem for large databases. In: ACM SIGMOD, pp. 127-138. San Jose (1995)
-
(1995)
ACM SIGMOD
, pp. 127-138
-
-
Hernandez, M.A.1
Stolfo, S.J.2
-
141
-
-
0013331361
-
Real-world data is dirty: Data cleansing and the merge/purge problem
-
Hernandez, M.A., Stolfo, S.J.: Real-world data is dirty: Data cleansing and the merge/purge problem. Data Mining and Knowledge Discovery 2(1), 9-37 (1998)
-
(1998)
Data Mining and Knowledge Discovery
, vol.2
, Issue.1
, pp. 9-37
-
-
Hernandez, M.A.1
Stolfo, S.J.2
-
142
-
-
84927513403
-
Scalable iterative graph duplicate detection
-
Herschel, M., Naumann, F., Szott, S., Taubert, M.: Scalable iterative graph duplicate detection. IEEE Transactions on Knowledge and Data Engineering X(Y) (2011)
-
(2011)
IEEE Transactions on Knowledge and Data Engineering
, vol.X
, Issue.Y
-
-
Herschel, M.1
Naumann, F.2
Szott, S.3
Taubert, M.4
-
144
-
-
28044445101
-
An index to quantify an individual’s scientific research output
-
Hirsch, J.: An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences of the United States of America 102(46), 16,569-16,572 (2005)
-
(2005)
Proceedings of the National Academy of Sciences of the United States of America
, vol.102
, Issue.46
, pp. 16,569-16,572
-
-
Hirsch, J.1
-
146
-
-
52649087713
-
A hybrid approach to private record linkage
-
Inan, A., Kantarcioglu, M., Bertino, E., Scannapieco, M.: A hybrid approach to private record linkage. In: IEEE ICDE, pp. 496-505 (2008)
-
(2008)
IEEE ICDE
, pp. 496-505
-
-
Inan, A.1
Kantarcioglu, M.2
Bertino, E.3
Scannapieco, M.4
-
147
-
-
77952279809
-
Private record matching using differential privacy
-
Inan, A., Kantarcioglu, M., Ghinita, G., Bertino, E.: Private record matching using differential privacy. In: International Conference on Extending Database Technology, pp. 123-134 (2010)
-
(2010)
International Conference on Extending Database Technology
, pp. 123-134
-
-
Inan, A.1
Kantarcioglu, M.2
Ghinita, G.3
Bertino, E.4
-
148
-
-
79959927816
-
On-the-fly entity-aware query processing in the presence of linkage
-
Ioannou, E., Nejdl, W., Niederée, C., Velegrakis, Y.: On-the-fly entity-aware query processing in the presence of linkage. Proceedings of the VLDB Endowment 3(1) (2010)
-
(2010)
Proceedings of the VLDB Endowment
, vol.3
, Issue.1
-
-
Ioannou, E.1
Nejdl, W.2
Niederée, C.3
Velegrakis, Y.4
-
149
-
-
84950419860
-
Advances in record-linkage methodology a applied to matching the 1985 Census of Tampa, Florida
-
Jaro, M.A.: Advances in record-linkage methodology a applied to matching the 1985 Census of Tampa, Florida. Journal of the American Statistical Association 84, 414-420 (1989)
-
(1989)
Journal of the American Statistical Association
, vol.84
, pp. 414-420
-
-
Jaro, M.A.1
-
151
-
-
84943425383
-
Efficient record linkage in large data sets
-
Tokyo
-
Jin, L., Li, C., Mehrotra, S.: Efficient record linkage in large data sets. In: DASFAA, pp. 137-146. Tokyo (2003)
-
(2003)
DASFAA
, pp. 137-146
-
-
Jin, L.1
Li, C.2
Mehrotra, S.3
-
152
-
-
0030412523
-
A comparison of approximate string matching algorithms
-
Jokinen, P., Tarhio, J., Ukkonen, E.: A comparison of approximate string matching algorithms. Software-Practice and Experience 26(12), 1439-1458 (1996)
-
(1996)
Software-Practice and Experience
, vol.26
, Issue.12
, pp. 1439-1458
-
-
Jokinen, P.1
Tarhio, J.2
Ukkonen, E.3
-
153
-
-
45849148052
-
Effective counterterrorism and the limited role of predictive data mining
-
Jonas, J., Harper, J.: Effective counterterrorism and the limited role of predictive data mining. Policy Analysis (584) (2006)
-
(2006)
Policy Analysis
, Issue.584
-
-
Jonas, J.1
Harper, J.2
-
154
-
-
73949088415
-
FRIL: A tool for comparative record linkage
-
American Medical Informatics Association
-
Jurczyk, P., Lu, J., Xiong, L., Cragan, J., Correa, A.: FRIL: A tool for comparative record linkage. In: AMIA Annual Symposium Proceedings, p. 440. American Medical Informatics Association (2008)
-
(2008)
AMIA Annual Symposium Proceedings
, pp. 440
-
-
Jurczyk, P.1
Lu, J.2
Xiong, L.3
Cragan, J.4
Correa, A.5
-
155
-
-
33745266392
-
Domain-independent data cleaning via analysis of entityrelationship graph
-
Kalashnikov, D., Mehrotra, S.: Domain-independent data cleaning via analysis of entityrelationship graph. ACM Transactions on Database Systems 31(2), 716-767 (2006)
-
(2006)
ACM Transactions on Database Systems
, vol.31
, Issue.2
, pp. 716-767
-
-
Kalashnikov, D.1
Mehrotra, S.2
-
156
-
-
47649126673
-
Interactive entity resolution in relational data: A visual analytic tool and its evaluation
-
Kang, H., Getoor, L., Shneiderman, B., Bilgic, M., Licamele, L.: Interactive entity resolution in relational data: A visual analytic tool and its evaluation. IEEE Transactions on Visualization and Computer Graphics 14(5), 999-1014 (2008)
-
(2008)
IEEE Transactions on Visualization and Computer Graphics
, vol.14
, Issue.5
, pp. 999-1014
-
-
Kang, H.1
Getoor, L.2
Shneiderman, B.3
Bilgic, M.4
Licamele, L.5
-
159
-
-
84863589864
-
Fake injection strategies for private phonetic matching
-
Leuven, Belgium
-
Karakasidis, A., Verykios, V., Christen, P.: Fake injection strategies for private phonetic matching. In: International Workshop on Data Privacy Management. Leuven, Belgium (2011)
-
(2011)
International Workshop on Data Privacy Management
-
-
Karakasidis, A.1
Verykios, V.2
Christen, P.3
-
160
-
-
47249144557
-
-
Tech. Rep. 2006-19, Department of Computer Science, Stanford University
-
Kawai, H., Garcia-Molina, H., Benjelloun, O., Menestrina, D., Whang, E., Gong, H.:P-Swoosh: Parallel algorithm for generic entity resolution. Tech. Rep. 2006-19, Department of Computer Science, Stanford University (2006)
-
(2006)
P-Swoosh: Parallel algorithm for generic entity resolution
-
-
Kawai, H.1
Garcia-Molina, H.2
Benjelloun, O.3
Menestrina, D.4
Whang, E.5
Gong, H.6
-
161
-
-
0036450652
-
Research use of linked health data-A best practice protocol
-
Kelman, C.W., Bass, J., Holman, D.: Research use of linked health data-A best practice protocol. Aust NZ Journal of Public Health 26, 251-255 (2002)
-
(2002)
Aust NZ Journal of Public Health
, vol.26
, pp. 251-255
-
-
Kelman, C.W.1
Bass, J.2
Holman, D.3
-
162
-
-
0142218940
-
Non-adjacent digrams improve matching of cross-lingual spelling variants
-
Manaus, Brazil
-
Keskustalo, H., Pirkola, A., Visala, K., Leppanen, E., Jarvelin, K.: Non-adjacent digrams improve matching of cross-lingual spelling variants. In: String Processing and Information Retrieval, LNCS 2857, pp. 252-265. Manaus, Brazil (2003)
-
(2003)
String Processing and Information Retrieval, LNCS 2857
, pp. 252-265
-
-
Keskustalo, H.1
Pirkola, A.2
Visala, K.3
Leppanen, E.4
Jarvelin, K.5
-
163
-
-
63449096255
-
Parallel linkage
-
Lisboa, Portugal
-
Kim, H., Lee, D.: Parallel linkage. In: ACM CIKM, pp. 283-292. Lisboa, Portugal (2007)
-
(2007)
ACM CIKM
, pp. 283-292
-
-
Kim, H.1
Lee, D.2
-
164
-
-
77952280581
-
Harra: Fast iterative hashed record linkage for large-scale data collections
-
Lausanne, Switzerland
-
Kim, H., Lee, D.: Harra: fast iterative hashed record linkage for large-scale data collections. In: International Conference on Extending Database Technology, pp. 525-536. Lausanne, Switzerland (2010)
-
(2010)
International Conference on Extending Database Technology
, pp. 525-536
-
-
Kim, H.1
Lee, D.2
-
165
-
-
84876687819
-
Data partitioning for parallel entity matching
-
Kirsten, T., Kolb, L., Hartung, M., Gross, A., Köpcke, H., Rahm, E.: Data partitioning for parallel entity matching. Proceedings of the VLDB Endowment 3(2) (2010)
-
(2010)
Proceedings of the VLDB Endowment
, vol.3
, Issue.2
-
-
Kirsten, T.1
Kolb, L.2
Hartung, M.3
Gross, A.4
Köpcke, H.5
Rahm, E.6
-
166
-
-
76249090414
-
The normalized compression distance as a distance measure in entity identification. Advances in Data Mining
-
Klenk, S., Thom, D., Heidemann, G.: The normalized compression distance as a distance measure in entity identification. Advances in Data Mining. Applications and Theoretical Aspects pp. 325-337 (2009)
-
(2009)
Applications and Theoretical Aspects
, pp. 325-337
-
-
Klenk, S.1
Thom, D.2
Heidemann, G.3
-
167
-
-
84857053051
-
Multi-pass sorted neighborhood blocking with Map-Reduce
-
Kolb, L., Thor, A., Rahm, E.: Multi-pass sorted neighborhood blocking with Map-Reduce. Computer Science-Research and, Development pp. 1-19 (2011)
-
(2011)
Computer Science-Research and, Development
, pp. 1-19
-
-
Kolb, L.1
Thor, A.2
Rahm, E.3
-
168
-
-
72649095071
-
Frameworks for entity matching: A comparison
-
Köpcke, H., Rahm, E.: Frameworks for entity matching: A comparison. Data and Knowledge Engineering 69(2), 197-210 (2010)
-
(2010)
Data and Knowledge Engineering
, vol.69
, Issue.2
, pp. 197-210
-
-
Köpcke, H.1
Rahm, E.2
-
169
-
-
80455148340
-
Evaluation of entity resolution approaches on real-world match problems
-
Köpcke, H., Thor, A., Rahm, E.: Evaluation of entity resolution approaches on real-world match problems. Proceedings of the VLDB Endowment 3(1-2), 484-493 (2010)
-
(2010)
Proceedings of the VLDB Endowment
, vol.3
, Issue.1-2
, pp. 484-493
-
-
Köpcke, H.1
Thor, A.2
Rahm, E.3
-
170
-
-
85123004356
-
Flexible string matching against large databases in practice
-
Toronto
-
Koudas, N., Marathe, A., Srivastava, D.: Flexible string matching against large databases in practice. In: VLDB, pp. 1086-1094. Toronto (2004)
-
(2004)
VLDB
, pp. 1086-1094
-
-
Koudas, N.1
Marathe, A.2
Srivastava, D.3
-
172
-
-
0026979939
-
Techniques for automatically correcting words in text
-
Kukich, K.: Techniques for automatically correcting words in text. ACM Computing Surveys 24(4), 377-439 (1992)
-
(1992)
ACM Computing Surveys
, vol.24
, Issue.4
, pp. 377-439
-
-
Kukich, K.1
-
173
-
-
79961178764
-
A constraint satisfaction cryptanalysis of Bloom filters in private record linkage
-
Springer
-
Kuzu, M., Kantarcioglu, M., Durham, E., Malin, B.: A constraint satisfaction cryptanalysis of Bloom filters in private record linkage. In: Privacy Enhancing Technologies, pp. 226-245. Springer (2011)
-
(2011)
Privacy Enhancing Technologies
, pp. 226-245
-
-
Kuzu, M.1
Kantarcioglu, M.2
Durham, E.3
Malin, B.4
-
175
-
-
33645619459
-
-
Tech. rep., Department of Computer Science, University of Newcastle upon Tyne
-
Lait, A., Randell, B.: An assessment of name matching algorithms. Tech. rep., Department of Computer Science, University of Newcastle upon Tyne (1993)
-
(1993)
An assessment of name matching algorithms
-
-
Lait, A.1
Randell, B.2
-
176
-
-
36949029197
-
Are your citations clean?
-
Lee, D., Kang, J., Mitra, P., Giles, C.L., On, B.W.: Are your citations clean? Commununications of the ACM 50, 33-38 (2007)
-
(2007)
Commununications of the ACM
, vol.50
, pp. 33-38
-
-
Lee, D.1
Kang, J.2
Mitra, P.3
Giles, C.L.4
On, B.W.5
-
177
-
-
34748823295
-
-
The MIT Press
-
Lee, Y., Pipino, L., Funk, J., Wang, R.: Journey to data quality. The MIT Press (2009)
-
(2009)
Journey to data quality
-
-
Lee, Y.1
Pipino, L.2
Funk, J.3
Wang, R.4
-
179
-
-
80055030019
-
Linking temporal records
-
Li, P., Dong, X., Maurino, A., Srivastava, D.: Linking temporal records. Proceedings of the VLDB Endowment 4(11) (2011)
-
(2011)
Proceedings of the VLDB Endowment
, vol.4
, Issue.11
-
-
Li, P.1
Dong, X.2
Maurino, A.3
Srivastava, D.4
-
180
-
-
36049037582
-
K-unlinkability: A privacy protection model for distributed data
-
Malin, B.: K-unlinkability: A privacy protection model for distributed data. Data and Knowledge Engineering 64(1), 294-311 (2008)
-
(2008)
Data and Knowledge Engineering
, vol.64
, Issue.1
, pp. 294-311
-
-
Malin, B.1
-
181
-
-
33644776482
-
A network analysis model for disambiguation of names in lists
-
Malin, B., Airoldi, E., Carley, K.: A network analysis model for disambiguation of names in lists. Computational and Mathematical Organization Theory 11(2), 119-139 (2005)
-
(2005)
Computational and Mathematical Organization Theory
, vol.11
, Issue.2
, pp. 119-139
-
-
Malin, B.1
Airoldi, E.2
Carley, K.3
-
182
-
-
74249091457
-
Technical and policy approaches to balancing patient privacy and data sharing in clinical and translational research
-
Malin, B., Karp, D., Scheuermann, R.: Technical and policy approaches to balancing patient privacy and data sharing in clinical and translational research. Journal of investigative medicine: the official publication of the American Federation for Clinical Research 58(1), 11 (2010)
-
(2010)
Journal of investigative medicine: The official publication of the American Federation for Clinical Research
, vol.58
, Issue.1
, pp. 11
-
-
Malin, B.1
Karp, D.2
Scheuermann, R.3
-
184
-
-
0000806922
-
Automating the construction of Internet portals with machine learning
-
McCallum, A., Nigam, K., Rennie, J., Seymore, K.: Automating the construction of Internet portals with machine learning. Information Retrieval 3(2), 127-163 (2000)
-
(2000)
Information Retrieval
, vol.3
, Issue.2
, pp. 127-163
-
-
McCallum, A.1
Nigam, K.2
Rennie, J.3
Seymore, K.4
-
185
-
-
0034592784
-
Efficient clustering of high-dimensional data sets with application to reference matching
-
Boston
-
McCallum, A., Nigam, K., Ungar, L.H.: Efficient clustering of high-dimensional data sets with application to reference matching. In: ACM SIGKDD, pp. 169-178. Boston (2000)
-
(2000)
ACM SIGKDD
, pp. 169-178
-
-
McCallum, A.1
Nigam, K.2
Ungar, L.H.3
-
186
-
-
34547960421
-
Generic entity resolution with data confidences
-
Seoul, South Korea
-
Menestrina, D., Benjelloun, O., Garcia-Molina, H.: Generic entity resolution with data confidences. In: First International VLDB Workshop on Clean Databases. Seoul, South Korea (2006)
-
(2006)
First International VLDB Workshop on Clean Databases
-
-
Menestrina, D.1
Benjelloun, O.2
Garcia-Molina, H.3
-
187
-
-
79960270026
-
Evaluating entity resolution results
-
Menestrina, D., Whang, S., Garcia-Molina, H.: Evaluating entity resolution results. Proceedings of the VLDB Endowment 3(1-2), 208-219 (2010)
-
(2010)
Proceedings of the VLDB Endowment
, vol.3
, Issue.1-2
, pp. 208-219
-
-
Menestrina, D.1
Whang, S.2
Garcia-Molina, H.3
-
188
-
-
36348932551
-
Learning blocking schemes for record linkage
-
Boston
-
Michelson, M., Knoblock, C.A.: Learning blocking schemes for record linkage. In: AAAI. Boston (2006)
-
(2006)
AAAI
-
-
Michelson, M.1
Knoblock, C.A.2
-
190
-
-
0002089617
-
Matching algorithms within a duplicate detection system
-
Monge, A.E.: Matching algorithms within a duplicate detection system. IEEE Data Engineering Bulletin 23(4), 14-20 (2000)
-
(2000)
IEEE Data Engineering Bulletin
, vol.23
, Issue.4
, pp. 14-20
-
-
Monge, A.E.1
-
191
-
-
85018108837
-
The field-matching problem: Algorithm and applications
-
Portland
-
Monge, A.E., Elkan, C.P.: The field-matching problem: Algorithm and applications. In: ACM SIGKDD, pp. 267-270. Portland (1996)
-
(1996)
ACM SIGKDD
, pp. 267-270
-
-
Monge, A.E.1
Elkan, C.P.2
-
192
-
-
80052900431
-
Robust similarity measures for named entities matching
-
Association for Computational Linguistics
-
Moreau, E., Yvon, F., Cappé, O.: Robust similarity measures for named entities matching. In: 22nd International Conference on Computational Linguistics-Volume 1, pp. 593-600. Association for Computational Linguistics (2008)
-
(2008)
22nd International Conference on Computational Linguistics-Volume 1
, pp. 593-600
-
-
Moreau, E.1
Yvon, F.2
Cappé, O.3
-
194
-
-
77953213147
-
Myths and fallacies of personally identifiable information
-
Narayanan, A., Shmatikov, V.: Myths and fallacies of personally identifiable information. Communications of the ACM 53(6), 24-26 (2010)
-
(2010)
Communications of the ACM
, vol.53
, Issue.6
, pp. 24-26
-
-
Narayanan, A.1
Shmatikov, V.2
-
196
-
-
0345566149
-
A guided tour to approximate string matching
-
Navarro, G.: A guided tour to approximate string matching. ACM Computing Surveys 33(1), 31-88 (2001)
-
(2001)
ACM Computing Surveys
, vol.33
, Issue.1
, pp. 31-88
-
-
Navarro, G.1
-
197
-
-
0001139918
-
Record linkage: Making maximum use of the discriminating power of identifying information
-
Newcombe, H., Kennedy, J.: Record linkage: making maximum use of the discriminating power of identifying information. Communications of the ACM 5(11), 563-566 (1962)
-
(1962)
Communications of the ACM
, vol.5
, Issue.11
, pp. 563-566
-
-
Newcombe, H.1
Kennedy, J.2
-
198
-
-
0001592068
-
Automatic linkage of vital records
-
Newcombe, H., Kennedy, J., Axford, S., James, A.: Automatic linkage of vital records. Science 130(3381), 954-959 (1959)
-
(1959)
Science
, vol.130
, Issue.3381
, pp. 954-959
-
-
Newcombe, H.1
Kennedy, J.2
Axford, S.3
James, A.4
-
199
-
-
0003659171
-
-
Oxford University Press, Inc., New York, NY, USA
-
Newcombe, H.B.: Handbook of record linkage: methods for health and statistical studies, administration, and business. Oxford University Press, Inc., New York, NY, USA (1988)
-
(1988)
Handbook of record linkage: Methods for health and statistical studies, administration, and business
-
-
Newcombe, H.B.1
-
200
-
-
47949115568
-
On the use of semantic blocking techniques for data cleansing and integration
-
Banff, Canada
-
Nin, J., Muntes-Mulero, V., Martinez-Bazan, N., Larriba-Pey, J.L.: On the use of semantic blocking techniques for data cleansing and integration. In: IDEAS, pp. 190-198. Banff, Canada (2007)
-
(2007)
IDEAS
, pp. 190-198
-
-
Nin, J.1
Muntes-Mulero, V.2
Martinez-Bazan, N.3
Larriba-Pey, J.L.4
-
202
-
-
20444470394
-
Privacy-preserving data linkage protocols
-
Washington DC
-
O’Keefe, C., Yung, M., Gu, L., Baxter, R.: Privacy-preserving data linkage protocols. In: ACM Workshop on Privacy in the Electronic Society, pp. 94-102. Washington DC (2004)
-
(2004)
ACM Workshop on Privacy in the Electronic Society
, pp. 94-102
-
-
O’Keefe, C.1
Yung, M.2
Gu, L.3
Baxter, R.4
-
203
-
-
85031041199
-
Data matching and merging: An overview
-
Okner, B.: Data matching and merging: An overview. NBER Chapters pp. 49-54 (1974)
-
(1974)
NBER Chapters
, pp. 49-54
-
-
Okner, B.1
-
204
-
-
47249101877
-
Improving grouped-entity resolution using quasi-cliques
-
On, B.W., Elmacioglu, E., Lee, D., Kang, J., Pei, J.: Improving grouped-entity resolution using quasi-cliques. In: IEEE ICDM, pp. 1008-1015 (2006)
-
(2006)
IEEE ICDM
, pp. 1008-1015
-
-
On, B.W.1
Elmacioglu, E.2
Lee, D.3
Kang, J.4
Pei, J.5
-
205
-
-
34548725521
-
Group linkage
-
Istanbul
-
On, B.W., Koudas, N., Lee, D., Srivastava, D.: Group linkage. In: IEEE ICDE, pp. 496-505. Istanbul (2007)
-
(2007)
IEEE ICDE
, pp. 496-505
-
-
On, B.W.1
Koudas, N.2
Lee, D.3
Srivastava, D.4
-
206
-
-
56249109568
-
Synthetic identity fraud: Unseen identity challenge
-
Oscherwitz, T.: Synthetic identity fraud: unseen identity challenge. Bank Security News 3(7) (2005)
-
(2005)
Bank Security News
, vol.3
, Issue.7
-
-
Oscherwitz, T.1
-
207
-
-
62249167366
-
Privacy-preserving fuzzy matching using a public reference table
-
Pang, C., Gu, L., Hansen, D., Maeder, A.: Privacy-preserving fuzzy matching using a public reference table. Intelligent Patient Management pp. 71-89 (2009)
-
(2009)
Intelligent Patient Management
, pp. 71-89
-
-
Pang, C.1
Gu, L.2
Hansen, D.3
Maeder, A.4
-
209
-
-
37149010102
-
-
PSMA Australia Limited, Griffith, ACT, Australia
-
Paull, D.: A geocoded national address file for Australia: The G-NAF what, why, who and when? PSMA Australia Limited, Griffith, ACT, Australia (2003)
-
(2003)
A geocoded national address file for Australia: The G-NAF what, why, who and when?
-
-
Paull, D.1
-
210
-
-
0030290473
-
Retrieval effectiveness of proper name search methods
-
Pfeifer, U., Poersch, T., Fuhr, N.: Retrieval effectiveness of proper name search methods. Information Processing and Management 32(6), 667-679 (1996)
-
(1996)
Information Processing and Management
, vol.32
, Issue.6
, pp. 667-679
-
-
Pfeifer, U.1
Poersch, T.2
Fuhr, N.3
-
211
-
-
13344267227
-
The double-metaphone search algorithm
-
Philips, L.: The double-metaphone search algorithm. C/C++ User’s Journal 18(6) (2000)
-
(2000)
C/C++ User’s Journal
, vol.18
, Issue.6
-
-
Philips, L.1
-
212
-
-
84856412670
-
Resilient identity crime detection
-
Phua, C., Smith-Miles, K., Lee, V., Gayler, R.: Resilient identity crime detection. IEEE Transactions on Knowledge and Data Engineering 24(3) (2012)
-
(2012)
IEEE Transactions on Knowledge and Data Engineering
, vol.24
, Issue.3
-
-
Phua, C.1
Smith-Miles, K.2
Lee, V.3
Gayler, R.4
-
213
-
-
84879363944
-
Total information awareness (TIA)
-
Poindexter, J., Popp, R., Sharkey, B.: Total information awareness (TIA). In: IEEE Aerospace Conference, 2003, vol. 6, pp. 2937-2944 (2003)
-
(2003)
IEEE Aerospace Conference, 2003
, vol.6
, pp. 2937-2944
-
-
Poindexter, J.1
Popp, R.2
Sharkey, B.3
-
214
-
-
84976776121
-
Automatic spelling correction in scientific and scholarly text
-
Pollock, J.J., Zamora, A.: Automatic spelling correction in scientific and scholarly text. Communications of the ACM 27(4), 358-368 (1984)
-
(1984)
Communications of the ACM
, vol.27
, Issue.4
, pp. 358-368
-
-
Pollock, J.J.1
Zamora, A.2
-
216
-
-
84930816594
-
Indexing uncertain data
-
C.C. Aggarwal (ed.), Springer
-
Prabhakar, S., Shah, R., Singh, S.: Indexing uncertain data. In: C.C. Aggarwal (ed.) Managing and Mining Uncertain Data, Advances in Database Systems, vol. 35, pp. 299-325. Springer (2009)
-
(2009)
Managing and Mining Uncertain Data, Advances in Database Systems, vol. 35
, pp. 299-325
-
-
Prabhakar, S.1
Shah, R.2
Singh, S.3
-
217
-
-
80051933614
-
Data cleansing techniques for large enterprise datasets
-
San Jose, USA
-
Prasad, K., Faruquie, T., Joshi, S., Chaturvedi, S., Subramaniam, L., Mohania, M.: Data cleansing techniques for large enterprise datasets. In: SRII Global Conference, pp. 135-144. San Jose, USA (2009)
-
(2009)
SRII Global Conference
, pp. 135-144
-
-
Prasad, K.1
Faruquie, T.2
Joshi, S.3
Chaturvedi, S.4
Subramaniam, L.5
Mohania, M.6
-
219
-
-
0032032036
-
How to ensure data quality of an epidemiological follow-up: Quality assessment of an anonymous record linkage procedure
-
Quantin, C., Bouzelat, H., Allaert, F., Benhamiche, A., Faivre, J., Dusserre, L.: How to ensure data quality of an epidemiological follow-up: Quality assessment of an anonymous record linkage procedure. International Journal of Medical Informatics 49(1), 117-122 (1998)
-
(1998)
International Journal of Medical Informatics
, vol.49
, Issue.1
, pp. 117-122
-
-
Quantin, C.1
Bouzelat, H.2
Allaert, F.3
Benhamiche, A.4
Faivre, J.5
Dusserre, L.6
-
220
-
-
0031656168
-
Automatic record hash coding and linkage for epidemiological follow-up data confidentiality
-
Quantin, C., Bouzelat, H., Allaert, F.A., Benhamiche, A.M., Faivre, J., Dusserre, L.:Automatic record hash coding and linkage for epidemiological follow-up data confidentiality. Methods of Information in Medicine 37(3), 271-277 (1998)
-
(1998)
Methods of Information in Medicine
, vol.37
, Issue.3
, pp. 271-277
-
-
Quantin, C.1
Bouzelat, H.2
Allaert, F.A.3
Benhamiche, A.M.4
Faivre, J.5
Dusserre, L.6
-
221
-
-
0029822648
-
Irreversible encryption method by generation of polynomials
-
Quantin, C., Bouzelat, H., Dusserre, L.: Irreversible encryption method by generation of polynomials. Medical Informatics and the Internet in Medicine 21(2), 113-121 (1996)
-
(1996)
Medical Informatics and the Internet in Medicine
, vol.21
, Issue.2
, pp. 113-121
-
-
Quantin, C.1
Bouzelat, H.2
Dusserre, L.3
-
223
-
-
0024610919
-
A tutorial on hidden Markov models and selected applications in speech recognition
-
Rabiner, L.: A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257-286 (1989)
-
(1989)
Proceedings of the IEEE
, vol.77
, Issue.2
, pp. 257-286
-
-
Rabiner, L.1
-
224
-
-
0002490026
-
Data cleaning: Problems and current approaches
-
Rahm, E., Do, H.H.: Data cleaning: Problems and current approaches. IEEE Data Engineering Bulletin 23(4), 3-13 (2000)
-
(2000)
IEEE Data Engineering Bulletin
, vol.23
, Issue.4
, pp. 3-13
-
-
Rahm, E.1
Do, H.H.2
-
225
-
-
83055169894
-
Large-scale collective entity matching
-
Rastogi, V., Dalvi, N., Garofalakis, M.: Large-scale collective entity matching. VLDB Endowment 4, 208-218 (2011)
-
(2011)
VLDB Endowment
, vol.4
, pp. 208-218
-
-
Rastogi, V.1
Dalvi, N.2
Garofalakis, M.3
-
226
-
-
33751069306
-
A secure protocol for computing string distance metrics
-
Brighton, UK
-
Ravikumar, P., Cohen, W., Fienberg, S.: A secure protocol for computing string distance metrics. In: Workshop on Privacy and Security Aspects of Data Mining held at IEEE ICDM, pp. 40-46. Brighton, UK (2004)
-
(2004)
Workshop on Privacy and Security Aspects of Data Mining held at IEEE ICDM
, pp. 40-46
-
-
Ravikumar, P.1
Cohen, W.2
Fienberg, S.3
-
227
-
-
77955652421
-
Linking historical censuses: A new approach
-
Ruggles, S.: Linking historical censuses: A new approach. History and Computing 14(1-2), 213-224 (2002)
-
(2002)
History and Computing
, vol.14
, Issue.1-2
, pp. 213-224
-
-
Ruggles, S.1
-
228
-
-
31644443877
-
Geocoding in cancer research: A review
-
Rushton, G., Armstrong, M., Gittler, J., Greene, B., Pavlik, C., West, M., Zimmerman, D.:Geocoding in cancer research: A review. American Journal of Preventive Medicine 30(2), S16-S24 (2006)
-
(2006)
American Journal of Preventive Medicine
, vol.30
, Issue.2
, pp. S16-S24
-
-
Rushton, G.1
Armstrong, M.2
Gittler, J.3
Greene, B.4
Pavlik, C.5
West, M.6
Zimmerman, D.7
-
231
-
-
0242456811
-
Interactive deduplication using active learning
-
Edmonton
-
Sarawagi, S., Bhamidipaty, A.: Interactive deduplication using active learning. In: ACM SIGKDD, pp. 269-278. Edmonton (2002)
-
(2002)
ACM SIGKDD
, pp. 269-278
-
-
Sarawagi, S.1
Bhamidipaty, A.2
-
232
-
-
85088005959
-
Efficient set joins on similarity predicates
-
Paris
-
Sarawagi, S., Kirpal, A.: Efficient set joins on similarity predicates. In: ACM SIGMOD, pp. 754-765. Paris (2004)
-
(2004)
ACM SIGMOD
, pp. 754-765
-
-
Sarawagi, S.1
Kirpal, A.2
-
233
-
-
84863549453
-
The RecordLinkage package: Detecting errors in data
-
Sariyar, M., Borg, A.: The RecordLinkage package: Detecting errors in data. The R Journal 2(2), 61-67 (2010)
-
(2010)
The R Journal
, vol.2
, Issue.2
, pp. 61-67
-
-
Sariyar, M.1
Borg, A.2
-
234
-
-
35448932873
-
Privacy preserving schema and data matching
-
Scannapieco, M., Figotin, I., Bertino, E., Elmagarmid, A.: Privacy preserving schema and data matching. In: ACM SIGMOD, pp. 653-664 (2007)
-
(2007)
ACM SIGMOD
, pp. 653-664
-
-
Scannapieco, M.1
Figotin, I.2
Bertino, E.3
Elmagarmid, A.4
-
236
-
-
84860685505
-
On the decidability and complexity of identity knowledge representation
-
Busan, South Korea
-
Schewe, K., Wang, Q.: On the decidability and complexity of identity knowledge representation. In: Database Systems for Advanced Applications, Springer LNCS 7238, pp. 288-302. Busan, South Korea (2012)
-
(2012)
Database Systems for Advanced Applications, Springer LNCS 7238
, pp. 288-302
-
-
Schewe, K.1
Wang, Q.2
-
237
-
-
0003855464
-
-
John Wiley and Sons, Inc., New York
-
Schneier, B.: Applied cryptography: Protocols, algorithms, and source code in C, 2 edn. John Wiley and Sons, Inc., New York (1996)
-
(1996)
Applied cryptography: Protocols, algorithms, and source code in C, 2 edn
-
-
Schneier, B.1
-
238
-
-
33749384202
-
A toolbox for record linkage
-
Schnell, R., Bachteler, T., Bender, S.: A toolbox for record linkage. Austrian Journal of Statistics 33(1& 2), 125-133 (2004)
-
(2004)
Austrian Journal of Statistics
, vol.33
, Issue.1-2
, pp. 125-133
-
-
Schnell, R.1
Bachteler, T.2
Bender, S.3
-
239
-
-
70349817125
-
Privacy-preserving record linkage using Bloom filters
-
Schnell, R., Bachteler, T., Reiher, J.: Privacy-preserving record linkage using Bloom filters. BioMed Central Medical Informatics and Decision Making 9(1) (2009)
-
(2009)
BioMed Central Medical Informatics and Decision Making
, vol.9
, Issue.1
-
-
Schnell, R.1
Bachteler, T.2
Reiher, J.3
-
240
-
-
0037826642
-
Learning hidden Markov model structure for information extraction
-
Seymore, K., McCallum, A., Rosenfeld, R.: Learning hidden Markov model structure for information extraction. In: AAAI Workshop on Machine Learning for Information Extraction, pp. 37-42 (1999)
-
(1999)
AAAI Workshop on Machine Learning for Information Extraction
, pp. 37-42
-
-
Seymore, K.1
McCallum, A.2
Rosenfeld, R.3
-
241
-
-
0016792139
-
Methods for computer linkage of hospital admission-separation records into cumulative health histories
-
Smith, M., Newcombe, H.: Methods for computer linkage of hospital admission-separation records into cumulative health histories. Methods of Information in Medicine 14(3), 118-125 (1975)
-
(1975)
Methods of Information in Medicine
, vol.14
, Issue.3
, pp. 118-125
-
-
Smith, M.1
Newcombe, H.2
-
242
-
-
0018743442
-
Accuracies of computer versus manual linkages of routine health records
-
Smith, M., Newcombe, H.: Accuracies of computer versus manual linkages of routine health records. Methods of Information in Medicine 18(2), 89-97 (1979)
-
(1979)
Methods of Information in Medicine
, vol.18
, Issue.2
, pp. 89-97
-
-
Smith, M.1
Newcombe, H.2
-
244
-
-
0033705124
-
Practical techniques for searches on encrypted data
-
Song, D., Wagner, D., Perrig, A.: Practical techniques for searches on encrypted data. In:IEEE Symposium on Security and Privacy, pp. 44-55 (2000)
-
(2000)
IEEE Symposium on Security and Privacy
, pp. 44-55
-
-
Song, D.1
Wagner, D.2
Perrig, A.3
-
245
-
-
77649261370
-
Record matching over query results from multiple web databases
-
Su, W., Wang, J., Lochovsky, F.H.: Record matching over query results from multiple web databases. IEEE Transactions on Knowledge and Data Engineering 22(4), 578-589 (2009)
-
(2009)
IEEE Transactions on Knowledge and Data Engineering
, vol.22
, Issue.4
, pp. 578-589
-
-
Su, W.1
Wang, J.2
Lochovsky, F.H.3
-
246
-
-
33750141665
-
Automated geocoding of routinely collected health data in New South Wales
-
Summerhayes, R., Holder, P., Beard, J., Morgan, G., Christen, P., Willmore, A., Churches, T.: Automated geocoding of routinely collected health data in New South Wales. New South Wales Public Health Bulletin 17(4), 33-38 (2006)
-
(2006)
New South Wales Public Health Bulletin
, vol.17
, Issue.4
, pp. 33-38
-
-
Summerhayes, R.1
Holder, P.2
Beard, J.3
Morgan, G.4
Christen, P.5
Willmore, A.6
Churches, T.7
-
250
-
-
84871606066
-
SOG: A synthetic occupancy generator to support entity resolution instruction and research
-
Potsdam, Germany
-
Talburt, J.R., Zhou, Y., Shivaiah, S.Y.: SOG: A synthetic occupancy generator to support entity resolution instruction and research. In: International Conference on Information Quality, pp. 91-105. Potsdam, Germany (2009)
-
(2009)
International Conference on Information Quality
, pp. 91-105
-
-
Talburt, J.R.1
Zhou, Y.2
Shivaiah, S.Y.3
-
252
-
-
0242456803
-
Learning domain-independent string transformation weights for high accuracy object identification
-
Edmonton
-
Tejada, S., Knoblock, C.A., Minton, S.: Learning domain-independent string transformation weights for high accuracy object identification. In: ACM SIGKDD, pp. 350-359. Edmonton (2002)
-
(2002)
ACM SIGKDD
, pp. 350-359
-
-
Tejada, S.1
Knoblock, C.A.2
Minton, S.3
-
254
-
-
66249148065
-
Data mining methods for linking data coming from several sources
-
Monographs in Official Statistics. Luxembourg
-
Torra, V., Domingo-Ferrer, J., Torres, A.: Data mining methods for linking data coming from several sources. In: Third Joint UN/ECE-Eurostat Work Session on Statistical Data Confidentiality, Eurostat. Monographs in Official Statistics. Luxembourg (2004)
-
(2004)
Third Joint UN/ECE-Eurostat Work Session on Statistical Data Confidentiality, Eurostat
-
-
Torra, V.1
Domingo-Ferrer, J.2
Torres, A.3
-
255
-
-
70349844175
-
Privacy-preserving string comparisons in record linkage systems: A review
-
Trepetin, S.: Privacy-preserving string comparisons in record linkage systems: a review. Information Security Journal: A Global Perspective 17(5), 253-266 (2008)
-
(2008)
Information Security Journal: A Global Perspective
, vol.17
, Issue.5
, pp. 253-266
-
-
Trepetin, S.1
-
256
-
-
3042553478
-
Homeland Security and Geographic Information Systems: How GIS and mapping technology can save lives and protect property in post-September 11th America
-
US Federal Geographic Data Committee. Homeland Security and Geographic Information Systems: How GIS and mapping technology can save lives and protect property in post-September 11th America. Public Health GIS News and, Information (52), 21-23 (2003)
-
(2003)
Public Health GIS News and, Information
, Issue.52
, pp. 21-23
-
-
-
257
-
-
63449119775
-
-
Springer
-
Vaidya, J., Clifton, C., Zhu, M.: Privacy preserving data mining, vol. 19. Springer (2006)
-
(2006)
Privacy preserving data mining, vol. 19
-
-
Vaidya, J.1
Clifton, C.2
Zhu, M.3
-
258
-
-
79952272717
-
Triphone analysis: A combined method for the correction of orthographical and typographical errors
-
Austin
-
Van Berkel, B., De Smedt, K.: Triphone analysis: A combined method for the correction of orthographical and typographical errors. In: Second Conference on Applied Natural Language Processing, pp. 77-83. Austin (1988)
-
(1988)
Second Conference on Applied Natural Language Processing
, pp. 77-83
-
-
Van Berkel, B.1
De Smedt, K.2
-
260
-
-
84870477881
-
An efficient two-party protocol for approximate matching in private record linkage
-
Ballarat, Australia
-
Vatsalan, D., Christen, P., Verykios, V.: An efficient two-party protocol for approximate matching in private record linkage. In: AusDM, CRPIT, vol. 121. Ballarat, Australia (2011)
-
(2011)
AusDM, CRPIT
, pp. 121
-
-
Vatsalan, D.1
Christen, P.2
Verykios, V.3
-
261
-
-
0034228352
-
Automating the approximate record-matching process
-
Verykios, V., Elmagarmid, A., Houstis, E.: Automating the approximate record-matching process. Information Sciences 126(1-4), 83-98 (2000)
-
(2000)
Information Sciences
, vol.126
, Issue.1-4
, pp. 83-98
-
-
Verykios, V.1
Elmagarmid, A.2
Houstis, E.3
-
262
-
-
70349836319
-
Privacy preserving record linkage approaches
-
Verykios, V., Karakasidis, A., Mitrogiannis, V.: Privacy preserving record linkage approaches. Int. J. of Data Mining, Modelling and Management 1(2), 206-221 (2009)
-
(2009)
Int. J. of Data Mining, Modelling and Management
, vol.1
, Issue.2
, pp. 206-221
-
-
Verykios, V.1
Karakasidis, A.2
Mitrogiannis, V.3
-
263
-
-
0038208065
-
A Bayesian decision model for cost optimal record matching
-
Verykios, V., George, M.V., Elfeky, M.G.: A Bayesian decision model for cost optimal record matching. The VLDB Journal 12(1), 28-40 (2003)
-
(2003)
The VLDB Journal
, vol.12
, Issue.1
, pp. 28-40
-
-
Verykios, V.1
George, M.V.2
Elfeky, M.G.3
-
264
-
-
84861750648
-
Silk-a link discovery framework for the web of data
-
Volz, J., Bizer, C., Gaedke, M., Kobilarov, G.: Silk-a link discovery framework for the web of data. In: Second Linked Data on the Web Workshop (2009)
-
(2009)
Second Linked Data on the Web Workshop
-
-
Volz, J.1
Bizer, C.2
Gaedke, M.3
Kobilarov, G.4
-
265
-
-
74549152150
-
Robust record linkage blocking using suffix arrays
-
Hong Kong
-
de Vries, T., Ke, H., Chawla, S., Christen, P.: Robust record linkage blocking using suffix arrays. In: ACM CIKM, pp. 305-314. Hong Kong (2009)
-
(2009)
ACM CIKM
, pp. 305-314
-
-
de Vries, T.1
Ke, H.2
Chawla, S.3
Christen, P.4
-
266
-
-
79952543891
-
Robust record linkage blocking using suffix arrays and Bloom filters
-
de Vries, T., Ke, H., Chawla, S., Christen, P.: Robust record linkage blocking using suffix arrays and Bloom filters. ACM Transactions on Knowledge Discovery from Data 5(2) (2011)
-
(2011)
ACM Transactions on Knowledge Discovery from Data
, vol.5
, Issue.2
-
-
de Vries, T.1
Ke, H.2
Chawla, S.3
Christen, P.4
-
267
-
-
1942443495
-
Automatically detecting deceptive criminal identities
-
Wang, G., Chen, H., Atabakhsh, H.: Automatically detecting deceptive criminal identities. Communications of the ACM 47(3), 70-76 (2004)
-
(2004)
Communications of the ACM
, vol.47
, Issue.3
, pp. 70-76
-
-
Wang, G.1
Chen, H.2
Atabakhsh, H.3
-
270
-
-
29844441371
-
Dogmatix tracks down duplicates in XML
-
Baltimore
-
Weis, M., Naumann, F.: Dogmatix tracks down duplicates in XML. In: ACM SIGMOD, pp. 431-442. Baltimore (2005)
-
(2005)
ACM SIGMOD
, pp. 431-442
-
-
Weis, M.1
Naumann, F.2
-
272
-
-
77956549963
-
Industry-scale duplicate detection
-
Weis, M., Naumann, F., Jehle, U., Lufter, J., Schuster, H.: Industry-scale duplicate detection. Proceedings of the VLDB Endowment 1(2), 1253-1264 (2008)
-
(2008)
Proceedings of the VLDB Endowment
, vol.1
, Issue.2
, pp. 1253-1264
-
-
Weis, M.1
Naumann, F.2
Jehle, U.3
Lufter, J.4
Schuster, H.5
-
276
-
-
85031041797
-
Joint entity resolution
-
Arlington, Virginia
-
Whang, S.E., Garcia-Molina, H.: Joint entity resolution. In: IEEE ICDE. Arlington, Virginia (2012)
-
(2012)
IEEE ICDE
-
-
Whang, S.E.1
Garcia-Molina, H.2
-
277
-
-
70849098813
-
Entity resolution with iterative blocking
-
Providence, Rhode Island
-
Whang, S.E., Menestrina, D., Koutrika, G., Theobald, M., Garcia-Molina, H.: Entity resolution with iterative blocking. In: ACM SIGMOD, pp. 219-232. Providence, Rhode Island (2009)
-
(2009)
ACM SIGMOD
, pp. 219-232
-
-
Whang, S.E.1
Menestrina, D.2
Koutrika, G.3
Theobald, M.4
Garcia-Molina, H.5
-
278
-
-
84883401319
-
Rattle: A data mining GUI for R
-
Williams, G.J.: Rattle: a data mining GUI for R. The R Journal 1(2), 45-55 (2009)
-
(2009)
The R Journal
, vol.1
, Issue.2
, pp. 45-55
-
-
Williams, G.J.1
-
279
-
-
0008976521
-
String comparator metrics and enhanced decision rules in the Fellegi-Sunter model of record linkage
-
American Statistical Association
-
Winkler, W.: String comparator metrics and enhanced decision rules in the Fellegi-Sunter model of record linkage. In: Proceedings of the Section on Survey Research Methods, pp. 354-359. American Statistical Association (1990)
-
(1990)
Proceedings of the Section on Survey Research Methods
, pp. 354-359
-
-
Winkler, W.1
-
286
-
-
27544453079
-
-
Tech. Rep. RR1991/09, US Bureau of the Census, Washington, DC
-
Winkler, W.E., Thibaudeau, Y.: An application of the Fellegi-Sunter model of record linkage to the 1990 U.S. decennial census. Tech. Rep. RR1991/09, US Bureau of the Census, Washington, DC (1991)
-
(1991)
An application of the Fellegi-Sunter model of record linkage to the 1990 U.S. decennial census
-
-
Winkler, W.E.1
Thibaudeau, Y.2
-
287
-
-
84893021148
-
Fast record linkage of very large files in support of decennial and administrative records projects
-
American Statistical Association
-
Winkler, W.E., Yancey, W.E., Porter, E.H.: Fast record linkage of very large files in support of decennial and administrative records projects. In: Proceedings of the Section on Survey Research Methods, pp. 2120-2130. American Statistical Association (2010)
-
(2010)
Proceedings of the Section on Survey Research Methods
, pp. 2120-2130
-
-
Winkler, W.E.1
Yancey, W.E.2
Porter, E.H.3
-
288
-
-
0003756969
-
-
Morgan Kaufmann
-
Witten, I.H., Moffat, A., Bell, T.C.: Managing Gigabytes, 2 edn. Morgan Kaufmann (1999)
-
(1999)
Managing Gigabytes, 2 edn
-
-
Witten, I.H.1
Moffat, A.2
Bell, T.C.3
-
289
-
-
70849105253
-
Ed-join: An efficient algorithm for similarity joins with edit distance constraints
-
Xiao, C., Wang, W., Lin, X.: Ed-join: an efficient algorithm for similarity joins with edit distance constraints. Proceedings of the VLDB Endowment 1(1), 933-944 (2008)
-
(2008)
Proceedings of the VLDB Endowment
, vol.1
, Issue.1
, pp. 933-944
-
-
Xiao, C.1
Wang, W.2
Lin, X.3
-
290
-
-
67649644357
-
Efficient private record linkage
-
Yakout, M., Atallah, M., Elmagarmid, A.: Efficient private record linkage. In: IEEE ICDE, pp. 1283-1286 (2009)
-
(2009)
IEEE ICDE
, pp. 1283-1286
-
-
Yakout, M.1
Atallah, M.2
Elmagarmid, A.3
-
291
-
-
84856597650
-
Behavior based record linkage
-
Yakout, M., Elmagarmid, A., Elmeleegy, H., Ouzzani, M., Qi, A.: Behavior based record linkage. Proceedings of the VLDB Endowment 3(1-2), 439-448 (2010)
-
(2010)
Proceedings of the VLDB Endowment
, vol.3
, Issue.1-2
, pp. 439-448
-
-
Yakout, M.1
Elmagarmid, A.2
Elmeleegy, H.3
Ouzzani, M.4
Qi, A.5
-
292
-
-
36348961379
-
Adaptive sorted neighborhood methods for efficient record linkage
-
Yan, S., Lee, D., Kan, M.Y., Giles, L.C.: Adaptive sorted neighborhood methods for efficient record linkage. In: ACM/IEEE-CS joint conference on Digital Libraries, pp. 185-194 (2007)
-
(2007)
ACM/IEEE-CS joint conference on Digital Libraries
, pp. 185-194
-
-
Yan, S.1
Lee, D.2
Kan, M.Y.3
Giles, L.C.4
-
296
-
-
84919835260
-
-
Springer
-
Yu, P., Han, J., Faloutsos, C.: Link Mining: Models, Algorithms, and Applications. Springer (2010)
-
(2010)
Link Mining: Models, Algorithms, and Applications
-
-
Yu, P.1
Han, J.2
Faloutsos, C.3
-
298
-
-
77955171784
-
Effectively indexing the uncertain space
-
Zhang, Y., Lin, X., Zhang, W., Wang, J., Lin, Q.: Effectively indexing the uncertain space. IEEE Transactions on Knowledge and Data Engineering 22(9), 1247-1261 (2010)
-
(2010)
IEEE Transactions on Knowledge and Data Engineering
, vol.22
, Issue.9
, pp. 1247-1261
-
-
Zhang, Y.1
Lin, X.2
Zhang, W.3
Wang, J.4
Lin, Q.5
-
299
-
-
33845920025
-
Semantic matching across heterogeneous data sources
-
Zhao, H.: Semantic matching across heterogeneous data sources. Communications of the ACM 50(1), 45-50 (2007)
-
(2007)
Communications of the ACM
, vol.50
, Issue.1
, pp. 45-50
-
-
Zhao, H.1
-
301
-
-
1342281224
-
Linking hospital discharge and death records- accuracy and sources of bias
-
Zingmond, D., Ye, Z., Ettner, S., Liu, H.: Linking hospital discharge and death records- accuracy and sources of bias. Journal of Clinical Epidemiology 57, 21-29 (2004)
-
(2004)
Journal of Clinical Epidemiology
, vol.57
, pp. 21-29
-
-
Zingmond, D.1
Ye, Z.2
Ettner, S.3
Liu, H.4
-
302
-
-
0030379050
-
Phonetic string matching: Lessons from information retrieval
-
Zürich, Switzerland
-
Zobel, J., Dart, P.: Phonetic string matching: Lessons from information retrieval. In: ACM SIGIR, pp. 166-172. Zürich, Switzerland (1996)
-
(1996)
ACM SIGIR
, pp. 166-172
-
-
Zobel, J.1
Dart, P.2
-
303
-
-
33747729581
-
Inverted files for text search engines
-
Zobel, J., Moffat, A.: Inverted files for text search engines. ACM Computing Surveys 38(2), 6 (2006)
-
(2006)
ACM Computing Surveys
, vol.38
, Issue.2
, pp. 6
-
-
Zobel, J.1
Moffat, A.2
|