-
1
-
-
0002696312
-
Matching records in a national medical patient index
-
Bell G.B., and Sethi A. Matching records in a national medical patient index. Communications of the ACM 44 9 (2001) 83-88
-
(2001)
Communications of the ACM
, vol.44
, Issue.9
, pp. 83-88
-
-
Bell, G.B.1
Sethi, A.2
-
2
-
-
77954003729
-
-
I. Bhattacharya, L. Getoor, Iterative record linkage for cleaning and integration, in: Proceedings of the Ninth ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, 2004.
-
I. Bhattacharya, L. Getoor, Iterative record linkage for cleaning and integration, in: Proceedings of the Ninth ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, 2004.
-
-
-
-
3
-
-
77952372966
-
-
M. Bilenko, R.J. Mooney, Adaptive duplicate detection using learnable string similarity measures, in: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington DC, 2003, pp. 39-48.
-
M. Bilenko, R.J. Mooney, Adaptive duplicate detection using learnable string similarity measures, in: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington DC, 2003, pp. 39-48.
-
-
-
-
4
-
-
0030211964
-
Bagging predictors
-
Breiman L. Bagging predictors. Machine Learning 24 2 (1996) 123-140
-
(1996)
Machine Learning
, vol.24
, Issue.2
, pp. 123-140
-
-
Breiman, L.1
-
5
-
-
0001858087
-
Multivariate decision trees
-
Brodley C.E., and Utgoff P.E. Multivariate decision trees. Machine Learning 19 1 (1995) 45-77
-
(1995)
Machine Learning
, vol.19
, Issue.1
, pp. 45-77
-
-
Brodley, C.E.1
Utgoff, P.E.2
-
6
-
-
47849111084
-
-
C.D. Budzinsky, Automated spelling correction, Unpublished Report, Statistics Canada, Ottawa, 1991.
-
C.D. Budzinsky, Automated spelling correction, Unpublished Report, Statistics Canada, Ottawa, 1991.
-
-
-
-
7
-
-
0029271867
-
Rule based joins in heterogeneous databases
-
Chatterjee A., and Segev A. Rule based joins in heterogeneous databases. Decision Support Systems 13 3-4 (1995) 313-333
-
(1995)
Decision Support Systems
, vol.13
, Issue.3-4
, pp. 313-333
-
-
Chatterjee, A.1
Segev, A.2
-
8
-
-
26444550791
-
-
S. Chaudhuri, V. Ganti, R. Motwani, Robust identification of fuzzy duplicates, in: Proceedings of the International Conference on Data Engineering, Tokyo, Japan, 2005, pp. 865-876.
-
S. Chaudhuri, V. Ganti, R. Motwani, Robust identification of fuzzy duplicates, in: Proceedings of the International Conference on Data Engineering, Tokyo, Japan, 2005, pp. 865-876.
-
-
-
-
11
-
-
84863154946
-
-
P.T. Davis, D.K. Elson, J.L. Klavans, Methods for precise named entity matching in digital collections, in: Proceedings of the 2003 Joint Conference on Digital Libraries, 2003, pp. 27-31.
-
P.T. Davis, D.K. Elson, J.L. Klavans, Methods for precise named entity matching in digital collections, in: Proceedings of the 2003 Joint Conference on Digital Libraries, 2003, pp. 27-31.
-
-
-
-
12
-
-
4344686171
-
Record matching in data warehouses: a decision model for data consolidation
-
Dey D. Record matching in data warehouses: a decision model for data consolidation. Operations Research 51 2 (2003) 240-254
-
(2003)
Operations Research
, vol.51
, Issue.2
, pp. 240-254
-
-
Dey, D.1
-
13
-
-
0032182242
-
A probabilistic decision model for entity matching in heterogeneous databases
-
Dey D., Sarkar S., and De P. A probabilistic decision model for entity matching in heterogeneous databases. Management Science 44 10 (1998) 1379-1395
-
(1998)
Management Science
, vol.44
, Issue.10
, pp. 1379-1395
-
-
Dey, D.1
Sarkar, S.2
De, P.3
-
14
-
-
0036565014
-
A distance-based approach to entity reconciliation in heterogeneous databases
-
Dey D., Sarkar S., and De P. A distance-based approach to entity reconciliation in heterogeneous databases. IEEE Transactions on Knowledge and Data Engineering 14 3 (2002) 567-582
-
(2002)
IEEE Transactions on Knowledge and Data Engineering
, vol.14
, Issue.3
, pp. 567-582
-
-
Dey, D.1
Sarkar, S.2
De, P.3
-
15
-
-
2342615638
-
Profile-based object matching for information integration
-
Doan A., Lu Y., Lee Y., and Han J. Profile-based object matching for information integration. IEEE Intelligent Systems 18 5 (2003) 54-59
-
(2003)
IEEE Intelligent Systems
, vol.18
, Issue.5
, pp. 54-59
-
-
Doan, A.1
Lu, Y.2
Lee, Y.3
Han, J.4
-
16
-
-
47849091034
-
-
P. Domingos, A unified bias-variance decomposition and its applications, in: Proceedings of 17th International Conference on Machine Learning, 2000, pp. 231-238.
-
P. Domingos, A unified bias-variance decomposition and its applications, in: Proceedings of 17th International Conference on Machine Learning, 2000, pp. 231-238.
-
-
-
-
17
-
-
47849104793
-
-
M.E. Fair, Record linkage in an information age society, in: Proceedings of Record Linkage Techniques - 1997, 1997, pp. 427-441.
-
M.E. Fair, Record linkage in an information age society, in: Proceedings of Record Linkage Techniques - 1997, 1997, pp. 427-441.
-
-
-
-
19
-
-
47849101981
-
-
Y. Freund, R.E. Schapire, Experiments with a new boosting algorithm, in: Proceedings of the Thirteenth International Conference on Machine Learning, 1996, pp. 148-156.
-
Y. Freund, R.E. Schapire, Experiments with a new boosting algorithm, in: Proceedings of the Thirteenth International Conference on Machine Learning, 1996, pp. 148-156.
-
-
-
-
20
-
-
0034541162
-
Cascade generalization
-
Gama J., and Brazdil P. Cascade generalization. Machine Learning 41 3 (2000) 315-343
-
(2000)
Machine Learning
, vol.41
, Issue.3
, pp. 315-343
-
-
Gama, J.1
Brazdil, P.2
-
21
-
-
47849104368
-
-
M. Ganesh, J. Srivastava, T. Richardson, Mining entity-identification rules for database integration, in: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), 1996, pp. 291-294.
-
M. Ganesh, J. Srivastava, T. Richardson, Mining entity-identification rules for database integration, in: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), 1996, pp. 291-294.
-
-
-
-
22
-
-
47849132294
-
-
K. Gilhooly, Dirty data blights the bottom line, Computerworld, November 07, 2005.
-
K. Gilhooly, Dirty data blights the bottom line, Computerworld, November 07, 2005.
-
-
-
-
23
-
-
47849106891
-
-
I.J. Haimowitz, Ö. Gür-Ali, H. Schwarz, Integrating and mining distributed customer databases, in: Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD-97), 1997, pp. 179-182.
-
I.J. Haimowitz, Ö. Gür-Ali, H. Schwarz, Integrating and mining distributed customer databases, in: Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD-97), 1997, pp. 179-182.
-
-
-
-
24
-
-
0003987805
-
-
MIT Press, Cambridge, MA
-
Hand D., Mannila H., and Smyth P. Principles of Data Mining (2001), MIT Press, Cambridge, MA
-
(2001)
Principles of Data Mining
-
-
Hand, D.1
Mannila, H.2
Smyth, P.3
-
25
-
-
0013331361
-
Real-world data is dirty: data cleansing and the merge/purge problem
-
Hernández M.A., and Stolfo S.J. Real-world data is dirty: data cleansing and the merge/purge problem. Data Mining and Knowledge Discovery 2 1 (1998) 9-37
-
(1998)
Data Mining and Knowledge Discovery
, vol.2
, Issue.1
, pp. 9-37
-
-
Hernández, M.A.1
Stolfo, S.J.2
-
27
-
-
0026390019
-
Classifying schematic and data heterogeneity in multidatabase systems
-
Kim W., and Seo J. Classifying schematic and data heterogeneity in multidatabase systems. IEEE Computer 24 12 (1991) 12-18
-
(1991)
IEEE Computer
, vol.24
, Issue.12
, pp. 12-18
-
-
Kim, W.1
Seo, J.2
-
28
-
-
47849122086
-
-
R. Kohavi, A study of cross-validation and bootstrap for accuracy estimation and model selection, in: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, 1995, pp. 1137-1143.
-
R. Kohavi, A study of cross-validation and bootstrap for accuracy estimation and model selection, in: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, 1995, pp. 1137-1143.
-
-
-
-
29
-
-
47849123092
-
-
M. Kubat, S. Matwin, Addressing the curse of imbalanced training sets: Onesided sampling, in: Proceedings of the Fourteenth International Conference on Machine Learning, 1997, pp. 179-186.
-
M. Kubat, S. Matwin, Addressing the curse of imbalanced training sets: Onesided sampling, in: Proceedings of the Fourteenth International Conference on Machine Learning, 1997, pp. 179-186.
-
-
-
-
30
-
-
8644242446
-
-
W. Lam, R. Huang, P.-S. Cheung, Learning phonetic similarity for matching named entity translations and mining new translations, in: Proceedings of the Twenty-seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2004, pp. 289-296.
-
W. Lam, R. Huang, P.-S. Cheung, Learning phonetic similarity for matching named entity translations and mining new translations, in: Proceedings of the Twenty-seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2004, pp. 289-296.
-
-
-
-
32
-
-
1342347426
-
The design and implementation of a corporate householding knowledge processor to improve data quality
-
Madnick S.E., Wang Y.R., and Xian X. The design and implementation of a corporate householding knowledge processor to improve data quality. Journal of Management Information Systems 20 3 (2003) 41-69
-
(2003)
Journal of Management Information Systems
, vol.20
, Issue.3
, pp. 41-69
-
-
Madnick, S.E.1
Wang, Y.R.2
Xian, X.3
-
33
-
-
33646765912
-
Conditional models of identity uncertainty with application to noun coreference
-
McCallum A., and Wellner B. Conditional models of identity uncertainty with application to noun coreference. Advances in Neural Information Processing Systems 17 (2005) 905-912
-
(2005)
Advances in Neural Information Processing Systems
, vol.17
, pp. 905-912
-
-
McCallum, A.1
Wellner, B.2
-
34
-
-
6444245574
-
Enhancing information systems management with natural language processing techniques
-
Métais E. Enhancing information systems management with natural language processing techniques. Data & Knowledge Engineering 41 2-3 (2002) 247-272
-
(2002)
Data & Knowledge Engineering
, vol.41
, Issue.2-3
, pp. 247-272
-
-
Métais, E.1
-
35
-
-
47849119593
-
-
A.E. Monge, C.P. Elkan, The filed matching problem: algorithms and applications, in: Proceedings of the second International Conference on Knowledge Discovery and Data Mining, 1996, pp. 267-270.
-
A.E. Monge, C.P. Elkan, The filed matching problem: algorithms and applications, in: Proceedings of the second International Conference on Knowledge Discovery and Data Mining, 1996, pp. 267-270.
-
-
-
-
36
-
-
0002431740
-
Automatic construction of decision trees from data: a multi-disciplinary survey
-
Murthy S.K. Automatic construction of decision trees from data: a multi-disciplinary survey. Data Mining and Knowledge Discovery 2 4 (1998) 345-389
-
(1998)
Data Mining and Knowledge Discovery
, vol.2
, Issue.4
, pp. 345-389
-
-
Murthy, S.K.1
-
38
-
-
0001592068
-
Automatic linkage of vital records
-
Newcombe H.B., Kennedy J.M., Axford S.J., and James A.P. Automatic linkage of vital records. Science 130 3381 (1959) 954-959
-
(1959)
Science
, vol.130
, Issue.3381
, pp. 954-959
-
-
Newcombe, H.B.1
Kennedy, J.M.2
Axford, S.J.3
James, A.P.4
-
39
-
-
47849119104
-
-
J.C. Pinheiro, D.X. Sun, Methods for linking and mining massive heterogeneous databases, in: Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, 1998, pp. 309-313.
-
J.C. Pinheiro, D.X. Sun, Methods for linking and mining massive heterogeneous databases, in: Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, 1998, pp. 309-313.
-
-
-
-
40
-
-
1142288175
-
Element matching across data-oriented XML sources using a multi-strategy clustering model
-
Pluempitiwiriyawej C., and Hammer J. Element matching across data-oriented XML sources using a multi-strategy clustering model. Data & Knowledge Engineering 48 3 (2004) 297-333
-
(2004)
Data & Knowledge Engineering
, vol.48
, Issue.3
, pp. 297-333
-
-
Pluempitiwiriyawej, C.1
Hammer, J.2
-
41
-
-
0002442571
-
Discovering rules by induction from large collections of examples
-
Michie D. (Ed), Edinburgh University Press, Edinburgh
-
Quinlan J.R. Discovering rules by induction from large collections of examples. In: Michie D. (Ed). Expert Systems in the Micro-electronic Age (1979), Edinburgh University Press, Edinburgh 168-201
-
(1979)
Expert Systems in the Micro-electronic Age
, pp. 168-201
-
-
Quinlan, J.R.1
-
43
-
-
0346970930
-
Multimedia indexing through multi-source and multi-language information extraction: the MUMIS project
-
Saggion H., Cunningham H., Bontcheva K., Maynard D., Hamza O., and Wilks Y. Multimedia indexing through multi-source and multi-language information extraction: the MUMIS project. Data & Knowledge Engineering 48 2 (2004) 247-264
-
(2004)
Data & Knowledge Engineering
, vol.48
, Issue.2
, pp. 247-264
-
-
Saggion, H.1
Cunningham, H.2
Bontcheva, K.3
Maynard, D.4
Hamza, O.5
Wilks, Y.6
-
44
-
-
0242456811
-
-
S. Sarawagi, A. Bhamidipaty, Interactive deduplication using active learning, in: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, 2002.
-
S. Sarawagi, A. Bhamidipaty, Interactive deduplication using active learning, in: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, 2002.
-
-
-
-
46
-
-
12244257337
-
Data integration: where does the time go?
-
Seligman L., Rosenthal A., Lehner P., and Smith A. Data integration: where does the time go?. IEEE Data Engineering Bulletin 25 3 (2002) 3-10
-
(2002)
IEEE Data Engineering Bulletin
, vol.25
, Issue.3
, pp. 3-10
-
-
Seligman, L.1
Rosenthal, A.2
Lehner, P.3
Smith, A.4
-
48
-
-
0003893064
-
-
World Scientific Publishing Co. Pte. Ltd., River Edge, NJ
-
Stephen G.A. String Searching Algorithms (1994), World Scientific Publishing Co. Pte. Ltd., River Edge, NJ
-
(1994)
String Searching Algorithms
-
-
Stephen, G.A.1
-
50
-
-
0035545848
-
Learning object identification rules for information integration
-
Tejada S., Knoblock C.A., and Minton S. Learning object identification rules for information integration. Information Systems 26 8 (2001) 607-633
-
(2001)
Information Systems
, vol.26
, Issue.8
, pp. 607-633
-
-
Tejada, S.1
Knoblock, C.A.2
Minton, S.3
-
51
-
-
12344288685
-
A probabilistic similarity metric for Medline records: a model for author name disambiguation
-
Torvik V., Weeber M., Swanson D.R., and Smalheiser N.R. A probabilistic similarity metric for Medline records: a model for author name disambiguation. Journal of the American Society for Information Science and Technology 56 2 (2005) 140-258
-
(2005)
Journal of the American Society for Information Science and Technology
, vol.56
, Issue.2
, pp. 140-258
-
-
Torvik, V.1
Weeber, M.2
Swanson, D.R.3
Smalheiser, N.R.4
-
52
-
-
47849097021
-
-
A.C. Trembly, Poor data quality: A $600 billion issue, National Underwriter Property & Casualty - Risk & Benefits Management, March 18, 2002 Edition.
-
A.C. Trembly, Poor data quality: A $600 billion issue, National Underwriter Property & Casualty - Risk & Benefits Management, March 18, 2002 Edition.
-
-
-
-
53
-
-
0001164493
-
Shift of Bias for Inductive Concept Learning
-
Michalski R., Carbonell J., and Mitchell T. (Eds), Morgan Kaufmann, Los Altos, CA (Chapter 5)
-
Utgoff P.E. Shift of Bias for Inductive Concept Learning. In: Michalski R., Carbonell J., and Mitchell T. (Eds). Machine Learning: An Artificial Intelligence Approach, Vol. II (1986), Morgan Kaufmann, Los Altos, CA 107-148 (Chapter 5)
-
(1986)
Machine Learning: An Artificial Intelligence Approach, Vol. II
, pp. 107-148
-
-
Utgoff, P.E.1
-
55
-
-
0003932630
-
-
Morgan Kaufmann
-
Weiss S.M., and Kulikowski C.A. Computer Systems That Learn - Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning, and Expert System (1991), Morgan Kaufmann
-
(1991)
Computer Systems That Learn - Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning, and Expert System
-
-
Weiss, S.M.1
Kulikowski, C.A.2
-
56
-
-
47849126967
-
-
W.E. Winkler, Matching and record linkage, in: Proceedings of Record Linkage Techniques - 1997, 1997, pp. 374-403.
-
W.E. Winkler, Matching and record linkage, in: Proceedings of Record Linkage Techniques - 1997, 1997, pp. 374-403.
-
-
-
-
57
-
-
47849119350
-
-
W.E. Winkler, Record linkage software and methods for merging administrative lists, Exchange of Technology and Know-How, Luxembourg, 1999, pp. 313-323.
-
W.E. Winkler, Record linkage software and methods for merging administrative lists, Exchange of Technology and Know-How, Luxembourg, 1999, pp. 313-323.
-
-
-
-
59
-
-
0026692226
-
Stacked generalization
-
Wolpert D.H. Stacked generalization. Neural Networks 5 2 (1992) 241-259
-
(1992)
Neural Networks
, vol.5
, Issue.2
, pp. 241-259
-
-
Wolpert, D.H.1
-
60
-
-
85133070181
-
The relationship between PAC, the statistical physics framework, the Bayesian framework, and the VC framework
-
Addison-Wesley
-
Wolpert D.H. The relationship between PAC, the statistical physics framework, the Bayesian framework, and the VC framework. Proceedings of the SFI/CNLS Workshop on Formal Approaches to Supervised Learning (1994), Addison-Wesley 117-214
-
(1994)
Proceedings of the SFI/CNLS Workshop on Formal Approaches to Supervised Learning
, pp. 117-214
-
-
Wolpert, D.H.1
-
61
-
-
3142679542
-
-
W. Wu, C. Yu, A. Doan, W. Meng, An interactive clustering-based approach to integrating source query interfaces on the deep Web, in: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, 2004, pp. 95-106.
-
W. Wu, C. Yu, A. Doan, W. Meng, An interactive clustering-based approach to integrating source query interfaces on the deep Web, in: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, 2004, pp. 95-106.
-
-
-
-
62
-
-
33845920025
-
Semantic matching across heterogeneous data sources
-
Zhao H. Semantic matching across heterogeneous data sources. Communications of the ACM 50 1 (2007) 45-50
-
(2007)
Communications of the ACM
, vol.50
, Issue.1
, pp. 45-50
-
-
Zhao, H.1
-
64
-
-
5644287747
-
Entity identification for heterogeneous database integration - a multiple classifier system approach and empirical evaluation
-
Zhao H., and Ram S. Entity identification for heterogeneous database integration - a multiple classifier system approach and empirical evaluation. Information Systems 30 2 (2005) 119-132
-
(2005)
Information Systems
, vol.30
, Issue.2
, pp. 119-132
-
-
Zhao, H.1
Ram, S.2
-
66
-
-
33947161876
-
Combining schema and instance information for integrating heterogeneous data sources
-
Zhao H., and Ram S. Combining schema and instance information for integrating heterogeneous data sources. Data & Knowledge Engineering 61 2 (2007) 281-303
-
(2007)
Data & Knowledge Engineering
, vol.61
, Issue.2
, pp. 281-303
-
-
Zhao, H.1
Ram, S.2
|