-
1
-
-
84868257170
-
Aggregative digital library systems in the driver infrastructure
-
Artini, M., Candela, L., Castelli, D., Manghi, P., Mikulicic, M. and Pagano, P. (1991) 'Aggregative digital library systems in the driver infrastructure', World Digital Libraries Journal, December, ISSN 0974-567X, Vol. 2, No. 2, pp.113-130.
-
(1991)
World Digital Libraries Journal, December, ISSN 0974-567X
, vol.2
, Issue.2
, pp. 113-130
-
-
Artini, M.1
Candela, L.2
Castelli, D.3
Manghi, P.4
Mikulicic, M.5
Pagano, P.6
-
2
-
-
58149472338
-
Swoosh: A generic approach to entity resolution
-
March.
-
Benjelloun, O., Garcia-Molina, H., Su, Q. and Widom, J. (2005) 'Swoosh: a generic approach to entity resolution', Stanford University technical report, March. Vol. 18, No. 1, pp.255-276.
-
(2005)
Stanford University Technical Report
, vol.18
, Issue.1
, pp. 255-276
-
-
Benjelloun, O.1
Garcia-Molina, H.2
Su, Q.3
Widom, J.4
-
4
-
-
0036040277
-
Similarity estimation techniques from rounding algorithms
-
May, Montreal, Quebec, Canada
-
Charikar, M. (2002) 'Similarity estimation techniques from rounding algorithms', in 34th Annual Symposium on Theory and Computing, May, Montreal, Quebec, Canada.
-
(2002)
34th Annual Symposium on Theory and Computing
-
-
Charikar, M.1
-
5
-
-
84857183817
-
A survey of indexing techniques for scalable record linkage and deduplication
-
ISSN 1041-4347. doi: 10.1109/TKDE
-
Christen, P. (2011) 'A survey of indexing techniques for scalable record linkage and deduplication', Knowledge and Data Engineering, IEEE Transactions on, ISSN 1041-4347. doi: 10.1109/TKDE, Vol. 127, No. 1, pp.99.
-
(2011)
Knowledge and Data Engineering, IEEE Transactions on
, vol.127
, Issue.1
, pp. 99
-
-
Christen, P.1
-
6
-
-
65449178105
-
An open source data cleaning, deduplication and record linkage system with a graphical user interface
-
New York, NY, USA, ACM. ISBN 978-1-60558-193-4. doi, URL http://doi.acm.org/10.1145/1401890.1402020
-
Christen, P.F. (2008) 'An open source data cleaning, deduplication and record linkage system with a graphical user interface', In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '08, New York, NY, USA, ACM. ISBN 978-1-60558-193-4. doi: http://doi.acm.org/10.1145/1401890.1402020, URL http://doi.acm.org/10.1145/ 1401890.1402020, pp.1065-1068.
-
(2008)
Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '08
, pp. 1065-1068
-
-
Christen, P.F.1
-
7
-
-
67650700151
-
Accurate synthetic generation of realistic personal information, in Thanaruk Theeramunkong
-
(Eds.): Kijsirikul, B., Cercone, N. and Ho, T-B 10.1007/978-3-642-01307- 2.47 Springer Berlin, Heidelberg, ISBN 978-3-642-01306-5. URL
-
Christen, P. and Pudjijono, A. (2009) 'Accurate synthetic generation of realistic personal information', In Thanaruk Theeramunkong, (Eds.): Kijsirikul, B., Cercone, N. and Ho, T-B., Advances in Knowledge Discovery and Data Mining, of Lecture Notes in Computer Science, Springer Berlin, Heidelberg, ISBN 978-3-642-01306-5. URL http://dx.doi.org/10.1007/978-3-642-01307-2-47. 10.1007/978-3-642-01307-2.47, Vol. 5476 pp.507-514.
-
(2009)
Advances in Knowledge Discovery and Data Mining, of Lecture Notes in Computer Science
, vol.5476
, pp. 507-514
-
-
Christen, P.1
Pudjijono, A.2
-
8
-
-
84868243350
-
-
The Australian Data Mining Workshop, November
-
Christen, T., Churches, P. and Zhu, J.X. (2002) 'Probabilistic name and address cleaning and standardization', The Australian Data Mining Workshop, November. http://datamining.anu.edu. au/projects/linkage-publications.html.
-
(2002)
Probabilistic Name and Address Cleaning and Standardization
-
-
Christen, T.1
Churches, P.2
Zhu, J.X.3
-
9
-
-
84884417241
-
Preparation of name and address data for record linkage using hidden Markov models
-
Online journal
-
Churches, T., Christen, P., Lu, J. and Zhu, J.X. (2002) 'Preparation of name and address data for record linkage using hidden Markov models', Bio-Med Central Medical Informatics and Decision Making, Vol. 2, No. 9, Online journal.
-
(2002)
Bio-Med Central Medical Informatics and Decision Making
, vol.2
, Issue.9
-
-
Churches, T.1
Christen, P.2
Lu, J.3
Zhu, J.X.4
-
10
-
-
11144240583
-
A comparison of string distance metrics for name-matching tasks
-
Online proceedings
-
Cohen, W.W., Ravikumar, P. and Fienberg, S.E. (2003) 'A comparison of string distance metrics for name-matching tasks', International Joint Conference on Artifi cial Intelligence, Proceedings of the Workshop on Information Integration on the Web, August. pp. 73-78. Online proceedings, http://www.isi. edu/info-agents/workshops/ijcai03/proceedings.htm.
-
(2003)
International Joint Conference on Artifi Cial Intelligence, Proceedings of the Workshop on Information Integration on the Web, August.
, pp. 73-78
-
-
Cohen, W.W.1
Ravikumar, P.2
Fienberg, S.E.3
-
11
-
-
0242540438
-
Learning to match and cluster large high-dimensional data sets for data integration
-
ACM ISBN 1-58113-567-X. doi New York, USA. URL http://doi.acm. org/10.1145/775047.775116
-
Cohen, W.W. and Richman, J. (2002) 'Learning to match and cluster large high-dimensional data sets for data integration', in Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '02, ACM ISBN 1-58113-567-X. doi: http://doi. acm.org/10.1145/775047.775116. URL http://doi.acm. org/10.1145/775047.775116, New York, USA., pp.475-480.
-
(2002)
Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '02
, pp. 475-480
-
-
Cohen, W.W.1
Richman, J.2
-
12
-
-
3042649466
-
From authority control to informed retrieval: Framing the expanded domain of subject access
-
Dalrymple, P.W. and Young, J.A. (1991) 'From authority control to informed retrieval: Framing the expanded domain of subject access', College and Research Libraries, doi: http://hdl.handle. net/1860/3173, Vol. 52, pp.139-149.
-
(1991)
College and Research Libraries
, vol.52
, pp. 139-149
-
-
Dalrymple, P.W.1
Young, J.A.2
-
13
-
-
33845667955
-
Duplicate record detection: A survey
-
DOI 10.1109/TKDE.2007.250581
-
Elmagarmid, A.K., Ipeirotis, P.G. and Verykios, V.S. (2007) 'Duplicate record detection: a survey', Knowledge and Data Engineering, IEEE Transactions on, Jan., ISSN 1041-4347, doi: 10.1109/TKDE 250581, URL http://www.cs.purdue. edu/homes/ake/pub/-survey2.pdf, Vol. 19, No. 1, pp.1-16. (Pubitemid 44955773)
-
(2007)
IEEE Transactions on Knowledge and Data Engineering
, vol.19
, Issue.1
, pp. 1-16
-
-
Elmagarmid, A.K.1
Ipeirotis, P.G.2
Verykios, V.S.3
-
14
-
-
74049107745
-
Record linkage performance for large data sets
-
ACM. ISBN 978-1-60558-804-9. doi URL http://doi.acm.org/10.1145/1651449. 1651453 New York, NY, USA
-
Gomez-Bao, J., Larriba-Pey, J-L. and Puig, J.R. (2009) 'Record linkage performance for large data sets', In Proceedings of the ACM fi rst international workshop on Privacy and anonymity for very large databases, PAVLAD '09, ACM. ISBN 978-1-60558-804-9. doi: http://doi.acm.org/10.1145/1651449.1651453, URL http://doi.acm.org/10.1145/1651449.1651453, New York, NY, USA, pp.09-16.
-
(2009)
Proceedings of the ACM Fi Rst International Workshop on Privacy and Anonymity for Very Large Databases, PAVLAD '09
, pp. 09-16
-
-
Gomez-Bao, J.1
Larriba-Pey, J.-L.2
Puig, J.R.3
-
15
-
-
44649164477
-
Detecting near-duplicates in large-scale short text databases
-
DOI 10.1007/978-3-540-68125-0-87, Advances in Knowledge Discovery and Data Mining - 12th Pacific-Asia Conference, PAKDD 2008, Proceedings LNAI
-
Gong, C., Huang, Y., Cheng, X. and Bai, S. (2008) 'Detecting near-duplicates in large-scale short text databases', in (Eds.): Washio, T., Suzuki, E., Ting, K. and Inokuchi, A., Advances in Knowledge Discovery and Data Mining, of Lecture Notes in Computer Science, Springer Berlin Heidelberg, URL http://dx.doi.org/10.1007/978-3-540-68125-0-87, Vol.5012, pp.877-883. (Pubitemid 351776381)
-
(2008)
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
, vol.5012
, pp. 877-883
-
-
Gong, C.1
Huang, Y.2
Cheng, X.3
Bai, S.4
-
16
-
-
84905814283
-
Authority control in the context of bibliographic control in the electronic environment
-
Gorman, M. (2003) 'Authority control in the context of bibliographic control in the electronic environment', International Conference Authority Control: Defi nition and International Experiences, Florence, February 10-12, doi: http://hdl.handle. net/10760/4164. Vol. 38, No. 3-4, pp.11-22. (Pubitemid 40055125)
-
(2004)
Cataloging and Classification Quarterly
, vol.38
, Issue.3-4
, pp. 11-22
-
-
Gorman, M.1
-
17
-
-
84880467474
-
Text joins in an RDBMS for web data integration
-
ACM. ISBN 1-58113680-3. doi URL http://doi.acm. org/10.1145/775152.775166 New York, NY, USA
-
Gravano, L., Panagiotis, G., Koudas, I.N. and Srivastava, D. (2003) 'Text joins in an RDBMS for web data integration', In Proceedings of the 12th international conference on World Wide Web, WWW '03, ACM. ISBN 1-58113680-3. doi: http://doi.acm.org/10.1145/775152.775166, URL http://doi.acm. org/10.1145/775152.775166, New York, NY, USA, pp.90-101.
-
(2003)
Proceedings of the 12th International Conference on World Wide Web, WWW '03
, pp. 90-101
-
-
Gravano, L.1
Panagiotis, G.2
Koudas, I.N.3
Srivastava, D.4
-
18
-
-
84976856849
-
-
SIGMOD Rec., May, ISSN 0163-5808, doi URL http://doi. acm.org/10.1145/568271.223807
-
Hernandez, M.A. and Stolfo, S.J. (1995) 'The merge/purge problem for large databases, SIGMOD Rec., May, ISSN 0163-5808, doi: http://doi.acm.org/10. 1145/568271.223807, URL http://doi. acm.org/10.1145/568271.223807, Vol. 24, pp.127-138.
-
(1995)
The Merge/purge Problem for Large Databases
, vol.24
, pp. 127-138
-
-
Hernandez, M.A.1
Stolfo, S.J.2
-
19
-
-
84950419860
-
Advances in record-linkage methodology as applied to matching the 1985 census of Tampa
-
June, URL published by American Statistical Association, Florida
-
Jaro, M.A. (1989) 'Advances in record-linkage methodology as applied to matching the 1985 census of Tampa, Journal of the American Statistical Association, June, URL http://www.jstor.org/stable/2289924, published by American Statistical Association, Florida, Vol. 84, No. 406, pp.414-420.
-
(1989)
Journal of the American Statistical Association
, vol.84
, Issue.406
, pp. 414-420
-
-
Jaro, M.A.1
-
20
-
-
56749169489
-
Cragan and adolfo correa, fi ne-grained record integration and linkage tool
-
ISSN 1542-0760 10.1002/bdra.20521 URL
-
Jurczyk, P., James, J., Lu, Li. and Janet, D.X. (2008) 'Cragan and adolfo correa, fi ne-grained record integration and linkage tool', Birth Defects Research Part A: Clinical and Molecular Teratology, ISSN 1542-0760, doi: 10.1002/bdra.20521. URL http://dx.doi.org/10.1002/bdra.20521, Vol. 82, No. 11, pp.822-829.
-
(2008)
Birth Defects Research Part A: Clinical and Molecular Teratology
, vol.82
, Issue.11
, pp. 822-829
-
-
Jurczyk, P.1
James, J.2
Li, L.3
Janet, D.X.4
-
21
-
-
33745266392
-
Domain-independent data cleaning via analysis of entity-relationship graph
-
DOI 10.1145/1138394.1138401
-
Kalashnikov, D.V. and Mehrotra, S. (2006) 'Domain-independent data cleaning via analysis of entity-relationship graph', ACM Transactions on Database Systems (TODS), Vol. 31, No. 2, pp.716-767. (Pubitemid 43924953)
-
(2006)
ACM Transactions on Database Systems
, vol.31
, Issue.2
, pp. 716-767
-
-
Kalashnikov, D.V.1
Mehrotra, S.2
-
22
-
-
47649126673
-
Interactive entity resolution in relational data: A visual analytic tool and its evaluation
-
Kang H., Getoor, L., Shneiderman, B., Bilgic, M. and Licamele. L. (2008) 'Interactive entity resolution in relational data: A visual analytic tool and its evaluation', IEEE Transactions on Visualization and Computer Graphics, Vol. 14, No. 5, pp.999-1014.
-
(2008)
IEEE Transactions on Visualization and Computer Graphics
, vol.14
, Issue.5
, pp. 999-1014
-
-
Kang, H.1
Getoor, L.2
Shneiderman, B.3
Bilgic, M.4
Licamele, L.5
-
23
-
-
72649095071
-
Frameworks for entity matching: A comparison
-
ISSN 0169-023X doi: 10.1016/j.datak.2009.10.003. URL
-
Koepcke, H. and Rahm, E. (2010) 'Frameworks for entity matching: a comparison', Data; Knowledge Engineering, ISSN 0169-023X. doi: 10.1016/j.datak.2009.10.003. URL http://www.sciencedirect.com/science/article/- pii/S0169023X09001451, Vol. 69, No. 2, pp.197-210.
-
(2010)
Data; Knowledge Engineering
, vol.69
, Issue.2
, pp. 197-210
-
-
Koepcke, H.1
Rahm, E.2
-
24
-
-
84882976599
-
Training selection for tuning entity matching
-
Koepcke, H., Rahm, E. and Rahm, E. (2008) 'Training selection for tuning entity matching', In QDB/MUD, pp.3-12.
-
(2008)
QDB/MUD
, pp. 3-12
-
-
Koepcke, H.1
Rahm, E.2
Rahm, E.3
-
25
-
-
0035786730
-
The open archives initiative: Building a low-barrier interoperability framework
-
ACM Press, ISBN 1-58113-345-6. doi
-
Lagoze, C. and Van de Sompel, H. (2001) The open archives initiative: building a low-barrier interoperability framework, in Proceedings of the fi rst ACM/IEEE-CS Joint Conference on Digital Libraries, ACM Press, ISBN 1-58113-345-6. doi: http://doi.acm.org/10.1145/379437.379449, pp.54-62.
-
(2001)
Proceedings of the Fi Rst ACM/IEEE-CS Joint Conference on Digital Libraries
, pp. 54-62
-
-
Lagoze, C.1
Van De Sompel, H.2
-
27
-
-
77955933052
-
Cassandra: A decentralized structured storage system
-
April, ISSN 0163-5980. doi:http://doi. acm.org/10.1145/1773912.1773922. URL http://doi.acm.org/10.1145/1773912.1773922
-
Lakshman, A. and Malik, P. (2010) 'Cassandra: a decentralized structured storage system', SIGOPS Oper. Syst. Rev., April, ISSN 0163-5980. doi: http://doi.acm.org/10.1145/1773912.1773922. URL http://doi.acm.org/10.1145/ 1773912.1773922, Vol. 44, pp.35-40.
-
(2010)
SIGOPS Oper. Syst. Rev
, vol.44
, pp. 35-40
-
-
Lakshman, A.1
Malik, P.2
-
28
-
-
14744286115
-
The making of the open archives initiative protocol for metadata harvesting
-
Lagoze, C. and Van de Sompel, H. (2003) 'The making of the open archives initiative protocol for metadata harvesting', Library Hi Tech, Vol. 21, No. 2, pp.118-128.
-
(2003)
Library Hi Tech
, vol.21
, Issue.2
, pp. 118-128
-
-
Lagoze, C.1
Van De Sompel, H.2
-
29
-
-
80054062837
-
PACE: A general-purpose tool for authority control
-
(Ed.): Garcia-Barriocanal, E., Cebeci, Z., Okur, M.C. and Ozturk, A 10.1007/978-3-642-24731-6, 8 Springer Berlin, Heidelberg, ISBN 978-3-642-24731-6, URL
-
Manghi, P. and Mikulicic, M. (2011) 'PACE: a general-purpose tool for authority control, ' in (Ed.): Garcia-Barriocanal, E., Cebeci, Z., Okur, M.C. and Ozturk, A., Metadata and Semantic Research, of Communications in Computer and Information Science, Springer Berlin, Heidelberg, ISBN 978-3-642-24731-6, URL http://dx.doi.org/10.1007/978-3-642-24731-6-8, 10.1007/978-3-642-24731-6, 8, Vol. 240, pp.80-92.
-
(2011)
Metadata and Semantic Research, of Communications in Computer and Information Science
, vol.240
, pp. 80-92
-
-
Manghi, P.1
Mikulicic, M.2
-
30
-
-
35348911985
-
Detecting near-duplicates for web crawling
-
May, Banff, Alberta, Canada
-
Manku, G.S., Jain, A. and Sarma, A.D. (2007) 'Detecting near-duplicates for web crawling', In 16th International World Wide Conference, May, Banff, Alberta, Canada.
-
(2007)
16th International World Wide Conference
-
-
Manku, G.S.1
Jain, A.2
Sarma, A.D.3
-
31
-
-
70449112601
-
'Virtual international authority fi le: linking the Deutsche Nationalbibliothek and Library of Congress name authority fi les
-
Rick, B., Hengel-Dittrich, C., ONeill, E.T. and Viaf, T.B. (2007) 'Virtual international authority fi le: linking the Deutsche Nationalbibliothek and Library of Congress name authority fi les', in International Cataloging and Bibliographic Control, Vol. 1, No. 36, pp.12-19.
-
(2007)
International Cataloging and Bibliographic Control
, vol.1
, Issue.36
, pp. 12-19
-
-
Rick, B.1
Hengel-Dittrich, C.2
Oneill, E.T.3
Viaf, T.B.4
-
32
-
-
77954714476
-
Detecting duplicate biological entities using shortest path edit distance
-
doi: 10.1504/IJDMB.2010.034196. URL http://inderscience. metapress.com/content/-TQ3737625VK1R573
-
Rudniy, A., Song, M. and Geller, J. (2010) 'Detecting duplicate biological entities using shortest path edit distance', International Journal of Data Mining and Bioinformatics, doi: 10.1504/IJDMB.2010.034196. URL http://inderscience. metapress.com/content/-TQ3737625VK1R573, Vol. 4, No. 4, pp.395-410.
-
(2010)
International Journal of Data Mining and Bioinformatics
, vol.4
, Issue.4
, pp. 395-410
-
-
Rudniy, A.1
Song, M.2
Geller, J.3
-
33
-
-
0035545848
-
Learning object identification rules for information integration
-
DOI 10.1016/S0306-4379(01)00042-4, Data Extraction, Cleaning and Reconciliation
-
Tejada, S., Knoblock, C. and Minton, S. (2001) 'Learning object identifi cation rules for information extraction', Information Systems, Vol. 26, No. 8, pp.607-633. (Pubitemid 33046273)
-
(2001)
Information Systems
, vol.26
, Issue.8
, pp. 607-633
-
-
Tejada, S.1
Knoblock, C.A.2
Minton, S.3
-
34
-
-
84868232618
-
Authority control: State of the art and new perspectives
-
Florence, Italy
-
Tillett, B.T. (2003) Authority control: state of the art and new perspectives, in Authority Control International Conference, Florence, Italy.
-
(2003)
Authority Control International Conference
-
-
Tillett, B.T.1
-
35
-
-
84868228418
-
-
arXiv.org:cs/0502028
-
Van de Sompel, H., Bekaert, J., Liu, X., Balakireva, L. and Schwander, T. (2005) aDORe: a modular, standards-based Digital Object Repository, URL http://www.citebase.org/abstract?id=oai:-arXiv.org:cs/0502028.
-
(2005)
ADORe: A Modular, Standards-based Digital Object Repository, URL
-
-
Van De Sompel, H.1
Bekaert, J.2
Liu, X.3
Balakireva, L.4
Schwander, T.5
-
36
-
-
84868226364
-
Linking medical records: A machine learning approach
-
doi: 10.1504/IJCENT.2010.03836. URL
-
Wang, X. and Alexander, S.M. (2010) 'Linking medical records: a machine learning approach', International Journal of Collaborative Enterprise, doi: 10.1504/IJCENT.2010.03836. URL http://inderscience.metapress.com/content/- T93552025P 150H36, Vol. 1, No. 3, pp.394-406.
-
(2010)
International Journal of Collaborative Enterprise
, vol.1
, Issue.3
, pp. 394-406
-
-
Wang, X.1
Alexander, S.M.2
|