메뉴 건너뛰기




Volumn , Issue , 2005, Pages 37-52

Assessing deduplication and data linkage quality: What to measure?

Author keywords

Data integration and matching; Data mining pre processing; Data or record linkage; Deduplication; Quality measures

Indexed keywords

DATA PREPARATION; DATABASE RESEARCH; DE DUPLICATIONS; PAIR COMPARISONS; PRE-PROCESSING; QUALITY MEASURES; RECORD LINKAGE; RESEARCH INTERESTS;

EID: 84884345501     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (13)

References (41)
  • 1
    • 77954319072 scopus 로고    scopus 로고
    • ACM SIGKDD '03 Workshop on Data Cleaning, Record Linkage, and Object Consolidation, August 27, 2003, Washington, DC
    • Baxter, R., Christen, P. and Churches, T.: A Comparison of Fast Blocking Methods for Record Linkage. ACM SIGKDD '03 Workshop on Data Cleaning, Record Linkage, and Object Consolidation, August 27, 2003, Washington, DC, pp. 25-27.
    • A Comparison of Fast Blocking Methods for Record Linkage , pp. 25-27
    • Baxter, R.1    Christen, P.2    Churches, T.3
  • 4
    • 84884326216 scopus 로고    scopus 로고
    • On evaluation and training-set construction for duplicate detection
    • Proceedings of the KDD-2003 workshop on data cleaning, record linkage, and object consolidation, Washington DC, August
    • Bilenko, M. and Mooney, R.J.: On evaluation and training-set construction for duplicate detection. Proceedings of the KDD-2003 workshop on data cleaning, record linkage, and object consolidation, Washington DC, August 2003.
    • (2003)
    • Bilenko, M.1    Mooney, R.J.2
  • 5
    • 1142279457 scopus 로고    scopus 로고
    • Robust and eficient fuzzy match for online data cleaning
    • Proceedings of the 2003 ACM SIGMOD International Conference on on Management of Data, San Diego, USA
    • Chaudhuri, S., Ganjam, K., Ganti, V. and Motwani, R.: Robust and eficient fuzzy match for online data cleaning. Proceedings of the 2003 ACM SIGMOD International Conference on on Management of Data, San Diego, USA, 2003, pp. 313-324.
    • (2003) , pp. 313-324
    • Chaudhuri, S.1    Ganjam, K.2    Ganti, V.3    Motwani, R.4
  • 13
    • 0017918892 scopus 로고
    • Foundations of Probabilistic and Utility-Theoretic Indexing
    • Cooper, W.S. and Maron, M.E.: Foundations of Probabilistic and Utility-Theoretic Indexing. Journal of the ACM, vol. 25, no. 1, pp. 67-80, January 1978.
    • (1978) Journal of the ACM , vol.25 , Issue.1 , pp. 67-80
    • Cooper, W.S.1    Maron, M.E.2
  • 14
    • 0036203458 scopus 로고    scopus 로고
    • TAILOR: A record linkage toolbox
    • Proceedings of the ICDE' 2002, San Jose, USA, March
    • Elfeky, M.G., Verykios, V.S. and Elmagarmid, A.K.: TAILOR: A record linkage toolbox. Proceedings of the ICDE' 2002, San Jose, USA, March 2002.
    • (2002)
    • Elfeky, M.G.1    Verykios, V.S.2    Elmagarmid, A.K.3
  • 15
    • 84884321126 scopus 로고    scopus 로고
    • ROC Graphs: Notes and Practical Considerations for Researchers, HP Labs Tech Report HPL-2003-4, HP Laboratories, Palo Alto, March
    • Fawcett, T.: ROC Graphs: Notes and Practical Considerations for Researchers, HP Labs Tech Report HPL-2003-4, HP Laboratories, Palo Alto, March 2004.
    • (2004)
    • Fawcett, T.1
  • 18
    • 1642332418 scopus 로고    scopus 로고
    • Methods for Automatic Record Matching and Linking and their use in National Statistics
    • London
    • Gill, L.: Methods for Automatic Record Matching and Linking and their use in National Statistics. National Statistics Methodology Series No. 25, London, 2001.
    • (2001) National Statistics Methodology Series , Issue.25
    • Gill, L.1
  • 21
    • 84884301739 scopus 로고    scopus 로고
    • Proceedings of the 3rd Australasian data mining conference, Cairns, December
    • Gu, L. and Baxter, R.: Decision models for record linkage. Proceedings of the 3rd Australasian data mining conference, pp. 241-254, Cairns, December 2004.
    • (2004) Decision models for record linkage , pp. 241-254
    • Gu, L.1    Baxter, R.2
  • 25
    • 84884301913 scopus 로고    scopus 로고
    • AutoStan and AutoMatch, User's Manuals, MatchWare Technologies
    • AutoStan and AutoMatch, User's Manuals, MatchWare Technologies, 1998.
    • (1998)
  • 26
  • 27
    • 0034592784 scopus 로고    scopus 로고
    • Effcient clustering of high-dimensional data sets with application to reference matching
    • Boston, August
    • McCallum, A., Nigam, K. and Ungar, L.H.: Effcient clustering of high-dimensional data sets with application to reference matching. Proceedings of the 6th ACM SIGKDD conference, pp. 169-178, Boston, August 2000.
    • (2000) Proceedings of the 6th ACM SIGKDD conference , pp. 169-178
    • McCallum, A.1    Nigam, K.2    Ungar, L.H.3
  • 28
    • 84884336267 scopus 로고    scopus 로고
    • Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, August
    • Monge, A. and Elkan, C.: The field-matching problem: Algorithm and applications. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, August 1996.
    • (1996) The field-matching problem: Algorithm and applications
    • Monge, A.1    Elkan, C.2
  • 31
    • 33746091742 scopus 로고    scopus 로고
    • Centre for Epidemiology and Research, NSW Department of Health. New South Wales Mothers and Babies 2001. NSW Public Health Bull
    • Centre for Epidemiology and Research, NSW Department of Health. New South Wales Mothers and Babies 2001. NSW Public Health Bull 2002; 13(4).
    • (2002) , vol.13 , Issue.4
  • 34
    • 84884337510 scopus 로고    scopus 로고
    • Proceedings of the 20th conference on uncertainty in artificial intelligence, Banff, Canada, July
    • Ravikumar, P. and Cohen, W.W.: A hierarchical graphical model for record linkage. Proceedings of the 20th conference on uncertainty in artificial intelligence, Banfi, Canada, July 2004.
    • (2004) A hierarchical graphical model for record linkage
    • Ravikumar, P.1    Cohen, W.W.2
  • 39
    • 33846411033 scopus 로고    scopus 로고
    • Using the EM algorithm for weight computation in the Fellegi- Sunter model of record linkage
    • RR 2000-05, US Bureau of the Census
    • Winkler, W.E.: Using the EM algorithm for weight computation in the Fellegi- Sunter model of record linkage. RR 2000-05, US Bureau of the Census, 2000.
    • (2000)
    • Winkler, W.E.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.