메뉴 건너뛰기




Volumn , Issue , 2008, Pages 151-159

Automatic record linkage using seeded nearest neighbour and support vector machine classification

Author keywords

Data linkage; Data matching; Deduplication; Entity resolution; Nearest neighbour; Support vector machine

Indexed keywords

DATA LINKAGE; DATA MATCHING; DEDUPLICATION; ENTITY RESOLUTION; NEAREST NEIGHBOUR;

EID: 65449139594     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1401890.1401913     Document Type: Conference Paper
Times cited : (148)

References (25)
  • 3
    • 77952372966 scopus 로고    scopus 로고
    • Adaptive duplicate detection using learnable string similarity measures
    • Washington DC
    • M. Bilenko and R. J. Mooney. Adaptive duplicate detection using learnable string similarity measures. In ACM KDD'03, pages 39-48, Washington DC, 2003.
    • (2003) ACM KDD'03 , pp. 39-48
    • Bilenko, M.1    Mooney, R.J.2
  • 4
    • 0003710380 scopus 로고    scopus 로고
    • Department of Computer Science, National Taiwan University, Software available at
    • C.-C. Chang and C-J. Lin. LIBSVM: A library for support vector machines. Manual, Department of Computer Science, National Taiwan University, 2001. Software available at: http://www.csie.ntu.edu.tw/~cjlin/libsvm.
    • (2001) Manual, LIBSVM: A library for support vector machines
    • Chang, C.-C.1    Lin, C.-J.2
  • 5
    • 26444478506 scopus 로고    scopus 로고
    • Probabilistic data generation for deduplication and data linkage
    • IDEAL'05, Brisbane
    • P. Christen. Probabilistic data generation for deduplication and data linkage. In IDEAL'05, Springer LNCS 3578, pages 109-116, Brisbane, 2005.
    • (2005) Springer LNCS , vol.3578 , pp. 109-116
    • Christen, P.1
  • 6
    • 44649135932 scopus 로고    scopus 로고
    • A two-step classification approach to unsupervised record linkage
    • Gold Coast, Australia
    • P. Christen. A two-step classification approach to unsupervised record linkage. In AusDM'07, CRPIT vol. 70, pages 111-119, Gold Coast, Australia, 2007.
    • (2007) AusDM'07, CRPIT , vol.70 , pp. 111-119
    • Christen, P.1
  • 7
    • 44649093306 scopus 로고    scopus 로고
    • Automatic training example selection for scalable unsupervised record linkage
    • PAKDD'08, Springer, Osaka
    • P. Christen. Automatic training example selection for scalable unsupervised record linkage. In PAKDD'08, Springer LNAI 5012, pages 511-518, Osaka, 2008.
    • (2008) LNAI , vol.5012 , pp. 511-518
    • Christen, P.1
  • 8
    • 67649649496 scopus 로고    scopus 로고
    • Febrl - A freely available record linkage system with a graphical user interface
    • Wollongong, Australia
    • P. Christen. Febrl - A freely available record linkage system with a graphical user interface. In HDKM'08, CRPIT vol. 80, Wollongong, Australia, 2008.
    • (2008) HDKM'08, CRPIT , vol.80
    • Christen, P.1
  • 9
    • 33846428121 scopus 로고    scopus 로고
    • P. Christen and K. Goiser. Quality and complexity measures for data linkage and deduplication. In F. Guillet and H. Hamilton, editors, Quality Measures in Data Mining, 43 of Studies in Computational Intelligence. Springer, 2007.
    • P. Christen and K. Goiser. Quality and complexity measures for data linkage and deduplication. In F. Guillet and H. Hamilton, editors, Quality Measures in Data Mining, volume 43 of Studies in Computational Intelligence. Springer, 2007.
  • 12
    • 0242540438 scopus 로고    scopus 로고
    • Learning to match and cluster large high-dimensional data sets for data integration
    • Edmonton
    • W. Cohen and J. Richman. Learning to match and cluster large high-dimensional data sets for data integration. In ACM KDD'02, pages 475-480, Edmonton, 2002.
    • (2002) ACM KDD'02 , pp. 475-480
    • Cohen, W.1    Richman, J.2
  • 13
    • 0036203458 scopus 로고    scopus 로고
    • TAILOR: A record linkage toolbox
    • San Jose
    • M. Elfeky, V. Verykios, and A. Elmagarmid. TAILOR: A record linkage toolbox. In ICDE'02, pages 17-28, San Jose, 2002.
    • (2002) ICDE'02 , pp. 17-28
    • Elfeky, M.1    Verykios, V.2    Elmagarmid, A.3
  • 16
    • 65449179112 scopus 로고    scopus 로고
    • Towards automated record linkage
    • Sydney
    • K. Goiser and P. Christen. Towards automated record linkage. In AusDM'06, CRPIT vol. 61, pages 23-31, Sydney, 2006.
    • (2006) AusDM'06, CRPIT , vol.61 , pp. 23-31
    • Goiser, K.1    Christen, P.2
  • 18
    • 45849148052 scopus 로고    scopus 로고
    • Effective counterterrorism and the limited role of predictive data mining
    • J. Jonas and J. Harper. Effective counterterrorism and the limited role of predictive data mining. Policy Analysis, (584), 2006.
    • (2006) Policy Analysis , vol.584
    • Jonas, J.1    Harper, J.2
  • 19
    • 0037867900 scopus 로고    scopus 로고
    • Two approaches to handling noisy variation in text mining
    • Sydney
    • U. Y. Nahm, M. Bilenko, and R. J. Mooney. Two approaches to handling noisy variation in text mining. In TextML'02, pages 18-27, Sydney, 2002.
    • (2002) TextML'02 , pp. 18-27
    • Nahm, U.Y.1    Bilenko, M.2    Mooney, R.J.3
  • 20
    • 15744370005 scopus 로고    scopus 로고
    • Efficient nearest neighbor classification with data reduction and fast search algorithms
    • Man and Cybernetics
    • J. S. Sanchez, J. M. Sotoca, and F. Pla. Efficient nearest neighbor classification with data reduction and fast search algorithms. In IEEE International Conference on Systems, Man and Cybernetics, volume 5, pages 4757-4762, 2004.
    • (2004) IEEE International Conference on Systems , vol.5 , pp. 4757-4762
    • Sanchez, J.S.1    Sotoca, J.M.2    Pla, F.3
  • 21
    • 0242456811 scopus 로고    scopus 로고
    • Interactive deduplication using active learning
    • Edmonton
    • S. Sarawagi and A. Bhamidipaty. Interactive deduplication using active learning. In ACM KDD'02, pages 269-278, Edmonton, 2002.
    • (2002) ACM KDD'02 , pp. 269-278
    • Sarawagi, S.1    Bhamidipaty, A.2
  • 22
    • 0242456803 scopus 로고    scopus 로고
    • Learning domain-independent string transformation weights for high accuracy object identification
    • Edmonton
    • S. Tejada, C. Knoblock, and S. Minton. Learning domain-independent string transformation weights for high accuracy object identification. In ACM KDD'02, pages 350-359, Edmonton, 2002.
    • (2002) ACM KDD'02 , pp. 350-359
    • Tejada, S.1    Knoblock, C.2    Minton, S.3
  • 23
    • 33846411033 scopus 로고    scopus 로고
    • Using the EM algorithm for weight computation in the Fellegi-Sunter model of record linkage
    • Technical Report RR2000/05, US Bureau of the Census
    • W. E. Winkler. Using the EM algorithm for weight computation in the Fellegi-Sunter model of record linkage. Technical Report RR2000/05, US Bureau of the Census, 2000.
    • (2000)
    • Winkler, W.E.1
  • 24
    • 2942709772 scopus 로고    scopus 로고
    • Methods for evaluating and creating data quality
    • W. E. Winkler. Methods for evaluating and creating data quality. Elsevier Information Systems, 29(7):531-550, 2004.
    • (2004) Elsevier Information Systems , vol.29 , Issue.7 , pp. 531-550
    • Winkler, W.E.1
  • 25
    • 18744413274 scopus 로고    scopus 로고
    • Text classification from positive and unlabeled documents
    • New Orleans
    • H. Yu, C. X. Zhai, and J. Han. Text classification from positive and unlabeled documents. In CIKM'03, pages 232-239, New Orleans, 2003.
    • (2003) CIKM'03 , pp. 232-239
    • Yu, H.1    Zhai, C.X.2    Han, J.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.