메뉴 건너뛰기




Volumn 2015-August, Issue , 2015, Pages 279-288

A clustering-based framework to control block sizes for entity resolution

Author keywords

Blocking; Data cleaning; Indexing; Record linkage

Indexed keywords

CLUSTERING ALGORITHMS; DATA HANDLING; DATA MINING; DATA PRIVACY; ECONOMIC AND SOCIAL EFFECTS; INDEXING (OF INFORMATION); VIRTUAL REALITY;

EID: 84954139187     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/2783258.2783396     Document Type: Conference Paper
Times cited : (56)

References (25)
  • 1
    • 34248229658 scopus 로고    scopus 로고
    • Collective entity resolution in relational data
    • I. Bhattacharya and L. Getoor. Collective entity resolution in relational data. ACM TKDD, 1(1), 2007.
    • (2007) ACM TKDD , vol.1 , Issue.1
    • Bhattacharya, I.1    Getoor, L.2
  • 3
    • 84920595044 scopus 로고    scopus 로고
    • A survey of indexing techniques for scalable record linkage and deduplication
    • P. Christen. A survey of indexing techniques for scalable record linkage and deduplication. IEEE TKDE, 24(9), 2012.
    • (2012) IEEE TKDE , vol.24 , Issue.9
    • Christen, P.1
  • 4
    • 84889610061 scopus 로고    scopus 로고
    • Preparation of a real temporal voter data set for record linkage and duplicate detection research
    • P. Christen. Preparation of a real temporal voter data set for record linkage and duplicate detection research. Technical report, Australian National University, 2014.
    • (2014) Technical Report, Australian National University
    • Christen, P.1
  • 5
    • 84871075183 scopus 로고    scopus 로고
    • An automatic blocking mechanism for large-scale de-duplication tasks
    • A. Das Sarma, A. Jain, A. Machanavajjhala, and P. Bohannon. An automatic blocking mechanism for large-scale de-duplication tasks. In ACM CIKM, pages 1055-1064, 2012.
    • (2012) ACM CIKM , pp. 1055-1064
    • Das Sarma, A.1    Jain, A.2    Machanavajjhala, A.3    Bohannon, P.4
  • 6
    • 29844452555 scopus 로고    scopus 로고
    • Reference reconciliation in complex information spaces
    • X. Dong, A. Halevy, and J. Madhavan. Reference reconciliation in complex information spaces. In ACM SIGMOD, pages 85-96, 2005.
    • (2005) ACM SIGMOD , pp. 85-96
    • Dong, X.1    Halevy, A.2    Madhavan, J.3
  • 8
    • 84947399464 scopus 로고
    • A theory for record linkage
    • I. P. Fellegi and A. B. Sunter. A theory for record linkage. JASA, 64(328):1183-1210, 1969.
    • (1969) JASA , vol.64 , Issue.328 , pp. 1183-1210
    • Fellegi, I.P.1    Sunter, A.B.2
  • 9
    • 84901288530 scopus 로고    scopus 로고
    • A graph matching method for historical census household linkage
    • Z. Fu, P. Christen, and J. Zhou. A graph matching method for historical census household linkage. In PAKDD, Springer LNAI Vol. 8443, pages 485-496. 2014.
    • (2014) PAKDD, Springer LNAI , vol.8443 , pp. 485-496
    • Fu, Z.1    Christen, P.2    Zhou, J.3
  • 10
    • 84921044566 scopus 로고    scopus 로고
    • Data clustering with cluster size constraints using a modified k-means algorithm
    • Oct
    • N. Ganganath, C.-T. Cheng, and C. Tse. Data clustering with cluster size constraints using a modified k-means algorithm. In CyberC, pages 158-161, Oct 2014.
    • (2014) CyberC , pp. 158-161
    • Ganganath, N.1    Cheng, C.-T.2    Tse, C.3
  • 11
    • 0013331361 scopus 로고    scopus 로고
    • Real-world data is dirty: Data cleansing and the merge/purge problem
    • M. A. Hernandez and S. J. Stolfo. Real-world data is dirty: Data cleansing and the merge/purge problem. Springer DMKD, 2(1):9-37, 1998.
    • (1998) Springer DMKD , vol.2 , Issue.1 , pp. 9-37
    • Hernandez, M.A.1    Stolfo, S.J.2
  • 12
    • 33745266392 scopus 로고    scopus 로고
    • Domain-independent data cleaning via analysis of entity-relationship graph
    • D. Kalashnikov and S. Mehrotra. Domain-independent data cleaning via analysis of entity-relationship graph. ACM TODS, 31(2):716-767, 2006.
    • (2006) ACM TODS , vol.31 , Issue.2 , pp. 716-767
    • Kalashnikov, D.1    Mehrotra, S.2
  • 13
    • 84894647271 scopus 로고    scopus 로고
    • An unsupervised algorithm for learning blocking schemes
    • M. Kejriwal and D. P. Miranker. An unsupervised algorithm for learning blocking schemes. In IEEE ICDM, pages 340-349, 2013.
    • (2013) IEEE ICDM , pp. 340-349
    • Kejriwal, M.1    Miranker, D.P.2
  • 14
    • 63449096255 scopus 로고    scopus 로고
    • Parallel linkage
    • H. Kim and D. Lee. Parallel linkage. In ACM CIKM, pages 283-292, 2007.
    • (2007) ACM CIKM , pp. 283-292
    • Kim, H.1    Lee, D.2
  • 15
    • 72649095071 scopus 로고    scopus 로고
    • Frameworks for entity matching: A comparison
    • H. Köpcke and E. Rahm. Frameworks for entity matching: A comparison. Elsevier DKE, 69(2):197-210, 2010.
    • (2010) Elsevier DKE , vol.69 , Issue.2 , pp. 197-210
    • Köpcke, H.1    Rahm, E.2
  • 16
    • 84906314498 scopus 로고    scopus 로고
    • Balanced k-means for clustering
    • M. Malinen and P. Fränti. Balanced k-means for clustering. In SSSPR, Springer LNCS Vol. 8621, pages 32-41. 2014.
    • (2014) SSSPR, Springer LNCS , vol.8621 , pp. 32-41
    • Malinen, M.1    Fränti, P.2
  • 17
    • 84945552843 scopus 로고    scopus 로고
    • Unsupervised blocking key selection for real-time entity resolution
    • B. Ramadan and P. Christen. Unsupervised blocking key selection for real-time entity resolution. In PAKDD, Springer LNAI Vol. 9078, pages 574-585, 2015.
    • (2015) PAKDD, Springer LNAI , vol.9078 , pp. 574-585
    • Ramadan, B.1    Christen, P.2
  • 18
    • 84904155169 scopus 로고    scopus 로고
    • Dynamic sorted neighborhood indexing for real-time entity resolution
    • B. Ramadan, P. Christen, and H. Liang. Dynamic sorted neighborhood indexing for real-time entity resolution. In ADC, Springer LNCS Vol. 8506, pages 1-12. 2014.
    • (2014) ADC, Springer LNCS , vol.8506 , pp. 1-12
    • Ramadan, B.1    Christen, P.2    Liang, H.3
  • 19
    • 84870054852 scopus 로고    scopus 로고
    • A modification of the k-means method for quasi-unsupervised learning
    • D. Rebollo-Monedero, M. Solé, J. Nin, and J. Forné. A modification of the k-means method for quasi-unsupervised learning. Elsevier KBS, 37(0):176-185, 2013.
    • (2013) Elsevier KBS , vol.37 , pp. 176-185
    • Rebollo-Monedero, D.1    Solé, M.2    Nin, J.3    Forné, J.4
  • 20
    • 84878044770 scopus 로고    scopus 로고
    • Entity resolution with Markov logic
    • P. Singla and P. Domingos. Entity resolution with Markov logic. In IEEE ICDM, pages 572-582, 2006.
    • (2006) IEEE ICDM , pp. 572-582
    • Singla, P.1    Domingos, P.2
  • 21
    • 84893619583 scopus 로고    scopus 로고
    • Sorted nearest neighborhood clustering for efficient private blocking
    • Springer
    • D. Vatsalan and P. Christen. Sorted nearest neighborhood clustering for efficient private blocking. In PAKDD, Volume 7819 of LNCS, pages 341-352. Springer, 2013.
    • (2013) PAKDD, Volume 7819 of LNCS , pp. 341-352
    • Vatsalan, D.1    Christen, P.2
  • 22
    • 84889599807 scopus 로고    scopus 로고
    • A taxonomy of privacy-preserving record linkage techniques
    • D. Vatsalan, P. Christen, and V. S. Verykios. A taxonomy of privacy-preserving record linkage techniques. Elsevier IS, 38(6):946-969, 2013.
    • (2013) Elsevier IS , vol.38 , Issue.6 , pp. 946-969
    • Vatsalan, D.1    Christen, P.2    Verykios, V.S.3
  • 24
    • 16444383160 scopus 로고    scopus 로고
    • Survey of clustering algorithms
    • D. May
    • R. Xu and I. Wunsch, D. Survey of clustering algorithms. IEEE TNN, 16(3):645-678, May 2005.
    • (2005) IEEE TNN , vol.16 , Issue.3 , pp. 645-678
    • Xu, R.1    Wunsch, I.2
  • 25
    • 77957894729 scopus 로고    scopus 로고
    • Data clustering with size constraints
    • S. Zhu, D. Wang, and T. Li. Data clustering with size constraints. Elsevier KBS, 23(8):883-889, 2010.
    • (2010) Elsevier KBS , vol.23 , Issue.8 , pp. 883-889
    • Zhu, S.1    Wang, D.2    Li, T.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.