메뉴 건너뛰기




Volumn , Issue , 2011, Pages 85-94

Eliminating the redundancy in blocking-based entity resolution methods

Author keywords

data cleaning; entity resolution; redundancy based blocking

Indexed keywords

ABSTRACT LEVELS; BLOCKING METHOD; CITATION MATCHING; COMPUTATIONAL COSTS; DATA CLEANING; ENTITY RESOLUTION; HETEROGENEOUS DATA; NOVEL TECHNIQUES; OPTIMAL SOLUTIONS; REAL WORLD DATA; REAL-WORLD OBJECTS; REDUNDANCY-BASED BLOCKING; RESOLUTION METHODS; SPACE COMPLEXITY; SPACE LIMITATIONS; TIME EFFICIENCIES;

EID: 79960519872     PISSN: 15525996     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1998076.1998093     Document Type: Conference Paper
Times cited : (46)

References (29)
  • 1
    • 84878049861 scopus 로고    scopus 로고
    • Adaptive blocking: Learning to scale up record linkage
    • M. Bilenko, B. Kamath, and R. J. Mooney. Adaptive blocking: Learning to scale up record linkage. In ICDM, 2006.
    • (2006) ICDM
    • Bilenko, M.1    Kamath, B.2    Mooney, R.J.3
  • 2
    • 11144240583 scopus 로고    scopus 로고
    • A comparison of string distance metrics for name-matching tasks
    • W. W. Cohen, P. D. Ravikumar, and S. E. Fienberg. A comparison of string distance metrics for name-matching tasks. In IIWeb, 2003.
    • (2003) IIWeb
    • Cohen, W.W.1    Ravikumar, P.D.2    Fienberg, S.E.3
  • 3
    • 74549152150 scopus 로고    scopus 로고
    • Robust record linkage blocking using su x arrays
    • T. de Vries, H. Ke, S. Chawla, and P. Christen. Robust record linkage blocking using su x arrays. In CIKM, 2009.
    • (2009) CIKM
    • De Vries, T.1    Ke, H.2    Chawla, S.3    Christen, P.4
  • 4
    • 17244380794 scopus 로고    scopus 로고
    • Semantic integration research in the database community: A brief survey
    • A. Doan and A. Y. Halevy. Semantic integration research in the database community: A brief survey. AI Magazine, 2005.
    • (2005) AI Magazine
    • Doan, A.1    Halevy, A.Y.2
  • 5
    • 29844452555 scopus 로고    scopus 로고
    • Reference reconciliation in complex information spaces
    • X. Dong, A. Halevy, and J. Madhavan. Reference reconciliation in complex information spaces. In SIGMOD, 2005.
    • (2005) SIGMOD
    • Dong, X.1    Halevy, A.2    Madhavan, J.3
  • 7
    • 15044355327 scopus 로고    scopus 로고
    • Similarity search in high dimensions via hashing
    • A. Gionis, P. Indyk, and R. Motwani. Similarity search in high dimensions via hashing. In VLDB, 1999.
    • (1999) VLDB
    • Gionis, A.1    Indyk, P.2    Motwani, R.3
  • 9
    • 4944235920 scopus 로고    scopus 로고
    • Two supervised learning approaches for name disambiguation in author citations
    • H. Han, C. L. Giles, H. Zha, C. Li, and K. Tsioutsiouliklis. Two supervised learning approaches for name disambiguation in author citations. In JCDL, 2004.
    • (2004) JCDL
    • Han, H.1    Giles, C.L.2    Zha, H.3    Li, C.4    Tsioutsiouliklis, K.5
  • 10
    • 27544488429 scopus 로고    scopus 로고
    • Name disambiguation in author citations using a k-way spectral clustering method
    • H. Han, H. Zha, and C. L. Giles. Name disambiguation in author citations using a k-way spectral clustering method. In JCDL, 2005.
    • (2005) JCDL
    • Han, H.1    Zha, H.2    Giles, C.L.3
  • 12
    • 33745266392 scopus 로고    scopus 로고
    • Domain-independent data cleaning via analysis of entity-relationship graph
    • D. V. Kalashnikov and S. Mehrotra. Domain-independent data cleaning via analysis of entity-relationship graph. TODS, 2006.
    • (2006) TODS
    • Kalashnikov, D.V.1    Mehrotra, S.2
  • 13
    • 34250670467 scopus 로고    scopus 로고
    • Record linkage: Similarity measures and algorithms
    • N. Koudas, S. Sarawagi, and D. Srivastava. Record linkage: similarity measures and algorithms. In SIGMOD, 2006.
    • (2006) SIGMOD
    • Koudas, N.1    Sarawagi, S.2    Srivastava, D.3
  • 14
    • 85019781759 scopus 로고    scopus 로고
    • Effective and scalable solutions for mixed and split citation problems in digital libraries
    • D. Lee, B.-W. On, J. Kang, and S. Park. Effective and scalable solutions for mixed and split citation problems in digital libraries. In IQIS, 2005.
    • (2005) IQIS
    • Lee, D.1    On, B.-W.2    Kang, J.3    Park, S.4
  • 15
    • 33846320077 scopus 로고    scopus 로고
    • Supporting e cient record linkage for large data sets using mapping techniques
    • C. Li, L. Jin, and S. Mehrotra. Supporting e cient record linkage for large data sets using mapping techniques. WWW J., 9(4), 2006.
    • (2006) WWW J. , vol.9 , Issue.4
    • Li, C.1    Jin, L.2    Mehrotra, S.3
  • 17
    • 0034592784 scopus 로고    scopus 로고
    • Efficient clustering of high-dimensional data sets with application to reference matching
    • A. McCallum, K. Nigam, and L. H. Ungar. Efficient clustering of high-dimensional data sets with application to reference matching. In KDD, pages 169-178, 2000.
    • (2000) KDD , pp. 169-178
    • McCallum, A.1    Nigam, K.2    Ungar, L.H.3
  • 18
    • 36348932551 scopus 로고    scopus 로고
    • Learning blocking schemes for record linkage
    • M. Michelson and C. A. Knoblock. Learning blocking schemes for record linkage. In AAAI, 2006.
    • (2006) AAAI
    • Michelson, M.1    Knoblock, C.A.2
  • 20
    • 27544460727 scopus 로고    scopus 로고
    • Comparative study of name disambiguation problem using a scalable blocking-based framework
    • B.-W. On, D. Lee, J. Kang, and P. Mitra. Comparative study of name disambiguation problem using a scalable blocking-based framework. In JCDL, 2005.
    • (2005) JCDL
    • On, B.-W.1    Lee, D.2    Kang, J.3    Mitra, P.4
  • 21
    • 79952386495 scopus 로고    scopus 로고
    • Efficient entity resolution for large heterogeneous information spaces
    • G. Papadakis, E. Ioannou, C. Niederée, and P. Fankhauser. Efficient entity resolution for large heterogeneous information spaces. In WSDM, 2011.
    • (2011) WSDM
    • Papadakis, G.1    Ioannou, E.2    Niederée, C.3    Fankhauser, P.4
  • 23
    • 79960475885 scopus 로고    scopus 로고
    • Xstreamcluster: An efficient algorithm for streaming xml data clustering
    • to appear
    • O. Papapetrou and L. Chen. Xstreamcluster: an efficient algorithm for streaming xml data clustering. In DASFAA (to appear), 2011.
    • (2011) DASFAA
    • Papapetrou, O.1    Chen, L.2
  • 24
    • 77952280581 scopus 로고    scopus 로고
    • Harra: Fast iterative hashed record linkage for large-scale data collections
    • H. sik Kim and D. Lee. Harra: fast iterative hashed record linkage for large-scale data collections. In EDBT, 2010.
    • (2010) EDBT
    • Sik Kim, H.1    Lee, D.2
  • 25
    • 36348962507 scopus 로고    scopus 로고
    • Efficient topic-based unsupervised name disambiguation
    • 0002
    • Y. Song, J. H. 0002, I. G. Councill, J. Li, and C. L. Giles. Efficient topic-based unsupervised name disambiguation. In JCDL, 2007.
    • (2007) JCDL
    • Song, Y.1    H, J.2    Councill, I.G.3    Li, J.4    Giles, C.L.5
  • 26
    • 0242456803 scopus 로고    scopus 로고
    • Learning domainin-dependent string transformation weights for high accuracy object identification
    • S. Tejada, C. A. Knoblock, and S. Minton. Learning domainin-dependent string transformation weights for high accuracy object identification. In KDD, 2002.
    • (2002) KDD
    • Tejada, S.1    Knoblock, C.A.2    Minton, S.3
  • 27
    • 70450273106 scopus 로고    scopus 로고
    • Disambiguating authors in academic publications using random forests
    • P. Treeratpituk and C. L. Giles. Disambiguating authors in academic publications using random forests. In JCDL, 2009.
    • (2009) JCDL
    • Treeratpituk, P.1    Giles, C.L.2
  • 29
    • 36348961379 scopus 로고    scopus 로고
    • Adaptive sorted neighborhood methods for efficient record linkage
    • S. Yan, D. Lee, M.-Y. Kan, and C. L. Giles. Adaptive sorted neighborhood methods for efficient record linkage. In JCDL, 2007.
    • (2007) JCDL
    • Yan, S.1    Lee, D.2    Kan, M.-Y.3    Giles, C.L.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.