메뉴 건너뛰기




Volumn , Issue , 2013, Pages 295-305

Optimal hashing schemes for entity matching

Author keywords

Blocking; Entity matching; Hashing

Indexed keywords

APPROXIMATION ALGORITHMS; BOOLEAN FUNCTIONS; SIGNAL THEORY; WORLD WIDE WEB;

EID: 84893085168     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/2488388.2488415     Document Type: Conference Paper
Times cited : (26)

References (32)
  • 1
    • 85104914015 scopus 로고    scopus 로고
    • Efficient exact set-similarity joins
    • Arvind Arasu, Venkatesh Ganti, and Raghav Kaushik. Efficient exact set-similarity joins. In VLDB, pages 918-929, 2006.
    • (2006) VLDB , pp. 918-929
    • Arasu, A.1    Ganti, V.2    Kaushik, R.3
  • 2
    • 77954717287 scopus 로고    scopus 로고
    • On active learning of record matching packages
    • Arvind Arasu, Michaela Götz, and Raghav Kaushik. On active learning of record matching packages. In SIGMOD Conference, pages 783-794, 2010.
    • (2010) SIGMOD Conference , pp. 783-794
    • Arasu, A.1    Götz, M.2    Kaushik, R.3
  • 3
    • 29844440778 scopus 로고    scopus 로고
    • Towards a robust query optimizer: A principled and practical approach
    • Brian Babcock and Surajit Chaudhuri. Towards a robust query optimizer: a principled and practical approach. In SIGMOD, pages 119-130, 2005.
    • (2005) SIGMOD , pp. 119-130
    • Babcock, B.1    Chaudhuri, S.2
  • 4
    • 0034837020 scopus 로고    scopus 로고
    • Sampling algorithms: Lower bounds and applications
    • Ziv Bar-Yossef, Ravi Kumar, and D. Sivakumar. Sampling algorithms: lower bounds and applications. In STOC, pages 266-275, 2001.
    • (2001) STOC , pp. 266-275
    • Bar-Yossef, Z.1    Kumar, R.2    Sivakumar, D.3
  • 7
    • 84871099550 scopus 로고    scopus 로고
    • Adaptive blocking: Learning to scale up record linkage and clustering
    • Mikhail Bilenko, Beena Kamath, and Raymond J. Mooney. Adaptive blocking: Learning to scale up record linkage and clustering. In ICDM, 2006.
    • (2006) ICDM
    • Bilenko, M.1    Kamath, B.2    Mooney, R.J.3
  • 8
    • 33749597967 scopus 로고    scopus 로고
    • A primitive operator for similarity joins in data cleaning
    • Surajit Chaudhuri, Venkatesh Ganti, and Raghav Kaushik. A primitive operator for similarity joins in data cleaning. In ICDE, 2006.
    • (2006) ICDE
    • Chaudhuri, S.1    Ganti, V.2    Kaushik, R.3
  • 9
    • 0000301097 scopus 로고
    • A greedy heuristic for the set-covering problem
    • Aug
    • V. Chvatal. A greedy heuristic for the set-covering problem. Mathematics of Operations Research, 4:233-235, Aug 1979.
    • (1979) Mathematics of Operations Research , vol.4 , pp. 233-235
    • Chvatal, V.1
  • 12
    • 29844452555 scopus 로고    scopus 로고
    • Reference reconciliation in complex information spaces
    • Xin Dong, Alon Halevy, and Jayant Madhavan. Reference reconciliation in complex information spaces. In SIGMOD, pages 85-96, 2005.
    • (2005) SIGMOD , pp. 85-96
    • Dong, X.1    Halevy, A.2    Madhavan, J.3
  • 14
    • 85027780074 scopus 로고    scopus 로고
    • Creating probabilistic databases from information extraction models
    • Rahul Gupta and Sunita Sarawagi. Creating probabilistic databases from information extraction models. In VLDB, pages 965-976, 2006.
    • (2006) VLDB , pp. 965-976
    • Gupta, R.1    Sarawagi, S.2
  • 15
    • 65449122439 scopus 로고    scopus 로고
    • Unsupervised deduplication using cross-field dependencies
    • Rob Hall, Charles Sutton, and Andrew McCallum. Unsupervised deduplication using cross-field dependencies. In KDD, pages 310-317, 2008.
    • (2008) KDD , pp. 310-317
    • Hall, R.1    Sutton, C.2    McCallum, A.3
  • 16
    • 77956220112 scopus 로고    scopus 로고
    • Scalable similarity search with optimized kernel hashing
    • Junfeng He, Wei Liu, and Shih-Fu Chang. Scalable similarity search with optimized kernel hashing. In KDD, pages 1129-1138, 2010.
    • (2010) KDD , pp. 1129-1138
    • He, J.1    Liu, W.2    Chang, S.-F.3
  • 17
    • 0013331361 scopus 로고    scopus 로고
    • Real-world data is dirty: Data cleansing and the merge/purge problem
    • January
    • Mauricio A. Hernández and Salvatore J. Stolfo. Real-world data is dirty: Data cleansing and the merge/purge problem. Data Min. Knowl. Discov., 2:9-37, January 1998.
    • (1998) Data Min. Knowl. Discov. , vol.2 , pp. 9-37
    • Hernández, M.A.1    Stolfo, S.J.2
  • 18
    • 0036083445 scopus 로고    scopus 로고
    • A data complexity analysis of comparative advantages of decision forest constructors
    • Tin Kam Ho. A data complexity analysis of comparative advantages of decision forest constructors. Pattern Anal. Appl., pages 102-112, 2002.
    • (2002) Pattern Anal. Appl. , pp. 102-112
    • Ho, T.K.1
  • 21
    • 84950419860 scopus 로고
    • Advances in record-linkage methodology as applied to matching the 1985 Census of Tampa, Florida
    • Matthew A. Jaro. Advances in Record-Linkage Methodology as Applied to Matching the 1985 Census of Tampa, Florida. Journal of the American Statistical Association, 84(406):414-420, 1989.
    • (1989) Journal of the American Statistical Association , vol.84 , Issue.406 , pp. 414-420
    • Jaro, M.A.1
  • 23
    • 0034592784 scopus 로고    scopus 로고
    • Efficient clustering of high-dimensional data sets with application to reference matching
    • Andrew McCallum, Kamal Nigam, and Lyle H. Ungar. Efficient clustering of high-dimensional data sets with application to reference matching. In Knowledge Discovery and Data Mining, pages 169-178, 2000.
    • (2000) Knowledge Discovery and Data Mining , pp. 169-178
    • McCallum, A.1    Nigam, K.2    Ungar, L.H.3
  • 25
    • 33750728911 scopus 로고    scopus 로고
    • Learning blocking schemes for record linkage
    • Matthew Michelson and Craig A. Knoblock. Learning blocking schemes for record linkage. In AAAI, pages 440-445, 2006.
    • (2006) AAAI , pp. 440-445
    • Michelson, M.1    Knoblock, C.A.2
  • 26
    • 0345566149 scopus 로고    scopus 로고
    • A guided tour to approximate string matching
    • Gonzalo Navarro. A guided tour to approximate string matching. ACM Comput. Surv., 33(1):31-88, 2001.
    • (2001) ACM Comput. Surv. , vol.33 , Issue.1 , pp. 31-88
    • Navarro, G.1
  • 30
    • 84891103393 scopus 로고    scopus 로고
    • An automatic blocking mechanism for large-scale de-duplication tasks
    • Anish Das Sarma, Ankur Jain, Ashwin Machanavajjhala, and Philip Bohannon. An automatic blocking mechanism for large-scale de-duplication tasks. In CIKM, 2012.
    • (2012) CIKM
    • Sarma, A.D.1    Jain, A.2    Machanavajjhala, A.3    Bohannon, P.4
  • 31
    • 84878044770 scopus 로고    scopus 로고
    • Entity resolution with markov logic
    • Parag Singla and Pedro Domingos. Entity resolution with markov logic. In icdm, pages 572-582, 2006.
    • (2006) Icdm , pp. 572-582
    • Singla, P.1    Domingos, P.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.