메뉴 건너뛰기




Volumn 1, Issue 1, 2008, Pages 933-944

Edjoin: An efficient algorithm for similarity joins with edit distance constraints

Author keywords

[No Author keywords available]

Indexed keywords

PATTERN RECOGNITION;

EID: 70849105253     PISSN: None     EISSN: 21508097     Source Type: Conference Proceeding    
DOI: 10.14778/1453856.1453957     Document Type: Article
Times cited : (272)

References (35)
  • 1
    • 0038754128 scopus 로고    scopus 로고
    • Lower bounds for embedding edit distance into normed spaces
    • A. Andoni, M. Deza, A. Gupta, P. Indyk, and S. Raskhodnikova. Lower bounds for embedding edit distance into normed spaces. In SODA, pages 523-526, 2003.
    • (2003) SODA , pp. 523-526
    • Andoni, A.1    Deza, M.2    Gupta, A.3    Indyk, P.4    Raskhodnikova, S.5
  • 2
    • 85104914015 scopus 로고    scopus 로고
    • Efficient exact set-similarity joins
    • A. Arasu, V. Ganti, and R. Kaushik. Efficient exact set-similarity joins. In VLDB, 2006.
    • (2006) VLDB
    • Arasu, A.1    Ganti, V.2    Kaushik, R.3
  • 3
    • 35348849154 scopus 로고    scopus 로고
    • Scaling up all pairs similarity search
    • R. J. Bayardo, Y. Ma, and R. Srikant. Scaling up all pairs similarity search. In WWW, 2007.
    • (2007) WWW
    • Bayardo, R.J.1    Ma, Y.2    Srikant, R.3
  • 5
    • 0034831593 scopus 로고    scopus 로고
    • Epsilon grid order: An algorithm for the similarity join on massive high-dimensional data
    • C. Böhm, B. Braunmüller, F. Krebs, and H.-P. Kriegel. Epsilon grid order: An algorithm for the similarity join on massive high-dimensional data. In SIGMOD, pages 379-388, 2001.
    • (2001) SIGMOD , pp. 379-388
    • Böhm, C.1    Braunmüller, B.2    Krebs, F.3    Kriegel, H.-P.4
  • 7
    • 45249103790 scopus 로고    scopus 로고
    • One-gapped q-gram filters for Levenshtein distance
    • S. Burkhardt and J. Kärkkäinen. One-gapped q-gram filters for Levenshtein distance. In CPM, pages 225-234, 2002.
    • (2002) CPM , pp. 225-234
    • Burkhardt, S.1    Kärkkäinen, J.2
  • 9
    • 0036040277 scopus 로고    scopus 로고
    • Similarity estimation techniques from rounding algorithms
    • M. Charikar. Similarity estimation techniques from rounding algorithms. In STOC, 2002.
    • (2002) STOC
    • Charikar, M.1
  • 10
    • 85011029434 scopus 로고    scopus 로고
    • Example-driven design of efficient record matching queries
    • S. Chaudhuri, B.-C. Chen, V. Ganti, and R. Kaushik. Example-driven design of efficient record matching queries. In VLDB, pages 327-338, 2007.
    • (2007) VLDB , pp. 327-338
    • Chaudhuri, S.1    Chen, B.-C.2    Ganti, V.3    Kaushik, R.4
  • 11
    • 84859202692 scopus 로고    scopus 로고
    • Data debugger: An operator-centric approach for data quality solutions
    • S. Chaudhuri, V. Ganti, and R. Kaushik. Data debugger: An operator-centric approach for data quality solutions. IEEE Data Eng. Bull., 29(2):60-66, 2006.
    • (2006) IEEE Data Eng. Bull. , vol.29 , Issue.2 , pp. 60-66
    • Chaudhuri, S.1    Ganti, V.2    Kaushik, R.3
  • 12
    • 33749597967 scopus 로고    scopus 로고
    • A primitive operator for similarity joins in data cleaning
    • S. Chaudhuri, V. Ganti, and R. Kaushik. A primitive operator for similarity joins in data cleaning. In ICDE, 2006.
    • (2006) ICDE
    • Chaudhuri, S.1    Ganti, V.2    Kaushik, R.3
  • 13
    • 15044355327 scopus 로고    scopus 로고
    • Similarity search in high dimensions via hashing
    • A. Gionis, P. Indyk, and R. Motwani. Similarity search in high dimensions via hashing. In VLDB, 1999.
    • (1999) VLDB
    • Gionis, A.1    Indyk, P.2    Motwani, R.3
  • 16
    • 0004137004 scopus 로고    scopus 로고
    • Computer Science and Computational Biology. Cambridge University Press
    • D. Gusfield. Algorithms on Strings, Trees, and Sequences. Computer Science and Computational Biology. Cambridge University Press, 1997.
    • (1997) Algorithms on Strings, Trees, and Sequences
    • Gusfield, D.1
  • 17
    • 84994164621 scopus 로고    scopus 로고
    • Evaluation of main memory join algorithms for joins with set comparison join predicates
    • S. Helmer and G. Moerkotte. Evaluation of main memory join algorithms for joins with set comparison join predicates. In VLDB, pages 386-395, 1997.
    • (1997) VLDB , pp. 386-395
    • Helmer, S.1    Moerkotte, G.2
  • 18
    • 33750296887 scopus 로고    scopus 로고
    • Finding near-duplicate web pages: a large-scale evaluation of algorithms
    • M. R. Henzinger. Finding near-duplicate web pages: a large-scale evaluation of algorithms. In SIGIR, 2006.
    • (2006) SIGIR
    • Henzinger, M.R.1
  • 19
    • 0013331361 scopus 로고    scopus 로고
    • Real-world data is dirty: Data cleansing and the merge/purge problem
    • M. A. Hernández and S. J. Stolfo. Real-world data is dirty: Data cleansing and the merge/purge problem. Data Mining and Knowledge Discovery, 2(1):9-37, 1998.
    • (1998) Data Mining and Knowledge Discovery , vol.2 , Issue.1 , pp. 9-37
    • Hernández, M.A.1    Stolfo, S.J.2
  • 20
    • 85011072445 scopus 로고    scopus 로고
    • Extending q-grams to estimate selectivity of string matching with low edit distance
    • H. Lee, R. T. Ng, and K. Shim. Extending q-grams to estimate selectivity of string matching with low edit distance. In VLDB, pages 195-206, 2007.
    • (2007) VLDB , pp. 195-206
    • Lee, H.1    Ng, R.T.2    Shim, K.3
  • 21
    • 85011032600 scopus 로고    scopus 로고
    • VGRAM: Improving performance of approximate queries on string collections using variable-length grams
    • C. Li, B. Wang, and X. Yang. VGRAM: Improving performance of approximate queries on string collections using variable-length grams. In VLDB, 2007.
    • (2007) VLDB
    • Li, C.1    Wang, B.2    Yang, X.3
  • 22
    • 1142267356 scopus 로고    scopus 로고
    • Efficient processing of joins on set-valued attributes
    • N. Mamoulis. Efficient processing of joins on set-valued attributes. In SIGMOD, pages 157-168, 2003.
    • (2003) SIGMOD , pp. 157-168
    • Mamoulis, N.1
  • 23
    • 0018985316 scopus 로고
    • A faster algorithm computing string edit distances
    • W. J. Masek and M. Paterson. A faster algorithm computing string edit distances. J. Comput. Syst. Sci., 20(1):18-31, 1980.
    • (1980) J. Comput. Syst. Sci. , vol.20 , Issue.1 , pp. 18-31
    • Masek, W.J.1    Paterson, M.2
  • 25
    • 0000541351 scopus 로고    scopus 로고
    • A fast bit-vector algorithm for approximate string matching based on dynamic programming
    • G. Myers. A fast bit-vector algorithm for approximate string matching based on dynamic programming. J. ACM, 46(3):395-415, 1999.
    • (1999) J. ACM , vol.46 , Issue.3 , pp. 395-415
    • Myers, G.1
  • 26
    • 0345566149 scopus 로고    scopus 로고
    • A guided tour to approximate string matching
    • G. Navarro. A guided tour to approximate string matching. ACM Comput. Surv., 33(1):31-88, 2001.
    • (2001) ACM Comput. Surv. , vol.33 , Issue.1 , pp. 31-88
    • Navarro, G.1
  • 27
    • 0001670844 scopus 로고    scopus 로고
    • Performance in practice of string hashing functions
    • M. V. Ramakrishna and J. Zobel. Performance in practice of string hashing functions. In DASFAA, pages 215-224, 1997.
    • (1997) DASFAA , pp. 215-224
    • Ramakrishna, M.V.1    Zobel, J.2
  • 28
    • 10644238464 scopus 로고    scopus 로고
    • Set containment joins: The good, the bad and the ugly
    • K. Ramasamy, J. M. Patel, J. F. Naughton, and R. Kaushik. Set containment joins: The good, the bad and the ugly. In VLDB, pages 351-362, 2000.
    • (2000) VLDB , pp. 351-362
    • Ramasamy, K.1    Patel, J.M.2    Naughton, J.F.3    Kaushik, R.4
  • 29
    • 0242456811 scopus 로고    scopus 로고
    • Interactive deduplication using active learning
    • S. Sarawagi and A. Bhamidipaty. Interactive deduplication using active learning. In KDD, 2002.
    • (2002) KDD
    • Sarawagi, S.1    Bhamidipaty, A.2
  • 30
    • 3142777876 scopus 로고    scopus 로고
    • Efficient set joins on similarity predicates
    • S. Sarawagi and A. Kirpal. Efficient set joins on similarity predicates. In SIGMOD, 2004.
    • (2004) SIGMOD
    • Sarawagi, S.1    Kirpal, A.2
  • 31
    • 0004498253 scopus 로고
    • On approximate string matching
    • E. Ukkonen. On approximate string matching. In FCT, 1983.
    • (1983) FCT
    • Ukkonen, E.1
  • 32
    • 0015960104 scopus 로고
    • The string-to-string correction problem
    • R. A. Wagner and M. J. Fischer. The string-to-string correction problem. J. ACM, 21(1):168-173, 1974.
    • (1974) J. ACM , vol.21 , Issue.1 , pp. 168-173
    • Wagner, R.A.1    Fischer, M.J.2
  • 34
    • 66249113620 scopus 로고    scopus 로고
    • Efficient similarity joins for near duplicate detection
    • C. Xiao, W. Wang, X. Lin, and J. X. Yu. Efficient similarity joins for near duplicate detection. In WWW, 2008.
    • (2008) WWW
    • Xiao, C.1    Wang, W.2    Lin, X.3    Yu, J.X.4
  • 35
    • 33747729581 scopus 로고    scopus 로고
    • Inverted files for text search engines
    • J. Zobel and A. Moffat. Inverted files for text search engines. ACM Comput. Surv., 38(2), 2006.
    • (2006) ACM Comput. Surv. , vol.38 , Issue.2
    • Zobel, J.1    Moffat, A.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.