메뉴 건너뛰기




Volumn , Issue , 2009, Pages 759-770

Efficient approximate entity extraction with edit distance constraints

Author keywords

[No Author keywords available]

Indexed keywords

ALTERNATIVE APPROACH; APPROXIMATE MATCHES; DATA SETS; DICTIONARY MATCHING; DOCUMENT-PROCESSING; DOMAIN KNOWLEDGE; EDIT DISTANCE; GENERATION METHOD; NAMED ENTITIES; NAMED ENTITY RECOGNITION; ORDER OF MAGNITUDE; POOR PERFORMANCE; PROBLEM DEFINITION; PRUNING TECHNIQUES; RECENT TRENDS;

EID: 70849115286     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1559845.1559925     Document Type: Conference Paper
Times cited : (99)

References (42)
  • 1
    • 77950901996 scopus 로고    scopus 로고
    • Scalable ad-hoc entity extraction from text collections
    • S. Agrawal, K. Chakrabarti, S. Chaudhuri, and V. Ganti. Scalable ad-hoc entity extraction from text collections. PVLDB, 1(1):945-957, 2008.
    • (2008) PVLDB , vol.1 , Issue.1 , pp. 945-957
    • Agrawal, S.1    Chakrabarti, K.2    Chaudhuri, S.3    Ganti, V.4
  • 2
    • 8644265357 scopus 로고    scopus 로고
    • Web-awhere: geotagging web content
    • E. Amitay, N. Har'El, R. Sivan, and A. Soffer. Web-awhere: geotagging web content. In SIGIR, pages 273-280, 2004.
    • (2004) SIGIR , pp. 273-280
    • Amitay, E.1    Har'El, N.2    Sivan, R.3    Soffer, A.4
  • 3
    • 57149137338 scopus 로고    scopus 로고
    • Incorporating string transformations in record matching
    • A. Arasu, S. Chaudhuri, K. Ganjam, and R. Kaushik. Incorporating string transformations in record matching. In SIGMOD Conference, pages 1231-1234, 2008.
    • (2008) SIGMOD Conference , pp. 1231-1234
    • Arasu, A.1    Chaudhuri, S.2    Ganjam, K.3    Kaushik, R.4
  • 4
    • 85104914015 scopus 로고    scopus 로고
    • Efficient exact set-similarity joins
    • A. Arasu, V. Ganti, and R. Kaushik. Efficient exact set-similarity joins. In VLDB, 2006.
    • (2006) VLDB
    • Arasu, A.1    Ganti, V.2    Kaushik, R.3
  • 5
    • 35348849154 scopus 로고    scopus 로고
    • Scaling up all pairs similarity search
    • R. J. Bayardo, Y. Ma, and R. Srikant. Scaling up all pairs similarity search. In WWW, 2007.
    • (2007) WWW
    • Bayardo, R.J.1    Ma, Y.2    Srikant, R.3
  • 6
    • 84976793960 scopus 로고
    • Using semi-joins to solve relational queries
    • P. A. Bernstein and D.-M. W. Chiu. Using semi-joins to solve relational queries. J. ACM, 28(1):25-40, 1981.
    • (1981) J. ACM , vol.28 , Issue.1 , pp. 25-40
    • Bernstein, P.A.1    Chiu, D.-M.W.2
  • 8
  • 10
    • 33749624541 scopus 로고    scopus 로고
    • Efficient batch top-k search for dictionary-based entity recognition
    • A. Chandel, P. C. Nagesh, and S. Sarawagi. Efficient batch top-k search for dictionary-based entity recognition. In ICDE, page 28, 2006.
    • (2006) ICDE , pp. 28
    • Chandel, A.1    Nagesh, P.C.2    Sarawagi, S.3
  • 11
    • 85011029434 scopus 로고    scopus 로고
    • Example-driven design of efficient record matching queries
    • S. Chaudhuri, B.-C. Chen, V. Ganti, and R. Kaushik. Example-driven design of efficient record matching queries. In VLDB, pages 327-338, 2007.
    • (2007) VLDB , pp. 327-338
    • Chaudhuri, S.1    Chen, B.-C.2    Ganti, V.3    Kaushik, R.4
  • 12
    • 33749597967 scopus 로고    scopus 로고
    • A primitive operator for similarity joins in data cleaning
    • S. Chaudhuri, V. Ganti, and R. Kaushik. A primitive operator for similarity joins in data cleaning. In ICDE, 2006.
    • (2006) ICDE
    • Chaudhuri, S.1    Ganti, V.2    Kaushik, R.3
  • 14
    • 33750364461 scopus 로고    scopus 로고
    • Unsupervised gene/protein entity normalization using automatically extracted dictionaries
    • A. M. Cohen. Unsupervised gene/protein entity normalization using automatically extracted dictionaries. In Proceedings of the BioLINK2005 Workshop, 2005.
    • (2005) Proceedings of the BioLINK2005 Workshop
    • Cohen, A.M.1
  • 15
    • 12244290581 scopus 로고    scopus 로고
    • Exploiting dictionaries in named entity extraction: Combining semi-markov extraction processes and data integration methods
    • W. W. Cohen and S. Sarawagi. Exploiting dictionaries in named entity extraction: combining semi-markov extraction processes and data integration methods. In KDD, pages 89-98, 2004.
    • (2004) KDD , pp. 89-98
    • Cohen, W.W.1    Sarawagi, S.2
  • 17
    • 52649145249 scopus 로고    scopus 로고
    • Fast indexes and algorithms for set similarity selection queries
    • M. Hadjieleftheriou, A. Chandel, N. Koudas, and D. Srivastava. Fast indexes and algorithms for set similarity selection queries. In ICDE, pages 267-276, 2008.
    • (2008) ICDE , pp. 267-276
    • Hadjieleftheriou, M.1    Chandel, A.2    Koudas, N.3    Srivastava, D.4
  • 18
    • 70349659026 scopus 로고    scopus 로고
    • Hashed samples: Selectivity estimators for set similarity selection queries
    • M. Hadjieleftheriou, X. Yu, N. Koudas, and D. Srivastava. Hashed samples: selectivity estimators for set similarity selection queries. PVLDB, 1(1):201-212, 2008.
    • (2008) PVLDB , vol.1 , Issue.1 , pp. 201-212
    • Hadjieleftheriou, M.1    Yu, X.2    Koudas, N.3    Srivastava, D.4
  • 19
    • 33745607646 scopus 로고    scopus 로고
    • Selectivity estimation for fuzzy string predicates in large data sets
    • L. Jin and C. Li. Selectivity estimation for fuzzy string predicates in large data sets. In VLDB, pages 397-408 2005.
    • (2005) VLDB , pp. 397-408
    • Jin, L.1    Li, C.2
  • 20
    • 85011072445 scopus 로고    scopus 로고
    • Extending q-grams to estimate selectivity of string matching with low edit distance
    • H. Lee, R. T. Ng, and K. Shim. Extending q-grams to estimate selectivity of string matching with low edit distance. In VLDB, pages 195-206, 2007.
    • (2007) VLDB , pp. 195-206
    • Lee, H.1    Ng, R.T.2    Shim, K.3
  • 21
    • 52649086729 scopus 로고    scopus 로고
    • C. Li, J. Lu, and Y. Lu. Efficient merging and filtering algorithms for approximate string searches. In ICDE, pages 257-266, 2008.
    • (2008) ICDE , pp. 257-266
    • Li, C.1    Lu, J.2    Lu, Y.3
  • 22
    • 85011032600 scopus 로고    scopus 로고
    • VGRAM: Improving performance of approximate queries on string collections using variable-length grams
    • C. Li, B. Wang, and X. Yang. VGRAM: Improving performance of approximate queries on string collections using variable-length grams. In VLDB, 2007.
    • (2007) VLDB
    • Li, C.1    Wang, B.2    Yang, X.3
  • 23
    • 52649161208 scopus 로고    scopus 로고
    • A fast similarity join algorithm using graphics processing units
    • M. D. Lieberman, J. Sankaranarayanan, and H. Samet. A fast similarity join algorithm using graphics processing units. In ICDE, pages 1111-1120, 2008.
    • (2008) ICDE , pp. 1111-1120
    • Lieberman, M.D.1    Sankaranarayanan, J.2    Samet, H.3
  • 25
    • 0028516571 scopus 로고
    • A sublinear algorithm for approximate keyword searching
    • E. W. Myers. A sublinear algorithm for approximate keyword searching. Algorithmica, 12(4/5):345-374, 1994.
    • (1994) Algorithmica , vol.12 , Issue.4-5 , pp. 345-374
    • Myers, E.W.1
  • 26
    • 0345566149 scopus 로고    scopus 로고
    • A guided tour to approximate string matching
    • G. Navarro. A guided tour to approximate string matching. ACM Comput. Surv., 33(1):31-88, 2001.
    • (2001) ACM Comput. Surv. , vol.33 , Issue.1 , pp. 31-88
    • Navarro, G.1
  • 27
    • 0037228824 scopus 로고    scopus 로고
    • Matchsimile: A flexible approximate matching tool for searching proper name
    • G. Navarro, R. A. Baeza-Yates, and J. M. A. Arcoverde. Matchsimile: a flexible approximate matching tool for searching proper name. JASIST, 54(1):3-15, 2003.
    • (2003) JASIST , vol.54 , Issue.1 , pp. 3-15
    • Navarro, G.1    Baeza-Yates, R.A.2    Arcoverde, J.M.A.3
  • 29
    • 61949087310 scopus 로고    scopus 로고
    • Word sense disambiguation: A survey
    • R. Navigli. Word sense disambiguation: A survey. ACM Comput. Surv., 41(2), 2009.
    • (2009) ACM Comput. Surv. , vol.41 , Issue.2
    • Navigli, R.1
  • 31
    • 18744373348 scopus 로고    scopus 로고
    • Acquisition of categorized named entities for web search
    • M. Pasca. Acquisition of categorized named entities for web search. In CIKM, pages 137-145, 2004.
    • (2004) CIKM , pp. 137-145
    • Pasca, M.1
  • 32
    • 3142777876 scopus 로고    scopus 로고
    • Efficient set joins on similarity predicates
    • S. Sarawagi and A. Kirpal. Efficient set joins on similarity predicates. In SIGMOD, 2004.
    • (2004) SIGMOD
    • Sarawagi, S.1    Kirpal, A.2
  • 33
    • 70849113473 scopus 로고    scopus 로고
    • Department of Informatics, University of Zurich, April
    • B. S. T. Bocek, E. Hunt. Fast Similarity Search in Large Dictionaries. Technical Report ifi-2007.02, Department of Informatics, University of Zurich, April 2007.
    • (2007) Technical Report ifi-2007.02
    • Bocek, B.S.T.1    Hunt, E.2
  • 34
    • 2342598678 scopus 로고    scopus 로고
    • Generation of a large gene/protein lexicon by morphological pattern analysis
    • L. Tanabe and W. J. Wilbur. Generation of a large gene/protein lexicon by morphological pattern analysis. Journal of Bioinformatics and Computational Biology, 1(4):1-16, 2004.
    • (2004) Journal of Bioinformatics and Computational Biology , vol.1 , Issue.4 , pp. 1-16
    • Tanabe, L.1    Wilbur, W.J.2
  • 35
    • 8444232801 scopus 로고    scopus 로고
    • Improving the performance of dictionary-based approaches in protein name recognition
    • Y. Tsuruoka and J. ichi Tsujii. Improving the performance of dictionary-based approaches in protein name recognition. Journal of Biomedical Informatics, 37(6):461-470,2004.
    • (2004) Journal of Biomedical Informatics , vol.37 , Issue.6 , pp. 461-470
    • Tsuruoka, Y.1    Tsujii, J.I.2
  • 36
    • 0020494998 scopus 로고
    • Algorithms for approximate string matching
    • E. Ukkonen. Algorithms for approximate string matching. Information and Control, 64(1-3):100-118, 1985.
    • (1985) Information and Control , vol.64 , Issue.1-3 , pp. 100-118
    • Ukkonen, E.1
  • 37
    • 0000386785 scopus 로고
    • Finding approximate patterns in strings
    • E. Ukkonen. Finding approximate patterns in strings. J. Algorithms, 6(1):132-137, 1985.
    • (1985) J. Algorithms , vol.6 , Issue.1 , pp. 132-137
    • Ukkonen, E.1
  • 38
    • 23944499603 scopus 로고    scopus 로고
    • Assessment of approximate string matching in a biomedical text retrieval problem
    • J. Wang, Z. Li, C. Cai, and Y. Chen. Assessment of approximate string matching in a biomedical text retrieval problem. Computers in Biology and Medicine, 35(8):717-724, 2005.
    • (2005) Computers in Biology and Medicine , vol.35 , Issue.8 , pp. 717-724
    • Wang, J.1    Li, Z.2    Cai, C.3    Chen, Y.4
  • 39
    • 70849105253 scopus 로고    scopus 로고
    • Ed-join: An efficient algorithm for similarity joins with edit distance constraints
    • C. Xiao, W. Wang, and X. Lin. Ed-join: an efficient algorithm for similarity joins with edit distance constraints. PVLDB, 1(1):933-944, 2008.
    • (2008) PVLDB , vol.1 , Issue.1 , pp. 933-944
    • Xiao, C.1    Wang, W.2    Lin, X.3
  • 40
    • 66249113620 scopus 로고    scopus 로고
    • Efficient similarity joins for near duplicate detection
    • C. Xiao, W. Wang, X. Lin, and J. X. Yu. Efficient similarity joins for near duplicate detection. In WWW, 2008.
    • (2008) WWW
    • Xiao, C.1    Wang, W.2    Lin, X.3    Yu, J.X.4
  • 41
    • 57149130672 scopus 로고    scopus 로고
    • Cost-based variablelength- gram selection for string collections to support approximate queries efficiently
    • X. Yang, B. Wang, and C. Li. Cost-based variablelength- gram selection for string collections to support approximate queries efficiently. In SIGMOD Conference, pages 353-364, 2008.
    • (2008) SIGMOD Conference , pp. 353-364
    • Yang, X.1    Wang, B.2    Li, C.3
  • 42
    • 0029271657 scopus 로고
    • Finding approximate matches in large lexicons
    • J. Zobel and P. W. Dart. Finding approximate matches in large lexicons. Softw., Pract. Exper., 25(3):331-345, 1995.
    • (1995) Softw., Pract. Exper. , vol.25 , Issue.3 , pp. 331-345
    • Zobel, J.1    Dart, P.W.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.