SCOPUS 정보 검색 플랫폼

SIGMOD-PODS'09 - Proceedings of the International Conference on Management of Data and 28th Symposium on Principles of Database Systems

Volumn , Issue , 2009, Pages 759-770

Efficient approximate entity extraction with edit distance constraints

(4) Wang, Wei a Xiao, Chuan a Lin, Xuemin a Zhang, Chengqi b

a UNIVERSITY OF NEW SOUTH WALES (Australia)

b UNIVERSITY OF TECHNOLOGY SYDNEY (Australia)

Author keywords

[No Author keywords available]

Indexed keywords

ALTERNATIVE APPROACH; APPROXIMATE MATCHES; DATA SETS; DICTIONARY MATCHING; DOCUMENT-PROCESSING; DOMAIN KNOWLEDGE; EDIT DISTANCE; GENERATION METHOD; NAMED ENTITIES; NAMED ENTITY RECOGNITION; ORDER OF MAGNITUDE; POOR PERFORMANCE; PROBLEM DEFINITION; PRUNING TECHNIQUES; RECENT TRENDS;

ALGORITHMS; CHARACTER RECOGNITION; GRAPH THEORY; NATURAL LANGUAGE PROCESSING SYSTEMS;

DATABASE SYSTEMS;

EID: 70849115286 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/1559845.1559925 Document Type: Conference Paper

Times cited : (99)

References (42)

1
- 77950901996
- Scalable ad-hoc entity extraction from text collections
- S. Agrawal, K. Chakrabarti, S. Chaudhuri, and V. Ganti. Scalable ad-hoc entity extraction from text collections. PVLDB, 1(1):945-957, 2008.
- (2008) PVLDB , vol.1 , Issue.1 , pp. 945-957
- Agrawal, S.¹ Chakrabarti, K.² Chaudhuri, S.³ Ganti, V.⁴

2
- 8644265357
- Web-awhere: geotagging web content
- E. Amitay, N. Har'El, R. Sivan, and A. Soffer. Web-awhere: geotagging web content. In SIGIR, pages 273-280, 2004.
- (2004) SIGIR , pp. 273-280
- Amitay, E.¹ Har'El, N.² Sivan, R.³ Soffer, A.⁴

3
- 57149137338
- Incorporating string transformations in record matching
- A. Arasu, S. Chaudhuri, K. Ganjam, and R. Kaushik. Incorporating string transformations in record matching. In SIGMOD Conference, pages 1231-1234, 2008.
- (2008) SIGMOD Conference , pp. 1231-1234
- Arasu, A.¹ Chaudhuri, S.² Ganjam, K.³ Kaushik, R.⁴

4
- 85104914015
- Efficient exact set-similarity joins
- A. Arasu, V. Ganti, and R. Kaushik. Efficient exact set-similarity joins. In VLDB, 2006.
- (2006) VLDB
- Arasu, A.¹ Ganti, V.² Kaushik, R.³

5
- 35348849154
- Scaling up all pairs similarity search
- R. J. Bayardo, Y. Ma, and R. Srikant. Scaling up all pairs similarity search. In WWW, 2007.
- (2007) WWW
- Bayardo, R.J.¹ Ma, Y.² Srikant, R.³

6
- 84976793960
- Using semi-joins to solve relational queries
- P. A. Bernstein and D.-M. W. Chiu. Using semi-joins to solve relational queries. J. ACM, 28(1):25-40, 1981.
- (1981) J. ACM , vol.28 , Issue.1 , pp. 25-40
- Bernstein, P.A.¹ Chiu, D.-M.W.²

7
- 2342447399
- Adaptive name matching in information integration
- M. Bilenko, R. J. Mooney, W. W. Cohen, P. Ravikumar, and S. E. Fienberg. Adaptive name matching in information integration. IEEE Intelligent Sys., 18(5):16-23, 2003.
- (2003) IEEE Intelligent Sys. , vol.18 , Issue.5 , pp. 16-23
- Bilenko, M.¹ Mooney, R.J.² Cohen, W.W.³ Ravikumar, P.⁴ Fienberg, S.E.⁵

8
- 57149127665
- An efficient filter for approximate membership checking
- K. Chakrabarti, S. Chaudhuri, V. Ganti, and D. Xin. An efficient filter for approximate membership checking. In SIGMOD Conference, pages 805-818, 2008.
- (2008) SIGMOD Conference , pp. 805-818
- Chakrabarti, K.¹ Chaudhuri, S.² Ganti, V.³ Xin, D.⁴

9
- 35448984015
- Benchmarking declarative approximate selection predicates
- A. Chandel, O. Hassanzadeh, N. Koudas, M. Sadoghi, and D. Srivastava. Benchmarking declarative approximate selection predicates. In SIGMOD Conference, pages 353-364, 2007.
- (2007) SIGMOD Conference , pp. 353-364
- Chandel, A.¹ Hassanzadeh, O.² Koudas, N.³ Sadoghi, M.⁴ Srivastava, D.⁵

10
- 33749624541
- Efficient batch top-k search for dictionary-based entity recognition
- A. Chandel, P. C. Nagesh, and S. Sarawagi. Efficient batch top-k search for dictionary-based entity recognition. In ICDE, page 28, 2006.
- (2006) ICDE , pp. 28
- Chandel, A.¹ Nagesh, P.C.² Sarawagi, S.³

11
- 85011029434
- Example-driven design of efficient record matching queries
- S. Chaudhuri, B.-C. Chen, V. Ganti, and R. Kaushik. Example-driven design of efficient record matching queries. In VLDB, pages 327-338, 2007.
- (2007) VLDB , pp. 327-338
- Chaudhuri, S.¹ Chen, B.-C.² Ganti, V.³ Kaushik, R.⁴

12
- 33749597967
- A primitive operator for similarity joins in data cleaning
- S. Chaudhuri, V. Ganti, and R. Kaushik. A primitive operator for similarity joins in data cleaning. In ICDE, 2006.
- (2006) ICDE
- Chaudhuri, S.¹ Ganti, V.² Kaushik, R.³

13
- 35448937301
- Leveraging aggregate constraints for deduplication
- S. Chaudhuri, A. D. Sarma, V. Ganti, and R. Kaushik. Leveraging aggregate constraints for deduplication. In SIGMOD Conference, pages 437-448, 2007.
- (2007) SIGMOD Conference , pp. 437-448
- Chaudhuri, S.¹ Sarma, A.D.² Ganti, V.³ Kaushik, R.⁴

14
- 33750364461
- Unsupervised gene/protein entity normalization using automatically extracted dictionaries
- A. M. Cohen. Unsupervised gene/protein entity normalization using automatically extracted dictionaries. In Proceedings of the BioLINK2005 Workshop, 2005.
- (2005) Proceedings of the BioLINK2005 Workshop
- Cohen, A.M.¹

15
- 12244290581
- Exploiting dictionaries in named entity extraction: Combining semi-markov extraction processes and data integration methods
- W. W. Cohen and S. Sarawagi. Exploiting dictionaries in named entity extraction: combining semi-markov extraction processes and data integration methods. In KDD, pages 89-98, 2004.
- (2004) KDD , pp. 89-98
- Cohen, W.W.¹ Sarawagi, S.²

16
- 84944318804
- Approximate string joins in a database (almost) for free
- L. Gravano, P. G. Ipeirotis, H. V. Jagadish, N. Koudas, S. Muthukrishnan, and D. Srivastava. Approximate string joins in a database (almost) for free. In VLDB, 2001.
- (2001) VLDB
- Gravano, L.¹ Ipeirotis, P.G.² Jagadish, H.V.³ Koudas, N.⁴ Muthukrishnan, S.⁵ Srivastava, D.⁶

17
- 52649145249
- Fast indexes and algorithms for set similarity selection queries
- M. Hadjieleftheriou, A. Chandel, N. Koudas, and D. Srivastava. Fast indexes and algorithms for set similarity selection queries. In ICDE, pages 267-276, 2008.
- (2008) ICDE , pp. 267-276
- Hadjieleftheriou, M.¹ Chandel, A.² Koudas, N.³ Srivastava, D.⁴

18
- 70349659026
- Hashed samples: Selectivity estimators for set similarity selection queries
- M. Hadjieleftheriou, X. Yu, N. Koudas, and D. Srivastava. Hashed samples: selectivity estimators for set similarity selection queries. PVLDB, 1(1):201-212, 2008.
- (2008) PVLDB , vol.1 , Issue.1 , pp. 201-212
- Hadjieleftheriou, M.¹ Yu, X.² Koudas, N.³ Srivastava, D.⁴

19
- 33745607646
- Selectivity estimation for fuzzy string predicates in large data sets
- L. Jin and C. Li. Selectivity estimation for fuzzy string predicates in large data sets. In VLDB, pages 397-408 2005.
- (2005) VLDB , pp. 397-408
- Jin, L.¹ Li, C.²

20
- 85011072445
- Extending q-grams to estimate selectivity of string matching with low edit distance
- H. Lee, R. T. Ng, and K. Shim. Extending q-grams to estimate selectivity of string matching with low edit distance. In VLDB, pages 195-206, 2007.
- (2007) VLDB , pp. 195-206
- Lee, H.¹ Ng, R.T.² Shim, K.³

21
- 52649086729
- C. Li, J. Lu, and Y. Lu. Efficient merging and filtering algorithms for approximate string searches. In ICDE, pages 257-266, 2008.
- (2008) ICDE , pp. 257-266
- Li, C.¹ Lu, J.² Lu, Y.³

22
- 85011032600
- VGRAM: Improving performance of approximate queries on string collections using variable-length grams
- C. Li, B. Wang, and X. Yang. VGRAM: Improving performance of approximate queries on string collections using variable-length grams. In VLDB, 2007.
- (2007) VLDB
- Li, C.¹ Wang, B.² Yang, X.³

23
- 52649161208
- A fast similarity join algorithm using graphics processing units
- M. D. Lieberman, J. Sankaranarayanan, and H. Samet. A fast similarity join algorithm using graphics processing units. In ICDE, pages 1111-1120, 2008.
- (2008) ICDE , pp. 1111-1120
- Lieberman, M.D.¹ Sankaranarayanan, J.² Samet, H.³

24
- 34547421874
- Estimating the selectivity of approximate string queries
- A. Mazeika, M. H. Böhlen, N. Koudas, and D. Srivastava. Estimating the selectivity of approximate string queries. ACM Trans. Database Syst., 32(2):12, 2007.
- (2007) ACM Trans. Database Syst. , vol.32 , Issue.2 , pp. 12
- Mazeika, A.¹ Böhlen, M.H.² Koudas, N.³ Srivastava, D.⁴

25
- 0028516571
- A sublinear algorithm for approximate keyword searching
- E. W. Myers. A sublinear algorithm for approximate keyword searching. Algorithmica, 12(4/5):345-374, 1994.
- (1994) Algorithmica , vol.12 , Issue.4-5 , pp. 345-374
- Myers, E.W.¹

26
- 0345566149
- A guided tour to approximate string matching
- G. Navarro. A guided tour to approximate string matching. ACM Comput. Surv., 33(1):31-88, 2001.
- (2001) ACM Comput. Surv. , vol.33 , Issue.1 , pp. 31-88
- Navarro, G.¹

27
- 0037228824
- Matchsimile: A flexible approximate matching tool for searching proper name
- G. Navarro, R. A. Baeza-Yates, and J. M. A. Arcoverde. Matchsimile: a flexible approximate matching tool for searching proper name. JASIST, 54(1):3-15, 2003.
- (2003) JASIST , vol.54 , Issue.1 , pp. 3-15
- Navarro, G.¹ Baeza-Yates, R.A.² Arcoverde, J.M.A.³

28
- 0001851762
- Indexing methods for approximate string matching
- G. Navarro, R. A. Baeza-Yates, E. Sutinen, and J. Tarhio. Indexing methods for approximate string matching. IEEE Data Eng. Bull., 24(4):19-27, 2001.
- (2001) IEEE Data Eng. Bull. , vol.24 , Issue.4 , pp. 19-27
- Navarro, G.¹ Baeza-Yates, R.A.² Sutinen, E.³ Tarhio, J.⁴

29
- 61949087310
- Word sense disambiguation: A survey
- R. Navigli. Word sense disambiguation: A survey. ACM Comput. Surv., 41(2), 2009.
- (2009) ACM Comput. Surv. , vol.41 , Issue.2
- Navigli, R.¹

30
- 34548725521
- Group linkage
- B.-W. On, N. Koudas, D. Lee, and D. Srivastava. Group linkage. In ICDE, pages 496-505, 2007.
- (2007) ICDE , pp. 496-505
- On, B.-W.¹ Koudas, N.² Lee, D.³ Srivastava, D.⁴

31
- 18744373348
- Acquisition of categorized named entities for web search
- M. Pasca. Acquisition of categorized named entities for web search. In CIKM, pages 137-145, 2004.
- (2004) CIKM , pp. 137-145
- Pasca, M.¹

32
- 3142777876
- Efficient set joins on similarity predicates
- S. Sarawagi and A. Kirpal. Efficient set joins on similarity predicates. In SIGMOD, 2004.
- (2004) SIGMOD
- Sarawagi, S.¹ Kirpal, A.²

33
- 70849113473
- Department of Informatics, University of Zurich, April
- B. S. T. Bocek, E. Hunt. Fast Similarity Search in Large Dictionaries. Technical Report ifi-2007.02, Department of Informatics, University of Zurich, April 2007.
- (2007) Technical Report ifi-2007.02
- Bocek, B.S.T.¹ Hunt, E.²

34
- 2342598678
- Generation of a large gene/protein lexicon by morphological pattern analysis
- L. Tanabe and W. J. Wilbur. Generation of a large gene/protein lexicon by morphological pattern analysis. Journal of Bioinformatics and Computational Biology, 1(4):1-16, 2004.
- (2004) Journal of Bioinformatics and Computational Biology , vol.1 , Issue.4 , pp. 1-16
- Tanabe, L.¹ Wilbur, W.J.²

35
- 8444232801
- Improving the performance of dictionary-based approaches in protein name recognition
- Y. Tsuruoka and J. ichi Tsujii. Improving the performance of dictionary-based approaches in protein name recognition. Journal of Biomedical Informatics, 37(6):461-470,2004.
- (2004) Journal of Biomedical Informatics , vol.37 , Issue.6 , pp. 461-470
- Tsuruoka, Y.¹ Tsujii, J.I.²

36
- 0020494998
- Algorithms for approximate string matching
- E. Ukkonen. Algorithms for approximate string matching. Information and Control, 64(1-3):100-118, 1985.
- (1985) Information and Control , vol.64 , Issue.1-3 , pp. 100-118
- Ukkonen, E.¹

37
- 0000386785
- Finding approximate patterns in strings
- E. Ukkonen. Finding approximate patterns in strings. J. Algorithms, 6(1):132-137, 1985.
- (1985) J. Algorithms , vol.6 , Issue.1 , pp. 132-137
- Ukkonen, E.¹

38
- 23944499603
- Assessment of approximate string matching in a biomedical text retrieval problem
- J. Wang, Z. Li, C. Cai, and Y. Chen. Assessment of approximate string matching in a biomedical text retrieval problem. Computers in Biology and Medicine, 35(8):717-724, 2005.
- (2005) Computers in Biology and Medicine , vol.35 , Issue.8 , pp. 717-724
- Wang, J.¹ Li, Z.² Cai, C.³ Chen, Y.⁴

39
- 70849105253
- Ed-join: An efficient algorithm for similarity joins with edit distance constraints
- C. Xiao, W. Wang, and X. Lin. Ed-join: an efficient algorithm for similarity joins with edit distance constraints. PVLDB, 1(1):933-944, 2008.
- (2008) PVLDB , vol.1 , Issue.1 , pp. 933-944
- Xiao, C.¹ Wang, W.² Lin, X.³

40
- 66249113620
- Efficient similarity joins for near duplicate detection
- C. Xiao, W. Wang, X. Lin, and J. X. Yu. Efficient similarity joins for near duplicate detection. In WWW, 2008.
- (2008) WWW
- Xiao, C.¹ Wang, W.² Lin, X.³ Yu, J.X.⁴

41
- 57149130672
- Cost-based variablelength- gram selection for string collections to support approximate queries efficiently
- X. Yang, B. Wang, and C. Li. Cost-based variablelength- gram selection for string collections to support approximate queries efficiently. In SIGMOD Conference, pages 353-364, 2008.
- (2008) SIGMOD Conference , pp. 353-364
- Yang, X.¹ Wang, B.² Li, C.³

42
- 0029271657
- Finding approximate matches in large lexicons
- J. Zobel and P. W. Dart. Finding approximate matches in large lexicons. Softw., Pract. Exper., 25(3):331-345, 1995.
- (1995) Softw., Pract. Exper. , vol.25 , Issue.3 , pp. 331-345
- Zobel, J.¹ Dart, P.W.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.