-
1
-
-
77950901996
-
Scalable ad-hoc entity extraction from text collections
-
S. Agrawal, K. Chakrabarti, S. Chaudhuri, and V. Ganti. Scalable ad-hoc entity extraction from text collections. PVLDB, 1(1):945-957, 2008.
-
(2008)
PVLDB
, vol.1
, Issue.1
, pp. 945-957
-
-
Agrawal, S.1
Chakrabarti, K.2
Chaudhuri, S.3
Ganti, V.4
-
2
-
-
8644265357
-
Web-awhere: geotagging web content
-
E. Amitay, N. Har'El, R. Sivan, and A. Soffer. Web-awhere: geotagging web content. In SIGIR, pages 273-280, 2004.
-
(2004)
SIGIR
, pp. 273-280
-
-
Amitay, E.1
Har'El, N.2
Sivan, R.3
Soffer, A.4
-
3
-
-
57149137338
-
Incorporating string transformations in record matching
-
A. Arasu, S. Chaudhuri, K. Ganjam, and R. Kaushik. Incorporating string transformations in record matching. In SIGMOD Conference, pages 1231-1234, 2008.
-
(2008)
SIGMOD Conference
, pp. 1231-1234
-
-
Arasu, A.1
Chaudhuri, S.2
Ganjam, K.3
Kaushik, R.4
-
4
-
-
85104914015
-
Efficient exact set-similarity joins
-
A. Arasu, V. Ganti, and R. Kaushik. Efficient exact set-similarity joins. In VLDB, 2006.
-
(2006)
VLDB
-
-
Arasu, A.1
Ganti, V.2
Kaushik, R.3
-
5
-
-
35348849154
-
Scaling up all pairs similarity search
-
R. J. Bayardo, Y. Ma, and R. Srikant. Scaling up all pairs similarity search. In WWW, 2007.
-
(2007)
WWW
-
-
Bayardo, R.J.1
Ma, Y.2
Srikant, R.3
-
6
-
-
84976793960
-
Using semi-joins to solve relational queries
-
P. A. Bernstein and D.-M. W. Chiu. Using semi-joins to solve relational queries. J. ACM, 28(1):25-40, 1981.
-
(1981)
J. ACM
, vol.28
, Issue.1
, pp. 25-40
-
-
Bernstein, P.A.1
Chiu, D.-M.W.2
-
7
-
-
2342447399
-
Adaptive name matching in information integration
-
M. Bilenko, R. J. Mooney, W. W. Cohen, P. Ravikumar, and S. E. Fienberg. Adaptive name matching in information integration. IEEE Intelligent Sys., 18(5):16-23, 2003.
-
(2003)
IEEE Intelligent Sys.
, vol.18
, Issue.5
, pp. 16-23
-
-
Bilenko, M.1
Mooney, R.J.2
Cohen, W.W.3
Ravikumar, P.4
Fienberg, S.E.5
-
8
-
-
57149127665
-
An efficient filter for approximate membership checking
-
K. Chakrabarti, S. Chaudhuri, V. Ganti, and D. Xin. An efficient filter for approximate membership checking. In SIGMOD Conference, pages 805-818, 2008.
-
(2008)
SIGMOD Conference
, pp. 805-818
-
-
Chakrabarti, K.1
Chaudhuri, S.2
Ganti, V.3
Xin, D.4
-
9
-
-
35448984015
-
Benchmarking declarative approximate selection predicates
-
A. Chandel, O. Hassanzadeh, N. Koudas, M. Sadoghi, and D. Srivastava. Benchmarking declarative approximate selection predicates. In SIGMOD Conference, pages 353-364, 2007.
-
(2007)
SIGMOD Conference
, pp. 353-364
-
-
Chandel, A.1
Hassanzadeh, O.2
Koudas, N.3
Sadoghi, M.4
Srivastava, D.5
-
10
-
-
33749624541
-
Efficient batch top-k search for dictionary-based entity recognition
-
A. Chandel, P. C. Nagesh, and S. Sarawagi. Efficient batch top-k search for dictionary-based entity recognition. In ICDE, page 28, 2006.
-
(2006)
ICDE
, pp. 28
-
-
Chandel, A.1
Nagesh, P.C.2
Sarawagi, S.3
-
11
-
-
85011029434
-
Example-driven design of efficient record matching queries
-
S. Chaudhuri, B.-C. Chen, V. Ganti, and R. Kaushik. Example-driven design of efficient record matching queries. In VLDB, pages 327-338, 2007.
-
(2007)
VLDB
, pp. 327-338
-
-
Chaudhuri, S.1
Chen, B.-C.2
Ganti, V.3
Kaushik, R.4
-
12
-
-
33749597967
-
A primitive operator for similarity joins in data cleaning
-
S. Chaudhuri, V. Ganti, and R. Kaushik. A primitive operator for similarity joins in data cleaning. In ICDE, 2006.
-
(2006)
ICDE
-
-
Chaudhuri, S.1
Ganti, V.2
Kaushik, R.3
-
13
-
-
35448937301
-
Leveraging aggregate constraints for deduplication
-
S. Chaudhuri, A. D. Sarma, V. Ganti, and R. Kaushik. Leveraging aggregate constraints for deduplication. In SIGMOD Conference, pages 437-448, 2007.
-
(2007)
SIGMOD Conference
, pp. 437-448
-
-
Chaudhuri, S.1
Sarma, A.D.2
Ganti, V.3
Kaushik, R.4
-
14
-
-
33750364461
-
Unsupervised gene/protein entity normalization using automatically extracted dictionaries
-
A. M. Cohen. Unsupervised gene/protein entity normalization using automatically extracted dictionaries. In Proceedings of the BioLINK2005 Workshop, 2005.
-
(2005)
Proceedings of the BioLINK2005 Workshop
-
-
Cohen, A.M.1
-
15
-
-
12244290581
-
Exploiting dictionaries in named entity extraction: Combining semi-markov extraction processes and data integration methods
-
W. W. Cohen and S. Sarawagi. Exploiting dictionaries in named entity extraction: combining semi-markov extraction processes and data integration methods. In KDD, pages 89-98, 2004.
-
(2004)
KDD
, pp. 89-98
-
-
Cohen, W.W.1
Sarawagi, S.2
-
16
-
-
84944318804
-
Approximate string joins in a database (almost) for free
-
L. Gravano, P. G. Ipeirotis, H. V. Jagadish, N. Koudas, S. Muthukrishnan, and D. Srivastava. Approximate string joins in a database (almost) for free. In VLDB, 2001.
-
(2001)
VLDB
-
-
Gravano, L.1
Ipeirotis, P.G.2
Jagadish, H.V.3
Koudas, N.4
Muthukrishnan, S.5
Srivastava, D.6
-
17
-
-
52649145249
-
Fast indexes and algorithms for set similarity selection queries
-
M. Hadjieleftheriou, A. Chandel, N. Koudas, and D. Srivastava. Fast indexes and algorithms for set similarity selection queries. In ICDE, pages 267-276, 2008.
-
(2008)
ICDE
, pp. 267-276
-
-
Hadjieleftheriou, M.1
Chandel, A.2
Koudas, N.3
Srivastava, D.4
-
18
-
-
70349659026
-
Hashed samples: Selectivity estimators for set similarity selection queries
-
M. Hadjieleftheriou, X. Yu, N. Koudas, and D. Srivastava. Hashed samples: selectivity estimators for set similarity selection queries. PVLDB, 1(1):201-212, 2008.
-
(2008)
PVLDB
, vol.1
, Issue.1
, pp. 201-212
-
-
Hadjieleftheriou, M.1
Yu, X.2
Koudas, N.3
Srivastava, D.4
-
19
-
-
33745607646
-
Selectivity estimation for fuzzy string predicates in large data sets
-
L. Jin and C. Li. Selectivity estimation for fuzzy string predicates in large data sets. In VLDB, pages 397-408 2005.
-
(2005)
VLDB
, pp. 397-408
-
-
Jin, L.1
Li, C.2
-
20
-
-
85011072445
-
Extending q-grams to estimate selectivity of string matching with low edit distance
-
H. Lee, R. T. Ng, and K. Shim. Extending q-grams to estimate selectivity of string matching with low edit distance. In VLDB, pages 195-206, 2007.
-
(2007)
VLDB
, pp. 195-206
-
-
Lee, H.1
Ng, R.T.2
Shim, K.3
-
21
-
-
52649086729
-
-
C. Li, J. Lu, and Y. Lu. Efficient merging and filtering algorithms for approximate string searches. In ICDE, pages 257-266, 2008.
-
(2008)
ICDE
, pp. 257-266
-
-
Li, C.1
Lu, J.2
Lu, Y.3
-
22
-
-
85011032600
-
VGRAM: Improving performance of approximate queries on string collections using variable-length grams
-
C. Li, B. Wang, and X. Yang. VGRAM: Improving performance of approximate queries on string collections using variable-length grams. In VLDB, 2007.
-
(2007)
VLDB
-
-
Li, C.1
Wang, B.2
Yang, X.3
-
23
-
-
52649161208
-
A fast similarity join algorithm using graphics processing units
-
M. D. Lieberman, J. Sankaranarayanan, and H. Samet. A fast similarity join algorithm using graphics processing units. In ICDE, pages 1111-1120, 2008.
-
(2008)
ICDE
, pp. 1111-1120
-
-
Lieberman, M.D.1
Sankaranarayanan, J.2
Samet, H.3
-
24
-
-
34547421874
-
Estimating the selectivity of approximate string queries
-
A. Mazeika, M. H. Böhlen, N. Koudas, and D. Srivastava. Estimating the selectivity of approximate string queries. ACM Trans. Database Syst., 32(2):12, 2007.
-
(2007)
ACM Trans. Database Syst.
, vol.32
, Issue.2
, pp. 12
-
-
Mazeika, A.1
Böhlen, M.H.2
Koudas, N.3
Srivastava, D.4
-
25
-
-
0028516571
-
A sublinear algorithm for approximate keyword searching
-
E. W. Myers. A sublinear algorithm for approximate keyword searching. Algorithmica, 12(4/5):345-374, 1994.
-
(1994)
Algorithmica
, vol.12
, Issue.4-5
, pp. 345-374
-
-
Myers, E.W.1
-
26
-
-
0345566149
-
A guided tour to approximate string matching
-
G. Navarro. A guided tour to approximate string matching. ACM Comput. Surv., 33(1):31-88, 2001.
-
(2001)
ACM Comput. Surv.
, vol.33
, Issue.1
, pp. 31-88
-
-
Navarro, G.1
-
27
-
-
0037228824
-
Matchsimile: A flexible approximate matching tool for searching proper name
-
G. Navarro, R. A. Baeza-Yates, and J. M. A. Arcoverde. Matchsimile: a flexible approximate matching tool for searching proper name. JASIST, 54(1):3-15, 2003.
-
(2003)
JASIST
, vol.54
, Issue.1
, pp. 3-15
-
-
Navarro, G.1
Baeza-Yates, R.A.2
Arcoverde, J.M.A.3
-
28
-
-
0001851762
-
Indexing methods for approximate string matching
-
G. Navarro, R. A. Baeza-Yates, E. Sutinen, and J. Tarhio. Indexing methods for approximate string matching. IEEE Data Eng. Bull., 24(4):19-27, 2001.
-
(2001)
IEEE Data Eng. Bull.
, vol.24
, Issue.4
, pp. 19-27
-
-
Navarro, G.1
Baeza-Yates, R.A.2
Sutinen, E.3
Tarhio, J.4
-
29
-
-
61949087310
-
Word sense disambiguation: A survey
-
R. Navigli. Word sense disambiguation: A survey. ACM Comput. Surv., 41(2), 2009.
-
(2009)
ACM Comput. Surv.
, vol.41
, Issue.2
-
-
Navigli, R.1
-
30
-
-
34548725521
-
Group linkage
-
B.-W. On, N. Koudas, D. Lee, and D. Srivastava. Group linkage. In ICDE, pages 496-505, 2007.
-
(2007)
ICDE
, pp. 496-505
-
-
On, B.-W.1
Koudas, N.2
Lee, D.3
Srivastava, D.4
-
31
-
-
18744373348
-
Acquisition of categorized named entities for web search
-
M. Pasca. Acquisition of categorized named entities for web search. In CIKM, pages 137-145, 2004.
-
(2004)
CIKM
, pp. 137-145
-
-
Pasca, M.1
-
32
-
-
3142777876
-
Efficient set joins on similarity predicates
-
S. Sarawagi and A. Kirpal. Efficient set joins on similarity predicates. In SIGMOD, 2004.
-
(2004)
SIGMOD
-
-
Sarawagi, S.1
Kirpal, A.2
-
33
-
-
70849113473
-
-
Department of Informatics, University of Zurich, April
-
B. S. T. Bocek, E. Hunt. Fast Similarity Search in Large Dictionaries. Technical Report ifi-2007.02, Department of Informatics, University of Zurich, April 2007.
-
(2007)
Technical Report ifi-2007.02
-
-
Bocek, B.S.T.1
Hunt, E.2
-
34
-
-
2342598678
-
Generation of a large gene/protein lexicon by morphological pattern analysis
-
L. Tanabe and W. J. Wilbur. Generation of a large gene/protein lexicon by morphological pattern analysis. Journal of Bioinformatics and Computational Biology, 1(4):1-16, 2004.
-
(2004)
Journal of Bioinformatics and Computational Biology
, vol.1
, Issue.4
, pp. 1-16
-
-
Tanabe, L.1
Wilbur, W.J.2
-
35
-
-
8444232801
-
Improving the performance of dictionary-based approaches in protein name recognition
-
Y. Tsuruoka and J. ichi Tsujii. Improving the performance of dictionary-based approaches in protein name recognition. Journal of Biomedical Informatics, 37(6):461-470,2004.
-
(2004)
Journal of Biomedical Informatics
, vol.37
, Issue.6
, pp. 461-470
-
-
Tsuruoka, Y.1
Tsujii, J.I.2
-
36
-
-
0020494998
-
Algorithms for approximate string matching
-
E. Ukkonen. Algorithms for approximate string matching. Information and Control, 64(1-3):100-118, 1985.
-
(1985)
Information and Control
, vol.64
, Issue.1-3
, pp. 100-118
-
-
Ukkonen, E.1
-
37
-
-
0000386785
-
Finding approximate patterns in strings
-
E. Ukkonen. Finding approximate patterns in strings. J. Algorithms, 6(1):132-137, 1985.
-
(1985)
J. Algorithms
, vol.6
, Issue.1
, pp. 132-137
-
-
Ukkonen, E.1
-
38
-
-
23944499603
-
Assessment of approximate string matching in a biomedical text retrieval problem
-
J. Wang, Z. Li, C. Cai, and Y. Chen. Assessment of approximate string matching in a biomedical text retrieval problem. Computers in Biology and Medicine, 35(8):717-724, 2005.
-
(2005)
Computers in Biology and Medicine
, vol.35
, Issue.8
, pp. 717-724
-
-
Wang, J.1
Li, Z.2
Cai, C.3
Chen, Y.4
-
39
-
-
70849105253
-
Ed-join: An efficient algorithm for similarity joins with edit distance constraints
-
C. Xiao, W. Wang, and X. Lin. Ed-join: an efficient algorithm for similarity joins with edit distance constraints. PVLDB, 1(1):933-944, 2008.
-
(2008)
PVLDB
, vol.1
, Issue.1
, pp. 933-944
-
-
Xiao, C.1
Wang, W.2
Lin, X.3
-
40
-
-
66249113620
-
Efficient similarity joins for near duplicate detection
-
C. Xiao, W. Wang, X. Lin, and J. X. Yu. Efficient similarity joins for near duplicate detection. In WWW, 2008.
-
(2008)
WWW
-
-
Xiao, C.1
Wang, W.2
Lin, X.3
Yu, J.X.4
-
41
-
-
57149130672
-
Cost-based variablelength- gram selection for string collections to support approximate queries efficiently
-
X. Yang, B. Wang, and C. Li. Cost-based variablelength- gram selection for string collections to support approximate queries efficiently. In SIGMOD Conference, pages 353-364, 2008.
-
(2008)
SIGMOD Conference
, pp. 353-364
-
-
Yang, X.1
Wang, B.2
Li, C.3
-
42
-
-
0029271657
-
Finding approximate matches in large lexicons
-
J. Zobel and P. W. Dart. Finding approximate matches in large lexicons. Softw., Pract. Exper., 25(3):331-345, 1995.
-
(1995)
Softw., Pract. Exper.
, vol.25
, Issue.3
, pp. 331-345
-
-
Zobel, J.1
Dart, P.W.2
|