-
1
-
-
0025183708
-
Basic local alignment search tool
-
S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. Basic local alignment search tool. Journal of molecular biology, 215(3):403-410, 1990.
-
(1990)
Journal of Molecular Biology
, vol.215
, Issue.3
, pp. 403-410
-
-
Altschul, S.F.1
Gish, W.2
Miller, W.3
Myers, E.W.4
Lipman, D.J.5
-
2
-
-
85104914015
-
Efficient exact set-similarity joins
-
A. Arasu, V. Ganti, and R. Kaushik. Efficient exact set-similarity joins. In VLDB, 2006.
-
(2006)
VLDB
-
-
Arasu, A.1
Ganti, V.2
Kaushik, R.3
-
3
-
-
35348849154
-
Scaling up all pairs similarity search
-
R. J. Bayardo, Y. Ma, and R. Srikant. Scaling up all pairs similarity search. In WWW, 2007.
-
(2007)
WWW
-
-
Bayardo, R.J.1
Ma, Y.2
Srikant, R.3
-
4
-
-
34547618938
-
On the resemblance and containment of documents
-
A. Z. Broder. On the resemblance and containment of documents. In SEQS, 1997.
-
(1997)
SEQS
-
-
Broder, A.Z.1
-
5
-
-
0010362121
-
Syntactic clustering of the Web
-
PII S0169755297000317
-
A. Z. Broder, S. C. Glassman, M. S. Manasse, and G. Zweig. Syntactic clustering of the web. Computer Networks, 29(8-13):1157-1166, 1997. (Pubitemid 127818518)
-
(1997)
Computer Networks and ISDN Systems
, vol.29
, Issue.8-13
, pp. 1157-1166
-
-
Broder, A.Z.1
Glassman, S.C.2
Manasse, M.S.3
Zweig, G.4
-
6
-
-
0036040277
-
Similarity estimation techniques from rounding algorithms
-
M. Charikar. Similarity estimation techniques from rounding algorithms. In STOC, 2002.
-
(2002)
STOC
-
-
Charikar, M.1
-
7
-
-
33749597967
-
A primitive operator for similarity joins in data cleaning
-
S. Chaudhuri, V. Ganti, and R. Kaushik. A primitive operator for similarity joins in data cleaning. In ICDE, 2006.
-
(2006)
ICDE
-
-
Chaudhuri, S.1
Ganti, V.2
Kaushik, R.3
-
9
-
-
0013206133
-
Collection statistics for fast duplicate document detection
-
DOI 10.1145/506309.506311
-
A. Chowdhury, O. Frieder, D. A. Grossman, and M. C. McCabe. Collection statistics for fast duplicate document detection. ACM Trans. Inf. Syst., 20(2):171-191, 2002. (Pubitemid 44642301)
-
(2002)
ACM Transactions on Information Systems
, vol.20
, Issue.2
, pp. 171-191
-
-
Chowdhury, A.1
Frieder, O.2
Grossman, D.3
McCabe, M.C.4
-
10
-
-
84993661659
-
M-tree: An efficient access method for similarity search in metric spaces
-
P. Ciaccia, M. Patella, and P. Zezula. M-tree: An efficient access method for similarity search in metric spaces. In VLDB, pages 426-435, 1997.
-
(1997)
VLDB
, pp. 426-435
-
-
Ciaccia, P.1
Patella, M.2
Zezula, P.3
-
11
-
-
35248897127
-
Similarity join in metric spaces
-
V. Dohnal, C. Gennaro, P. Savino, and P. Zezula. Similarity join in metric spaces. In ECIR, pages 452-467, 2003.
-
(2003)
ECIR
, pp. 452-467
-
-
Dohnal, V.1
Gennaro, C.2
Savino, P.3
Zezula, P.4
-
12
-
-
77952777875
-
Similarity join in metric spaces using ed-index
-
V. Dohnal, C. Gennaro, and P. Zezula. Similarity join in metric spaces using ed-index. In DEXA, 2003.
-
(2003)
DEXA
-
-
Dohnal, V.1
Gennaro, C.2
Zezula, P.3
-
13
-
-
32344441912
-
Finding similar files in large document repositories
-
G. Forman, K. Eshghi, and S. Chiocchetti. Finding similar files in large document repositories. In KDD, 2005.
-
(2005)
KDD
-
-
Forman, G.1
Eshghi, K.2
Chiocchetti, S.3
-
14
-
-
15044355327
-
Similarity search in high dimensions via hashing
-
A. Gionis, P. Indyk, and R. Motwani. Similarity search in high dimensions via hashing. In VLDB, 1999.
-
(1999)
VLDB
-
-
Gionis, A.1
Indyk, P.2
Motwani, R.3
-
15
-
-
84944318804
-
Approximate string joins in a database (almost) for free
-
L. Gravano, P. G. Ipeirotis, H. V. Jagadish, N. Koudas, S. Muthukrishnan, and D. Srivastava. Approximate string joins in a database (almost) for free. In VLDB, 2001.
-
(2001)
VLDB
-
-
Gravano, L.1
Ipeirotis, P.G.2
Jagadish, H.V.3
Koudas, N.4
Muthukrishnan, S.5
Srivastava, D.6
-
16
-
-
79959981831
-
-
L. Gravano, P. G. Ipeirotis, H. V. Jagadish, N. Koudas, S. Muthukrishnan, and D. Srivastava. Approximate string joins in a database (almost) for free (erratum). Technical Report CUCS-011-03, Columbia University, 2003.
-
(2003)
Approximate String Joins in A Database (Almost) for Free (Erratum)Technical Report CUCS-011-03, Columbia University
-
-
Gravano, L.1
Ipeirotis, P.G.2
Jagadish, H.V.3
Koudas, N.4
Muthukrishnan, S.5
Srivastava, D.6
-
17
-
-
0021615874
-
R-trees: A dynamic index structure for spatial searching
-
A. Guttman. R-trees: A dynamic index structure for spatial searching. In SIGMOD Conference, pages 47-57, 1984.
-
(1984)
SIGMOD Conference
, pp. 47-57
-
-
Guttman, A.1
-
18
-
-
77958004305
-
Efficient approximate search on string collections
-
M. Hadjieleftheriou and C. Li. Efficient approximate search on string collections. PVLDB, 2(2):1660-1661, 2009.
-
(2009)
PVLDB
, vol.2
, Issue.2
, pp. 1660-1661
-
-
Hadjieleftheriou, M.1
Li, C.2
-
19
-
-
84944324113
-
Efficient index structures for string databases
-
T. Kahveci and A. K. Singh. Efficient index structures for string databases. In VLDB, pages 351-360, 2001.
-
(2001)
VLDB
, pp. 351-360
-
-
Kahveci, T.1
Singh, A.K.2
-
20
-
-
0012906001
-
High dimensional similarity joins: Algorithms and performance evaluation
-
N. Koudas and K. C. Sevcik. High dimensional similarity joins: Algorithms and performance evaluation. IEEE Trans. Knowl. Data Eng., 12(1):3-18, 2000.
-
(2000)
IEEE Trans. Knowl. Data Eng.
, vol.12
, Issue.1
, pp. 3-18
-
-
Koudas, N.1
Sevcik, K.C.2
-
21
-
-
52649086729
-
Efficient merging and filtering algorithms for approximate string searches
-
C. Li, J. Lu, and Y. Lu. Efficient merging and filtering algorithms for approximate string searches. In ICDE, 2008.
-
(2008)
ICDE
-
-
Li, C.1
Lu, J.2
Lu, Y.3
-
22
-
-
85011032600
-
VGRAM: Improving performance of approximate queries on string collections using variable-length grams
-
C. Li, B. Wang, and X. Yang. VGRAM: Improving performance of approximate queries on string collections using variable-length grams. In VLDB, 2007.
-
(2007)
VLDB
-
-
Li, C.1
Wang, B.2
Yang, X.3
-
23
-
-
0345566149
-
A guided tour to approximate string matching
-
G. Navarro. A guided tour to approximate string matching. ACM Comput. Surv., 33(1):31-88, 2001. (Pubitemid 33768480)
-
(2001)
ACM Computing Surveys
, vol.33
, Issue.1
, pp. 31-88
-
-
Navarro, G.1
-
24
-
-
0012924426
-
A practical q -gram index for text retrieval allowing errors
-
G. Navarro and R. A. Baeza-Yates. A practical q -gram index for text retrieval allowing errors. CLEI Electron. J., 1(2), 1998.
-
(1998)
CLEI Electron. J.
, vol.1
, Issue.2
-
-
Navarro, G.1
Baeza-Yates, R.A.2
-
25
-
-
70350635615
-
Indexing variable length substrings for exact and approximate matching
-
G. Navarro and L. Salmela. Indexing variable length substrings for exact and approximate matching. In SPIRE, pages 214-221, 2009.
-
(2009)
SPIRE
, pp. 214-221
-
-
Navarro, G.1
Salmela, L.2
-
26
-
-
3142777876
-
Efficient set joins on similarity predicates
-
S. Sarawagi and A. Kirpal. Efficient set joins on similarity predicates. In SIGMOD, 2004.
-
(2004)
SIGMOD
-
-
Sarawagi, S.1
Kirpal, A.2
-
27
-
-
33846664609
-
Tandem repeats over the edit distance
-
D. Sokol, G. Benson, and J. Tojeira. Tandem repeats over the edit distance. Bioinformatics, 23(2):30-35, 2007.
-
(2007)
Bioinformatics
, vol.23
, Issue.2
, pp. 30-35
-
-
Sokol, D.1
Benson, G.2
Tojeira, J.3
-
28
-
-
36448954599
-
Principles of hash-based text retrieval
-
B. Stein. Principles of hash-based text retrieval. In SIGIR, pages 527-534, 2007.
-
(2007)
SIGIR
, pp. 527-534
-
-
Stein, B.1
-
29
-
-
72949094783
-
-
Technical Report ifi-2007.02, Department of Informatics, University of Zurich, April
-
B. S. T. Bocek, E. Hunt. Fast Similarity Search in Large Dictionaries. Technical Report ifi-2007.02, Department of Informatics, University of Zurich, April 2007.
-
(2007)
Fast Similarity Search in Large Dictionaries
-
-
Bocek, B.S.T.1
Hunt, E.2
-
30
-
-
57349131623
-
Spotsigs: Robust and efficient near duplicate detection in large web collections
-
M. Theobald, J. Siddharth, and A. Paepcke. Spotsigs: robust and efficient near duplicate detection in large web collections. In SIGIR, pages 563-570, 2008.
-
(2008)
SIGIR
, pp. 563-570
-
-
Theobald, M.1
Siddharth, J.2
Paepcke, A.3
-
31
-
-
0015960104
-
The string-to-string correction problem
-
R. A. Wagner and M. J. Fischer. The string-to-string correction problem. J. ACM, 21(1):168-173, 1974.
-
(1974)
J. ACM
, vol.21
, Issue.1
, pp. 168-173
-
-
Wagner, R.A.1
Fischer, M.J.2
-
32
-
-
79959957636
-
Trie-join: Efficient trie-based string similarity joins with edit
-
J. Wang, J. Feng, and G. Li. Trie-join: Efficient trie-based string similarity joins with edit. In VLDB, 2010.
-
(2010)
VLDB
-
-
Wang, J.1
Feng, J.2
Li, G.3
-
33
-
-
70849115286
-
Efficient approximate entity extraction with edit constraints
-
W. Wang, C. Xiao, X. Lin, and C. Zhang. Efficient approximate entity extraction with edit constraints. In SIMGOD, 2009.
-
(2009)
SIMGOD
-
-
Wang, W.1
Xiao, C.2
Lin, X.3
Zhang, C.4
-
34
-
-
70849105253
-
Ed-Join: An efficient algorithm for similarity joins with edit distance constraints
-
C. Xiao, W. Wang, and X. Lin. Ed-Join: an efficient algorithm for similarity joins with edit distance constraints. PVLDB, 1(1):933-944, 2008.
-
(2008)
PVLDB
, vol.1
, Issue.1
, pp. 933-944
-
-
Xiao, C.1
Wang, W.2
Lin, X.3
-
35
-
-
79955101629
-
Efficient and effective similarity search over probabilistic data based on earth mover's distance
-
J. Xu, Z. Zhang, A. K. H. Tung, and G. Yu. Efficient and effective similarity search over probabilistic data based on earth mover's distance. PVLDB, 3(1):758-769, 2010.
-
(2010)
PVLDB
, vol.3
, Issue.1
, pp. 758-769
-
-
Xu, J.1
Zhang, Z.2
Tung, A.K.H.3
Yu, G.4
-
36
-
-
57149130672
-
Cost-based variable-length-gram selection for string collections to support approximate queries efficiently
-
X. Yang, B. Wang, and C. Li. Cost-based variable-length-gram selection for string collections to support approximate queries efficiently. In SIGMOD Conference, pages 353-364, 2008.
-
(2008)
SIGMOD Conference
, pp. 353-364
-
-
Yang, X.1
Wang, B.2
Li, C.3
-
37
-
-
2442567706
-
Making the pyramid technique robust to query types and workloads
-
R. Zhang, B. C. Ooi, and K.-L. Tan. Making the pyramid technique robust to query types and workloads. In ICDE, pages 313-324, 2004.
-
(2004)
ICDE
, pp. 313-324
-
-
Zhang, R.1
Ooi, B.C.2
Tan, K.-L.3
-
38
-
-
77954747181
-
Bed-tree: An all-purpose index structure for string similarity search based on edit distance
-
Z. Zhang, M. Hadjieleftheriou, B. C. Ooi, and D. Srivastava. Bed-tree: an all-purpose index structure for string similarity search based on edit distance. In SIGMOD Conference, pages 915-926, 2010.
-
(2010)
SIGMOD Conference
, pp. 915-926
-
-
Zhang, Z.1
Hadjieleftheriou, M.2
Ooi, B.C.3
Srivastava, D.4
-
39
-
-
77956960464
-
Similarity search on bregman divergence: Towards non-metric indexing
-
Z. Zhang, B. C. Ooi, S. Parthasarathy, and A. K. H. Tung. Similarity search on bregman divergence: Towards non-metric indexing. PVLDB, 2(1):13-24, 2009.
-
(2009)
PVLDB
, vol.2
, Issue.1
, pp. 13-24
-
-
Zhang, Z.1
Ooi, B.C.2
Parthasarathy, S.3
Tung, A.K.H.4
|