-
1
-
-
84864280321
-
-
http://secondstring. sourceforge. net/.
-
-
-
-
2
-
-
84864278686
-
-
http://www. dcs. shef. ac. uk/~sam/simmetrics. html.
-
-
-
-
3
-
-
77950901996
-
Scalable ad-hoc entity extraction from text collections
-
Agrawal S., Chakrabarti K., Chaudhuri S., Ganti V.: Scalable ad-hoc entity extraction from text collections. PVLDB 1(1), 945-957 (2008).
-
(2008)
PVLDB
, vol.1
, Issue.1
, pp. 945-957
-
-
Agrawal, S.1
Chakrabarti, K.2
Chaudhuri, S.3
Ganti, V.4
-
4
-
-
52649137537
-
Transformation-based framework for record matching
-
Arasu, A., Chaudhuri, S., Kaushik, R.: Transformation-based framework for record matching. In: ICDE, pp. 40-49 (2008).
-
(2008)
ICDE
, pp. 40-49
-
-
Arasu, A.1
Chaudhuri, S.2
Kaushik, R.3
-
5
-
-
85104914015
-
Efficient exact set-similarity joins
-
Arasu, A., Ganti, V., Kaushik, R.: Efficient exact set-similarity joins. In: VLDB, pp. 918-929 (2006).
-
(2006)
VLDB
, pp. 918-929
-
-
Arasu, A.1
Ganti, V.2
Kaushik, R.3
-
6
-
-
52649127789
-
Approximate joins for data-centric xml
-
Augsten, N., Böhlen, M. H., Dyreson, C. E., Gamper, J.: Approximate joins for data-centric xml. In: ICDE, pp. 814-823 (2008).
-
(2008)
ICDE
, pp. 814-823
-
-
Augsten, N.1
Böhlen, M.H.2
Dyreson, C.E.3
Gamper, J.4
-
7
-
-
35348849154
-
Scaling up all pairs similarity search
-
Bayardo, R. J., Ma, Y., Srikant, R.: Scaling up all pairs similarity search. In: WWW, pp. 131-140 (2007).
-
(2007)
WWW
, pp. 131-140
-
-
Bayardo, R.J.1
Ma, Y.2
Srikant, R.3
-
8
-
-
52649109639
-
Compact similarity joins
-
Bryan, B., Eberhardt, F., Faloutsos, C.: Compact similarity joins. In: ICDE, pp. 346-355 (2008).
-
(2008)
ICDE
, pp. 346-355
-
-
Bryan, B.1
Eberhardt, F.2
Faloutsos, C.3
-
9
-
-
72949105984
-
Fast error-tolerant search on very large texts
-
Celikik, M., Bast, H.: Fast error-tolerant search on very large texts. In: SAC, pp. 1724-1731 (2009).
-
(2009)
SAC
, pp. 1724-1731
-
-
Celikik, M.1
Bast, H.2
-
10
-
-
57149127665
-
An efficient filter for approximate membership checking
-
Chakrabarti, K., Chaudhuri, S., Ganti, V., Xin, D.: An efficient filter for approximate membership checking. In: SIGMOD Conference, pp. 805-818 (2008).
-
(2008)
SIGMOD Conference
, pp. 805-818
-
-
Chakrabarti, K.1
Chaudhuri, S.2
Ganti, V.3
Xin, D.4
-
11
-
-
1142279457
-
Robust and efficient fuzzy match for online data cleaning
-
Chaudhuri, S., Ganjam, K., Ganti, V., Motwani, R.: Robust and efficient fuzzy match for online data cleaning. In: SIGMOD Conference, pp. 313-324 (2003).
-
(2003)
SIGMOD Conference
, pp. 313-324
-
-
Chaudhuri, S.1
Ganjam, K.2
Ganti, V.3
Motwani, R.4
-
12
-
-
84859202692
-
Data debugger: An operator-centric approach for data quality solutions
-
Chaudhuri S., Ganti V., Kaushik R.: Data debugger: An operator-centric approach for data quality solutions. IEEE Data Eng. Bull. 29(2), 60-66 (2006).
-
(2006)
IEEE Data Eng. Bull.
, vol.29
, Issue.2
, pp. 60-66
-
-
Chaudhuri, S.1
Ganti, V.2
Kaushik, R.3
-
13
-
-
33749597967
-
A primitive operator for similarity joins in data cleaning
-
Chaudhuri, S., Ganti, V., Kaushik, R.: A primitive operator for similarity joins in data cleaning. In: ICDE, pp. 5-16 (2006).
-
(2006)
ICDE
, pp. 5-16
-
-
Chaudhuri, S.1
Ganti, V.2
Kaushik, R.3
-
14
-
-
70849103081
-
Extending autocompletion to tolerate errors
-
Chaudhuri, S., Kaushik, R.: Extending autocompletion to tolerate errors. In: SIGMOD Conference, pp. 707-718 (2009).
-
(2009)
SIGMOD Conference
, pp. 707-718
-
-
Chaudhuri, S.1
Kaushik, R.2
-
15
-
-
84993661659
-
M-tree: An efficient access method for similarity search in metric spaces
-
Ciaccia, P., Patella, M., Zezula, P.: M-tree: An efficient access method for similarity search in metric spaces. In: VLDB, pp. 426-435 (1997).
-
(1997)
VLDB
, pp. 426-435
-
-
Ciaccia, P.1
Patella, M.2
Zezula, P.3
-
16
-
-
4544388794
-
Dictionary matching and indexing with errors and don't cares
-
Cole, R., Gottlieb, L.-A., Lewenstein, M.: Dictionary matching and indexing with errors and don't cares. In: STOC, pp. 91-100 (2004).
-
(2004)
STOC
, pp. 91-100
-
-
Cole, R.1
Gottlieb, L.-A.2
Lewenstein, M.3
-
17
-
-
84945709825
-
Trie memory
-
Fredkin E.: Trie memory. Commun. ACM 3(9), 490-499 (1960).
-
(1960)
Commun. ACM
, vol.3
, Issue.9
, pp. 490-499
-
-
Fredkin, E.1
-
19
-
-
84944318804
-
Approximate string joins in a database (almost) for free
-
Gravano, L., Ipeirotis, P. G., Jagadish, H. V., Koudas, N., Muthukrishnan, S., Srivastava, D.: Approximate string joins in a database (almost) for free. In: VLDB, pp. 491-500 (2001).
-
(2001)
VLDB
, pp. 491-500
-
-
Gravano, L.1
Ipeirotis, P.G.2
Jagadish, H.V.3
Koudas, N.4
Muthukrishnan, S.5
Srivastava, D.6
-
20
-
-
0344496626
-
Index-based approximate xml joins
-
Guha, S., Koudas, N., Srivastava, D., Yu, T.: Index-based approximate xml joins. In: ICDE, pp. 708-710 (2003).
-
(2003)
ICDE
, pp. 708-710
-
-
Guha, S.1
Koudas, N.2
Srivastava, D.3
Yu, T.4
-
21
-
-
52649145249
-
Fast indexes and algorithms for set similarity selection queries
-
Hadjieleftheriou, M., Chandel, A., Koudas, N., Srivastava, D.: Fast indexes and algorithms for set similarity selection queries. In: ICDE, pp. 267-276 (2008).
-
(2008)
ICDE
, pp. 267-276
-
-
Hadjieleftheriou, M.1
Chandel, A.2
Koudas, N.3
Srivastava, D.4
-
22
-
-
70849096574
-
Incremental maintenance of length normalized indexes for approximate string matching
-
Hadjieleftheriou, M., Koudas, N., Srivastava, D.: Incremental maintenance of length normalized indexes for approximate string matching. In: SIGMOD Conference, pp. 429-440 (2009).
-
(2009)
SIGMOD Conference
, pp. 429-440
-
-
Hadjieleftheriou, M.1
Koudas, N.2
Srivastava, D.3
-
24
-
-
70349659026
-
Hashed samples: selectivity estimators for set similarity selection queries
-
Hadjieleftheriou M., Yu X., Koudas N., Srivastava D.: Hashed samples: selectivity estimators for set similarity selection queries. PVLDB 1(1), 201-212 (2008).
-
(2008)
PVLDB
, vol.1
, Issue.1
, pp. 201-212
-
-
Hadjieleftheriou, M.1
Yu, X.2
Koudas, N.3
Srivastava, D.4
-
25
-
-
0038564328
-
Burst tries: a fast, efficient data structure for string keys
-
Heinz S., Zobel J., Williams H. E.: Burst tries: a fast, efficient data structure for string keys. ACM Trans. Inf. Syst. 20(2), 192-223 (2002).
-
(2002)
ACM Trans. Inf. Syst.
, vol.20
, Issue.2
, pp. 192-223
-
-
Heinz, S.1
Zobel, J.2
Williams, H.E.3
-
27
-
-
77954747849
-
Probabilistic string similarity joins
-
Jestes, J., Li, F., Yan, Z., Yi, K.: Probabilistic string similarity joins. In: SIGMOD Conference, pp. 327-338 (2010).
-
(2010)
SIGMOD Conference
, pp. 327-338
-
-
Jestes, J.1
Li, F.2
Yan, Z.3
Yi, K.4
-
28
-
-
84865633750
-
Efficient interactive fuzzy keyword search
-
Ji, S., Li, G., Li, C., Feng, J.: Efficient interactive fuzzy keyword search. In WWW, pp. 433-439 (2009).
-
(2009)
WWW
, pp. 433-439
-
-
Ji, S.1
Li, G.2
Li, C.3
Feng, J.4
-
29
-
-
84944324113
-
Efficient index structures for string databases
-
Kahveci, T., Singh, A. K.: Efficient index structures for string databases. In: VLDB, pp. 351-360 (2001).
-
(2001)
VLDB
, pp. 351-360
-
-
Kahveci, T.1
Singh, A.K.2
-
30
-
-
33745621089
-
n-Gram/2L: A space and time efficient two-level n-gram inverted index structure
-
Kim, M.-S., Whang, K.-Y., Lee, J.-G., Lee, M.-J. n-Gram/2L: A space and time efficient two-level n-gram inverted index structure. In: VLDB, pp. 325-336 (2005).
-
(2005)
VLDB
, pp. 325-336
-
-
Kim, M.-S.1
Whang, K.-Y.2
Lee, J.-G.3
Lee, M.-J.4
-
32
-
-
85011072445
-
Extending q-grams to estimate selectivity of string matching with low edit distance
-
Lee, H., Ng, R. T., Shim, K.: Extending q-grams to estimate selectivity of string matching with low edit distance. In: VLDB, pp. 195-206 (2007).
-
(2007)
VLDB
, pp. 195-206
-
-
Lee, H.1
Ng, R.T.2
Shim, K.3
-
33
-
-
77957718350
-
Power-law based estimation of set similarity join size
-
Lee H., Ng R. T., Shim K.: Power-law based estimation of set similarity join size. PVLDB 2(1), 658-669 (2009).
-
(2009)
PVLDB
, vol.2
, Issue.1
, pp. 658-669
-
-
Lee, H.1
Ng, R.T.2
Shim, K.3
-
34
-
-
52649086729
-
Efficient merging and filtering algorithms for approximate string searches
-
Li, C., Lu, J., Lu, Y.: Efficient merging and filtering algorithms for approximate string searches. In: ICDE, pp. 257-266 (2008).
-
(2008)
ICDE
, pp. 257-266
-
-
Li, C.1
Lu, J.2
Lu, Y.3
-
35
-
-
85011032600
-
VGRAM: Improving performance of approximate queries on string collections using variable-length grams
-
Li, C., Wang, B., Yang, X. VGRAM: Improving performance of approximate queries on string collections using variable-length grams. In: VLDB, pp. 303-314 (2007).
-
(2007)
VLDB
, pp. 303-314
-
-
Li, C.1
Wang, B.2
Yang, X.3
-
36
-
-
79959922359
-
Faerie: Efficient filtering algorithms for approximate dictionary-based entity extraction
-
Li, G., Deng, D., Feng, J. Faerie: efficient filtering algorithms for approximate dictionary-based entity extraction. In: SIGMOD Conference, pp. 529-540 (2011).
-
(2011)
SIGMOD Conference
, pp. 529-540
-
-
Li, G.1
Deng, D.2
Feng, J.3
-
37
-
-
79960467518
-
Efficient fuzzy full-text type-ahead search
-
Li G., Ji S., Li C., Feng J.: Efficient fuzzy full-text type-ahead search. VLDB J. 20(4), 617-640 (2011).
-
(2011)
VLDB J.
, vol.20
, Issue.4
, pp. 617-640
-
-
Li, G.1
Ji, S.2
Li, C.3
Feng, J.4
-
38
-
-
84859260100
-
Set similarity join on probabilistic data
-
Lian X., Chen L.: Set similarity join on probabilistic data. PVLDB 3(1), 650-659 (2010).
-
(2010)
PVLDB
, vol.3
, Issue.1
, pp. 650-659
-
-
Lian, X.1
Chen, L.2
-
39
-
-
74549168398
-
Efficient algorithms for approximate member extraction using signature-based inverted lists
-
Lu, J., Han, J., Meng, X.: Efficient algorithms for approximate member extraction using signature-based inverted lists. In: CIKM, pp. 315-324 (2009).
-
(2009)
CIKM
, pp. 315-324
-
-
Lu, J.1
Han, J.2
Meng, X.3
-
40
-
-
38149018071
-
Patricia: practical algorithm to retrieve information coded in alphanumeric
-
Morrison D. R.: Patricia: practical algorithm to retrieve information coded in alphanumeric. J. ACM 15, 514-534 (1968).
-
(1968)
J. ACM
, vol.15
, pp. 514-534
-
-
Morrison, D.R.1
-
41
-
-
0345566149
-
A guided tour to approximate string matching
-
Navarro G.: A guided tour to approximate string matching. ACM Comput. Surv. 33(1), 31-88 (2001).
-
(2001)
ACM Comput. Surv.
, vol.33
, Issue.1
, pp. 31-88
-
-
Navarro, G.1
-
43
-
-
84976659272
-
Computer programs for detecting and correcting spelling errors
-
Peterson J. L.: Computer programs for detecting and correcting spelling errors. Commun. ACM 23(12), 676-687 (1980).
-
(1980)
Commun. ACM
, vol.23
, Issue.12
, pp. 676-687
-
-
Peterson, J.L.1
-
44
-
-
84864279560
-
-
Available at
-
Russell, R. C.: Available at http://patft. uspto. gov/netacgi/nph-Parser?patentnumber=1261167 (1918).
-
(1918)
-
-
Russell, R.C.1
-
45
-
-
0344065611
-
Distance based indexing for string proximity search
-
Sahinalp, S. C., Tasan, M., Macker, J., Özsoyoglu, Z. M.: Distance based indexing for string proximity search. In: ICDE, pp. 125-136 (2003).
-
(2003)
ICDE
, pp. 125-136
-
-
Sahinalp, S.C.1
Tasan, M.2
Macker, J.3
Özsoyoglu, Z.M.4
-
46
-
-
0017930815
-
Dynamic programming algorithm optimization for spoken word recognition
-
Sakoe H., Chiba S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust Speech Signal Process 26, 43-49 (1978).
-
(1978)
IEEE Trans. Acoust Speech Signal Process
, vol.26
, pp. 43-49
-
-
Sakoe, H.1
Chiba, S.2
-
48
-
-
3142777876
-
Efficient set joins on similarity predicates
-
Sarawagi, S., Kirpal, A.: Efficient set joins on similarity predicates. In: SIGMOD Conference, pp. 743-754 (2004).
-
(2004)
SIGMOD Conference
, pp. 743-754
-
-
Sarawagi, S.1
Kirpal, A.2
-
49
-
-
84893371228
-
Fast string correction with levenshtein automata
-
Schulz K. U., Mihov S.: Fast string correction with levenshtein automata. Intl J Doc Anal Recognit 5(1), 67-85 (2002).
-
(2002)
Intl J Doc Anal Recognit
, vol.5
, Issue.1
, pp. 67-85
-
-
Schulz, K.U.1
Mihov, S.2
-
50
-
-
0005079936
-
Use of tree structures for processing files
-
Sussenguth E. H.: Use of tree structures for processing files. Commun. ACM 6, 272-279 (1963).
-
(1963)
Commun. ACM
, vol.6
, pp. 272-279
-
-
Sussenguth, E.H.1
-
51
-
-
77954744650
-
Efficient parallel set-similarity joins using mapreduce
-
Vernica, R., Carey, M. J., Li, C.: Efficient parallel set-similarity joins using mapreduce. In: SIGMOD Conference, pp. 495-506 (2010).
-
(2010)
SIGMOD Conference
, pp. 495-506
-
-
Vernica, R.1
Carey, M.J.2
Li, C.3
-
52
-
-
79957822983
-
Trie-join: Efficient trie-based string similarity joins with edit-distance constraints
-
Wang J., Li G., Feng J.: Trie-join: Efficient trie-based string similarity joins with edit-distance constraints. PVLDB 3(1), 1219-1230 (2010).
-
(2010)
PVLDB
, vol.3
, Issue.1
, pp. 1219-1230
-
-
Wang, J.1
Li, G.2
Feng, J.3
-
53
-
-
79957824788
-
Fast-join: An efficient method for fuzzy token matching based string similarity join
-
Wang, J., Li, G., Feng, J.: Fast-join: An efficient method for fuzzy token matching based string similarity join. In: ICDE pp. 458-469 (2011).
-
(2011)
ICDE
, pp. 458-469
-
-
Wang, J.1
Li, G.2
Feng, J.3
-
54
-
-
84863541462
-
Entity matching: how similar is similar
-
Wang J., Li G., Yu J. X., Feng J.: Entity matching: how similar is similar. PVLDB 4(10), 622-633 (2011).
-
(2011)
PVLDB
, vol.4
, Issue.10
, pp. 622-633
-
-
Wang, J.1
Li, G.2
Yu, J.X.3
Feng, J.4
-
55
-
-
70849115286
-
Efficient approximate entity extraction with edit distance constraints
-
Wang, W., Xiao, C., Lin, X., Zhang, C.: Efficient approximate entity extraction with edit distance constraints. In: SIGMOD Conference, pp. 759-770 (2009).
-
(2009)
SIGMOD Conference
, pp. 759-770
-
-
Wang, W.1
Xiao, C.2
Lin, X.3
Zhang, C.4
-
56
-
-
70849105253
-
Ed-join: an efficient algorithm for similarity joins with edit distance constraints
-
Xiao C., Wang W., Lin X.: Ed-join: an efficient algorithm for similarity joins with edit distance constraints. PVLDB 1(1), 933-944 (2008).
-
(2008)
PVLDB
, vol.1
, Issue.1
, pp. 933-944
-
-
Xiao, C.1
Wang, W.2
Lin, X.3
-
57
-
-
67649653766
-
Top-k set similarity joins
-
Xiao, C., Wang, W., Lin, X., Shang, H.: Top-k set similarity joins. In: ICDE, pp. 916-927 (2009).
-
(2009)
ICDE
, pp. 916-927
-
-
Xiao, C.1
Wang, W.2
Lin, X.3
Shang, H.4
-
58
-
-
57349141410
-
Efficient similarity joins for near duplicate detection
-
Xiao, C., Wang, W., Lin, X., Yu, J. X.: Efficient similarity joins for near duplicate detection. In: WWW, pp. 131-140 (2008).
-
(2008)
WWW
, pp. 131-140
-
-
Xiao, C.1
Wang, W.2
Lin, X.3
Yu, J.X.4
-
59
-
-
57149130672
-
Cost-based variable-length-gram selection for string collections to support approximate queries efficiently
-
Yang, X., Wang, B., Li, C.: Cost-based variable-length-gram selection for string collections to support approximate queries efficiently. In: SIGMOD Conference, pp. 353-364 (2008).
-
(2008)
SIGMOD Conference
, pp. 353-364
-
-
Yang, X.1
Wang, B.2
Li, C.3
|