-
1
-
-
0042960934
-
Fast and practical approximate string matching
-
R. A. Baeza-Yates and C. H. Perleberg. Fast and practical approximate string matching. Inf.Proc.Lett., 59(1):21-27, 1996.
-
(1996)
Inf.Proc.Lett.
, vol.59
, Issue.1
, pp. 21-27
-
-
Baeza-Yates, R.A.1
Perleberg, C.H.2
-
2
-
-
33645851469
-
A general-purpose compression scheme for large collections
-
A. Cannane and H. E. Williams. A general-purpose compression scheme for large collections. ACM Trans. Inf. Syst., 20(3):329-355, July 2002.
-
(2002)
ACM Trans. Inf. Syst.
, vol.20
, Issue.3
, pp. 329-355
-
-
Cannane, A.1
Williams, H.E.2
-
3
-
-
85011015609
-
Entityrank: Searching entities directly and holistically
-
In Proceedings of the 33rd International Conference on Very Large Data Bases, VLDB '07, pages. VLDB Endowment
-
T. Cheng, X. Yan, and K. C.-C. Chang. Entityrank: Searching entities directly and holistically. In Proceedings of the 33rd International Conference on Very Large Data Bases, VLDB '07, pages 387-398. VLDB Endowment, 2007.
-
(2007)
, pp. 387-398
-
-
Cheng, T.1
Yan, X.2
Chang, K.C.-C.3
-
4
-
-
83055168142
-
Indexes for highly repetitive document collections
-
In CIKM, pages, New York, NY, USA, ACM.
-
F. Claude, A. Fariña, et al. Indexes for highly repetitive document collections. In CIKM, pages 463-468, New York, NY, USA, 2011. ACM.
-
(2011)
, pp. 463-468
-
-
Claude, F.1
Fariña, A.2
-
5
-
-
0029716125
-
Parsing with prefix and suffix dictionaries
-
In Data Compression Conf., pages
-
M. Cohn and R. Khazan. Parsing with prefix and suffix dictionaries. In Data Compression Conf., pages 180-189, 1996.
-
(1996)
, pp. 180-189
-
-
Cohn, M.1
Khazan, R.2
-
6
-
-
85199273040
-
Indexing large genome collections on a PC
-
CoRR, abs/1403.7481
-
A. Danek, S. Deorowicz, and S. Grabowski. Indexing large genome collections on a PC. CoRR, abs/1403.7481, 2014.
-
(2014)
-
-
Danek, A.1
Deorowicz, S.2
Grabowski, S.3
-
9
-
-
39549090389
-
Seqan an efficient, generic C++ library for sequence analysis
-
A. Döring, D. Weese, et al. Seqan an efficient, generic C++ library for sequence analysis. BMC Bioinformatics, 9, 2008.
-
(2008)
BMC Bioinformatics
, vol.9
-
-
Döring, A.1
Weese, D.2
-
10
-
-
84904014664
-
AliBI: An Alignment-Based Index for Genomic Datasets
-
ArXiv e-prints, July
-
H. Ferrada, T. Gagie, et al. AliBI: An Alignment-Based Index for Genomic Datasets. ArXiv e-prints, July 2013.
-
(2013)
-
-
Ferrada, H.1
Gagie, T.2
-
11
-
-
84899088435
-
Hybrid indexes for repetitive datasets
-
H. Ferrada, T. Gagie, et al. Hybrid indexes for repetitive datasets. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 372(2016), 2014.
-
(2014)
Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
, vol.372
, Issue.2016
-
-
Ferrada, H.1
Gagie, T.2
-
12
-
-
84884295366
-
Document listing on repetitive collections
-
In J. Fischer and P. Sanders, editors, Combinatorial Pattern Matching, volume 7922 of Lecture Notes in Computer Science, pages. Springer Berlin Heidelberg
-
T. Gagie, K. Karhu, et al. Document listing on repetitive collections. In J. Fischer and P. Sanders, editors, Combinatorial Pattern Matching, volume 7922 of Lecture Notes in Computer Science, pages 107-119. Springer Berlin Heidelberg, 2013.
-
(2013)
, pp. 107-119
-
-
Gagie, T.1
Karhu, K.2
-
13
-
-
84876802097
-
Wallbreaker: Overcoming the wall effect in similarity search
-
In Proceedings of the Joint EDBT/ICDT 2013 Workshops, EDBT '13, pages , New York, NY, USA, ACM.
-
S. Gerdjikov, S. Mihov, et al. Wallbreaker: Overcoming the wall effect in similarity search. In Proceedings of the Joint EDBT/ICDT 2013 Workshops, EDBT '13, pages 366-369, New York, NY, USA, 2013. ACM.
-
(2013)
, pp. 366-369
-
-
Gerdjikov, S.1
Mihov, S.2
-
14
-
-
65449144325
-
Evaluation of next generation sequencing platforms for population targeted sequencing studies
-
R32+
-
O. Harismendy, P. Ng, et al. Evaluation of next generation sequencing platforms for population targeted sequencing studies. Genome Biology, 10(3):R32+, 2009.
-
(2009)
Genome Biology
, vol.10
, Issue.3
-
-
Harismendy, O.1
Ng, P.2
-
15
-
-
65949095627
-
Breaking a time-and-space barrier in constructing full-text indices
-
W. Hon, K. Sadakane, and W. Sung. Breaking a time-and-space barrier in constructing full-text indices. SIAM Journal on Computing, 38(6):2162-2178, 2009.
-
(2009)
SIAM Journal on Computing
, vol.38
, Issue.6
, pp. 2162-2178
-
-
Hon, W.1
Sadakane, K.2
Sung, W.3
-
16
-
-
80052129821
-
Sample selection for dictionary-based corpus compression
-
In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '11, pages , New York, NY, USA, ACM.
-
C. Hoobin, S. Puglisi, and J. Zobel. Sample selection for dictionary-based corpus compression. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '11, pages 1137-1138, New York, NY, USA, 2011. ACM.
-
(2011)
, pp. 1137-1138
-
-
Hoobin, C.1
Puglisi, S.2
Zobel, J.3
-
17
-
-
84863731168
-
Relative Lempel-Ziv factorization for efficient storage and retrieval of web collections
-
C. Hoobin, S. J. Puglisi, and J. Zobel. Relative Lempel-Ziv factorization for efficient storage and retrieval of web collections. PVLDB, 5(3):265-273, 2011.
-
(2011)
PVLDB
, vol.5
, Issue.3
, pp. 265-273
-
-
Hoobin, C.1
Puglisi, S.J.2
Zobel, J.3
-
18
-
-
84876803927
-
Efficient parallel partition-based algorithms for similarity search and join with edit distance constraints
-
In Proceedings of the Joint EDBT/ICDT 2013 Workshops, EDBT '13, pages, New York, NY, USA, ACM.
-
Y. Jiang, D. Deng, et al. Efficient parallel partition-based algorithms for similarity search and join with edit distance constraints. In Proceedings of the Joint EDBT/ICDT 2013 Workshops, EDBT '13, pages 341-348, New York, NY, USA, 2013. ACM.
-
(2013)
, pp. 341-348
-
-
Jiang, Y.1
Deng, D.2
-
19
-
-
0000904908
-
Fast pattern matching in strings
-
D. E. Knuth, J. H. Morris, Jr, and V. R. Pratt. Fast pattern matching in strings. SIAM journal on computing, 6(2):323-350, 1977.
-
(1977)
SIAM journal on computing
, vol.6
, Issue.2
, pp. 323-350
-
-
Knuth, D.E.1
Morris Jr., J.H.2
Pratt, V.R.3
-
20
-
-
84876408746
-
On compressing and indexing repetitive sequences
-
S. Kreft and G. Navarro. On compressing and indexing repetitive sequences. Theoretical Computer Science, 483(0):115 -133, 2013.
-
(2013)
Theoretical Computer Science
, vol.483
, pp. 115-133
-
-
Kreft, S.1
Navarro, G.2
-
21
-
-
78449295543
-
Relative lempel-ziv compression of genomes for large-scale storage and retrieval
-
In Proceedings of SPIRE 2010, pages, Berlin, Heidelberg, Springer-Verlag.
-
S. Kuruppu, S. J. Puglisi, and J. Zobel. Relative lempel-ziv compression of genomes for large-scale storage and retrieval. In Proceedings of SPIRE 2010, pages 201-206, Berlin, Heidelberg, 2010. Springer-Verlag.
-
(2010)
, pp. 201-206
-
-
Kuruppu, S.1
Puglisi, S.J.2
Zobel, J.3
-
22
-
-
0032647886
-
Offline dictionary-based compression
-
In Data Compression Conference, Proceedings. DCC '99, pages, Mar 1999.
-
N. Larsson and A. Moffat. Offline dictionary-based compression. In Data Compression Conference, 1999. Proceedings. DCC '99, pages 296-305, Mar 1999.
-
(1999)
, pp. 296-305
-
-
Larsson, N.1
Moffat, A.2
-
23
-
-
84863702145
-
Compressive genomics
-
P.-R. Loh, M. Baym, and B. Berger. Compressive genomics. Nature Biotechnology, 30(7):627-630, July 2012.
-
(2012)
Nature Biotechnology
, vol.30
, Issue.7
, pp. 627-630
-
-
Loh, P.-R.1
Baym, M.2
Berger, B.3
-
24
-
-
85175633764
-
Mining naturally-occurring corrections and paraphrases from wikipedias revision history
-
In N. Calzolari, K. Choukri, et al., editors, LREC 2010, Valletta, Malta. European Language Resources Association.
-
A. Max and G. Wisniewski. Mining naturally-occurring corrections and paraphrases from wikipedias revision history. In N. Calzolari, K. Choukri, et al., editors, LREC 2010, Valletta, Malta, 2010. European Language Resources Association.
-
(2010)
-
-
Max, A.1
Wisniewski, G.2
-
25
-
-
0038531207
-
Efficient algorithms for enumerating intersection intervals and rectangles
-
Technical report, Xerox Paolo Alte Research Center
-
E. McCreight. Efficient algorithms for enumerating intersection intervals and rectangles. Technical report, Xerox Paolo Alte Research Center, 1980.
-
(1980)
-
-
McCreight, E.1
-
26
-
-
79955130548
-
Wikipedia vandalism detection
-
In S. Srinivasan, K. Ramamritham, et al., editors, WWW (Companion Volume), pages. ACM
-
S. M. Mola-Velasco. Wikipedia vandalism detection. In S. Srinivasan, K. Ramamritham, et al., editors, WWW (Companion Volume), pages 391-396. ACM, 2011.
-
(2011)
, pp. 391-396
-
-
Mola-Velasco, S.M.1
-
27
-
-
84958529445
-
Document retrieval on repetitive collections
-
In A. Schulz and D. Wagner, editors, Algorithms -ESA 2014, volume 8737 of Lecture Notes in Computer Science, pages. Springer Berlin Heidelberg
-
G. Navarro, S. Puglisi, and J. Siren. Document retrieval on repetitive collections. In A. Schulz and D. Wagner, editors, Algorithms -ESA 2014, volume 8737 of Lecture Notes in Computer Science, pages 725-736. Springer Berlin Heidelberg, 2014.
-
(2014)
, pp. 725-736
-
-
Navarro, G.1
Puglisi, S.2
Siren, J.3
-
28
-
-
85022188413
-
CST++
-
In SPIRE'10, pages
-
E. Ohlebusch, J. Fischer, and S. Gog. CST++. In SPIRE'10, pages 322-333, 2010.
-
(2010)
, pp. 322-333
-
-
Ohlebusch, E.1
Fischer, J.2
Gog, S.3
-
29
-
-
80052126396
-
Inverted indexes for phrases and strings
-
In SIGIR 2011, pages , New York, NY, USA. ACM.
-
M. Patil, S. V. Thankachan, et al. Inverted indexes for phrases and strings. In SIGIR 2011, pages 555-564, New York, NY, USA, 2011. ACM.
-
(2011)
, pp. 555-564
-
-
Patil, M.1
Thankachan, S.V.2
-
30
-
-
84873187741
-
Green: a tool for efficient compression of genome resequencing data
-
A. J. Pinho, D. Pratas, and S. P. Garcia. Green: a tool for efficient compression of genome resequencing data. Nucleic Acids Research, Dec. 2011.
-
(2011)
Nucleic Acids Research
-
-
Pinho, A.J.1
Pratas, D.2
Garcia, S.P.3
-
31
-
-
49549106252
-
Segment-based multiple sequence alignment
-
T. Rausch, A.-K. Emde, et al. Segment-based multiple sequence alignment. Bioinformatics, 24(16):i187-i192, 2008.
-
(2008)
Bioinformatics
, vol.24
, Issue.16
, pp. i187-i192
-
-
Rausch, T.1
Emde, A.-K.2
-
32
-
-
76249113666
-
Simultaneous alignment of short reads against multiple genomes
-
R98+, Sept
-
K. Schneeberger, J. Hagmann, et al. Simultaneous alignment of short reads against multiple genomes. Genome biology, 10(9):R98+, Sept. 2009.
-
(2009)
Genome biology
, vol.10
, Issue.9
-
-
Schneeberger, K.1
Hagmann, J.2
-
33
-
-
85199279540
-
Indexing graphs for path queries with applications in genome research
-
IEEE/ACM Transactions on Computational Biology and Bioinformatics, (accepted).
-
J. Siren, N. Välimäki, and V. Mäkinen. Indexing graphs for path queries with applications in genome research. IEEE/ACM Transactions on Computational Biology and Bioinformatics, (accepted).
-
-
-
Siren, J.1
Välimäki, N.2
Mäkinen, V.3
-
34
-
-
84904582651
-
Principled dictionary pruning for low-memory corpus compression
-
In Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '14, pages , New York, NY, USA, ACM.
-
J. Tong, A. Wirth, and J. Zobel. Principled dictionary pruning for low-memory corpus compression. In Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '14, pages 283-292, New York, NY, USA, 2014. ACM.
-
(2014)
, pp. 283-292
-
-
Tong, J.1
Wirth, A.2
Zobel, J.3
-
35
-
-
84904764719
-
Trends in genome compression
-
S. Wandelt, M. Bux, and U. Leser. Trends in genome compression. Current Bioinformatics, 9(3):315-326, 2014.
-
(2014)
Current Bioinformatics
, vol.9
, Issue.3
, pp. 315-326
-
-
Wandelt, S.1
Bux, M.2
Leser, U.3
-
36
-
-
84894514001
-
FRESCO: Referential compression of highly similar sequences
-
S. Wandelt and U. Leser. FRESCO: Referential compression of highly similar sequences. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 10(5):1275-1288, Sept. 2013.
-
(2013)
IEEE/ACM Trans. Comput. Biol. Bioinformatics
, vol.10
, Issue.5
, pp. 1275-1288
-
-
Wandelt, S.1
Leser, U.2
-
37
-
-
84891054664
-
RCSI: Scalable similarity search in thousand(s) of genomes
-
S. Wandelt, J. Starlinger, et al. RCSI: Scalable similarity search in thousand(s) of genomes. PVLDB, 6(13):1534-1545, 2013.
-
(2013)
PVLDB
, vol.6
, Issue.13
, pp. 1534-1545
-
-
Wandelt, S.1
Starlinger, J.2
-
38
-
-
84881341729
-
Efficient direct search on compressed genomic data
-
In C. S. Jensen, C. M. Jermaine, and X. Zhou, editors, ICDE, pages. IEEE Computer Society
-
X. Yang, B. Wang, et al. Efficient direct search on compressed genomic data. In C. S. Jensen, C. M. Jermaine, and X. Zhou, editors, ICDE, pages 961-972. IEEE Computer Society, 2013.
-
(2013)
, pp. 961-972
-
-
Yang, X.1
Wang, B.2
-
39
-
-
33749646126
-
Super-scalar RAM-CPU cache compression
-
In ICDE, pages, Washington, DC, USA. IEEE Computer Society.
-
M. Zukowski, S. Heman, et al. Super-scalar RAM-CPU cache compression. In ICDE, pages 59-70, Washington, DC, USA, 2006. IEEE Computer Society.
-
(2006)
, pp. 59-70
-
-
Zukowski, M.1
Heman, S.2
|