메뉴 건너뛰기




Volumn 8, Issue 5 5, 2015, Pages 461-472

MRCSI: Compressing and Searching String Collections with Multiple References

Author keywords

Compression; Dissimilar strings; Indexing

Indexed keywords

COMPRESSION RATIO (MACHINERY);

EID: 84955565000     PISSN: None     EISSN: 21508097     Source Type: Journal    
DOI: 10.14778/2735479.2735480     Document Type: Chapter
Times cited : (11)

References (39)
  • 1
    • 0042960934 scopus 로고    scopus 로고
    • Fast and practical approximate string matching
    • R. A. Baeza-Yates and C. H. Perleberg. Fast and practical approximate string matching. Inf.Proc.Lett., 59(1):21-27, 1996.
    • (1996) Inf.Proc.Lett. , vol.59 , Issue.1 , pp. 21-27
    • Baeza-Yates, R.A.1    Perleberg, C.H.2
  • 2
    • 33645851469 scopus 로고    scopus 로고
    • A general-purpose compression scheme for large collections
    • A. Cannane and H. E. Williams. A general-purpose compression scheme for large collections. ACM Trans. Inf. Syst., 20(3):329-355, July 2002.
    • (2002) ACM Trans. Inf. Syst. , vol.20 , Issue.3 , pp. 329-355
    • Cannane, A.1    Williams, H.E.2
  • 3
    • 85011015609 scopus 로고    scopus 로고
    • Entityrank: Searching entities directly and holistically
    • In Proceedings of the 33rd International Conference on Very Large Data Bases, VLDB '07, pages. VLDB Endowment
    • T. Cheng, X. Yan, and K. C.-C. Chang. Entityrank: Searching entities directly and holistically. In Proceedings of the 33rd International Conference on Very Large Data Bases, VLDB '07, pages 387-398. VLDB Endowment, 2007.
    • (2007) , pp. 387-398
    • Cheng, T.1    Yan, X.2    Chang, K.C.-C.3
  • 4
    • 83055168142 scopus 로고    scopus 로고
    • Indexes for highly repetitive document collections
    • In CIKM, pages, New York, NY, USA, ACM.
    • F. Claude, A. Fariña, et al. Indexes for highly repetitive document collections. In CIKM, pages 463-468, New York, NY, USA, 2011. ACM.
    • (2011) , pp. 463-468
    • Claude, F.1    Fariña, A.2
  • 5
    • 0029716125 scopus 로고    scopus 로고
    • Parsing with prefix and suffix dictionaries
    • In Data Compression Conf., pages
    • M. Cohn and R. Khazan. Parsing with prefix and suffix dictionaries. In Data Compression Conf., pages 180-189, 1996.
    • (1996) , pp. 180-189
    • Cohn, M.1    Khazan, R.2
  • 6
    • 85199273040 scopus 로고    scopus 로고
    • Indexing large genome collections on a PC
    • CoRR, abs/1403.7481
    • A. Danek, S. Deorowicz, and S. Grabowski. Indexing large genome collections on a PC. CoRR, abs/1403.7481, 2014.
    • (2014)
    • Danek, A.1    Deorowicz, S.2    Grabowski, S.3
  • 9
    • 39549090389 scopus 로고    scopus 로고
    • Seqan an efficient, generic C++ library for sequence analysis
    • A. Döring, D. Weese, et al. Seqan an efficient, generic C++ library for sequence analysis. BMC Bioinformatics, 9, 2008.
    • (2008) BMC Bioinformatics , vol.9
    • Döring, A.1    Weese, D.2
  • 10
    • 84904014664 scopus 로고    scopus 로고
    • AliBI: An Alignment-Based Index for Genomic Datasets
    • ArXiv e-prints, July
    • H. Ferrada, T. Gagie, et al. AliBI: An Alignment-Based Index for Genomic Datasets. ArXiv e-prints, July 2013.
    • (2013)
    • Ferrada, H.1    Gagie, T.2
  • 12
    • 84884295366 scopus 로고    scopus 로고
    • Document listing on repetitive collections
    • In J. Fischer and P. Sanders, editors, Combinatorial Pattern Matching, volume 7922 of Lecture Notes in Computer Science, pages. Springer Berlin Heidelberg
    • T. Gagie, K. Karhu, et al. Document listing on repetitive collections. In J. Fischer and P. Sanders, editors, Combinatorial Pattern Matching, volume 7922 of Lecture Notes in Computer Science, pages 107-119. Springer Berlin Heidelberg, 2013.
    • (2013) , pp. 107-119
    • Gagie, T.1    Karhu, K.2
  • 13
    • 84876802097 scopus 로고    scopus 로고
    • Wallbreaker: Overcoming the wall effect in similarity search
    • In Proceedings of the Joint EDBT/ICDT 2013 Workshops, EDBT '13, pages , New York, NY, USA, ACM.
    • S. Gerdjikov, S. Mihov, et al. Wallbreaker: Overcoming the wall effect in similarity search. In Proceedings of the Joint EDBT/ICDT 2013 Workshops, EDBT '13, pages 366-369, New York, NY, USA, 2013. ACM.
    • (2013) , pp. 366-369
    • Gerdjikov, S.1    Mihov, S.2
  • 14
    • 65449144325 scopus 로고    scopus 로고
    • Evaluation of next generation sequencing platforms for population targeted sequencing studies
    • R32+
    • O. Harismendy, P. Ng, et al. Evaluation of next generation sequencing platforms for population targeted sequencing studies. Genome Biology, 10(3):R32+, 2009.
    • (2009) Genome Biology , vol.10 , Issue.3
    • Harismendy, O.1    Ng, P.2
  • 15
    • 65949095627 scopus 로고    scopus 로고
    • Breaking a time-and-space barrier in constructing full-text indices
    • W. Hon, K. Sadakane, and W. Sung. Breaking a time-and-space barrier in constructing full-text indices. SIAM Journal on Computing, 38(6):2162-2178, 2009.
    • (2009) SIAM Journal on Computing , vol.38 , Issue.6 , pp. 2162-2178
    • Hon, W.1    Sadakane, K.2    Sung, W.3
  • 16
    • 80052129821 scopus 로고    scopus 로고
    • Sample selection for dictionary-based corpus compression
    • In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '11, pages , New York, NY, USA, ACM.
    • C. Hoobin, S. Puglisi, and J. Zobel. Sample selection for dictionary-based corpus compression. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '11, pages 1137-1138, New York, NY, USA, 2011. ACM.
    • (2011) , pp. 1137-1138
    • Hoobin, C.1    Puglisi, S.2    Zobel, J.3
  • 17
    • 84863731168 scopus 로고    scopus 로고
    • Relative Lempel-Ziv factorization for efficient storage and retrieval of web collections
    • C. Hoobin, S. J. Puglisi, and J. Zobel. Relative Lempel-Ziv factorization for efficient storage and retrieval of web collections. PVLDB, 5(3):265-273, 2011.
    • (2011) PVLDB , vol.5 , Issue.3 , pp. 265-273
    • Hoobin, C.1    Puglisi, S.J.2    Zobel, J.3
  • 18
    • 84876803927 scopus 로고    scopus 로고
    • Efficient parallel partition-based algorithms for similarity search and join with edit distance constraints
    • In Proceedings of the Joint EDBT/ICDT 2013 Workshops, EDBT '13, pages, New York, NY, USA, ACM.
    • Y. Jiang, D. Deng, et al. Efficient parallel partition-based algorithms for similarity search and join with edit distance constraints. In Proceedings of the Joint EDBT/ICDT 2013 Workshops, EDBT '13, pages 341-348, New York, NY, USA, 2013. ACM.
    • (2013) , pp. 341-348
    • Jiang, Y.1    Deng, D.2
  • 20
    • 84876408746 scopus 로고    scopus 로고
    • On compressing and indexing repetitive sequences
    • S. Kreft and G. Navarro. On compressing and indexing repetitive sequences. Theoretical Computer Science, 483(0):115 -133, 2013.
    • (2013) Theoretical Computer Science , vol.483 , pp. 115-133
    • Kreft, S.1    Navarro, G.2
  • 21
    • 78449295543 scopus 로고    scopus 로고
    • Relative lempel-ziv compression of genomes for large-scale storage and retrieval
    • In Proceedings of SPIRE 2010, pages, Berlin, Heidelberg, Springer-Verlag.
    • S. Kuruppu, S. J. Puglisi, and J. Zobel. Relative lempel-ziv compression of genomes for large-scale storage and retrieval. In Proceedings of SPIRE 2010, pages 201-206, Berlin, Heidelberg, 2010. Springer-Verlag.
    • (2010) , pp. 201-206
    • Kuruppu, S.1    Puglisi, S.J.2    Zobel, J.3
  • 22
    • 0032647886 scopus 로고    scopus 로고
    • Offline dictionary-based compression
    • In Data Compression Conference, Proceedings. DCC '99, pages, Mar 1999.
    • N. Larsson and A. Moffat. Offline dictionary-based compression. In Data Compression Conference, 1999. Proceedings. DCC '99, pages 296-305, Mar 1999.
    • (1999) , pp. 296-305
    • Larsson, N.1    Moffat, A.2
  • 24
    • 85175633764 scopus 로고    scopus 로고
    • Mining naturally-occurring corrections and paraphrases from wikipedias revision history
    • In N. Calzolari, K. Choukri, et al., editors, LREC 2010, Valletta, Malta. European Language Resources Association.
    • A. Max and G. Wisniewski. Mining naturally-occurring corrections and paraphrases from wikipedias revision history. In N. Calzolari, K. Choukri, et al., editors, LREC 2010, Valletta, Malta, 2010. European Language Resources Association.
    • (2010)
    • Max, A.1    Wisniewski, G.2
  • 25
    • 0038531207 scopus 로고
    • Efficient algorithms for enumerating intersection intervals and rectangles
    • Technical report, Xerox Paolo Alte Research Center
    • E. McCreight. Efficient algorithms for enumerating intersection intervals and rectangles. Technical report, Xerox Paolo Alte Research Center, 1980.
    • (1980)
    • McCreight, E.1
  • 26
    • 79955130548 scopus 로고    scopus 로고
    • Wikipedia vandalism detection
    • In S. Srinivasan, K. Ramamritham, et al., editors, WWW (Companion Volume), pages. ACM
    • S. M. Mola-Velasco. Wikipedia vandalism detection. In S. Srinivasan, K. Ramamritham, et al., editors, WWW (Companion Volume), pages 391-396. ACM, 2011.
    • (2011) , pp. 391-396
    • Mola-Velasco, S.M.1
  • 27
    • 84958529445 scopus 로고    scopus 로고
    • Document retrieval on repetitive collections
    • In A. Schulz and D. Wagner, editors, Algorithms -ESA 2014, volume 8737 of Lecture Notes in Computer Science, pages. Springer Berlin Heidelberg
    • G. Navarro, S. Puglisi, and J. Siren. Document retrieval on repetitive collections. In A. Schulz and D. Wagner, editors, Algorithms -ESA 2014, volume 8737 of Lecture Notes in Computer Science, pages 725-736. Springer Berlin Heidelberg, 2014.
    • (2014) , pp. 725-736
    • Navarro, G.1    Puglisi, S.2    Siren, J.3
  • 28
    • 85022188413 scopus 로고    scopus 로고
    • CST++
    • In SPIRE'10, pages
    • E. Ohlebusch, J. Fischer, and S. Gog. CST++. In SPIRE'10, pages 322-333, 2010.
    • (2010) , pp. 322-333
    • Ohlebusch, E.1    Fischer, J.2    Gog, S.3
  • 29
    • 80052126396 scopus 로고    scopus 로고
    • Inverted indexes for phrases and strings
    • In SIGIR 2011, pages , New York, NY, USA. ACM.
    • M. Patil, S. V. Thankachan, et al. Inverted indexes for phrases and strings. In SIGIR 2011, pages 555-564, New York, NY, USA, 2011. ACM.
    • (2011) , pp. 555-564
    • Patil, M.1    Thankachan, S.V.2
  • 30
    • 84873187741 scopus 로고    scopus 로고
    • Green: a tool for efficient compression of genome resequencing data
    • A. J. Pinho, D. Pratas, and S. P. Garcia. Green: a tool for efficient compression of genome resequencing data. Nucleic Acids Research, Dec. 2011.
    • (2011) Nucleic Acids Research
    • Pinho, A.J.1    Pratas, D.2    Garcia, S.P.3
  • 31
    • 49549106252 scopus 로고    scopus 로고
    • Segment-based multiple sequence alignment
    • T. Rausch, A.-K. Emde, et al. Segment-based multiple sequence alignment. Bioinformatics, 24(16):i187-i192, 2008.
    • (2008) Bioinformatics , vol.24 , Issue.16 , pp. i187-i192
    • Rausch, T.1    Emde, A.-K.2
  • 32
    • 76249113666 scopus 로고    scopus 로고
    • Simultaneous alignment of short reads against multiple genomes
    • R98+, Sept
    • K. Schneeberger, J. Hagmann, et al. Simultaneous alignment of short reads against multiple genomes. Genome biology, 10(9):R98+, Sept. 2009.
    • (2009) Genome biology , vol.10 , Issue.9
    • Schneeberger, K.1    Hagmann, J.2
  • 33
    • 85199279540 scopus 로고    scopus 로고
    • Indexing graphs for path queries with applications in genome research
    • IEEE/ACM Transactions on Computational Biology and Bioinformatics, (accepted).
    • J. Siren, N. Välimäki, and V. Mäkinen. Indexing graphs for path queries with applications in genome research. IEEE/ACM Transactions on Computational Biology and Bioinformatics, (accepted).
    • Siren, J.1    Välimäki, N.2    Mäkinen, V.3
  • 34
    • 84904582651 scopus 로고    scopus 로고
    • Principled dictionary pruning for low-memory corpus compression
    • In Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '14, pages , New York, NY, USA, ACM.
    • J. Tong, A. Wirth, and J. Zobel. Principled dictionary pruning for low-memory corpus compression. In Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '14, pages 283-292, New York, NY, USA, 2014. ACM.
    • (2014) , pp. 283-292
    • Tong, J.1    Wirth, A.2    Zobel, J.3
  • 36
    • 84894514001 scopus 로고    scopus 로고
    • FRESCO: Referential compression of highly similar sequences
    • S. Wandelt and U. Leser. FRESCO: Referential compression of highly similar sequences. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 10(5):1275-1288, Sept. 2013.
    • (2013) IEEE/ACM Trans. Comput. Biol. Bioinformatics , vol.10 , Issue.5 , pp. 1275-1288
    • Wandelt, S.1    Leser, U.2
  • 37
    • 84891054664 scopus 로고    scopus 로고
    • RCSI: Scalable similarity search in thousand(s) of genomes
    • S. Wandelt, J. Starlinger, et al. RCSI: Scalable similarity search in thousand(s) of genomes. PVLDB, 6(13):1534-1545, 2013.
    • (2013) PVLDB , vol.6 , Issue.13 , pp. 1534-1545
    • Wandelt, S.1    Starlinger, J.2
  • 38
    • 84881341729 scopus 로고    scopus 로고
    • Efficient direct search on compressed genomic data
    • In C. S. Jensen, C. M. Jermaine, and X. Zhou, editors, ICDE, pages. IEEE Computer Society
    • X. Yang, B. Wang, et al. Efficient direct search on compressed genomic data. In C. S. Jensen, C. M. Jermaine, and X. Zhou, editors, ICDE, pages 961-972. IEEE Computer Society, 2013.
    • (2013) , pp. 961-972
    • Yang, X.1    Wang, B.2
  • 39
    • 33749646126 scopus 로고    scopus 로고
    • Super-scalar RAM-CPU cache compression
    • In ICDE, pages, Washington, DC, USA. IEEE Computer Society.
    • M. Zukowski, S. Heman, et al. Super-scalar RAM-CPU cache compression. In ICDE, pages 59-70, Washington, DC, USA, 2006. IEEE Computer Society.
    • (2006) , pp. 59-70
    • Zukowski, M.1    Heman, S.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.